Delphi Clinic C++Builder Gate Training & Consultancy Delphi Notes Weblog Dr.Bob's Webshop
Bob Swart (aka Drs.Bob) Dr.Bob's Delphi Clinics Dr.Bob's Delphi Courseware Manuals
View Bob Swart's profile on LinkedIn Drs.Bob's Delphi Notes
These are the voyages using Delphi Enterprise (and Architect). Its mission: to explore strange, new worlds. To design and build new applications. To boldly go...
Title:

Unicode tip #7 - Unicode Text File Output

Author: Bob Swart
Posted: 12/3/2008 9:18:07 AM (GMT+1)
Content:

The TextFile of Delphi 2009 can only write AnsiStrings and not Unicode Strings, which means we can only write ANSI data to text files, right?
Wrong!
Since UTF8Strings are also (special) AnsiStrings, we can still write Unicode data to text files, provided we convert the Unicode String to a UTF8String (with no data loss) before writing to the file.

Note that we should also write the UTF-8 BOM to the output in case you want to save the text file and read it afterwards:

  program UnicodeTextFile;
uses
Windows, SysUtils;

const // surrogate bytes
Clef = #$5B + #$D834 + #$DD1E + #$5D;

var
F: Text;
B: Byte;
begin
Assign(F, 'output.txt');
Rewrite(f);
for B in TEncoding.UTF8.GetPreamble do write(f, AnsiChar(B));
writeln(f, UTF8String('['+Clef+']'));
Close(f);
end.
Since UTF8String is an AnsiString, we can combine the code above with writeln of normal strings, which will be converted to AnsiStrings, as long as we keep away from high-ascii characters (since these would indicate the start of a UTF8 special character byte sequence).
  program UnicodeTextFile;
uses
Windows, SysUtils;

const // surrogate bytes
Clef = #$5B + #$D834 + #$DD1E + #$5D;

var
F: Text;
B: Byte;
begin
Assign(F, 'output.txt');
Rewrite(f);
for B in TEncoding.UTF8.GetPreamble do write(f, AnsiChar(B));
writeln(f, UTF8String('['+Clef+']'));
writeln(f, 'This is a UTF-16 String which will be written as AnsiString');
Close(f);
end.
As long as we convert UTF-16 Unicode Strings to UTF8 before writing to Text files, and don’t forget to use the UTF-8 BOM as prefix, this will work fine for writing files with Unicode UTF-8 output.

This tip is the 7th in a series of Unicode tips taken from my Delphi 2009 Development Essentials book published earlier this week on Lulu.com.

Back  


6 Comments

AuthorPostedComments
54 09/12/21 10:23:03\xe7\xa4\xba
hema venkata ramu 10/04/19 17:05:44it is not suiteble , infact it is not working in my system
Imtiaz 10/10/11 11:29:26If i did not Write BOM, does this effects any thing. Some people wrote the above sample is not working on their system. is this a platform dependent change. what is the purpose of Writing Clef = #$5B + #$D834 + #$DD1E + #$5D;
mb 10/11/06 06:15:01how do convert unicode to ansi or ansi to unicode file with delphi 7.0 ??????? help me . help me
Bob 11/08/09 14:04:14What if your string has foreign, mathematical or other special characters? How do you write those so they're preserved?
mg30rg 13/07/25 12:14:38@mb - UTF8Encode() will do the trick for you. @Bob - All Unicode characters can be encoded to UTF-8. Even matematical, arabic, cyrillic, or asian symbols.


New Comment (max. 2048 characters, no HTML):

Name:
Comment:



This webpage © 2005-2017 by Bob Swart (aka Dr.Bob - www.drbob42.com). All Rights Reserved.