IFC String Encoding
String Encoding & Decoding
The IFC exchange format βSTEP physical fileβ uses characters represented by decimal value 32 to 126 from the code table in ISO 8859-1. Any other character, like some Western characters, like the German βUmlautβ, Greek or Cyrillic letters, or Asian characters, has to be encoded before being exchanged as part of a string value. Up until IFC4.x this encoding is used in IFC. In the future, IFC will adopt the UTF8 encoding.Β
The rules for decoding and encoding are defined in ISO10303-21: βIndustrial automation systems and integration β Product data representation and exchange β Part 21: Implementation methods: Clear text encoding of the exchange structureβ. A short summary and guideline is included in the IFC Implementation Guide.
Example:Β The following encodings define the character βUpper A umlautβ Γ β the hexadecimal character code is xC4 (decimal 196)
Characters | Description |
---|---|
β\S\Dβ | character code of D = x44 (decimal 68) added to x80 (128) isΒ x44 + x80 (68+128) = xC4 (196); since Γ is defined in ISO 8859-1 it is the default code page and no P encoding is required. |
β\PA\\S\Dβ | same as above, but the PA directive at the begin of the string explicitly defines that the value of xC4 (196) is taken from ISO 8859-1 |
β\X\C4β | character code xC4 as 8-bit character code found in ISO 10646 (first 255 characters β also referred to as βrow 0β) |
β\X2\00C4\X0\β | character code xC4 as 16-bit character x00C4 in ISO 10646 (Unicode) |
Imported from MarkDown source file