Overview
This article provides comprehensive information about character sets used in the EFS Survey, focusing on UTF-8 as the standard character set and listing other available options for various languages and regions.
Information
This guide explains the importance of UTF-8 in the EFS Survey and provides a detailed list of character sets that can be used for different language requirements.
The standard character set for EFS Survey: UTF-8
The EFS Survey admin area is coded in UTF-8. Similarly, for newly created projects the
UTF-8 character set is set as default, if you have not made any different presettings for your
account.
UTF-8 is a character set defined by the Unicode Consortium.
Using UTF-8 will significantly facilitate implementation of foreign language and multilingual projects, in particular:
-
UTF-8 encompasses all characters that there are, thus all literary languages can be
reproduced. -
You can enter characters from any given language directly in the admin area using
the keyboard. -
All of the entered data and settings will be saved internally and uniformly in UTF8 – questionnaire texts, participant data, and internal EFS Survey data, such as to-do notes or user accounts.
-
The answers of participants to open questions are coded uniformly in UTF-8, thus
all open entries in multilingual surveys can be exported and viewed in one single
record. -
Survey and panel passwords can contain characters from all possible languages
Character sets that can be used in EFS Survey
You can set the character set of your projects yourself. Tivian recommends the use
of the UTF-8 in general, this is true in particular with surveys that otherwise would
require several character sets.
The following table contains a complete selection of all available character sets.
Character set |
Description |
---|---|
UTF-8 |
International character set |
ISO 8859-1 West European |
Latin 1 |
ISO 8859-2 East European |
Latin 2 |
ISO 8859-3 South European |
Latin 3 |
ISO 8859-4 Baltic |
Latin 4 |
ISO 8859-5 Cyrillic |
Covers largely the languages Bulgarian, Macedonian, |
ISO 8859-6 Arabian |
Arabian. The direction of the text is from right to left. |
ISO 8859-7 Greek |
Modern Greek |
ISO 8859-8 Hebrew |
Hebrew. The direction of the text is from right to left. |
ISO-8859-9 Turkish |
Latin 5 |
ISO 8859-13 Baltic |
Latin 7 |
ISO -8859-15 West European |
Latin 9 |
ASCII (7-bit Charset) |
ASCII character set |
KO18-R, Russian |
Russian and Bulgarian |
Simplified Chinese, PRC standard |
Chinese simplified |
GB2312, EUC encoding, Simplified Chinese |
Chinese simplified |
GBK, Simplified Chinese |
Chinese simplified |
CNS11643 (Plane 1-3), EUC encoding, |
Traditional Chinese |
Big5, Traditional Chinese |
Traditional Chinese. Used in Taiwan and Hong Kong. |
Big5 with Hong Kong extensions, Traditional Chinese |
Traditional Chinese with extensions for the Cantonese dialect |
JISX 0201, 0208 and 0212, EUC encoding Japanese |
Japanese |
JISX 0201, 0208 and 0212, EUC encoding Japanese |
Japanese |
Shift-JIS, Japanese |
Japanese |
JIS X 0201, 0208, in ISO 2022 form, |
Japanese |
KS C 5601, EUC encoding, Korean |
Korean |
ISO 2022 KR, Korean |
Korean |
TIS620 Thai |
Thai |
FAQ
Why is UTF-8 recommended for the EFS Survey?
UTF-8 is recommended because it supports all characters from all languages, facilitates multilingual surveys, ensures uniform data storage, and allows for consistent coding of open-ended responses across different languages.
Can I use multiple character sets in a single survey?
While it's possible to use different character sets, it's generally recommended to use UTF-8 for all surveys, especially those that would otherwise require multiple character sets. This ensures consistency and simplifies data management.
How do I change the character set for my EFS Survey project?
You can set the character set for your projects in the EFS Survey admin area. However, it's recommended to use UTF-8 unless you have a specific reason to use a different character set.
Priyanka Bhotika
Comments