Developing a locale file for OpenOffice.

 

 

Javier SOLA - www.khmeros.info - Last edited 05/04/2005 - Top

 

The written culture of a country is not only defined by its language. Many other conventions are applied on written texts.

 

As an example, in the US they write dates placing the name or number of month before the day of the month, capitalize the month and make the day of the month an ordinal (May 2nd, 1960), while in Spain the day is written first, the month is not capitalized and the word “de” has to be placed between the day and the month and between the month and the year (2 de mayo de 1960). In other countries the word “day” has to precede the day number and in others the year is written first. Some others will use the Buddhist or other calendar, changing dates altogether... and these are only a few examples.

 

The same applies to numbers (the period is used as a decimal separator in some countries and the comma is used in others), currency writing formats (number of decimals, currency identifier placed before or after, etc), measurement units and some other pieces of data.

 

Sorting order, word order, alphabetic order, collation order, or whatever you prefer to call it (all equivalent terms), is also script and culture dependent. Cultures that use the same script sometimes have different rules. In English the sorting order starts with “a b c d f…” while in Spanish it starts with “a b c ch d e f…”. Accents and diacritics are also classified differently in different languages.

 

All this data is usually referred to in the computer world as LOCALE data. Locale data either refers to the general use of a language (independently of which country or region it is used in) or to the specific use of a language in a given country or regions (the conventions for the use of the Spanish language is Spain are different to the ones used for the Spanish language as used in Chile).

 

Programs that are localized to many languages tend to place all the data related to a language or to a country (region) in a file called a LOCALE for that culture.

 

OpenOffice requires that you place all the cultural data for your language/region in a file that has a format specific to OpenOffice. This file is an XML file that is plain text (utf-8 if your language requires it). It can be edited with any plain text editor.

 

By now, you should know your locale name (in this case LanguageCode.xml or languageCode_regionCode.xml). If you don't, you can find them here.

Language codes - iso639-2 
Country codes - iso3166-1

The locale name in OpenOffice always uses both the language code on the country code (language code in small letters and country code in capital letters), separated by an underscore (not a hyphen), and with the .xml extension. Some examples of names are: km_KH.xml, es_CL.xml, en_US.xml, etc. Please look into the directory that contains all the locale files presently included in OpenOffice:

 

http://l10n.openoffice.org/source/browse/l10n/i18npool/source/localedata/data/

 

If there is a file for your culture already, you might want to check it to see if any corrections are needed. If they are, you should file an issue with OpenOffice related to mistakes in the file.

 

If there is no file… then follow the instructions below on how to create and submit a locale.

 

To try to make it easy for you, we have created a template file that originates in the present en-US.xml file. It includes most of the information that will be needed; other information is inherited from the en-US.xml (English in the United States) file. The information that we have not included in our template file (that we have included as inherited) is not important for first level localization work. If you localize the template.xml file to your own culture you will have everything that you need.

 

We will now go through this file point by point to see what needs to be changed:

 

  • The first block gives general information about the locale itself. You should not change anything, unless you know what you are doing, specially in the first two lines. The third line contains three values (version number and attributes versionDTD and allowUpdateFromCLDR). For the version number (at the end of the line) you can start with 1.0 (as it is now) and change it when needed (later, if you do big upgrades of the locale file). This version number is for your own internal control. The versionDTD attribute is the same one in all locale files for all languages. To make sure that you have the correct number, you need to look into the file:

/i18npool/source/localedata/data/locale.dtd

In this file, search for versionDTD, and you will find a line such as:

<!ATTLIST Locale versionDTD CDATA #FIXED "2.0">

which indicates that the value of versionDTD is a fixed one for all locale files, and it must be "2.0". If your locale does not have this attribute or this number is incorrect, be careful, you might have taken - as a template to start your locale - an old locale file that is not compatible. We recommend you once again to start with our template file if you are working on version 2.0 of OpenOffice.org

The allowUpdateFromCLDR value indicates if you want to be permit new changes in the CLDR locale file to also take place automatically in the OOo locale file.

The Common Locale Data Repository (CLDR) is a central location for locale data managed by the Unicode consortium (and before by the Openi18n group) that is becoming a central reference location for all locale data. Specific CLDR locales are accesible here.

If your OOo locale and the CLDR locale contain the same general data (such as separators, day and month names and so on), as it is already the case for many locales, then you might be interested on using the value "yes", not to have to think about upgrading the OOo locale everytime something new comes out in the CLDR locale. If the CLDR locale for you language/country has different data from the one that is in the OOo locale (there are disagreements), and you do not want the values to be changed to those of the CLDR locale, then you should use the value "no".

 <?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Locale SYSTEM 'locale.dtd'>
<Locale versionDTD="2.0" allowUpdateFromCLDR="no" version="1.0">

 

  • In the following section, you need to provide the language and region (country) codes, as well as the name of the language and the region in English (not in your own language and script).


<LC_INFO>

<Language>

<LangID>en</LangID>

<DefaultName>English</DefaultName>

</Language>

<Country>

<CountryID>US</CountryID>

<DefaultName>United States</DefaultName>

</Country>

</LC_INFO>

 

  • You have to pay most attention to the <DecimalSeparator> and the <ThousandSeparator>. If in your country you write “one thousand three hundred fifty five and half” as 1,355.5, you do not need to change them, but if you usually write 1.355,5 , then you have to invert them. The list separator is presently not used in OpenOffice, but it is defined to separate objects in enumerated list, such as: socks; shoes; Swiss army knife; toothbrush and clean underware.

 

<LC_CTYPE unoid="generic">

<Separators>

 

<ThousandSeparator>,</ThousandSeparator>

<DecimalSeparator>.</DecimalSeparator>

<ListSeparator>;</ListSeparator>

 

 

  • The following are declared for future more automatic date and time formats. They should be in agreement with the separators that you use further down in the date formats, just for consistency, so that future automatic dates have the same format that the ones you are declaring here. They normally do not interact with the present date formats, with only one exception: <LongDateDayOfWeekSeparator>, which is, in the long date formats, always inserted after an occurrence of NNNN (complete name of the day of the week), this is why in some date formats you will find patterns like the following one, in which MMMM is placed right after NNNN.

 

NNNNMMMM DD, YYYY

 

As the automatic separator is inserted, and if we consider the <LongDateDayOfWeekSeparator> ", ", it will be interpreted in this template locale as something like:

 

Wednesday, March 12, 2023


if 3/12/23 was the date. If you inserted another space in the date format after NNNN, the date would would show a double space.

 

<DateSeparator>/</DateSeparator>

<TimeSeparator>:</TimeSeparator>

<Time100SecSeparator>.</Time100SecSeparator>

<LongDateDayOfWeekSeparator>, </LongDateDayOfWeekSeparator>

<LongDateDaySeparator>, </LongDateDaySeparator>

<LongDateMonthSeparator> </LongDateMonthSeparator>

<LongDateYearSeparator> </LongDateYearSeparator>

</Separators>

  • Here you can either use advanced directional quotation marks (as they are now), or the good old-style straight marks ( ' and " , same for Start and End)

<Markers>

<QuotationStart>‘</QuotationStart>

<QuotationEnd>’</QuotationEnd>

<DoubleQuotationStart>“</DoubleQuotationStart>

<DoubleQuotationEnd>”</DoubleQuotationEnd>

</Markers>

 

  • What words or abbreviations do you use to indicate morning (0:00 to 12:00) or afternoon (12:00 to 24:00) in a 12 hour clock?

 

<TimeAM>AM</TimeAM>

<TimePM>PM</TimePM>

 

  • Here you can use metric (as it is now), US or UK (if your country uses the US or UK measurement systems).

 

<MeasurementSystem>metric</MeasurementSystem>

</LC_CTYPE>

 

  • You have to replace the second $ sign with the sign(s) used to indicate your currency (up to three characters). The number in the second part is the Microsoft language ID for your language in hexadecimal format. You can find it here:

 

http://www.microsoft.com/globaldev/reference/lcid-all.mspx

 

if your language does not appear in this list, then say so in the OpenOffice Localization list and they will assign a number for you.

 

 

<LC_FORMAT replaceFrom="[CURRENCY]" replaceTo="[$$-409]">

 

·        Dates are the following part. A large number of formats are used in OpenOffice, each one has different amount of information and format. Here in the example the data is structured in the usual US format, with the month before the day of the month and then the year (I wonder who ever came out with this order). You will most probably have to change the order, but it is a good idea to maintain the separators and the amount of information in each format. The formats are build around this table of “placeholder” letters:

§         Era                   G

§         Year                 Y

§         Month              M  (when within a date)

§         Day                  D

§         DayofWeek      N

§         DayOfWeek     A ? (probably in old specifications, not used now)

§         Quarter            Q

§         Hour                H

§         Minutes            M (when used within a time structure)

§         Seconds            S

§         1/100 of sec.     00

 

The number of times a placeholder letter is repeated is an indication of the number of characters to be used, but the number of characters does not always correspond exactly with the number of times the placeholder letter is repeated in the format. For example: ‘D’ means day of the month, with one or two characters (as needed: 2 for day 2, 12 for day 12), ‘DD’ means that two digits must always be used (day 5 must be 05). M is a one or two digit month, MM a two digit month, MMM a three letter month (short month name) and MMMM a long month name (long and short month names for your language are defined further down in the locale). Of the following block you only have to localize what is in the <FormatCode> lines, don’t touch the other lines. Do not change the format for entries numbered 32 and 33, they correspond to data in a specific ISO formats (year first). Also, pay special attention to using the same number of letters for each piece of data in dates 21 and 47 (change only the order of the elements if needed).

 Spaces inside the dates are significant. If you include a space, it will be included in the printed date. In some cases you will see that they day of the week is attached to the name of the month (no spaces between). This is because here OpenOffice automatically includes between them the <LongDateDayOfWeekSeparator> that you defined above (comma + space for US English).

Inside the date format, you can include strings with text before, between or after the placeholder letters, such as in D ¨de¨ MMMM ¨de¨ YYYY for Spanish. This is because dates in Spanish should be printed with these words in the middle: 2 de mayo de 1960 (note that there are spaces outside the quotes, which are significant. IF they had been inside the quotes, they would also be taken into account. What you should never do is to put spaces inside and outside, then they would be duplicated in the final date.

Inside some of the date format, you will see “AM/PM”. Do not translate these. They are placeholders for the words for AM and PM that you have defined in <TimeAM> and <TimePM>

The placeholder M is used in two different situations. When used within a date, it means "month", but when used within a time structure, it means "minute". Note that it might be used with both meanings within the same format.

 

<FormatElement msgid="DateFormatskey1" default="true" type="short" usage="DATE" formatindex="18">

<FormatCode>M/D/YY</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="DateFormatskey2" default="false" type="medium" usage="DATE" formatindex="28">

<FormatCode>NN DD/MMM YY</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="DateFormatskey3" default="false" type="medium" usage="DATE" formatindex="34">

<FormatCode>MM/YY</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="DateFormatskey4" default="false" type="medium" usage="DATE" formatindex="35">

<FormatCode>MMM DD</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="DateFormatskey5" default="false" type="medium" usage="DATE" formatindex="36">

<FormatCode>MMMM</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="DateFormatskey6" default="false" type="medium" usage="DATE" formatindex="37">

<FormatCode>QQ YY</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="DateFormatskey7" default="false" type="medium" usage="DATE" formatindex="21">

<FormatCode>MM/DD/YYYY</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="DateFormatskey8" default="true" type="medium" usage="DATE" formatindex="20">

<FormatCode>MM/DD/YY</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="DateFormatskey9" default="true" type="long" usage="DATE" formatindex="19">

<FormatCode>NNNNMMMM DD, YYYY</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="DateFormatskey10" default="false" type="long" usage="DATE" formatindex="22">

<FormatCode>MMM D, YY</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="DateFormatskey11" default="false" type="long" usage="DATE" formatindex="23">

<FormatCode>MMM D, YYYY</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="DateFormatskey12" default="false" type="long" usage="DATE" formatindex="25">

<FormatCode>MMMM D, YYYY</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="DateFormatskey13" default="false" type="long" usage="DATE" formatindex="27">

<FormatCode>NN, MMM D, YY</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="DateFormatskey14" default="false" type="long" usage="DATE" formatindex="29">

<FormatCode>NN, MMMM D, YYYY</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="DateFormatskey15" default="false" type="long" usage="DATE" formatindex="30">

<FormatCode>NNNNMMMM D, YYYY</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="DateFormatskey16" default="false" type="long" usage="DATE" formatindex="24">

<FormatCode>D. MMM. YYYY</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="DateFormatskey17" default="false" type="long" usage="DATE" formatindex="26">

<FormatCode>D. MMMM YYYY</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="DateFormatskey18" default="false" type="short" usage="DATE" formatindex="31">

<FormatCode>MM-DD</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="DateFormatskey19" default="false" type="medium" usage="DATE" formatindex="32">

<FormatCode>YY-MM-DD</FormatCode>

<DefaultName>ISO 8601</DefaultName>

</FormatElement>

<FormatElement msgid="DateFormatskey20" default="false" type="medium" usage="DATE" formatindex="33">

<FormatCode>YYYY-MM-DD</FormatCode>

<DefaultName>ISO 8601</DefaultName>

</FormatElement>

<FormatElement msgid="DateFormatskey21" default="false" type="medium" usage="DATE" formatindex="38">

<FormatCode>WW</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="TimeFormatskey1" default="false" type="short" usage="TIME"  formatindex="39">

<FormatCode>HH:MM</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="TimeFormatskey2" default="false" type="medium" usage="TIME"  formatindex="40">

<FormatCode>HH:MM:SS</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="TimeFormatskey3" default="true" type="short" usage="TIME"  formatindex="41">

<FormatCode>HH:MM AM/PM</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="TimeFormatskey4" default="true" type="medium" usage="TIME"  formatindex="42">

<FormatCode>HH:MM:SS AM/PM</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="TimeFormatskey5" default="false" type="medium" usage="TIME"  formatindex="43">

<FormatCode>[HH]:MM:SS</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="TimeFormatskey6" default="false" type="short" usage="TIME"  formatindex="44">

<FormatCode>MM:SS.00</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="TimeFormatskey7" default="false" type="medium" usage="TIME"  formatindex="45">

<FormatCode>[HH]:MM:SS.00</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="DateTimeFormatskey1" default="true" type="medium" usage="DATE_TIME"  formatindex="46">

<FormatCode>MM/DD/YY HH:MM AM/PM</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="DateTimeFormatskey2" default="false" type="medium" usage="DATE_TIME" formatindex="47">

<FormatCode>MM/DD/YYYY HH:MM:SS</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

 

 If you want different predetermined date formats to appear in the places in OOo in which date formatting is done automatically (such as in Calc cell formatting, then you can include additional date formats, or any other type of formats. By default, numbers in these formats will be Arab (English) numbers. If you want the new date formats to use the numbers of your script, you need to precede the format with [NatNum1].

 

For example, you could add:

 

<FormatElement msgid="DateFormatskey22" default="false" type="medium" usage="DATE_TIME" formatindex="50">

<FormatCode>[NatNum1]NNNN, "Day" D "of" MMMM "of the year" YYYY</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

 

This will not translate the expressions "Day" or "of the year", but if your locale is selected, all the numbers will be in your locale.

Warning: if you include new date styles, you should give them new consecutive DateFormatskey numbers. If the last one is DateFormatskey21, you should start with DateFormatskey22...  Also, even more important, you should use formatindex="XX" number in which XX is over 49, as all numbers under 49 are reserved by the system. Notice that all formats (dates, numbers, currency, etc...) share the same formatindex numbers. If you add dates with formatindex numbers 50 and 51 (for example), and then add number formats, they must have different formatindex such as 52, 53, etc. (different from 50, 51). There is no relation between the number that you use in DateFormatskey and the number that you use in formatindex.

 

  • The next block is about number formats. The most important issues here are the DecimalSeparator, the ThousandSeparator and how to represent currency: should the currency symbol be placed before the amount or after? How many decimals should currency representation have? Look at each format carefully. Remember to change only the lines that include <FormatCode>, and make sure that the representation for numbers that you include here is the same as the <DecimalSeparator>, the <ThousandSeparator> that you did above. By default, the US number representation form is in this file, if your form is different, you must change it.

 

<FormatElement msgid="FixedFormatskey1" default="true" type="medium" usage="FIXED_NUMBER"  formatindex="0">

<FormatCode>General</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="FixedFormatskey2" default="true" type="short" usage="FIXED_NUMBER"  formatindex="1">

<FormatCode>0</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="FixedFormatskey3" default="false" type="medium" usage="FIXED_NUMBER"  formatindex="2">

<FormatCode>0.00</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="FixedFormatskey4" default="false" type="short" usage="FIXED_NUMBER"  formatindex="3">

<FormatCode>#,##0</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="FixedFormatskey5" default="false" type="medium" usage="FIXED_NUMBER"  formatindex="4">

<FormatCode>#,##0.00</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="FixedFormatskey6" default="false" type="medium" usage="FIXED_NUMBER"  formatindex="5">

<FormatCode>#,###.00</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="CurrencyFormatskey1" default="true" type="short" usage="CURRENCY"  formatindex="12">

<FormatCode>[CURRENCY]#,##0;-[CURRENCY]#,##0</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="CurrencyFormatskey2" default="false" type="medium" usage="CURRENCY"  formatindex="13">

<FormatCode>[CURRENCY]#,##0.00;-[CURRENCY]#,##0.00</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="CurrencyFormatskey3" default="false" type="medium" usage="CURRENCY"  formatindex="14">

<FormatCode>[CURRENCY]#,##0;[RED]-[CURRENCY]#,##0</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="CurrencyFormatskey4" default="true" type="medium" usage="CURRENCY"  formatindex="15">

<FormatCode>[CURRENCY]#,##0.00;[RED]-[CURRENCY]#,##0.00</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="CurrencyFormatskey5" default="false" type="medium" usage="CURRENCY"  formatindex="16">

<FormatCode>#,##0.00 CCC</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="CurrencyFormatskey6" default="false" type="medium" usage="CURRENCY"  formatindex="17">

<FormatCode>[CURRENCY]#,##0.--;[RED]-[CURRENCY]#,##0.--</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="PercentFormatskey1" default="true" type="short" usage="PERCENT_NUMBER"  formatindex="8">

<FormatCode>0%</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="PercentFormatskey2" default="true" type="long" usage="PERCENT_NUMBER"  formatindex="9">

<FormatCode>0.00%</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="ScientificFormatskey1" default="true" type="medium" usage="SCIENTIFIC_NUMBER"  formatindex="6">

<FormatCode>0.00E+000</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

<FormatElement msgid="ScientificFormatskey2" default="false" type="medium" usage="SCIENTIFIC_NUMBER"  formatindex="7">

<FormatCode>0.00E+00</FormatCode>

<DefaultName></DefaultName>

</FormatElement>

</LC_FORMAT>

 If you want different predetermined formats to appear in the places in OOo in which number formatting is done automatically (such as in Calc cell formatting), then you can include additional FIXED_NUMBER, CURRENCY, PERCENT_NUMBER formats, or any other type of formats. By default, numbers in these formats will be Latin numbers. If you want the new number formats to use the numbers of your script, you need to precede the format with [NatNum1].

Warning: if you include new number styles, you should give them new FixedFormatskey, CurrencyFormatskey or PercentFormatskey1 numbers that are consecutive to the already existing ones. As ane xample, if the last FIXED_NUMBER is FixedFormatskey6, you should start with FixedFormatskey7...  Also, even more important, you should use a formatindex="XX" number in which XX is over 49, as all numbers under 49 are reserved by the system. Notice that all formats (dates, numbers, currency, etc...) share the same formatindex numbers. If you add dates with formatindex numbers 50 and 51 (for example), and then add number formats, they must have different formatindex such as 52, 53, etc. (different from 50, 51). There is no relation between the number that you use in DateFormatskey and the number that you use in formatindex.

 

 

  • Please see this page for information on collation

 

<LC_COLLATION>

<Collator unoid="alphanumeric" default="true"/>

<CollationOptions>

<TransliterationModules>IGNORE_CASE</TransliterationModules>

</CollationOptions>

</LC_COLLATION>

 

  • No information

 

<LC_SEARCH>

<SearchOptions>

<TransliterationModules>IGNORE_CASE</TransliterationModules>

</SearchOptions>

</LC_SEARCH>

 

 

  • In <IndexKey> you should list the letters of your script that you would use for enumerating with letters instead of numbering, these characters should replace A-Z below (unless, as in English, this is what your language uses). You can name sets of characters by mentioning the first one and the last one separated by a hyphen (as in the case of A-Z, corresponding to all letters between A and Z).
  • You need to type the real characters (this is a UTF-8 file). <<<<REVIEW *** Then you can list all other characters used, but not in that range one-by-one. In Spanish this would probably amount to: A-Z á é í ó ú Á É Í Ó Ú ñ Ñ>>>>>>

  

<LC_INDEX>

<IndexKey unoid="alphanumeric" default="true" phonetic="false">A-Z</IndexKey>

  

  • In the next block, you have to define <UnicodeScript>. If your language uses the basic Latin alphabet (as used by the English language), do not change this, leave the two entries. IF your language script has its own characters and is listed in Unicode, you should remove one of the two <UnicodeScript> lines and in the other place the number of the following table that corresponds to your script.

      

0

BasicLatin

1

Latin1Supplement

2

LatinExtendedA

3

LatinExtendedB

4

IPAExtension

5

SpacingModifier

6

CombiningDiacritical

7

Greek

8

Cyrillic

9

Armenian

10

Hebrew

11

Arabic

12

Syriac

13

Thaana

14

Devanagari

15

Bengali

16

Gurmukhi

17

Gujarati

18

Oriya

19

Tamil

20

Telugu

21

Kannada

22

Malayalam

23

Sinhala

24

Thai

25

Lao

26

Tibetan

27

Myanmar

28

Georgian

29

HangulJamo

30

Ethiopic

31

Cherokee

32

UnifiedCanadianAboriginalSyll

33

Ogham

34

Runic

35

Khmer

36

Mongolian

37

LatinExtendedAdditional

38

GreekExtended

39

GeneralPunctuation

40

SuperSubScript

41

CurrencySymbolScript

42

SymbolCombiningMark

43

LetterlikeSymbol

44

NumberForm

45

Arrow

46

MathOperator

47

MiscTechnical

48

ControlPicture

49

OpticalCharacter

50

EnclosedAlphanumeric

51

BoxDrawing

52

BlockElement

53

GeometricShape

54

MiscSymbol

55

Dingbat

56

BraillePatterns

57

CJKRadicalsSupplement

58

KangxiRadicals

59

IdeographicDescriptionChars

60

CJKSymbolPunctuation

61

Hiragana

62

Katakana

63

Bopomofo

64

HangulCompatibilityJamo

65

Kanbun

66

BopomofoExtended

67

EnclosedCJKLetterMonth

68

CJKCompatibility

69

CJKUnifiedIdeographsExtA

70

CJKUnifiedIdeograph

71

YiSyllables

72

YiRadicals

73

HangulSyllable

74

HighSurrogate

75

HighPrivateUseSurrogate

76

LowSurrogate

77

PrivateUse

78

CJKCompatibilityIdeograph

79

AlphabeticPresentation

80

ArabicPresentationA

81

CombiningHalfMark

82

CJKCompatibilityForm

83

SmallFormVariant

84

ArabicPresentationB

85

NoScript

86

HalfwidthFullwidthForm

 

If you cannot find the script here, you should point your browser to:

 http://api.openoffice.org/source/browse/api/offapi/com/sun/star/i18n/UnicodeScript.idl

 

and click on view. This will show you a file that has the last version of this list… but unfortunately, the entries are not numbered. If you find the script for your language there, you have to start counting from the beginning (BasicLatin is number 0) and figure out what is the correct number of your <UnicodeScript>. If your script is in Unicode, but you still do not find the script listed here, you should write to the L10N@openoffice.apache.org list mentioning this and asking what should be done.

 

<UnicodeScript>0</UnicodeScript>

<UnicodeScript>1</UnicodeScript>

 

  • The next characters refer to the abbreviation used in your language to talk about the following page or pages (such as s. and ss. in Spanish or sv. in both cases in French).

 

<FollowPageWord>p.</FollowPageWord>

<FollowPageWord>pp.</FollowPageWord>

</LC_INDEX>

 

  • The next one is easy, if your country uses the Gregorian 12 month, 7 day week Gregorian (western) calendar. For each day of the week and for each month, you have to defined a long (full) name and a short or abbreviated name. You should not translate the ID fields (<DayID> or <MonthID>), they are used for reference. Translate only the abbreviated and Full names (<DefaultAbbrvName> and <DefaultFullName>) using the characters of your language. If you want to define a different type of calendar, you will have to ask about it in the list.

 

Use the right capitalization. In English the Months are written capitalizing the first letter, in other languages months are written all in small letters.

 

<LC_CALENDAR>

<Calendar unoid="gregorian" default="true">

<DaysOfWeek>

<Day>

<DayID>sun</DayID>

<DefaultAbbrvName>Sun</DefaultAbbrvName>

<DefaultFullName>Sunday</DefaultFullName>

</Day>

<Day>

<DayID>mon</DayID>

<DefaultAbbrvName>Mon</DefaultAbbrvName>

<DefaultFullName>Monday</DefaultFullName>

</Day>

<Day>

<DayID>tue</DayID>

<DefaultAbbrvName>Tue</DefaultAbbrvName>

<DefaultFullName>Tuesday</DefaultFullName>

</Day>

<Day>

<DayID>wed</DayID>

<DefaultAbbrvName>Wed</DefaultAbbrvName>

<DefaultFullName>Wednesday</DefaultFullName>

</Day>

<Day>

<DayID>thu</DayID>

<DefaultAbbrvName>Thu</DefaultAbbrvName>

<DefaultFullName>Thursday</DefaultFullName>

</Day>

<Day>

<DayID>fri</DayID>

<DefaultAbbrvName>Fri</DefaultAbbrvName>

<DefaultFullName>Friday</DefaultFullName>

</Day>

<Day>

<DayID>sat</DayID>

<DefaultAbbrvName>Sat</DefaultAbbrvName>

<DefaultFullName>Saturday</DefaultFullName>

</Day>

</DaysOfWeek>

<MonthsOfYear>

<Month>

<MonthID>jan</MonthID>

<DefaultAbbrvName>Jan</DefaultAbbrvName>

<DefaultFullName>January</DefaultFullName>

</Month>

<Month>

<MonthID>feb</MonthID>

<DefaultAbbrvName>Feb</DefaultAbbrvName>

<DefaultFullName>February</DefaultFullName>

</Month>

<Month>

<MonthID>mar</MonthID>

<DefaultAbbrvName>Mar</DefaultAbbrvName>

<DefaultFullName>March</DefaultFullName>

</Month>

<Month>

<MonthID>apr</MonthID>

<DefaultAbbrvName>Apr</DefaultAbbrvName>

<DefaultFullName>April</DefaultFullName>

</Month>

<Month>

<MonthID>may</MonthID>

<DefaultAbbrvName>May</DefaultAbbrvName>

<DefaultFullName>May</DefaultFullName>

</Month>

<Month>

<MonthID>jun</MonthID>

<DefaultAbbrvName>Jun</DefaultAbbrvName>

<DefaultFullName>June</DefaultFullName>

</Month>

<Month>

<MonthID>jul</MonthID>

<DefaultAbbrvName>Jul</DefaultAbbrvName>

<DefaultFullName>July</DefaultFullName>

</Month>

<Month>

<MonthID>aug</MonthID>

<DefaultAbbrvName>Aug</DefaultAbbrvName>

<DefaultFullName>August</DefaultFullName>

</Month>

<Month>

<MonthID>sep</MonthID>

<DefaultAbbrvName>Sep</DefaultAbbrvName>

<DefaultFullName>September</DefaultFullName>

</Month>

<Month>

<MonthID>oct</MonthID>

<DefaultAbbrvName>Oct</DefaultAbbrvName>

<DefaultFullName>October</DefaultFullName>

</Month>

<Month>

<MonthID>nov</MonthID>

<DefaultAbbrvName>Nov</DefaultAbbrvName>

<DefaultFullName>November</DefaultFullName>

</Month>

<Month>

<MonthID>dec</MonthID>

<DefaultAbbrvName>Dec</DefaultAbbrvName>

<DefaultFullName>December</DefaultFullName>

</Month>

</MonthsOfYear>

 

  • Change the following if your language uses different names for the eras of the Gregorian calendar (before and after Christ).

 

<Eras>

<Era>

<EraID>bc</EraID>

<DefaultAbbrvName>BC</DefaultAbbrvName>

<DefaultFullName>Before Christ</DefaultFullName>

</Era>

<Era>

<EraID>ad</EraID>

<DefaultAbbrvName>AD</DefaultAbbrvName>

<DefaultFullName>Anno Domini</DefaultFullName>

</Era>

</Eras>

 

  • In the US, the week starts on Sunday, but in Spain and other countries, the week is considered to start on Monday. Select the correct one for your language. Indicate it by the corresponding three letter “DayID” as expressed above in the translations of the days of the week.

 

<StartDayOfWeek><DayID>sun</DayID></StartDayOfWeek>

 

  • At the beginning of a month, how many days does a week need to have in order to be considered a working week?

 

<MinimalDaysInFirstWeek>1</MinimalDaysInFirstWeek>

</Calendar>

</LC_CALENDAR>

 

  • Currency ID can be found in this graph…  or in:

 

http://nsdsa.phdnswc.navy.mil/mspecs/docs/styleman2000/chapter_txt-17.html#17t6

 

Currency symbol should be in your own script.

 

Currency codes (<BankSymbol>), <CurrencyName> and <DecimalPlaces> come from the ISO4217 list can be found in here. If your currency is  new one and is not in here, you should try to find it by yourself in your country, because if you go the standards body mantainer (BSI Global) , they will make you PAY for the data.

 

 

<LC_CURRENCY>

<Currency default="true" usedInCompatibleFormatCodes="true">

<CurrencyID>dollar</CurrencyID>

<CurrencySymbol>$</CurrencySymbol>

<BankSymbol>USD</BankSymbol>

<CurrencyName>US Dollar</CurrencyName>

<DecimalPlaces>2</DecimalPlaces>

</Currency>

</LC_CURRENCY>

 

  • Used for specify character conversion algorithms. Languages that use Latin characters do use upper to lowercase conversion algorithms, such as the ones mentioned in this file. Some languages, such as Japanese, Chinese or Korean, have complicated transliteration schemes. Specific transliteration could be used, but it is not used in other scripts so far. Transliteration procedures need to be written before they are included here. If your language is not Chinese, Korean or Japanese, just leave this as it is... and if it is one of them.. then it has already been done. Again, if you really want to do transliteration, please look into:

http://l10n.openoffice.org/i18n_framework/HowToAddLocaleInI18n.html

 

<LC_TRANSLITERATION>

<Transliteration unoid="LOWERCASE_UPPERCASE"/>

<Transliteration unoid="UPPERCASE_LOWERCASE"/>

<Transliteration unoid="IGNORE_CASE"/>

</LC_TRANSLITERATION>

 

 

  • Write here the translation of the words “true” and “false” to your language.

 

<LC_MISC>

<ReservedWords>

<trueWord>true</trueWord>

<falseWord>false</falseWord>

 

  • Translate the following four expressions, in reference to whatever is more used in your language, quarters, trimesters or any other way that your language uses to call the four three-month groups contained in a year.

 

<quarter1Word>1st quarter</quarter1Word>

<quarter2Word>2nd quarter</quarter2Word>

<quarter3Word>3rd quarter</quarter3Word>

<quarter4Word>4th quarter</quarter4Word>

 

  • Translate the words “above” and “below”.

 

<aboveWord>above</aboveWord>

<belowWord>below</belowWord>

 

  • Abbreviated way of describing the four quarters or trimesters. Don’t let this bugger you. If you don’t have any idea. Leave it as it is or change it to T1, T2….

 

<quarter1Abbreviation>Q1</quarter1Abbreviation>

<quarter2Abbreviation>Q2</quarter2Abbreviation>

<quarter3Abbreviation>Q3</quarter3Abbreviation>

<quarter4Abbreviation>Q4</quarter4Abbreviation>

</ReservedWords>

</LC_MISC>

 

  • The next section relates to numbering styles for paragraphs. In the locale file we will define what styles will be included in OpenOffice.or Writer in Format-->bullets and Numbering-->Numbering type tab. Each line in the locale file defines one of the 8 squares in this page. For each one of the it defines three things: the style of numbering, what character should be palced before the number (prefix) and what character should be placed after the numbering. The NumType value refers to a list of types of numbers that is defined in:

offapi/com/sun/star/style/NumberingType.idl

 

and that includes some traditional western types of numbering (Latin letters, Arab numbers, Latin Numbers), the possibility of using numbers in the script of the locale (number 12) or creating specific numbering series (such as for example Thai letters). In our reference file number 4 refers to Arab number (as used in English), 2 refers to capital Latin numbers (not the ones used by English language), 0 to capital Latin letters, 1 to lower case Latin letters and 3 to lower case Latin numbers. You should define here the method that would best fit your culture. Some of the locales replace number 4 by number 12, making the first two or three in local numbering, and then use other methods. You should evaluate, for example, if Latin numbers are known to your culture.

 

<LC_NumberingLevel>
<NumberingLevel Prefix=" " NumType="4" Suffix=")" />
<NumberingLevel Prefix=" " NumType="4" Suffix="." />
<NumberingLevel Prefix="(" NumType="4" Suffix=")" />
<NumberingLevel Prefix=" " NumType="2" Suffix="." />
<NumberingLevel Prefix=" " NumType="0" Suffix=")" />
<NumberingLevel Prefix=" " NumType="1" Suffix=")" />
<NumberingLevel Prefix="(" NumType="1" Suffix=")" />
<NumberingLevel Prefix=" " NumType="3" Suffix="." />
</LC_NumberingLevel>

 

  • Outline Numbering Styles are defined in the next section. In the locale file we will define what styles will be included in OpenOffice.or Writer in Format-->bullets and Numbering-->Numbering type tab. Each <OutlineStyle> block in the locale file defines one of the 8 squares in this page. Each block has five lines defining the first five levels of heading. The first line of a block (for example) will define how paragraphs with style 'Heading 1' will be numbered (including number style, and characters to be placed before and after the number), if some bullet characters should be used, the left margin of Heading 1 numbered paragraphs, etc... In subsequente levels (other lines), it is important to say how many levels of heading will be named in this specific number... for example, if we are defining the numbering of level 3 (Heading 3), the ParentingNumber could be 0 (in which case only one number will be showed) could be 1 (two numbers shown, as in 1.1) or 2, in which case we will have numbers of the style 1.1.1.

If you want to use numbers in the script of the locale, you need to use NumType="12". Information about the different number styles is available in :

 

offapi/com/sun/star/style/NumberingType.idl

 

<LC_OutLineNumberingLevel>
<OutlineStyle>
<OutLineNumberingLevel Prefix=" " NumType="4" Suffix="." BulletChar="0020" BulletFontName="" ParentNumbering="0" LeftMargin="0" SymbolTextDistance="50" FirstLineOffset="0" />
<OutLineNumberingLevel Prefix=" " NumType="4" Suffix="." BulletChar="0020" BulletFontName="" ParentNumbering="1" LeftMargin="50" SymbolTextDistance="50" FirstLineOffset="0" />
<OutLineNumberingLevel Prefix=" " NumType="1" Suffix=")" BulletChar="0020" BulletFontName="" ParentNumbering="0" LeftMargin="100" SymbolTextDistance="50" FirstLineOffset="0" />
<OutLineNumberingLevel Prefix=" " NumType="6" Suffix=" " BulletChar="2022" BulletFontName="StarSymbol" ParentNumbering="0" LeftMargin="150" SymbolTextDistance="50" FirstLineOffset="0" />
<OutLineNumberingLevel Prefix=" " NumType="6" Suffix=" " BulletChar="2022" BulletFontName="StarSymbol" ParentNumbering="0" LeftMargin="200" SymbolTextDistance="50" FirstLineOffset="0" />
</OutlineStyle>
<OutlineStyle>
<OutLineNumberingLevel Prefix=" " NumType="4" Suffix="." BulletChar="0020" BulletFontName="" ParentNumbering="0" LeftMargin="0" SymbolTextDistance="50" FirstLineOffset="0" />
<OutLineNumberingLevel Prefix=" " NumType="1" Suffix=")" BulletChar="0020" BulletFontName="" ParentNumbering="0" LeftMargin="50" SymbolTextDistance="50" FirstLineOffset="0" />
<OutLineNumberingLevel Prefix=" " NumType="6" Suffix=" " BulletChar="2022" BulletFontName="StarSymbol" ParentNumbering="0" LeftMargin="100" SymbolTextDistance="50" FirstLineOffset="0" />
<OutLineNumberingLevel Prefix=" " NumType="6" Suffix=" " BulletChar="2022" BulletFontName="StarSymbol" ParentNumbering="0" LeftMargin="150" SymbolTextDistance="50" FirstLineOffset="0" />
<OutLineNumberingLevel Prefix=" " NumType="6" Suffix=" " BulletChar="2022" BulletFontName="StarSymbol" ParentNumbering="0" LeftMargin="200" SymbolTextDistance="50" FirstLineOffset="0" />
</OutlineStyle>
<OutlineStyle>
<OutLineNumberingLevel Prefix=" " NumType="4" Suffix="." BulletChar="0020" BulletFontName="" ParentNumbering="0" LeftMargin="0" SymbolTextDistance="50" FirstLineOffset="0" />
<OutLineNumberingLevel Prefix="(" NumType="1" Suffix=")" BulletChar="0020" BulletFontName="" ParentNumbering="0" LeftMargin="50" SymbolTextDistance="50" FirstLineOffset="0" />
<OutLineNumberingLevel Prefix=" " NumType="3" Suffix="." BulletChar="0020" BulletFontName="" ParentNumbering="0" LeftMargin="100" SymbolTextDistance="50" FirstLineOffset="0" />
<OutLineNumberingLevel Prefix=" " NumType="0" Suffix="." BulletChar="0020" BulletFontName="" ParentNumbering="0" LeftMargin="150" SymbolTextDistance="50" FirstLineOffset="0" />
<OutLineNumberingLevel Prefix=" " NumType="6" Suffix="." BulletChar="2022" BulletFontName="StarSymbol" ParentNumbering="0" LeftMargin="200" SymbolTextDistance="50" FirstLineOffset="0" />
</OutlineStyle>
<OutlineStyle>
<OutLineNumberingLevel Prefix=" " NumType="4" Suffix="." BulletChar="0020" BulletFontName="" ParentNumbering="0" LeftMargin="0" SymbolTextDistance="50" FirstLineOffset="0" />
<OutLineNumberingLevel Prefix=" " NumType="4" Suffix="." BulletChar="0020" BulletFontName="" ParentNumbering="0" LeftMargin="50" SymbolTextDistance="50" FirstLineOffset="0" />
<OutLineNumberingLevel Prefix=" " NumType="4" Suffix="." BulletChar="0020" BulletFontName="" ParentNumbering="0" LeftMargin="100" SymbolTextDistance="50" FirstLineOffset="0" />
<OutLineNumberingLevel Prefix=" " NumType="4" Suffix="." BulletChar="0020" BulletFontName="" ParentNumbering="0" LeftMargin="150" SymbolTextDistance="50" FirstLineOffset="0" />
<OutLineNumberingLevel Prefix=" " NumType="4" Suffix="." BulletChar="0020" BulletFontName="" ParentNumbering="0" LeftMargin="200" SymbolTextDistance="50" FirstLineOffset="0" />
</OutlineStyle>
<OutlineStyle>
<OutLineNumberingLevel Prefix=" " NumType="2" Suffix="." BulletChar="0020" BulletFontName="" ParentNumbering="0" LeftMargin="0" SymbolTextDistance="50" FirstLineOffset="0" />
<OutLineNumberingLevel Prefix=" " NumType="0" Suffix="." BulletChar="0020" BulletFontName="" ParentNumbering="0" LeftMargin="50" SymbolTextDistance="50" FirstLineOffset="0" />
<OutLineNumberingLevel Prefix=" " NumType="3" Suffix="." BulletChar="0020" BulletFontName="" ParentNumbering="0" LeftMargin="100" SymbolTextDistance="50" FirstLineOffset="0" />
<OutLineNumberingLevel Prefix=" " NumType="1" Suffix=")" BulletChar="0020" BulletFontName="" ParentNumbering="0" LeftMargin="150" SymbolTextDistance="50" FirstLineOffset="0" />
<OutLineNumberingLevel Prefix=" " NumType="6" Suffix=" " BulletChar="2022" BulletFontName="StarSymbol" ParentNumbering="0" LeftMargin="200" SymbolTextDistance="50" FirstLineOffset="0" />
</OutlineStyle>
<OutlineStyle>
<OutLineNumberingLevel Prefix=" " NumType="0" Suffix="." BulletChar="0020" BulletFontName="" ParentNumbering="0" LeftMargin="0" SymbolTextDistance="50" FirstLineOffset="0" />
<OutLineNumberingLevel Prefix=" " NumType="2" Suffix="." BulletChar="0020" BulletFontName="" ParentNumbering="0" LeftMargin="50" SymbolTextDistance="50" FirstLineOffset="0" />
<OutLineNumberingLevel Prefix=" " NumType="1" Suffix="." BulletChar="0020" BulletFontName="" ParentNumbering="0" LeftMargin="100" SymbolTextDistance="50" FirstLineOffset="0" />
<OutLineNumberingLevel Prefix=" " NumType="3" Suffix="." BulletChar="0020" BulletFontName="" ParentNumbering="0" LeftMargin="150" SymbolTextDistance="50" FirstLineOffset="0" />
<OutLineNumberingLevel Prefix=" " NumType="6" Suffix=" " BulletChar="2022" BulletFontName="StarSymbol" ParentNumbering="0" LeftMargin="200" SymbolTextDistance="50" FirstLineOffset="0" />
</OutlineStyle>
<OutlineStyle>
<OutLineNumberingLevel Prefix=" " NumType="4" Suffix=" " BulletChar="0020" BulletFontName="" ParentNumbering="0" LeftMargin="0" SymbolTextDistance="50" FirstLineOffset="0" />
<OutLineNumberingLevel Prefix=" " NumType="4" Suffix=" " BulletChar="0020" BulletFontName="" ParentNumbering="1" LeftMargin="50" SymbolTextDistance="50" FirstLineOffset="0" />
<OutLineNumberingLevel Prefix=" " NumType="4" Suffix=" " BulletChar="0020" BulletFontName="" ParentNumbering="2" LeftMargin="100" SymbolTextDistance="50" FirstLineOffset="0" />
<OutLineNumberingLevel Prefix=" " NumType="4" Suffix=" " BulletChar="0020" BulletFontName="" ParentNumbering="3" LeftMargin="150" SymbolTextDistance="50" FirstLineOffset="0" />
<OutLineNumberingLevel Prefix=" " NumType="4" Suffix=" " BulletChar="0020" BulletFontName="" ParentNumbering="4" LeftMargin="200" SymbolTextDistance="50" FirstLineOffset="0" />
</OutlineStyle>
<OutlineStyle>
<OutLineNumberingLevel Prefix=" " NumType="6" Suffix=" " BulletChar="27A2" BulletFontName="StarSymbol" ParentNumbering="0" LeftMargin="0" SymbolTextDistance="50" FirstLineOffset="0" />
<OutLineNumberingLevel Prefix=" " NumType="6" Suffix=" " BulletChar="E006" BulletFontName="StarSymbol" ParentNumbering="0" LeftMargin="50" SymbolTextDistance="50" FirstLineOffset="0" />
<OutLineNumberingLevel Prefix=" " NumType="6" Suffix=")" BulletChar="E004" BulletFontName="StarSymbol" ParentNumbering="0" LeftMargin="100" SymbolTextDistance="50" FirstLineOffset="0" />
<OutLineNumberingLevel Prefix=" " NumType="6" Suffix=" " BulletChar="2022" BulletFontName="StarSymbol" ParentNumbering="0" LeftMargin="150" SymbolTextDistance="50" FirstLineOffset="0" />
<OutLineNumberingLevel Prefix=" " NumType="6" Suffix=" " BulletChar="2022" BulletFontName="StarSymbol" ParentNumbering="0" LeftMargin="200" SymbolTextDistance="50" FirstLineOffset="0" />
</OutlineStyle>
</LC_OutLineNumberingLevel>

 

</Locale>

 

And you are finished. Save your file, check it a couple of times and then submit it as an ENHANCEMENT issue against the Localization (L10n) project and submit the file. To submit an issue you first need to login into the OpenOffice website, then hit File Issue on the left hand menu… go to proceed in the next page… click in the component l10n in the next one… are you are ready to file it. Select version current, subcomponent code, type ENHANCEMENT, Summary Locale file for language…., and hit Submit. The system will ask you if you want to attach a file and what type. Attach the file that you have been working on, submit it… and you are done.

 

If you would - nevertheless - like to prepare a more developed locale, please look at the following documents:

 

http://l10n.openoffice.org/i18n_framework/HowToAddLocaleInI18n.html

 

http://l10n.openoffice.org/source/browse/l10n/i18npool/source/localedata/data/locale.dtd

 

http://l10n.openoffice.org/i18n_framework/LocaleData.html

 

http://api.openoffice.org/docs/common/ref/com/sun/star/i18n/NumberFormatIndex.html