Phrase is a localization platform supporting a large number of resource file formats used in the most popular platforms and languages. It detects keys, values, and descriptions. Where appropriate, it can also help translators handle pluralization.
Phrase is a localization platform created for the unique needs of software developers. In a preparatory internationalization step, developers pull all displayable text out of their code and store it in a so-called resource or localization file.
What is a localization file?
Here, we are talking about text files – files that can be opened and edited in a text editor such as Notepad or TextEdit or one of the myriad enhanced text editing tools used by programmers. These files generally follow the key-value principle. This means that they contain a list of text snippets ("strings") that are associated with unique IDs ("keys"). Each string is thus a "value" of a key:
key1 = value1
key2 = value2
keyN = valueN
This simple example actually shows the format of real localization files used in Java programming. Below, we will also discuss other formats that deviate from this bare-bones structure to include more information about the strings.
How are localization files created?
Given that localization files are plain-text files with relatively simple structure, it is certainly possible to create them by hand. Most of the time, however, they are automatically generated by "internationalization" utilities or scripts that are available for the different development environments. Automatic creation of localization files relieves the programmers of highly tedious work and guarantees that the structure of the files is valid.
To create a localization file, all pieces of displayable text are replaced with unique IDs in the code files. Then the text strings are added to the localization file with their IDs.
How are localization files used?
Instead of the actual text strings, the code now contains only keys. Thus, when the software generates a view for the user, it first uses these keys to look up the associated strings in the localization file. So far, so unexciting. But now we get to the nifty part that is the reason for going through all this in the first place...
Let's say your software is set up to be used in English and Spanish. You might keep all of your English text in a file called English.txt. This may be your default text location. This means that if your user does not select a language, all text will be pulled from this file to generate any display. However, if the user selects Spanish, the software is redirected to Spanish.txt instead. Obviously, you can add as many languages as you wish with a system like this.
The advantage of doing this is that the choice of language for the display does not affect the code. If the software needs to display a login button, it may require the string associated with the key "login_button". It just needs to know in which file to look to retrieve the appropriate string for the given language.
How does Phrase work with Localization files?
As a key-based translation platform, Phrase supports many different resource file types. After files are uploaded to Phrase, it extracts the keys and their associated string values. Then, Phrase presents the keys and strings to the translator in a standard format. Translators can thus focus on their task without having to worry about the exact format of the localization file. They can inspect the keys, because the key itself can provide crucial context and guide them to correct word choices.
When all strings are translated, files are downloaded from Phrase. In the process, Phrase recreates the needed localization file format that matches the original source file.
What formats does Phrase support?
Phrase supports four broad types of resource files. These are all essentially text-based, i.e. you could open and inspect them with a text editor. XLSX looks like an exception, but it is really XML within a zip file.
Phrase supports xlsx and csv files. These formats are equivalent for localization purposes and contain rows of key-value pairs. The keys are in one row, while the corresponding values are in an adjacent row. Which exact column is used for which purpose depends on the application, and a localizer needs to configure Phrase to interpret the columns correctly. ZenDesk csv files have a fixed structure, so this file type does not require further adjustments:
"Title","Default language","Default text","English text","Variant status"
"simple_key","German","Einfacher Schlüssel.","Simple key.","Current"
XML is a format that offers meta information in the form of <tags>. This means that Phrase can use the tag structure to determine where the keys and their corresponding values are, as shown here from an Android XML file:
<string name="simple_key">Just a key with a message.</string>
Two standard translation formats are XML: TMX and Xliff. These do not only hold keys and values in one language but also associate value pairs from a source language with corresponding values from a target language. Such files are typically bilingual, as this translation unit in a Symfony Xliff file shows:
<trans-unit id="simple_key" resname="simple_key">
<source xml:lang="de-DE">Nur ein einfacher Schlüssel mit einer einfachen Nachricht.</source
<target xml:lang="en-GB">Just a simple key with a simple message.</target>
QT programs use resource files with a structure that is very similar to these standardized formats, but for historical reasons they have a somewhat deviating layout.
Plain key-value lists
First, there are resource files that contain just simple listings of keys and values, as this snippet from a Ruby on Rails YAML shows:
simple_key: Just a simple key with a simple message.
Many different programming languages or platforms use such formats with minor layout differences.
Since these are monolingual files, a localization program needs to maintain parallel versions of such files - one for the source language and others for the target languages.
Gettext produces key-value files containing additional information, such as descriptive comments or plural variants:
# This is the amazing description for this key!
msgstr "Check it out! This key has a description! (At least in some formats)"
msgstr "Check it out! This key has %s descriptions! (At least in some formats)"
Again, there are competing formats with similar functionality and layouts that vary in relatively minor ways.
For example, go-i18n JSON refers to keys as "id":
"translation": "simple key, simple message, so simple."
while Angular uses the keys themselves as keys in its arrays:
"simple_key": "I am a simple key with a simple message.".
Since there are these minor but crucial differences, it is very helpful that Phrase directly supports widely-used JSON and PHP Array structures.
Our full listing of supported file types contains links to pages for each file type with a detailed code snippet.