Strings and keys are the fundamental building blocks of software localization projects. Keys are the IDs used to reference pieces of displayable text or strings. Organized like this, software becomes localizable while remaining maintainable.
In a successful localization project, strings and keys are handled correctly and efficiently. Ehm, come again? What are “strings” and “keys”? What do you need to do with them? And what are best practices here?
What is a string?
In programming jargon, a string is a series of characters. In other words, a string is a piece of text. It can be just one character long (“a”), or it contains several words (“this is a string”) or sentences. In theory, there is no upper limit on string length. But in practice, software displays strings of reasonable length.
The dialog here shows six different strings:
What are keys?
When you localize a software application, your goal is to present it in the language of your end-user. This means that all strings in the user interface need to be translated. In an initial step – internationalization – these strings are replaced with unique IDs or “keys”. The keys are then stored in so-called “resource files” together with their associated strings. In Phrase, we refer to these string-key pairs as "translations" - whether we are talking about the source language or a target language. When the time comes to show a screen to your end-user, your code will use the keys to pull the associated strings from appropriate translations and display them.
Benefits of working with key-string pairs
You might wonder why developers go to these lengths. Why don’t they just keep the strings in their code and display them directly? There are several reasons, and only some involve localization:
- text changes don’t require recompilation
- text changes shouldn’t break your code, thus
- less testing
- easier hand-off to translators
- translators can focus on strings without getting sidetracked by code
What should strings look like?
Ideally, each string should be self-contained. If we take one of the strings in the above screenshot as an example, we could conceivably handle it as a concatenation of two strings and a variable:
s1=”Save changes to document”
s2 =”before closing”
Display: s1 + FILENAME + s2
The problem of doing this is that the translator will see two incomplete strings without being fully aware of how they fit together. As a result, the combined sentence will likely sound unnatural or even ungrammatical. Keeping the string together is the better solution, as you can see in the resource file excerpt below. Here, the filename is inserted into a complete sentence via a variable:
What should keys look like?
There are different naming conventions for keys - each has its advantages and disadvantages and requires a different type of support from the code and the localization platform.
Source Strings as Keys
The screenshot above shows a .po file – a very common resource file format. The present example shows a file holding German strings that are marked as “msgstr.” Each string is associated with its key, called “msgid.” The original English here serves as a lookup key. This method has a clear advantage: it is easy to figure out what each msgstr means, because you have the source text right next to it. The main disadvantage is also relatively obvious: even if you just make a minor adjustment to an English string (such as fixing a typo, or inserting a comma), you thereby invalidate all translations. After all, with the new, fixed string, you will not be able to look up existing translations that are still associated with the old source string as key. This means that any change in the source language must immediately be followed by adjustments to all resource files that contain this key. At the very least, you need to carefully keep track of all adjustments if you do not want to end up with lots of strings and translations without a clear relationship.
The issue of strings as keys arises specifically with file formats similar to .po, such as .properties, .strings, .yaml, and different varieties of JSON. Such files are monolingual, which means that for each target language, you need one resource file (or a set thereof). And using the string as a key is the simplest way of keeping the file readable and translatable: just populate the string fields with the translations for the associated key.
<trans-unit id="querysavedialog_query" >
<target xml:lang="en-US">Save changes to document “$(DOC)” before closing?</target>
<source xml:lang="de-DE">Änderungen am Dokument „$(DOC)“ vor dem Schließen speichern
In formats like this, the key is logically independent of the string it refers to. Best practice, however, asks the developer to make the key as descriptive as possible. Because here we have to worry about the maintainability of the code: It is much easier for a developer to read code that says something like display(‘querysavedialog_query’) instead of an abstraction such as display(‘wFiA8’).
While most types of localization files are simple text files, some formats may hide surprising complexities. Above, we have a key querysavedialog_query, which refers to the query string within the querysavedialog, the dialog used to ask whether something should be saved. This means that the key contains a hierarchical structure. The file formats mentioned above, however, do not directly express this hierarchy - all key/value pairs are stored at the same level.
JSON-based formats can store nested keys - keys within keys. This allows to express hierarchical relationships between keys in the structure of the localization file itself. For example, with a file containing
"querysave": "Do you want to save this file?"
you could retrieve the question "Do you want to save this file?" in a dialog for file operations in productXYZ by referencing the key
This provides opportunities for keeping different string contexts separate.
Certainly, with complex keys like this, you need complex mechanisms in the internationalization code, and you also need a sophisticated translation platform (such as Phrase) that helps translators use keys for reference.
Choice of a localization tool
The format choices for strings and keys are made on the battleground between maintainability and localizability. Each resource file format favors one side or another, or one specific aspect at the expense of others. And some developers choose a resource file format because they have a specific translation tool in mind. If you work in an environment with many developer groups, you will thus likely encounter many string and key formats.
Phrase is a localization management platform that is focused precisely on the close relationship of keys and strings that is crucial in software development. As such, it displays keys to the translator to provide important contextual clues. It also offers sophisticated mechanisms for searching keys and grouping strings by key patterns.