Translation termbase is a list of approved terms restricted to a specific subject matter paired with corresponding terms in the target language and guidelines on usage.
What is a Termbase?
Companies and industries tend to have their own languages. For example, a “screen” in IT setting refers to a computer display, but in medicine, it is a medical check. If you are dealing with translation technology then “CAT” will most likely mean Computer-Aided Translation, but if you write marketing materials for Caterpillar, it is a short version of the company name. If a company decided to use the term “drive” in their manuals, then it should not be substituted by “disc” or “HDD” on another page to avoid misunderstandings. These examples refer to terminology issues that are solved by terminology management and termbases.
A term is not always a single word. It can be an expression of more than one word that illustrates a concept or a thing, or even a symbol (©, ®). A termbase is a reference document containing all such terms and instructions for proper usage. Companies often take measures to make it clear what terms should and should not be used. On the most basic level, it may be a simple list of guidelines, such as:
- Avoid using the term "screen," use "test" instead
- Use "drive," and not "disc" or "HDD"
- Use the full name "Caterpillar" in most cases, but the short version "CAT" is allowed for brevity.
This sort of a basic monolingual termbase may be enough at first, but if more and more terms require explanations, a better format is needed, for example:
|CAT||Short for Caterpillar||Use only when required for brevity.|
|Caterpillar||Full company name||Use in all cases, except where space is limited.|
|Drive||Data storage unit||Preferred term in all communications.|
|Disc||Data storage unit||Avoid, use “drive” instead.|
|HDD||Hard Disc Drive. Data storage unit||Avoid, use “drive” instead.|
|Screen||May refer to display unit or medical examination.||Use only in the meaning of a display unit.|
|Test||Medical examination.||Preferred generic term for all types of medical examinations.|
The information is the same, but it is presented in a more organized manner. There are three columns (known as data categories), but adding more would allow for even more details, as indicated below:
|Term||Part of speech||Example of usage|
|CAT||Noun or adjective||Noun: The latest catalog from CAT.
Adjective: The latest CAT catalog.
|Caterpillar||Noun or adjective||Noun: The latest catalog from Caterpillar.
Adjective: The latest Caterpillar catalog.
|Drive||Noun||Check if the drive is connected to a power source.|
|Screen||Noun, do not use as verb||The information is displayed on a screen.|
|Test||Noun or verb||Noun: Results of the test will be available tomorrow.
Verb: We need to test for the presence of antibodies in the organism.
As you can see, additional information may make usage of termbase terms clearer. On the other hand, the task of collecting, developing, storing, reviewing, synchronizing, updating and distributing terminology becomes quite complex as more terms are added. Computer software helps manage terminology, whether in a simple multi-column spreadsheet or using sophisticated tools designed for this purpose. This is referred to as terminology management and terminology data is stored in a terminology database, or termbase, also known as a glossary of terms or termbank.
Why is a Termbase necessary?
In high-risk fields such as medicine, military and law, problems of ambiguity or inconsistency have especially serious consequences and could cost lives. However, even with less sensitive material, there are clear benefits of terminology management. Here are a few:
- Readability – Texts are easier to understand when there is no ambiguity or inconsistency. Readers will focus on what you are saying rather than struggle to get the intended meaning or stop reading because they don’t understand;
- Image – Consistent use of terminology indicates good writing and projects a professional and confident image of your organization;
- Subject matter expertise – Termbases are produced or approved by a specialist who is not a translator, while users of termbases do not need to be domain experts;
- String lengths – String lengths are often restricted in software, gaming and mobile applications. Termbases will have pre-approved terms with correct string length;
- Productivity – Translators and editors work faster and more confidently when there is a clear reference for the terminology. When there is a well-done termbase integrated with CAT tools, translators and editors see the terms automatically as they work instead of having to refer to an external document or researching online.
- Cost – Fewer quality complaints and corrections lower the total cost of translation projects because higher quality materials require less work from editors and proofreaders;
- Controlled language – Termbases are part of controlled language, which makes documentation easier and faster to index, search and manage.
What is a translation Termbase?
Things become even more complex when providing information in multiple languages. If you enforce standard terminology on all your source content, then you should enforce it in translations too. Otherwise, the same content in new languages will not be as consistent as the source material. What’s more, terminology inconsistencies often increase in translated versions, due to the fact that there can be several ways to translate a given word or expression.
Type of information recorded in a termbase:
|Term in source language||Definition in target language|
|Definition in source language||Examples of usage in target language|
|Examples of usage in source language||Internal links between related entries|
|Grammatical information||External links to information in other sources|
|Subject domain||Metadata: date, location, person who last updated the term, etc.|
|Term in target language||Hundreds of other categories are possible|
Sometimes, due to pressure to get their products to market as soon as possible, companies decide to divide a large document into smaller parts sent to different translators. It is easy to see how consistency issues will be multiplied with more translators involved and a translation termbase becomes even more important.
Termbases can be monolingual (described earlier), bilingual (source and one target language) or multilingual (contain many languages). Translation termbases along with translation memories and style guides are the three most common types of reference documents used in translation.
How to create and use a Termbase?
Termbases must be created before a translation begins rather than during. On the other hand, they are living documents and should be updated regularly after agreement and consensus develops among translators, editors, reviewers, subject matter experts, Language Services Providers (LSPs) and clients. While many parties contribute to their creation and maintenance, it is important to note that termbases belong to the client. A spreadsheet can be used to record terminology initially, such as by a writer. However, in the future, it should be imported into a format appropriate for use in Computer-Aided Translation (CAT) Tools.
If there is no existing termbase, the process starts with a review of your approved materials, common industry terminology and by scanning the new source content to identify candidate terms prior to translation. Much of this “term-mining” can be automated using terminology extraction tools, but approval needs to be done by humans.
Running a terminology extraction tool on a text (corpus) automatically produces a list of words and word combinations called “term candidates.” A terminologist will then go through the list to determine the words and phrases that should be included.
Which terms should be included?
|Technical||Ones that caused confusion in the past|
|Company or product-specific||Terms with a preferred of forbidden synonym|
|Acronyms||With multiple meanings|
Then, linguists and translators provide any remaining information: definition, context and usage. Finally, the client reviews and approves the terminology for each language. The review and approval step is critical and should be done by an in-house expert for each language based in the country where the translation will be used. Once finalized, the termbase should be converted into a format suitable for use in CAT tools. The open XML-based standard TermBase eXchange (TBX) format is the most common.
Termbases can be used directly with CAT tools in the same way as translation memory. Every time a segment is presented for translation, there will be an automatic search for terms in the termbase. Finally, as the termbase is used, translators, editors, reviewers or clients may point out the need to update existing information or add new terms.
The termbase is used during translation and editing, and finally one more time as one of the most important steps in quality control before finalizing the project.
Best practices for translation Termbases
- Well-organized – Each term should appear only once in a termbase (with the exception of terms with more than one meaning) and all terms that require definition should be included. Navigation and updating should be quick and simple;
- Right size – While a termbase should include all terms that require definition, it should not include unnecessary terms that are either never used in the content or are well understood and do not require clarification. A large termbase is time-consuming and cumbersome to use and manage.
- Usage guidelines – Good termbases include examples of usage and comments for writers and translators.
- Terms not for translation – Termbases should also include a list of terms that should not be translated. For example, product and company names are typically left in the original language.
- Relevant – Base your termbases on actual materials used by the company and use general industry terms as a supporting source.
- Reviewed – Ensure that any changes or additions to a termbase are checked and reviewed by native-speaking subject matter experts approved by the client.
- Language variants – Include information about a specific locale or language variant. For example, Spanish for Spain and Spanish for Latin America should have separate termbases.
- Updated – A termbase is never finished. New terms will emerge as products, technologies and languages change. Old terms that are no longer used should be removed or deprecated.