Andovar Localization Blog - tips & content for global growth

The Ultimate Guide to Machine Translation (now known as 'AI Translation') Software

Written by Steven Bussey | Aug 31, 2022 3:57:00 AM

When you're working with a deadline and a budget, machine translation can be your best friend. Find out how it works and why we think every business should consider using this service!

In this blog we will cover:

1. What is Machine Translation
2. How to Choose Machine Translation Software
3. Benefits of Machine Translation Software
4. Disadvantages of Machine Translation
5. Machine Translation vs. Human Translators
6. History of Machine Translation
7. Machine Translation Use Cases
8. Machine Translation Software Options




What is Machine Translation 

Machine Translation (MT) in its latest iteration is the application of automated AI-facilitated software for translating spoken or text language. Basic MT software simply performs word-for-word substitutions. More advanced MT technology incorporates AI, natural language processing (NLP) and machine learning while automatically employing specific bases of terms and innovative techniques for analysis of grammar, syntax, and semantic elements. 

There are many useful, basic, public free Machine Translation engines available online, such as Google Translate, Amazon Translate, and DeepL. Machine translation software platforms in this class are understood as less than reliable. However, overall, MT tech quality has evolved so much that it has become a fundamental tool for top language service providers (LSPs). When combined with skilled human translators for post-editing, professional-grade Machine Translation software can be widely used in companies with medium to large localization requirements.

How to choose Machine Translation Software

Machine Translation is a rapidly growing industry that has been making waves in recent years. As the use of MT software programs continues to grow, so do their capabilities and effects on language providers (LSPs). Some top-rated machine translators can facilitate faster production rates while simultaneously increasing accuracy with lower costs than human workers—allowing companies more time for other important tasks! These machines offer various options tailored specifically towards your needs; depending upon what you're looking for: Whether it's a fast turnaround time or higher precision/word count ratio - There are options and settings suited to most use cases.

Let’s look at the things you should consider when selecting Machine Translation technology for your particular use.

 

Benefits of Machine Translation Software 

Machine Translation is often overlooked when people compare it to human translation work. The problem lies in uninformed expectations, not the performance capacity of machine translation software; employing MT may help professional translators but they should never replace them!

When machine translation software is integrated with a computer-assisted translation (CAT) platform, the powerful CAT tool capabilities are multiplied. This high-performance combination of translation technologies is used by global organizations that need to manage very large and complex localization projects with exceptional efficiency. 

  • Productivity: Automated translation is designed for processing large volumes of content. 
  • Speed: Machine Translation software translates at incomparable rates of output.
  • Consistency: Termbases in Machine Translation engines ensure consistent application of terms. 
  • Reliability: Machine Translation is accurate enough for translating commonly used words and expressions.
  • Customizable: Workflows can be modified to accommodate specialized industry or client needs.
  • Compliance Ready: Global Data Protection Regulation (GDPR) compliant translation solution.
  • Multitasking: Machine Translation automatically translates to multiple languages simultaneously.
  • Cost Savings: Machine translations cost a fraction of human translation services.
  • Unburdens Translators: Frees human translators to focus on the finer details of the translation.
  • Integrable: Machine Translation seamlessly integrates with CAT and other localization systems.

Disadvantages of Machine Translation  

Machine translation software offers a long list of pros (above), but like any other complex technology, it naturally also comes with its cons: 

  • Imperfect: Not all Machine Translation translations are precise matches with the original content. 
  • Generic: A software program cannot comprehend all cultural or contextual nuances, so Machine Translation cannot accurately translate those aspects of content.
  • Developing: Translation quality in some languages is less than in others, due to development levels in termbases, glossaries, customizations, and integration.


Machine Translation vs. Human 


The misconception surrounding MT is in the assumption that professional users of Machine Translation cannot expect accuracy without post-editing. The point here is that, in fact, both the Machine Translation and post-editing phases of the process are understood by LPS project managers as necessary and built-in stages of their standard workflow. 

Although in the vast majority of machine translation projects, post-editing is necessary, there are instances when only light post-editing or even no post-editing is needed. Whether PE will be needed for MT depends on these key factors:

  • The quality of the Machine Translation output
  • The extent of the engine training
  • The robustness of the corpus
  • Content structure
  • Target language
  • Availability and extensiveness of the termbase
  • How well-aligned users’ expectations are with the performance potential of the Machine Translation engine

Why is PEMT necessary with Machine Translation?

Post-edited machine translations (PEMT) achieve precision in translation accuracy by capturing any mistakes that have been made in earlier stages to ensure the meaning of the message is conveyed to people of a different language and cultural experience. 

 

 

History of Machine Translation

The initial attempts at machine translation reportedly happened back in the 1940s. In the early 1950s, the concept attracted more interest, underwent a surge of development in the 1990s, and has continued evolving since then. The Machine Translation market is now anticipated to grow to over USD $600 million in North America alone by 2024 and reportedly between USD $943M to as high as USD$1.5B globally that same year by some estimates. (The U.S. currently accounts for over 60% of the global localization market.) 

The globalization of the commercial marketplace now relies on MT technology. More and more e-commerce companies and other international enterprises are now relying on machine translation software to facilitate the work of human translators in localization processes.

  1. The Evolution of Machine Translation

Modern adaptive AI-enhanced MT systems perform real-time updates driven by content edits. This means the best machine translation software is continuously learning and building on its knowledge base. Current MT technology types include:

  • Rule-Based Machine Translation (RBMT): Developed many years ago as the first practical Machine Translation technology, RBMT parses source content segments to interpret words, analyze sentence structures, and translate them based on rules set by linguistics experts. The rules are applied by the system to define correlations between structures in the source and target languages.

  • Statistical Machine Translation (SMT): SMT is the processing system used by popular online free platforms like Bing Translator and Google Translate. It is today’s most commonly used MT technology. SMT searches segments of source texts and potential translations and phrases within the segments for statistical correlations, to develop models for translations. The system then calculates confidence scores, evaluating the likelihood that the source text will match the translation to be rendered.

  • Neural Machine Translation (NMT): NMT marks a technological paradigm shift in machine translation. Today’s state-of-the-art Neural Machine Translation engine is an advanced form of MT that consists of an artificial neural network with artificial intelligence training it. The system is designed to predict word sequences, extrapolate from accumulated information, and generate translated sentences that are modeled from the results. 

    In contrast to the conventional SMT translation system consisting of numerous separately adapted components, NMT is built and trained as a singular neural network that reads and translates sentences. All parts of the system are jointly trained from end to end, to maximize translation performance.

  • Deep Neural Machine Translation (DNMT): First-generation NMT, with its single layer of neural network language translation processing, has further evolved into Deep NMT (DNMT). This version of neural machine translation design features multiple stacked neural processing layers. This means there are many more jointly trained processing elements maximizing translation performance compared to the early NMT engines.
In most circumstances, NMT generates translations of much higher quality than SMT while using just a fraction of memory. NMT is becoming increasingly important in localization. Currently, this technology is mostly used by leading global LSPs. However, the spectacular NMT innovation is projected to become more accessible and more widely used.

 

  1. The Current State of Machine Translation 

As far as MT has evolved, even to the marvelous depths of the DNMT model’s artificial neurons and their amazing human-like functionality, MT nevertheless is not yet a technology that has reached its maximum potential. Generally speaking, at this point, the world’s best machine translation software has advanced to serve these common translating purposes:

  • Gisting: Translating to a generalized version of what the source text says, representing only the essence of the message, without delivering the benefit of a richer expression of it. 
  • Immediate Need: Translating content that cannot wait for more time-consuming human translation, such as for texts, chat, etc.
  • MT/human Translation: Humans perform post-editing of machine translations to produce error-free, stylistically appropriate final versions of content.
  • Controlled Language: Customized Machine Translation platforms provide exceptionally high-quality translations of content written in controlled language, for example, certain reports, specifications, and various other documentation.
  • High Volume: Machine Translation generates great volumes of translations at lightning speed when human translation alone is not feasible economically or technically.
  • Support for Translators: Human translators can edit results as needed for higher machine translation quality or use machine-translated content as-is, when that level of quality is sufficient for that content. 
  • Pseudo-Translation: Localization specialists can apply Machine Translation to compare source text with target languages, to examine for internationalization issues prior to undertaking translation.

Although this emphasizes the limitations of Machine Translation without human post-editing, again, there are cases in which very little if any PE is needed. Again, the extent of need for PE can depend on the MT output quality, the termbase, and the user's expectations. 

Machine Translation Use Cases

By all accounts, Machine Translation is not perfect. It comes with its good and bad aspects. The takeaway point here is that the MT option should be weighed for all sizeable translation projects. You should decide to use it or dismiss it for each project based on your results and not on preconceptions based on anecdotal reports.

The primary use cases for machine translation are: 1) processes that necessitate rapid interaction, such as assimilating web chat or texts, and 2) as a tool to increase the productivity of human translators.

Use these general guidelines for the use of MT:


Machine Translation Software Options

All the basic Machine Translation software programs listed below integrate with Phrase. These platforms auto-translate verbiage between more than 600 language pairs. This makes these basic MT resources good content translation aids. When coupled with post-editing by human translators, Machine Translation helps elevate translated content to meet the highest standards for human translation quickly, and at a reduced cost.

Options for Machine Translation software types include:

  • Rule-Based Machine Translation (RBMT): RBMT solutions operate by rules that are based on the software’s respective analyses of the source and target languages.
  • Statistical Machine Translation (SMT): SMT software utilizes an array of separate components enabling algorithms and statistical models to create translations after analyzing substantial amounts of data.
  • Example-Based Machine Translation (EBMT): EBMT systems translate sentences by retrieving and comparing similar or matching existing translation source sentences and targets as interpretive examples for processing the current translation task. 
  • Neural Machine Translation (NMT): NMT uses a large single artificial neural network of functions jointly trained for end-to-end maximization of translation performance.
  • Hybrid Method: Combine options and build a system that enables you to select the best choice to match the content type and other considerations for a given Machine Translation project. 

Hybrid MT platform models combine the capacity of services like AWS, Google Translate, DeepL, Amazon Translate, as well as human-only translation workflows. These tools can be integrated into workflows in AI-facilitated Phrase. 

Another alternative is to use a platform product like Phrase, which automatically identifies and selects the best machine translation software solution for your current purposes, based on factors including language pairs, content type, and domain, among others. 

Machine Translation platform model options:

  • Free MT Engines: Such as Google Cloud Translate, Amazon Translate, Microsoft Translator, other leading free ‘All language’ solutions
  • Language-Specific MT Solutions: For example, DeepL is a good choice for various European to/from Asian language translations
  • Customized MT Platforms: Like Language Studio customized for Andovar, with over 600 language pairs. Key features of a customized MT system may feature:
  • Ready-made MT engines: Use Google Translate, Amazon Translate, or other publicly available free MT services. These do not have advanced functions or customization, and your data can be reused in the providers’ other services. 
  • Other custom MT engines: Platforms designed for processing in certain industries, for specific language pairs, particular content types, and other defined needs and outcomes. 
  • Cutting-Edge AI Deep Neural MT: State-of-the-art DNMT technology for the fastest and most accurate performance in broad-scale translation for localization. 
  • Cloud MT: Similar functionality to free public MT engines, hosted in the cloud,  but provides a dedicated account for exclusive use by your company. Cloud MT provides added capabilities in terminology customization, plus various other benefits. Your data with the service is well-secured. OR:
  • On-Premises MT: For companies that plan to deploy machine translation software in their in-house IT sphere. This is an exceptionally secure approach, but the cost is significant, deploying and managing the system is complex and requires continuous maintenance.
  • Best of Breed MT: This is a platform that enables the management of multiple MT engines, provides one layer of term customization, and features a conveniently manageable UI. It allows you to select the best MT engine(s) for various content types and language pairs. 
  • REST API: The REST API protocol is the common preference for flexible integration, simplicity, and ease of use. 

MT Landscape

NOTE: Machine Translation is not the same as CAT (Computer-Aided Translation). CAT tools enable collaboration between multiple translators and groups and integrate multiple tech components on one platform, like document editors, Quality Assurance, MT software, and others. CATs are designed to maximize the productivity and consistency of translation.

 

When Should You Use Machine Translation?

While Machine Translation can be very useful, it is not meant to be treated as a one-size-fits-all solution. Here are some situations where you should consider using MT:

  • When time is of the essence
  • Where you need to scale your efforts
  • When accuracy is less important
  • When a project is budget-conscious
  • When human translators are not available

Another criterion for judging a project as a good candidate for Machine Translation application is the level of nuance or complexity of the information in the content. In high-volume projects with these translation challenges, combining MT with post-editing vastly expands the opportunities to use MT for speed and cost savings.

 

Machine Translation Platform Implementation

Successful Machine Translation implementation requires following a thorough process. Here are the basic 7 steps for implementing MT engines:

  1. Prioritize Data Security Rules: Not all MT engines are compliant with GDPR or HIPAA. If customer data to be handled through your translation system requires protection, ensure that the machine translation software you select provides it.
  2. Process Content Suitable for MT: Some types of content are more compatible with MT engines than others. For the best results, use MT for content that is structured and straightforward in form. Professionally written FAQs, general customer service information, etc.
  3. Train Your MT Engine: Train the MT engine with words, phrases, and other content elements your company frequently uses. Accurate machine translations require a minimum of 100,000 segments. You can build or buy corpora (text collections), or obtain it from public sources for use in training your MT engine with data relevant to your industry.
  4. Recruit Post-Editors: Apply post-editing to ensure accuracy after machine translation. Use light editing for glaring content translation issues, or use full editing to correct any mistakes, including cultural errors. Ensure consistency in the application of MT post-editing process management.
  5. Sample in Advance: The purpose of choosing the best machine translation engine is to save time and money. But, if the results are bad, MT can cost you more instead of less. So, test to confirm that the quality will be sufficient to send your translated content forward for post-editing.
  6. Get Pricing Upfront: As with any investment, obtain an agreement on price from all the stakeholders before you commit. Use MT engines that generate quality estimations (MTQE), to help you determine the most accurate cost estimate.
  7. Roll It Out: Your machine translation outcomes may not meet your standards initially. But, by continuing to train the engine, the results will improve. With some fine-tuning, you will achieve the level of efficiency you’re aiming for.

Remember that post-editing is an integral part of the success of translation processes that rely on MT and require both volume output and high accuracy. Andovar turnkey translation solutions, for example, include both MT and post-editing components. Our technology is set up to auto-forward segments of content to either the MT engines or to human translators, as most appropriate. 

Before we begin MT implementation, we utilize MTQE scores that indicate the level of accuracy to be expected in the MT output. This allows us to correctly estimate the cost to help you go global with your content. 

 

Measuring Machine Translation Quality

The accuracy of a large-scale translation project mostly depends on the MT engine you choose. Currently, the most accurate method for evaluating the quality of MT output is for human evaluators to score sentence-by-sentence. Additionally, auto-evaluation methods are used to measure consistency between MT and human translations. For example:

  • Word Error Rate (WER): Based on the number of insertions, deletions, and substitutions made to translate the reference sentence, sometimes measured by the resulting edit distance 
  • Position-Independent Error Rate (PER): Computes the WER by recognizing sentences as clusters of words and disregarding the word order.
  • Rank-based Intuitive Bilingual Evaluation Scores (RIBES): Based on analysis of the reordering of words. 
  • Bilingual Evaluation Understudy (BLEU): Measures similarity of MT output to a set of high-quality reference translations.
  • Metric for Evaluation of Translation with Explicit Ordering (METEOR): Considers word stems and synonyms.

Frequently used QA testing tools for MT include Xbench, Verifika, ServiceNow, our latest tool, Content Quo and others, and CAT platforms that feature QA tools for localization. 

A preliminary QA testing process that is recommended by LSP QA testing experts, apart from the industry’s standard final full-scope QA testing is QA spot-checking. MT is typically used in this kind of supplementary localization QA process.  

 

Security Concerns for Corporations

Translating company documents, internal communications, and protected customer information, and other materials can present serious cybersecurity risks. To tighten data security in translation processes, examine your company’s language translation activities, and assess practices that may be exposing sensitive information. Here are some areas of security concern to be aware of as you shop for the best machine translation software platform for your company’s needs:

  • Data returns to online MT engines when using those free translation tools
  • Disregarding user permission controls 
  • Transmitting files for translation as email attachments
  • Failure to use translation memory group permissions
  • Using free online MT engines that do not have:
  • Encrypted file storage
  • 256-bit SSL certification
  • SHA-2 and 4096-Bit Encryption
  • Device verification
  • Transport layer security
  • Two-step authentication
  • Compliance with specified mandates
  • Compliant translation data centers
  • Third-party security assessment records
  • Updated browser
  • Automatic logoff
  • Last login information

To remedy these security issues, use a CAT tool that enables first-draft translations in the MT process without enabling data access to the free MT engine provider. Choose a secure MT platform with robust project management capabilities. Ensure that data transitions can be centralized and user permission controls and translation memory group permission controls are active. 

If the machine translation software you’re considering does not have the above security controls, you should move on to find a secure translation platform that features all the above security measures. Further, keep in mind that for data protection, on-premises MT applications are safest, secure cloud products are safe, and free online MT engines are the least likely to be GDPR compliant and the least secure.

 

Which Machine Translation Software is Right for Your Company?

There are numerous MT options, and as you have probably gathered by now, there is not an MT engine specifically designed for any particular content type. Generic MT engines can translate most kinds of content. But, with a custom MT platform, you can tailor training data to your industry and content type. 

But, which is the best machine translation software for your company? The answer depends on a set of factors including these, among others: 

  • MT Software Type: First, familiarize yourself with the classes of MT engines and generally how they work to process language translations. 
  • Your Content Type: Decide which type of content you want to translate. Test your results for some content samples through the MT and post-editing 
  • Your Industry Type: Some global industry types involve translation of voluminous complex technical language that requires the highly sophisticated processing provided by Neural MT.
  • Desired Language Pairings: Statistical MT may offer adequate translation for your needs.
  • Data Volume: Neural MT requires large amounts of text to learn and deliver benefits.
  • MT Security Types: Scrutinize the MT software provider’s privacy and security policies.
  • TMS Compatibility: Be sure your chosen MT software is supported by your TMS. 
  • Budget: Neural MT is more expensive to train than statistical MT. 

For best results, use a TMS that automatically selects the best machine translation engine for your particular needs in any given project. Again, there is no particular machine

translation software with functionality specifically designed to match your content type perfectly. But, the best MT engines can be trained for particular data types and subjects. 





15. Andovar Takes Your Content Global
 

Andovar is a global media-focused content localization services provider. We have built our brand on customizing solutions for complex localization projects that have facilitated seamless global growth for our clients. Our experts in MT and localization systems can help you develop and implement the ideal translation and testing hardware system and the best machine translation software for your needs. Our language technology tools accommodate the largest-scale localization needs and make your life much easier as you expand your international market reach.

Our expanding multi-cultural team currently includes localization project management experts, technical experts, and over 10,000 professional translators. We have helped eCommerce companies, games developers, software and technology companies and enterprise companies in various other industries deliver ideally localized content of all types throughout their global markets. 



For more tips and content on global growth, please visit our blog.