Aug 9, 2022
When “Parasite” won the Academy Award for Best Picture a few years ago, the filmmakers weren’t the only ones surprised. The entire media and entertainment industry immediately recognized the changing landscape: not only did a foreign language film win the top prize, but a “subtitled” foreign language film won!
Since that watershed moment, more subtitled foreign films have entered the mainstream and captured acclaim and attention of new audiences. This proves that audiences will read if a story is compelling.
Running in parallel to this viewing trend is the rise of Artificial Intelligence (AI). Even if you don’t fully understand it, you’ve surely encountered it in some way during your daily activities. AI or, more accurately speaking, its major subset Machine Learning (ML), refers to the ability of “machines” to learn over time. Some common examples include auto-correct; Netflix recommending movies you might like; Google Maps directing you to your destination in real-time based on the traffic; being matched with a driver and taking a trip through the Uber app; or asking Alexa or Siri for the weather or to stream your favorite music playlist. These common apps in some way all use AI and/or machine learning.
You may also be surprised to learn that these consumer technologies are driving innovation for many business tools, and the benefits are similar. AI is now used to extract, interpret, and process data quickly and contextually in real time, which improves operational efficiencies, improves accuracy, and reduces costs. If configured and trained correctly, machine-learning algorithms can consistently exceed human expectations.
The change in viewers’ behavior coupled with the innovations in AI contributed to the passion and foundation of XL8. XL8’s machine translation (MT) engine is unique because it is the only engine in the market that was trained exclusively on media and entertainment content that was 100% hand-curated by professional translators over the past 20 years. To further increase accuracy, we’ve applied our unique application of “context awareness” in MT to further enhance the immersive experiences for audiences while watching live or pre-recorded streaming or broadcast content.
We’ll cover context awareness in more detail, but first, a little more background on content localization.
The explosive global demand for content across every platform and language has created a ripple effect throughout every corner of media and entertainment. Yes, people want more content, but their preferences have evolved with the proliferation of OTT services and availability of foreign language content in new markets. Localization efforts have supported this evolution through the integration of subtitling and dubbing to bridge the language barrier. Additionally, the mode of consuming content has also evolved with the immediacy that OTT services provide. Consumers can watch anytime, anywhere, and from whatever device they prefer. Sometimes, they consume on multiple devices and platforms - all at the same time!
Riding the coattails of this content explosion is the widely expanded use of subtitles and captioning. For a long time solely seen as an assist for the hearing impaired – and that is still a primary use – captions and subtitles are now considered creative onscreen elements specifically designed to enhance the viewing experience. They make visual content easier to digest and, most importantly, unite people through the joy of shared interactions. We're also seeing a rise in interactive viewing in the form of "Watch Parties" that is designed to create a community around a shared viewing experience. Imagine how live interpretation could unite viewers across all languages! We'll touch on this more in a future article.
Subtitles and captions can also represent genuine business potential for online content creation and live event production. Content owners and rights holders see the increased value of dubbing, subtitling, and accessibility services.
Context Awareness and Machine Translation
The amount of content produced and consumed worldwide continues to increase at an un-ending pace. The direct result is that content owners and service providers are taking traditional localization services a few steps further -- combining translation with “hyper-localization” capabilities.
Hyper-localization means more than only considering the languages spoken in one country. Localization strategies must also consider the various dialects and even “slang” terms spoken in each of a country’s regions. Getting content prepped and out to the market faster means more consumers can enjoy the titles they want faster, and content-owners can make maximum use of the content in their libraries. The catch is that it’s not an easy or quick process. This is precisely where context-aware MT can speed up the localization process and produce better accuracy when required to turnaround localized content in a significantly shorter timeframe.
While there are still use cases where it is more difficult to adopt MT, content localization workflows are rapidly transforming by adopting Machine Translation Post Editing (MTPE). MTPE is a process where the output of MT is post-edited by a linguist to refine the final results of the translation.
After the breakthroughs that the Transformer, a deep machine learning model for translation, brought to the MT field in 2017, steady and incremental contributions were made to the technology. These contributions resulted in better translation quality and easier and/or faster training of the model. Context awareness (“CA”) was among the enhancements that contributed significantly to translation quality, and it helped to demystify the long-standing bias that MT would never reach its full potential.
The meaning of context is “the circumstances that form the setting for an event, statement, or idea, and the terms in which it can be fully understood and assessed.”
A generic Transformer model translates sentences one by one, which means it loses any context that is outside of the primary sentence. Instead, context-aware translation models use information “surrounding” the source sentence.
CA does what was previously the sole domain of humans. In other words, CA reads between the lines and accurately considers the context of a conversation. It evaluates the subtle differences of gender, slang, multiple word meanings, and hundreds of other elements that make language the living, dynamic wonder it is.
CA models have now advanced to the point of human-like capabilities. Particularly due to this reason, when XL8 released its CA models, many translators were surprised by the results which exceeded their expectations.
In a series of articles, I’ll describe why our patent-pending CA engine is producing results that exceed expectations in the localization service provider community. Our “secret sauce” is rooted in two main tenants. As previously described, the first is our “golden” data, which is 100% hand-curated by professional translators over the past 20 years and accessible through our partnerships. Unlike the data scraped or crawled from the web, our dataset includes deep sets of conversational contexts that teach our engines to draw “contexts” from the text.
The second is our research efforts. Because we start with a pure, golden dataset, we don’t need to alchemize to turn poor-quality web-scraped data into gold. Our research centers on developing a model architecture along with our own training, data preparation, and inference methods that are optimized for the data we have.
There are a myriad of applications for this type of targeted subtitling and captioning technology:
- Live events, such as e-sports, gaming tournaments, online sports viewing - where AI-generated captioning doesn’t need to be 100 percent accurate. It exists purely to enhance the experience.
- Offline experiences, such as viewing pre-recorded media or watching broadcast or OTT content, is where an MTPE workflow is critical. Mistakes will not only be noticed but will detract from the experience.
- The streaming market, especially its rapidly growing subset, Free Ad-Supported Streaming TV (“FAST”), provides all the benefits of streaming TV for free but with ad breaks inserted into programming. Cost-effective and efficient subtitling will be key to these services adding new channels while keeping costs low.
And the list doesn’t stop here. Stay tuned for a deeper dive into context awareness!