July 1, 2016Web · User Experience · Research

Building a better way to write posts in multiple languages

Necip Fazil AyanDon HusaShawn Mei

People use Facebook to communicate and share in many different languages. In fact, 50 percent of our community speaks a language other than English, and most people don’t speak each other’s languages. Given that, we're always thinking about ways we can help remove language as a barrier to connecting on Facebook

To help solve this problem, we built a new multilingual composer that allows people to reach a broad audience across languages with less work. It rolled out for Pages earlier this year, and today we're beginning to test it for individual people on Facebook.

Many Facebook Pages have diverse audiences, and Page owners often want to share their messages across a large group of people who speak many different languages. In the past, some Page owners have done this by creating multiple posts written in different languages and then using post targeting to choose a specific audience for each post. Others have strung together several versions of a message written in multiple languages into a single long post, but this isn't ideal because it requires people to scroll through a long block of text to find the part written in a language they understand. Still others created separate Pages for each language audience, but this adds additional Page management time and effort.

With the multilingual composer, Page authors and people can compose a single post in multiple languages, and viewers who speak one of those languages will see the post only in their preferred language. This enables diverse audiences to more easily interact with the Pages and people they follow.

The composer became available to all Pages earlier this year, and it's now being used by around 5,000 Pages to post nearly 10,000 times per day on average. Collectively, these posts receive 70 million daily views, and of these views, 25 million are viewed in one of the post's secondary languages. We're excited to see this tool help even more people on Facebook connect with their friends who speak different languages.


Building a multilingual experience

Developing this end-to-end multilingual experience involved three main components: the composer experience and editing flow, the storage implementation, and the viewing experience.

Composer experience: Multilingual post creation with machine translation

Constructing the composer components was relatively straightforward. Upon creating a new post, authors are given the option to write the post in additional languages. Authors specify each language they're posting in using the drop-down selections. The composer components were built using React to render additional text areas. We used Flux to handle events, maintain state, and update the UI accordingly. Upon creating a post, the multilingual data is sent to the server as well.

To help authors create multilingual posts, we're testing a pre-fill feature that takes the first message composed and uses machine translation to pre-fill the messages in the additional languages selected. Authors can use the translations as a starting point for their own translations in other languages, or use the provided version as is. These machine translations are generated by machine learning models trained on hundreds of thousands or millions of translations from one language to the other. This is the same system that generates translations in other places on Facebook, like when you click "See Translation" for posts and comments.

Storing multilingual posts: Concatenation vs. author translation

One of the most difficult parts of building the multilingual composer was the storage implementation, as it affects the engineering and experience for viewing and editing. Existing Facebook code previously made the assumption that post content contains a single message with at most one language, so we needed to alter this assumption to correctly handle multilingual posts.

We considered two potential solutions: concatenation and author translation.

In the concatenation approach, the post's messages in all languages would be stored on the post's TAO object, with metadata objects containing information about the character ranges for each message. Editing a post with this approach would involve keeping track of character ranges, as well as changes to each range. This could get especially tricky when accounting for all character sets used across Facebook.

In the author translation approach, we would store the first message on the post object and create separate TAO objects for each additional language. We call these additional objects “author translations,” as they conceptually are translations of the original post provided by the author. Editing was more straightforward with this approach, giving authors the option to edit all language versions within a post simultaneously. Because of this, we decided to use author translations.

We then needed to make sure that each translation was treated like a post in terms of validation and processing logic. This required developing common interfaces to abstract away the storage implementation details from the validation logic. These abstractions allowed for easier substitution between the original post and author translations. One question we considered was how to handle mentions across languages of a multilingual post. To optimize for consistency, we decided that author translations shouldn't include mentions that aren't in the original post. We also implemented caching of some multilingual post metadata on the original post to speed up rendering at viewing time.

We also had to ensure that each type of post was properly supported. Different kinds of posts — status updates, photos, videos, shares, etc. — could potentially have a different code execution flow. Moreover, the code is constantly changing in a rapidly evolving codebase. To manage these cross-infrastructure dependencies, we developed rigorous integration tests for each type of post supported by multilingual composer. We worked with product operations and support teams to discover what behavior would result for the majority of product use cases, in order to make sure multilingual posts were being saved properly.

What viewers see: Determining which language to display

The last challenge was to determine what viewers will see. We heard feedback that most Pages prefer their followers to see messages only in the language they best understand, as this is the smoothest and easiest experience for readers. So that's what we decided the viewers of a multilingual post should see. For example, if a multilingual post was created in English and Spanish, English-speaking viewers will see the English version of the message and Spanish speakers will see the Spanish message. We use several signals to determine the most relevant version of the post to show each viewer, including the language preferences and the locale that people selected for their accounts, as well as the language they most commonly post in (using a naive Bayes classifier to determine the probability distribution of the text across the languages our system can identify). If there's no match between the languages the post was written in and the viewer's preferred language, then we show the author's first message as a default, and the viewer can use the existing “See Translation” tool to see the post in their preferred language.

We select the best message to be shown to the viewer on the server upon loading feed, and send that message to native clients. This allowed us to ensure a consistent viewing experience across platforms without having to make any native mobile changes. This also let us iterate more quickly, as app upgrades can take time to be broadly adopted, and it gave us full support on older app versions and across platforms.

What's next

Building the multilingual post experience was only the beginning. Many Pages and people have already developed routines for reaching multilingual audiences — so to change this pattern and make our tools more useful, we'll continue to iterate on our product design to provide the best experience for post authors and viewers alike.

We also plan to use multilingual posts to improve our machine translation training data. By opening this tool to people who speak less common languages, we'll be able to build better machine translation systems. This will move us closer to our vision of removing language barriers across Facebook.

How to use the multilingual composer: If you're part of the test group, you can enable the multilingual composer by going to the Language section of Account Settings. Page owners can find instructions on how to enable the composer for Pages here. The composer is currently available on desktop, while the viewing experience works across all platforms.

Keep Updated

Stay up-to-date via RSS with the latest open source project releases from Facebook, news from our Engineering teams, and upcoming events.

Subscribe
Facebook © 2017