Facebook just unveiled DeepText, a deep learning-based engine that can understand with near-human accuracy the textual content of several thousands posts per second, spanning more than 20 languages. The idea is that understanding the various ways text is used on Facebook can help the company improve people's experiences with its products.
One of the goals of DeepText is to understand more languages more quickly. Facebook thinks that AI systems such as Parsey McParsface are simply far too time-consuming and do not properly understand or translate slang. DeepText reduced the reliance on language-dependent knowledge since the system can learn from text with little to no processing. DeepText also uses “word embeddings", a mathematical concept that preserves the semantic relationship among words. DeepText recognizes that for English and Spanish, “happy birthday” and “feliz cumpleaños” should be very close to each other in the common embedding space.
Perhaps the most lucrative purpose of DeepText is intent detection. For example, someone might write on Facebook, “I am looking to sell my car for $5,000”. DeepText would be able to detect that the post is about selling something. It would extract information such as the object being sold and its price, and prompt the seller to use existing tools that make these transactions easier through Facebook.
DeepText is also intended to detect sentiment, and entities (e.g., people, places, events), using mixed content signals like text and images, and remove spam. The announcement noted that many celebrities and public figures use Facebook to start conversations with the public that often draw thousands of comments. DeepText could find the highest-quality or most relevant comments in multiple languages, althoughit did not state what constitutes a “high-quality” comment.
Facebook aims to better understand people’s intent on their website, however, its also has a wider goal. The unstructured data on Facebook means that text understanding systems are better able to learn how people naturally use language in multiple languages. They hope that this will further advance the state of the art in natural language processing.