'Feels More Human', Say Users as Facebook Open-Sources Its Blender Chatbot

Venturebeat | April 30, 2020

'Feels More Human', Say Users as Facebook Open-Sources Its Blender Chatbot
  • FAIR claims that Blender, which is available in open source on GitHub, is the largest-ever open-domain chatbot.

  • Blender promises to make interactions with conversational AI systems like Alexa, Siri, and Cortana more natural than before.

  • To achieve Blender’s state-of-the-art performance, researchers at FAIR focused on two engineering steps: blending skills and generation strategy.

Facebook AI Research (FAIR), Facebook’s AI and machine learning division, today detailed work on a comprehensive AI chatbot framework called Blender. FAIR claims that Blender, which is available in open source on GitHub, is the largest-ever open-domain chatbot and outperforms existing approaches to generating dialogue while “feel[ing] more human,” according to human evaluators.

FAIR says Blender is the culmination of years of research to combine empathy, knowledge, and personality into one system. To this end, the underlying models — which benefit from improved decoding and skill blending techniques — contain up to 9.4 billion parameters (configuration variables that define skill on a given problem), or 3.6 times more than previous systems.

Blender promises to make interactions with conversational AI systems like Alexa, Siri, and Cortana more natural than before, whether in enterprise, industrial, or consumer-facing contexts. That’s because they’re able to ask and answer a wide range of questions; display knowledge about specific topics; and express sentiments like empathy, seriousness, or playfulness as circumstances dictate.


Blending skills and generation strategies

To achieve Blender’s state-of-the-art performance, researchers at FAIR focused on two engineering steps: blending skills and generation strategy.

“Blending skills” refers to selecting tasks that outperform larger models that lack tuning. As the FAIR researchers point out in a paper, chatbot improvements can be attained by fine-tuning models on data that emphasizes desirable conversational skills. As it turns out, tuning can also minimize undesirable traits learned from large data sets, such as toxicity.

With respect to generation strategy, the choice of decoding algorithm — the algorithm used to generate text from a language model — has an outsized impact on a chatbot’s responses. Because the length of a bot’s responses tend to correspond to human judgments of quality, decoders that strike an appropriate balance are desirable. Responses that are too short are typically perceived as dull or showing a lack of interest, while those that are too long imply waffling or distraction.

Over the course of these engineering steps, the researchers tested three types of model architectures, all of which used Transformers as a base. Transformers — a Google innovation — contain neurons (mathematical functions) arranged in layers that transmit signals from input data and adjust the strength (weights) of each connection, as with all deep neural networks. That’s how they extract features and learn to make predictions, but Transformers also have attention. This means every output element is connected to every input element and the weightings between them are calculated dynamically.

First up was a retriever model that, given a dialogue history (or context) as input, selected the next dialogue response by scoring a large set of candidate responses and outputting the highest-scoring one. The FAIR researchers employed a poly-encoder architecture that encoded features of the context using representations attended to by each candidate response, which they say resulted in improved performance while remaining “tractable” to compute, compared with other architectures, like cross-encoders.

The second model was a generator that produced responses rather than retrieving them from a fixed set. Three models were considered by size, ranging from 90 million parameters to 2.7 billion parameters to 9.4 billion parameters.

The third model attempted to address issues with the generator, namely its tendency to synthesize repetitive responses and to “hallucinate” knowledge. It took a “retrieve and refine” (RetNRef) approach, where the above-described retrieval model produced a response when provided a dialogue history, which was then appended to the input sequence of the generator. In this way, the generator learned when to copy elements of responses from the retriever and when not to so it could output more interesting, engaging, and “vibrant” responses. (Retriever models produce human-written responses that tend to include more vibrant language than standard generative models.)

The FAIR team paired a Wizard Generative model with another retriever that together determined when to incorporate knowledge into chatbot responses. The two models produce a set of initial knowledge candidates and then rank those candidates, after which they select a single sentence and use it to condition response generation. A classifier chooses whether to perform retrieval or not on a per-dialogue basis so as to avoid serving knowledge when it’s not required.




For the generative models, the FAIR researchers used a beam search decoder method to generate responses to given dialogue contexts. Beam search maintains a set of partially decoded sequences, called hypotheses, that are appended to form sequences and then scored so the best sequences bubble to the top.

To control the length of the chatbot’s responses, the FAIR team considered two approaches: a hard constraint on the minimum generation length and a classifier that predicted the length of responses and set the minimum generation length constraint to its corresponding prediction. The latter was more complex but resulted in variable-length responses to questions, ensuring the chatbot served long responses when they seemed appropriate.


Training the models

To prep the various models that make up Blender, the researchers first performed pretraining, a step that conditions machine learning models for particular tasks. They used Facebook’s own Fairseq, a toolkit that supports the training of custom language models, with data samples from a Reddit corpus containing 1.5 billion comments (with two sets of 360,000 comments each reserved for validation and testing) pruned for known bots, non-English subreddits, deleted comments, comments with a URL, and comments of a certain length.

Next, the FAIR team fine-tuned the models using another Facebook-developed suite — ParlAI — designed for training and testing dialogue models. One training corpus selected was ConvAI2, which contains 140,000 utterances involving paired volunteers getting to know each other by asking and answering friendly questions. Another was Empathetic Dialogues, which consists of 50,000 crowdsourced utterances grounded in an emotional situation. Yet another data set — the Wizard of Wikipedia — comprises 194,000 utterances of 1,250 topics, where each conversation begins with a randomly chosen topic and the goal is to display expert knowledge.

A fourth fine-tuning data set — Blended Skill Talk — aimed to blend the previous three sets (ConvAI2, Empathetic Dialogues, and Wizard of Wikipedia) to combine their respective skills during dialogue. Here, 76,000 utterances were collected with a guided and unguided human speaker, where the guided speaker could select utterances suggested by bots trained on the three individual data sets.



Post-training, the researchers evaluated Blender’s performance by comparing it with Google’s latest Meena chatbot, a machine learning model with 2.6 billion parameters. Human volunteers were tasked with answering two questions — “Who would you prefer to talk to for a long conversation?” and “Which speaker sounds more human?” — given 100 publicly released and randomized logs from Meena and the same number of logs generated by Blender. In each case, the volunteers were shown series of dialogues between humans paired with the respective chatbots.

The topics of conversation ranged from cooking, music, movies, and pets to yoga, veganism, instruments, and malls — with the Blender models often going into detail when asked and naming relevant stores, bands, movies, actors, pet species, and pet names. In one example, Blender offered a nuanced answer to a question about how Bach compared with Justin Beiber, while a request that Blender write a song indeed yielded lyrics — although nothing particularly poetic.

When presented with chats showing Meena in action and chats showing Blender in action, 67% of the evaluators said the best-performing Blender-powered chatbot — the one with a generative model containing 9.4 billion parameters pretrained on the Blended Skill Talk corpus — sounded more human. About 75% said they’d rather have a long conversation with the 2.7 billion-parameter fine-tuned model than with Meena. And in an A/B comparison between human-to-human and human-to-Blender conversations, the volunteers expressed a preference for models fine-tuned on Blended Skill Talk 49% of the time, while models trained only on public domain conversations were preferred just 36% of the time.

Problematically, further experiments showed that Blender sometimes produced responses in the style of offensive samples from the training corpora — mostly from Reddit comments. The FAIR researchers say that fine-tuning on the Blended Skill Talk data set mitigated this to an extent but addressing it comprehensively would require using an unsafe word filter and a kind of safety classifier.

We’re excited about the progress we’ve made in improving open-domain chatbots,” wrote Facebook in a blog post. “However, building a truly intelligent dialogue agent that can chat like a human remains one of the largest open challenges in AI today … True progress in the field depends on reproducibility — the opportunity to build upon the best technology possible. We believe that releasing models is essential to enable full, reliable insights into their capabilities.”

The pretrained and fine-tuned Blender models with 90 million parameters, 2.7 billion parameters, and 9.4 billion parameters are available on GitHub, along with a script for interacting with the bot (with safety filtering built in). All code for model evaluation and fine-tuning, including the data sets themselves, is available in ParAI.



Adam Flanders, M.D., co-director, neuroradiology and vice-chair of informatics at Jefferson University Hospitals, Philadelphia, and chair of the RSNA Radiology Informatics Committee, discusses the impact of AI at RSNA 2017.


Adam Flanders, M.D., co-director, neuroradiology and vice-chair of informatics at Jefferson University Hospitals, Philadelphia, and chair of the RSNA Radiology Informatics Committee, discusses the impact of AI at RSNA 2017.

Related News

DefinedCrowd Rebrands as Defined.ai, Reflecting Expanded Position as a Developer Platform for Artificial Intelligence

Defined.ai , the leading provider of data, models and tools for Artificial Intelligence, today announced a rebranding in response to continued company growth and the evolution of the development and application of Artificial Intelligence, impacting companies in all sectors from healthcare and retail to finance and consumer goods. At the center of this rebranding is a change of the company name to Defined.ai and an update to the corporate logo and tagline. With the product suites now folding under one name, the company can continue to scale the business to new heights, moving beyond a resource for crowd-sourced data gathering to a comprehensive AI platform and marketplace to embrace and empower a new era of AI development. While Defined.ai will continue to offer its existing services, including custom collection, data crowdsets, and white-glove support, the product brands DefinedData, DefinedWorkflows, DefinedSolutions and DefinedCrew will all merge under the cohesive Defined.ai umbrella. As companies look to invest in AI technology, development teams have increasingly realized how critical AI models with robust and diverse datasets are to building a good product. If AI is built on data that is representative of the populations it is serving, the end result will only be more successful and have a higher degree of consumer engagement and support. The Defined.ai platform gives AI model builders a place to create and trade those tools and datasets that are necessary to develop successful AI models that drive key business goals. Developers can purchase datasets, through subscriptions or one-off transactions, sell their own datasets as third-party vendors, or request highly specialized, custom datasets built by the Defined.ai team. "Today marks a major milestone for Defined.ai, as AI technology continues to be incorporated into every aspect of our lives, both in the tech stack and geographically. Our team is reshaping the way the AI industry innovates, by changing the way AI developers build and add to the value chain of AI, starting with the data. By advocating for ethical and bias-free models, and enabling the world with the tools that will make their products as inclusive as possible, the Defined.ai team is creating the future in real time. We're setting the standard and promoting the already rapid evolution, adoption, and application of AI technology." Daniela Braga, CEO and Founder, Defined.ai About  Defined.ai Defined.ai is on a mission to enable the creators of the future. At Defined.ai, we believe AI should be created as we raise our children, with the responsibility to make it the best version possible, to be fair, kind, inclusive and to strive for a better world. That's why we provide high-quality AI training data, tools, and models to the creators of the future. We offer data scientists the solutions to get it just right, from datasets to bootstrap their models which keep their projects moving, to the final tuning in domains and perfection in accents and phonetics. We host the leading AI marketplace, where data scientists can buy and sell off-the-shelf datasets, tools and models, and we provide customizable workflows that can be used to generate datasets tuned to their needs. And, because the future of AI is complicated, Defined.ai can also offer professional services to help deliver success in complex machine learning projects.

Read More


Uniphore Announces “Uniphore Unite” Partner Program to Accelerate Global AI and Automation Innovation

Uniphore, the leader in Conversational Automation, today announced its Uniphore Unite partner program to support a rapidly expanding market that is seeing the benefits of using Artificial Intelligence (AI) and automation technology to significantly improve customer experience (CX). Uniphore Unite is a robust partner program that includes essential resources to support the partner lifecycle end-to-end and enables partners to leverage Uniphore’s best-of-breed, innovative technology to expand their portfolio and profitability. Uniphore provides a unique value proposition that combines improved CX along with a great return-on-investment, increasing customer satisfaction while driving cost savings. Customers can now view and take advantage of the services expertise, capabilities, and complementary technology of the partners in Uniphore’s Unite program to achieve these returns. “Uniphore has always been committed to building a robust partner ecosystem to support our customers. With the launch of Uniphore Unite, we enhance the value of our industry-leading AI and automation solutions by partnering with world-class services and complementary technology firms. Uniphore Unite provides structure and foundation for enhanced partner collaboration and will facilitate the creation of a strong community built around the mission to transform CX across the board.” Jafar Syed, SVP, Global Head of Channel Alliances & Partnerships at Uniphore Uniphore Unite offers a range of programs to support each partner’s business model, including referral, resell, managed services, co-selling, and services, delivering the resources this global community needs for success. There are three program levels in the reseller and Business Process Outsourcer (BPO) programs, providing support for partners of all sizes: Uniphore Reseller/Unite BPO: Unite’s entry level that allows new partners to ramp up, build skills and drive increased revenue Unite Pro: For companies who have established a relationship with Uniphore and participated in key sales and technical training Unite Pro+: Designed for organizations that have developed a strong partnership with Uniphore and are consistently collaborating on sales, marketing and training opportunities Partners who join Uniphore Unite will benefit from the program in numerous ways, including: Significant Partner Resources: The initial package of partner resources includes sales training, technical training and support, dedicated channel teams, deal registration and co-selling, marketing and sales assets and support, and a comprehensive rewards program Partner Helpdesk: The Partner Help Desk will be available to all Unite members to provide world-class support via web conferencing, email and phone Marketing Development Funds (MDF): The Uniphore Unite MDF program provides not only funding but also access to an experienced global marketing agency to assist our partners in planning, messaging, positioning, demand generation and other go-to-market activities Partner Advisory Council: The advisory council enables strategic partners to easily give direct feedback and to engage consistently with key members of the Uniphore team to build a strong partner community App Alliances Program: This complementary ISV program includes benefits around co-selling and positioning our solutions with these partners The launch of Uniphore Unite is yet another milestone indicative of Uniphore’s accelerating momentum in the market. In addition to its latest $150M Series D funding that was announced in March 2021, Uniphore has announced numerous product innovations and two acquisitions so far this year – the acquisition of Emotion Research Labs and Jacada. With the acquisition of Jacada, Uniphore is the leading vendor that can truly deliver front and back-office automation across every customer and agent interaction by optimizing every conversation and delivering it in a simplified, business user friendly UX environment and desktop. Uniphore Unite will enable the company’s global partners and their customers to take full advantage of this innovative platform. About Uniphore Uniphore is the global leader in Conversational Automation. Every day, billions of conversations take place across industries — customer service, sales, HR, education and more. Whether they are human to human, human to machine or machine to machine, conversations are at the heart of everything we do, and the new currency of the enterprise.

Read More


The expert.ai NL API Now Available in AWS Marketplace

Expert.ai announced today that its natural language (NL) API providing deep language understanding is now available in the AWS Marketplace, a digital catalog with thousands of software listings from independent software vendors that make it easy to find, test, buy, and deploy software that runs on Amazon Web Services (AWS). The expert.ai NL API is a powerful way to structure unstructured language data leveraging deep language intelligence with minimal effort. The API identifies which meaning of a word is used in context ("disambiguation") to quickly analyze text for key elements, relations, classifications and more. It can also determine sentiment and even capture a range of 117 behavioral and emotional traits, providing the richest, most comprehensive and granular emotional and behavioral taxonomy available throughout the AI-based API ecosystem. Furthermore, using built-in technologies and its extensive knowledge graph, the expert.ai NL API can be used in more targeted ways to identify sensitive data (to protect customers, victims, users or research subjects, as well as to comply with data privacy regulations), media-related topics, geographical taxonomies and more. "At expert.ai, we aim to make it easy for developers and data scientists to design, build and test NL-aware functions and easily embed advanced natural language understanding and natural language processing capabilities into their apps. The availability of our NL API in the AWS Marketplace expands this opportunity to more users: we are excited to offer all of the insights the NL API provides to enrich business data, understanding it in less time, at scale and in the most precise way." Brian Munz, product manager, NL API & developed experience at expert.ai AWS customers can quickly begin extracting insight from their unstructured language data by using the expert.ai NL API with their existing AWS account. The expert.ai NL capabilities can be accessed via two feature options: Core Bundle which includes semantic analysis, part-of-speech tagging, morphological analysis, text subdivision, dependency parsing, lemmatization, named entity recognition, key phrase extraction, relation extraction. Premium Bundle that includes sentiment analysis, IPTC media topics, geographic, emotional traits and behavioral traits taxonomies, personally identifiable information (PII) detection and writeprint for performing a stylometric analysis of business documents. About expert.ai Expert.ai is the premier artificial intelligence platform for language understanding. Its unique approach to hybrid natural language combines symbolic human-like comprehension and machine learning to extract useful knowledge and insight from unstructured data to improve decision making. With a full range of on-premises, private and public cloud offerings, expert.ai enhances business operations, accelerates and scales natural language data science capabilities while simplifying AI adoption across a vast range of industries, including insurance, banking & finance, publishing & media, defense & intelligence, life science & pharma, and oil, gas & energy. Expert.ai has cemented itself at the forefront of natural language solutions and serves global businesses such as AXA XL, Zurich Insurance Group, Generali, The Associated Press, Bloomberg INDG, BNP Paribas, Rabobank, Gannett, and EBSCO.

Read More