Have you ever wondered, “how does my bot know which answer to choose when responding to my customers?” Gordon Gibson, our Machine Learning Lead at Ada breaks down everything you need to know about Machine Learning and all the processes behind the scenes that make your bot understand, choose, and learn.
What is Machine Learning?
Machine learning (ML) is a set of mathematical models and algorithms that use data to learn how to perform tasks and make predictions on data without being given explicit rules on how to do so. Natural Language Processing (NLP) is a set of computer science tasks that relate to programming computers to process and analyze various human language data. Natural Language Understanding (NLU) is a subfield of NLP which relates to programs that can understand the content or meaning of a piece of text. In our case we apply NLU to help your bot understand what a customer is asking about. Our classification models then learn to detect patterns between questions and answers so that we can send your customers the most relevant answer to their questions.
Every Ada bot performs natural language understanding (NLU) and answer classification. Each of these steps leverage machine learning techniques in one way or another. As an example, let’s say your customer asks your bot “how do I find my points?” Ada uses NLP techniques to convert the question into a machine-readable format, and it then uses answer classification to determine the most relevant answer.
What data goes into Ada's NLU?
First let’s address the question: “why does Ada even need NLU”? We’ve said that we use it to understand customer questions, but why can’t we apply some fixed rules instead? This is because human language is incredibly complex and varied, and it would be very difficult to create a set of rules to summarize all of the different variations of language that you would want your bot to capture.
Our bots’ NLU learns by being fed examples of questions a customer might ask along with the correct Ada answers for those questions. These are the questions added in the “Training” section of the Answer editor. When you train a question to an answer in Ada, you are effectively telling the bot that when someone asks a similar question in the future, it should respond with that answer. Along with learning the relationships between questions and answers, Ada bots also automatically determine how confident they need to be in their predictions for an answer to actually be delivered to a customer in a chat. We call this the bot’s “confidence threshold”.
When we say our bots learn to identify relationships between questions and answers, we mean they learn what combinations of concepts and words in a customer’s question are likely to indicate that they should be sent a particular answer. For example, if you’ve trained an answer called “How to Add Credits ” with five questions, and they all contain the term “add credits,” your bot will deduce that when customers ask questions that contain “add credits” or related concepts, it should respond with the “How to Add Credits” answer. Going back to our original customer question “how do I find my points?” how would the bot know to respond with the “How to Add Credits” answer if it were never trained on that exact language? This is where Ada’s proprietary word embedding comes in.
Word embeddings are powerful NLP tools that represent the meaning of words in a way that a computer can work with. They can be thought of as digital thesauruses that are able to recognize synonyms and understand the relationships between words. So when a customer asks a question containing similar concepts to “add credits,” such as “find points,” or “discover rewards,” Ada’s word embedding helps bots understand that the customer should receive the same “How to Add Credits” answer. The power of Ada’s word embedding lies in the fact that it is trained on the interactions of customers across all of Ada’s bots. So, while your bot is only ever directly trained on the training data added in your bot’s dashboard, it is able to learn from the knowledge contained in Ada’s entire client-base. Put another way, your bot learns the meaning of words in general from looking at all the knowledge across Ada’s client base, but learns how to map a question to an answer based solely on the specific training data added in your bot’s dashboard.
BERT stands for Bidirectional Encoder Representations from Transformers. Before BERT, we relied on static word embeddings to determine how to respond to a user’s question with the most accurate answer available.
BERT advances and improves word-embeddings in part due to its bidirectionality – BERT allows for your training data strings and the chatter input strings to be processed both in left-to-right, and right-to-left order, and creates a numerical vector for each word after scanning the entire string. For instance, if I use BERT to scan "I took my cell phone to the prison cell and worked on cancer cells in the lab," the vector that is created for the first, second and third "cell" would be different, depending on the words next to it. With word-embeddings alone, the vector would be the same for all 3 instances of "cell." This process significantly improves the machine's ability to make a more semantically context-aware prediction than if it was simply using word-embeddings.
Putting It All Together
Let’s now summarize what happens when a customer asks your bot a question. First, Ada’s word embedding and other NLU tools transform the question into a format that the bot’s classification model can understand. The classification model then generates a list of answer predictions for the most relevant answer to the question. Along with this list of answers, the model also returns a number indicating its confidence in its top prediction. If the model’s confidence is above the bot’s confidence threshold, the predicted answer will be sent to the customer. If not, the bot will respond with the “Not Understood” (or upcoming “Needs Clarification”) answer and will include its model’s top predictions as clarification buttons for the customer to select. Likely causes for the bot not being confident in a prediction are that the customer asked something that was out of the bot’s realm of understanding -- which we call “Not Understood” -- or that they asked a question that returns multiple possible answers that the model is unable to pick between, which we refer as “Needs Clarification”.