At Helpr we frequently get questions about the magic behind chatbots. How does a chatbot know what you’re talking about? Why are chatbots trending again lately? How smart is a chatbot? I’ll explain this complex matter in a simple way, no prerequisites are needed.
In this blog series I’ll make complex technical knowledge of chatbots accessible for everyone and discuss the many parts a chatbot consists of. In this first post I want to teach you about the core of a chatbot: Natural Language Processing or NLP.
Without NLP a chatbot only sees a sequence of characters when a user talks to him. For a chatbot, it’s like reading an alien language. But we humans know that most parts of these sequences have a meaning or resemble some concept. And together these words form a sentence with various purposes. So we must provide a chatbot processable elements and structure.
The first step consists of splitting sequences of characters at periods and whitespaces to break it up in sentences and words. Some algorithms go even one step further. They break up words to syllables. Some other algorithms use two or three consecutive words as building blocks (n-grams). Let’s stick to words as our building blocks.
We want to keep track of the surrounding words of each word, because these words should be somehow related to each other. For example: physics relatively often occurs in the same context as chemistry or an other school subject.
Surrounding words should be somehow related to each other
Then for the next challenge: The chatbot has to remember similarity in such a way that it can quickly process text and interpret. You don’t want to wait two minutes for an answer from a human, let alone a chatbot!
Bringing New York into this: A chatbot’s language map.
Most NLP algorithms represent words as a set of numbers, called a vector. You can compare a vector with a floor in a building in Manhattan. For example one at the intersection of West 50th Street and 7th Avenue in a building on the 12th floor. This floor resembles the vector (50, 7, 12). Let this vector represent the word physics (physics has a nice apartment). The vector of chemistry might be (51, 7, 11) and the vector of the word summarizing might be at East 118th Street, 2nd Avenue on the 4th floor (118, 2, 4).
The distance between these floors is a measure of the similarity of the words. Physics and chemistry are almost neighbors and thus similar terms for a chatbot. On the other hand physics and summarizing are a long way from each other in Manhattan, so a chatbot will interpret these words as different.
Vectors in practice
When you teach a chatbot to classify “I struggle with physics” as an expression for someone who has difficulties with a certain school subject, then “I struggle with chemistry” and “I have difficulties with chemistry” will be put in the same class, because physics and chemistry are neighbors. But when you say “I struggle with summarizing texts”, then the vector of summarizing is far away from the area of school subjects in Manhattan. Therefore this sentence will be put in another class. Probably one for an expression for someone who has difficulties with a study skill.
Vectors have to be assembled by the algorithm from huge amount of texts about the same subject as the chatbot and language. Also examples of sentences belonging to the same class have to be developed.
There it is! We’ve now learned how the basic core functionalities of NLP in chatbots work. This has to be put in practice using NLU (natural language understanding) for intents and entities to teach the chatbot what to interpret and what to ignore. So tune in for part 2 explaining these