ChatbotClient: Big EU based corporation
Length: 5 months
Goal: Create a chatbot that is able to connect a user with the colleague that can help him/her with a current topic or problem.

Tech: Python, Docker



Speak to us about your software development requirements:



The client approached us with an idea to create a chatbot solution that allows finding the best person among the user’s colleagues that can help him with a current topic or problem. The chatbot is required to interact with employees, learn what they are working on and derive areas of expertise. Once the chatbot has knowledge about the employees, it is ready to give assistance – when a user (employee) needs help regarding some topic, the chatbot finds the best expert (colleague) with relevant experience. NLP and machine learning techniques enables us to implement the matching solution. In cases where the chatbot is in doubt when matching or needs more relevant information, it can ask additional questions to get more context from the user.


The approach we followed for creating the chatbot app included defining the architecture, conversation flow and matching algorithm. Once we had an initial database with employee descriptions, in raw-text format, we implemented data extraction and matching. For extracting important information from text, we relied on a number of NLP methods and algorithms. Matching was implemented using the classification method, while clustering was employed for defining the right clarifying questions.


The “complete my profile” part of the chatbot (collecting user data) was implemented as a state machine. If there is missing information in a user’s profile, this part would be activated at every second login. The user is asked only those questions with incomplete answers, and can cancel the “complete my profile” at any point. In this case, the same question will be asked on the following logins. The user’s information is stored for later use, thereby training the model and facilitating matching. In addition, the user can enter daily status reports about his current work at any point and thus provide additional information for our machine learning model.

As for asking questions and the NLP methods, the first step was processing the questions and user texts (user descriptions). NLP steps that we used are the following: autocorrection, lowercase, text cleanup, abbreviation replacement, phrase detection and replacement, removal of unimportant and noise words, stopword removal, lemmatization, and stemming. We improved our model by introducing synonyms and word relations. We used Google’s Word2vec model to calculate similarities between words. We extracted similarities between words that appear in the dataset provided by the client. Each user was described by a number of attributes (dimensions). For calculating the probability that a user is a good match for the given question, the text of the question and texts of all user-dimensions are compared and matching features are created. When creating features, we considered word relations/similarities, word importance and phrases. The features were used to train the classification model that will output probability whether a user is a good match or not.

For generating clarifying questions we used clusterization: all keywords which describe users expertise were grouped into clusters, by using techniques such as Word2vec, mapping similar or co-occurring words to similar vectors, and K-means algorithm. Clusters were used in clarifying the question phase – while selecting top experts for a given question, if the chatbot is in doubt which expert to recommend, it can check to which clusters they belong and possibly filter and separate them by cluster keywords which describe their expertise. Using these keywords the chatbot can generate clarifying questions to get more context from the user.


We created a chatbot that is able to collect user information and recommend an expert of interest. Many relevant text-processing techniques were applied to extract information from raw text, which is later used for training our machine learning model. Each conversation made with the chatbot is saved and available for future use, enabling retraining and further improvement.


Speak to us about your software development requirements: