Guest Post by Microsoft Student Partner Kai Alan Du, Hacking My 1st Chatbot With special guest: Microsoft Cognitive Services
Hi everyone my name is Kai but some people call me by English middle name, Alan
I am studying Computer Science at Imperial College London. I am passionate about hackathons, games, drawing and using Photoshop. I look forward in having fun and improving my skills during my position as a Microsoft Student Partner.
Course: Computing MEng
So, you want to make your first chatbot? But not just any chatbot… A chatbot that takes emotion into account. Does detecting emotions in text or parsing language sound too difficult for you? Then Microsoft Cognitive Services is the API for you.
Its easy-to-use yet powerful API will be your life saviour in making the chatbot.
If you’re looking to hack out a simple chatbot, then put your socks on and keep reading. We will provide a frontend web client for the chatbot and walk through how I made a simple chatbot backend using Python 3. The backend involves the chatbot producing simple responses given a user message.
- Somewhat competent with Python 3.
- Python 3 installed.
- Simple WebSocket Server (https://github.com/dpallot/simple-websocket-server) installed.
- Microsoft Cognitive Services Subscription (it’s free ^o^): https://www.microsoft.com/cognitive-services/en-US/subscriptions?mode=NewTrials
- Microsoft Cognitive Services Python SDKs https://github.com/Microsoft?utf8=%E2%9C%93&q=cognitive%20python&type=&language=
Setting up the project:
Download the project from the github repository here:
A rough overview of what the files are.
● generate_reply.py : the file you will be modifying throughout the tutorial.
● generate_reply_complete.py : the completed version of generate_reply.py. Following this tutorial, your generate_reply.py should become similar to generate_reply_complete.py.
● api_key.py : this file stores your Microsoft Cognitive Services API keys. You will need to configure these to contain your API keys for the Text-Analysis and Linguistic-Analysis APIs.
● linguistic.py : Provides a method (wrapper) for the Linguistic-Analysis API.
● sentiment.py : Provides a method (wrapper) to get the sentiment score using the Text-Analysis API.
● index.html : Frontend web client for the chat (uses WebSockets).
● server.py : The WebSocket server for the chat.
You can start chatting (after satisfying the prerequisites and configuring the API keys in api_key.py) by running the chat server by running “python server.py” on your terminal.
After that, run the index.html file via a modern browser. Say “hi” to the bot and it should say “I don’t understand”. Currently, the bot only replies with “I don’t understand”. Upon making changes to the bot (Python code), you need to restart the Python server then refresh the client if you make a change (CTRL+C in your prompt to terminate the Python server). If the bot doesn’t respond, it may mean that there is an error in your Python code or that the WebSocket connection was unsuccessful.
If you want to skip to the end results without doing the tutorial, in line 2 of server.py, rename generate_reply to generate_reply_completed. Change it back before continuing with the tutorial.
You need to configure the api_key.py file to the API keys of your Microsoft Cognitive Services.
Go to https://www.microsoft.com/cognitive-services/en-US/subscriptions?mode=NewTrials and view your API keys for Text Analysis and Linguistic Analysis and set them to the TEXT_ANALYSIS and LINGUISTIC variables respectively.
Let’s start with a greeting!
Hi there! We will do something very simple - giving the bot the ability to greet.
To do that, we need to check that the first word of the user input is an element of an array of greetings.
Look into generate_reply.py and define the array of greeting words as below:
And then we need to check that the first word of the message is an element of that array - which tells us that the user’s message is a greeting. We need to lowercase the word to provide an incasesensitive match using lower().
To make it more interesting, you may want to replace “Hey there” with a random greeting. To do that, define a list of random greeting responses and import the ‘random’ Python library. ‘random.choice(array)’ will return you a random element of that array.
To see your changes, you need to stop the Python server (if it’s running) by hitting CTRL+C on the terminal/prompt, and start the Python server by running python server.py, then refresh your client on your browser.
Let’s get emotional!
Okay, now all we have so far is an emotionless bot who only responds to greetings. Pffft, that’s boring… But how can we infer emotion from the user? How hard can that be? Do we have to implement some complex machine learning algorithm or take a PHD in psychology? Nope. We just leave all the dirty work to our slave, Microsoft’s Cognitive Services.
We will use the text-analysis API. The text-analysis API, given some text, returns you the sentiment score (a continuous value between 0 and 1, 1 indicating an ‘emotionally’ positive sentence, and 0 a negative sentence). This API also returns key phrases of the text, but for now, we are only interested in the sentiment score.
Going back to our generateReply method, we use the sentiment score to produce a reply. So if a user says a positive sentence, the bot will say “I’m happy to hear that”, otherwise “I’m sad to hear that.”.
To do that, we need to import our sentiment method and use the sentiment score returned to produce a happy response if it’s more than 0.5, otherwise a sad response, as below:
You are stupid!
Now let’s go a step up and detect if a user is calling the bot stupid!
To do this, we again use the help of our slave, Microsoft’s Cognitive Services! This time, using the Linguistic Analysis API. This API allows you to explore the structure of the text and access its POS (Parts Of Speech) tagging - which labels each word in a given text with a POS (e.g. noun, verb, adjective). Don’t worry if this sounds complicated for now, you will get it soon enough.
Looking at the Python code sample for this API, we create a simple wrapper, taking in chat input and returning the input tokenized and POS tagged.
Below is an example of tokenisation and POS tagging applied to “Hello world, I love banitsa!”.
Now we will need to detect “You are cool” or “You are quite stupid”.
To do this, we need to find “you are” in the sentence, followed by 0 or more words, and then the adjective. We want to capture this adjective and use it in the bot’s response. An adjective is represented as ‘JJ’. There are many possible tags (i.e. UH, NN, PRP, etc), however in this tutorial, we will only worry about the JJ tag.
Let’s define the method findYouAreJJ(pos) in generate_reply.py. This method returns the relevant adjective in the sentence. If the message does not contain “you are” followed by an adjective somewhere, it will return False. This method takes in POS tagged tokens.
To simplify the tutorial, we will not support cases where the user uses ‘not’, i.e. “You are NOT cool”. Then we can include the adjective (JJ) returned by this function into our response.
We then modify the generateReply method to just do that.
To make it better, we can combine our work with sentiment analysis to produce a more relevant reply.
I am cool!
And… we’re finished! Or are we?
Congratulations on hacking your first simple chatbot. Although we are at the end of making our simple chatbot, let’s not finish here. You can improve your chatbot! We have nowhere near utilised all of Microsoft’s Cognitive Services features and the chatbot demonstrated here is just a simple hack. For some ideas, using Microsoft Cognitive Services, you could use the Spell Check API to correct user input, use the Speech or Face Recognition API to detect the emotion of the user without text, utilise keywords using the Text-Analysis API and the list goes on.