Anyone that has ever shopped on Amazon or listened to music on Spotify will be familiar with the concept of recommendations where the user is offered recommendations of items based on browsing history, purchase history and other indicators of how the user has interacted with a product, artist etc.
Recommendations often take one of these forms:
- Frequently Bought Together: Items that are frequent bought or used in conjunction with each other. For example on Amazon if you look at a SONOS Play:1 speaker, it will tell you that Flexson Desk Stand for Sonos PLAY:1 is frequently bought with the speaker. Amazon show this under the 'Frequently Bought Together' and 'Customers Who Bought This Item Also Bought' section for every product page.
- Item-to-item Recommendations: Items that are closely related to each other based on generic historical user interactions such as click-through, add-to-cart, download or any other kind of usage. On Spotify if you look at any Artists, you'll see 'Related Artists'
- Personalised Recommendations: These are like item-to-item recommendations but the recommendations are based on a user's personal interactions. In Amazon, you'll see this type of recommendation labelled as 'Related to items you've viewed' or 'Inspired by your purchases' on the home page.
Recommendations are a proven way to boost conversions, discoverability and user satisfaction for any digital service.
The Cognitive Services Recommendations API
The API is free to use for up to 10,000 transactions a month and relatively inexpensive after that.
Your Catalogue & Features
To use the API, you first need to provide a catalogue which is a CSV file containing your items. These don't have to be products in an e-commerce store, they can be songs, files or anything that a user will 'use' or interact with.
As a minimum, you'll need to provide an
Item Name and
Item Category (see the full catalogue schema here).
You can optionally provide a list of 'features' which are additional categorical data points about the item. Features are used by the machine learning algorithms behind the API to make usage connections so the more features you can provide, the better (limit of 20). For example, if you are providing recommendation on bands, you could provide the music genre as the category and also provide location, year established, number of members, last gig location as features. Features are defined as key/value pairs in the CSV file.
Features are used by the model when there is not enough transaction data to provide recommendations on transaction information alone. So features will have the greatest impact on “cold items” – items with few transactions. If all your items have sufficient transaction information you may not need to enrich your model with features.
As an example, the catalogue entry for Metallica might look like this where '1234' is some unique ID
1234,Metallica,Metal,location=San Fransicso,yearestalished=1981,numberofmembers=4,lastgiglocation=Mexico City
If you are planning to use features, please take note of the advice in the 'Building your model' section.
You can get an example catalogue containing data about books, including feature data from http://aka.ms/RecoSampleData
Your Usage Data
Once you've uploaded your catalogue, you can then upload your usage data.
Usage data describes transactions where users have interacted with catalogue items. This could be downloading a song, viewing a details page, adding to cart, purchasing etc.
Usage data files have to contain the following data points (see the full usage schema here). :
UserID: Some unique value that identifies a user
ItemID: The id of the item that have used. This correlates to the item ID's in your catalogue
Time: A date/time stamp for the time when the usage occurred.
You can also optionally defined the
Event which indicates the type of usage that occurred. Events can be one of the following and defaults to
Purchase if not defined:
As with the catalogue file, the usage data takes as CSV format. For example, if someone had viewed the artist details page for Metallica, the usage record might look like this (where 9876 is a unique user id)
The quality of your model is heavily dependent on the quality and quantity of your data. The system learns when users buy different items (We call this co-occurrences).
A good rule of thumb is to have 20+ usage transactions per catalogue item. So if you had 10,000 items in your catalogue, you should have at least 20 times that number of transactions or about 200,000 transactions. If you do not have that level of usage, you may need consider allowing
cold items. See the 'Building your model' section of this article for more detail.
You can upload usage files either via the Recommendations UI tool or the API Upload usage file Function or you can post usage events directly from your application as they occur via the API Upload usage event function.
The model will take all usage files and events into account when performing a build.
Building your model
Once you catalogue and usage files\events have been uploaded, you can start a build. A build is the process where the machine learning engine analyses the data and works out the recommendations for each of your items.
There are three main types of build:
- Recommendation build: Determines recommendations based on usage. Split into two sub-builds:
- Item-to-item (or ITI): Given an item or a list of items, it will predict other items that are likely to be of high interest to customers that have interacted with the original set of items.
- User-to-item (or U2I): Given a user id (and optionally a list of items) this option will predict a list of items that are likely to be of high interest to the given user (and its additional choice of items). The U2I recommendations are based on the history of items that were of interest to the user.
- Frequently-Bought-Together build (or FBT): Counts the number of times two or three different products co-occur together, and then sorts the sets based on a similarity function (Co-occurrences, Jaccard, Lift). Given an item, and FBT build returns other items that are likely to occur in the same transaction.
- Rank build: A rank build is a technical build that allows you to learn about the usefulness of your features.
As indicated in the 'Your Usage Data' section, builds require around 20+ usage events per catalogue item to get recommendations. Some items may not have that level of usage data, these are referred to as
cold items. If you have a large number of cold items in your data, you may want to explore the
ModelingFeatureList parameters when creating your build (see API Create/Trigger build function for the full list of parameters). These options along with many others are only available when creating a build via the API.
Builds can be managed both via the Recommendations UI tool and the API. Most of the API functions available are related to creating and managing builds. The reason for this is because it is best practice to constantly update your usage data as usage may change over time in relation to seasonal changes or trends. The SDK contains a sample Windows application that can be used to manage your builds in more detail than is available via the web UI.
You can have several builds in place concurrently. The purpose for this is that you may produce a new build whilst your application is using an older one or you may want both item/user-to-item recommendations and frequently-bought-together recommendations. In either case, there is always an 'active' build which is the default if no build ID is specified by your application.
Using your model
Once you have a build, you can call the API in your application. This is simply case of calling either the Get item-to-item recommendations API function or the Get user-to-item recommendations API function.
As with all Cognitive Services, you'll need to pass an
Ocp-Apim-Subscription-Key header which you can get from your Cognitive Services accounts - you can subscribe here if you have not already done so.
You must also pass several request parameters which you can see in the API references listed above. These basically involves passing the item(s) or user(s) you want recommendations for, the number of recommendation and the build ID (default to the active build if omitted)
As with all Cognitive Services, these are just simple HTTP based APIs and can be accessed in any language or platform.
To show how recommendations are used in a web application, I've used the book sample data to build http://recommendationsapi.azurewebsites.net/ in ASP.Net Core. You can see how the recommendations work by going to the site and clicking a book.
You can see the source code for this project here:
Microsoft Bot Framework example
One of the best type of application for the Recommendations API is in bots. Bots are conversational applications that user access through social media chat applications such as Facebook Messenger and Skype. You can see more about bots in several Web Hack Wednesday episodes i have recorded with my chum Martin Beeby:
I produced a very simple bot that uses the same book sample data to provide recommendations via the bot framework. You can see the source code for this project here.
I recently worked with a company called GigSeekr who provide a live music discovery service. GigSeekr are working on a bot using the Microsoft bot framework which will allow users to locate artists, events and venues near them.
Once user have found an artist GigSeekr want the bot to be able to provide recommendations for other artists the user may be interested in.
Gig seeker have a database of around 38,000 artists and usage data from various sources including the visits on their website and view and favourite data from their app Pepper.
We used the Cognitive Recommendations API to easily provide recommendations from within the bot. As I write the work is underway, but I'll post again when it is completed.
Recommendations are great user feature which you can easily implement in a consistant data-driven way using the Cognitive Services Recommendations API.