The universal recommender in Action(ML)

No Comments


Recommender systems have become crucial for many different businesses. E-commerce uses recommenders to guide their customers in finding the right products and to assure they stay on the site. Newspapers or entertainment websites want to keep their users engaged by showing them the right content. Current machine learning techniques for recommender systems vary from collaborative filtering algorithms to methods based on neural networks and reinforcement learning.

In this article a new collaborative filtering technique is presented and applied to a sample dataset. The universal recommender is a recommender system based on co-occurrences between events. It can be run as “engine” of Harness, a service provided by ActionML that offers different end-to-end machine learning solutions. We will go through its architecture but our focus lies on the algorithm and its interpretations.

Universal recommender: architecture and main idea


The universal recommender is configured as a machine learning engine in the Harness server. Harness provides a REST API for input data and queries. Once the input data is imported to the engine instance a model is trained and the engine becomes “queryable”: queries are sent via the Harness REST API and recommendations (for example: a list of recommended items with scores) are given in return.

For a native installation different requirements are needed (Java, MongoDB, Elasticsearch, etc.). The model is trained using Apache Spark. Harness and all needed dependencies can also be run in docker containers. For the experiments described below, a docker-compose installation has been used.

Main idea

A usual task of recommender systems is to find personalized recommendations based on user interactions with items like rating (explicit user feedback) or purchase, view, click, etc. (implicit user feedback). In matrix factorization techniques like the alternating least square algorithm (ALS), the user-item matrix is decomposed into the product of two matrices in lower-dimensional space. In this way users and items can be represented as vectors which are then used to create recommendations.

The approach used in the universal recommender is different. User interactions with items are still considered but there is no matrix factorization involved: items are scored to give recommendations based on co-occurrences between events. Let’s look at this in more detail:

Assume we have \(n\) users that can buy \(m\) items on a website. Let \(h_p\) be the vector representing a user’s purchase history. We want to recommend new items to the user (i.e. items the user has not purchased yet) based on a “score”:

    • First consider the \(n\times m\) user-item matrix of purchases:

      \(P = \left( \begin{array}{rrrrr}1 & . & . & . & 1 \\ . & 1 & . & 1 & . \\. & . & . & . & . \\. & . & . & . & . \\1 & 1 & . & 1 & . \\ \end{array}\right)\)

      Here the value 1 at position \(i,j\) means the user \(i\) has purchased item \(j\).

    • Calculate \(m\times m\) the matrix of log-likelihood ratio between users purchases:


      The parentheses [ ] indicate that we are taking the log-likelihood ratio between purchase vectors (row of \(P^t\) compared to column of \(P\)), a sort of similarity score between items. For more details on log-likelihood ratio similarity see next section.

    • Finally the scores vector is given as the product:

      \(r = \left[P^{t}P\right]h_{p}\)

We can then give recommendations to the user based on the scores in \(r\). More on this can be found on the universal recommender presentation slides.

The main advantage of this approach is that any type of user interaction and information can be ingested: user views, preferences of categories but also profile data and contextual information. In other words, the formula above can be extended to other secondary actions as follows:

\(r = \left[P^{t}P\right]h_{p}+\left[P^{t}V\right]h_{v}+\cdots\)

In this example \(V\) denotes the matrix of user-item views and \(h_v\) is the vector of the user’s views.

Log-likelihood ratio score

The log-likelihood ratio (LLR) is a similarity score that does not only depend on the number of times two events have occurred together (\(k_{11}\) in the table below) but also on the number of times two events haven’t occurred together (\(k_{22}\) in the table below) and the number of times one event has occurred and the other not (\(k_{12}\) and \(k_{21}\) in the table below)

LLR will be higher if there is a correlation or anti-correlation between events A and B.
For example consider the following vectors representing three different item purchase histories (this would be three columns of the user-purchase matrix \(P\) above):

\(i_1 = (1,1,1,0,0,0,0)\)

\(i_2 = (1,1,1,1,1,0,0)\)

\(i_3 = (1,1,1,1,0,0,0)\)

Even if the co-correlation values between \(i_1\) and \(i_2\) and between \(i_1\) and \(i_3\) are the same (the number of users who purchased \(i_1\) AND \(i_2\) and the number of users who purchased \(i_1\) AND \(i_3\)) the LLR score between \(i_1\) and \(i_3\) will be higher as the anti-correlation value, i.e the number of users who haven’t purchased \(i_1\) nor \(i_3\), is higher.


In this section we present an application of the universal recommender to build recommendations for a multi-category online store. The data is available on Kaggle. Jupyter notebooks with analysis and instructions on how to run the application can be found in the GitHub repository.

Data preparation

For more details on this section see the data preparation notebook.
We first load the data in a pandas dataframe:

df = pd.read_csv("../datasets/2019-Nov.csv")

This is how it looks like:

It contains information about the product type (category, brand, price) and the user’s interaction with the site (event type): purchase, view, cart.

For simplicity and to avoid sparsity we restrict the dataframe to the category of smartphones and we take only the top 10.000 users by number of purchases.

df_el = df[df["category_code"] == "electronics.smartphone"]
n = 10000
purch_by_users = df_el[df_el["event_type"] == "purchase"].groupby(
top_n = list(purch_by_users.sort_values("nr_purch", ascending=False).head(n)["user_id"]))

We train three different recommenders based on user-item interactions:

      1. For the first recommender we use purchase-item as the unique main action. The recommendation scores are computed using the formula above.
        #for the first recommender we only consider purchases as interaction
        df_el_purch = df_el[(df_el["event_type"] == "purchase") & (df_el["user_id"].isin(top_n))]
      2. For the second recommender we add view-item as secondary action. For the score’s computation the LLR between purchases and views is added as described in the extended formula.
      3. In the third recommender we add cart-item as a further action.

We prepare three different train sets for each recommender by considering the first interactions of the users with items. We keep the last purchased item in a test set and we use this to compare the recommenders: we calculate for how many users the purchased item in the test set is in the list of recommendations.

Engine configuration

We need to create an engine for each of the recommenders we want to train. The engine will be specified by a json file containing information like: algorithm type (universal recommender), spark resources, name of Elasticsearch container (where the input events will be sent) and indicators, i.e. the user interactions considered in each recommender.

This is how the engine template for the first recommender looks like:

 "engineId": "ecommerce_electronic_purchase",
 "engineFactory": "com.actionml.engines.ur.UREngine",
 "sparkConf": {
   "master": "local",
   "spark.driver.memory": "4g",
   "spark.executor.memory": "4g",
   "": "true",
   "": "elasticsearch",
   "": "true",
   "spark.kryo.referenceTracking": "false",
   "spark.kryo.registrator": "",
   "spark.kryoserializer.buffer": "300m",
   "spark.serializer": "org.apache.spark.serializer.KryoSerializer"
 "algorithm": {
   "indicators": [
       "name": "purchase"
   "num": 4

Purchase is the only interaction specified in the list of indicators. For the other two recommenders we just need to add view (and cart) in the list (see the engine configuration files). More details on how to set up engine templates can be found in the ActionML documentation.


We need to send the train data to each model as input: this contains information of the user’s historical interactions with the items. This is how an input for the first recommender looks like:

   "event" : "purchase",
   "entityType" : "user",
   "entityId" : "520088904",
   "targetEntityType" : "item",
   "targetEntityId" : "1003461",
   "properties" : {},

In the first recommender the event is always purchase, entityType is always user and targetEntityType is always item. EntityId and TargetEntityId denote the user_id and product_id respectively. Properties can be used to specify item properties (type of category, expiration date) but we do not need this.
Recommender 2 and 3 will be trained using similar inputs but in this case the event can be purchase, view or cart.
The input data can be sent via the harness REST API using curl. In this application we create events via the Python SDK using the following function:

import csv
from datetime import datetime
import harness
import argparse
import pytz
def create_event(row, client):
   """Create input events for the recommender.
       - row: list denoting a row in a pandas dataframe
       - client: harness client
   event_time = datetime.strptime(row[1],"%Y-%m-%d %H:%M:%S+00:00").replace(tzinfo=pytz.utc)
   event = row[2]
   entity_id = row[8]
   target_id = row[3]
       event = event,
       entity_id = str(entity_id),
       target_entity_type = "item",
       target_entity_id = str(target_id)
   print("Event: {}, entity_id: {}, target_entity_id: {}".format(event, entity_id, target_id))



Once the model has been trained we need to retrieve recommendations for our users. Similarly to input events, queries can be retrieved using curl as follows:

curl -i -X POST http://harness-address:9090/engines/some_engine_id/queries" \
-H "Content-Type: application/json" \
-d '{
  "user": "520088904"

The curl request above will then return recommended items for the specified user. If the user has never been seen by the recommender (not present in the train set) most popular items will be recommended (by default items with most events).
Notice that harness address with port and engine_id need to be specified in the curl request.

As we use Python, we send queries using the requests library with the following function:

import requests
import os
# requests module need to be installed
def query_for_user(user_id, host_url, engine_id):
"""Creates POST requests for recommendation"""
   h = {'Content-Type': 'application/json'}
   if user_id:
       d = {'user': user_id}
       d = {} #user not specified; returns most popular items
   url = os.path.join(host_url,"engines", engine_id, "queries")
   r =, data=json.dumps(d), headers=h)
   return r


Communicate with harness server via harness-cli and put everything together

Now that we know how to send input data and queries we need to communicate this information to the harness server and train our model. For this purpose few harness-cli commands are needed (the operations described below are summarized in the bash file of this application).

We first create a new engine instance by specifying the path to the json file

harness-cli add ${engine_json}

Now we are ready to send input data to the specified engine-id as we described in the Input section. We do this by running the Python file and specifying the engine-id and the file containing train set data:

python3 ${data_folder}/python/ --engine_id ${engine} --input_file ${train_file} --url ${host_url}

Once all the data is imported we can train the recommender using the command:

harness-cli train

We can then retrieve the recommendations for the specified users by running the Python file as follows:

python3 ${data_folder}/python/ --engine_id ${engine} --url ${host_url} --input_file ${test_file} --output_file ${results_file}

Here we need to give the engine id as input, the test file (a csv file containing the unique users id of our selected users) and an output file where we store the recommendations.

Analyse results

For more details on this section see the data analysis notebook in the GitHub repository.

By running the bash file resumed in the section above for each recommender we create three different output files containing recommended items.
We load the result files as pandas dataframes:

import os
main_path = "../data/"
#result recommender with only purchase as action on eletronics products
res_p = pd.read_json(os.path.join(main_path,"results/predictions-ecommerce-eletronics-p-10kusers.json"))
#result recommender with purchase (main action) and view on eletronics products
res_pv = pd.read_json(os.path.join(main_path, "results/predictions-ecommerce-eletronics-pv-10kusers.json"))
#result recommender with purchase (main action), view and cart as secondary actions on eletronics products
res_pvc = pd.read_json(os.path.join(main_path, "results/predictions-ecommerce-eletronics-pvc-10kusers.json"))

The dataframes contain a list of recommended items with a score for each user_id

We also load the common test set containing user id and the last item purchased by each user:

test = pd.read_csv(os.path.join(main_path, 

After extracting information from the recommended results (result_items containing item ids, result_score containing scores), we merge each result dataframe with the test set above. As a simple comparison we calculate the number of users for which the item in the test set is in the list of recommendations:

As we can see adding view-item and cart-item to the recommender as secondary actions slightly increases the average recall. On the other hand, these secondary actions also create some “noise” as the second and third recommenders “miss” user-items that were correctly recommended by the first one.

As a further analysis we consider which position the item in the test set has in the list of recommendations (when recommended)

As we can see from the plots above ~38% of the items in the test set that are correctly recommended occupy the first position in the list of recommendations. This percentage slightly decreases for the second and the third recommenders: it seems the correctly recommended items become “less” important in the list of recommendations once we add further user interactions.

Further ideas

Here is a list of ideas how to experiment further with the universal recommender in the case of e-commerce data:

    • Instead of view-item consider view-category as a second interaction. An analysis on this is contained in the repository notebooks.
    • Try out “Item-based” recommendations: instead of outputting the top n recommended items for a user, try to find the items with a similar user behaviour to a given item (“people who purchased item x also purchased item y,z,…”).
    • Add business rules: restrict recommendations to certain categories or boost/favour certain types of items by adding a bias factor to the business rule (see business rules for queries)

More content about Artificial Intelligence


Your email address will not be published. Required fields are marked *