Category Archives: ecommerce

idea product recommendation

Product Recommendation with machine learning using ElasticSearch

Elasticsearch provides us with powerful machine learning based tool for product recommendation for e-commerce.

What do we know about our customers?

It is relatively easy to know which products our customers have either bought, or clicked upon in the past.   Using elastic search we are able to leverage this data to recommend to these users other products which have been of interest to different users who have showed an interest in the same products.

For example if our user has clicked upon a book about medieval French history,  then it would seem obvious that we can show the user the most popular books in the category of medieval French history.   However this approach of simply repeating products in the same category may become tedious, and we may miss many interesting possibilities to offer users products across categories.  For example if  the user buys a camera we would possibly want to offer the user a book on photography.

Machine Learning Product Recommendation

Elasticsearch provides us the possibility to recommend products to users based on what other users who bought the same products as them have purchased.

Term aggregation

For example we have an index which contains all of our users, with all of the products they have purchased as shown in the mapping below.

If a user has bought a polaroid camera, then we can look up the most popular products purchased by other users who brought the same polaroid camera.  This list is likely to include products which are directly related to a polaroid camera (such as polaroid film), and indirectly related (books on photography).

Significant Term Aggregation

A term aggregation may indicate products which are popular with polaroid users, but some of these may be completely unrelated to polaroid cameras, (eg. Mobile telephones) simply because they are popular with everyone (including people who buy polaroid cameras).  If we want to avoid this, then we can use the significant terms aggregation, which will return products which are significantly more popular with polaroid camara buyers compared with our customer set as a whole.

The example below shows a significant term aggregation



This approach is interesting because as our volume of data increases, the quality of our recommendations improves:  as time goes by, we will learn more about our specific users and  at the same time grow our database of user preferences.






Make your own search engine with elasticsearch

In this article you can see how to use elasticsearch to create a fast search engine capable of deep text search, working with terrabytes of data.

We are going to build a search engine based on the living people category of wikipedia, store the data in elasticsearch, test the speed and relevance of our queries and also create an autocomplete suggestion query.


You already have elasticsearch and kibana installed.

Install Pywikibot

Pywikibot enables you to easily download the contents of wikipedia articles.  If you have access to a different source of data, then you can use that instead.

Instructions to install pywikibot are here

Configure pywikibot to use wikipedia.

This is done by running the setup script

python generate_user_files

The script is interactive and enables you to define the type of wiki you want to access.  In our case, choose wikipedia.

Install Python Libraries
pip install elasticsearch


Create a Mapping in Elasticsearch

The mapping tells elasticsearch what sort of data is being stored in each field, and how it should be indexed.

The following command can be pasted directly into  the Kibana Dev Tools console

This creates a mapping for document type “wiki_page” in the index “wikipeople” with four text categories (full url, title, categories,text) and one special field called suggest which will be used for autocomplete function (more on that later).  Note also that we have specified that the text field uses an english language analyser.   (as opposed to French,Spanish or any other language).


Create Pywikibot script

In the directory where you installed Pywikibot, you will find a subdirectory “/core/scripts”

In the scripts directory create a new script called

You can then get pywikibot to run your script using the following command (from ../pywikibot/core directory)


The output from the screen will reveal any errors, if all is going well, you should see how the script downloads pages from wikipedia and loads them into elasticsearch.   The speed of download will depend on your machine, in my case one or two pages per second.  For testing you can abort the script (ctrl Z) after a minute or so.

Elasticsearch Search engine Query

Below is an example elasticsearch query and the beginning of the response.

The “source” part of the command specifies that we exclude the text of the page to keep the size of the response down.

The query searches for the terms american football and bearcats in the title, category and body of the text.  However it gives greater weight to the score if these terms are found in the category and title (as determined by the values “boost” in the search query).

The highlight part of the command also returns the detail of the where the search term has been found.   This can be seen in the part of the response labeled “highlight”.  This makes it very easy to display the context of the search term to the user to enable them to see whether they are interested in the results.


Autocomplete suggestions Using ElasticSearch and Jquery

In our mapping we created a special field called “suggest” based on the page title.  This enables us to display an “autocomplete” suggester as the user types into the search box.  Autocomplete queries are optimized to provide very quick responses.   A sample query and response would be as follows:

The query returns suggestions where the title starts with the letters we have introduced in our query.    This would enable us to create autocomplete funcionality with jquery or similar.







How to automate and track email campaigns on google analytics using thunderbird

In this article we will show you how to create a personalized email campaign and create a tracker on google analytics so that you can see who has clicked on the links in your campaign.  We will use opensource mail client Thunderbird, with a plugin called Mail Merge, and google analytics.


You need to have set up google analytics on your web site.

Install thunderbird

Install mail merge plug in for thunderbird

From thunderbird, go to   Tools>Addons>Extensions   then search for Mail merge plugin.

After installing restart thunderbird.

Create a csv file of your contact list

You can create a csv file using excel, or LibreOffice Calc, using save as csv option.

Each column will be some data you want to include in the marketing mail.

Each row will be transformed into a single email.

The CONTACT_ID should be a code number which we will use to identify the customer and will appear in google analytics.

Take care to remember exactly the syntax used for the first row, these will be our PLACEHOLDERS below.

Screenshot from 2017-05-24 12-09-16


Create a mail in thunderbird

Use the format {{<place holder>}}  where <place holder> is the Column Header of the data in the csv file you want to merge into your mails.    The text must match perfectly.

In my example I am using {{EMAIL}} for the users email.(see below)


Screenshot from 2017-05-24 10-37-14

We strongly recommend you include a text or link to enable people to UNSUBSCRIBE from your mailing list.

Add a custom link with google analytics tracking code to your email

Now edit the links to include the tracking code as follows:

The format is:{{CONTACT_ID}}

You need to replace the parts in red with your web landing page, campaign name and keyword.  If you have two or more links in your email, then by using two different keywords you can tell which link the user has pressed.

The CONTACT_ID placeholder will pull the CONTACT_ID column that you put in your csv file earlier just like we did for the email address, so do not change this.

Note! The contact_ID must NOT be anything you can directly identify a user with (such as an email or name) since that contravenes privacy laws in many countries.


Create the mails using mail merge

When you have finished editing, press


Screenshot from 2017-05-24 10-45-11

Source- csv

Deliver mode: send later

csv – select the csv file you created ealier

Character set, Field delimeter and Text Delimiter should be the same as you used when saving the csv file. (Recommended UTF-8, Tab, “)

Rest of values- leave blank

Click OK and one mail will be created for each line in your csv file in your OUTBOX.

Check the contents of your marketing campaign

Before sending, make sure that the mails have been created as you expected in your OUTBOX.  Check several mails, and pay attention to how the place holders and links have been built.

Check the links by clicking on them yourself.


Send your campaign mails

Outbox, right click , send unsent messages

Check the results in google analytics

You will probably need to wait at least 24 hours before any results show up.  You can see them in your google analytics account.  Analytics>Acquisition>Campaigns.

Screenshot from 2017-05-30 16-13-58mail_campaigns



Here you will see the Campaign properties we included in the link for every time a link is clicked.   If you want to know who the user is, you need to add a secondary dimension, which you will find under Advertising>Ad Content or Advertising>Keyword.

This way we know not only how many users clicked on our campaign mail, but also who they are, so we can follow up with those customers.



Build your own customer service portal


A customer service portal will enable your staff to set up a repository of information to help users install, use and service their products, using a software similar to that used in Wikipedia.  This, together with a user forum will enable your customers to ask questions and receive answers in a web format which enables other users to see the answers independent of whether your customer support staff are in the office.

Furthermore, these tools (optionally) can permit the empowerment of other people outside of your organization to contribute to the knowledge base about your products by adding their own comments and participating in the discussion.   Building a community around your products is a great way of increasing your customer and user loyalty to your products, improving your customer service, and also will help increase the visibility of your products on the web, since google and most other search engines rank web sites on the basis of the volume of useful content on your web site.

Our service includes:

  • Setup and configuration of knowledge base and user community forum.
  • User training
  • Telephone help and support (12 months)




Track and measure your marketing campaigns


All lean manufacturers know that measurement is the first step to continuous improvement.   In marketing that means to measure the number of users who open our mails or who click on our marketing message links, to determine whether our message is getting to our users.   This can be easily achieved using a number of email software tools combined with other web tracking tools such as google analytics.    Tools such as mail chimp can be used at little or no cost and allow us to optimize and automate our email communication with customers, and to track their effectiveness to attract customers to our web, and even be used to follow up how many customers attracted by the campaign subsequently purchase in our web shop.   Smart factory provides an electronic marketing workshop to accelerate the process of setting up trackable e-marketing integrated with your web site.