In the last post we had used Conditional Random Fields (CRF) to extract attributes from e-commerce product titles and description. CRFs are linear models just like Logistic Regression. The drawback with linear models is that they do not take feature-feature interaction or higher order feature terms into account while building model. Linear models can under-fit on the data while too much non-linearity can lead to over-fitting. Non-linear models such as Neural […]
In the last post, we had started to design a movie recommendation engine using the 20 million ratings dataset available from MovieLens. We started with a Content Based Recommendation approach, where we built a classification/regression model for each user based on the tags and genres assigned to each movie he has rated. The assumption behind this approach is that, the rating that an user has given to a movie depends […]
In this post, we would be looking to design a movie recommendation engine with the MovieLens dataset. We will not be designing the architecture of such a system, but will be looking at different methods by which one can recommend movies to users that minimizes the root mean squared error of the predicted ratings from the actual ratings on a hold out validation dataset.
This post is motivated from trying to find better unsupervised vector representations for questions pertaining to the queries from customers to our agents. Earlier, in a series of posts, we have seen how to design and implement a clustering framework for customer questions, so that we can efficiently find the most appropriate answer and at the same time find out most similar questions to recommend to the customer.
In continuation of my earlier posts on designing an automated question-answering system, in part three of the series we look into how to incorporate feedback into our system. Note that since getting labelled data is an expensive operation from the perspective of our company resources, the amount of feedback from human agents is very low (~ 2-3% of the total number of questions). So obviously with such less labelled data, […]
In this post we will look at the offline implementation architecture. Assuming that, there are currently about a 100 manual agents, each serving somewhere around 60-80 customers (non-unique) a day, i.e. a total of about 8K customer queries each day for our agents. And each customer session has an average of 5 question-answer rounds including statements, greetings, contextual and personal questions. Thus on average we generate 40K client-agent response pairs […]
Natural Language Question Answering system such as chatbots and AI conversational agents requires answering customer queries in an intelligent fashion. Many companies employ manual resources to answer customer queries and complaints. Apart from the high cost factor with employing people, many of the customer queries are repetitive in nature and most of the time, same intents are asked in different tones.
In this second part of the series on my posts on Dynamic Programming in NLP, I will be showing how to solve the Longest Common Subsequence problem using DP and then use modified versions of the algorithm to find out the similarity between two strings. LCS is a common programming question asked in many technical interviews. Given two strings (sequence of words, characters etc.) S1 and S2, return the number […]
In this series of posts, I will be exploring and showing how dynamic programming technique is used in machine learning and natural language processing. Dynamic programming is very popular in programming interviews. DP technique is used mainly in problems which has optimal substructure and can be defined recursively, which means that a problem of size N can be solved by solving smaller sub-problems of size m < N. Such problems […]
For the past few days, I have been reading quite a lot of research papers, articles and blogs related to artificial neural networks and its transition towards deep learning. With so many different methods of selecting the best neural network architecture for a problem, the optimal hyper-parameters, the best optimization algorithm and so on, it becomes a little overwhelming to connect all the dots together when we ourselves start to […]