Stokastik

LeetCode : Reaching Points

Problem Statement Solution : The obvious way is to go with a DFS or BFS like approach. Starting from the point (sx, sy), we can go in two possible directions (sx + sy, sy) and (sx, sx + sy) and then each of these points can go in two possible directions and so on. Note that the paths will form a tree like structure as we are going in either […]

LeetCode : Swim in Rising Water

Problem Statement Solution : One possible solution is to see whether for each water level, he can swim from (0, 0) to (N-1, N-1) without waiting anywhere in between. To check whether he can swim from (0, 0) to (N-1, N-1) i.e. there exists a path, we can use DFS or BFS to traverse the matrix. He can swim from point A to B if the height of B is less […]

Neural Networks as a Function Approximator

For the past few days, I have been reading quite a lot of research papers, articles and blogs related to artificial neural networks and its transition towards deep learning. With so many different methods of selecting the best neural network architecture for a problem, the optimal hyper-parameters, the best optimization algorithm and so on, it becomes a little overwhelming to connect all the dots together when we ourselves start to […]

Building a Neural Network from scratch in Python

In this post I am going to build an artificial neural network from scratch. Although there exists a lot of advanced neural network libraries written using a variety of programming languages, the idea is not to re-invent the wheel but to understand what are the components required to make a workable neural network. A full-fledged industrial scale neural network might require a lot of research and experimentation with the dataset. Building a simple […]

Building an Incremental Named Entity Recognizer System

In the last post, we saw how to train a system to identify Part Of Speech tags for words in sentences. In essence we found out that discriminative models such as Neural Networks and Conditional Random Fields, outperforms other methods by 5-6% in prediction accuracy. In this post, we will look at another common problem in Natural Language Processing, known as the Named Entity Recognition (NER in short). The problem […]

Building a POS Tagger with Python NLTK and Scikit-Learn

In this post we are going to understand about Part-Of-Speech Taggers for the English Language and look at multiple methods of building a POS Tagger with the help of the Python NLTK and scikit-learn libraries. The available methods ranges from simple regular expression based taggers to classifier based (Naive Bayes, Neural Networks and Decision Trees) and then sequence model based (Hidden Markov Model, Maximum Entropy Markov Model and Conditional Random […]

Understanding Conditional Random Fields

Given a sequence of observations, many machine learning tasks require us to label each observation in the sequence with a corresponding class (or named entity) such that the overall likelihood of the labelling is maximized. For example, given a english sentence, i.e. a sequence of words, label each word with a Part-Of-Speech tag, such that the combined POS tag of the sentence is optimum. "Machine Learning is a field of […]

Machine Learning Interview Questions and Answers (Part III)

Which loss function is better for neural network training, logistic loss or the squared error loss and why ? The loss function depends mostly on the type of problem we are solving and the activation function. In case of regression where the values from the output units are normally distributed, the squared error is the preferred loss function whereas in a classification problem the output units follows the multinomial distribution, the […]

Optimization Methods for Deep Learning

In this post I am going to give brief overview of few of the common optimization techniques used in training a neural network from simple classification problems to deep learning. As we know, the critical part of a classification algorithm is to optimize the loss (objective) function in order to learn the correct parameters of the model. The type of the objective function (convex, non-convex, constrained, unconstrained etc.) along with […]

Machine Learning Interview Questions and Answers (Part II)

Why does negative sampling strategy works during training of word vectors ? In word2vec training the objective is to have semantically and syntactically similar words close to each other in terms of the cosine distance between their word vectors. In the skip-gram architecture, the probability of a word 'c' being predicted as a context word at the output node, given the target word 'w' and the input and output weights […]