# Stokastik

## Leetcode : Bricks Falling When Hit Problem Statement Solution: This is a trick problem. I will come later to explain why. So the most logical approach to solve the problem is to first 'hit & erase' one of the brick A and then for each of the adjacent bricks of A, check if they are 'connected' to the top of the grid. If they are connected to the top of the grid don't do anything, else […]

## Leetcode : Longest Valid Parentheses Problem Statement Solution: Most of us must have encountered a similar problem of finding the maximum number of valid parentheses pairs, which is pretty straightforward and quite commonly asked in many programming interviews. This is one variant where we need to find the longest sub-string which is a valid parentheses. The sub-string may contain any number of parenthesis pairs. One naive approach is to check for all pair of indices […]

## Leetcode : Count The Repetitions Problem Statement Solution: One naive approach is to estimate the value of M from the fact that the maximum value of M possible is: Q=(n1*len(s1))/(n2*len(s2)) Let s(x) denote the string s repeated 'x' times. Using binary search between 0 to Q, check if the string s2(n2*m) is present in the string s1(n1) for some m. If s2(n2*m) is present in s1(n1), then search to the right half of m else to the left […]

## Building a classification pipeline with C++11, Cython and Scikit-Learn We have earlier seen how using Cython increases the performance of Python code 50-60x, mostly due to static typing as compared to dynamic typing in pure Python. But we have also seen how one can wrap pure C++ classes and functions with Cython and export them as Python packages with improved speed. The codes we have dealt with so far using Cython were mostly generic modules like generating primes using […]

## LeetCode: Word Ladder II Problem Statement Solution: This is one of the problems asked frequently in coding interviews but has a relatively straighforward solution once one can get hold of the data-structures that he/she is going to implement. The idea is pretty simple, starting from 'beginWord', find out all possible words that are 1-distance away and also in the list of valid words. Repeat this step until we hit the 'endWord'. But since the […]

## LeetCode: Three Equal Parts Problem Statement Solution: The problem can be approached in multiple different ways. One trivial approach is to find the decimal representation for all binary sequences [i, j], where j >= i. Once we find the decimal representations for all [i, j], we can then find the indices i' and j' such that the decimal representation [0, i'], [i'+1, j'-1] and [j', N-1] are equal where N is the length of […]

## LeetCode: Minimum Refuel Stops Problem Statement Solution: At first, this looks like a classic graph traversal problem. For example, I might be tempted to use BFS approach to find the minimum number of steps to reach the target. But one important distinction this problem has from a standard BFS is that in a standard BFS problem, the car can either directly reach a station Y from station X or they cannot depending whether there […]

## Interfacing C++ with Cython In the last post we saw how we can use Cython programming language to boost speed of Python programs. For a simple program like finding primes upto a certain N, we obtained a gain of around 50-60x with Cython as compared to a naive Python implementation. This is significant when we are going to deploy our codes in production. A complex system will have multiple such programs calling each other […]

## Speeding up with Cython While I was working with the R programming language, I always somehow found it to be slow coming from a JAVA/C++ background. Then I discovered the "RCpp" package which allowed me to write C++ codes and call them from R. It greatly improved my program speed by an order of 50-100x. Most of the coding that I did was in C++ while the interface remained in R. Since I moved […]

## BiLSTM-CRF Sequence Tagging for E-Commerce Attribute Extraction In the last post we had used Conditional Random Fields (CRF) to extract attributes from e-commerce product titles and description. CRFs are linear models just like Logistic Regression. The drawback with linear models is that they do not take feature-feature interaction or higher order feature terms into account while building model. Linear models can under-fit on the data while too much non-linearity can lead to over-fitting. Non-linear models such as Neural […]