Machine Learning, AI and Programming

Leetcode : Reverse Pairs

Problem Statement


The brute force approach is to consider all pairs (i, j) such that i < j and check whether they satisfy the condition that nums[i] > 2*nums[j]. The time complexity for this approach is definitely O(N2) where N is the number of elements in the array 'nums'. We can improve this time complexity by reducing the number of comparisons required for each element. For e.g. if we are on the i-th element in the array, then all values j > i such that nums[i] > 2*nums[j] are the valid candidates. Thus if we traverse the array from back and maintain a sorted array T of the values 2*nums[j]+1 for all nums[j] seen so far, then we need to check at the i-th element how many elements in the current array T has values less than or equal to nums[i]. For e.g. if the array is like this:


Then when we traverse from the back and reach the 2nd element (at i=1) i.e. 4, the sorted array T would look like [3, 7, 11], because 2*1+1=3, 2*3+1=7 and 2*5+1=11. Now only 3 is less than or equal to 4, thus for the 2nd element only possible pair is (i, j) = (1,4). But challenge is to keep the array T sorted every-time. We cannot afford to sort the array every-time after adding a new element. One possible mechanism is to keep a Balanced Binary Search Tree of the values 2*nums[j]+1 with an additional value at each node indicating how many nodes are there in its left sub-tree.

In this way adding a new node is O(logN), updating the left-subtree count value of its parent by 1 (if it is left node of its parent) up to the root node is also O(logN). Thus with this modified BST, we can compute the number of such reverse pairs in O(N*logN) complexity. But most of the time I have found that an alternative solution exists for a BST method that does not require us to create and maintain a balanced BST.

We will use a modified Merge Sort based approach for this problem instead of a BST with the same running time complexity but reduced space-time complexity.

The idea is to sort the array in decreasing order using merge sort and while doing the merge operation on the two sorted sub-arrays, we can also count the number of reverse pairs for the merged array.

For e.g. given an array nums=[a1, a2, ..., aN], in merge sort, first we divide this array into two equal parts:

x1 = [a1, a2, ..., aN/2], and x2 = [aN/2+1, aN/2+2, ..., aN]

We sort x1 and x2 recursively using merge sort to get the sorted arrays (in decreasing order) y1 and y2:

y1 = [b1, b2, ..., bN/2], and y2 = [c1, c2, ...,cN/2]

We will count the number of reverse pairs in  y1 and y2 then merge them and return back. Let us say that the number of reverse pairs in the array y1 be p1 and in the array y2 be p2, then the number of reverse pairs for the merged array will also include the reverse pairs between y1 and y2 i.e. for each element in y1 which are the corresponding reverse pairs from y2.

For the 1st element in y2 i.e. c1, the possible reverse pairs from y1 will be [b1, b2, ..., bk1] such that bk1 >= 2*c1+1  and bk1+1 < 2*c1+1 because the elements are in descending order. Similarly for the next element c2, the possible reverse pairs will include all the reverse pairs from c1, plus additional elements [bk1+1, bk1+2, ..., bk2], such that bk2 >= 2*c2+1 and bk2+1 < 2*c2+1 and so on.

Let us denote the number of reverse pairs for an element cj from y2 by N(cj), then we have:

Thus, N(cj+1) = N(cj) + length([bkj+1, bkj+2, ..., bk(j+1)])

#ReversePairs([y1, y2]) = p1 + p2 + N(c1) + N(c2) + ... + N(cN/2)

Observe that we never scan each element in y1 or y2 more than once. Thus the time complexity of computing the number of reverse pairs is O(len(y1)+len(y2)). Thus the overall time complexity of sorting + counting reverse pairs is O(N*logN). The python code for the same is as follows:

class Solution(object):
    def count_reverse(self, nums, left, right):
        if left < right:
            mid = (left+right)/2
            a, x = self.count_reverse(nums, left, mid)
            b, y = self.count_reverse(nums, mid+1, right)
            n, m = len(a), len(b)

            i, j, cnt, last_cnt, sum_cnt = 0, 0, 0, 0, 0
            while True:
                if i < n and j < m and a[i] > 2*b[j]:
                    cnt += 1
                    i += 1
                elif (i < n and j < m and a[i] <= 2*b[j]) or (i == n and j < m):
                    last_cnt += cnt
                    sum_cnt += last_cnt
                    cnt = 0
                    j += 1

            out, i, j = [], 0, 0
            while True:
                if (i < n and j < m and a[i] > b[j]) or (i < n and j == m):
                    i += 1
                elif (i < n and j < m and a[i] <= b[j]) or (i == n and j < m):
                    j += 1

            return out, sum_cnt + x + y
        return [nums[left]], 0
    def reversePairs(self, nums):
        if len(nums) == 0:
            return 0
        out, cnt = self.count_reverse(nums, 0, len(nums)-1)
        return cnt

In each recursion, we return the sorted array as well as the number of reverse pairs in the sorted array.

The BST technique is preferred if we need to build a streaming system for this problem because with each new element in the stream we can add that element efficiently to the Balanced BST, whereas merge sort needs to be called on the entire array each time a new element is added.


Tags: , , , ,

Leave a Reply