Solution:

It is pretty obvious that O(1) insertion and removal can be designed using a Set or Hash-Map or a Doubly Linked List. When duplicates are prohibited, using a Set() data structure is very convenient, but when duplicates are allowed we cannot use Set. Instead we can use Hash-Map. The key of the Hash-Map are the unique integers while the values are the count of each integer. While inserting we increment the count for that integer if already present, else add a new key and set the count to 1. While removing an integer we decrement the counter and if counter becomes 0, we remove the key. The only problem we will face with the Hash-Map is that we cannot fetch a random integer uniformly in O(1) time.

Given that we need to get random numbers uniformly from the integers according to their counts distribution, we can achieve it in O(M) time where M is the number of unique integers. The following simple method can do that:

import random def get_random(count_map, m): p = random.uniform(0, m) s = 0 for h, count in count_map.items(): s += count if p < s: return h return None

But we need to get random numbers in O(1) time complexity which can be made possible only if we use "random.choice" method on the list of all the integers including the duplicates. This is possible with two Hash-Maps but we are going to demonstrate it using a Doubly Linked List (DLL) which is conceptually simpler.

Recall that DLL allows O(1) insertion and removal. Insertion is pretty straightforward. Append the new Node at the tail of the DLL. Since in a DLL we have both previous and next pointer for each Node, once we know which Node to remove, we just set the next pointer of its previous Node to the next Node and similarly the previous pointer of its next Node to its previous Node.

But unlike array, Nodes in linked list are not indexed and thus we cannot access a Node to be removed in O(1) time complexity, we need to linearly scan for the Node which is O(N) time. Instead we can use a Hash-Map to map an integer to its corresponding Node pointer in the linked list in O(1) time and then delete the pointer. In case of duplicates, we map an integer to a set of Nodes in the linked list.

Here is the Python code for O(1) insertion and removal from DLL.

class Node(object): def __init__(self, val): self.val = val self.next, self.prev = None, None class RandomizedCollection(object): def __init__(self): self.tail = None self.node_map = dict() def insert(self, val): node = Node(val) if self.tail is None: self.tail = node else: self.tail.next = node node.prev = self.tail self.tail = self.tail.next out = val not in self.node_map if out: self.node_map[val] = set() self.node_map[val].add(node) return out def remove(self, val): if val in self.node_map: node = self.node_map[val].pop() prev, nxt = node.prev, node.next if prev is None: self.head = nxt else: prev.next = nxt if nxt is None: self.tail = prev else: nxt.prev = prev if len(self.node_map[val]) == 0: self.node_map.pop(val) return True return False

Now coming to the get_random part, we can keep a list of the Nodes present in the DLL and then randomly select one and return its integer value. But this is fine if we do not remove any Nodes from in-between. If we are removing a Node from the list of Nodes, the list needs to be updated which is O(N) again. Deletion in DLL in O(1) but not in an array or in an indexed list.

Instead we delete the tail Node from the DLL and only change the value of the Node to be deleted to be equal to the value of the tail Node. Any change made to the Node pointers in the DLL is reflected in the Nodes list because Node is a pointer. Thus we only need to remove the last element from the Node list. Here is the Python code for the entire functionality - i.e. O(1) insertion, removal and get_random:

import collections, random from random import randint class Node(object): def __init__(self, val): self.val = val self.next, self.prev = None, None class RandomizedCollection(object): def __init__(self): self.tail = None self.node_map = dict() self.node_list = [] def insert(self, val): node = Node(val) if self.tail is None: self.tail = node else: self.tail.next = node node.prev = self.tail self.tail = self.tail.next out = val not in self.node_map if out: self.node_map[val] = set() self.node_map[val].add(node) self.node_list.append(node) return out def remove(self, val): if val in self.node_map: node = self.node_map[val].pop() node.val = self.tail.val self.node_map[node.val].add(node) self.node_map[node.val].remove(self.tail) self.tail = self.tail.prev if self.tail is not None: self.tail.next = None if len(self.node_list) > 0: self.node_list.pop() if len(self.node_map[val]) == 0: self.node_map.pop(val) return True return False def getRandom(self): if len(self.node_list) > 0: return random.choice(self.node_list).val return None

Categories: PROBLEM SOLVING

Tags: Doubly Linked List, Get Random O(1), Leetcode, O(1) deletion, O(1) insertion, Sampling

### Leave a Reply

You must be logged in to post a comment.