Most of us must have encountered a similar problem of finding the maximum number of valid parentheses pairs, which is pretty straightforward and quite commonly asked in many programming interviews. This is one variant where we need to find the longest sub-string which is a valid parentheses. The sub-string may contain any number of parenthesis pairs. One naive approach is to check for all pair of indices [i, j] whether it is a valid parentheses or not. This can be done using a dynamic programming approach, where we say that the substring [i, j] is a valid parentheses if
number of open parenthesis '(' in [i, j-1] = 1 + number of closed parenthesis ')' in [i, j-1]
Thus we need to keep track for all pairs of indices [i, j], the number of open parenthesis and the number of closed parenthesis. The run-time of this approach is O(N2) where N is the length of the entire string because we are searching all pairs of positions within the string. We know we can do better than this.
Observe that using a single linear scan over the string, we can determine the positions of all the valid parentheses, similar to our maximum valid parenthesis problem.
Maintain a LIFO stack of the positions of the open parenthesis, whenever we encounter a closed parenthesis while scanning the string, remove one from the stack (if the stack is not empty) and add the indices of the removed open parenthesis and the current closed parenthesis to a separate array A. Whenever we encounter an open parenthesis, we add it to the stack.
At the end, the array A will contain tuples for all the starting and ending positions of the valid pairs of parenthesis. Now the problem is to de-construct this array A to find the longest valid parentheses. While the pairs of indices will take care of nested parentheses but it will not take into account consecutive valid parentheses. For e.g. in "((()))" the indices will be [2,3], [1,4] and [0,5], thus we can obtain the longest valid parentheses of length=6 just from [0,5] but consider the string "()()()", for which the indices are [0,1], [2,3] and [4,5]. Although the longest valid parentheses here is also of length=6, but it is not directly obtained from any single pair of indices.
This can be done by storing the indices in a min-heap instead of an array keyed on the starting positions. Thus the root of the min-heap will contain the valid parentheses with the least starting position in the string. If the pair is [i, j] , then its length is j-i+1. Pop the root each time. The next pair we consider is the one whose starting position is greater than the current maximum ending position.
If the current starting position of the root is 1 greater than the maximum ending position, it means that the parentheses are consecutive and there lengths should be added. Here is the Python code for doing the same:
import heapq class Solution(object): def longestValidParentheses(self, s): open_stack, max_len = , 0 min_heap =  for i in range(len(s)): if s[i] == '(': open_stack.append(i) else: if len(open_stack) > 0: last = open_stack.pop() heapq.heappush(min_heap, (last, i)) curr_max_end, cnt = -1, 0 while len(min_heap) > 0: start, end = heapq.heappop(min_heap) if start > curr_max_end: if start == curr_max_end + 1: cnt += end - start + 1 else: max_len = max(max_len, cnt) cnt = end - start + 1 curr_max_end = end max_len = max(max_len, cnt) return max_len
The run-time complexity of this approach is O(N*logN) because pushing and popping to the heap is O(logN).
While this can be improved by storing the indices in a dictionary rather than min-heap. Use the starting positions as key and the ending positions of each valid parentheses as values to the dict. Then for all possible indices from 0 to N-1, check if the index is a starting point of any valid parentheses. If it is then do the same thing similar to our min-heap.
class Solution(object): def longestValidParentheses(self, s): open_stack, max_len = , 0 start_end_map = dict() for i in range(len(s)): if s[i] == '(': open_stack.append(i) else: if len(open_stack) > 0: last = open_stack.pop() start_end_map[last] = i curr_max_end, cnt = -1, 0 start = 0 while start < len(s): if start in start_end_map: end = start_end_map[start] if start > curr_max_end: if start == curr_max_end + 1: cnt += end - start + 1 else: max_len = max(max_len, cnt) cnt = end - start + 1 curr_max_end = end start = end+1 else: start += 1 max_len = max(max_len, cnt) return max_len
Only difference is that the time-complexity is now linear (O(N)) assuming that the dict() stores and retrieves in O(1) time complexity.
Categories: PROBLEM SOLVING