Stokastik

Machine Learning, AI and Programming

Leetcode : Course Schedule 3

Problem Statement

Solution:

An obvious brute force approach is to enumerate all possible scenarios i.e. all possible valid sequence of courses and take the answer as the sequence with the maximum length. For an input size of N and a sequence of length M, the number of possible sequences is exponential in N i.e. NM. Obviously this is a bad choice, but can we think of a better solution. We can solve this problem in O(N*logN) time complexity using a greedy approach.

Observation - If we sort the courses in increasing order of their last day, then a student who wants to maximize the number of courses has to take a course with an earlier last day before taking another course with a later last day. For e.g. given the two courses:

C = [[10,20],[5,6]]

If the student takes the 1st course [10,20] first then he cannot take [5,6] because 10+5 > 6. But if he takes the 2nd course [5,6] first then he can also take [10,20] course. It also says that for a pair of courses:

[[a1, b1], [a2, b2]], where b1 > b2

If the student is able to take both the courses, first [a1, b1] and then [a2, b2] then he can also take them in reverse order [a2, b2] and then [a1, b1] but other way around is not true.

If it is possible that he takes [a1, b1] and then [a2, b2], it implies that the end day for the 2nd course i.e. a1 + a2 <= b2. Now if he takes [a2, b2] then [a1, b1], then the end day for the 2nd course is a2 + a1, but we know already that a1 + a2 <= b2 and b2 < b1 thus a1 + a2 < b1.

Now assuming that he is not able to take [a1, b1] and then [a2, b2], it implies that the end day for the 2nd course i.e. a1 + a2 > b2. Now if he takes [a2, b2] then [a1, b1] then the end day for the 2nd course is a2 + a1, but we cannot say for sure whether a2 + a1 is less than or greater than b1.

While if we know that he cannot take [a2, b2] and then [a1, b1], it implies that a2 + a1 > b1, which means that he also cannot take [a1, b1] and then [a2, b2] because a1 + a2 > b1 > b2.

Let us denote the following:

  • D(i) - duration of the i-th course in the sorted course list
  • L(i) - last day of the i-th course
  • S(i) - sum of D(0) to D(i)
  • H(i) - maximum value of D(i) from 0 to i

Assuming that the student has already taken the maximum possible number of courses from the course list indexed 0 to i-1. Then the student can also take the i-th course in the list if:

S(i-1) + D(i) <= L(i)

Because then the i-th course can also be finished before the last day of the current course. For e.g. given the following list of courses (sorted by last day):

[[3,12],[7,17],[10,20]]

First the student takes the course [3, 12], then sees that for the next course [7, 17], 7 + 3 <= 17 (because L(1)=17) and thus he can also take the next course. For the 3rd course [10,20], S(2)=10 and D(3)=10 and L(3)=20 and thus S(2) + D(3) <= L(3). He can also take the 3rd course.

Now what if S(i-1) + D(i) > L(i) ? Should he take the i-th course or not ?

In such a case, the student will substitute the i-th course with the course with highest duration so far i.e. H(i). If the i-th course itself is the one with the highest duration, then he will not take the i-th course. This ensures that the current sum of duration of all courses taken is minimum so that he has bandwidth to take additional courses and thus maximize the number of courses taken.

The python code for this problem is as follows:

import collections, heapq

class Solution(object):
    def scheduleCourse(self, courses):
        courses = sorted(courses, key=lambda k:k[1])
        max_heap = [(-courses[0][0], courses[0][1])]
        end_day = courses[0][0]
        
        for i in range(1, len(courses)):
            heapq.heappush(max_heap, (-courses[i][0], courses[i][1]))
            end_day += courses[i][0]
            
            if end_day > courses[i][1]:
                a, b = heapq.heappop(max_heap)
                end_day += a
        
        return len(max_heap)

First we sort the courses by their last day and then use the max heap on the course durations to remove the course with the highest duration if in case the i-th course cannot be taken. The run-time complexity is O(N*logN) because heap insertion and deletion is O(logN).

Edit : A more optimized heap operation code. Reduces the number of heap insertions and deletions:

import heapq

class Solution(object):
    def scheduleCourse(self, courses):
        courses = sorted(courses, key=lambda k:k[1])
        max_heap = [(-courses[0][0], courses[0][1])]
        end_day = courses[0][0]
        
        for i in range(1, len(courses)):
            if end_day + courses[i][0] > courses[i][1]:
                if courses[i][0] < -max_heap[0][0]:
                    a, b = heapq.heappushpop(max_heap, (-courses[i][0], courses[i][1]))
                    end_day += a + courses[i][0]
            else:
                heapq.heappush(max_heap, (-courses[i][0], courses[i][1]))
                end_day += courses[i][0]
        
        return len(max_heap)

Categories: PROBLEM SOLVING

Tags: , , , ,

Leave a Reply