Machine Learning, AI and Programming

Leetcode : Number of digit one

Problem Statement


The brute force approach would be iterate through all numbers from 0 to N and then for each number count the number of 1's and add them up. For a number N, the number of digits it contains is O(log10N). Thus the brute force approach has a time complexity of O(N*log10N). Can we do better than this ? The proposed solution achieves a O(log10N) run-time complexity. The idea is to use dynamic programming on the number of digits.

Let S(m) denote the number of digit one in all numbers upto 99...9 repeated 'm' times i.e. all numbers with less than or equal to m digits. Then in order to compute S(m+1), we know that all numbers with m+1 digits can be obtained as follows:

1 ... followed by all numbers with m digits

2 ... followed by all numbers with m digits


9 ... followed by all numbers with m digits

Thus S(m+1) will contain a component 9*S(m) corresponding to the above reasoning. Also since S(m+1) is the summation till the m+1 digits, thus we also need to add an additional S(m) term to S(m+1). Think of S(m) as a cumulative quantity i.e. already takes care of S(1), S(2), ... upto S(m-1).

S(m) does not include any numbers which does not contain any one i.e. all numbers used to compute S(m) contains at-least 1 one. Now when computing S(m+1), for all numbers starting with a 1, the remaining m digits can contain any number from 0 to 9 and not only 1 because we already have a 1 in the first place. Since the number of all possible numbers with 'm' digits is 10m.

One can argue that we already have considered the quantity "1 ... followed by all numbers with m digits" in the 9*S(m) term above, then aren't we double counting for all numbers that already has a 1 in at-least one position in the last m digits ? In-fact yes, we are double counting all these numbers because since we are adding a 1 in the first position, the count of 1's for all these numbers should be 1 higher than what is already present.

For e.g. if m=2, then '11' would be counted twice in S(2), now for S(3), when we have a 1 in the 1st position i.e. '111' this number should be counted 3 times.

Thus the final recurrence relation for S(m+1) is as follows:

S(m+1) = 9*S(m) + S(m) + 10m = 10*S(m) + 10m.

Now for a given number N, with the number of digits as M=int(log10N)+1, we compute the above upto S(M-1) because if N=7495 then M=4 and S(M) will contain number of 1's upto 9999 which overshoots 7495. Thus we will first compute for one digit less i.e. S(3) upto 999. Now given N=7495, we know that S(3) will be counted 6 times because number of digit one in 7495 is at-least as great as 6999. Thus S(3) will be accounted for 1999, 2999, ... 6999.

Once we are done with the 1st digit, we move to the next digit i.e. 4. Similar as above but with S(2). S(2) will be counted 3 times for the 2nd position because number of digit one in 495 is at-least as great as 399. S(2) will be accounted for 199, 299 and 399. Repeat this process for all remaining digits. Thus finally, the number of digit one in 7495 would be:

(6*S(3) + 103)+ (3*S(2) + 102)+ (9*S(1) + 101) + 1

The terms 10m occurs in the above calculation by the same reasoning we had used to deduce S(m+1) from S(m) i.e. for all m+1 digits starting with a 1, can have the remaining m digits any thing between 0 and 9. But what if there is a 1 in N. This becomes a special case.

Consider N=1495. We cannot use S(3) for the 1st position because S(3) is only valid for starting digits > 1. Also in such a case we cannot include the 103 term in the above calculations because it implies that N >= 1999. We only take the value of the next 3 digits instead of 103 i.e. 495. Let us denote :

Ti = (Ni-1) * S(M-i-1) + 10M-i-1    if Ni > 1 else    1 + Ni+1Ni+2...NM

where 'i' is a position (indexed from 0) in the number N, Ni is the digit in the i-th position and Ni+1Ni+2...NM represents the integer value of the digits from position i+1 to M. Notice that there is an additional 1 in the else condition in the above equation. This is to handle the special case of '1000'.

Thus the final answer G(N) would be:

G(N) = T0 + T1 + ... + TM-1

Here is the Python code for the above approach:

import math

class Solution(object):
    def countDigitOne(self, n):
        if n < 1:
            return 0
        m = int(math.log10(n))
        sums = dict()
        for m1 in range(1, m+1):
            if m1 == 1:
                sums[m1] = 1
                sums[m1] = 10*sums[m1-1] + 10**(m1-1)
        str_n, out = str(n), 0
        for i in range(len(str_n)):
            v = int(str_n[i])
            m1 = len(str_n) - i
            if i == len(str_n)-1:
                out += 1 if v > 0 else 0
                if v > 0:
                    if v > 1:
                        out += v*sums[m1-1] + 10**(m1-1)
                        out += sums[m1-1] + 1 + int(str_n[i+1:])
        return out

We can compress the above code by noticing that we don't need to store all the 'sums' in a dict as a pre-processing step i.e. we remove the first for-loop from the code and add the updation of the 'sums' in the second for-loop itself. Also instead of working with strings for the digits in N, we can directly work with mathematical operations as shown below:

import math

class Solution(object):
    def countDigitOne(self, n):
        if n < 1:
            return 0
        m1, out, u, v, p = 1, 0, 0, 1, 1
        while n > 0:
            rem = n % 10
            if m1 == 1:
                out += 1 if rem > 0 else 0
                if rem > 0:
                    if rem > 1:
                        out += rem*v + p
                        out += v + 1 + u
                v = 10*v + p
            u = rem*p + u
            n = int(n/10)
            p = 10*p
            m1 += 1
        return out


Tags: , , , ,

Leave a Reply