Problem Overview
LeetCode 338 — Counting Bits — gives you a non-negative integer n and asks you to return an array ans of length n+1 where ans[i] equals the number of 1 bits in the binary representation of i, for every i from 0 to n inclusive. For example, given n = 5, the output is [0, 1, 1, 2, 1, 2] because 0 has zero 1-bits, 1 has one, 2 (binary 10) has one, 3 (binary 11) has two, 4 (binary 100) has one, and 5 (binary 101) has two.
The problem explicitly asks for an O(n) solution without using any built-in popcount function such as bin(i).count('1') in Python or Integer.bitCount(i) in Java. This constraint rules out the naive O(n log n) approach of counting bits for each integer independently and forces you to find a recurrence — a way to express each number's bit count in terms of a previously computed entry in the dp array.
LeetCode 338 is rated Easy but tests a non-trivial insight: that every integer's bit count is related to a simpler integer's bit count by a single bit operation. Discovering that relationship is the core challenge. The problem appears in the LeetCode Top Interview 150 and the LeetCode 75 lists and is a natural companion to LeetCode 191 (Number of 1 Bits) and LeetCode 268 (Missing Number).
- Given n, return array ans of length n+1 where ans[i] = popcount(i)
- Must be O(n) time — no per-number bit loop
- Cannot use built-in popcount functions
- ans[0] is always 0 (zero has no set bits)
- n can be up to 100,000 in the LeetCode constraints
Naive Approach
The straightforward approach iterates from 0 to n and, for each number i, counts its set bits individually. Using Brian Kernighan's trick — while i > 0: count += 1; i &= i - 1 — each inner loop iteration removes the lowest set bit, so the inner loop runs in O(k) time where k is the number of set bits in i. Since k is at most log2(n) bits, the naive approach runs in O(n * log n) time overall.
Alternatively you can use a right-shift loop: while i > 0: count += i & 1; i >>= 1. This is O(log n) per number in the worst case. Either way, the total cost across all n+1 numbers is O(n log n), not O(n). For n = 100,000, log2(n) is about 17, so the naive approach does roughly 1.7 million bit-counting operations instead of 100,000. For larger n the gap grows.
The naive approach is not wrong — it produces the correct output — but it fails the O(n) requirement. The key observation that unlocks the O(n) solution is that when you compute dp[i], all entries dp[0] through dp[i-1] are already available. If you can express dp[i] in terms of one of those earlier entries using a single O(1) bit operation, the total cost drops to O(n).
The DP approach reuses previously computed results: each number's bit count relates to a smaller number's bit count by one operation
Think about it structurally: every integer i has some set of 1-bits. If you can identify one specific 1-bit and strip it away, you get a smaller integer j < i. Then dp[i] = dp[j] + 1 — one more 1-bit than j. The challenge is finding the right bit to strip such that the operation to compute j is O(1) and j is always guaranteed to be less than i (so it is already in your dp array). Three different bit tricks each satisfy these properties, leading to three different O(n) recurrences.
DP with Right Shift
The right-shift recurrence is dp[i] = dp[i >> 1] + (i & 1). Right-shifting i by 1 removes the least significant bit (LSB), giving i >> 1. Since right-shifting always produces a strictly smaller number (i >> 1 < i for all i >= 1), the entry dp[i >> 1] is already computed when we reach i in the forward pass. The (i & 1) term checks whether the bit we just removed was a 1: if i is odd, the LSB was 1 and we add 1; if i is even, the LSB was 0 and we add 0.
This recurrence is the most intuitive of the three. Reading it aloud: the number of 1-bits in i equals the number of 1-bits in i with its last bit removed, plus 1 if that last bit was a 1. Equivalently, i >> 1 is just i divided by 2 (integer division), and the parity of i (odd or even) tells whether the LSB contributes. The recurrence naturally groups numbers into even and odd: dp[2k] = dp[k] (same bits, just shifted) and dp[2k+1] = dp[k] + 1 (same bits plus one extra 1-bit at the end).
For example, dp[6] = dp[3] + (6 & 1) = dp[3] + 0 = 2 (6 is 110 in binary, 3 is 11 in binary, both have two 1-bits). And dp[7] = dp[3] + (7 & 1) = 2 + 1 = 3 (7 is 111 in binary with three 1-bits). The recurrence builds the entire dp array in a single left-to-right pass, using each entry exactly once.
- 1Initialize dp array of length n+1 with dp[0] = 0
- 2For i from 1 to n: dp[i] = dp[i >> 1] + (i & 1)
- 3i >> 1 removes the last bit — always < i, so already computed
- 4(i & 1) adds 1 if i is odd (last bit is 1), 0 if i is even
- 5Return the dp array
DP with Kernighan's Trick
The Brian Kernighan recurrence is dp[i] = dp[i & (i-1)] + 1. The expression i & (i-1) clears the lowest set bit of i. This is the same trick used in LeetCode 191 (Number of 1 Bits) to count bits by repeatedly stripping the lowest set bit. Here we use it in reverse: rather than stripping bits in a loop, we note that i & (i-1) is always strictly less than i (it has one fewer 1-bit and all remaining bits match i), so dp[i & (i-1)] is already computed when we reach i.
Reading the recurrence: the number of 1-bits in i equals the number of 1-bits in i with its lowest set bit cleared, plus 1 for that cleared bit. This is always valid because i & (i-1) is strictly less than i for any i >= 1, and i & (i-1) >= 0. The base case dp[0] = 0 is essential here: for any i that is a power of 2, i & (i-1) = 0, so dp[i] = dp[0] + 1 = 1. Indeed, every power of 2 has exactly one 1-bit.
For example, dp[12] = dp[12 & 11] + 1 = dp[8] + 1. 12 in binary is 1100, 11 is 1011, their AND is 1000 = 8. dp[8] = 1 (8 is a power of 2), so dp[12] = 2. 12 in binary is 1100, which indeed has two 1-bits. The Kernighan recurrence is slightly more abstract than the right-shift version but connects directly to the LeetCode 191 approach.
Both the right-shift and Kernighan recurrences are O(n) total; right-shift is easier to understand, Kernighan connects to LeetCode 191
The right-shift recurrence dp[i] = dp[i >> 1] + (i & 1) strips the least significant bit positionally — it removes bit 0. The Kernighan recurrence dp[i] = dp[i & (i-1)] + 1 strips the lowest set bit by value — it removes whichever bit position has the rightmost 1. For powers of two these are the same operation. For other numbers they differ: dp[6] via right-shift uses dp[3], while dp[6] via Kernighan uses dp[4]. Both give the correct answer. In interviews, mention both recurrences to demonstrate breadth; use right-shift as your default because the + (i & 1) term makes the derivation transparent.
DP with Last Set Bit
The last-set-bit recurrence is dp[i] = dp[i - (i & -i)] + 1. The expression (i & -i) isolates the lowest set bit of i — it returns a power of 2 equal to the position value of the lowest set bit. For example, for i = 12 (binary 1100), (i & -i) = 4 (binary 0100). For i = 7 (binary 0111), (i & -i) = 1 (binary 0001). The trick works because -i in two's complement flips all bits and adds 1, which causes the lowest set bit of i to be the only bit that is 1 in both i and -i.
Subtracting (i & -i) from i removes that lowest set bit, giving a number with the same higher bits and one fewer 1-bit. Since we removed a positive power of 2 from i, the result i - (i & -i) is strictly less than i, so it is already in the dp array. The +1 accounts for the removed bit. This formulation is mathematically equivalent to the Kernighan recurrence — i & (i-1) and i - (i & -i) both clear the lowest set bit — but the subtraction form is easier to derive from the (i & -i) isolation property.
For example, dp[10] via last-set-bit: (10 & -10) = 2 (10 is 1010, lowest set bit is at position 1, value 2). dp[10] = dp[10 - 2] + 1 = dp[8] + 1 = 1 + 1 = 2. 10 in binary is 1010, which has two 1-bits. Correct. All three recurrences — right-shift, Kernighan, and last-set-bit — produce the identical dp array; they differ only in which smaller index they refer back to.
- 1Initialize dp array of length n+1 with dp[0] = 0
- 2For i from 1 to n: compute lowest_set = i & (-i)
- 3dp[i] = dp[i - lowest_set] + 1
- 4i - lowest_set removes the lowest set bit — always < i, so already computed
- 5Return the dp array
Code Walkthrough Python and Java
Python — right-shift (cleanest): def countBits(n): dp = [0] * (n + 1); [dp.__setitem__(i, dp[i >> 1] + (i & 1)) for i in range(1, n + 1)]; return dp. Or more readably: dp = [0] * (n + 1); for i in range(1, n + 1): dp[i] = dp[i >> 1] + (i & 1); return dp. O(n) time, O(1) extra space (the output array itself does not count as extra space per the problem statement). Both the Kernighan form dp[i] = dp[i & (i - 1)] + 1 and the last-set-bit form dp[i] = dp[i - (i & -i)] + 1 work identically in Python, just substituting the index expression.
Java — all three forms translate directly: public int[] countBits(int n) { int[] dp = new int[n + 1]; for (int i = 1; i <= n; i++) { dp[i] = dp[i >> 1] + (i & 1); } return dp; }. The right-shift version is the most commonly seen in interviews. The Kernighan version would use dp[i] = dp[i & (i - 1)] + 1 and the last-set-bit version dp[i] = dp[i - (i & -i)] + 1. All three have O(n) time and O(1) extra space. Java's fixed-width int makes the (i & -i) two's complement trick work correctly without any masking.
When presenting this problem in an interview, state the recurrence first, then explain why the referred index is always less than i (ensuring it is already computed), then code the loop. For the right-shift version: i >> 1 < i because right-shifting a positive integer always reduces it. For Kernighan: i & (i-1) < i because clearing a set bit strictly reduces the value. For last-set-bit: i - (i & -i) < i because (i & -i) >= 1 for all i >= 1. The time is O(n) because each dp[i] is computed in O(1) using a single bit operation plus one array lookup. The extra space is O(1) because no auxiliary structures are needed beyond the output array.
Make sure dp[0] = 0 as the base case — all three recurrences depend on it since 0 has zero 1-bits
All three recurrences bottom out at dp[0] = 0. For any power of 2 (i = 1, 2, 4, 8, ...), the Kernighan recurrence gives dp[i] = dp[0] + 1 = 1 because i & (i-1) = 0. The right-shift recurrence gives dp[1] = dp[0] + (1 & 1) = 0 + 1 = 1. The last-set-bit recurrence gives dp[i] = dp[0] + 1 = 1 because i - (i & -i) = 0 for powers of two. If dp[0] were initialized to any non-zero value, all downstream results would be wrong. In Python, [0] * (n + 1) initializes all entries to 0 correctly. In Java, int[] dp = new int[n + 1] also zero-initializes. Never skip the base case.