Problem Overview
LeetCode 204 — Count Primes — asks you to count the number of prime numbers that are strictly less than a given integer n. A prime number is a natural number greater than 1 that has no positive divisors other than 1 and itself.
The naive approach checks each number from 2 to n-1 for primality by testing divisibility up to its square root. This runs in O(n * sqrt(n)) time, which is too slow for large inputs like n = 5,000,000. We need a fundamentally better algorithm.
The Sieve of Eratosthenes is the classic solution. Instead of testing each number individually, it marks all composite numbers in bulk by iterating through primes and crossing off their multiples. This reduces the time complexity to O(n log log n), which is nearly linear.
- Input: integer n
- Output: count of primes strictly less than n
- n = 0 or 1 returns 0 (no primes below these values)
- n = 2 returns 0 (2 is not strictly less than 2)
- n = 10 returns 4 (primes: 2, 3, 5, 7)
The Sieve of Eratosthenes
Create a boolean array isPrime of size n, initialized to true. Set isPrime[0] and isPrime[1] to false since 0 and 1 are not prime. Then iterate from p = 2 upward. For each index where isPrime[p] is still true, mark all multiples of p as composite.
The critical optimization is starting the marking at p*p rather than 2*p. All smaller multiples of p (like 2*p, 3*p, ..., (p-1)*p) have already been marked by smaller primes. Starting at p*p skips this redundant work and significantly reduces the number of operations.
Continue until p*p >= n, because any composite number less than n must have a factor less than or equal to sqrt(n). After the sieve completes, count the number of true values in the array — each represents a prime number less than n.
Start at p*p, Not 2*p
Starting the inner loop at p*p instead of 2*p is not just an optimization — it can make the difference between TLE and accepted on LeetCode. Every multiple of p below p*p has already been marked by a smaller prime factor.
Step-by-Step Walkthrough
Let n = 20. Create isPrime[0..19], all true. Set isPrime[0] = isPrime[1] = false. Start with p = 2. Since isPrime[2] is true, mark 4, 6, 8, 10, 12, 14, 16, 18 as false (starting from 2*2 = 4, incrementing by 2).
Move to p = 3. isPrime[3] is true. Mark 9, 12, 15, 18 as false (starting from 3*3 = 9, incrementing by 3). Note that 12 and 18 were already marked by 2, but that is fine — the sieve naturally handles overlaps.
Move to p = 4. isPrime[4] is false (already marked), so skip. At p = 5, 5*5 = 25 >= 20, so we stop the outer loop. Count trues: indices 2, 3, 5, 7, 11, 13, 17, 19 — that is 8 primes less than 20.
Implementation in Python and Java
In Python, use a list of booleans: is_prime = [True] * n. After running the sieve, return sum(is_prime) — Python counts True as 1. The inner loop uses range(p*p, n, p) for efficient composite marking.
In Java, use a boolean[] of size n. The sieve logic is identical. Count primes with a simple loop at the end. Java's boolean array defaults to false, so initialize all entries from index 2 onward to true before sieving.
Both implementations handle edge cases naturally: when n <= 2, the array is too small to contain any primes, so the count is 0. No special case handling is needed beyond the initial array setup.
Memory Optimization
For very large n, you can use a bitset instead of a boolean array to reduce memory by 8x. In Python, use bitarray; in Java, use java.util.BitSet. This matters when n exceeds 10 million.
Complexity Analysis
Time complexity is O(n log log n). The outer loop runs up to sqrt(n) times. For each prime p, the inner loop marks n/p composites. The sum over all primes p of n/p converges to n * log(log(n)) by Mertens' theorem. This is nearly linear in practice.
Space complexity is O(n) for the boolean sieve array. Each entry uses one byte in most languages (or one bit with a bitset optimization). For n = 5,000,000, this is about 5 MB — well within typical memory limits.
The sieve is optimal for this problem. No known algorithm counts primes below n faster than O(n) for the general case without advanced number theory (like the Meissel-Lehmer algorithm, which is far more complex to implement).
Common Mistakes and Edge Cases
The most frequent mistake is using n <= rather than n < in the problem constraint. Count Primes asks for primes strictly less than n, not less than or equal to n. Creating an array of size n (not n+1) handles this correctly.
Another common error is not starting the inner loop at p*p. Starting at 2*p works but leads to TLE on large inputs because it re-marks composites that smaller primes already handled. The p*p optimization is essential.
Integer overflow can bite in languages like C++ or Java when computing p*p. If p is close to sqrt(INT_MAX), p*p overflows. Cast to long before multiplying: (long)p * p < n. In Python this is not an issue due to arbitrary precision integers.
YeetCode Flashcard Tip
The Sieve of Eratosthenes is a foundational algorithm that appears in number theory problems across coding interviews. Add it to your YeetCode flashcard deck for spaced repetition mastery.