Problem Walkthrough

Contains Duplicate LeetCode 217 Deep Dive

LeetCode 217 Contains Duplicate is the gateway warm-up problem for HashSet membership checking. Given an integer array, return true if any value appears at least twice and false if every element is distinct. The HashSet approach achieves O(n) time by inserting each element and checking for membership in O(1): if an element is already in the set, a duplicate exists and you return true immediately. The Python one-liner len(nums) != len(set(nums)) captures this in a single expression. The alternative sorting approach brings duplicates adjacent in O(n log n) time, then a single pass comparing nums[i] == nums[i-1] detects any match, using O(1) extra space if sorting is done in-place. The classic interview trade-off: sorting saves space at the cost of time; HashSet saves time at the cost of space.

7 min read|

Contains Duplicate

LeetCode 217

Problem Overview

LeetCode 217 — Contains Duplicate — gives you an integer array nums and asks you to return true if any value appears at least twice in the array, and false if every element is distinct. For example, nums = [1, 2, 3, 1] returns true because 1 appears at indices 0 and 3. But nums = [1, 2, 3, 4] returns false because all four values are distinct. The problem is rated Easy and is typically the first problem assigned in LeetCode study plans.

Despite its Easy rating, Contains Duplicate is the perfect vehicle for introducing two fundamental algorithmic patterns: HashSet membership checking and sort-then-scan. These two approaches embody the classic space-vs-time trade-off: HashSet gives O(n) time at the cost of O(n) extra space, while sorting gives O(n log n) time but only O(1) extra space if done in-place. Understanding both and being able to discuss the trade-off is what interviewers look for when they assign this problem.

The constraints are generous: 1 <= nums.length <= 10^5 and the values can be any 32-bit signed integer. This means a brute force O(n^2) approach will time out — with n up to 100,000, comparing all pairs gives up to 5 * 10^9 operations, far exceeding what can run in 1-2 seconds. An efficient approach is required.

  • Input: integer array nums of length 1 to 10^5
  • Output: true if any value appears at least twice, false if all elements are distinct
  • Values can be any 32-bit signed integer (negative, zero, or positive)
  • Constraints: 1 <= nums.length <= 10^5
  • Brute force O(n^2) will time out for n up to 10^5

Brute Force

The brute force approach checks all pairs (i, j) where i < j and returns true if nums[i] == nums[j] for any pair. Two nested loops iterate over all combinations: the outer loop runs from index 0 to n-2 and the inner loop runs from i+1 to n-1. If a matching pair is found, return true immediately; if no match is found after all pairs are checked, return false.

This approach is correct but has O(n^2) time complexity — with n = 10^5, this means up to 5 * 10^9 comparisons, which will exceed the time limit. The space complexity is O(1) since no extra data structures are used. The brute force is useful only for very small inputs or for establishing correctness before optimizing.

In an interview, mentioning the brute force first is fine — it shows you understand the problem and can verify correctness. But always follow up immediately with the optimized approach. Saying "the brute force is O(n^2) and will time out; here are two better approaches with different trade-offs" is exactly the right framing to show algorithmic maturity.

💡

This is the simplest HashSet problem: add each element; if it already exists the answer is true. One line in Python: len(nums) != len(set(nums))

Contains Duplicate is arguably the purest HashSet membership problem on LeetCode — its entire solution reduces to: for each element, check if it is already in the set; if yes, return true; otherwise add it and continue. In Python this collapses to the one-liner len(nums) != len(set(nums)): set(nums) deduplicates the array, and if the set is smaller than the original, a duplicate existed. In an interview, state this one-liner and then offer the explicit loop with early exit as the more efficient implementation for arrays with duplicates near the start.

Sorting Approach

The sorting approach exploits the property that sorting brings equal elements adjacent to each other. Sort nums in-place, then do a single pass comparing each element to its neighbor: if nums[i] == nums[i-1] for any i from 1 to n-1, a duplicate exists and you return true. If the entire array is traversed without finding adjacent equals, return false.

The time complexity is O(n log n) dominated by the sort. If the sort is done in-place (as in Python's list.sort() or Java's Arrays.sort()), the space complexity is O(1) — no extra data structure proportional to n is needed. This makes sorting attractive in memory-constrained environments where the O(n) HashSet space is unacceptable.

The sorting approach is slightly more verbose than the HashSet approach and has worse time complexity, but its O(1) space usage is a genuine advantage in certain contexts. In an interview, presenting both approaches and explaining when you would choose each (HashSet for speed, sorting for memory) demonstrates the depth interviewers are looking for on this problem.

  1. 1Sort nums in-place: nums.sort() in Python, Arrays.sort(nums) in Java
  2. 2Iterate from index 1 to len(nums) - 1
  3. 3At each index i, check if nums[i] == nums[i - 1]
  4. 4If equal, a duplicate was found — return true immediately
  5. 5If the loop completes without finding a match, return false

HashSet Approach

The HashSet approach iterates through the array and adds each element to a set. Before adding, check if the element is already in the set: if it is, a duplicate has been found and you return true immediately. If the element is not yet in the set, add it and continue. If you reach the end of the array without finding a duplicate, return false.

This is O(n) time because each element is processed exactly once and HashSet membership checks and insertions are O(1) on average. The space complexity is O(n) in the worst case — if all elements are distinct, every element is added to the set. In the best case (duplicate found immediately), the set holds only one or two elements.

In Java, HashSet.add(element) returns false if the element was already in the set, which allows a particularly clean implementation: for each element, if add returns false, return true. In Python, the check is explicit: if num in seen: return True; seen.add(num). Both are idiomatic for their respective languages and both have the same O(n) time and O(n) space complexity.

ℹ️

The trade-off is classic: sorting uses less space but more time; HashSet uses more space but optimal time. Interviewers expect you to discuss both.

Contains Duplicate is a textbook example of the space-vs-time trade-off. Sorting requires O(1) extra space but O(n log n) time; HashSet requires O(n) extra space but O(n) time. Neither is universally superior — the right choice depends on the constraints of your system. In an interview, stating the trade-off explicitly ("I will use HashSet for optimal O(n) time at the cost of O(n) space; if memory is a constraint, sorting at O(n log n) with O(1) space is the alternative") shows you understand algorithms as engineering decisions, not just puzzles to solve.

Early Exit Optimization

The HashSet approach with early exit returns true as soon as the first duplicate is detected, without processing any remaining elements. In the best case — when duplicates appear at the very beginning of the array — the algorithm terminates in O(1) time after examining just two elements. This is a significant practical advantage over approaches that always process the entire array.

Consider an array where the first two elements are the same: [5, 5, 1, 2, 3, ...]. The early exit HashSet finds the duplicate immediately and returns. The sorting approach must still sort the entire array before discovering the adjacent duplicate. The Python one-liner set(nums) also processes the entire array before comparing lengths. For duplicate-heavy inputs, early exit can be dramatically faster in practice.

Even the worst-case scenario — all elements distinct — processes each element exactly once, maintaining the O(n) theoretical bound. The early exit is a pure optimization with no downside: it never makes the algorithm slower and can make it arbitrarily faster. Always implement early exit in the HashSet approach; there is no reason not to.

  1. 1Initialize an empty HashSet: seen = set() in Python, Set<Integer> seen = new HashSet<>() in Java
  2. 2Iterate over each element num in nums
  3. 3Check if num is already in seen
  4. 4If yes: a duplicate found — return true immediately without processing the rest
  5. 5If no: add num to seen and continue to the next element
  6. 6After the loop completes with no duplicates found, return false

Code Walkthrough Python and Java

Python — one-liner: def containsDuplicate(nums): return len(nums) != len(set(nums)). This builds the full set before comparing lengths; it is correct and minimal but does not exit early. Python — explicit loop with early exit: def containsDuplicate(nums): seen = set(); [return True for num in nums if num in seen or seen.add(num)]; return False. More idiomatically: seen = set() then for num in nums: if num in seen: return True; seen.add(num). Then return False outside the loop. O(n) time, O(n) space, with early exit.

Java — HashSet.add() returns false on duplicate: public boolean containsDuplicate(int[] nums) { Set<Integer> seen = new HashSet<>(); for (int num : nums) { if (!seen.add(num)) return true; } return false; }. The add() method of HashSet returns false if the element was already present, so !seen.add(num) is true when a duplicate is found. This is idiomatic Java and avoids a separate contains() call, making the code slightly more efficient. O(n) time, O(n) space.

Both Python and Java implementations handle all edge cases: a single-element array (always returns false), an array of all identical elements (returns true on the second element), and an array where the duplicate is the last pair encountered. The Java solution benefits from HashSet's O(1) amortized add() and contains() backed by a hash table, giving consistent performance across all integer values including negative numbers and Integer.MIN_VALUE.

⚠️

The Python one-liner set(nums) builds the entire set before comparing lengths; the explicit loop with early exit can be much faster for arrays with early duplicates.

len(nums) != len(set(nums)) is elegant and correct, but set(nums) always processes every element in the array before the length comparison happens. If the array is [1, 1, 2, 3, ..., 100000], the one-liner still visits all 100,000 elements while the explicit loop returns true after just two. In an interview, mention both: lead with the one-liner to show Python fluency, then explain the explicit loop as the production-quality implementation with early exit. The interviewer will appreciate that you understand the performance difference even between two correct O(n) solutions.

Ready to master algorithm patterns?

YeetCode flashcards help you build pattern recognition through active recall and spaced repetition.

Start practicing now