Problem Overview
LeetCode 3 — Longest Substring Without Repeating Characters — gives you a string s and asks for the length of the longest substring that contains no duplicate characters. A substring is a contiguous sequence of characters within s, not to be confused with a subsequence which can skip characters. For example, "abcabcbb" yields 3 because the longest window with all unique characters is "abc".
The problem is deceptively simple to state but requires careful handling of duplicate detection and window management. Naive approaches that check all substrings quickly exceed time limits on large inputs. The key insight is that you never need to restart from scratch after finding a duplicate — you only need to shrink the window from the left until the duplicate is removed, then continue expanding rightward.
This problem is the canonical introduction to the variable-size sliding window pattern. Mastering it here pays dividends across dozens of LeetCode problems involving substrings, subarrays, and contiguous segments with uniqueness or frequency constraints.
- Constraints: 0 <= s.length <= 5 * 10^4
- s consists of English letters, digits, symbols and spaces
- Return 0 for empty string
- Window contains only unique characters at all times
- Answer is the maximum window length seen during the scan
Brute Force to Optimal
The brute force approach checks all O(n^2) substrings. For each pair of indices (i, j), extract s[i:j] and check for duplicate characters using a set — that check itself is O(n), giving an overall O(n^3) time complexity. For n = 50,000 this is 1.25 * 10^14 operations, far too slow. Even reducing the check to O(1) per character still leaves O(n^2) substrings to enumerate.
The sliding window insight eliminates redundant work: instead of restarting the window at every position, maintain a window [left, right] where all characters are unique. Expand right by one character at a time. When a duplicate appears, advance left until the duplicate leaves the window. This way each character is visited at most twice — once when entering on the right, once when leaving on the left — giving O(2n) = O(n) time.
The further optimization with a hashmap instead of a hashset reduces worst-case character visits from 2n to n. Rather than shrinking left one step at a time until the duplicate exits, store each character's last seen index. When a duplicate is detected, jump left directly to lastIndex[char] + 1 in a single operation. This eliminates the inner shrink loop entirely and guarantees a true single-pass O(n) solution.
The Hashmap Stores Last Seen Index for Jump Optimization
The hashmap stores each character's last seen index, not just whether it is present. When s[right] is already in the map, the left pointer jumps directly to max(left, map[s[right]] + 1) — one atomic operation that skips over all intermediate positions the hashset approach would have visited one-by-one. This is the critical difference between O(2n) and true O(n).
Sliding Window with HashSet
The hashset variant maintains a set of characters currently in the window. Expand right: if s[right] is not in the set, add it and update max length. If s[right] is already in the set, remove s[left] from the set and advance left by one — repeat until s[right] can be safely added. This two-pointer approach is O(2n) in the worst case because each character can be added and removed from the set once.
The worst case for the hashset approach occurs with strings like "abcdeabcde" where every right expansion triggers a shrink phase. For "abcdeabcde" with window size 5, when right hits the second "a", left must advance 5 times to clear the duplicate. Each of the n characters can trigger up to n shrink steps in a degenerate case — but amortized over the full scan, each character is added and removed exactly once, keeping total work at O(2n).
The hashset approach is conceptually cleaner and easier to reason about for beginners. It correctly handles all edge cases: empty strings, single characters, all-unique strings, and all-duplicate strings. The max window size is updated after every successful right expansion, so no valid window is missed. Space complexity is O(min(n, |charset|)) — bounded by the character set size, typically 26 or 128.
- 1Initialize: left = 0, maxLen = 0, charSet = empty set
- 2Expand right from 0 to len(s) - 1:
- 3 While s[right] in charSet: remove s[left] from charSet, left += 1
- 4 Add s[right] to charSet
- 5 Update maxLen = max(maxLen, right - left + 1)
- 6Return maxLen
Optimized with HashMap
The hashmap variant replaces the set with a dictionary mapping each character to its last seen index. Initialize left = 0 and an empty map. For each right position, if s[right] is in the map and its stored index >= left (meaning it is inside the current window), jump left to map[s[right]] + 1. Then update map[s[right]] = right and compute max length. This is a single-pass O(n) solution with no inner loop.
The condition map[s[right]] >= left is critical. A character may appear in the map from a previous window that has already been passed. For example, in "abba" when right = 3 (second "a"), the map has "a" at index 0. But left is already at 2 (after handling the second "b"). Using map["a"] + 1 = 1 as the new left would move it backwards, breaking the invariant. The max(left, map[char] + 1) guard prevents this regression.
Space complexity remains O(min(n, |charset|)) — the map holds at most one entry per unique character. The time complexity is strictly O(n) since the right pointer advances exactly n times and the left pointer only jumps forward (never shrinks one step at a time). This is the production-grade solution most interviewers expect for this problem.
Why max(left, lastIndex+1) Is Essential
The max(left, map[char] + 1) expression is not optional. Without the max, the left pointer can jump backwards when a duplicate character was seen before the current window began. For example in "abba": when right=3 hits "a" again, map["a"]=0 but left=2. Using just map["a"]+1=1 moves left backwards to 1, incorrectly including "b" twice. The max ensures left only ever moves forward.
Walk-Through Example
Trace "abcabcbb" with the hashmap approach. Start: left=0, maxLen=0, map={}. Right=0 "a": map={"a":0}, window="a", maxLen=1. Right=1 "b": map={"a":0,"b":1}, window="ab", maxLen=2. Right=2 "c": map={"a":0,"b":1,"c":2}, window="abc", maxLen=3. Right=3 "a": "a" in map at index 0 >= left=0, so left=max(0,0+1)=1. map={"a":3,"b":1,"c":2}, window="bca", maxLen=3.
Continuing: Right=4 "b": "b" in map at index 1 >= left=1, so left=max(1,1+1)=2. map={"a":3,"b":4,"c":2}, window="cab", maxLen=3. Right=5 "c": "c" in map at index 2 >= left=2, so left=max(2,2+1)=3. map={"a":3,"b":4,"c":5}, window="abc", maxLen=3. Right=6 "b": "b" in map at index 4 >= left=3, so left=max(3,4+1)=5. map={"a":3,"b":6,"c":5}, window="cb", maxLen=3.
Finally: Right=7 "b": "b" in map at index 6 >= left=5, so left=max(5,6+1)=7. map={"a":3,"b":7,"c":5}, window="b", maxLen=3. Scan complete. Return maxLen=3. The answer matches the expected output. Notice the left pointer only moved forward — never backward — and each character was processed exactly once.
- 1right=0 "a": map={a:0}, left=0, window=1, maxLen=1
- 2right=1 "b": map={a:0,b:1}, left=0, window=2, maxLen=2
- 3right=2 "c": map={a:0,b:1,c:2}, left=0, window=3, maxLen=3
- 4right=3 "a": dup at 0>=0, left=1. map={a:3,b:1,c:2}, window=3, maxLen=3
- 5right=4 "b": dup at 1>=1, left=2. map={a:3,b:4,c:2}, window=3, maxLen=3
- 6right=5 "c": dup at 2>=2, left=3. map={a:3,b:4,c:5}, window=3, maxLen=3
- 7right=6 "b": dup at 4>=3, left=5. map={a:3,b:6,c:5}, window=2, maxLen=3
- 8right=7 "b": dup at 6>=5, left=7. map={a:3,b:7,c:5}, window=1, maxLen=3
Code Walkthrough: Python and Java
Python solution using a dict: def lengthOfLongestSubstring(s): char_map = {}; left = 0; max_len = 0. For right, char in enumerate(s): if char in char_map and char_map[char] >= left: left = char_map[char] + 1. char_map[char] = right; max_len = max(max_len, right - left + 1). Return max_len. Time O(n), space O(min(n, 26)) for lowercase alphabet or O(128) for ASCII. The solution is 8 lines and handles all edge cases including empty strings.
Java solution: public int lengthOfLongestSubstring(String s) { Map<Character,Integer> map = new HashMap<>(); int left = 0, maxLen = 0; for (int right = 0; right < s.length(); right++) { char c = s.charAt(right); if (map.containsKey(c) && map.get(c) >= left) { left = map.get(c) + 1; } map.put(c, right); maxLen = Math.max(maxLen, right - left + 1); } return maxLen; }. Same O(n) time and O(min(n,|charset|)) space. The HashMap.get and put operations are O(1) average.
Both implementations are structurally identical: one pass with right pointer, conditional left jump using max guard, map update, and running max. The Python version uses enumerate for cleaner indexing; the Java version uses charAt for character extraction. Edge cases handled automatically: empty string returns 0 (loop never executes), single character returns 1, all-unique string returns n, all-same character returns 1.
Always Use max(left, map[char]+1) Not Just map[char]+1
A common implementation bug is writing left = map[char] + 1 without the max(left, ...) guard. This compiles and passes most test cases but fails when a duplicate character was last seen before the current window. The left pointer must never move backwards — omitting the max guard causes incorrect results on inputs like "abba", "dvdf", and "tmmzuxt". Always guard with max(left, map[char] + 1).