Problem Walkthrough

Min Deletions Unique Freq LeetCode 1647

A greedy approach using frequency counting and a set to track used frequencies — ensuring every character frequency in the string is unique with the fewest deletions possible.

8 min read|

Min Deletions Frequencies

LeetCode 1647

Problem Overview

LeetCode 1647 asks: given a string s, return the minimum number of character deletions required so that no two distinct characters share the same frequency. For example, "aaabbbcc" has frequencies {a:3, b:3, c:2}. Deleting one 'b' gives {a:3, b:2, c:2}, but c and b still collide. Deleting one 'c' as well yields {a:3, b:2, c:1} — all unique. That is 2 deletions total.

The problem is rated Medium and appears regularly in FAANG phone screens as a test of greedy reasoning and frequency-map fluency. The key insight is that you never need to increase any character's count — only reduce. This makes a greedy, top-down approach optimal.

Constraints to keep in mind: s consists only of lowercase English letters (26 possible characters), the length of s can be up to 100,000, and the answer is guaranteed to exist since every character can be deleted entirely (frequency 0).

  • s consists of lowercase English letters only (a–z)
  • 1 <= s.length <= 100,000
  • Return the minimum number of character deletions
  • Frequency 0 is valid — a character can be completely removed

Greedy Insight

The greedy strategy is straightforward: count how often each character appears, sort those frequency counts in descending order, and iterate through them. For each frequency, if it is already claimed by a previous character, keep decrementing by 1 (each decrement = one deletion) until you find an unclaimed frequency or reach 0.

This works because reducing a frequency to the nearest available slot below it is always optimal. You are never forced to skip a slot — every integer from 1 up to the maximum frequency is a valid target, and the set tells you instantly which slots are still open.

Sorting descending is the natural choice: larger frequencies have more room to maneuver (they can step down to any lower value), so processing them first fills the highest available slots and leaves smaller slots for smaller frequencies. The result is a minimum total number of decrements — i.e., minimum deletions.

💡

Greedy Correctness

The greedy approach works because reducing a frequency to the nearest unused value is always optimal. You never need to increase any frequency — only decrease. Each character independently finds its best available slot, and no backtracking is required.

Algorithm with Set

The algorithm uses a hash map for frequency counting and a set for tracking which frequency values are already in use. Both data structures support O(1) lookups, keeping the overall complexity low.

The set starts empty. As you assign each frequency to a character, you add it to the set. If the current frequency is already in the set, you decrement and count that as one deletion. You repeat until the frequency is either 0 or unused. A frequency of 0 means the character is completely removed — it simply is not added to the set since multiple characters can all have frequency 0.

This clean separation — count in a map, track usage in a set — makes the algorithm easy to implement, easy to explain in an interview, and easy to verify for correctness.

  1. 1Count character frequencies using a hash map (or Counter in Python)
  2. 2Create an empty set to track used frequency values
  3. 3Sort frequency values in descending order
  4. 4For each frequency f: while f > 0 and f is in the used set, decrement f and add 1 to deletions
  5. 5Add f to the used set (even if f is now 0, skip adding 0 so multiple chars can share it)
  6. 6Return total deletions

Why Sort Descending

Sorting frequencies in descending order before processing ensures that high-frequency characters claim the highest available slots first. A character with frequency 5 can legitimately occupy 5, 4, 3, 2, 1, or 0. Starting from the top means it takes slot 5 if available, preserving lower slots for characters that start with lower counts.

If you process in ascending order, a low-frequency character might claim slot 2, forcing a high-frequency character (already at 2 due to collisions) to keep decrementing past slot 2 unnecessarily — producing more total deletions. Descending order prevents this cascade.

In practice the sort covers at most 26 frequencies (one per lowercase letter), so it runs in constant time — O(26 log 26) — and has no meaningful performance cost.

ℹ️

Sort is Optional with a Set

Sorting is optional if using a set: the set approach works in any order because each frequency independently finds its nearest available slot. However, sorting descending guarantees minimum deletions and makes the greedy argument easier to prove in an interview setting.

Walk-Through Example

Take the string "aaabbbcc". Character frequencies are: a → 3, b → 3, c → 2. After sorting descending: [3, 3, 2]. The used set starts empty and deletions = 0.

Process first 3: the set is empty, so 3 is unused. Add 3 to the set. Used = {3}, deletions = 0. Process second 3: 3 is in the set. Decrement to 2 — but 2 is also in the set (from c? No, not yet). Actually 2 is not in the set yet. Add 2, deletions = 1. Used = {3, 2}. Process 2 (from c): 2 is already in the set. Decrement to 1 — 1 is free. Add 1, deletions = 2. Used = {3, 2, 1}.

Total deletions = 2. This matches the expected answer. The final frequency assignment is a:3, b:2, c:1 — all distinct. No character was eliminated entirely, which is the minimal solution.

  1. 1Input: "aaabbbcc" — frequencies {a:3, b:3, c:2}
  2. 2Sorted descending: [3, 3, 2]
  3. 3Process 3 (a): set empty, add 3. Used={3}, deletions=0
  4. 4Process 3 (b): 3 in set, decrement to 2. 2 not in set, add 2. Used={3,2}, deletions=1
  5. 5Process 2 (c): 2 in set, decrement to 1. 1 not in set, add 1. Used={3,2,1}, deletions=2
  6. 6Result: 2 deletions — final frequencies are {a:3, b:2, c:1}

Code Walkthrough — Python and Java

In Python, Counter from collections handles frequency counting in one line. The sorted() call with reverse=True gives the descending order. A plain set() tracks used frequencies. The inner while loop decrements until finding a free slot or hitting 0. Total code: ~10 lines.

In Java, use a HashMap<Character, Integer> for counting, then collect values into a List, sort in reverse, and use a HashSet<Integer> for tracking. The logic is identical — the verbosity is higher but the algorithm is the same. Both run in O(n + 26 log 26) = O(n) time and O(26) = O(1) space, since there are at most 26 distinct characters.

A common alternative uses a frequency-of-frequency array (size n+1) instead of sorting — iterate from the highest frequency down and merge collisions. Both approaches are O(n) but the set-based approach is more intuitive and interview-friendly.

⚠️

Do Not Add 0 to the Used Set

Frequencies that decrement all the way to 0 mean those characters are completely deleted from the string. Do not add 0 to the used set — multiple characters can all have frequency 0 (i.e., be fully removed), so 0 is always a valid landing spot and should never be blocked.

Ready to master algorithm patterns?

YeetCode flashcards help you build pattern recognition through active recall and spaced repetition.

Start practicing now