Efficient String Permutation Check Using Sliding Window Technique
Understanding the Permutation Substring Problem
When solving Leetcode problem 567, we need to determine if any permutation of string s1 exists within string s2. Permutations are rearrangements of the same characters in different orders. For example, "ab" permutations include "ab" and "ba". The challenge is efficiently checking for these permutations in larger strings.
The core insight: All permutations share identical character frequencies. Whether "ab" appears as "ab" or "ba", both contain exactly one 'a' and one 'b'. This means we can solve the problem by comparing character counts rather than checking every possible rearrangement.
Key Constraints and Optimization Opportunities
The problem specifies strings contain only lowercase English letters. This constraint allows optimization since we know there are exactly 26 possible characters. We leverage this by:
- Using a fixed-size integer array (size 26) for frequency counting
- Calculating character indices via
char - 'a'(e.g., 'b' - 'a' = 1) - Avoiding expensive hash maps when unnecessary
Critical observation: If constraints allowed extended character sets (uppercase/digits/symbols), we'd need a hash map instead. This variation is important for real-world applications.
Step-by-Step Solution Methodology
Frequency Counting Initialization
First, we create a frequency map for s1:
int[] s1Freq = new int[26];
for (char c : s1.toCharArray()) {
s1Freq[c - 'a']++;
}
This stores counts for each character in s1. For s1 = "ab", s1Freq[0] = 1 ('a') and s1Freq[1] = 1 ('b').
Sliding Window Implementation
We then slide a window of length s1.length() through s2:
int windowSize = s1.length();
int[] windowFreq = new int[26];
for (int i = 0; i <= s2.length() - windowSize; i++) {
// Reset window frequency for new segment
Arrays.fill(windowFreq, 0);
// Populate frequency for current window
for (int j = 0; j < windowSize; j++) {
char c = s2.charAt(i + j);
windowFreq[c - 'a']++;
}
// Check frequency match
if (Arrays.equals(s1Freq, windowFreq)) {
return true;
}
}
return false;
Key efficiency note: The nested loop leads to O(n*m) time complexity. We'll optimize this next.
Optimized Frequency Updates
Instead of recalculating the entire window for each position, we update frequencies incrementally:
int[] windowFreq = new int[26];
// Initialize first window
for (int i = 0; i < windowSize; i++) {
windowFreq[s2.charAt(i) - 'a']++;
}
if (Arrays.equals(s1Freq, windowFreq)) return true;
// Slide window
for (int i = windowSize; i < s2.length(); i++) {
// Remove leftmost character
windowFreq[s2.charAt(i - windowSize) - 'a']--;
// Add new character
windowFreq[s2.charAt(i) - 'a']++;
if (Arrays.equals(s1Freq, windowFreq)) return true;
}
This reduces time complexity to O(n) by:
- Removing the leftmost character from the count
- Adding the new right character
- Maintaining constant-time comparison (26 checks)
Handling Extended Character Sets
When dealing with uppercase/digits/symbols, replace the array with a hash map:
Map<Character, Integer> s1Freq = new HashMap<>();
for (char c : s1.toCharArray()) {
s1Freq.put(c, s1Freq.getOrDefault(c, 0) + 1);
}
Comparison logic changes: Instead of Arrays.equals(), check mapA.equals(mapB). This handles any character set but requires more memory.
Performance Analysis
| Approach | Time Complexity | Space | Best Use Case |
|---|---|---|---|
| Double Loop | O(n*m) | O(1) | Small strings |
| Optimized Sliding Window | O(n) | O(1) | Most cases |
| Hash Map | O(n) | O(k) | Extended character sets |
Practical Implementation Checklist
- Verify constraints: Check if input is only lowercase letters
- Initialize frequency counter: Array for a-z, hash map otherwise
- Set window size: Equal to
s1.length() - Slide window through s2: Update counts incrementally
- Compare frequencies: Use constant-time array comparison when possible
- Edge case handling: Empty strings, unequal lengths
Common Pitfalls and Solutions
- Mismatched lengths: If
s1is longer thans2, immediately return false - Case sensitivity: Convert to lowercase if constraints allow mixed cases
- Unicode characters: Use
Character.codePointAt()for multi-byte characters - Overcounting: Reset window frequencies properly between iterations
Advanced Optimization Techniques
For large inputs:
- Early termination: Track matched characters count instead of full comparison
- Skip invalid windows: Jump ahead when encountering characters not in
s1 - Parallel processing: Divide
s2into segments for multi-threaded search
Pro tip: The same approach solves anagram detection problems by treating both strings as s1 and s2 with equal lengths.
Conclusion
The sliding window technique with frequency counting provides an optimal solution for permutation substring checks. By leveraging character constraints and incremental updates, we achieve O(n) time complexity. The core principle remains: permutations share identical character distributions regardless of order.
When implementing this solution, which character set variation do you anticipate being most challenging in your projects? Share your use cases below!