Dayue's Blog

Leetcode 3. Longest Substring Without Repeatable Characters [MEDIUM]

Given a string s, find the length of the longest substring without repeating characters.

Example 1:

Input: s = "abcabcbb"
Output: 3
Explanation: The answer is "abc", with the length of 3.
Example 2:

Input: s = "bbbbb"
Output: 1
Explanation: The answer is "b", with the length of 1.
Example 3:

Input: s = "pwwkew"
Output: 3
Explanation: The answer is "wke", with the length of 3.
Notice that the answer must be a substring, "pwke" is a subsequence and not a substring.
Example 4:

Input: s = ""
Output: 0


0 <= s.length <= 5 * 104
s consists of English letters, digits, symbols and spaces.

Solution 1 (Brute force)

To find longest substring without repeatable characters, the most naive solution is to
1. first enumerate all possbile substrings
2. check each substring if there exists repeatable characters in the substring.

Step 1 takes O(N^2) time to enumerate all substrings.
Step 2 takes O(N) time check if a substring contains repeatable chracters with hashtable.

Overall, this naive solution takes O(N^3) time.

Solution 2 (Optimized brute force)

We can optimize our solution based on solution 1. For each starting index i, find the maximum value of starting j such that there are no repeatable characters within the range [i, j); when we find the first repeatable characters, stop executing and increment i. In this way, for eaching starting index i, we don't have to iterate j to the last character of the string. Instead, we can stop iterating j when we first find repeatable chracters (further iteration is redundant).

Java code:

Optimized brute force solution
// Time: O(N^2)
// Space: O(128)
// class Solution {
//     public int lengthOfLongestSubstring(String s) {
//         int ans = 0;
//         for (int i = 0; i < s.length(); i++) {
//             char[] seen = new char[128];
//             int j = i;
//             while (j < s.length() && seen[s.charAt(j)] == 0) {
//                 seen[s.charAt(j)] = 1;
//                 j++;
//             }
//             ans = Math.max(ans, j - i);
//         }
//         return ans;
//     }
// }

1. Time
There are N iterations in total (for loop). For each iteration, it loops at most 128 times because there are only 128 unique chracters in ascii table. Therefore the time complexity is O(N^2) => further reduced to O(N * 128)
2. Space
For each iteration, we keep track of characters that already appear in this iteration. We use an array to check (instead of hashtable, reason: array is cheap, hashtable is expensive). Overall, the space complexity is just the size of the tracking array => O(128)

Solution 3 (Hash Table + Sliding Window)

To further optimize this algorithm, we should think about how we can get the final answer using only one scan of the string. Suppose we have a sliding window [i, j] such that there are no repeatable chracters in the window. Then, the final answer is the maximum possible size of this sliding window.

How to maintain such a window?

  1. We keep iterating the ending point (j) of this sliding window. j monotonically increases.
  2. Find the smallest possible value of i.

Now, the key challenge is how to find the feasible i for each j?

  1. We use a hash table to store the last index of each character.
  2. For an endpoint j (stores character c), find the last index of c (i_c). i should be i_c + 1

However, there might be a case shown as below:

for j = 3 (zero based), it find the last index of a is 0, so it set i = 0 + 1 = 1. However, the window[1, 3] still contains repeatable chracters.

How to solve this issue?
Let i = Math.max(i, last occurrence index of c + 1) // TODO: need further explanation

Java code:

class Solution {
    public int lengthOfLongestSubstring(String s) {
        int[] lastOccur = new int[128];
        Arrays.fill(lastOccur, -1);
        int ans = 0;
        for (int i = 0, j = 0; j < s.length(); j++) {
            i = Math.max(i, lastOccur[s.charAt(j)] + 1);
            ans = Math.max(ans, j - i + 1);
            lastOccur[s.charAt(j)] = j;
        return ans;

1. Time: O(N), reason: one time scan of the input array
2. Space: O(128), reason: use array of size 128 as a hash table.