Thursday, July 9, 2020

If a String Contains an Anagram of Another String

If a String Contains an Anagram of Another String From last weeks survey, most Gainlo users said that they want to read posts about how to come up with solutions to coding questions. So we decided to start talking about this topic from this week. I think itll be much more helpful by analyzing popular coding interview questions as examples. Id also like to highlight few points that make this article worth reading: We manually select popular coding interview questions that are asked by big companies recently. The whole point of the post is not providing something like standard answers. I want to tell you how to analyze and solve a coding question step by step. In each post, Ill summarize techniques that you may use in other problems. Alright, so here we go. Question How to check if a string contains an anagram of another string? This question was asked by Google few weeks ago and other companies as well (reported by Glassdoor). In case you dont know what anagram is An anagram is a type of word play, the result of rearranging the letters of a word or phrase to produce a new word or phrase, using all the original letters exactly once. For example, string “logain” is an anagram of “gainlo”. To provide an example for this question, string “coding interview questions” contains an anagram of string “weivretni”, however not for “abcde”. Start simple If you have read our previous post A Step-By-Step Guide to Solve Coding Problems, you should know that its recommended to start with a naive solution first and then optimize it step by step. Many people think that the naive solution is too trivial to mention, however, it can be used as a starting point and at the same time you can at least provide a solution. A most straightforward solution is brute force. Suppose we want to check if string A contains an anagram of string B, we can get all the substrings of A and check each of them. To check anagram, one way to do that is to compare the hash of two strings. More specifically, you can map each character to a different prime number and the hash of string is the multiples of all the prime numbers of each character. By doing this, you can check anagram in linear time. If the length of the string is N and M, then the time complexity should be O(N^2 * M). Apparently, this is not ideal and you may tell the interview without even being asked. Optimization One way to think about optimization is to figure out at which point the algorithm is costly. In the above solution, there are two parts: get all substrings O(N^2) and check if its an anagram O(M). If we can make both faster, itll be great. Obviously, if a string is an anagram of another, they should have the same length. In other words, we dont need to check all substrings, but only substring of length M. This operation is of O(N) time. To check if its an anagram, you should realize that you wont be able to reduce it since you should at least iterate a string once. So theres no point thinking about how to reduce O( M) to O(1). However, we can try to merge this iteration into the substring iteration process. To sum up, we keep a sliding window of length M using two indices pointing to the start and end of the substring. The sliding window keeps moving forward. Each time it moves one character, we can re-calculate the hash of the current substring and check if its same as the hash of string B. By doing this, we can reduce the overall time complexity to O(N) without using extra memory. Optimization hash map You should be able to know that O(N) is the minimal time we can have since you should at least iterate string A once. So dont need to waste time thinking about how to make it O(logN). Another way to check the anagram is to use a hash map whose key is the character and value is the frequency of the character in this substring. We first store the character frequency of B inside the hash map. When the sliding window moves forward by one step, update the hash map based on the current substring inside the window. Once the hash map gets zero for all keys, the current sliding window has the anagram substring. Takeaways This is the most important section. Again, the whole point of this article is not providing answers, but focused on analysis. Here are several techniques you can reuse in other problems: When you need to compare things regardless of order (like anagram), you may consider hash, hash map. Sliding window or keep two indices pointing to the start and end is a great way to iterate an array in linear time. This is also used in questions like “find 2 numbers in a sorted array that sum to M”. You should be able to tell the bottleneck of an algorithm and how much it can be optimized. If you have to iterate an array at least once, you dont need to think about O(logN) solution.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.