Search the pattern in given String
Given two strings, text and pattern, of size N and M (N > M)respectively, the task is to print all occurrences of pattern in text.
Examples:
Input: text = “This is a dummy text”, pattern = “This”
Output: Pattern found at indices: 0
Explanation: The pattern “This” starts from index 0 in the given text.Input: text = “Welcome to Geeks for Geeks”, pattern = “Geeks”
Output: Pattern found at indices: 21 11
Explanation: The pattern “Geeks” starts from the 11th and 21st index (considering the white spaces).
Approach: The approach for this problem is based on the following idea:
Find the possible starting indices of all the starting points in the text. Then for all those indices check if their adjacents match with the next elements of the pattern.
The above idea can be implemented using the queue. Follow the steps mentioned below to implement the idea.
- First of all, use a 256 sized array of unordered_set. Traverse through the text and insert every index to the set of the respective character.
C++
for ( int i = 0; i < st.length(); i++) // Insert every index to the hash set using character // ASCII. structured_text[st[i]].insert(i); |
- Search for the first character of the pattern in the array and push every index contained in the hash set into a queue.
C++
for ( int ind : structured_text[pattern[0]]) q_indices.push(ind); |
- Then traverse through pattern from i = 1 to M-1:
- If the index in structured_text[pattern[i]] is adjacent to any of the characters present in pat[i-1], push it to the queue for next iteration.
- Otherwise, continue checking for other positions.
C++
for ( int i = 1; i < pattern.length(); i++) { char ch = pattern[i]; int q_size = q_indices.size(); /* the queue contains the number of occurrences of the previous character. traverse the queue for q_size times Check the next character of the pattern found or not. */ while (q_size--) { int ind = q_indices.front(); q_indices.pop(); if (structured_text[ch].find(ind + 1) != structured_text[ch].end()) q_indices.push(ind + 1); } } |
Python3
for i in range ( 1 , pat_len): ch = pattern[i] q_size = len (q_indices) ## The queue contains the ## number of occurrences of ## the previous character. ## Traverse the queue for ## q_size times. ## Check the next character of ## the pattern found or not. while q_size > 0 : q_size - = 1 ind = q_indices[ 0 ] q_indices.pop( 0 ) if ((ind + 1 ) in structured_text[ ord (ch)]): q_indices.append(ind + 1 ) # This code is contributed by akashish_. |
- If the whole pattern is found then return those indices.
Below is the implementation for the above approach:
C++
// C++ code for the above approach: #include <bits/stdc++.h> using namespace std; // Using a 256 sized array of // hash sets. unordered_set< int > structured_text[256]; // Function to perform the hashing void StringSearch(string st) { // Structure the text. It will be // helpful in pattern searching for ( int i = 0; i < st.length(); i++) // Insert every index to the // hash set using character ASCII. structured_text[st[i]].insert(i); } // Function to search the pattern void pattern_search(string st, string pattern) { StringSearch(st); // Queue contain the indices queue< int > q_indices; for ( int ind : structured_text[pattern[0]]) q_indices.push(ind); // Pattern length int pat_len = pattern.length(); for ( int i = 1; i < pat_len; i++) { char ch = pattern[i]; int q_size = q_indices.size(); // The queue contains the // number of occurrences of // the previous character. // Traverse the queue for // q_size times. // Check the next character of // the pattern found or not. while (q_size--) { int ind = q_indices.front(); q_indices.pop(); if (structured_text[ch].find(ind + 1) != structured_text[ch].end()) q_indices.push(ind + 1); } } cout << "Pattern found at indexes:" ; while (!q_indices.empty()) { // last_ind is the last index // of the pattern in the text int last_ind = q_indices.front(); q_indices.pop(); cout << " " << last_ind - (pat_len - 1); } cout << endl; } // Driver code int main() { // Passing the Text string text = "Welcome to Geeks for Geeks" ; string pattern = "Geeks" ; // Function call pattern_search(text, pattern); return 0; } |
Python3
# Python program for the above approach: ## Using a 256 sized array of ## hash sets. structured_text = [ set ({}) for _ in range ( 256 )] ## Function to perform the hashing def StringSearch(st): ## Structure the text. It will be ## helpful in pattern searching global structured_text for i in range ( len (st)): ## Insert every index to the ## hash set using character ASCII. structured_text[ ord (st[i])].add(i) ## Function to search the pattern def pattern_search(st, pattern): global structured_text StringSearch(st) ## Queue contain the indices q_indices = [] for ind in structured_text[ ord (pattern[ 0 ])]: q_indices.append(ind) ## Pattern length pat_len = len (pattern); for i in range ( 1 , pat_len): ch = pattern[i] q_size = len (q_indices) ## The queue contains the ## number of occurrences of ## the previous character. ## Traverse the queue for ## q_size times. ## Check the next character of ## the pattern found or not. while q_size > 0 : q_size - = 1 ind = q_indices[ 0 ] q_indices.pop( 0 ) if ((ind + 1 ) in structured_text[ ord (ch)]): q_indices.append(ind + 1 ) print ( "Pattern found at indexes:" , end = "") while len (q_indices) > 0 : ## last_ind is the last index ## of the pattern in the text last_ind = q_indices[ 0 ] q_indices.pop( 0 ) print (" ", last_ind - (pat_len - 1), end=" ") print ("") ## Driver code if __name__ = = '__main__' : ## Passing the Text text = "Welcome to Geeks for Geeks" pattern = "Geeks" ## Function call pattern_search(text, pattern) # This code is contributed by subhamgoyal2014. |
Pattern found at indexes: 21 11
Time Complexity: O(N * logK), where K is the maximum occurrence of any character
Auxiliary Space: O(d), d represents a 256 sized array of unordered_set
Please Login to comment...