Find Anagrams in array of Strings

Problem: You are given an array of strings and you have to print all the anagrams within the array.

What is anagram – For those who don’t know, two words are anagrams if they contain the same characters.
We have already discussed how to check if 2 strings are anagrams here.

Solution:

Let’s use this example to solve the problem [“bat”, “code”, “cat”, “tab”, “cab”, “crazy”, “act”]

Method 1: (Brute Force)

The brute force solution would be to take each string and compare it to every other string in the array one character at a time O(n^4) solution.
Assuming there are n strings with maximum n characters each.

Method 2: O(N^3)

One way we can quicken the string comparison process is by sorting each string alphabetically (any sorting would be fine actually). Pre-sorting helps you avoid comparing each character in a string to each character in another string. You only need to compare every character to the character in the corresponding position of the other string. This reduces the order of the solution ( O(n^2 log n (sorting n strings with n characters each) + n^3 (n^2 comparisons of n characters each)) = O(n^3)) assuming the sorting algorithm is O(n log n)).

However, this increases memory usage since you now need to store the original strings so you can print them out later.
After sorting the characters in the string, the array would look like this [abt”, “cdeo”, “act”, “abt”, “abc”, “acryz”, “act”].

Method 3:

We are comparing each string with every other string. That seems inefficient. How about in addition to sorting the characters in each string, we sort each string in the array alphabetically. Sorting the strings in the array means you do not have to compare each string to every other string, you only have to compare it to the next string in line. If they happen to be the same (i.e. found an anagram), then you can compare with the one after that. Whichever the case is, the order of comparison in this case will be of the order O(n) instead of O(n^2). This sorting will reduce the order to ( O(n^2 log n (sorting characters) + n^2 log n (sorting strings) + n^2 (n comparisons of n characters each)) = O(n^2 log n)
Assuming the sorting algorithm is O(n log n)).

After sorting the strings, the array would look like [“abt”, ""abc“, abt”, "acryz", ”act”, “act”, “act”, “cdeo”].

There is an alternate solution using hashing. Put each string in a hash table ( O(n) ). If the spot in the hash table is already taken, then that is an anagram. The order for this solution is n^2 (hashing n strings) + n (n insertions into hash table) = O(n^2)
Assuming hashing each string is O(n) .

This article is contributed by Rohit Khandelwal. If you have any better solution then please do comment.

0 Thoughts on “Find Anagrams in array of Strings

  1. Ranjeet_Great on May 1, 2014 at 2:05 pm said:

    public class Anagram_String
    {
    public static void main(String arr[])
    {
    String str1=”abc”;
    String str2=”blc”;
    int count=0;
    for(int i=0;i<str1.length();i++)
    {
    for(int j=0;j<str2.length();j++)
    {
    if(str1.charAt(i)==str2.charAt(j))
    count++;
    }

    }
    if(str1.length()==count)
    {
    System.out.println("Its Anagram");}
    }

    }

  2. Turja Chaudhuri on June 27, 2014 at 12:29 am said:

    We can use trie.This will increase space complexity , but might be good for this problem.
    We scan our input array from left to right, for each word,we sort it in a temp string,then insert it into a trie,in the leaf of the trie,for valid words,we maintain a linked list which contains the indices of the words in the original array.In the end we print these indices after traversal through the trie.

  3. venkat on August 15, 2014 at 1:36 pm said:

    No sorting is needed.

    Compare strings using count of characters, sum and product of characters. If all three match, they must be anagrams.

  4. vikash mishra on August 26, 2014 at 8:01 pm said:

    1-enter first string into hash table and cunt field will count the frequency of each character.
    2-for next string for each character if same character is already present then decrease its count by 1.
    3- if counts are zero then strings are anagrams.
    4- repeat for others.

    TC = o(n^2)

  5. Nishant on January 5, 2015 at 3:16 am said:

    @Ranjeet_Great: this does not work for repeating strings try
    str1:”aabcde”
    str2″abcdea”

  6. pankaj on July 29, 2015 at 7:44 am said:

    Efficient solution:
    loop
    traversal of array-0 to N= O(N)
    loop
    sum of ASCII codes for every string- O(N)
    put in hash table as a linklist with values for all same ASCII sum strings on a same key
    end
    end

    print the hashtable

  7. The article is misleading. Hash value is not unique. At most it can be used to break strings into subgroup for further check (combining with string lengthy would enhance performance, too)

Leave a Reply to WL Cancel reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Post Navigation