I have discussed O(n) method for finding first non-repeating character in a given string.
You can find it here Method-1
The above approach takes O(n) time, but in practice it can be improved. The first part of the algorithm runs through the string to construct the count array (in O(n) time). This is reasonable. But the second part about running through the string again just to find the first non-repeater is not good in practice. In real situations, your string is expected to be much larger than your alphabet. Take DNA sequences for example: they could be millions of letters long with an alphabet of just 4 letters. What happens if the non-repeater is at the end of the string? Then we would have to scan for a long time (again).
We can augment the count array by storing not just counts but also the index of the first time you encountered the character e.g. (3, 26) for ‘a’ meaning that ‘a’ got counted 3 times and the first time it was seen is at position 26. So when it comes to finding the first non-repeater, we just have to scan the count array, instead of the string. Thanks to Ben for suggesting this approach.
Hope it helps...
....Next time... :)
You can find it here Method-1
The above approach takes O(n) time, but in practice it can be improved. The first part of the algorithm runs through the string to construct the count array (in O(n) time). This is reasonable. But the second part about running through the string again just to find the first non-repeater is not good in practice. In real situations, your string is expected to be much larger than your alphabet. Take DNA sequences for example: they could be millions of letters long with an alphabet of just 4 letters. What happens if the non-repeater is at the end of the string? Then we would have to scan for a long time (again).
We can augment the count array by storing not just counts but also the index of the first time you encountered the character e.g. (3, 26) for ‘a’ meaning that ‘a’ got counted 3 times and the first time it was seen is at position 26. So when it comes to finding the first non-repeater, we just have to scan the count array, instead of the string. Thanks to Ben for suggesting this approach.
#include <stdlib.h>
#include <stdio.h>
#include <limits.h>
#define NO_OF_CHARS 256
// Structure to store count of a character and index of the first
// occurrence in the input string
struct countIndex {
int count;
int index;
};
/* Returns an array of above structure type. The size of
array is NO_OF_CHARS */
struct countIndex *getCharCountArray(char *str)
{
struct countIndex *count =
(struct countIndex *)calloc(sizeof(countIndex), NO_OF_CHARS);
int i;
for (i = 0; *(str+i); i++)
{
(count[*(str+i)].count)++;
// If it's first occurrence, then store the index
if (count[*(str+i)].count == 1)
count[*(str+i)].index = i;
}
return count;
}
/* The function returns index of first non-repeating
character in a string. If all characters are repeating
then reurns INT_MAX */
int firstNonRepeating(char *str)
{
struct countIndex *count = getCharCountArray(str);
int result = INT_MAX, i;
for (i = 0; i < NO_OF_CHARS; i++)
{
// If this character occurs only once and appears
// before the current result, then update the result
if (count[*(str+i)].count == 1 &&
result > count[*(str+i)].index)
result = i;
}
free(count); // To avoid memory leak
return result;
}
/* Driver program to test above function */
int main()
{
char str[] = "HelloWorldHowAreYou";
int index = firstNonRepeating(str);
if(index == INT_MAX)
printf("Either all characters are repeating or string is empty");
else
printf("First non-repeating character is %c", str[index]);
getchar();
return 0;
}
0 comments:
Post a Comment