[OCLUG-devel] Matching Strings
James Colannino
james at colannino.org
Thu Aug 4 13:57:16 PDT 2005
Hey everyone. I'm writing a C program that will report either duplicate
strings in a file or strings that are the same up to n characters. For
example, let's say that I have the following file named text.txt:
albert
albert
albany
bert
biscuit
true
trouble
lone
If I were to do the following:
match text.txt
It would search for complete matches and should output the following:
albert
albert
If I were to do the following:
match -n 3 text.txt
It should output the following:
albert
albert
albany
But instead, it outputs this:
albert
albert
albany
albert
albany
I know why it does this. I look for matches in the following way:
read string
search rest of file below it for match
read next string
search rest of file below it for match
Since albert is there twice, it reads the first instance of albert,
outputs all matches below it, reads the second albert, and outputs all
the matches below it, and so on.
My question is, how can I write my program in such a way that these only
show up once? In essence, I'd like it to report:
albert
albert
albany
Instead of:
albert
albert
albany
albert
albany
Here's a link to the source (it's probably not very good, but I tried to
structure and comment it in such a way that it's at least easy to read
-- I hope...)
http://james.colannino.org/match.c
Any help would be greatly appreciated :)
James
More information about the OCLUG-devel
mailing list