[OCLUG-devel] Matching Strings

Tim Nation tnation at baipro.com
Thu Aug 4 14:01:18 PDT 2005


> Hey everyone.  I'm writing a C program that will report either duplicate
> strings in a file or strings that are the same up to n characters.  For
> example, let's say that I have the following file named text.txt:

<snip>

> But instead, it outputs this:
>
> albert
> albert
> albany
> albert
> albany
>
> I know why it does this.  I look for matches in the following way:
>
> read string
> search rest of file below it for match
> read next string
> search rest of file below it for match
>
> Since albert is there twice, it reads the first instance of albert,
> outputs all matches below it, reads the second albert, and outputs all
> the matches below it, and so on.

<snip>

How about removing the matching lines from the record in memory as they are
processed. That way they can't be reprocessed multiple times. As long as you
are only reading and not writing the text record, you should be free to
manipulate it in memory all you want.

Tim



More information about the OCLUG-devel mailing list