[OCLUG-devel] GNU code with security holes?!! [was: Trouble with strings and files]

Christopher Smith x at xman.org
Thu Jul 8 14:41:47 PDT 2004


On Wed, 2004-07-07 at 22:11, Jack Denman wrote:
> *****************************************
> 
> /* Read an arbitrarily long line of text from STREAM into LINEBUFFER.
>    Remove any newline.  Does not null terminate.
>    Return LINEBUFFER, except at end of file return 0.  */
> 
> struct linebuffer *readline (struct linebuffer *line, FILE *stream)
> {
>     int         c;
>     char        *buffer = line->buffer;
>     char        *p = line->buffer;
>     char        *end = buffer + line->size;   /* Sentinel.                  */
> 
>     c = getc (stream);
>     if (feof (stream)) {
>         line->length = 0;
>         return (NULL);
>     } else {
>         ungetc(c, stream);
>     }
> 
>     for ( ; ; ) {
>         c = getc (stream);
>         if (c == '\r')
>             continue;
>         if (p == end) {
>             line->size *= 2;
>             buffer = xrealloc (buffer, (size_t) line->size);
>             p += buffer - line->buffer;
>             line->buffer = buffer;
>             end = buffer + line->size;
>         }
>         if (c == EOF || c == '\n') {
>             c = '\0';
>             *p++ = (char) c;
>             break;
>         }
>         *p++ = (char) c;
>     }
> 
>     if (feof (stream) && p == buffer) {
>         line->length = 0;
>         return(NULL);
>     }
>     line->length = p - line->buffer - 1;
>     return (line);
> }

So, now that I've mostly caught up on my sleep, I had a chance to look
at this code, and my code. My main observation is actually that my code
doesn't have a buffer overflow but ironically, yours does. Some
observations:

First of all, having looked at my code sample with alert eyes, and
having context switched from my C++ world (where STL does all this stuff
so wonderfully for me), I feel pretty confident saying that fgets is
being used properly in my original code sample, and would not introduce
a buffer overflow. If I'm wrong, I'd really appreciate someone detailing
my mistake.

The one thing I could have improved on was to use strtol() instead of
atoi(), so that underflow and overflow conditions could be detected.
Since it wasn't clear how the case of how underflow and overflow should
be handled it didn't occur to me to consider that possibility, but it's
just good hygene to worry about such things. As it stands, the output is
going to be incorrect two-thirds of the time for lines where the numbers
are larger than INT_MAX, but this won't introduce a security problem
such as a buffer overflow.

This readline() function is quite different in behavior from fgets(),
and as such, depending on how you look at the original program it would
be unwise to use readline() (in particular, unless you start using
unlimited precision integers, there's no point in reading in more than
50 characters on most platforms). The code also drops carriage returns
and line feeds from the file it reads in, and makes a carriage return
and an EOF indistinguishable, which is not necessarily desireable.
However, these are all minor points that really shouldn't be a concern,
particularly given the scope of the program.

The code isn't entirely portable C all by itself, as you also need to
provide an implementation of xrealloc (no biggie here). Most
implementations of xrealloc exit with an error if the line is too large
to fit in memory, and as mentioned before, that may not be the
appropriate behavior here. Still, xrealloc is simple to implement, and
in this case the error behavior seems appropriate.

The code is also not entirely portable, as different platforms treat
carriage return and line feed characters differently. That may be an
issue, but this code should be portable to POSIX systems as well as
Windows, so no big concern here either.

The handling of the (c == EOF) || (c == '\n') case, while perfectly good
code, could be improved. For starters, I'd test this before I'd test
against '\r', because one would hope that the '\r' case would occur less
often. More importantly this snippet:

>            c = '\0';
>             *p++ = (char) c;
>             break;

seems quite torturous to me. Why widen '\0' to an int, assign it to
c, and cast c to char, and then assign it to p, particularly given the
fact that c won't be used again. Why not just assign '\0' to p directly?

I'm also trying to understand the point of the ungetc() when your
next operation on the stream is guaranteed to be getc(). It seems
smarter to just move the getc currently at the beginning loop to the
end of the loop.

The code could also be much more efficient for the case where FILE* is
pointing to a seekable stream (or even better something that can be
memory mapped, but of course memory mapping isn't portable). If you test
first for whether the stream is seekable (presumably a common case) you
can jump from a O(nlog(n)) algorithm to an O(n) algorithm.

The big concern I have about this (and it actually makes me a little
suspect and concerned about this being the "GNU way" to code) is that
using this function actually *introduces* a security hole.

Yup, you got that. When I first saw that I was so surprised I read it
over again to make sure I wasn't mistaken, but I'm quite sure of it now.

The problem is in this line of code:

line->size *= 2;

This can can cause line->size to overflow, which in turn would result in
the buffer being reallocated to a *smaller* size, but p would be updated
to point to a location as if memory had grown. All subsequent writes to
p would constitute a buffer overflow.

The simple fix to this would be to change the line to:

line->size = (line->size < (SIZE_MAX/2)) ? line->size*2 : SIZE_MAX;

Given that realloc(SIZE_MAX) has to fail, it's a good idea to do
something more clever here, but at least this correction eliminates the
security problem.

So Jack, if this is standard GNU code, do you know of any code bases
where it exists? I'd like to send in patches to fix it ASAP.

-- 
Christopher Smith <x at xman.org>


More information about the OCLUG-devel mailing list