From james at colannino.org Mon Jul 5 19:57:57 2004 From: james at colannino.org (James Colannino) Date: Mon, 05 Jul 2004 19:57:57 -0700 Subject: [OCLUG-devel] invalid initializer? Message-ID: <40EA1535.5070407@colannino.org> Hey everyone. I know that you can do, say, the following: char string[] = "Hello"; But, I tried the following: char filename[] = argv[1]; This resulted in the following compiler error: error: invalid initializer What does this mean? I thought that you could assign a string to a variable-length character array at the time you initialize it. Am I missing something? James -- My blog: http://www.crazydrclaw.com/ My homepage: http://james.colannino.org/ "There are no uninteresting things; only uninterested people." --G.K. Chesterton From johnhscs at scsoftware.sc-software.com Mon Jul 5 20:35:39 2004 From: johnhscs at scsoftware.sc-software.com (John Heil) Date: Mon, 5 Jul 2004 20:35:39 -0700 (PDT) Subject: [OCLUG-devel] invalid initializer? In-Reply-To: <40EA1535.5070407@colannino.org> References: <40EA1535.5070407@colannino.org> Message-ID: On Mon, 5 Jul 2004, James Colannino wrote: > Date: Mon, 05 Jul 2004 19:57:57 -0700 > From: James Colannino > To: oclug-devel at oclug.org > Subject: [OCLUG-devel] invalid initializer? > > Hey everyone. I know that you can do, say, the following: > > char string[] = "Hello"; > > But, I tried the following: > > char filename[] = argv[1]; > > This resulted in the following compiler error: > > error: invalid initializer > > What does this mean? I thought that you could assign a string to a > variable-length character array at the time you initialize it. Am I > missing something? > > James The char filename[] is an array of undefined length in units of characters. argv[1] is a specific array entry which came from, IIRC, char **argv, thus argv[1] is a char * ie a pointer to a parameter value which is a 0x00 terminated string. You'd need to size filename like char filename[256] then take use something like strcpy to copy argv[1]'s contents into it. You should probably do man strcpy or man strncpy just for the experience of using man for function lookup, but any C text ought to describe them. Also you could do char *filename[]; Then filename[0] = argv[1]; ought to work just fine cuz your placing a pointer into a specific location in an array of pointers both of which point to chars. johnh - ----------------------------------------------------------------- John Heil South Coast Software Custom systems software for UNIX and IBM MVS mainframes 1-714-774-6952 johnhscs at sc-software.com http://www.sc-software.com ----------------------------------------------------------------- From james at colannino.org Tue Jul 6 19:59:16 2004 From: james at colannino.org (James Colannino) Date: Tue, 06 Jul 2004 19:59:16 -0700 Subject: [OCLUG-devel] Trouble with strings and files Message-ID: <40EB6704.4020700@colannino.org> Hey everyone. One of the programming excercises in my C book asked me to write a program that will take numbers from a file, place all the numbers divisible by 3 in one file, and all the numbers not divisible by 3 in another file. I wrote a program that compiles without any fatal errors (albeit I do get some warnings; I will post if asked), but when I run the program with the correct number of arguments, it seg faults. I ran strace on it to see what it was failing on, but had trouble locating the exact problem. Below are URLs pointing to both the C source and strace's output: http://james.colannino.org/mailinglist/strace.output http://james.colannino.org/mailinglist/twolist.c I'm sure there's some blatantly obvious and stupid thing that I'm doing, but for the life of me, I can't figure out what it is. I'm hoping that someone more experienced can hit me over the head with the correct answer :) Thanks very much in advance. James -- My blog: http://www.crazydrclaw.com/ My homepage: http://james.colannino.org/ "There are no uninteresting things; only uninterested people." --G.K. Chesterton From tthelin at sbcglobal.net Tue Jul 6 21:13:49 2004 From: tthelin at sbcglobal.net (Tim Thelin) Date: Tue, 06 Jul 2004 21:13:49 -0700 Subject: [OCLUG-devel] Trouble with strings and files In-Reply-To: <40EB6704.4020700@colannino.org> References: <40EB6704.4020700@colannino.org> Message-ID: <40EB787D.1020002@sbcglobal.net> James Colannino wrote: > Hey everyone. One of the programming excercises in my C book asked me > to write a program that will take numbers from a file, place all the > numbers divisible by 3 in one file, and all the numbers not divisible > by 3 in another file. I wrote a program that compiles without any > fatal errors (albeit I do get some warnings; I will post if asked), > but when I run the program with the correct number of arguments, it > seg faults. I ran strace on it to see what it was failing on, but had > trouble locating the exact problem. Below are URLs pointing to both > the C source and strace's output: > > http://james.colannino.org/mailinglist/strace.output > http://james.colannino.org/mailinglist/twolist.c > > I'm sure there's some blatantly obvious and stupid thing that I'm > doing, but for the life of me, I can't figure out what it is. I'm > hoping that someone more experienced can hit me over the head with the > correct answer :) Thanks very much in advance. > > James You probably only want to use strace for debugging when your trying to figure out what system calls are being used and with what arguments. Otherwise its not too useful as a general debugging tool. Getting familiar with gdb (or one of its graphical front ends) would probably be more useful to you. I believe the issue is your sscanfs for turning the number into a string. You should actually be using sprintf there, not sscanf. scanf is for turning a string something and parsing it into smaller units (scaning); printf is used for taking smaller units and turning them into a single string. So you want to take your smaller unit (the number) and turn it into the string you want to place in the file with your fputs. Its probably segfaulting because scanf needs a string as a first arg and your passing in a number; remember how strings are just char pointers... and numbers can be cast as a char* quite easily... so scanf is using your number as a memory address and seg faulting. One of the compiler's warnings probably hinted at this (as a personal rule I always compile with -Wall (all warnings) and make sure i've solved every warning before continuingor at least made sure i know why i'm ignoring a warning). Btw you could combine your (soon to be) sprintfs and fputs into a single fprintf, and you could combine your fgets and scanf into a single fscanf. - Tim Thelin From james at colannino.org Tue Jul 6 21:19:35 2004 From: james at colannino.org (James Colannino) Date: Tue, 06 Jul 2004 21:19:35 -0700 Subject: [OCLUG-devel] Trouble with strings and files In-Reply-To: <40EB787D.1020002@sbcglobal.net> References: <40EB6704.4020700@colannino.org> <40EB787D.1020002@sbcglobal.net> Message-ID: <40EB79D7.2020008@colannino.org> Tim Thelin wrote: > James Colannino wrote: > >> Hey everyone. One of the programming excercises in my C book asked me >> to write a program that will take numbers from a file, place all the >> numbers divisible by 3 in one file, and all the numbers not divisible >> by 3 in another file. I wrote a program that compiles without any >> fatal errors (albeit I do get some warnings; I will post if asked), >> but when I run the program with the correct number of arguments, it >> seg faults. I ran strace on it to see what it was failing on, but had >> trouble locating the exact problem. Below are URLs pointing to both >> the C source and strace's output: >> >> http://james.colannino.org/mailinglist/strace.output >> http://james.colannino.org/mailinglist/twolist.c >> >> I'm sure there's some blatantly obvious and stupid thing that I'm >> doing, but for the life of me, I can't figure out what it is. I'm >> hoping that someone more experienced can hit me over the head with the >> correct answer :) Thanks very much in advance. >> >> James > > > You probably only want to use strace for debugging when your trying to > figure out what system calls are being used and with what arguments. > Otherwise its not too useful as a general debugging tool. Getting > familiar with gdb (or one of its graphical front ends) would probably be > more useful to you. > > I believe the issue is your sscanfs for turning the number into a > string. You should actually be using sprintf there, not sscanf. scanf > is for turning a string something and parsing it into smaller units > (scaning); printf is used for taking smaller units and turning them into > a single string. So you want to take your smaller unit (the number) and > turn it into the string you want to place in the file with your fputs. > > Its probably segfaulting because scanf needs a string as a first arg and > your passing in a number; remember how strings are just char pointers... > and numbers can be cast as a char* quite easily... so scanf is using > your number as a memory address and seg faulting. One of the compiler's > warnings probably hinted at this (as a personal rule I always compile > with -Wall (all warnings) and make sure i've solved every warning before > continuingor at least made sure i know why i'm ignoring a warning). > > Btw you could combine your (soon to be) sprintfs and fputs into a single > fprintf, and you could combine your fgets and scanf into a single fscanf. Extremely helpful. Thank you :) James -- My blog: http://www.crazydrclaw.com/ My homepage: http://james.colannino.org/ "There are no uninteresting things; only uninterested people." --G.K. Chesterton From john at jjdev.com Wed Jul 7 10:36:10 2004 From: john at jjdev.com (johnd) Date: Wed, 7 Jul 2004 10:36:10 -0700 Subject: [OCLUG-devel] Trouble with strings and files In-Reply-To: <40EB6704.4020700@colannino.org> References: <40EB6704.4020700@colannino.org> Message-ID: <20040707173610.GA27689@stang.jjdev.com> James, just a small pointer... if you do a 'if x=something' then you don't really need to do a 'if x!=something' you could just do an else... that is pretty much what else is for I would have done it like: if ((number % 3) == 0) { sscanf (number, "%s", &line); fputs (line, div_file); } else { sscanf (number, "%s", &line); } On Tue, Jul 06, 2004 at 07:59:16PM -0700, James Colannino wrote: > Hey everyone. One of the programming excercises in my C book asked me > to write a program that will take numbers from a file, place all the > numbers divisible by 3 in one file, and all the numbers not divisible by > 3 in another file. I wrote a program that compiles without any fatal > errors (albeit I do get some warnings; I will post if asked), but when I > run the program with the correct number of arguments, it seg faults. I > ran strace on it to see what it was failing on, but had trouble locating > the exact problem. Below are URLs pointing to both the C source and > strace's output: > > http://james.colannino.org/mailinglist/strace.output > http://james.colannino.org/mailinglist/twolist.c > > I'm sure there's some blatantly obvious and stupid thing that I'm doing, > but for the life of me, I can't figure out what it is. I'm hoping that > someone more experienced can hit me over the head with the correct > answer :) Thanks very much in advance. > > James > -- > My blog: http://www.crazydrclaw.com/ > My homepage: http://james.colannino.org/ > > "There are no uninteresting things; only uninterested people." --G.K. > Chesterton > > _______________________________________________ > OCLUG-devel mailing list -- OCLUG-devel at oclug.org > http://mailman.oclug.org/mailman/listinfo/oclug-devel From x at xman.org Wed Jul 7 22:46:30 2004 From: x at xman.org (Christopher Smith) Date: Wed, 07 Jul 2004 22:46:30 -0700 Subject: [OCLUG] Re: [OCLUG-devel] Trouble with strings and files In-Reply-To: <200407072211.38416.cashmere@adelphia.net> References: <40EB6704.4020700@colannino.org> <40EB7DD1.2050405@colannino.org> <1089177489.4301.14.camel@diffie> <200407072211.38416.cashmere@adelphia.net> Message-ID: <1089265590.8027.20.camel@localhost> On Wed, 2004-07-07 at 22:11, Jack Denman wrote: > On Tuesday 06 July 2004 10:19 pm, Christopher Smith wrote: > > On Tue, 2004-07-06 at 21:36, James Colannino wrote: > > > I was wrong. I should not have used EOF, but instead, should have used > > > NULL. The program didn't work until I made that change. C is fun... :) > > > > Hehe. Good on you for figuring that out! I was about to point that one > > out to you. > > > > I'd also personally recommend changing your core logic to this: > > > > while (fgets(line, sizeof(line), in_file)) { > > FILE* target = div_file; > > int number = atoi(line); > > if ((number % 3) != 0) { /* != 0 is gratuitous, but clearer */ > > target = other_file; > > } > > fputs(line, target); > > } > > > > A number of advantages to this approach: > > > > fgets should not be used because of buffer overflow, a security violation and > the gcc compiler warns about that. The GNU way is as follows: Not sure how this ended up traveling back on to the main list. Yes, fgets in general is not a safe function, but James is trying to write his first couple of programs in C, he is still learning how to heap allocations work. Give him a chance to learn how C works first, so he can understand why he can't use fgets (I was hesitant to even introduce atoi(), but given that scanf is more complex than atoi(), it was probably a good idea)! Of course, even if we used the getline function you describe, we'd have other problems to worry about. -- Christopher Smith From x at xman.org Wed Jul 7 23:14:15 2004 From: x at xman.org (Christopher Smith) Date: Wed, 07 Jul 2004 23:14:15 -0700 Subject: [OCLUG] Re: [OCLUG-devel] Trouble with strings and files In-Reply-To: <200407072211.38416.cashmere@adelphia.net> References: <40EB6704.4020700@colannino.org> <40EB7DD1.2050405@colannino.org> <1089177489.4301.14.camel@diffie> <200407072211.38416.cashmere@adelphia.net> Message-ID: <1089267255.8027.25.camel@localhost> On Wed, 2004-07-07 at 22:11, Jack Denman wrote: > On Tuesday 06 July 2004 10:19 pm, Christopher Smith wrote: > > while (fgets(line, sizeof(line), in_file)) { > > FILE* target = div_file; > > int number = atoi(line); > > if ((number % 3) != 0) { /* != 0 is gratuitous, but clearer */ > > target = other_file; > > } > > fputs(line, target); > > } > > > > A number of advantages to this approach: > > > > fgets should not be used because of buffer overflow, a security violation and > the gcc compiler warns about that. The GNU way is as follows: Wait... I think maybe I'm way too tired to be reading e-mail right now. The code above uses *fgets*, not *gets* (and manages to use it properly too!). How do you see a buffer overflow occurring? -- Christopher Smith From x at xman.org Thu Jul 8 14:41:47 2004 From: x at xman.org (Christopher Smith) Date: Thu, 08 Jul 2004 14:41:47 -0700 Subject: [OCLUG-devel] GNU code with security holes?!! [was: Trouble with strings and files] In-Reply-To: <200407072211.38416.cashmere@adelphia.net> References: <40EB6704.4020700@colannino.org> <40EB7DD1.2050405@colannino.org> <1089177489.4301.14.camel@diffie> <200407072211.38416.cashmere@adelphia.net> Message-ID: <1089322907.11797.136.camel@smithch-ws.research.p4pnet.net> On Wed, 2004-07-07 at 22:11, Jack Denman wrote: > ***************************************** > > /* Read an arbitrarily long line of text from STREAM into LINEBUFFER. > Remove any newline. Does not null terminate. > Return LINEBUFFER, except at end of file return 0. */ > > struct linebuffer *readline (struct linebuffer *line, FILE *stream) > { > int c; > char *buffer = line->buffer; > char *p = line->buffer; > char *end = buffer + line->size; /* Sentinel. */ > > c = getc (stream); > if (feof (stream)) { > line->length = 0; > return (NULL); > } else { > ungetc(c, stream); > } > > for ( ; ; ) { > c = getc (stream); > if (c == '\r') > continue; > if (p == end) { > line->size *= 2; > buffer = xrealloc (buffer, (size_t) line->size); > p += buffer - line->buffer; > line->buffer = buffer; > end = buffer + line->size; > } > if (c == EOF || c == '\n') { > c = '\0'; > *p++ = (char) c; > break; > } > *p++ = (char) c; > } > > if (feof (stream) && p == buffer) { > line->length = 0; > return(NULL); > } > line->length = p - line->buffer - 1; > return (line); > } So, now that I've mostly caught up on my sleep, I had a chance to look at this code, and my code. My main observation is actually that my code doesn't have a buffer overflow but ironically, yours does. Some observations: First of all, having looked at my code sample with alert eyes, and having context switched from my C++ world (where STL does all this stuff so wonderfully for me), I feel pretty confident saying that fgets is being used properly in my original code sample, and would not introduce a buffer overflow. If I'm wrong, I'd really appreciate someone detailing my mistake. The one thing I could have improved on was to use strtol() instead of atoi(), so that underflow and overflow conditions could be detected. Since it wasn't clear how the case of how underflow and overflow should be handled it didn't occur to me to consider that possibility, but it's just good hygene to worry about such things. As it stands, the output is going to be incorrect two-thirds of the time for lines where the numbers are larger than INT_MAX, but this won't introduce a security problem such as a buffer overflow. This readline() function is quite different in behavior from fgets(), and as such, depending on how you look at the original program it would be unwise to use readline() (in particular, unless you start using unlimited precision integers, there's no point in reading in more than 50 characters on most platforms). The code also drops carriage returns and line feeds from the file it reads in, and makes a carriage return and an EOF indistinguishable, which is not necessarily desireable. However, these are all minor points that really shouldn't be a concern, particularly given the scope of the program. The code isn't entirely portable C all by itself, as you also need to provide an implementation of xrealloc (no biggie here). Most implementations of xrealloc exit with an error if the line is too large to fit in memory, and as mentioned before, that may not be the appropriate behavior here. Still, xrealloc is simple to implement, and in this case the error behavior seems appropriate. The code is also not entirely portable, as different platforms treat carriage return and line feed characters differently. That may be an issue, but this code should be portable to POSIX systems as well as Windows, so no big concern here either. The handling of the (c == EOF) || (c == '\n') case, while perfectly good code, could be improved. For starters, I'd test this before I'd test against '\r', because one would hope that the '\r' case would occur less often. More importantly this snippet: > c = '\0'; > *p++ = (char) c; > break; seems quite torturous to me. Why widen '\0' to an int, assign it to c, and cast c to char, and then assign it to p, particularly given the fact that c won't be used again. Why not just assign '\0' to p directly? I'm also trying to understand the point of the ungetc() when your next operation on the stream is guaranteed to be getc(). It seems smarter to just move the getc currently at the beginning loop to the end of the loop. The code could also be much more efficient for the case where FILE* is pointing to a seekable stream (or even better something that can be memory mapped, but of course memory mapping isn't portable). If you test first for whether the stream is seekable (presumably a common case) you can jump from a O(nlog(n)) algorithm to an O(n) algorithm. The big concern I have about this (and it actually makes me a little suspect and concerned about this being the "GNU way" to code) is that using this function actually *introduces* a security hole. Yup, you got that. When I first saw that I was so surprised I read it over again to make sure I wasn't mistaken, but I'm quite sure of it now. The problem is in this line of code: line->size *= 2; This can can cause line->size to overflow, which in turn would result in the buffer being reallocated to a *smaller* size, but p would be updated to point to a location as if memory had grown. All subsequent writes to p would constitute a buffer overflow. The simple fix to this would be to change the line to: line->size = (line->size < (SIZE_MAX/2)) ? line->size*2 : SIZE_MAX; Given that realloc(SIZE_MAX) has to fail, it's a good idea to do something more clever here, but at least this correction eliminates the security problem. So Jack, if this is standard GNU code, do you know of any code bases where it exists? I'd like to send in patches to fix it ASAP. -- Christopher Smith From x at xman.org Fri Jul 9 00:53:42 2004 From: x at xman.org (Christopher Smith) Date: Fri, 09 Jul 2004 00:53:42 -0700 Subject: [OCLUG-devel] Re: GNU code with security holes?!! [was: Trouble with strings and files] In-Reply-To: <1089339056.2097.87.camel@lanya.local> References: <40EB6704.4020700@colannino.org> <40EB7DD1.2050405@colannino.org> <1089177489.4301.14.camel@diffie> <200407072211.38416.cashmere@adelphia.net> <1089322907.11797.136.camel@smithch-ws.research.p4pnet.net> <1089339056.2097.87.camel@lanya.local> Message-ID: <1089359622.8022.152.camel@localhost> On Thu, 2004-07-08 at 19:10, Jack Denman wrote: > On Thu, 2004-07-08 at 14:41, Christopher Smith wrote: > > > This readline() function is quite different in behavior from fgets(), > > and as such, depending on how you look at the original program it would > > be unwise to use readline() (in particular, unless you start using > > unlimited precision integers, there's no point in reading in more than > > 50 characters on most platforms). The code also drops carriage returns > > and line feeds from the file it reads in, and makes a carriage return > > and an EOF indistinguishable, which is not necessarily desireable. > > Old news. > There is a major difference between EOF and carriage returns. EOF = > -1 as a 32 bit integer. Please try to read more carefully. I didn't claim their wasn't. I said that your sample code treats them equivalently (as indicated by the (c == EOF || (c == '\n')). You'd have to add an additional EOF test after doing this in order to know whether you needed to append the "\n" or just let the file end. It is, as I said, a minor point. > > The code is also not entirely portable, as different platforms treat > > carriage return and line feed characters differently. That may be an > > issue, but this code should be portable to POSIX systems as well as > > Windows, so no big concern here either. > > '\n' covers both the on Windows and the on *nix systems. > The only reason for testing for an '\r would be if no was > present. Has anyone ever seen this in a text file? Well, '\n' doesn't cover on Windows if you open in binary mode, but yes, as I said, it works on POSIX systems and Windows. It's kind of frightening that you think that's the whole world. Notice a major platform I didn't mention? Yup, MacOS. While OS X is a POSIX-ish system, so I imagine '\n' works fine on it, all prior versions of the MacOS used '\r'. If you ever used a teletype, this actually might seem to make more sense than '\n', as '\r' corresponds to the "Line Feed" and '\n' corresponds to 'carriage return'. Here's a quick link I found searching online which explains it in more detail: http://www.websiterepairguy.com/articles/os/crlf.html So, with old MacOS text files, having \r's without \n's would be pretty common. > > The handling of the (c == EOF) || (c == '\n') case, while perfectly good > > code, could be improved. For starters, I'd test this before I'd test > > against '\r', because one would hope that the '\r' case would occur less > > often. More importantly this snippet: > > > > > c = '\0'; > > > *p++ = (char) c; > > > break; > > > The problem is in this line of code: > > > > line->size *= 2; > > > > This can can cause line->size to overflow, which in turn would result in > > the buffer being reallocated to a *smaller* size, but p would be updated > > to point to a location as if memory had grown. All subsequent writes to > > p would constitute a buffer overflow. > > If the buffer is doubled how does it get smaller? > (character) p is updated to the position of the first position in > second half of the doubled buffer. Where is the beef (overflow)? Jack, given all the claims you have made about security expertise, and the work that you did working on AIX, making it so secure, I'm frankly quite shocked by this. I explained to you exactly what is wrong with this code, and you still don't get it. This is a classic integer overflow problem. It's one thing to make the mistake in the first place (even the best security experts acknowledge they need code reviews in order to avoid mistakes slipping through), but it should be obvious once someone points it out to you. This is the kind of thing you are taught in the first few hours of training on writing secure software. Don't worry though, I'll break it down so those without security training can understand it. > > The simple fix to this would be to change the line to: > > > > line->size = (line->size < (SIZE_MAX/2)) ? line->size*2 : SIZE_MAX; > > I don't see this as necessary over the GNU code. Well, I guess GNU does. I tracked down your function in the GNU code base. The function is called readlinebuffer. It is like your code sample, with a few changes. There is no ungetc(), no checking against '\r', and a few other changes, some of which we've talked about. readlinebuffer no longer does the *=2 itself, nor does it call xrealloc(). It now calls a GNU function called x2realloc() which grows the buffer to 2x whatever size you pass in. Now, why would GNU make a special realloc function for this? Well, x2realloc() eventually traces down to this little bit of code: if (SIZE_MAX / 2 / s < n) xalloc_die (); n *= 2; Looks a bit like my code doesn't it? They decide to die right then and there, rather than wait for xrealloc to do it for them. After thinking about it some more, it seems like the smarter thing to do would be to try reallocate to ((n/2) + (SIZE_MAX/2))/s. I might ping the GNU maintainer about this. Anyway, the "s" there is an extra parameter that lets you set an allocation ceiling that is SIZE_MAX/s, for whatever reason (s == 1 in the case where you invoke x2realloc()). Anyway, what does this tell us? It tells us GNU seems to think I'm on to something here. Now, this takes us back to what is the risk here. Jack wants to know why the number could get smaller when you do: line->size *= 2 Where's the beef? Well, think of the test case in my code: (line->size < (SIZE_MAX/2)) Okay, that's a pretty strong hint. I seem to think the problem is happening somewhere around SIZE_MAX/2. Now, technically, there isn't a problem if line->size == (SIZE_MAX/2), but since "*= 2" on that would work out to SIZE_MAX anyway, I thought why not save ourselves doing a multiply? So, what happens once line->size > (SIZE_MAX/2). Well, lets do this algebraically. let line->size = SIZE_MAX/2 + i, where i is defined as being a positive integer. Now, let's see what we end up with: line->size = (SIZE_MAX/2 + i) * 2 line->size = SIZE_MAX + 2i So, based on this, it'd be safe to say line->size is going to be greater than SIZE_MAX. But wait... they must call it SIZE_MAX for a reason right? SIZE_MAX represents the highest possible value that a value of size_t can possibly be; size_t simply isn't big enough to fit a larger value. So, what happens when you go above SIZE_MAX? You get an integer overflow. In some programming languages, this produces an exception, but in C the result "wraps around". Your end result is essentially "what you'd expect" mod (SIZE_MAX + 1). So it turns out: line->size = (SIZE_MAX + 2i) % (SIZE_MAX + 1) which is the same as: line->size = 2(i - 1) I know, I'm sure Jack thinks I'm a liar, who's making all this stuff up. I'm sure he checked out what happens when you go over SIZE_MAX/2 himself. Let's see what happens: #include #include int main() { size_t i; for (i = 1; i < (SIZE_MAX/2); ++i) { size_t foo = SIZE_MAX/2 + i; printf("i = %u\n", i); printf("2(i - 1) = %u\n", 2*(i - 1)); printf("foo: %u\n", foo); foo *= 2; printf("foo*2: %u\n", foo); } } You probably will want to stop this before it runs too long. Jack seemed to be suggestion that this line might some how correct for this problem: p += buffer - line->buffer But that line compensates for changes in the starting address of the buffer. It doesn't in any way compensate for changes in the size of the buffer. Indeed, for p to be pointing at allocated memory after this line, you have to assume that the size of the new buffer is greater than the size of the old buffer. If not.... ta-da! Buffer overflow. I'm sure Jack's code sample is based on an older version of the GNU code base. I know the GNU core utilities guy did a presentation at SCALE where he talked about an audit of the core code base he did the previous summer, where he uncovered and corrected hundreds of bugs like this. So, maybe this was corrected during that audit, and your code snapshot is older than that. Now, I've explained the problem in this code in quite painful detail, but I haven't seen an explanation of a security hole in my code. I'll conclude that Jack couldn't find one. -- Christopher Smith From x at xman.org Sat Jul 10 17:41:27 2004 From: x at xman.org (Christopher Smith) Date: Sat, 10 Jul 2004 17:41:27 -0700 Subject: [OCLUG-devel] Re: GNU code with security holes?!! [was: Trouble with strings and files] In-Reply-To: <1089359622.8022.152.camel@localhost> References: <40EB6704.4020700@colannino.org> <40EB7DD1.2050405@colannino.org> <1089177489.4301.14.camel@diffie> <200407072211.38416.cashmere@adelphia.net> <1089322907.11797.136.camel@smithch-ws.research.p4pnet.net> <1089339056.2097.87.camel@lanya.local> <1089359622.8022.152.camel@localhost> Message-ID: <1089506487.7004.6.camel@localhost> On Fri, 2004-07-09 at 00:53, Christopher Smith wrote: > Where's the beef? Well, think of the test case in my code: > > (line->size < (SIZE_MAX/2)) > > Okay, that's a pretty strong hint. I seem to think the problem is > happening somewhere around SIZE_MAX/2. Now, technically, there isn't a > problem if line->size == (SIZE_MAX/2), but since "*= 2" on that would > work out to SIZE_MAX anyway, I thought why not save ourselves doing a > multiply? So, what happens once line->size > (SIZE_MAX/2). Well, lets do > this algebraically. Okay, so upon *further* review I realized a mistake in my math here. SIZE_MAX is of course going to be an odd number (unless size_t is signed, which it never is). So, SIZE_MAX/2 == SIZE_MAX - 1. So, there is a slight error here. Which helps explain how I came up with > you'd expect" mod (SIZE_MAX + 1). So it turns out: > > line->size = (SIZE_MAX + 2i) % (SIZE_MAX + 1) > > which is the same as: > > line->size = 2(i - 1) This is of course wrong. Going from (SIZE_MAX + 2i) % (SIZE_MAX + 1) should come out to: 2i - 1 The devil is always in the details. If you go back into the algebra, you'll find that the "SIZE_MAX" I have there is as a result of (SIZE_MAX/2)*2. Because the division step is done before the multiply, this doesn't work out to SIZE_MAX, but to SIZE_MAX-1, so while line->size does end up equally 2(i - 1), that a transformation from: ((SIZE_MAX - 1) + 2i) % (SIZE_MAX + 1) Anyway, it's a niggly detail, but an important bit to get right. -- Christopher Smith From james at colannino.org Sun Jul 11 09:27:44 2004 From: james at colannino.org (James Colannino) Date: Sun, 11 Jul 2004 09:27:44 -0700 Subject: [OCLUG-devel] calculations one digit at a time Message-ID: <40F16A80.2070204@colannino.org> Hey everyone. I'm trying to figure out how I'm going to do calculations working with one digit at a time (this is so that I can work with fixed digit numbers.) For example: 110 + 100 First, the two 0's in the one's place would be added, then the 0 and the 1 in the tens place, and finally the two 1's in the hundreds place. At first glance, it doesn't seem like it would be too difficult. I would assign each individual digit to its own variable and go from there. The problem is, if I add two numbers and I get a number that's equal to 10 or greater, I'll have to figure out how to borrow, and I'm not quite sure how to do that in C. For example: 732 + 298 8 + 2 is 10, so I would have to move the 1 digit in the tens place over one. Would I maybe convert 10 into a string, split it into the 1 and 0, and then convert both back into integers and add them to their appropriate places? I hope I'm not too out there with my solution :-P I'm just not quite sure what I should do. James -- My blog: http://www.crazydrclaw.com/ My homepage: http://james.colannino.org/ "There are no uninteresting things; only uninterested people." --G.K. Chesterton From michael.elkins at gmail.com Sun Jul 11 11:18:46 2004 From: michael.elkins at gmail.com (Michael Elkins) Date: Sun, 11 Jul 2004 11:18:46 -0700 Subject: [OCLUG-devel] calculations one digit at a time In-Reply-To: <40F16A80.2070204@colannino.org> References: <40F16A80.2070204@colannino.org> Message-ID: <404f0322040711111854f76cee@mail.gmail.com> On Sun, 11 Jul 2004 09:27:44 -0700, James Colannino wrote: > Hey everyone. I'm trying to figure out how I'm going to do calculations > working with one digit at a time (this is so that I can work with fixed > digit numbers.) For example: Are you doing this as a programming exercise? I'm trying to figure out why you would want to do integer arithmetic this way? > > 110 + 100 > > First, the two 0's in the one's place would be added, then the 0 and the > 1 in the tens place, and finally the two 1's in the hundreds place. At > first glance, it doesn't seem like it would be too difficult. I would > assign each individual digit to its own variable and go from there. The > problem is, if I add two numbers and I get a number that's equal to 10 > or greater, I'll have to figure out how to borrow, and I'm not quite > sure how to do that in C. I believe you mean carry. > For example: > > 732 + 298 > > 8 + 2 is 10, so I would have to move the 1 digit in the tens place over > one. Would I maybe convert 10 into a string, split it into the 1 and 0, > and then convert both back into integers and add them to their > appropriate places? > > I hope I'm not too out there with my solution :-P I'm just not quite > sure what I should do. Off the top of my head, I'd say to represent each digit as an element in an array, in reverse order such that index 0 is the ones place, 1 is the tens place, etc. Then doing carries is pretty easy: char a[10]; //operand char b[10]; //operand char c[11] = {0}; //result for (i=0; i<10; i++) { c[i] = a[i] + b[i]; if (c[i] >= 10) { c[i+1]++; c[i] -= 10; } } -Michael From james at colannino.org Sun Jul 11 16:40:21 2004 From: james at colannino.org (James Colannino) Date: Sun, 11 Jul 2004 16:40:21 -0700 Subject: [OCLUG-devel] problems with atof() Message-ID: <40F1CFE5.2070602@colannino.org> Anyone know why the following won't work? I'm compiling with gcc 3.3.3 if that makes a difference. #include int main() { char number[] = "1.293"; double number_float = atof(number); printf ("\nnumber is: %f\n\n", number_float); return 0; } Instead of printing to the screen something even vaguely resembling 1.293, it instead shows me the following (perhaps it would be different on a different macine): -996432413.000000 I'm pretty sure I'm using atof() correctly, and I'm not sure why this is happening :( James -- My blog: http://www.crazydrclaw.com/ My homepage: http://james.colannino.org/ "There are no uninteresting things; only uninterested people." --G.K. Chesterton From x at xman.org Sun Jul 11 18:02:55 2004 From: x at xman.org (Christopher Smith) Date: Sun, 11 Jul 2004 18:02:55 -0700 Subject: [OCLUG-devel] problems with atof() In-Reply-To: <40F1CFE5.2070602@colannino.org> References: <40F1CFE5.2070602@colannino.org> Message-ID: <1089594174.8059.13.camel@localhost> On Sun, 2004-07-11 at 16:40, James Colannino wrote: > Anyone know why the following won't work? I'm compiling with gcc 3.3.3 > if that makes a difference. You failed to #include , which is where atof() is defined. -- Christopher Smith From x at xman.org Sun Jul 11 19:52:57 2004 From: x at xman.org (Christopher Smith) Date: Sun, 11 Jul 2004 19:52:57 -0700 Subject: [OCLUG-devel] Re: [OCLUG] Security Holes In Sample Code [was: Trouble with strings and files] In-Reply-To: <200407110807.38006.cashmere@adelphia.net> References: <40EB6704.4020700@colannino.org> <200407072211.38416.cashmere@adelphia.net> <1089506929.7004.13.camel@localhost> <200407110807.38006.cashmere@adelphia.net> Message-ID: <1089600776.8059.20.camel@localhost> On Sun, 2004-07-11 at 08:07, Jack Denman wrote: > I do not see the buffer overflow in the GNU code. Let each one that knows the > "C" language judge for themselves, and don't take anybody's word for it. Could you enlighten us as to where I'm going wrong in describing the error? Do integer overflows not happen in C? What does happen if the buffer is already > SIZE_MAX/2 and needs to be resized to continue reading the line? Oh, and do you still feel there is a buffer overflow in the fgets() code? -- Christopher Smith From james at colannino.org Sun Jul 11 20:46:19 2004 From: james at colannino.org (James Colannino) Date: Sun, 11 Jul 2004 20:46:19 -0700 Subject: [OCLUG-devel] problems with atof() In-Reply-To: <1089594174.8059.13.camel@localhost> References: <40F1CFE5.2070602@colannino.org> <1089594174.8059.13.camel@localhost> Message-ID: <40F2098B.5000308@colannino.org> Christopher Smith wrote: > On Sun, 2004-07-11 at 16:40, James Colannino wrote: > >>Anyone know why the following won't work? I'm compiling with gcc 3.3.3 >>if that makes a difference. > > > You failed to #include , which is where atof() is defined. Wow...how embarrasing... :-P But this begs the question: if I was trying to use a function that wasn't defined, then why didn't I get a compiler error? James -- My blog: http://www.crazydrclaw.com/ My homepage: http://james.colannino.org/ "There are no uninteresting things; only uninterested people." --G.K. Chesterton From x at xman.org Sun Jul 11 21:44:29 2004 From: x at xman.org (Christopher Smith) Date: Sun, 11 Jul 2004 21:44:29 -0700 Subject: [OCLUG-devel] problems with atof() In-Reply-To: <40F2098B.5000308@colannino.org> References: <40F1CFE5.2070602@colannino.org> <40F2098B.5000308@colannino.org> Message-ID: <1089607468.8059.33.camel@localhost> On Sun, 2004-07-11 at 20:46, James Colannino wrote: > Christopher Smith wrote: > > On Sun, 2004-07-11 at 16:40, James Colannino wrote: > > > >>Anyone know why the following won't work? I'm compiling with gcc 3.3.3 > >>if that makes a difference. > > > > > > You failed to #include , which is where atof() is defined. > > Wow...how embarrasing... :-P Nah, it's the kind of mistake we all make. Notice that noone else figured out this problem. ;-) > But this begs the question: if I was trying to use a function that > wasn't defined, then why didn't I get a compiler error? My guess is that there is a compiler builtin for atof() with a slightly different signature from the standard function. -- Christopher Smith From michael.elkins at gmail.com Sun Jul 11 22:23:26 2004 From: michael.elkins at gmail.com (Michael Elkins) Date: Sun, 11 Jul 2004 22:23:26 -0700 Subject: [OCLUG-devel] calculations one digit at a time In-Reply-To: <404f0322040711111854f76cee@mail.gmail.com> References: <40F16A80.2070204@colannino.org> <404f0322040711111854f76cee@mail.gmail.com> Message-ID: <404f032204071122237317dd12@mail.gmail.com> On Sun, 11 Jul 2004 11:18:46 -0700, Michael Elkins wrote: > for (i=0; i<10; i++) { > c[i] = a[i] + b[i]; > if (c[i] >= 10) { > c[i+1]++; > c[i] -= 10; > } > } Just for sake of completeness, the above code has an error in it. The second line should read: c[i] += a[i] + b[i]; Otherwise any carries that are done would be overwritten. -Michael From x at xman.org Mon Jul 12 01:25:44 2004 From: x at xman.org (Christopher Smith) Date: Mon, 12 Jul 2004 01:25:44 -0700 Subject: [OCLUG-devel] Anatomy of a buffer overflow In-Reply-To: <200407110807.38006.cashmere@adelphia.net> References: <40EB6704.4020700@colannino.org> <200407072211.38416.cashmere@adelphia.net> <1089506929.7004.13.camel@localhost> <200407110807.38006.cashmere@adelphia.net> Message-ID: <1089620744.20220.114.camel@diffie> On Sun, 2004-07-11 at 08:07, Jack Denman wrote: > I do not see the buffer overflow in the GNU code. Let each one that knows the > "C" language judge for themselves, and don't take anybody's word for it. I have to admit, Jack's statements concerned me. For one, I was pretty confident about the security flaw in the code and I didn't want him or anyone else to come to the wrong conclusion about the code. I felt I've already gone into great detail as to why there is an error, and nobody has pointed out an error in the analysis, or some point of confusion. So I was at a loss for what to add for further explanation. I was also concerned that somehow perhaps I'd gotten things wrong, and if so, I wanted to understand why. So, I wrote out a test snippet to demonstrate a buffer overflow in the code. I thought it might also prove to be a useful demonstration that people could step through in their debugger. Sure enough, I did in fact find it. Please find attached a code snippet demonstrating how a buffer overflow can happen with Jack's code. I took Jack's snipped, removed the DOS stuff, and made a few corrections to that the support libraries could compile. None of this should effect the code's behavior on a Linux or Unix system (Jack, feel free to point out if I am wrong in this regard). I then added a fairly simple main() function which would create a file with a single line of size SIZE_MAX (much bigger than necessary, but just to be safe), as well as a linebuffer just over SIZE_MAX/2, and then invoke readline on the file. You can actually start with various sizes of line buffers by passing in a 3rd argument. Depending on your starting buffer size and a ton of other factors, you may run out of virtual memory before the buffer overflow occurs. The program will attempt to detect cases where this will happen, but really, this is in the hands of a particular system's malloc/realloc/free implementation (my little test loop that checks for memory seems to really fragment the heap, so you might want to remove it). Also note that you need to have your system setup so that it can address >2GB of virtual memory. For best results, you want to use a so called "4+4" kernel, which allows 4GB of virtual memory per process. Most of the distros ship with kernels configured for at least 3GB's of virtual memory, but a lot of people who build their own kernels trim it down to 1GB, as that is the default. Also, you might want to make sure that you don't have any memory allocation limits from ulimit which might get in the way. In general, YMMV as far as which buffer sizes result in a buffer overrun, but if you support >2.5GB of virtual memory, you shouldn't have a problem getting there with the default config. The code generates a test file larger than 2GB, so you need a file system with large file support (basically all of them). The code attempts to create a sparse file to avoid chewing up a ton of disk space and time, so stick to filesystems which also support sparse files: ext2/3, reiserfs, xfs, jfs, etc. It runs quite slowly (if for no other reason that doing large reads with getc() is painfully slow), so be patient. I also added an assertion into the the main loop of readline which tests for whether we have a buffer overflow. Where previously, it was just: *p++ = (char) c; We now have: //make sure that p is not pointing outside the allocated buffer //if p is pointing outside, we have a buffer overflow if ((p - line->buffer) > line->size) { fprintf(stderr, "Buffer overflow with p at\t%x\nline->buffer at\t\t\t%x\nDifference is:\t%u\nline->size is:\t%u\n", p, line->buffer, p - line->buffer, line->size); exit(10); } *p++ = (char) c; Compilation is straight forward, I've been building it with "gcc -g readline.c -o readline". If you strip out the symbols the code will take up less of virtual memory, but then you won't be able to look at it in the debugger too easily. This is what happens when I run it: $ ./readline testFile Buffer overflow with p at b74a900a line->buffer at 374a9008 Difference is: 2147483650 line->size is: 4 $ echo $? 10 The default test case seems to work consistently on the 32-bit x86 RHEL systems I tested on, a 64-bit x86 RHEL Opteron system (running as a 32-bit binary of course), as well as a Solaris system (again running as a 32-bit binary). While it's theoretically possible to reproduce the error with a 64-bit executable, you'd need > 2^63 of virtual memory, and I figure most of us can't come up with that until another 30 odd years of Moore's law and it's variants pass us by. -- Christopher Smith -------------- next part -------------- A non-text attachment was scrubbed... Name: readline.c Type: text/x-csrc Size: 5454 bytes Desc: not available Url : http://localhost.localdomain/pipermail/oclug-devel/attachments/20040712/296abb22/attachment.bin From ndoclug at ninkware.com Tue Jul 27 16:35:59 2004 From: ndoclug at ninkware.com (ndoclug at ninkware.com) Date: Tue, 27 Jul 2004 16:35:59 -0700 (PDT) Subject: [OCLUG-devel] Hello All Message-ID: <3357.64.215.118.254.1090971359.squirrel@webmail.ninksink.com> I am a regular on oclug but I also code like a banshee =-)