From james at colannino.org Thu Aug 4 13:57:16 2005 From: james at colannino.org (James Colannino) Date: Thu, 04 Aug 2005 13:57:16 -0700 Subject: [OCLUG-devel] Matching Strings Message-ID: <42F2812C.2060100@colannino.org> Hey everyone. I'm writing a C program that will report either duplicate strings in a file or strings that are the same up to n characters. For example, let's say that I have the following file named text.txt: albert albert albany bert biscuit true trouble lone If I were to do the following: match text.txt It would search for complete matches and should output the following: albert albert If I were to do the following: match -n 3 text.txt It should output the following: albert albert albany But instead, it outputs this: albert albert albany albert albany I know why it does this. I look for matches in the following way: read string search rest of file below it for match read next string search rest of file below it for match Since albert is there twice, it reads the first instance of albert, outputs all matches below it, reads the second albert, and outputs all the matches below it, and so on. My question is, how can I write my program in such a way that these only show up once? In essence, I'd like it to report: albert albert albany Instead of: albert albert albany albert albany Here's a link to the source (it's probably not very good, but I tried to structure and comment it in such a way that it's at least easy to read -- I hope...) http://james.colannino.org/match.c Any help would be greatly appreciated :) James From tnation at baipro.com Thu Aug 4 14:01:18 2005 From: tnation at baipro.com (Tim Nation) Date: Thu, 4 Aug 2005 14:01:18 -0700 Subject: [OCLUG-devel] Matching Strings References: <42F2812C.2060100@colannino.org> Message-ID: <000401c59937$aa5ba170$0317a8c0@win2000> > Hey everyone. I'm writing a C program that will report either duplicate > strings in a file or strings that are the same up to n characters. For > example, let's say that I have the following file named text.txt: > But instead, it outputs this: > > albert > albert > albany > albert > albany > > I know why it does this. I look for matches in the following way: > > read string > search rest of file below it for match > read next string > search rest of file below it for match > > Since albert is there twice, it reads the first instance of albert, > outputs all matches below it, reads the second albert, and outputs all > the matches below it, and so on. How about removing the matching lines from the record in memory as they are processed. That way they can't be reprocessed multiple times. As long as you are only reading and not writing the text record, you should be free to manipulate it in memory all you want. Tim From x at xman.org Thu Aug 4 14:30:35 2005 From: x at xman.org (Christopher Smith) Date: Thu, 04 Aug 2005 14:30:35 -0700 Subject: [OCLUG-devel] Matching Strings In-Reply-To: <42F2812C.2060100@colannino.org> References: <42F2812C.2060100@colannino.org> Message-ID: <42F288FB.4000109@xman.org> Hmm... I don't have time to look at your code right now, but the algorithm is pretty straight forward for doing this right. In C++ it'd be something like (filling out the necessary types and various bits of prologue and epilogue code is left as an excercise to the reader): //keep going until we've seen all the data while (my_ifstream) { //read a new line getline(buffer, my_ifstream); //get whatever section is relevant as a key string key = buffer.substr(0, n); //check to see if we've seen this key before my_map_iterator = my_map.find(key); if (my_map_iterator != my_map.end()) { //check if we've printed out the previously //scanned line before if (my_map_iterator->second) { //print out the previous line cout << *(my_map_iterator->second) << endl; //set last_time to null and wipe //out the string auto_ptr temp(my_map_iterator->second); my_map_iterator->second = 0; } //print out this line cout << buffer << endl; } else { //add this new key to our map my_map_iterator.insert(make_pair(key, new string(buffer))); } } --Chris From msimpson at braysimpson.com Thu Aug 4 14:51:14 2005 From: msimpson at braysimpson.com (Morgan Simpson) Date: Thu, 04 Aug 2005 14:51:14 -0700 Subject: [OCLUG-devel] Matching Strings In-Reply-To: <42F2812C.2060100@colannino.org> References: <42F2812C.2060100@colannino.org> Message-ID: <42F28DD2.9010701@braysimpson.com> James Colannino wrote: > My question is, how can I write my program in such a way that these only > show up once? In essence, I'd like it to report: > > albert > albert > albany > > Instead of: > > albert > albert > albany > albert > albany James, you may want to consider a dictionary construct where each unique word is a key and the argument (definition) is the number of times that word appears in the input. You traverse the file once to create the dictionary. Then you traverse the dictionary and report those words that occur more than once. Best regards, Morgan Simpson Bray, Simpson & Associates, Inc. +1 714 390 5040 +1 714 549 3064 From strombrg at dcs.nac.uci.edu Thu Aug 4 15:15:30 2005 From: strombrg at dcs.nac.uci.edu (Dan Stromberg) Date: Thu, 04 Aug 2005 15:15:30 -0700 Subject: [OCLUG-devel] Matching Strings In-Reply-To: <42F2812C.2060100@colannino.org> References: <42F2812C.2060100@colannino.org> Message-ID: <1123193731.14126.194.camel@seki.nac.uci.edu> There's a variety of ways of doing this sort of thing, but some of the easier ones that should still have a good running time, in pseudo code look like: A) Not absolutely blazing, but not slow, and very simple coding: 1) go through file, stuffing all matches in an array 2) pass the resulting array to qsort with an equivalence function that only compares -all- characters in the array 3) go through and print out all matches that aren't adjacent and identical Or perhaps a little bit better: 1) Initialize a suitably-sized hash table 2) Go through the file, stuffing all matches in the hash table. The table is indexed by the matching prefix, but contains an associated piece of data for every matching line 3) Just run through the hash table, outputting all data at each key ...or you could do much the same thing with a binary tree or something, but a suitably-sized hash table will probably outperform the binary tree, -and- be less likely to exhibit pathological behavior. -Or-, if you have a truly huge list, then you might use a database-backed "hash table", rather than keeping everything in memory. On Thu, 2005-08-04 at 13:57 -0700, James Colannino wrote: > Hey everyone. I'm writing a C program that will report either duplicate > strings in a file or strings that are the same up to n characters. For > example, let's say that I have the following file named text.txt: > > albert > albert > albany > bert > biscuit > true > trouble > lone > > If I were to do the following: > > match text.txt > > It would search for complete matches and should output the following: > > albert > albert > > If I were to do the following: > > match -n 3 text.txt > > It should output the following: > > albert > albert > albany > > But instead, it outputs this: > > albert > albert > albany > albert > albany > > I know why it does this. I look for matches in the following way: > > read string > search rest of file below it for match > read next string > search rest of file below it for match > > Since albert is there twice, it reads the first instance of albert, > outputs all matches below it, reads the second albert, and outputs all > the matches below it, and so on. > > My question is, how can I write my program in such a way that these only > show up once? In essence, I'd like it to report: > > albert > albert > albany > > Instead of: > > albert > albert > albany > albert > albany > > Here's a link to the source (it's probably not very good, but I tried to > structure and comment it in such a way that it's at least easy to read > -- I hope...) > > http://james.colannino.org/match.c > > Any help would be greatly appreciated :) > > James > _______________________________________________ > OCLUG-devel mailing list -- OCLUG-devel at oclug.org > http://mailman.oclug.org/mailman/listinfo/oclug-devel > -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://localhost.localdomain/pipermail/oclug-devel/attachments/20050804/40b565cf/attachment.bin From james at colannino.org Thu Aug 4 16:25:59 2005 From: james at colannino.org (James Colannino) Date: Thu, 04 Aug 2005 16:25:59 -0700 Subject: [OCLUG-devel] Matching Strings In-Reply-To: <42F2812C.2060100@colannino.org> References: <42F2812C.2060100@colannino.org> Message-ID: <42F2A407.9020708@colannino.org> Thanks everyone for all your answers :) James From hugo at teknofx.com Tue Aug 9 00:27:49 2005 From: hugo at teknofx.com (Hugo Samayoa) Date: Tue, 09 Aug 2005 00:27:49 -0700 Subject: [OCLUG-devel] mod_dav customization Message-ID: <42F85AF5.3050806@teknofx.com> *I was wondering if anyone here has done any webdav customization work. I'm in the process of re-writing some functions in mod_dav so that when a user uploads/puts an image to the folder that it automatically creates a thumbnail for the image. Just wondering if anyone has messed around with either dav_method_copymove, dav_method_delete, dav_method_put or any other method in mod_dav for apache. I just dont want to duplicate efforts. Thanks in advance. -Hugo* From msimpson at braysimpson.com Tue Aug 9 10:46:10 2005 From: msimpson at braysimpson.com (Morgan Simpson) Date: Tue, 09 Aug 2005 10:46:10 -0700 Subject: [OCLUG-devel] mod_dav customization In-Reply-To: <42F85AF5.3050806@teknofx.com> References: <42F85AF5.3050806@teknofx.com> Message-ID: <42F8EBE2.90805@braysimpson.com> Hugo Samayoa wrote: > *I was wondering if anyone here has done any webdav customization work. > I'm in the process of re-writing some functions in mod_dav so that when > a user uploads/puts an image to the folder that it automatically creates > a thumbnail for the image. Just wondering if anyone has messed around > with either dav_method_copymove, dav_method_delete, dav_method_put or > any other method in mod_dav for apache. I just dont want to duplicate > efforts. Thanks in advance. We are just starting a DAV implementation as a replacement for Samba. We'd be interested in following your progress. Good luck! -- Best regards, Morgan Simpson Bray, Simpson & Associates, Inc. +1 714 390 5040 +1 714 549 3064 From hugo at teknofx.com Tue Aug 9 14:08:10 2005 From: hugo at teknofx.com (Hugo Samayoa) Date: Tue, 09 Aug 2005 14:08:10 -0700 Subject: [OCLUG-devel] mod_dav customization In-Reply-To: <42F8EBE2.90805@braysimpson.com> References: <42F85AF5.3050806@teknofx.com> <42F8EBE2.90805@braysimpson.com> Message-ID: <42F91B3A.6060508@teknofx.com> Morgan, I have a DAV implementation already in production with backend authentication to mysql. We have Windows, Linux, Mac OS 9 and OS X clients using it. Right now I'm just customizing the modules to do image capture on movies uploaded and thumbnail creation for those images and other images file uploaded. Our webapp will take of that as well but since we allow webdav connections to the app i need to customize this work. All of it is backed by a GFS filesystem. So yeah if you have any question of the implementation send me an email off list. -Hugo Morgan Simpson wrote: > Hugo Samayoa wrote: > >> *I was wondering if anyone here has done any webdav customization >> work. I'm in the process of re-writing some functions in mod_dav so >> that when a user uploads/puts an image to the folder that it >> automatically creates a thumbnail for the image. Just wondering if >> anyone has messed around with either dav_method_copymove, >> dav_method_delete, dav_method_put or any other method in mod_dav for >> apache. I just dont want to duplicate efforts. Thanks in advance. > > > We are just starting a DAV implementation as a replacement for Samba. > We'd be interested in following your progress. Good luck! > From ddjolley at gmail.com Fri Aug 12 16:15:05 2005 From: ddjolley at gmail.com (Doug Jolley) Date: Fri, 12 Aug 2005 19:15:05 -0400 Subject: [OCLUG-devel] Conditionally Included Sections in HTML Doc Message-ID: I have a rather large HTML document which which I'm going to call the master document and which includes a variety of sections. Actually, this large document represents a composite of several smaller separate documents each of which are a derivitive of the master document drawing certain sections alternatively and choosing whether to include other sections conditionally. The point is that all of the smaller documents can be built from the larger document by selecting and/or conditionally including sections from the mater document. The reason I maintain the master document is that it is much easier to maintain the one master document and then develop the derivitive documents on the fly (i.e., programmatically) as needed. I am trying to figure out how to best develop the derivitive documents programmatically from the master document. PGP is an obvious solution. Using the here document feature of bash script seems appealing. Using server-side includes seems a bit clumsey; but, could also be considered a possibility. That's about the only 3 possibilities I recognize as being practical. I'm wondering what I might be missing. Suggestions? Thanks for any input. ... doug From x at xman.org Fri Aug 12 16:31:05 2005 From: x at xman.org (Christopher Smith) Date: Fri, 12 Aug 2005 16:31:05 -0700 Subject: [OCLUG-devel] Conditionally Included Sections in HTML Doc In-Reply-To: References: Message-ID: <42FD3139.4090802@xman.org> Doug Jolley wrote: > I have a rather large HTML document which which I'm going to call the > master document and which includes a variety of sections. Actually, > this large document represents a composite of several smaller separate > documents each of which are a derivitive of the master document > drawing certain sections alternatively and choosing whether to include > other sections conditionally. The point is that all of the smaller > documents can be built from the larger document by selecting and/or > conditionally including sections from the mater document. The reason > I maintain the master document is that it is much easier to maintain > the one master document and then develop the derivitive documents on > the fly (i.e., programmatically) as needed. I am trying to figure out > how to best develop the derivitive documents programmatically from the > master document. PGP is an obvious solution. I'm guessing you mean PHP. ;-) > Using the here document > feature of bash script seems appealing. Using server-side includes > seems a bit clumsey; but, could also be considered a possibility. > That's about the only 3 possibilities I recognize as being practical. > I'm wondering what I might be missing. Suggestions? Thanks for any > input. I should think the proper thing would be to have files for each of the component parts, which are then assembled as externally parsed entities. There's a good document on this stuff (and other stuff): http://www.xml.org/xml/xslt_efficient_programming_techniques.pdf --Chris From james at colannino.org Mon Aug 15 12:07:45 2005 From: james at colannino.org (James Colannino) Date: Mon, 15 Aug 2005 12:07:45 -0700 Subject: [OCLUG-devel] Perl and Recursion Message-ID: <4300E801.6060600@colannino.org> Hey everyone. Here's a Perl question for you guys. Here's some pseudo code to show what I've been working on (it's going backward recursively through a list of directories looking for the previous version of a specified package - note that the function below, except for the first two variables, is private to another function which is why I wrote it in this way): #These have scope outside of the private function below my @dirs2check; my @dirs2check_elements; $look_for_rpm = sub { #If it wasn't found and we've reached the end of directories we can search, #we must return 0 to indicate our failure to find the package. if ($dirs2check_elements <= -1) {return 0;} while (files exist in $dirs2check[$dirs2check_elements]) { examine file; if match found, { install older version of package; return 1; } else { --$dirs2check_elements; &$look_for_rpm(); } } }; My problem here is that if a package is found, 1 is only returned to the previous instance of &$look_for_rpm() and is not returned to where I want it to go (past all the recursion and to the function that &$look_for_rpm() is private to.) If there is an error, than 0 is also not returned to where I want it to go. I want to find a way for either 1 or 0 to be returned outside of the recursion that's taking place so I can tell whether or not a file was found. I should probably be able to figure this out, and I'm sure the answer is a stupid one, but I can't for the life of me figure this out. Thanks in advance. James From ddjolley at gmail.com Mon Aug 15 12:48:01 2005 From: ddjolley at gmail.com (Doug Jolley) Date: Mon, 15 Aug 2005 15:48:01 -0400 Subject: [OCLUG-devel] Conditionally Included Sections in HTML Doc In-Reply-To: <42FD3139.4090802@xman.org> References: <42FD3139.4090802@xman.org> Message-ID: > I'm guessing you mean PHP. ;-) Yes. I seem to have a propensity for making that mistake. Anyway, thanks for the input. I finally went with a JS solution. I know, that left me spending a couple of hours typing in 'document.write' and adding backslashes in front of quotes. But, it kept me from having to install an interpreter which probably wouldn't be needed for anything else. It's all a trade off. I just wanted to make sure I wasn't missing something. Thanks again for the input. ... doug From x at xman.org Mon Aug 15 16:24:42 2005 From: x at xman.org (Christopher Smith) Date: Mon, 15 Aug 2005 16:24:42 -0700 Subject: [OCLUG-devel] Perl and Recursion In-Reply-To: <4300E801.6060600@colannino.org> References: <4300E801.6060600@colannino.org> Message-ID: <4301243A.101@xman.org> James Colannino wrote: > $look_for_rpm = sub { > > #If it wasn't found and we've reached the end of directories we > can search, > #we must return 0 to indicate our failure to find the package. > > if ($dirs2check_elements <= -1) {return 0;} > > while (files exist in $dirs2check[$dirs2check_elements]) { > > examine file; > > if match found, { > install older version of package; > return 1; > } > > else { > --$dirs2check_elements; > &$look_for_rpm(); > } > } > }; Okay, the biggest problem I see with this code is that it isn't using a return statement for all of it's exit points. Technically that can work with Perl, but probably not the way you want it to. So, just for clarity I'd do "return &$look_for_rpm();", but you still have the case where you fall out of the while loop. I'm guessing you want to return "0" for that case. Generally with recursion you want to structure things as such: sub recursiveFunction(arguments) { if (atEndOfRecursion) { return someAppropriateValue; } else { return recursiveFunction(someChange(arguments)); } } The way to think of it is you can prove correctness of the code via induction. For some end case you can prove the function returns the correct value. Then for all other cases it returns the correct value as long as some reduced form (i.e. n-1) of the recursive function returns the correct value. --Chris From james at colannino.org Wed Aug 24 13:53:57 2005 From: james at colannino.org (James Colannino) Date: Wed, 24 Aug 2005 13:53:57 -0700 Subject: [OCLUG-devel] C and external variables Message-ID: <430CDE65.50108@colannino.org> I have a quick question about C and external variables; let's say that in one file I have the following: typedef struct thing_ { variables; } thing; Now let's say that I want to declare a variable in another file with that external type. How would I declare it? Would I have to do another typedef in the other file, such as: typedef extern thing thing; I'm just guessing here, so I know I'm most likely wrong. Thanks in advance. James From x at xman.org Wed Aug 24 16:36:30 2005 From: x at xman.org (Christopher Smith) Date: Wed, 24 Aug 2005 16:36:30 -0700 Subject: [OCLUG-devel] C and external variables In-Reply-To: <430CDE65.50108@colannino.org> References: <430CDE65.50108@colannino.org> Message-ID: <430D047E.7070902@xman.org> James Colannino wrote: > I have a quick question about C and external variables; let's say that > in one file I have the following: > > typedef struct thing_ { > variables; > } thing; > > Now let's say that I want to declare a variable in another file with > that external type. How would I declare it? Would I have to do another > typedef in the other file, such as: > > typedef extern thing thing; > > I'm just guessing here, so I know I'm most likely wrong. Thanks in > advance. extern is a modifier on the variable, not the type. Normally what you do is the type definition is a header that is included by the "other file". extern just tells you that the storage space for that variable is being allocated in some other object file (and the linker is going to get right mad at you if one of the object file's it links doesn't claim ownership). So, if I have type defined in "type_defined.h" that I want to have shared between multiple object files, then any file that uses that type should '#include "type_defined.h"'. No extern necessary. If I have a *variable* that is used by multiple objects files, then to avoid a conflict I need to assign ownership to one of the object files and have the rest have the variable as an extern. So that probably means "variable_owned.c" has the variable declared as a global (as opposed to static), and "other_file.c" has the variable declared as extern (both will have '#include "type_defined.h"' to get the type definition. The linker will end up linking together "variable_owned.o" and "other_file.o" and marry up the storage in variable_owned.o with the references to the externed variable in other_file.o. --Chris From james at colannino.org Wed Aug 24 22:29:54 2005 From: james at colannino.org (James Colannino) Date: Wed, 24 Aug 2005 22:29:54 -0700 Subject: [OCLUG-devel] C and external variables In-Reply-To: <430D047E.7070902@xman.org> References: <430CDE65.50108@colannino.org> <430D047E.7070902@xman.org> Message-ID: <430D5752.9030603@colannino.org> Christopher Smith wrote: >James Colannino wrote: > > >>I have a quick question about C and external variables; let's say that >>in one file I have the following: >> >>typedef struct thing_ { >> variables; >>} thing; >> >>Now let's say that I want to declare a variable in another file with >>that external type. How would I declare it? Would I have to do another >>typedef in the other file, such as: >> >>typedef extern thing thing; >> >>I'm just guessing here, so I know I'm most likely wrong. Thanks in >>advance. >> >> > >extern is a modifier on the variable, not the type. Normally what you do >is the type definition is a header that is included by the "other file". > > Now I understand. >So, if I have type defined in "type_defined.h" that I want to have >shared between multiple object files, then any file that uses that type >should '#include "type_defined.h"'. No extern necessary. > > That's what I'll be doing then. Thanks for the help. James From ddjolley at gmail.com Sat Aug 27 11:52:43 2005 From: ddjolley at gmail.com (Doug Jolley) Date: Sat, 27 Aug 2005 14:52:43 -0400 Subject: [OCLUG-devel] warning: incompatible implicit declaration of built-in function blah Message-ID: I am persistently trying to learn C. It is a struggle for me. I had a program written on a FC2 machine and it compiled just fine and seemed to work OK as well. For reasons beyond my control, I had to move it to a FC4 machine to continue my development. Now when I try to compile the program I get 3 warnings of the type. 'warning: incompatible implicit declaration of built-in function whatever'. In my case the functions are strlen, strcpy, and strcat. Here are the lines involved: char *path2CountFile=(char*)malloc(strlen(cgi_val(entries,"__dataDir"))+12); strcpy(path2CountFile,cgi_val(entries,"__dataDir")); strcat(path2CountFile,"/count.dat"); I did some googling. One suggestion that I came across was to include stdlib.h. That didn't work; but, even if it had, I don't see why it should be suddenly required just because I switched machines. I did encounter one comment from an individual stating that he was having this problem on FC4; but, from the context it sounded to me like his problems were originating elsewhere. Anyway, I'd really like to understand why I'm suddenly encountering this problem. I'd certainly like to know how to fix it. I'm beginning to wonder if I need to move back to an FC2 machine for some reason. If I do have to move back, I'd sure like to know why I'm making the move. Thanks for any input. ... doug -------------- next part -------------- An HTML attachment was scrubbed... URL: http://localhost.localdomain/pipermail/oclug-devel/attachments/20050827/9e3f8ed4/attachment.html From ddjolley at gmail.com Mon Aug 29 11:17:43 2005 From: ddjolley at gmail.com (Doug Jolley) Date: Mon, 29 Aug 2005 14:17:43 -0400 Subject: [OCLUG-devel] Single Element Structures Message-ID: Hi -- I am a real newbie to C and in way over my head on a project that I'd really like to get to work. I'm thinking that the thing to do may be to try to get some help at the next OCLUG meeting; but, in the meantime I'm going to try to muddle along as best I can. To continue my muddling, I have encountered the following type definition: typedef struct { node* head; } llist; My question is: Why would anyone go to the trouble of defining a structure with only one element in it? I thought the essence of structures was that they contained a collection of dissimilar data elements. I'm sure the guy that did this had a very good reason and I realize that I'm asking a question without giving a lot of context; so, from what I've given there may be no answer. Anyway, if an answer may be deduced from what I've given, I'd love to hear it. Thanks for any input. ... doug -------------- next part -------------- An HTML attachment was scrubbed... URL: http://localhost.localdomain/pipermail/oclug-devel/attachments/20050829/dd047c35/attachment.html