Pages: [1]
|
|
|
|
Author
|
Topic: Linux Pipes (Read 175 times)
|
|
Terraji
Admin Team CSR Connoisseur
Karma: +35/-15
Offline
Gender:
Posts: 789
|
|
Linux Pipes
« on: January 28, 2004, 08:03:34 PM »
|
|
I have a problem/challenge:
The story is this. I am currently taking a second year software class that titled 'Principals of Software Engineering' which I should have taken last year, but due to some timetable shuffling, I didn't. This is a course that is to basically teach good coding and software practices. We get programming assignments that aren't for any real purpose other than to apply some of the practices that are taught. (size estimation, PSP, Coupling and Cohesion, etc.) A program that is due a week and a half from now is supposed to open a file that contains some C code and count the lines of code. The only criteria is that it counts the lines ending in a semicolon and ending in a closed curly bracket, and ignores space that is commented out.
Now that I have limited knowledge of pipes and that there exists functions that will do some of this for me, I'd feel like a chump going through the file character by character like the rest of the unix-illiterate class. (we work in a lab that runs Debian and KDE, but most of the class just fires up WinXP on VMware to program)
At this point, all I know is that sorting through files can be done with piping an little else.
For example, this counts the amount of unique words in a text file insensitive to case
cat infile | tr ' ' '\012' |tr '[A-Z]' '[a-z]' | sort | uniq -c
What I need is a few commands, which I can put together and throw into the system function that might solve this problem (or at least some of it). I'd be the coolest dude there if I could estimate my program size to be 5 lines and actually hit it.
Any help on this from you leet comp sci guys would be greatly appreciated.
|
|
Logged
|
|
|
|
Porter
[Wumpa]
Board Admin
Karma: +176/--88
Offline
Gender:
Posts: 3910
|
|
Re:Linux Pipes
« Reply #1 on: January 28, 2004, 09:20:33 PM »
|
|
Does it have to be C? Can you use Perl instead? Jeez, you could do this with one or two regular expressions and a counter in Perl. Five lines wouldn't be far off. C is another story though. Seriously, if I was given this problem, I'd be choosing Perl over a C program --OR-- command line utilities. It's the best suited tool for parsing text. Hell, that's what it was designed to do!
|
|
Logged
|
[Wumpa] Porter --Silent, professional, lethal... sometimes.
|
|
|
Terraji
Admin Team CSR Connoisseur
Karma: +35/-15
Offline
Gender:
Posts: 789
|
|
Re:Linux Pipes
« Reply #2 on: January 28, 2004, 11:25:35 PM »
|
|
Unfortunately, it has to be done in C or Java I could probably get away with C++ but that wouldnt do much good. Im doing it in C because I need the practice.
The idea of the assignment is to just occupy us for a while writing a moderate chunk of code so we can record a buch of PSP garbage. I'm trying to find a way around the system.
If this way isnt feasable, its no big problem, I'll just have to plug away at it for awhile
|
|
Logged
|
|
|
|
Porter
[Wumpa]
Board Admin
Karma: +176/--88
Offline
Gender:
Posts: 3910
|
|
Re:Linux Pipes
« Reply #3 on: January 29, 2004, 08:30:36 AM »
|
|
Oh it's feasible, but it's also equivalent to going out and hunting for your own food with a spear when there's a McDonalds down the street. One of the fundamental principles of Computer Science is always do the least amount of work necessarily. This doesn't mean be lazy, it means don't waste your time doing something that has already been done (or can be done an easier way).
That being said: Perl.
Perl Perl Perl.
|
|
Logged
|
[Wumpa] Porter --Silent, professional, lethal... sometimes.
|
|
|
Terraji
Admin Team CSR Connoisseur
Karma: +35/-15
Offline
Gender:
Posts: 789
|
|
Re:Linux Pipes
« Reply #4 on: January 29, 2004, 12:22:04 PM »
|
|
Personally, I would defenetly consider hunting before eating at McDonalds if i were hungy and those were the only two options.
I fully understand the idiocy of doing it the chump way. We SoftE's are into re-use just as much (if not, more so) as Comp Sci. But unfortunatly, the assingment is designed to waste my time for purposes other than just gaining coding skill. My problem, more clearly, is this:
1) I am aware of the existance of functions (like grep) that can do some of the manipulation.
2) I do not know the names of them or a resource that would let me look them.
3) Once I have them, it would be easy from within C to fork and execute the processes.
4) I was never formally trained or taught any of my unix/linux knowledge (thanks to my incompetent engineering department who think chemistry, physics, thermodynamics, rigid body statics/dynamics, etc. is more important and relevent to software design than some basic unix know-how) so I apologise for sounding strange with this plea for help.
|
|
Logged
|
|
|
|
Porter
[Wumpa]
Board Admin
Karma: +176/--88
Offline
Gender:
Posts: 3910
|
|
Re:Linux Pipes
« Reply #5 on: January 29, 2004, 12:53:51 PM »
|
|
Why not call a quick Perl script from a C program? If you're looking to waste your time, that would be a good bet for me. Perl is almost a necessary language to know for all computer programmers-- it really is worth your time to get familiar with it, even if now isn't the time for it.
As for searching for tools, hopefully you are familiar with man pages. The tool to search man pages is called apropos. Give it a term, and it will display al the man page listings it can find that match your query.
Slight might have more specific solutions for you than I do.
|
|
Logged
|
[Wumpa] Porter --Silent, professional, lethal... sometimes.
|
|
|
slightcrazed
-TWB-
Admin Team CSR Connoisseur
Karma: +65/-7
Offline
Gender:
Posts: 983
|
|
Re:Linux Pipes
« Reply #6 on: January 29, 2004, 09:04:36 PM »
|
|
Puh-LEEZE.
cat file | grep -v '#' | wc -l
replace the # with whatever is used for comments in the file you are looking at. This tells grep to output every line that does not have a '#' (i.e. every line NOT commented out) and then pipe the output to wc. The -l option with wc tells it to output the number of lines to stdout (or you can > to a file or variable so that you can use it later).
slight
|
|
Logged
|
I once beat Drizzt Do'Urden at thumb wrestling.
|
|
|
Porter
[Wumpa]
Board Admin
Karma: +176/--88
Offline
Gender:
Posts: 3910
|
|
Re:Linux Pipes
« Reply #7 on: January 29, 2004, 09:39:56 PM »
|
|
Okay, now recognize semi-colons and curly braces as the "line counter" instead of the "\n" char, but don't count comments, including c-style multi-line comments...
/* like this */
in other words, don't count the liter number of lines, but the number of C statements, delimited by semi-colons or curly braces, and excluding multi-line comments.
|
|
Logged
|
[Wumpa] Porter --Silent, professional, lethal... sometimes.
|
|
|
Terraji
Admin Team CSR Connoisseur
Karma: +35/-15
Offline
Gender:
Posts: 789
|
|
Re:Linux Pipes
« Reply #8 on: January 29, 2004, 11:24:39 PM »
|
|
One beauty of this assingment is that the specifics for what is counted as a line and what is valid input is up to me so the program doesnt have to be perfect. mulit-line comments seem like they might be too difficult to wipe out so I can negect those. pehaps this could be clarified better with examples:
// this line should not be counted
|
|
/* I would not worry about this case*/
|
|
for(i=0;i<50; i++) { foo+=bar; } //this counts as 3 lines or possibly 2 depending on what is easier
|
|
and of course blank space should not be counted. I think it is starting to come together for me from the last two posts. Ill have to look at it a bit tommorow since my head is a bit fried right now from spending the last 12 hours in computerland.
|
|
Logged
|
|
|
|
slightcrazed
-TWB-
Admin Team CSR Connoisseur
Karma: +65/-7
Offline
Gender:
Posts: 983
|
|
Re:Linux Pipes
« Reply #9 on: January 30, 2004, 07:34:36 AM »
|
|
Hmmm.... interesting. I will have to noodle on that one for a while..... I was always taught that comments should ONLY be placed on their own line, and not appended to another line or function, but I think I can come up with something.
slight
|
|
Logged
|
I once beat Drizzt Do'Urden at thumb wrestling.
|
|
|
Terraji
Admin Team CSR Connoisseur
Karma: +35/-15
Offline
Gender:
Posts: 789
|
|
Re:Linux Pipes
« Reply #10 on: January 30, 2004, 09:28:01 AM »
|
|
Okay, after sleeping on it I have come up with some pseudocode for what needs to be done which hopefully is in the realm of possiblility. I think the only situation my solution will not deal with is when there are comment lines that end in ';' ,'{' or '{' but I think that falls into the category of being able to neglect since the absolute correctness of this program will not be tested. What it will do is deal the case of this: (sorry for forgetting to give an example of in my previous post, I was tired.)
// for(i=0;i<50;i++){ //foo+=bar; //}
|
|
to deal with lines like that, I will have to do this first:
cat SomeCode.c | grep (the lines that start with '//') | grep (the lines that end with '{' or '{' or ';' ) | ws -l (this gives the count of lines piped from the greps right??) | tempfile.txt
I believe the above will give a count of the lines of code that are commented out and store them as an integer in tempfile.txt.
next:
cat SomeCode.c | tr (replace all instances of '//', '/*' or '*/' with '\n' ) | grep (all lines that end in '{', '}' or';' ) | ws -l | tempfile2.txt
Now, subtract the integer value in tempfile.txt from tempfile2.txt to get the line count. What I still dont know is the syntax for tr, grep and ws to do this. There might even be a cleaner solution that you can come up with for all I know. I also think it is possible to bypass the temp files by redirecting the stdout to a to a stream within the C program. I am able to look in my book for that one though.
Once again, any help/suggestions is greatly appreciated.
Terraji
|
« Last Edit: January 30, 2004, 09:32:16 AM by Terraji » |
Logged
|
|
|
|
Terraji
Admin Team CSR Connoisseur
Karma: +35/-15
Offline
Gender:
Posts: 789
|
|
Re:Linux Pipes
« Reply #11 on: January 30, 2004, 09:37:16 AM »
|
|
Also, I have just thought of anther case in where it would fail:
/* for(i=0;i<50;i++){ foo+=bar; } */
[\code]
If this proves difficulty to deal with, I'll just assume that that will always be done like this in the input:
|
|
// for(i=0;i<50;i++){ //foo+=bar; //}
[\code]
|
|
Logged
|
|
|
|
slightcrazed
-TWB-
Admin Team CSR Connoisseur
Karma: +65/-7
Offline
Gender:
Posts: 983
|
|
Re:Linux Pipes
« Reply #12 on: January 30, 2004, 10:03:24 AM »
|
|
Personally I would do it by grepping for each of the characters that you need to find, and then throwing out the lines that are identical (i.e a line that has both a } and a // would be found twice, so you need to discount one of the lines) OR you could use a multiple grep line, although this might be more difficult.
So first, what characters do you need to look for on each line, and which ones would mean 'count the line' and which ones would mean 'don't count the line'?
slight
|
|
Logged
|
I once beat Drizzt Do'Urden at thumb wrestling.
|
|
|
Terraji
Admin Team CSR Connoisseur
Karma: +35/-15
Offline
Gender:
Posts: 789
|
|
Re:Linux Pipes
« Reply #13 on: January 30, 2004, 10:17:08 AM »
|
|
I would say that ending in a '{','}' or a ';' is the critieria for counting a line. to not count a line, starting with a '//'.
also, to deal with the case of
i++; // comment
is it possible to insert a newline character in front of the '//' intially before the counting?
|
|
Logged
|
|
|
|
Guardian_Tenshi
Global Moderator
Karma: +53/-26
Offline
Gender:
Posts: 1114
|
|
Re:Linux Pipes
« Reply #14 on: February 02, 2004, 02:51:16 PM »
|
|
do you get to write the C code that you're sifting through?
if so it would be ideal if you just write the code such that every comment is on a different line. such as:
j--; //deccriment j i++; //this above code incriments i
|
|
|
|
Logged
|
|
|
|
slightcrazed
-TWB-
Admin Team CSR Connoisseur
Karma: +65/-7
Offline
Gender:
Posts: 983
|
|
Re:Linux Pipes
« Reply #15 on: February 02, 2004, 03:10:07 PM »
|
|
That would make it too easy, and my above snippet of code would work fine if that was the case. I noodled around with this one again, and the best I could come up with was the idea of saving each line of code in a matrix as a seperate variable, and then grepping each of the variables for the above criteria, and then matching the results. So if line 14 in the program is:
for(i=0;i<50;i++){ //blah blah blah |
|
then you would grep the line first for //, and if it is found, then the line would be tested for ; and if it is found then the line is counted. If it's not then it's not counted.
Make sense?
slight
|
|
Logged
|
I once beat Drizzt Do'Urden at thumb wrestling.
|
|
|
Terraji
Admin Team CSR Connoisseur
Karma: +35/-15
Offline
Gender:
Posts: 789
|
|
Re:Linux Pipes
« Reply #16 on: February 02, 2004, 03:14:22 PM »
|
|
Yes we do, which is fortunate since we can cook it to make our program easier to make. Im actually going to start writing the program in a few minutes so I think I have enough info to do it. I have just one quick question if you know. would
cat program.c | grep '^//' | wc -l
|
|
count the lines that start with '//' or is my syntax wrong?
|
|
Logged
|
|
|
|
Guardian_Tenshi
Global Moderator
Karma: +53/-26
Offline
Gender:
Posts: 1114
|
|
Re:Linux Pipes
« Reply #17 on: February 02, 2004, 03:37:24 PM »
|
|
yes Terraji, I believe the argument '^//' should imply starting with //. Hence any line that starts with //. Don't forget though, if you need to do large comments /* might be something to look for also.
|
|
Logged
|
|
|
|
Terraji
Admin Team CSR Connoisseur
Karma: +35/-15
Offline
Gender:
Posts: 789
|
|
Re:Linux Pipes
« Reply #18 on: February 02, 2004, 04:25:09 PM »
|
|
I assume that it follows that ';^' means ending with a semicolon. as for the block comments, I can deal with the /* and */ similarily for the first and last line. Any ideas for how to deal with lines in between? Its not 100% neccecary to do that, but for completeness, I'd like to try.
|
|
Logged
|
|
|
|
Porter
[Wumpa]
Board Admin
Karma: +176/--88
Offline
Gender:
Posts: 3910
|
|
Re:Linux Pipes
« Reply #19 on: February 02, 2004, 04:57:37 PM »
|
|
Nope, you use the dollar sign $ to indicate the end of a line.
Your example would be:
';$'
|
|
Logged
|
[Wumpa] Porter --Silent, professional, lethal... sometimes.
|
|
|
Terraji
Admin Team CSR Connoisseur
Karma: +35/-15
Offline
Gender:
Posts: 789
|
|
Re:Linux Pipes
« Reply #20 on: February 02, 2004, 10:32:12 PM »
|
|
I got a preliminary version working, but unfortunatly, my professor threw cold water on it. I have to do it the stupid way now. Despite the bad ending, I learned alot which I am sure will be helpful for me in the future.
Thank you all for your help, it was MEGA appreciated,
Terraji
|
|
Logged
|
|
|
|
Guardian_Tenshi
Global Moderator
Karma: +53/-26
Offline
Gender:
Posts: 1114
|
|
Re:Linux Pipes
« Reply #21 on: February 03, 2004, 02:50:32 AM »
|
|
hey, slightly off topic and all, but we were talking today that most unix compilers let you give command line commands in your program, apparently, it is possible for your program to "recompile" itself midway through running...of course while you're at it, you might as well just write your code all on one line.
Tenshi
|
|
Logged
|
|
|
|
Porter
[Wumpa]
Board Admin
Karma: +176/--88
Offline
Gender:
Posts: 3910
|
|
Re:Linux Pipes
« Reply #22 on: February 03, 2004, 09:46:23 AM »
|
|
gcc has more command line options than you can shake a stick at, but I don't know if you can add actual programming code to it directly. Mostly it's enabling or disabling compile time configuration options, or making sure the system libraries you need get found. check the mac page for gcc sometime. I dare you to read all of it! If you take that dare, I'll be waiting for your next post when you finish reading sometime next week.
|
|
Logged
|
[Wumpa] Porter --Silent, professional, lethal... sometimes.
|
|
|
slightcrazed
-TWB-
Admin Team CSR Connoisseur
Karma: +65/-7
Offline
Gender:
Posts: 983
|
|
Re:Linux Pipes
« Reply #23 on: February 03, 2004, 10:38:52 AM »
|
|
gcc does have a list of options that will make you squeel like a stuck pig, but I have only used a few on a regular basis.
Glad we could help Terraji... programming is something that needs to be odne in order to be learned. You could memorize a textbook, take a test and get a perfect score, and then not be able to write the famous 'hello world' example.
slight
|
|
Logged
|
I once beat Drizzt Do'Urden at thumb wrestling.
|
|
|
Terraji
Admin Team CSR Connoisseur
Karma: +35/-15
Offline
Gender:
Posts: 789
|
|
Re:Linux Pipes
« Reply #24 on: February 03, 2004, 12:28:01 PM »
|
|
The thing that frustrates me the most is that the university takes the 'give the man a fish and feed him for a day' approach instead of 'give the man a fishing rod and feed him for a lifetime' (or hovever that proverb goes)
I hate it when I learn more from the guy sitting next to me in the lab than I do from the professors and TA's.
Oh well, 15 more months then I am out.
As for gcc, I was involved in a convesation with the lab tech yesterday and he said that to upgrade gcc properly to new versions, he runs a script that compiles the new gcc with the old gcc, then recompiles the new gcc with itself, to take advantage of the new version upgrades for compiling itself, then for safe measure, compiles itself again using itself. Just thought that was kinda cool and that I should share it.
(I'd like to see Deuce try to wrap his head around that one)
|
« Last Edit: February 03, 2004, 12:34:30 PM by Terraji » |
Logged
|
|
|
|
slightcrazed
-TWB-
Admin Team CSR Connoisseur
Karma: +65/-7
Offline
Gender:
Posts: 983
|
|
Re:Linux Pipes
« Reply #25 on: February 03, 2004, 12:54:42 PM »
|
|
I had to do that exact process while setting up my linux-from-scratch system. It's actually a pretty common practice, but one that is not used all that often.
slight
|
|
Logged
|
I once beat Drizzt Do'Urden at thumb wrestling.
|
|
|
Porter
[Wumpa]
Board Admin
Karma: +176/--88
Offline
Gender:
Posts: 3910
|
|
Re:Linux Pipes
« Reply #26 on: February 03, 2004, 01:01:02 PM »
|
|
I'm taking a Compilers and Interpreters class right now-- we're BUILDING our own compiler from scratch. Let me tell you, there is TONS that goes into it. There are SOO many NP Complete problems in the whole process that interact with each other it's amazing programs like gcc run as fast as they do.
|
|
Logged
|
[Wumpa] Porter --Silent, professional, lethal... sometimes.
|
|
|
Miscreant
Guest
|
|
Re:Linux Pipes
« Reply #27 on: February 03, 2004, 01:08:34 PM »
|
|
It's actually a pretty common practice, but one that is not used all that often.
|
|
Might I say that, that is an excellent quote?
|
|
Logged
|
|
|
|
slightcrazed
-TWB-
Admin Team CSR Connoisseur
Karma: +65/-7
Offline
Gender:
Posts: 983
|
|
Re:Linux Pipes
« Reply #28 on: February 03, 2004, 03:25:56 PM »
|
|
What I meant one was that it is a commonly suggested practice, but people seem to ignore the suggestion.
Make more sense now?
slight
|
|
Logged
|
I once beat Drizzt Do'Urden at thumb wrestling.
|
|
|
Terraji
Admin Team CSR Connoisseur
Karma: +35/-15
Offline
Gender:
Posts: 789
|
|
Re:Linux Pipes
« Reply #29 on: February 04, 2004, 12:48:02 PM »
|
|
There are SOO many NP Complete problems in the whole process that interact with each other it's amazing programs like gcc run as fast as they do.
|
|
The gcc people must know a thing or two about optimizing code.
|
|
Logged
|
|
|
|
Porter
[Wumpa]
Board Admin
Karma: +176/--88
Offline
Gender:
Posts: 3910
|
|
Re:Linux Pipes
« Reply #30 on: February 04, 2004, 02:20:06 PM »
|
|
It's really more about cheating.
For example, machine instructions (or assembly code) can be rearranged to leverage the fact that each instruction takes a different number of cycles to complete. optimizing the oder of the instructions for speed causes more of the CPUs internal registers to be used. Going the other way can create code that uses the fewest number of registers, but doesn't run nearly as quickly.
On top of just THAT conflict, the compiler has to take into account the code SURROUNDING this current chunk, so it knows which variables to keep in registers between chunks, and which ones need to be moved out to main memory. So not only is each problem really tough in itself, but working on one has a direct impact on the performance of the others.
Compilers get around this partly by allowing the user to specify which aspects of performance they value, and also by frequently taking a "good enough" approach. It's pretty interesting to study, but a real pain to implement.
|
|
Logged
|
[Wumpa] Porter --Silent, professional, lethal... sometimes.
|
|
|
Terraji
Admin Team CSR Connoisseur
Karma: +35/-15
Offline
Gender:
Posts: 789
|
|
Re:Linux Pipes
« Reply #31 on: February 04, 2004, 03:23:22 PM »
|
|
In my Computer Archetecture course, I learned that those kind of optimizations, are taken care of at the achetecture level and cheating in the software isnt all that effective. The CPU actually looks ahead multiple instructions and executes them out of order to pipeline them the most efficiently through the modules. The compiler then has to be designed to take advantage of the specific archetecture that it is compiling for. For example doing things like making it easier for the CPU to predict which branches are going to be taken so it can look ahead past the branch.
I suppose this is just an approach to the same problem from the two separte directions, both being equally important. This is, I guess, the fundamental difference between Computer Engineering and Computer Science. (That, and Engineers get paid more )
|
|
Logged
|
|
|
|
Pages: [1]
|
|
|
|
|
|
CSReloaded Forums | Powered by YaBB SE
© 2001-2003, YaBB SE Dev Team. All Rights Reserved. |
|
|