Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
Hi have a c++ program that elaborates a set of files with the same prefix (i.e file0, file1, file2, etc.). When I run the program (on linux systems) I usually pass the prefix as command line argument:
myscript file*
this elaborates all the files (within the folder) that have prefix file. The c++ script includes a for loop as:
for(i=1;i<argc;i++) {
//do something
}
I'm not an expert of c++ and I don't know how * is elaborated. Now, how could I pass a subset of files (i.e from file0 to file10 or from file20 to file35) to the c++ program? How can I use the shell commands to list a subset of files?
Assuming you are running on a linux-like system, the * is evaluated by the shell, before executing your program (which by the way is called a program, not a script, as it first has to be compiled before execution).
So the shell expands the * to match everything. This means you should modify how call the program, rather than modifying the code. For example File0* would match anything beginning with a File0.
Chances are good you are working with a bash terminal, in which case you should be looking for command line help. The GNU project publishes a great book called "introduction to the command line" (http://shop.fsf.org/product/Introduction_to_Command_Line/) which is released under the gpl and freely available. You might enjoy it.
You might like to be aware that your line: for(i=2; i
First, you are setting i to two, which skips over the first command line parameter. argv is an array where the 0'th element is the command itself, and all options are in elements 1 to argc-1. If you are intentonally skipping the first argument, then that's fine.
The second is a pretty small one, but it's a good idea to get in the habit of preferring the prefix increment operation (++i) over the postfix. It won't make a difference on a simple integer, but in some cases using the prefix operator results in more efficient code (by avoiding an unnecessary temporary). Since the prefix operator is just as readable as the postfix, you lose nothing by getting the habit of always using the prefix operator, unless you really need the postfix one. This is discussed quite well in, for example, Item 1 (don't optimize or pessimize prematurely) of Sutter and Alexandrescu's C++ Coding Guidelines.
Basile is right, the cpp program sees only real file names. The sequence of file names passed to the program is the result of the shell's file name expansion: In a directory with files a1, a1, a3, a11 a command like echo a[0-9]would result in "a1 a2 a3".
The bash does not have true regular expressions, so you would need to pipe the ls command through grep in order to get all files named f1...f100 or so (with different number lengths). Example: ls | egrep 'file[0-9]+'.
A program "my_executable" would get the result on the command line with something like
my_executable $(ls a* | egrep 'a[0-9]+$')
Putting a command inside $() replaces $() with the output of that command.
Hope that helps.
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I want to learn how can I find source codes of __builtin functions like __builtin_cbrt() in C++. If you know, Can you help me?
I want to learn how can I find
First become acquainted with the language you are working with - learn C and C++ programming languages. Learn about the tools like git and autotools and the environment around these programming languages. Become familiar with the tool set needed browsing files - at least grep, but I recommend it's (way) faster alternatives - "the silver searcher" ag or ack, but be aware of tools like ctags or GNU Global.
Then research. GNU projects are available open source - it's very easy to find source code of GNU projects, nowadays they are even mirrored on github.
Then it's just a "feeling" or "experience". Surely a project will have builtins functions in a file named "builtins.c" or something similar. Be curious, reasonable and inventive. If you would want to add a builtin function to a codebase, where would you put it? Become familiar with the project structure you are working with. And expect big and old projects to have stuff scattered all over the place.
First I find gcc sources with builtins.def (BUILT_IN_CBRT, "cbrt", and some references of BUILT_IN_CBRT in builtins.c.
After cloning the gcc repository I scan for BUILT_IN_CBRT macro name. Browsing the code leads me to CASE_CFN_CBRT macro name, which leads me to fold-const-call.c:
CASE_CFN_CBRT:
return do_mpfr_arg1 (result, mpfr_cbrt, arg, format);
By the name of the file fold-const-call.c I suspect this part of code is taken only when folding a constant call.
From there I can browse google about mpfr_cbrt symbol, which leads me to GNU MPFR library. I find clone of MPRF library on github and search for a file named cbrt, I find cbrt.c with mpfr_cbrt() sources with the source of cbrt within MPRF library. This is the code that will be called and will compute cbrt of a number when __builtin_cbrt is folded inside a constant expression, I suspect.
When not in constnat expression, I suspect that [fold_const_call_ss]https://code.woboq.org/gcc/gcc/fold-const-call.c.html#_ZL18fold_const_call_ssP10real_value11combined_fnPKS_PK11real_format) is not called at all, instead some fold_const_* function returns to gcc that the expression cannot be constantly folded so a call to the actual cbrt() standard function is generated.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
Say I have a C++ program with 100 functions, and each function with 100 local variables, and each of them is an array, maybe 1D, maybe 2D, maybe 3D, maybe dynamically allocated.
Now I'm debugging the program and I have to check if all the variables are correct. Now I simply fprintf() them to their own files, and then check the data in the files. But I have to write many many fprintf() statements and fopen(), fclose() statements in the program, which is quite ugly.
Are there any better way or tool that can simplify and possibly automate this stuff?
You can use debugger for that, but it'll require to check everything on your own.
If you want to check everything automatically, just write unit tests and run them.
and each function with 100 local variable
There's your problem. Cut that so that each function is 100 lines (even then it's still too much!) and you'll have a fighting chance
Create a global log file and open/close it once.
Debug print is a powerful tool, but I suppose you'll need also a tool (write yourself) to compare the result files.
At first, as #UKMonkey already said your function shouldn't have 100 local variables. The best practice is to have functions with maximum 25 lines and maximum 80 characters in each line. That will make easier for you to debug and for others to understand your code.
Furthermore, if you use linux or other unix-like (unix based) systems, you can use GDB for debugging. Just compile your app giving -g flag to gcc/g++ and run it using GDB.
$ g++ -g example.cpp -o example.out
$ gdb ./example.out
there you can add breakpoints and print values of your variables. Read GDB manual for more details
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I've recently learned about command line arguments and I understand how to use them. But I just don't get why I should use them at all. I mean, you could use any normal variable to do the same job as a command line argument.
Could someone explain or give a scenario of how a command line argument could be essential to a program?
edit myfile.txt
You could always make an editor to edit one specific file, but it would make more sense if the user was able to tell you which file he wanted to edit. Command line args is one way of doing this.
The purpose of a command line argument is to allow you to pass information into your program without hard coding it into the program. For example
Foo -pages:10
Foo -pages:20
Here we've passed information into the program (in this case a pages setting). If you set a variable in your program you'd have to recompile it every time you wanted to change it!
It means you don't have to edit the program to change something in it.
Say your program processes all files in a folder to remove icon previews. You could hardcode the folder to process in the program. Or you could specify it as a commandline argument.
I use this example because it describes a bash script I use on my Mac at home.
Automation.
You cannot script or use an application/tool in a headless (or unmanned) environment if you require interactive user input.
You can use "config files" and write and read from temporary files, but this can become cumbersome quickly.
Driving the application.
Almost every non-trivial application has some variation in what or how it does something; there is some level of control that can and must be applied. Similar to functions, accepting arguments is a natural fit.
The command line is a natural and intuitive environment, supporting and using a command line allows for better and easier adoption of the application (or tool).
A GUI can be used, sure, but unless your plan is to only support GUI environments and only support input via the GUI, the command line is required.
Consider echo, which repeats its arguments—it could hardly work without them.
Or tar—how could it tell whether to extract, create, decompress, list, etc. etc. without command line arguments?
Or git, with its options to pull, push, merge, branch, checkout, fetch, ...
Or literally any UNIX program except maybe true and false (although the GNU versions of those do take arguments.)
There are countless applications for passing arguments to main. For example, let's say you are a developer and you've designed an application for processing images. Normal users need only to pass the names of the images to your application for processing. The actual source files of your application are not available to them or they are probably not programmers.
I'm looking for a way to search for a given term in a project's C/C++ code, while ignoring any occurrences in comments and strings.
As the code base is rather large, i am searching for a way to automatically identify the lines of code matching my search term, as they need manual inspection.
If possible I'd like to perform the search on my linux system.
background
the code base in question is a realtime signal processing engine with a large number of 3rd party plugins. plugins are implemented in a variety of languages (mostly C, but also C++ and others; currently I only care for those two), no standards have been enforced.
our code base currently uses the built-in type float for floating-point numbers and we would like to replace that with a typedef that would allow us to use doubles.
we would like to find all occurrences of float in the actual code (ignoring legit uses in comments and printouts).
What complicates things furthermore, is that there are some (albeit few) legit uses of float in the code payload (so we are really looking for a way to identify all places that require manual inspection, rather than run some automatic search-and-replace.)
the code also contains C-style static casts to (float), so relying on compiler warnings to identify type mismatches is often not an option.
the code base consists of more than 3000 (C and C++) files accumulating about 750000 lines of code.
the code is cross-platform (linux, osx, w32 being the main targets; but also freebsd and similar), and is compiled with the various native compilers (gcc/g++, clang/clang++, VisualStudio,...).
so far...
so far I'm using something ugly like:
grep "\bfloat\b" | sed -e 's|//.*||' -e 's|"[^"]*"||g' | grep "\bfloat\b"
but I'm thinking that there must be some better way to search only payload code.
IMHO there is a good answers on a similar question at "Unix & Linux":
grep works on pure text and does not know anything about the
underlying syntax of your C program. Therefore, in order not search
inside comments you have several options:
Strip C-comments before the search, you can do this using gcc
-fpreprocessed -dD -E yourfile.c For details, please see Remove comments from C/C++ code
Write/use some hacky half-working scripts like you have already found
(e.g. they work by skipping lines starting with // or /*) in order to
handle the details of all possible C/C++ comments (again, see the
previous link for some scary testcases). Then you still may have false
positives, but you do not have to preprocess anything.
Use more advanced tools for doing "semantic search" in the code. I
have found "coccigrep": http://home.regit.org/software/coccigrep/ This
kind of tools allows search for some specific language statements
(i.e. an update of a structure with given name) and certainly they
drop the comments.
https://unix.stackexchange.com/a/33136/158220
Although it doesn't completely cover your "not in strings" requirement.
It might practically depend upon the size of your code base, and perhaps also on the editor you are usually using. I am suggesting to use GNU emacs (if possible on Linux with a recent GCC compiler...)
For a small to medium size code (e.g. less than 300KLOC), I would suggest using the grep mode of Emacs. Then (assuming you have bound the next-error Emacs function to some key, perhaps with (global-set-key [f10] 'next-error) in your ~/.emacs...) you can quickly scan every occurrence of float (even inside strings or comments, but you'll skip very quickly such occurrences...). In a few hours you'll be done with a medium sized source code (and that is quicker than learning how to use a new tool).
For a large sized code (millions of lines), it might be worthwhile to customize some static analysis tool or compiler. You could use GCC MELT to customize your GCC compiler on Linux. Its findgimple mode could be inspirational, and perhaps even useful (you probably want to find all Gimple assignments targeting a float)
BTW, you probably don't want to replace all occurrences -but only most of them- of the float type with double (probably suitably typedef-ed...), because very probably you are using some external (or standard) functions requiring a float.
The CADNA tool might also be useful, to help you estimate the precision of results (so help you deciding when using double is sensible).
Using semantical tools like GCC MELT, CADNA, Coccinelle, Frama-C (or perhaps Fluctuat, or Coccigrep mentioned in g0hl1n's answer) would give more precise or relevant results, at the expense of having to spend more time (perhaps days!) in learning and customizing the tool.
The robust way to do this should be with cscope (http://cscope.sourceforge.net/) in line-oriented mode using the find this C symbol option but I haven't used that on a variety of C standards so if that doesn't work for you or if you can't get cscope then do this:
find . -type f -print |
while IFS= read -r file
do
sed 's/a/aA/g; s/__/aB/g; s/#/aC/g' "$file" |
gcc -P -E - |
sed 's/aC/#/g; s/aB/__/g; s/aA/a/g' |
awk -v file="$file" -v OFS=': ' '/\<float\>/{print file, $0}'
done
The first sed replaces all hash (#) and __ symbols with unique identifier strings, so that the preprocessor doesn't do any expansion of #include, etc. but we can restore them after preprocessing.
The gcc preprocesses the input to strip out comments.
The second sed replaces the hash-identifier string that we previously added with an actual hash sign.
The awk actually searches for float within word-boundaries and if found prints the file name plus the line it was found on. This uses GNU awk for word-boundaries \< and \>.
The 2nd sed's job COULD be done as part of the awk command but I like the symmetry of the 2 seds.
Unlike if you use cscope, this sed/gcc/sed/awk approach will NOT avoid finding false matches within strings but hopefully there's very few of those and you can weed them out while post-processing manually anyway.
It will not work for file names that contain newlines - if you have those you can put the body in a script and execute it as find .. -print0 | xargs -0 script.
Modify the gcc command line by adding whatever C or C++ version you are using, e.g. -ansi.
Sometimes I am reading some code and would like to find the definition for a certain symbol, but it is sprinkled throughout the code to such an extent that grep is more or less insufficient for pointing me to its definition.
For example, I am working with Zlib and I want to figure out what FAR means.
Steven#Steven-PC /c/Users/Steven/Desktop/zlib-1.2.5
$ grep "FAR" * -R | wc -l
260
That's a lot to scan through. It turns out it is in fact #defined to nothing but it took me some time to figure it out.
If I was using Eclipse I would have it easy because I can just hover over the symbol and it will tell me what it is.
What kinds of tools out there can I use to analyze code in this way? Can GCC do this for me? clang maybe? I'm looking for something command-line preferably. Some kind of tool that isn't a full fledged IDE at any rate.
You may want to check out cscope, it's basically made for this, and a command line tool (if you like, using ncurses). Also, libclang (part of clang/llvm) can do so - but that's just a library (but took me just ~100 lines of python to use libclang to emulate basic cscope features).
cscope requires you to build a database first. libclang can parse code "live".
If the variable is not declared in your curernt file, it is declared in an included file, i.e. a .h. So you can limit the amount of data by performing a grep only on those files.
Moreover, you can filter whole word matches with -w option of grep.
Try:
grep -w "FAR" *.h -R | wc -l
Our Source Code Search Engine (SCSE) is kind of graphical grep that indexes a large code base according to the tokens of its language(s) (e.g., C, Java, COBOL, ...). Queries are stated in terms of the tokens, not strings, so finding an identifier won't find it in the middle of a comment. This minimizes false positives, and in a big code base these can be a serious waste of time. Found hits are displayed one per line; a click takes to the source text.
One can do queries from the command line and get grep-like responses, too.
A query of the form of
I=foo*
will find all uses of any identifier that starts with the letters "foo".
Queries can compose mulitiple tokens:
I=foo* '[' ... ']' '='
finds assignments to a subscripted foo ("..." means "near").
For C, Java and COBOL, the SCSE can find reads, writes, updates, and declarations of variables.
D=*baz
finds declarations of variables whose names end in "baz". I think this is what OP is looking for.
While SCSE works for C++, it presently can't find reads/writes/updates/declarations in C++. It does everything else.
The SCSE will handle mixed languages with aplomb. An "I" query will search across all langauges that have identifiers, so you can see cross language calls relatively easily, since the source and target identifiers tend to be the same for software engineering reasons.
gcc can output the pre-processing result, with all macro definitions with gcc -E -dD. The output file would be rather larger, often due to the nested system headers. But the first appearance of a symbol is usually the declaration (definition). The output use #line to show the part pre-processed result belong to source/header file, so you can find where it is originally declared.
To get the exact result when the file is compiled, you may need to add all other parameters used to compile the file, like -I, -D, etc. In fact, I always copy a result compilation command line, and add -E -dD to the beginning, and add (or change) -o in case I accidental overwrite anything.
There is gccxml, but I am not aware of tools that build on top of it. clang and LLVM are suited for such stuff, too; equally, I am not aware of standalone tools that build on them.
Apart from that: QtCreator and code::blocks can find the declartion, too.
So what is it about a "full fledged IDE" you don't want? If its a little speed, I found netbeans somewhat usefull when I was in school, but really for power and speed and general utility I would like to reccomend emacs. It has key board shortcuts for things like this. Keep in mind, its a learning curve to be sure, but once you are over the hump there is no going back.