Struggling with sprintf... something stupid? - c++

Sorry to pester everyone, but this has been causing me some pain. Here's the code:
char buf[500];
sprintf(buf,"D:\\Important\\Calibration\\Results\\model_%i.xml",mEstimatingModelID);
mEstimatingModelID is an integer, currently holding value 0.
Simple enough, but debugging shows this is happening:
0x0795f630 "n\Results\model_0.xml"
I.e. it's missing the start of the string.
Any ideas? This is simple stuff, but I can't figure it out.
Thanks!

In an effort to make this an actual general answer: Here's a checklist for similar errors:
Never trust what you see in release mode, especially local variables that have been allocated from stack memory. Static variables that exist in heap data are about the only thing that will generally be correct but even then, don't trust it. (Which was the case for the user above)
It's been my experience that the more recent versions of VS have less reliable release mode data (probably b/c they optimize much more in release, or maybe it's 64bitness or whatever)
Always verify that you are examining the variable in the correct function. It is very easy to have a variable named "buf" in a higher function that has some uninitialized garbage in it. This would be easily confused with the same named variable in the lower subroutine/function.
It's always a good idea to double check for buffer overruns. If you ever use a %s in your sprintf, you could get a buffer overrun.
Check your types. sprintf is pretty adaptable and you can easily get a non-crashing but strange result by passing in a string pointer when an int is expected etc.

Related

Why is printf(inputString) a security hole?

I was reading an answer on Quora where I encountered that something as simple as:
char* inputString;
printf(inputString);
is a security hole.
I assume that the inputString is not simply uninitialized, but initialized with some external input between the two statements.
How exactly is this a security hole?
The original answer on Quora was here:
If C and C++ give the best performance, why do we still code in other languages?
but it provides no additional context for this claim.
I assume that the input string is a string you got from the user, and not just an uninitialized value.
The problem is that the user can
crash the program: printf ("%s%s%s%s%s%s%s%s%s%s%s%s")
view the stack: printf ("%08x %08x %08x %08x %08x\n");
view memory on any location,
or even write an integer to nearly any location in the process memory.
This leads to an attacker being able to:
Overwrite important program flags that control access privileges
Overwrite return addresses on the stack, function pointers, etc
It is all explained quite well here.
It's not just a security problem, but it won't work at all, because the pointer is not initialized. In this context, making the program crash = not running anymore could be a (security) problem, depending what the program does and in what context it runs.
I assume you mean you have a proper string. In this case, if the string is provided by some external input (user etc.), there can be (unexpected) placeholders like %s etc. while the rest of the printf expects eg. a %d. For this example (%s instead of %d), instead of printing an integer number, it will start printing all memory content until some 0 byte then, possibly giving out some secret information stored after the int bytes.
Something similar, ie. giving out too much bytes because of wrong unchecked user input, happened eg. in the known "Heartbleed" bug not too long ago, which was/is a pretty big global problem. ... The first printf parameter should be fixed, not coming from any variable.
Other placeholder combinations are possible too, leading to a wide range of possible effects (including generation of wrong floating point signals in the CPU, which could lead to more serious problems depending on the architecture, etc.etc.)
char* inputSting;
printf(inputSting);
Printing out an uninitialized string is undefined behavior. Effects from undefined behavior can range from printing garbage values to segfaults and other nastyness. Such unpredictable patterns can be exploited and thus compromise security.
But more importantly, nobody would write this as it doesn't do anything meaningful besides risk a segfault.

Is it possible read the content of a variable with gdb?

Is it possible without any file.out and source code, but just the binary?
Is it possible, knowing the name of a var, found and read at runtime the value?
Is it possible, knowing the name of a var, found and read at runtime the value
It depends.
If the variable is a global, and the binary is not stripped, then you should be able to examine its value with a simple
x/gx &var
print var
The latter may print the variable as if it were of type int (if the binary has no debug info), which may not be what you are looking for.
If the variable is local (automatic), then you can print it only while inside the routine in which it is declared (obviously).
If the binary has debug info, then simple print var in correct context should work.
If the binary doesn't, you'll have to figure out the in-memory address of the variable (usually at fixed offset from stack pointer of frame pointer register), and examine that address. You can often figure out a lot about the given routine by disassembling it.
Update:
if I strip the binary, is harder to do the reverse engineering?
Sure: the less info you provide to the attacker, the harder you make his job.
But you also make your job harder: when your binary doesn't work, often your end-user will know more about his system than you do. Often he will load your binary into GDB, and tell you exactly where your bug is. With a stripped executable, he likely wouldn't be able to do that, so you'll guess back and forth, and after a week of trying will lose that customer.
And there is nothing you can do to prevent a sufficiently determined and sufficiently skilled hacker with root access to his system and hardware from reverse engineering your program.
In the end, in my experience, anti-circumvention techniques are usually much more trouble than they are worth.

C++: Determining whether a variable contains no data

I've been messing around in C++ a little bit but I'm still pretty new. I searched around a little bit and even using the keywords of exactly the problem I am trying to tackle yields no results. Basically I am just trying to figure out how to tell if a variable has no data. I have a file that my program reads and it searches for a specific character within that file and basically uses delimiters to determine where to store the actual data in a variable. Now I added some comments in the file saying that it should not be edited which has caused me some problems. So I pretty much want to count the number of comments, but I'm not sure how to do it because the way I had it set up was resulting in huge numbers being returned. So I figured I would attempt to fix it with a simple if statement to see if there was any data in the array while it was running the loop, and if there was then simply add +1 to my variable. Needless to say it did not work. Here's the code. And if you know a better way of doing this, by all means please do share.
size_t arySearchData[20];
size_t commentLines[20];
size_t foundDelimiter;
size_t foundComment;
int commentsNum;
foundDelimiter = lineText.find("]");
foundComment = lineText.find("#");
if (foundComment != std::string::npos) {
commentLines[20] = int(foundComment);
if (foundComment = <PROBLEM>){
commentsNum++;
}
}
So it successfully gets the two comments in my file and recognizes that they are located at the first index(0) in each line but when I tried to have it just do commentsNum++ in my first if statement it just comes up with tons of random numbers, and I am not sure why. So as I said my problem is within the second if statement, I need a void or just a better way to solve this. Any help would be greatly appreciated.
And yes I do realize I could just determine if there 'was' data in the there rather than being void or null but then it would have to be specific and if the comment (#) had a space before it, then it would render my method of reading the file useless as the index will have changed.
A variable in C++ always contains data, just it may not be initialised.
int i;
It will have some value, what it is can't be determined until you do something like
i = 1337;
until you do that the value of i will be what ever happened to be in the memory location that i has been assigned to.
The compile may pick up on the fact that you are trying to use a variable which you have not actually given a value your self, but this will normally just be a warning, as their is nothing wrong as such with doing so
You do not initialize commentsNum. Try this:
int commentsNum = 0;
In C++ other than static variables, other variables are assigned undetermined values. This is primarily done to adhere to underlying philosophy -- "you don't pay for things you don't use", so it doesn't zero that memory by default." However, for static variables, memory is allocated at link time. Unlike runtime initialization, which would need to happen in local variables, link time allocation and initialization incur low cost.
I would recommend hence setting int commentsNum = 0;

Allocate room for null terminating character when copying strings in C?

const char* src = "hello";
Calling strlen(src); returns size 5...
Now say I do this:
char* dest = new char[strlen(src)];
strcpy(dest, src);
That doesn't seem like it should work, but when I output everything it looks right. It seems like I'm not allocating space for the null terminator on the end... is this right? Thanks
You are correct that you are not allocating space for the terminator, however the failure to do this will not necessarily cause your program to fail. You may be overwriting following information on the heap, or your heap manager will be rounding up allocation size to a multiple of 16 bytes or something, so you won't necessarily see any visible effect of this bug.
If you run your program under Valgrind or other heap debugger, you may be able to detect this problem sooner.
Yes, you should allocate at least strlen(src)+1 characters.
That doesn't seem like it should work, but when I output everything it looks right.
Welcome to the world of Undefined Behavior. When you do this, anything can happen. Your program can crash, your computer can crash, your computer can explode, demons can fly out of your nose.
And worst of all, your program could run just fine, inconspicuously looking like it's working correctly until one day it starts spitting out garbage because it's overwriting sensitive data somewhere due to the fact that somewhere, someone allocated one too few characters for their arrays, and now you've corrupted the heap and you get a segfault at some point a million miles away, or even worse your program happily chugs along with a corrupted heap and your functions are operating on corrupted credit card numbers and you get in huge trouble.
Even if it looks like it works, it doesn't. That's Undefined Behavior. Avoid it, because you can never be sure what it will do, and even when what it does when you try it is okay, it may not be okay on another platform.
The best description I have read (was on stackoverflow) and went like this:
If the speed limit is 50 and you drive at 60. You may get lucky and not get a ticket but one day maybe not today maybe not tomorrow but one day that cop will be waiting for you. On that day you will pay and you will pay dearly.
If somebody can find the original I would much rather point at that they were much more eloquent than my explanation.
strcpy will copy the null terminated char as well as all of the other chars.
So you are copying the length of hello + 1 which is 6 into a buffer size which is 5.
You have a buffer overflow here though, and overwiting memory that is not your own will have undefined results.
Alternatively, you could also use dest = strdup(src) which will allocate enough memory for the string + 1 for the null terminator (+1 for Juliano's answer).
This is why you should always, always, always run valgrind on any C program that appears to work.
Yeah, everyone has covered the major point; you are not guaranteed to fail. The fact is that the null terminator is usually 0 and 0 is a pretty common value to be sitting in any particular memory address. So it just happens to work. You could test this by taking a set of memory, writing a bunch of garbage to it and then writing that string there and trying to work with it.
Anyway, the major issue I see here is that you are talking about C but you have this line of code:
char* dest = new char[strlen(src)];
This won't compile in any standard C compiler. There's no new keyword in C. That is C++. In C, you would use one of the memory allocation functions, usually malloc. I know it seems nitpicy, but really, it's not.

c++ what happens if you print more characters with sprintf, than the char pointer has allocated?

I assume this is a common way to use sprintf:
char pText[x];
sprintf(pText, "helloworld %d", Count );
but what exactly happens, if the char pointer has less memory allocated, than it will be print to?
i.e. what if x is smaller than the length of the second parameter of sprintf?
i am asking, since i get some strange behaviour in the code that follows the sprintf statement.
It's not possible to answer in general "exactly" what will happen. Doing this invokes what is called Undefined behavior, which basically means that anything might happen.
It's a good idea to simply avoid such cases, and use safe functions where available:
char pText[12];
snprintf(pText, sizeof pText, "helloworld %d", count);
Note how snprintf() takes an additional argument that is the buffer size, and won't write more than there is room for.
This is a common error and leads to memory after the char array being overwritten. So, for example, there could be some ints or another array in the memory after the char array and those would get overwritten with the text.
See a nice detailed description about the whole problem (buffer overflows) here. There's also a comment that some architectures provide a snprintf routine that has a fourth parameter that defines the maximum length (in your case x). If your compiler doesn't know it, you can also write it yourself to make sure you can't get such errors (or just check that you always have enough space allocated).
Note that the behaviour after such an error is undefined and can lead to very strange errors. Variables are usually aligned at memory locations divisible by 4, so you sometimes won't notice the error in most cases where you have written one or two bytes too much (i.e. forget to make place for a NUL), but get strange errors in other cases. These errors are hard to debug because other variables get changed and errors will often occur in a completely different part of the code.
This is called a buffer overrun.
sprintf will overwrite the memory that happens to follow pText address-wise. Since pText is on the stack, sprintf can overwrite local variables, function arguments and the return address, leading to all sorts of bugs. Many security vulnerabilities result from this kind of code — e.g. an attacker uses the buffer overrun to write a new return address pointing to his own code.
The behaviour in this situation is undefined. Normally, you will crash, but you might also see no ill effects, strange values appearing in unrelated variables and that kind of thing. Your code might also call into the wrong functions, format your hard-drive and kill other running programs. It is best to resolve this by allocating more memory for your buffer.
I have done this many times, you will receive memory corruption error. AFAIK, I remember i have done some thing like this:-
vector<char> vecMyObj(10);
vecMyObj.resize(10);
sprintf(&vecMyObj[0],"helloworld %d", count);
But when destructor of vector is called, my program receive memory corruption error, if size is less then 10, it will work successfully.
Can you spell Buffer Overflow ? One possible result will be stack corruption, and make your app vulnerable to Stack-based exploitation.