Using memcmp in C++ gives memory leak while running with Valgrind - c++

I am using memcmp for comparing the char pointer to empty string as:
if((0 == memcmp("", pcNewBeginPtr, 1))){
// do some stuff
}
I am able to compare this, but while running through Valgrind, I get this error message:
Invalid read of size 1 at this line.

Amazingly, you could read the docs to see what the "invalid read" message means. E.g. you don't legitimately have read access to the memory at pcNewBeginPtr: it's already been freed, wasn't validly initialised to point at a char buffer, points to a local variable in a scope that's already exited etc....
You might read some other questions: e.g. here.

Related

Buffer array overflow in for loop in c

When would a program crash in a buffer overrun case
#include<stdio.h>
#include<stdlib.h>
main() {
char buff[50];
int i=0;
for( i=0; i <100; i++ )
{
buff[i] = i;
printf("buff[%d]=%d\n",i,buff[i]);
}
}
What will happen to first 50 bytes assigned, when would the program crash?
I see in my UBUNTU with gcc a.out it is crashing when i 99
>>
buff[99]=99
*** stack smashing detected ***: ./a.out terminated
Aborted (core dumped)
<<
I would like to know why this is not crashing when assignment happening at buff[51] in the for loop?
It is undefined behavior. You can never predict when (or if at all) it crashes, but you cannot rely upon it 'not crashing' and code an application.
Reasoning
The rationale is that there is no compile or run time 'index out of bound checking' in c arrays. That is present in STL vectors or arrays in other higher level languages. So whenever your program accesses memory beyond the allocated range, it depends whether it simply corrupts another field on your program's stack or affects memory of another program or something else, so one can never predict a crash which only occurs in extreme cases. It only crashes in a state that forces the OS to intervene OR when it no longer remains possible for your program to function correctly.
Example
Say you were inside a function call, and immediately next to your array was, the RETURN address i.e. the address your program uses to return to the function it was called from. Suppose you corrupted that and now your program tries to return to the corrupted value, which is not a valid address. Hence it would crash in such a situation.
The worst happens when you silently modified another field's value and didn't even discover what was wrong assuming no crash occurred.
Since it seems you have allocated on the stack the buffer, the app possibly will crash on the first occasion you overwrite an instruction which is to be executed, possibly somewhere in the code of the for loop... at least that's how it's supposed to be in theory.

Strange situation related to char[] and memcpy

I am a c++ newbie and just try to write some code to have experiment myself.
Recently I have encountered a problem that I cannot debug.
char acExpReason[128];
char acReason[] = "An error message is information displayed when an unexpected condition occurs, usually on a computer or other device. On modern operating systems.";
memcpy(acExpReason, acReason, sizeof(acExpReason));
std::string strExpReason(acExpReason);
I use VS2005 to add breakpoints to every line to debug.
when it reaches the breakpoint on the second line, the variable name and value info in the Autos is:
acReason 0x00f6f78c "ÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌ" char [147]
acExpReason 0x00f6f828 "ÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌ̺" char [128]
when it reaches the breakpoint on the third line, the variable name and value info in the Autos is:
acExpReason 0x00f6f828 "ÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌ̺" char [128]
acReason 0x00f6f78c "An error message is information displayed when an unexpected condition occurs, usually on a computer or other device. On modern operating systems." char [147]
when it reaches the breakpoint on the fourth line, the variable name and value info in the Autos is:
acExpReason 0x00f6f828 "An error message is information displayed when an unexpected condition occurs, usually on a computer or other device. On modern ÌÌÌÌÌÌÌ̺" char [128]
acReason 0x00f6f78c "An error message is information displayed when an unexpected condition occurs, usually on a computer or other device. On modern operating systems." char [147]
strExpReason Bad Ptr
after the last line is executed, the info in Autos is:
acExpReason 0x00f6f828 "An error message is information displayed when an unexpected condition occurs, usually on a computer or other device. On modern ÌÌÌÌÌÌÌ̺" char [128]
acReason 0x00f6f78c "An error message is information displayed when an unexpected condition occurs, usually on a computer or other device. On modern operating systems." char [147]
strExpReason "An error message is information displayed when an unexpected condition occurs, usually on a computer or other device. On modern ÌÌÌÌÌÌÌ̺"
basically what my code wants to do is just to have a complete line of msg which is stored in acReason[], and also there is a copy of the complete msg in fixed length(here is 128).
but I don't know why acExpReason and strExpReason(the string version of the acExpReason) will end with some strange characters "ÌÌÌÌÌÌÌ̺" which I don't want(as I will use this string to compare with other string later).
I tried using memcpy, strcpy, and strncpy and they all ended up having that set of strange characters in the end of the string.
Can anyone help?
Many thanks in advance.
std::string strExpReason(acExpReason);
This constructor requires a C-style string. However acExpReason is not a C-style string since it does not have a terminating zero byte. (How would the constructor know how many bytes should be in the string?) You have to follow the rules.
In C all string functions like strcpy and also the constructor for c++'s std::string take char* as a parameter but the char* must be terminated with a byte containing `\0`.
acExpReason does not have a zero ending it so all the string functions look for the next 0 byte in memory. acReason does have a trailing `\0`. Normal strcpy would work as it also copies the 0 however as #VladLazarenko says the buffer size is too small which will cause all memory to be overwritten.
To make memcpy work you need to copy one less byte than the buffer and make the last byte of the buffer 0.
e.g.
memcpy(acExpReason, acReason, sizeof(acExpReason)-1);
acReason[sizeof(acExpReason)-1] = 0;
You can also use string constructor which accepts iterator range -
std::string strExpReason(acExpReason, acExpReason+sizeof(acExpReason));

Regarding Possible Lost in Valgrind

What is wrong if we push the strings into vector like this:
globalstructures->schema.columnnames.push_back("id");
When i am applied valgrind on my code it is showing
possibly lost of 27 bytes in 1 blocks are possibly lost in loss record 7 of 19.
like that in so many places it is showing possibly lost.....because of this the allocations and frees are not matching....which is resulting in some strange error like
malloc.c:No such file or directory
Although I am using calloc for allocation of memory everywhere in my code i am getting warnings like
Syscall param write(buf) points to uninitialised byte(s)
The code causing that error is
datapage *dataPage=(datapage *)calloc(1,PAGE_SIZE);
writePage(dataPage,dataPageNumber);
int writePage(void *buffer,long pagenumber)
{
int fd;
fd=open(path,O_WRONLY, 0644);
if (fd < 0)
return -1;
lseek(fd,pagenumber*PAGE_SIZE,SEEK_SET);
if(write(fd,buffer,PAGE_SIZE)==-1)
return false;
close(fd);
return true;
}
Exact error which i am getting when i am running through gdb is ...
Breakpoint 1, getInfoFromSysColumns (tid=3, numColumns=#0x7fffffffdf24: 1, typesVector=..., constraintsVector=..., lengthsVector=...,
columnNamesVector=..., offsetsVector=...) at dbheader.cpp:1080
Program received signal SIGSEGV, Segmentation fault.
_int_malloc (av=0x7ffff78bd720, bytes=8) at malloc.c:3498
3498 malloc.c: No such file or directory.
When i run the same through valgrind it's working fine...
Well,
malloc.c:No such file or directory
can occur while you are debugging using gdb and you use command "s" instead of "n" near malloc which essentially means you are trying to step into malloc, the source of which may not be not available on your Linux machine.
That is perhaps the same reason why it is working fine with valgrind.
Why error is in malloc:
The problem is that you overwrote some memory buffer and corrupted one
of the structures used by the memory manager. (c)
Try to run valgrind with --track-origins=yes and see where that uninitialized access comes from. If you believe that it should be initialized and it is not, maybe the data came from a bad pointer, valgrind will show you where exactly the values were created. Probably those uninitialized values overwrote your buffer, including memory manager special bytes.
Also, review all valgrind warnings before the crash.

Why am I not getting a segmentation fault with this code? (Bus error)

I had a bug in my code that went like this.
char desc[25];
char name[20];
char address[20];
sprintf (desc, "%s %s", name, address);
Ideally this should give a segfault. However, I saw this give a bus error.
Wikipedia says something to the order of 'Bus error is when the program tries to access an unaligned memory location or when you try to access a physical (not virtual) memory location that does not exist or is not allowed. '
The second part of the above statement sounds similar to a seg fault. So my question is, when do you get a SIGBUS and when a SIGSEGV?
EDIT:-
Quite a few people have mentioned the context. I'm not sure what context would be needed but this was a buffer overflow lying inside a static class function that get's called from a number of other class functions. If there's something more specific that I can give which will help, do ask.
Anyways, someone had commented that I should simply write better code. I guess the point of asking this question was "can an application developer infer anything from a SIGBUS versus a SIGSEGV?" (picked from that blog post below)
As you probably realize, the base cause is undefined behavior in your
program. In this case, it leads to an error detected by the hardware,
which is caught by the OS and mapped to a signal. The exact mapping
isn't really specified (and I've seen integral division by zero result
in a SIGFPE), but generally: SIGSEGV occurs when you access out of
bounds, SIGBUS for other accessing errors, and SIGILL for an illegal
instruction. In this case, the most likely explination is that your
bounds error has overwritten the return address on the stack. If the
return address isn't correctly aligned, you'll probably get a SIGBUS,
and if it is, you'll start executing whatever is there, which could
result in a SIGILL. (But the possibility of executing random bytes as
code is what the standards committee had in mind when they defined
“undefined behavior”. Especially on machines with no memory
protection, where you could end up jumping directly into the OS.)
A segmentation fault is never guaranteed when you're doing fishy stuff with memory. It all depends on a lot of factors (how the compiler lays out the program in memory, optimizations etc).
What may be illegal for a C++ program may not be illegal for a program in general. For instance the OS doesn't care if you step outside an array. It doesn't even know what an array is. However it does care if you touch memory that doesn't belong to you.
A segmentation fault occurs if you try to do a data access a virtual address that is not mapped to your process. On most operating systems, memory is mapped in pages of a few kilobytes; this means that you often won't get a fault if you write off the end of an array, since there is other valid data following it in the memory page.
A bus error indicates a more low-level error; a wrongly-aligned access or a missing physical address are two reasons, as you say. However, the first is not happening here, since you're dealing with bytes, which have no alignment restriction; and I think the second can only happen on data accesses when memory is completely exhausted, which probably isn't happening.
However, I think you might also get a bus error if you try to execute code from an invalid virtual address. This could well be what is happening here - by writing off the end of a local array, you will overwrite important parts of the stack frame, such as the function's return address. This will cause the function to return to an invalid address, which (I think) will give a bus error. That's my best guess at what particular flavour of undefined behaviour you are experiencing here.
In general, you can't rely on segmentation faults to catch buffer overruns; the best tool I know of is valgrind, although that will still fail to catch some kinds of overrun. The best way to avoid overruns when working with strings is to use std::string, rather than pretending that you're writing C.
In this particular case, you don't know what kind of garbage you have in the format string. That garbage could potentially result in treating the remaining arguments as those of an "aligned" data type (e.g. int or double). Treating an unaligned area as an aligned argument definitely causes SIGBUS on some systems.
Given that your string is made up of two other strings each being a max of 20 characters long, yet you are putting it into a field that is 25 characters, that is where your first issue lies. You are have a good potential to overstep your bounds.
The variable desc should be at least 41 characters long (20 + 20 + 1 [for the space you insert]).
Use valgrind or gdb to figure out why you are getting a seg fault.
char desc[25];
char name[20];
char address[20];
sprintf (desc, "%s %s", name, address);
Just by looking at this code, I can assume that name and address each can be 20 chars long. If that is so, then does it not imply that desc should be minimum 20+20+1 chars long? (1 char for the space between name and address, as specified in the sprintf).
That can be the one reason of segfault. There could be other reasons as well. For example, what if name is longer than 20 chars?
So better you use std::string:
std::string name;
std::string address;
std::string desc = name + " " + address;
char const *char_desc = desc.str(); //if at all you need this

c++ what happens if you print more characters with sprintf, than the char pointer has allocated?

I assume this is a common way to use sprintf:
char pText[x];
sprintf(pText, "helloworld %d", Count );
but what exactly happens, if the char pointer has less memory allocated, than it will be print to?
i.e. what if x is smaller than the length of the second parameter of sprintf?
i am asking, since i get some strange behaviour in the code that follows the sprintf statement.
It's not possible to answer in general "exactly" what will happen. Doing this invokes what is called Undefined behavior, which basically means that anything might happen.
It's a good idea to simply avoid such cases, and use safe functions where available:
char pText[12];
snprintf(pText, sizeof pText, "helloworld %d", count);
Note how snprintf() takes an additional argument that is the buffer size, and won't write more than there is room for.
This is a common error and leads to memory after the char array being overwritten. So, for example, there could be some ints or another array in the memory after the char array and those would get overwritten with the text.
See a nice detailed description about the whole problem (buffer overflows) here. There's also a comment that some architectures provide a snprintf routine that has a fourth parameter that defines the maximum length (in your case x). If your compiler doesn't know it, you can also write it yourself to make sure you can't get such errors (or just check that you always have enough space allocated).
Note that the behaviour after such an error is undefined and can lead to very strange errors. Variables are usually aligned at memory locations divisible by 4, so you sometimes won't notice the error in most cases where you have written one or two bytes too much (i.e. forget to make place for a NUL), but get strange errors in other cases. These errors are hard to debug because other variables get changed and errors will often occur in a completely different part of the code.
This is called a buffer overrun.
sprintf will overwrite the memory that happens to follow pText address-wise. Since pText is on the stack, sprintf can overwrite local variables, function arguments and the return address, leading to all sorts of bugs. Many security vulnerabilities result from this kind of code — e.g. an attacker uses the buffer overrun to write a new return address pointing to his own code.
The behaviour in this situation is undefined. Normally, you will crash, but you might also see no ill effects, strange values appearing in unrelated variables and that kind of thing. Your code might also call into the wrong functions, format your hard-drive and kill other running programs. It is best to resolve this by allocating more memory for your buffer.
I have done this many times, you will receive memory corruption error. AFAIK, I remember i have done some thing like this:-
vector<char> vecMyObj(10);
vecMyObj.resize(10);
sprintf(&vecMyObj[0],"helloworld %d", count);
But when destructor of vector is called, my program receive memory corruption error, if size is less then 10, it will work successfully.
Can you spell Buffer Overflow ? One possible result will be stack corruption, and make your app vulnerable to Stack-based exploitation.