Is this legal in C++? - c++

I created a C++/CLI wrapper calling a third party code, which happened to end in corrupted memory. So I'm suspecting that maybe the code wasn't legal in C++
below is the code that crashed:
void Init_4bit_tab(unsigned char *dest,unsigned char *source)
{
unsigned char masque,i;
masque=0x08;
for(i=0; i<4; i++) {
dest[i] = (*source & masque)>>(3-i);
masque >>= 1;
}
}
the exact error was:
Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
Update:
After scanning the 3rd party code, It appears to be multidimensional array, because of the way it was passed, but I'm still not sure what's causing the problem:
the source function
unsigned char Data_B[81];
...
S_Box_Calc(&Data_B[33]);
void S_Box_Calc(unsigned char *vect)
{
unsigned char *S_Box[8];
unsigned lig,col,i;
S_Box[0]=S1;
S_Box[1]=S2;
S_Box[2]=S3;
S_Box[3]=S4;
S_Box[4]=S5;
S_Box[5]=S6;
S_Box[6]=S7;
S_Box[7]=S8;
for(i=0;i<8;i++) {
col= 8*vect[1+6*i] + 4*vect[2+6*i] + 2*vect[3+6*i] + vect[4+6*i];
lig= 2*vect[6*i] + vect[5+6*i];
Init_4bit_tab(&vect[4*i],&S_Box[i][col+lig*16]);
}
}
Update 2:
I checked the values on debug mode the dest and source are not null. however if I tried to quick watch (*source & masque) under this code dest[i] = (*source & masque)>>(3-i);
I get this error
(*source & masque) error: & cannot be performed on '*source' and 'masque'
Update 3:
S1...Sn was originally defined on the global scope of the file, but I get an error when I left it as is, so I initialized them in the constructor this way:
unsigned char lS1[64] = {
14,4,13,1,2,15,11,8,3,10,6,12,5,9,0,7,
0,15,7,4,14,2,13,1,10,6,12,11,9,5,3,8,
4,1,14,8,13,6,2,11,15,12,9,7,3,10,5,0,
15,12,8,2,4,9,1,7,5,11,3,14,10,0,6,13
};
std::copy(S1, S1 + 64, lS1);
could this be the problem?

There's no problem with the code you show, if it is passed
valid pointers. If it's corrupting memory, it's probably
because the caller didn't pass it valid pointers.
After your edit: if S_Box_Calc is called with vect equal to
Data_B + 33, as you show, the range [vect, vect+48) is
legal, which means that Init_4bit_tab should not be called
with a value superior to 44. In fact, in the code you show, it
is never called with a value greater than 28, so you shouldn't be able
to corrupt memory here. If any of S1 through S8, however, do
not point to valid memory, you'll get the symptoms you state.

This is perfectly legal C++ syntactically, it compiles. Check if it's semantically correct. The only place where you could inadvertently tread in the UB land is in accessing dest pointer i.e. The array it is pointing to should atleast be 4 chars long from where it's pointing to. Also since the error talks about access violation, make sure dest is pointing to a writable memory location.

Where are your overflow checks? You should be passing in sizes to your functions to allow you to restrict writes to memory if there is a chance that it will overflow. It's similar to strcpy() vs. strncpy() or strlcpy() in BSD. Perhaps if you implement something along those lines and generate an error where there is a condition that memory written would otherwise overflow, you might find the cause of memory corruption.

Related

What happens when assigning a value to non-allocated memory?

int main()
{
char* p = new char('a');
*reinterpret_cast<int*>(p) = 43523;
return 0;
}
This code runs fine but how safe is it? It should have only allocated one byte of memory but it doesn't seem to be having any problem filling 4 bytes with data. What can happen with the other 3 bytes that are not allocated?
char* p = new char('a');
ASCII code of 'a' is 97, so this is equivalent to:
char* p = new char(97);
Single character (1 byte) space is allocated with *p='a'
Now, you are trying to put more than 1-byte value, that's certainly risky, even if this works, I mean runs without any segmentation fault. You are overwriting some other parts of memory that you don't own or even if own, must be for some other purpose.
The code is not safe. Your program has undefined behaviour. Anything at all could happen.
The code causes heap buffer overflow. You're overriding memory which you don't own and can be in use in other parts of the program.

Why don't I get a runtime error when I access an out-of bounds element of an array?

In this code below I try to access the '-1'th element of an array, I don't get any runtime error.
#include <stdio.h>
int A[10] = {0};
int main(){
A[-1] += 12;
printf("%d",A[-1]);
return 0;
}
When I run the code, it outputs 12 that means it is adding 12 to the non-existent A[-1]. Till today whenever I had tried to access an out-of-bounds element, I had got a runtime-error. I had never tried it on a simple code before.
Can anyone explain why does my code run successfully?
I ran it on my computer and also on ideone, in both the cases it ran successfully.
You see, when you allocate a variable like this, it lands on the stack. Stack holds small packages of information about local variables in each function you call, to say it in simple words. The runtime is able to check, whether you exceed the bounds of allocated stack, but not if you write some data in the invalid place on the stack. The stack may look like the following:
[4 bytes - some ptr][4 bytes - A's first element][4 bytes - A's second element] ...
When you try to assign to -1th element of an array, you actually attempt to read four bytes preceding the array (four bytes, because it's an int array). You overwrite some data held on stack - but that's still in valid process's memory, so there are no complaints from the system.
Try running this code in release mode in Visual Studio:
#include <stdio.h>
int main(int argc, char * argv[])
{
// NEVER DO IT ON PURPOSE!
int i = 0;
int A[5];
A[-1] = 42;
printf("%d\n", i);
getchar();
return 0;
}
Edit: in response to comments.
I missed the fact, that A is global. It won't be held in stack, but instead (mostly probably) in .data segment of the binary module, however the rest of explanation stands: A[-1] is still within process's memory, so assignment won't raise AV. However, such assignment will overwrite something, that is before A (possibly a pointer or other part of the binary module) resulting in undefined behavior.
Note, that my example may work and may not, depending on compiler (or compiler mode). For example, in debug mode the program returns 0 - I guess, that memory manager inserts some sentry data between stack frames to catch errors like buffer over/underrun.
C and C++ does not have any bounds checking. It is a part of the language. It is to enable the language to execute faster.
If you want bounds checking use another language that has it. Java perhaps?
As your code executes you are just lucky.
In C++ (and C), the arrays don't check out of range indices. They're not classes.
In C++11, however you could use std::array<int,10> and at() function as:
std::array<int,10> arr;
arr.at(-1) = 100; //it throws std::out_of_range exception
Or you can use std::vector<int> and at() member function.

Stack overflow for string in C++?

I made a small program that looked like this:
void foo () {
char *str = "+++"; // length of str = 3 bytes
char buffer[1];
strcpy (buffer, str);
cout << buffer;
}
int main () {
foo ();
}
I was expecting that a stack overflow exception would appear because the buffer had smaller size than the str but it printed out +++ successfully... Can someone please explain why would this happened ?
Thank you very much.
Undefined Behavior(UB) happened and you were unlucky it did not crash.
Writing beyond the bounds of allocated memory is Undefined Behavior and UB does not warrant a crash. Anything might happen.
Undefined behavior means that the behavior cannot be defined.
You don't get a stack overflow because it's undefined behaviour, which means anything can happen.
Many compilers today have special flags that tell them to insert code to check some stack problems, but you often need to explicitly tell the compiler to enable that.
Undefined behavior...
In case you actually care about why there's a good chance of getting a "correct" result in this case: there are a couple of contributing factors. Variables with auto storage class (i.e., normal, local variables) will typically be allocated on the stack. In a typical case, all items on the stack will be a multiple of some specific size, most often int -- for example, on a typical 32-bit system, the smallest item you can allocate on the stack will be 32 bits. In other words, on your typical 32-bit system, room for four bytes (of four chars, if you prefer that term).
Now, as it happens, your source string contained only 3 characters, plus the NUL terminator, for a total of 4 characters. By pure bad chance, that just happened to be short enough to fit into the space the compiler was (sort of) forced to allocate for buffer, even though you told it to allocate less.
If, however, you'd copied a longer string to the target (possibly even just a single byte/char longer) chances of major problems would go up substantially (though in 64-bit software, you'd probably need longer still).
There is one other point to consider as well: depending on the system and the direction the stack grows, you might be able to write well the end of the space you allocated, and still have things appear to work. You've allocated buffer in main. The only other thing defined in main is str, but it's just a string literal -- so chances are that no space is actually allocated to store the address of the string literal. You end up with the string literal itself allocated statically (not on the stack) and its address substituted where you've used str. Therefore, if you write past the end of buffer, you may be just writing into whatever space is left at the top of the stack. In a typical case, the stack will be allocated one page at a time. On most systems, a page is 4K or 8K in size, so for a random amount of space used on the stack, you can expect an average of 2K or 4K free respectively.
In reality, since this is in main and nothing else has been called, you can expect the stack to be almost empty, so chances are that there's close to a full page of unused space at the top of the stack, so copying the string into the destination might appear to work until/unless the source string was quite long (e.g., several kilobytes).
As to why it will often fail much sooner than that though: in a typical case, the stack grows downward, but the addresses used by buffer[n] will grow upward. In a typical case, the next item on the stack "above" buffer will be the return address from main to the startup code that called main -- therefore, as soon as you write past the amount of space on the stack for buffer (which, as above, is likely to be larger than you specified) you'll end up overwriting the return address from main. In that case, the code inside main will often appear to work fine, but as soon as execution (tries to) return from main, it'll end up using that data you just wrote as the return address, at which point you're a lot more likely to see visible problems.
Outlining what happens:
Either you are lucky and it crashes at once. Or because it's undefined technically you could end up writing to a memory address used by something else. say that you had two buffers, one buffer[1] and one longbuffer[100] and assume that the memory address at buffer[2] could be the same as longbuffer[0] which would mean that long buffer now terminates at longbuffer[1] (because the null-termination).
char *s = "+++";
char longbuffer[100] = "lorem ipsum dolor sith ameth";
char buffer[1];
strcpy (buffer, str);
/*
buffer[0] = +
buffer[1] = +
buffer[2] = longbuffer[0] = +
buffer[3] = longbuffer[0] = \0 <- since assigning s will null terminate (i.e. add a \0)
*/
std::cout << longbuffer; // will output: +
Hope that helps in clarifying please note it's not very likely that these memory addresses will be the same in the random case, but it could happen, and it doesn't even need to be the same type, anything can be at buffer[2] and buffer[3] addresses before being overwritten by the assignment. Then the next time you try to use your (now destroyed) variable it might well crash, and thats when debugging become a bit tedious since the crash doesn't seem to have much to do with the real problem. (i.e. it crashes when you try to access a variable on your stack while the real problem is that you somewhere else in your code destroyed it).
There is no explicit bounds checking, or exception throwing on strcpy - it's a C function. If you want to use C functions in C++, you're going to have to take on the responsibility of checking for bounds etc. or switch to using std::string.
In this case it did work, but in a critical system, taking this approach might mean that your unit tests pass but in production, your code barfs - not a situation that you want.
Stack corruption is happening, its an undefined behaviour, luckily crash didnt occur. Do the below modifications in your program and run it will crash surely because of stack corruption.
void foo () {
char *str = "+++"; // length of str = 3 bytes
int a = 10;
int *p = NULL;
char buffer[1];
int *q = NULL;
int b = 20;
p = &a;
q = &b;
cout << *p;
cout << *q;
//strcpy (buffer, str);
//Now uncomment the strcpy it will surely crash in any one of the below cout statment.
cout << *p;
cout << *q;
cout << buffer;
}

Why Does Array With 1 element Allow 2k elements? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Why don’t i get “Segmentation Fault”?
Why does this code work? If the first element only hold the first characer, then where are the rest of characters being stored? And if this is possible, why aren't we using this method?
Notice line 11: static char c[1]. Using one element, you can store as much characters as you want. I use static to keep the memory location alive outside of the function when pointing to it later.
#include <stdio.h>
void PutString( const char* pChar ){
for( ; *pChar != 0; pChar++ )
{
putchar( *pChar );
}
}
char* GetString(){
static char c[1];
int i = 0;
do
{
c[i] = getchar();
}while( c[i++] != '\n' );
c[i] = '\0';
return c;
}
void main(){
PutString( "Enter some text: " );
char* pChar = GetString();
PutString( "You typed the following:\n" );
PutString( pChar );
}
C doesn't check for array boundaries, so no error is thrown. However, characters after the first one will be stored in memory not allocated by the program. If the string is short, this may work, but a long enough string will corrupt enough memory to crash the process.
You can write wherever you want:
char *bad = 0xABCDEF00;
bad[0] = 'A';
But you shouldn't. Who knows what the above lines of code will do? In the very best case, your program will crash. In the worst case, you've corrupted memory and won't find out until much later. (And good luck tracking down the source!)
To answer your specific questions, it doesn't "work". The rest of the characters are stored directly after the array.
You are just very (un)lucky, that you are not overwriting some other data structures. The array definitely cannot store as much characters as you want - sooner or later you either silently corrupt your memory (in the worse case), or hit a segfault by accesing a memory your process hasn't mapped. The fact that it works is likely because the compiler didn't place any other data after your c[1]. Just try to add a second array, let's say static char d[1]; after c, and then try reading from it - you'll see the second character from c.
C++ does not do bounds checking on arrays. That's for performance reasons; checking every array index to see if it's outside the bounds would incur an unacceptable runtime overhead. Avoiding overhead has always been a design goal of C++.
If you want bounds checking, you should use std::vector instead, which does provides it as an optional feature through std::vector::at().
In this case the behavior is undefined: according to the compiler / the current state of the memory / ..., it may seems to run fine, it may write corrupted chars, or it may crash because of a sefault.
Linking against Electric Fence or running in valgrind may help to find such errors at runtime.

Write data to memory in c++

I had posted a question on this but I thought to use memcmp() function instead.
Writing data to memory in C++
int x = 1;
int fileptr = 0;
void *data = malloc(4096);
memcpy((int *)data+fileptr, &x, sizeof(int));
Then I read the values back
int y;
fileptr = 0;
memcpy(&y, (int *)data+fileptr, sizeof(int));
cout<<y;
In this way, I get a different output for variable y (some long integer values). Please need immediate help.
You declare a pointer data but never initialize it. So the behavior is undefined. You have to point data somewhere, i.e. allocate memory. See here.
Unless there's some code you're leaving out, data is never initialized, and hence points to some random location. Since your application isn't crashing entirely, you're probably getting (somewhat) lucky and ending up with data pointing into a location on the stack which is written to by other code; hence the change in value.
Allocate some memory for data before you write to it and this won't happen.
You havent allocated any storage for data - so when you copy the int data there you are causing no end of problems by corrupting the stack.
You might want to do something like
int x = 1;
void *data=new int();
memcpy(data, &x, sizeof(x)); // never use the type use the var - more resilient to change
That actually invokes undefined behaviour, as you have not allocated memory for data. Where should it store the bit-patterns of x then?
If you have allocated memory (using malloc), and still get wrong output, then in that case, you are doing something terribly wrong. See your program again and then compile and run it again. Because the output cannot be anything other than 1. See this:
http://www.ideone.com/SBoHk
Compare your program with mine (in the above link) word by word, and see if there is anything you are missing.