segfault when copying an array to a vector in Linux - c++

I'm trying to debug a legacy code written for Linux. Sometimes the application gets a segfault when it reaches the memcpy call in the following method:
std::vector<uint8> _storage;
size_t _wpos;
void append(const uint8 *src, size_t cnt)
{
if (!cnt)
return;
if (_storage.size() < _wpos + cnt)
_storage.resize(_wpos + cnt);
memcpy(&_storage[_wpos], src, cnt);
_wpos += cnt;
}
The values are as follows:
_storage.size() is 1000
_wpos is 0
*src points to an array of uint8 with 3 values: { 3, 110, 20 }
cnt is 3
I have no idea why this happens since this method gets called thousands of times during the application's runtime but it sometimes gets a segfault.
Any one has any idea how to solve this?

Your code looks good in terms of the data that is written. Are you absolutely sure that you're passing in the right src pointer? What happens when you run the code with a debugger such as gdb? It should halt on the segfault, and then you can print out the values of _storage.size(), src, and cnt.
I'm sure you'll find that (at least) one of those is not at all what you're expecting. You might have passed an invalid src; you might have passed an absurdly large cnt.

I'd suggest to run valgrind on your program.
It's very valuable to spot early memory corruption as it may be the case with your program (since it's not a systematic crash you got).

For the values you give, I can't see why that would segfault. It's possible that your segfault is a delayed failure due to an earlier memory management mistake. Writing past the end of the vector in some earlier function could cause some of the vector's internal members to be corrupted, or you may have accidentally freed part of the memory used by the vector earlier. I'd check the other functions that manipulate the vector to see if any of them are doing any suspicious casting.

I see the size of the vector increasing. I never see it decreasing.
Next to that, vector has expquisit memory management support builtin. You can insert your values right to the end:
vector.insert( src, src+cnt );
This will both expand the vector to the right size, and copy the values.

The only thing I can think of is that _storage.resize() fails (which should throw a bad_alloc exception).
Another alternative would be to append each value separately with a call to push_back() (probably far slower though).

I see one problem here.
The memcpy() function copies n bytes from memory, so if cnt is the number of elements, you need a *sizeof(uint8) in the call to memcpy.

In a comment to my other answer, you said that "The vector gets cleaned up in another method since it is a class member variable. I'll test insert and see what happen".
What about thread-safety? Are you absolutely sure that the clearing method does not clear 'while' the resize is happening, or immediately after it? Since it's a 'sometimes' problem, it may be induced by concurrent access to the memory management in the vector.

Related

Buffer overrun with STL vector

I am copying the contents of one STL vector to another.
The program is something like this
std::vector<uint_8> l_destVector(100); //Just for illustration let us take the size as 100.
std::vector<uint_8> l_sourceVector; //Let us assume that source vector is already populated.
memcpy( l_destVector.data(), l_sourceVector.data(), l_sourceVector.size() );
The above example is pretty simplistic but in my actual code the size
of destination vector is dynamically calculated.
Also the source vector is getting populated dynamically making it possible to have different length of data.
Hence it increases the chance of buffer overrun.
The problem I faced is my program is not crashing at the point of memcpy when there is a buffer overrun but sometime later making it hard to debug.
How do we explain this behavior?
/******************************************************************************************************/
Based on the responses I am editing the question to make my concern more understandable.
So, this is a legacy code and there are lot of places where vector has been copied using memcpy, and we do not intend to change the existing code. My main concern here is "Should memcpy not guarantee immediate crash, if not why ?", I would honestly admit that this is not very well written code.
A brief illustration of actual use is as follows.
In the below method, i_DDRSPDBuffer and i_dataDQBuffer where generated based on some logic in the calling method.
o_dataBuffer was assigned a memory space that would have been sufficient to take the data from two input buffers, but some recent changes in method that calls updateSPDDataToRecordPerDimm, is causing overrun in one of the flows.
typedef std::vector<uint8_t> DataBufferHndl;
errHdl_t updateSPDDataToRecordPerDimm(
dimmContainerIterator_t i_spdMmap,
const DataBufferHndl & i_DDRSPDBuffer,
const DataBufferHndl & i_dataDQBuffer,
DataBufferHndl & o_dataBuffer)
{
uint16_t l_dimmSPDBytes = (*i_spdMmap).second.dimmSpdBytes;
// Get the Data Buffer Handle for the input and output vectors
uint8_t * l_pOutLDimmSPDData = o_dataBuffer.data();
const uint8_t * l_pInDDRSPDData = i_DDRSPDBuffer.data();
const uint8_t * l_pInDQData = i_dataDQBuffer.data();
memcpy(l_pOutLDimmSPDData, l_pInDDRSPDData, l_dimmSPDBytes);
memcpy(l_pOutLDimmSPDData + l_dimmSPDBytes,
l_pInDQData, LDIMM_DQ_DATA_BYTES);
memcpy(l_pOutLDimmSPDData ,
l_pInDQData, LDIMM_DQ_DATA_BYTES); ====> Expecting the crash here but the crash happens some where after the method updateSPDDataToRecordPerDimm returns.
}
It doesn't have to crash, it's undefined behaviour.
If you had used std::copy instead with std::vector<uint_8>::iterators in debug mode, you probably would've hit an assertion which would've caught it.
Do not do that! It will eventually bite you.
Use either std::copy and and output_iterator, or since you know the size of the destination, resize the vector to the correct size or create a vector of the correct size and pipe the contents straight in, or simply the assignment operator.
It doesn't crash right at the moment of memcpy because you 'only' overwrite the memory behind the allocated vector. As long as your program does not read from that corrupt memory and use the data, your program will continue to run.
As already mentioned before, using memcpy is not the recommended way to copy the contents of stl containers. You'd be on the safe side with
std::copy
std::vector::assign
And in both cases you'd also get the aforementioned iterator debugging which will trigger close to the point where the error actually is.

New vector with fixed size causes crash

Problem
I had a problem randomly appearing when creating a new vector (pointer) with fixed initial size.
std::vector<double> * ret = new std::vector<double>(size);
This sometimes causes my program to crash an I don't really get why ... maybe stack corruption? I didn't find any explanation on what can cause this issue on the web sadly.
Example:
Code
// <- ... Some independant code
// [size] is an unsigned int passed as parameter to the function
cout << size << endl;
std::vector<double> * ret = new std::vector<double>(size);
cout << "Debug text" << endl;
// More code ... ->
EDIT: I will update the code as soon as possible to have a clear, minimal, reproductible to have a correct question according to: How to create a Minimal, Complete, and Verifiable example
Output
100
... Then it crashes (the trace "Debug text" is not printed and the size is correct)
I tried putting the critical line of code inside a try catch as some people suggested (for memory related errors) but no exception is catched and I still get the crash.
This code is inside a function called multiple times (with various values of size, always between 1 and 1000) and sometimes the function end up witout problem, sometimes not (the value of size does not seem to have any infulence but maybe I'm wrong).
My "solution" (you can skip this part)
I adapted my code to use a pointer to vector without initial size
std::vector<double> * ret;
and I uses push_back() instead of [].
[] was quicker for my algorithm due to how the vector was filled at first (elements order is important and I get positions from external file but I still need a vector and not an array for its dynamic aspect later in code), but I adapted everything to use push_back() (less efficient in my case as I now need more iterations but nothing critical).
Question
In short: Does anyone knows what can be causing the issue OR how I can potentially track what is causing this issue?
Look like your program stopped crashing not because you create a vector without size, but because you use push_back(). The fact that replacing operator[] with push_back() removes your symptom points that somewhere else you access element in a vector out of bounds, corrupt your memory and suddenly get it crashed. Check your code where you access the data.
As you wrote it seems like you are trying to Access it using ret[...] right :-)? Sorry for my smile but this happens when you use a pointer to a vector...
If this is the case you Need to replace it with (*ret)[...]

Difference between accessing non-existent array index and existing-but-empty index

Suppose I wrote
vector<int> example(5);
example[6];
What difference would it make with the following?
vector<int> example(6);
example[5];
In the first case I'm trying to access a non-existent, non-declared index. Could that result in malicious code execution? Would it be possible to put some sort of code in the portion on memory corresponding to example[5] and have it executed by a program written like the first above?
What about the second case? Would it still be possible to place code in the area of memory of example[5], even though it should be reserved to my program, even if I haven't written anything in it?
Could that result in malicious code execution?
No, this causes 'only' undefined behaviour.
Simple code execution exploits usually write past the end of a stack-allocated buffer, thereby overwriting a return adress. When the function returns, it jumps to the malicious code. A write is always required, because else there is no malicious code in your program's address space.
With a vector the chances that this happens are low, because the storage for the elements is not allocated on the stack.
By writing to a wrong location on the heap, exploits are possible too, but they are much more complicated.
The first case reaches beyond the vector's buffer and thus invokes Undefined Behaviour. Technically, this means literally anything can happen. But it's unlikely to be directly exploitable to run malicious code—either the program will try to read the invalid memory (getting a garbage value or a memory error), or the compiler has eliminated the code path altogether (because it's allowed to assume UB doesn't happen). Depending on what's done with the result, it might potentially reveal unintended data from memory, though.
In the second case, all is well. Your program has already written into this memory—it has value-initialised all the 6 int objects in the vector (which happens in std::vector's constructor). So you're guarnateed to find a 0 of type int there.

Why does my dynamically allocated array get initialized to 0?

I have some code that creates a dynamically allocated array with
int *Array = new int[size];
From what I understand, Array should be a pointer to the first item of Array in memory. When using gdb, I can call x Array to examine the value at the first memory location, x Array+1 to examine the second, etc. I expect to have junk values left over from whatever application was using those spots in memory prior to mine. However, using x Array returns 0x00000000 for all those spots. What am I doing wrong? Is my code initializing all of the values of the Array to zero?
EDIT: For the record, I ask because my program is an attempt to implement this: http://eli.thegreenplace.net/2008/08/23/initializing-an-array-in-constant-time/. I want to make sure that my algorithm isn't incrementing through the array to initialize every element to 0.
In most modern OSes, the OS gives zeroed pages to applications, as opposed to letting information seep between unrelated processes. That's important for security reasons, for example. Back in the old DOS days, things were a bit more casual. Today, with memory protected OSes, the OS generally gives you zeros to start with.
So, if this new happens early in your program, you're likely to get zeros. You'd be crazy to rely on that though; it's undefined behavior if you do.
If you keep allocating, filling, and freeing memory, eventually new will return memory that isn't zeroed. Rather, it'll contain remnants of your process' own earlier scribblings.
And there's no guarantee that any particular call to new, even at the beginning of your program, will return memory filled with zeros. You're just likely to see that for calls to new early in your program. Don't let that mislead you.
I expect to have junk values left over from whatever application was using those spots
It's certainly possible but by no means guaranteed. Particularly in debug builds, you're just as likely to have the runtime zero out that memory (or fill it with some recognisable bit pattern) instead, to help you debug things if you use the memory incorrectly.
And, really, "those spots" is a rather loose term, given virtual addressing.
The important thing is that, no, your code is not setting all those values to zero.

C++ What's the max number of bytes you can dynamically allocate using the new operator in Windows XP using VS2005?

I have c++ code that attempts to dynamically allocate a 2d array of bytes that measures approx 151MB in size. When I attempt to go back and index through the array, my program crashes in exactly the same place every time with an "Access violation reading location 0x0110f000" error, but the indicies appear to be in range. That leads me to believe the memory at those indicies wasn't allocated correctly.
1) What's the max number of bytes you can dynamically allocate using the new operator?
2) If it is the case that I'm failing to dynamically allocate memory, would it make sense that my code is crashing when attempting to access the array at exactly the same two indicies every time? For some reason, I feel like they would be different every time the program is run, but what do i know ;)
3) If you don't think the problem is from an unsuccessful call to new, any other ideas what could be causing this error and crash?
Thanks in advance for all your help!
*Edit
Here's my code to allocate the 2d array...
#define HD_WIDTH 960
#define HD_HEIGHT 540
#define HD_FRAMES 100
//pHDVideo is a char**
pHDVideo->VideoData = new char* [HD_FRAMES];
for(int iFrame = 0; iFrame < HD_FRAMES; iFrame++)
{
//Create the new HD frame
pHDVideo->VideoData[iFrame] = new char[HD_WIDTH * HD_HEIGHT * 3];
memset(pHDVideo->VideoData[iFrame], 0, HD_WIDTH * HD_HEIGHT * 3);
}
and here's a screenshot of the crashing code and debugger (Dead Link) it will help.
I should add that the call to memset never fails, which to me means the allocations is successful, but I could be wrong.
EDIT
I found a fix everyone, thanks for all your help. Somehow, and I still need to figure out how, there was one extra horizontal line being upscaled, so I changed...
for(int iHeight = 0; iHeight < HD_HEIGHT; iHeight++)
to
for(int iHeight = 0; iHeight < HD_HEIGHT-1; iHeight++)
and it suddenly worked. Anyhow, thanks so much again!
Some possibilities to look at or things to try:
It may be that the pHDVideo->VideoData[iFrame] or pHDVideo->VideoData is being freed somewhere. I doubt this is the case but I'd check all the places this can happen anyway. Output a debug statement each time you free on of those AND just before your crash statement.
Something might be overwriting the pHDVideo->VideoData[iFrame] values. Print them out when allocated and just before your crash statement to see if they've changed. If 0x0110f000 isn't within the range of one of them, that's almost certainly the case.
Something might be overwriting the pHDVideo value. Print it out when allocated and just before your crash statement to see if it's changed. This depends on what else is within your pHDVideo structure.
Please show us the code that crashes, with a decent amount of context so we can check that out as well.
In answer to your specific questions:
1/ It's implementation- or platform-specific, and it doesn't matter in this case. If your new's were failing you'd get an exception or null return, not a dodgy pointer.
2/ It's not the case: see (1).
3/ See above for some possibilities and things to try.
Following addition of your screenshot:
You do realize that the error message says "Access violation reading ..."?
That means it's not complaining about writing to pHDVideo->VideoData[iFrame][3*iPixel+2] but reading from this->VideoData[iFrame][3*iPixelIndex+2].
iPixelIndex is set to 25458, so can you confirm that this->VideoData[iFrame][76376] exists? I can't see from your screenshot how this->VideoData is allocated and populated.
How are you accessing the allocated memory? Does it always die on the same statement? It looks very much like you're running off the end of either the one dimensional array of pointers, or the one of the big blocks of chars that it's pointing to. As you say, the memset pretty much proves that the memory was allocated correctly. The total amount of memory you're allocating is around 0x9450C00 bytes, so the address you quoted is off the end of allocated memory if it was allocated continguously.
Your screenshot appears to show that iPixel is in range, but it doesn't show what the value of iFrame was. Is it outside the range 0-99?
Update: The bug isn't in allocating the memory, it's in your conversion from HD to SD coordinates. The value you're reading from on the SD buffer is out of range, because it's at coordinates (144,176), which isn't in the range (0,0)-(143,175).
If it is the case that I'm failing to dynamically allocate memory, would it make sense that my code is crashing when attempting to access the array at exactly the same two indicies every time?
No, it wouldn't make sense to me.
If your call to operator new fails, I'd expect it to throw an exception or to return a null pointer (but not to return a non-null pointer to memory that's OK for some indices but not others).
Why are you using floating point math to calculate an integer index?
It looks like you are indexing out of range of the SD image buffer. 25344==SD_WIDTH*SD_HEIGHT, which is less than iPixelIndex, 25458.
Notice that heap allocation (e.g. using new) is efficient when allocating many small objects (that's why it's a Heap). If you're in the business of very large memory allocations, it might be better to use VirtualAlloc and friends.