In the documentation for avcodec_decode_video2 it gives the following warning:
Warning:
The input buffer must be FF_INPUT_BUFFER_PADDING_SIZE larger than the
actual read bytes because some optimized bitstream
readers read 32 or 64 bits at once and could read over the end. The
end of the input buffer buf should be set to 0 to ensure that no
overreading happens for damaged MPEG streams.
If this were not implemented would this cause segmentation faults when overreading occurs? Or would it potentially cause weird corruption? I'm just curious as I have corruption and I'm not sure if this could potentially be causing my problem.
It wouldn't necessarily cause segmentation faults, but it would be undefined behavior, since these readers would be reading unallocated memory. This could make the program crash immediately, or work for a while, or even appear to work fine: you can never be sure when it comes to undefined behavior.
Related
This question already has answers here:
No out of bounds error
(7 answers)
Accessing an array out of bounds gives no error, why?
(18 answers)
Closed 3 years ago.
Here's a sample of my code:
char chipid[13];
void initChipID(){
write_to_logs("Called initChipID");
strcpy(chipid, string2char(twelve_char_string));
write_to_logs("Chip ID: " + String(chipid));
}
Here's what I don't understand: even if I define chipid as char[2], I still get the expected result printed to the logs.
Why is that? Shouldn't the allocated memory space for chipid be overflown by the strcpy, and only the first 2 char of the string be printed?
Here's what I don't understand: even if I define chipid as char[2], I
still get the expected result printed to the logs.
Then you are (un)lucky. You are especially lucky if the undefined behavior produced by the overflow does not manifest as corruption of other data, yet also does not crash the program. The behavior is undefined, so you should not interpret whatever manifestation it takes as something you should rely upon, or that is specified by the language.
Why is that?
The language does not specify that it will happen, and it certainly doesn't specify why it does happen in your case.
In practice, the manifest ation you observe is as if the strcpy writes the full data into memory at the location starting at the beginning of your array and extending past its end, overwriting anything else your program may have stored in that space, and that the program subsequently reads it back via a corresponding overflowing read.
Shouldn't the allocated memory space for chipid be
overflown by the strcpy,
Yes.
and only the first 2 char of the string be
printed?
No, the language does not specify what happens once the program exercises UB by performing a buffer overflow (or by other means). But also no, C arrays are represented in memory simply as a flat sequence of contiguous elements, with no explicit boundary. This is why C strings need to be terminated. String functions do not see the declared size of the array containing a string's elements, they see only the element sequence.
You have it part correct: "the allocated memory space for chipid be overflowed by the strcpy" -- this is true. And this is why you get the full result (well, the result of an overflow is undefined, and could be a crash or other result).
C/C++ gives you a lot of power when it comes to memory. And with great power comes great responsibility. What you are doing gives undefined behaviour, meaning it may work. But it will definitely give problems later on when strcpy is writing to memory it is not supposed to write to.
You will find that you can get away with A LOT of things in C/C++. However, things like these will give you headaches later on when your program unexpectedly crashes, and this could be in an entire different part of your program, which makes it difficult to debug.
Anyway, if you are using C++, you should use std::string, which makes things like this a lot easier.
I created a stream using fmemopen(). I am closing it with fclose() and freeing the buffer after reading. Valgrind reports about problem at fclose() line:
==9323== Invalid write of size 8
==9323== at 0x52CAE52: _IO_mem_finish (memstream.c:139)
==9323== by 0x52C6A3E: fclose##GLIBC_2.2.5 (iofclose.c:63)
==9323== by 0x400CB6: main (main.cpp:80)
==9323== Address 0xffefffa80 is just below the stack ptr. To suppress,
use: --workaround-gcc296-bugs=yes
What's happening? Maybe fclose() cannot properly close a memory stream? Or maybe valgrind worries without reason and I can ignore that?
As you haven't posted your code, the following is pure speculation. You're probably writing more output to the stream than you promised to when you opened it. Did you account for the final NUL (if you didn't open with "b")?
Did you read the following in the manual page?
When a stream that has been opened for writing is flushed (fflush(3)) or closed (fclose(3)), a null byte is written at the end of the buffer if there is space. The caller should ensure that an extra byte is available in
the buffer (and that size counts that byte) to allow for this.
Attempts to write more than size bytes to the buffer result in an error. (By default, such errors will be visible only when the stdio buffer is flushed. Disabling buffering with setbuf(fp, NULL) may be useful to detect
errors at the time of an output operation. Alternatively, the caller can explicitly set buf as the stdio stream buffer, at the same time informing stdio of the buffer's size, using setbuffer(fp, buf, size).)
Following the latter advice should reveal the write that is exceeding the capacity.
It was my fault. I wrote that I use fmemopen() to open a stream, but this code wasn't running, and I get stream, produced earlier with open_memstream(). Although this stream was opened for writing, I can read from it. However valgrind reports about problem. I fixed my code, valgrind find no errors.
I had a bug in my code that went like this.
char desc[25];
char name[20];
char address[20];
sprintf (desc, "%s %s", name, address);
Ideally this should give a segfault. However, I saw this give a bus error.
Wikipedia says something to the order of 'Bus error is when the program tries to access an unaligned memory location or when you try to access a physical (not virtual) memory location that does not exist or is not allowed. '
The second part of the above statement sounds similar to a seg fault. So my question is, when do you get a SIGBUS and when a SIGSEGV?
EDIT:-
Quite a few people have mentioned the context. I'm not sure what context would be needed but this was a buffer overflow lying inside a static class function that get's called from a number of other class functions. If there's something more specific that I can give which will help, do ask.
Anyways, someone had commented that I should simply write better code. I guess the point of asking this question was "can an application developer infer anything from a SIGBUS versus a SIGSEGV?" (picked from that blog post below)
As you probably realize, the base cause is undefined behavior in your
program. In this case, it leads to an error detected by the hardware,
which is caught by the OS and mapped to a signal. The exact mapping
isn't really specified (and I've seen integral division by zero result
in a SIGFPE), but generally: SIGSEGV occurs when you access out of
bounds, SIGBUS for other accessing errors, and SIGILL for an illegal
instruction. In this case, the most likely explination is that your
bounds error has overwritten the return address on the stack. If the
return address isn't correctly aligned, you'll probably get a SIGBUS,
and if it is, you'll start executing whatever is there, which could
result in a SIGILL. (But the possibility of executing random bytes as
code is what the standards committee had in mind when they defined
“undefined behavior”. Especially on machines with no memory
protection, where you could end up jumping directly into the OS.)
A segmentation fault is never guaranteed when you're doing fishy stuff with memory. It all depends on a lot of factors (how the compiler lays out the program in memory, optimizations etc).
What may be illegal for a C++ program may not be illegal for a program in general. For instance the OS doesn't care if you step outside an array. It doesn't even know what an array is. However it does care if you touch memory that doesn't belong to you.
A segmentation fault occurs if you try to do a data access a virtual address that is not mapped to your process. On most operating systems, memory is mapped in pages of a few kilobytes; this means that you often won't get a fault if you write off the end of an array, since there is other valid data following it in the memory page.
A bus error indicates a more low-level error; a wrongly-aligned access or a missing physical address are two reasons, as you say. However, the first is not happening here, since you're dealing with bytes, which have no alignment restriction; and I think the second can only happen on data accesses when memory is completely exhausted, which probably isn't happening.
However, I think you might also get a bus error if you try to execute code from an invalid virtual address. This could well be what is happening here - by writing off the end of a local array, you will overwrite important parts of the stack frame, such as the function's return address. This will cause the function to return to an invalid address, which (I think) will give a bus error. That's my best guess at what particular flavour of undefined behaviour you are experiencing here.
In general, you can't rely on segmentation faults to catch buffer overruns; the best tool I know of is valgrind, although that will still fail to catch some kinds of overrun. The best way to avoid overruns when working with strings is to use std::string, rather than pretending that you're writing C.
In this particular case, you don't know what kind of garbage you have in the format string. That garbage could potentially result in treating the remaining arguments as those of an "aligned" data type (e.g. int or double). Treating an unaligned area as an aligned argument definitely causes SIGBUS on some systems.
Given that your string is made up of two other strings each being a max of 20 characters long, yet you are putting it into a field that is 25 characters, that is where your first issue lies. You are have a good potential to overstep your bounds.
The variable desc should be at least 41 characters long (20 + 20 + 1 [for the space you insert]).
Use valgrind or gdb to figure out why you are getting a seg fault.
char desc[25];
char name[20];
char address[20];
sprintf (desc, "%s %s", name, address);
Just by looking at this code, I can assume that name and address each can be 20 chars long. If that is so, then does it not imply that desc should be minimum 20+20+1 chars long? (1 char for the space between name and address, as specified in the sprintf).
That can be the one reason of segfault. There could be other reasons as well. For example, what if name is longer than 20 chars?
So better you use std::string:
std::string name;
std::string address;
std::string desc = name + " " + address;
char const *char_desc = desc.str(); //if at all you need this
I am observing a crash within my application and the call stack shows below
mfc42u!CString::AllocBeforeWrite+5
mfc42u!CString::operator=+22
No idea why this occuring. This does not occur frequently also.
Any suggestions would help. I have the crash dump with me but not able to progress any further.
The operation i am performing is something like this
iParseErr += m_RawMessage[wMsgLen-32] != NC_SP;
where m_RawMessage is a 512 length char array.
wMsgLen is unsigned short
and NC_SP is defined as
#define NC_SP 0x20 // Space
EDIT:
Call Stack:
042afe3c 5f8090dd mfc42u!CString::AllocBeforeWrite+0x5 * WARNING: Unable to verify checksum for WP Communications Server.exe
042afe50 0045f0c0 mfc42u!CString::operator=+0x22
042aff10 5f814d6b WP_Communications_Server!CParserN1000::iCheckMessage(void)+0x665 [V:\CSAC\SourceCode\WP Communications Server\HW Parser N1000.cpp # 1279]
042aff80 77c3a3b0 mfc42u!_AfxThreadEntry+0xe6
042affb4 7c80b729 msvcrt!_endthreadex+0xa9
042affec 00000000 kernel32!BaseThreadStart+0x37
Well this is complete call stack and i have posted the code snippet as in my original message
Thanks
I have a suggestion that might be a little frustrating for you:
CString::AllocBeforeWrite does implicate to me, that the system tries to allocate some memory.
Could it be, that some other memory operation (specially freeing or resizing of memory) is corrupted before?
A typical problem with C/C++ memory management is, that an error on freeing (or resizing) memory (for example two times freeing the same junk of memory) will not crash the system immediatly but can cause dumps much later -- specially when new memory is to be allocated.
Your situation looks to me quite like that.
The bad thing is:
It can be very difficult to find the place where the real error occurs -- where the heap is corrupted in the first place.
This also can be the reason, why your problem only occurs once in a while. It could depend on some complicated situation beforehand.
I'm sure you'll have checked the obvious: wMsgLen >= 32
I assume this is a common way to use sprintf:
char pText[x];
sprintf(pText, "helloworld %d", Count );
but what exactly happens, if the char pointer has less memory allocated, than it will be print to?
i.e. what if x is smaller than the length of the second parameter of sprintf?
i am asking, since i get some strange behaviour in the code that follows the sprintf statement.
It's not possible to answer in general "exactly" what will happen. Doing this invokes what is called Undefined behavior, which basically means that anything might happen.
It's a good idea to simply avoid such cases, and use safe functions where available:
char pText[12];
snprintf(pText, sizeof pText, "helloworld %d", count);
Note how snprintf() takes an additional argument that is the buffer size, and won't write more than there is room for.
This is a common error and leads to memory after the char array being overwritten. So, for example, there could be some ints or another array in the memory after the char array and those would get overwritten with the text.
See a nice detailed description about the whole problem (buffer overflows) here. There's also a comment that some architectures provide a snprintf routine that has a fourth parameter that defines the maximum length (in your case x). If your compiler doesn't know it, you can also write it yourself to make sure you can't get such errors (or just check that you always have enough space allocated).
Note that the behaviour after such an error is undefined and can lead to very strange errors. Variables are usually aligned at memory locations divisible by 4, so you sometimes won't notice the error in most cases where you have written one or two bytes too much (i.e. forget to make place for a NUL), but get strange errors in other cases. These errors are hard to debug because other variables get changed and errors will often occur in a completely different part of the code.
This is called a buffer overrun.
sprintf will overwrite the memory that happens to follow pText address-wise. Since pText is on the stack, sprintf can overwrite local variables, function arguments and the return address, leading to all sorts of bugs. Many security vulnerabilities result from this kind of code — e.g. an attacker uses the buffer overrun to write a new return address pointing to his own code.
The behaviour in this situation is undefined. Normally, you will crash, but you might also see no ill effects, strange values appearing in unrelated variables and that kind of thing. Your code might also call into the wrong functions, format your hard-drive and kill other running programs. It is best to resolve this by allocating more memory for your buffer.
I have done this many times, you will receive memory corruption error. AFAIK, I remember i have done some thing like this:-
vector<char> vecMyObj(10);
vecMyObj.resize(10);
sprintf(&vecMyObj[0],"helloworld %d", count);
But when destructor of vector is called, my program receive memory corruption error, if size is less then 10, it will work successfully.
Can you spell Buffer Overflow ? One possible result will be stack corruption, and make your app vulnerable to Stack-based exploitation.