Why am I getting segfault from this unsigned int? - c++

I'm trying to initialize an integer array and set all elements to 1. I need the array to have an upper bound of 4294967295, or the maximum number possible for a 32-bit unsigned int.
This seems like a trivial task to me, and it should be, but I am running into segfault. I can run the for loop empty and it seems to work fine (albeit slowly, but it's processing nearly 4.3 billion numbers so I won't complain). The problem seems to show up when I try to perform any kind of action within the loop. The instruction I have below - primeArray[i] = 1; - causes the segfault error. As best as I can tell, this shouldn't cause me to overrun the array. If I comment out that line, no segfault.
It's late and my tired eyes are probably just missing something simple, but I could use another pair.
Here's what I've got:
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <stdlib.h>
#include <errno.h>
#include <string.h>
#include <stdint.h>
#define LIMIT 0xFFFFFFFF;
int main(int argc, char const *argv[])
{
uint32_t i;
uint32_t numberOfPrimes = LIMIT; // hardcoded for debugging
int *primeArray = (int*) malloc(numberOfPrimes * sizeof(int));
for (i = 0; i < numberOfPrimes; ++i) {
primeArray[i] = 1;
}
}

Check the return code from malloc() to make sure the array was actually allocated. I suspect that the following test would fail:
int *primeArray = (int*) malloc(numberOfPrimes * sizeof(int));
if (primeArray != NULL) { /* check that array was allocated */
for (i = 0; i < numberOfPrimes; ++i) {
primeArray[i] = 1;
}
}

Your malloc call requests 16 gigabytes of memory from the system. If you don't have that much free virtual memory, or if you are running on any 32-bit system, the call will fail. If you don't check for the failure of malloc, as your code doesn't, the array will be NULL and any subsequent access to its elements will cause a segmentation fault.
If you really need to work with an array that large, you will either need to get a 64-bit system with a lot of memory, or rewrite your program to work with a smaller working set, and persist the rest to disk.

Related

How to fix 'Conditional jump or move depends on uninitialised value(s)' valgrind error for strlen?

My goal is to make reverse two digits like 123456 to 563412.
I'm using valgrind tool to check memory leakage problem but strlen(reverse_chr) function makes this error:
Conditional jump or move depends on uninitialized value(s)
Here is my code:
#include <stdio.h>
#include <string.h>
#include <string>
int main()
{
char chr[] = "123456";
char* reverse_chr=(char *) malloc(strlen(chr)+1);
memset(reverse_chr, 0, strlen(chr));
int chrlen=strlen(chr);
for (int t=0; t<chrlen; t+=2)
{
reverse_chr[t]=chr[chrlen-t-2];
reverse_chr[t+1]=chr[chrlen-t-1];
}
int len_reverse_chr = strlen(reverse_chr);
free(reverse_chr);
return 0;
}
I expect output without any valgrind error.
The problem is that reverse_chr is not a valid string as it's not properly terminated.
char* reverse_chr=(char *) malloc(strlen(chr)+1);
memset(reverse_chr, 0, strlen(chr));
You allocate 7 bytes, but only set the first 6 to 0.
for (int t=0; t<chrlen; t+=2)
{
reverse_chr[t]=...
reverse_chr[t+1]=...
This for loop also only writes to the first 6 elements of reverse_chr.
int len_reverse_chr = strlen(reverse_chr);
Then this line tries to find a NUL byte in reverse_chr, but the first 6 elements aren't '\0' and the 7th is uninitialized (hence the complaint by valgrind).
Fix:
Either do
reverse_chr[chrlen] = '\0';
after the loop, or use calloc:
reverse_chr = static_cast<char *>(calloc(strlen(chr)+1, sizeof *reverse_chr));
This way all allocated bytes are initialized (and you don't need memset anymore).

How to copy a vector to a subvector of another vector without segfaulting?

I have some code that looks like this:
for (int = i; i > (chip.rom.size() - 1); i++) {
//int i = 0;
chip.memory[i + 512] = chip.rom[i];
//i++;
}
chip is a struct where memory and rom are both vectors of unsigned chars
rom is a vector of 160 bytes which is the game rom that I'm using to test my emulator.
memory is zeroed to 4096 bytes like so:
chip.memory = std::vector<BYTE>(4096);
Upon debugging, I managed to found out that I was seg faulting after this for statement. I feel like im going crazy! What obvious error am I missing?
You don't initialize i and your comparison is in the wrong direction. Also, if you use < for your comparison, the - 1 is unnecessary if you're just trying to avoid running off the end of chip.rom.
You shouldn't even be using a loop at all here really. Here's how the code ought to read:
#include <algorithm> // Somewhere up above
::std::copy(chip.rom.begin(), chip.rom.end(), chip.memory.begin() + 512);
Use the stuff from the <algorithm> header and you don't have to remember to get everything in your for statement correct all the time. Not only that, but it's likely your compiler will generate more efficient code.

Memory page alignment doesn't seem to affect performance...?

I am developing an application for which performance is a fundamental issue. In particular, I was willing to organize a tree-like structure that needs to be traversed really quickly in blocks of the same size as my memory page size so that it would reduce the number of cache misses needed to reach a leaf.
I am quite a novice in the art of memory optimization. As far as I understand, the process of accessing the main memory goes more or less as follows:
CPUs have several layer of caches of increasing size and decreasing speed.
Every time some data that I need is already in the cache, it is fetched from the cache (cache hit).
If it is not in the cache, it will be fetched from the main memory.
Anytime something is loaded from the main memory, the whole page (or pages) containing the data are loaded and stored in the cache. In this way, if I try to access locations in memory that are close to the ones I already fetched from the main memory, they will already be in my CPU cache.
However, if I organize my data in blocks of the same size as my memory page size, I thought that it would also be needed to align that data properly, so that whenever a new block of my data needs to be loaded only one page of memory will need to be fetched from the main memory rather than the two pages containing the first half and the second half of my data block). In principle, shouldn't a correctly aligned data block mean only one access to the memory rather than two? Shouldn't that more or less double memory performance?
I tried the following:
#include <iostream>
#include <unistd.h>
#include <stdio.h>
#include <time.h>
#include <sys/time.h>
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
using namespace std;
#define BLOCKS 262144
#define TESTS 131072
unsigned long int utime()
{
struct timeval tp;
gettimeofday(&tp, NULL);
return tp.tv_sec * 1000000 + tp.tv_usec;
}
unsigned long int pagesize = sysconf(_SC_PAGE_SIZE);
unsigned long int block_slots = pagesize / sizeof(unsigned int);
unsigned int t = 0;
unsigned int p = 0;
unsigned long int test(unsigned int * data)
{
unsigned long int start = utime();
for(unsigned int n=0; n<TESTS; n++)
{
for(unsigned int i=0; i<block_slots; i++)
t += data[p * block_slots + i];
p = t % BLOCKS;
}
unsigned long int end = utime();
return end - start;
}
int main()
{
srand((unsigned int) time(NULL));
char * buffer = new char[(BLOCKS + 3) * pagesize];
for(unsigned int i=0; i<(BLOCKS + 3) * pagesize; i++)
buffer[i] = rand();
for(unsigned int i=0; i<pagesize; i++)
cout<<test((unsigned int *) (buffer + i))<<endl;
cout<<"("<<t<<")"<<endl;
delete [] buffer;
}
This code instantiates more or less 1 GB of empty bytes, fills them with random numbers. Then the function test is called with all the possible shifts in a memory page (from a 0 shift to a 4096 shift). The test function interprets the pointer provided as a group of blocks of data and carries out some simple operation (sum) over those blocks. The order of access to the blocks is more or less random (it's determined by the partial sums) so that every time a new block is accessed it is nearly certain not to already be in the cache.
The function test is then timed. In all the shift configurations but one I should observe some timing, while in one particular shift configuration (the null shift, maybe?) I should observe some big improvement in terms of efficiency. This, however, does not happen at all: all the shift timings are perfectly compatible with each other.
Why does this happen and what does this mean? Can I just forget about memory alignment? Can I also forget about making my data blocks exactly as big as a memory page? (I was planning to use some padding in case they were smaller). Or maybe something in the cache management process is just unclear to me?

Why does allocating large chunks of memory fail when reallocing small chunks doesn't

This code results in x pointing to a chunk of memory 100GB in size.
#include <stdlib.h>
#include <stdio.h>
int main() {
auto x = malloc(1);
for (int i = 1; i< 1024; ++i) x = realloc(x, i*1024ULL*1024*100);
while (true); // Give us time to check top
}
While this code fails allocation.
#include <stdlib.h>
#include <stdio.h>
int main() {
auto x = malloc(1024ULL*1024*100*1024);
printf("%llu\n", x);
while (true); // Give us time to check top
}
My guess is, that the memory size of your system is less than the 100 GiB that you are trying to allocate. While Linux does overcommit memory, it still bails out of requests that are way beyond what it can fulfill. That is why the second example fails.
The many small increments of the first example, on the other hand, are way below that threshold. So each one of them succeeds as the kernel knows that you didn't require any of the prior memory yet, so it has no indication that it won't be able to back those 100 additional MiB.
I believe that the threshold for when a memory request from a process fails is relative to the available RAM, and that it can be adjusted (though I don't remember how exactly).
Well you're allocating less memory in the one that succeeds:
for (int i = 1; i< 1024; ++i) x = realloc(x, i*1024ULL*1024*100);
The last realloc is:
x = realloc(x, 1023 * (1024ULL*1024*100));
As compared to:
auto x = malloc(1024 * (1024ULL*100*1024));
Maybe that's right where your memory boundary is - the last 100M that broke the camel's back?

How to get a "bus error"?

I am trying very hard to get a bus error.
One way is misaligned access and I have tried the examples given here and here, but no error for me - the programs execute just fine.
Is there some situation which is sure to produce a bus error?
This should reliably result in a SIGBUS on a POSIX-compliant system.
#include <unistd.h>
#include <stdio.h>
#include <sys/mman.h>
int main() {
FILE *f = tmpfile();
int *m = mmap(0, 4, PROT_WRITE, MAP_PRIVATE, fileno(f), 0);
*m = 0;
return 0;
}
From the Single Unix Specification, mmap:
References within the address range starting at pa and continuing for len bytes to whole pages following the end of an object shall result in delivery of a SIGBUS signal.
Bus errors can only be invoked on hardware platforms that:
Require aligned access, and
Don't compensate for an unaligned access by performing two aligned accesses and combining the results.
You probably do not have access to such a system.
Try something along the lines of:
#include <signal.h>
int main(void)
{
raise(SIGBUS);
return 0;
}
(I know, probably not the answer you want, but it's almost sure to get you a "bus error"!)
As others have mentioned this is very platform specific. On the ARM system I'm working with (which doesn't have virtual memory) there are large portions of the address space which have no memory or peripheral assigned. If I read or write one of those addresses, I get a bus error.
You can also get a bus error if there's actually a hardware problem on the bus.
If you're running on a platform with virtual memory, you might not be able to intentionally generate a bus error with your program unless it's a device driver or other kernel mode software. An invalid memory access would likely be trapped as an access violation or similar by the memory manager (and it never even has a chance to hit the bus).
on linux with an Intel CPU try this:
int main(int argc, char **argv)
{
# if defined i386
/* enable alignment check (AC) */
asm("pushf; "
"orl $(1<<18), (%esp); "
"popf;");
# endif
char d[] = "12345678"; /* yep! - causes SIGBUS even on Linux-i386 */
return 0;
}
the trick here is to set the "alignment check" bit in one of the CPUs "special" registers.
see also: here
I am sure that you must be using x86 machines.
X86 cpu does not generate bus error unless its AC flag in EFALAGS register is set.
Try this code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void)
{
char *p;
__asm__("pushf\n"
"orl $0x40000, (%rsp)\n"
"popf");
/*
* malloc() always provides aligned memory.
* Do not use stack variable like a[9], depending on the compiler you use,
* a may not be aligned properly.
*/
p = malloc(sizeof(int) + 1);
memset(p, 0, sizeof(int) + 1);
/* making p unaligned */
p++;
printf("%d\n", *(int *)p);
return 0;
}
More about this can be found at http://orchistro.tistory.com/206
Also keep in mind that some operating systems report "bus error" for errors other than misaligned access. You didn't mention in your question what it was you were actually trying to acheive. Maybe try thus:
int *x = 0;
*x=1;
the Wikipedia page you linked to mentions that access to non-existant memory can also result is a bus error. You might have better luck with loading a known-invalid address into a pointer and dereferwncing that.
How about this? untested.
#include<stdio.h>
typedef struct
{
int a;
int b;
} busErr;
int main()
{
busErr err;
char * cPtr;
int *iPtr;
cPtr = (char *)&err;
cPtr++;
iPtr = (int *)cPtr;
*iPtr = 10;
}
int main(int argc, char **argv)
{
char *bus_error = new char[1];
for (int i=0; i<1000000000;i++) {
bus_error += 0xFFFFFFFFFFFFFFF;
*(bus_error + 0xFFFFFFFFFFFFFF) = 'X';
}
}
Bus error: 10 (core dumped)
Simple, write to memory that isn't yours:
int main()
{
char *bus_error = 0;
*bus_error = 'X';
}
Instant bus error on my PowerPC Mac [OS X 10.4, dual 1ghz PPC7455's], not necessarily on your hardware and/or operating system.
There's even a wikipedia article about bus errors, including a program to make one.
For 0x86 arch:
#include <stdio.h>
int main()
{
#if defined(__GNUC__)
# if defined(__i386__)
/* Enable Alignment Checking on x86 */
__asm__("pushf\norl $0x40000,(%esp)\npopf");
# elif defined(__x86_64__)
/* Enable Alignment Checking on x86_64 */
__asm__("pushf\norl $0x40000,(%rsp)\npopf");
# endif
#endif
int b = 0;
int a = 0xffffff;
char *c = (char*)&a;
c++;
int *p = (int*)c;
*p = 10; //Bus error as memory accessed by p is not 4 or 8 byte aligned
printf ("%d\n", sizeof(a));
printf ("%x\n", *p);
printf ("%x\n", p);
printf ("%x\n", &a);
}
Note:If asm instructions are removed, code wont generate the SIGBUS error as suggested by others.
SIGBUS can occur for other reason too.
Bus errors occur if you try to access memory that is not addressable by your computer. For example, your computer's memory has an address range 0x00 to 0xFF but you try to access a memory element at 0x0100 or greater.
In reality, your computer will have a much greater range than 0x00 to 0xFF.
To answer your original post:
Tell me some situation which is sure to produce a bus error.
In your code, index into memory way outside the scope of the max memory limit. I dunno ... use some kind of giant hex value 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF indexed into a char* ...