Accessing array beyond the limit - c++

I have two character array of size 100 (char array1[100], char array2[100]). Now i just want to check whether anybody is accessing array beyond the limit or not. Its necessary because suppose allocated memory for array1 and array2 are consecutive means as the array1 finish then array2 starts. Now if anyone write: array1[101], conceptually its wrong but compiler will give warning but will not crash. So How can i detect this problems and solve it?
Update 1:
I already have a code of line 15,000. And for that code i have to check this condition and i can invoke my functions but cannot change the written code. Please suggest me according to this.

Most modern languages will detect this and prevent it from happening. C and its derivatives don't detect this, and basically can't detect this, because of the numerous ways you can access the memory, including bare pointers. If you can restrict the way you access the memory, then you can possibly use a function or something to check your access.

My initial response to this would be to wrap the access to these arrays in a function or method and send the index as a parameter. If the index is out of bounds, raise an exception or report the error in some other way.
EDIT:
This is of course a run-time prevention. Don't know how you would check this at compile time if the compiler cannot checkt this for you. Also, as Kolky has already pointed out, it'd be easier to answer this if we know which language you are using.

If you are using C++ rather than C there any reason you can't use std::vector? That will give you bounds checking if the user goes outside your range. Am I missing something here?
Wouldn't it be sensible to prevent the user having direct access to the collections in the first place?

If you use boost::array or similar you will get an exception range_error if array bounds are overstepped. http://www.boost.org/doc/libs/1_44_0/doc/html/boost/array.html. Boost is fabulous.

In C/C++, there is no general solution. You can't do it at compile time since there are too many ways to change memory in C. Example:
char * ptr = &array2;
ptr = foo(ptr); // ptr --;
ptr now contains a valid address but the address is outside of array2. This can be a bug or what you want. C can't know (there is no way to say "I want it so" in C), so the compiler can't check it. Sililarily:
char * array2 = malloc(100);
How should the C compiler know that you are treating the memory as a char array and would like a warning when you write &array2[100]?
Therefore, most solutions use "mungwalls", i.e. when you call malloc(), they will actually allocate 16/32 bytes more than you ask for:
malloc(size) {
mungwall_size = 16;
ptr = real_malloc(size + mungwall_size*2);
createMungwall(ptr, mungwall_size);
createMungwall(ptr+size, mungwall_size);
return ptr+size;
}
in free() it will check that 16 bytes before and after the allocated memory area haven't been touched (i.e. that the mungwall pattern is still intact). While not perfect, it makes your program crash earlier (and hopefully closer to the bug).
You could also use special CPU commands to check all memory accesses but this approach would make your program 100 to 1 million times slower than it is now.
Therefore, languages after C don't allow pointers which means "array" is a basic type which has a size. Now, you can check every array access with a simple compare.
If you want to write code in C which is save, you must emulate this. Create an array type, never use pointers or char * for strings. It means you must convert your data type all the time (because all library functions use const char * for strings) but it makes your code safer.
Languages do age. C is now 40 years old and our knowledge has moved on. It's still used in a lot of places but it shouldn't be the first choice anymore. The same applies (to a lesser extend) to C++ because it suffers from the same fundamental flaws of C (even though you now have libraries and frameworks which work around many of them).

If you're in C++ you can write a quick wrapper class.
template<typename T, int size> class my_array_wrapper {
T contents[size];
public:
T& operator[](int index) {
if (index >= size)
throw std::runtime_error("Attempted to access outside array bounds!");
if (index < 0)
throw std::runtime_error("Attempted to access outside array bounds!");
return contents[index];
}
const T& operator[](int index) const {
if (index >= size)
throw std::runtime_error("Attempted to access outside array bounds!");
if (index < 0)
throw std::runtime_error("Attempted to access outside array bounds!");
return contents[index];
}
operator T*() {
return contents;
}
operator const T*() const {
return contents;
}
};
my_array_wrapper<char, 100> array1;
array1[101]; // exception
Problem solved, although if you access through the pointer decay there will be no bounds checking. You could use the boost::array pre-provided solution.

If you ran a static analyser (i.e. cppcheck) against your code it would give you a bounds error
http://en.wikipedia.org/wiki/User:Exuwon/Cppcheck#Bounds_checking
to solve it... you'd be better off using a container of some sorts (i.e. std::vector) or writing a wrapper

Related

3D pointer seg error

I want to stick 2D arrays in a 3D array together, first i defined the 3D array in the following way
int ***grid;
grid=new int **[number];
then I want to assign the 2D arrays to the 3D construct
for(i=0;i<number;i++)
grid[i]=rk4tillimpact2dens(...);
with
int** rk4tillimpact2dens(...
...
static int** grid;
grid=new int*[600];
for(i=0;i<600;i++)
grid[i]=new int[600];
memset(grid,0x0,sizeof(grid));
...
return(grid);
}
so far no problem, everything works fine, but when I want to access the 3D array afterwards I get a seg fault. Like that e.g.
printf("%d",grid[1][1][1]);
What is my mistake?
Best,
Hannes
Oh, sorry, it was typo in my question, I did
printf("%d",grid[1][1][1]);
it's not working :(. But even
printf("%d",&grid[1][1][1]);
or
printf("%d",*grid[1][1][1]);
would not work. The strange thing is, that there are no errors unless I try to access the array
First, you discard the very first row of each matrix with that memset (the actual row is leaked). While technically grid[1][1][1] should still be readable, it probably becomes corrupt in some other place.
Can you provide a minimal verifiable example? This is likely to solve your problem.
To clear out the memory allocated for grid, you can't do the whole NxN matrix with one memset, it isn't contiguous memory. Since each row is allocated as a separate memory block, you need to clear them individually.
for(i=0;i<600;i++) {
grid[i]=new int[600];
memset(grid[i], 0, sizeof(int) * 600);
}
The 600 value should be a named constant, and not a hardcoded number.
And grid does not need to be a static variable.
Printing out the address
printf("%p",&grid[1][1][1]);
You are printing the address here. That's why you may not get what you desire to see.
printf("%d",grid[1][1][1]);
This will print the array element.
And to read an input from stdin you will use scanf() which requires you to pass address of an variable.
scanf("%d",&grid[1][1][1]);
Zeroing out the allocated memory
Also you can't get the size of the array using sizeof. SO to initialize with 0 you use memset on the chunks that are allocated at once with a new.
In your case example would be Like 1201ProgramAlarm pointed out
for(int i = 0; i < 600; i++){
...
memset(grid[i],0,sizeof(int)*600);
}
There is another way you can initialise an allocated memory in c++.
grid[i]=new int[600]();
For example:
int** rk4tillimpact2dens(...
...
static int** grid;
grid=new int*[600];
for(i=0;i<600;i++)
grid[i]=new int[600]();
...
return(grid);
}
Do you expect memset(grid,0x0,sizeof(grid)); not to zero the pointer values you've just assigned to grid[0] through to grid[599]? If so, you should test that theory by inspecting the pointer values of grid[0] through to grid[599] before and after that call to memset, to find out what memset does to true (more on that later) arrays.
Your program is dereferencing a null pointer which results directly from that line of code. Typically, a crash can be expected when you attempt to dereference a null pointer, because null pointers don't reference any objects. This explains your observation of a crash, and your observation of the crash disappearing when you comment out that call to memset. You can't expect good things to happen if you try to use the value of something which isn't an object, such as grid[1][... where grid[1] is a pointer consisting entirely of zero bits.
The term 3D array doesn't mean what you think it means, by the way. Arrays in C and C++ are considered to be a single allocation, where-as what your code is producing seems to be multiple allocations, associated in a hierarchical form; you've allocated a tree as opposed to an array, and memset isn't appropriate to zero a tree. Perhaps your experiments could be better guided from this point on by a book regarding algorithms, such as Algorithms in C, parts 1-4 by Robert Sedgewick.
For the meantime, in C, the following will get you a pointer to a 2D array which you can mostly use as though it's a 3D array:
void *make_grid(size_t x, size_t y, size_t z) {
int (*grid)[y][z] = malloc(x * sizeof *grid);
/* XXX: use `grid` as though it's a 3D array here.
* i.e. grid[0][1][2] = 42;
*/
return grid;
}
Assuming make_grid returns something non-null, you can use a single call to memset to zero the entirety of the array pointed to by that function because there's a single call to malloc matching that a single call to memset... Otherwise, if you want to zero a tree, you'll probably want to call memset n times for n items.
In C++, I don't think you'll find many who discourage the use of std::vector in place of arrays. You might want to at least consider that option, as well as the other options you have (such as trees; it seems like you want to use a tree, which is fine because trees have perfectly appropriate usecases that arrays aren't valid for, and you haven't given us enough context to tell which would be most appropriate for you).

Determine the nature of parameter in runtime

I have a function
void fname(char* Ptr)
{
...
}
I want to know inside this function whether this pointer Ptr holds the address of dynamically allocated memory using new char[] or the address of locally allocated memory in the calling function. Is there any way I can determine that? I think <typeinfo> doesn't help here.
One way to do this is to have your own operator new functions and keep track of everything allocated so that you can just ask your allocation library if the address given is one it allocated. The custom allocator then just calls the standard one to actually do the allocation.
Another approach (messy and details highly OS dependent) may be to examine the process layout in virtual memory and hence determine which addresses refer to which areas of memory.
You can combine these ideas by actually managing your own memory pools. So if you get a single large chunk of system memory with known address bounds and use that for all new'd memory, you can just check that an address in is the given range to answer your question.
However: Any of these ideas is a lot of work and not appropriate if this problem is the only purpose in doing so.
Having said all that, if you do want to implement something, you will need to work carefully through all the ways that an address might be generated.
For example (and surely I've missed some):
Stack
Return from new
Inside something returned from new.
Was returned from new but already deleted (hopefully not, but that's why we need diagnostics)
statically allocated
static constant memory
command line arguments/ environment
code addresses.
Now, ignoring all that for a moment, and assuming this is for some debug purpose rather than system design, you might be able to try this kind of thing:
This is ugly, unreliable, not guaranteed by the standard, etc etc, but might work . . .
char* firstStack = 0;
bool isOnStack(const void* p)
{
char* check =(char*)p;
char * here = (char*)&check;
int a = firstStack - check;
int b = check - here;
return (a*b > 0);
}
void g(const char* p)
{
bool onStack = isOnStack(p);
std::cout << p << (onStack ? "" : " not" ) << " on stack " << std::endl;
}
void f()
{
char stuff[1024] = "Hello";
g(stuff);
}
void h()
{
char* nonsense = new char[1024];
strcpy(nonsense, "World");
g(nonsense);
delete [] nonsense;
}
int main()
{
int var = 0;
firstStack = (char*)&var;
f();
h();
}
Output:
Hello on stack
World not on stack
The short answer: no, you can't. You have no way of knowing whether Ptr is a pointer to a single char, the start of a statically allocated array, a pointer to a single dynamically allocated char, or the start of an array thereof.
If you really wanted to, you try an overload like so:
template <std::size_t N>
void fname(char Ptr[N])
{
// ...
}
which would match when passed a statically allocated array, whereas the first version would be picked when dealing with dynamically allocated memory or a pointer to a single char.
(But note that function overloading rules are a bit complicated in the presence of templates -- in particular, a non-template function is preferred if it matches. So you might need to make the original function take a "dummy" template parameter if you go for this approach.)
In vc++ there is an assertion _CrtIsMemoryBlock (http://msdn.microsoft.com/en-us/library/ww5t02fa.aspx#BKMK_CRT_assertions) that can be used to check if a pointer was allocated from the heap. This will only work when a debug heap is being used but this is fine if you are just wanting to add some 'debug only' assertions. This method has worked well for me in the past under Windows.
For Linux however I know of no such equivalent.
Alternatively you could use an inline assembler block to try to determine the if it is a stack address or not. This would be hardware dependent as it would rely heavily not only on the processor type but also on the memory model being used (flat address model vs segmented etc). Its probably best to avoid this type of approach.

C++ - Regarding the scope of dynamic arrays

I have a quick question regarding the scope of dynamic arrays, which I assume is causing a bug in a program I'm writing. This snippet checks a function parameter and branches to either the first or the second, depending on what the user passes.
When I run the program, however, I get a scope related error:
error: ‘Array’ was not declared in this scope
Unless my knowledge of C++ fails me, I know that variables created within a conditional fall out of scope when when the branch is finished. However, I dynamically allocated these arrays, so I cannot understand why I can't manipulate the arrays later in the program, since the pointer should remain.
//Prepare to store integers
if (flag == 1) {
int *Array;
Array = new int[input.length()];
}
//Prepare to store chars
else if (flag == 2) {
char *Array;
Array = new char[input.length()];
}
Can anyone shed some light on this?
Declare Array before if. And you can't declare array of different types as one variable, so I think you should use to pointers.
int *char_array = nullptr;
int *int_array = nullptr;
//Prepare to store integers
if (flag == 1) {
int_array = new int[input.length()];
}
//Prepare to store chars
else if (flag == 2) {
char_array = new char[input.length()];
}
if (char_array)
{
//do something with char_array
}
else if (int_array)
{
//do something with int_array
}
Also as j_random_hacker points, you might want to change you program design to avoid lot's of if
While you are right that since you dynamically allocated them on the heap, the memory won't be released to the system until you explicitly delete it (or the program ends), the pointer to the memory falls out of scope when the block it was declared in exits. Therefore, your pointer(s) need to exist at a wider scope if they will be used after the block.
The memory remains allocated (i.e. taking up valuable space), there's just no way to access it after the closing }, because at that point the program loses the ability to address it. To avoid this, you need to assign the pointer returned by new[] to a pointer variable declared in an outer scope.
As a separate issue, it looks as though you're trying to allocate memory of one of 2 different types. If you want to do this portably, you're obliged to either use a void * to hold the pointer, or (less commonly done) a union type containing a pointer of each type. Either way, you will need to maintain state information that lets the program know which kind of allocation has been made. Usually, wanting to do this is an indication of poor design, because every single access will require switching on this state information.
If I understand your intend correctly what you are trying to do is: depending on some logic allocate memory to store n elements of either int or char and then later in your function access that array as either int or char without the need for a single if statement.
If the above understanding is correct than the simple answer is: "C++ is a strong-typed language and what you want is not possible".
However... C++ is also an extremely powerful and flexible language, so here's what can be done:
Casting. Something like the following:
void * Array;
if(flag1) Array = new int[len]
else Array = new char[len];
// ... later in the function
if(flag) // access as int array
int i = ((int*)Array)[0];
Yes, this is ugly and you'll have to have those ifs sprinkled around the function. So here's an alternative: template
template<class T> T foo(size_t _len)
{
T* Array = new T[_len];
T element = Array[0];
return element;
}
Yet another, even more obscure way of doing things, could be the use of unions:
union int_or_char {int i; char c;};
int_or_char *Array = new int_or_char[len];
if(flag) // access as int
int element = Array[0].i;
But one way or the other (or the third) there's no way around the fact that the compiler has to know how to deal with the data you are trying to work with.
Turix's answer is right. You need to keep in mind that two things are being allocated here, The memory for the array and the memory when the location of the array is stored.
So even though the memory from the array is allocated from the heap and will be available to the code where ever required, the memory where the location of the array is stored (the Array variable) is allocated in the stack and will be lost as soon as it goes out of scope. Which in this case is when the if block end. You can't even use it in the else part of the same if.
Another different code suggestion from Andrew I would give is :
void *Array = nullptr;
if (flag == 1) {
Array = new int[input.length()];
} else if (flag == 2) {
Array = new char[input.length()];
}
Then you can directly use if as you intended.
This part I am not sure : In case you want to know if its an int or char you can use the typeid literal. Doesn't work, at least I can't get it to work.
Alternatively you can use your flag variable to guess what type it is.

Size of an Array.... in C/C++?

Okay so you have and array A[]... that is passed to you in some function say with the following function prototype:
void foo(int A[]);
Okay, as you know it's kind of hard to find the size of that array without knowing some sort of ending variable or knowing the size already...
Well here is the deal though. I have seem some people figure it out on a challenge problem, and I don't understand how they did it. I wasn't able to see their source code of course, that is why I am here asking.
Does anyone know how it would even be remotely possible to find the size of that array?? Maybe something like what the free() function does in C??
What do you think of this??
template<typename E, int size>
int ArrLength(E(&)[size]){return size;}
void main()
{
int arr[17];
int sizeofArray = ArrLength(arr);
}
The signature of that function is not that of a function taking an array, but rather a pointer to int. You cannot obtain the size of the array within the function, and will have to pass it as an extra argument to the function.
If you are allowed to change the signature of the function there are different alternatives:
C/C++ (simple):
void f( int *data, int size ); // function
f( array, sizeof array/sizeof array[0] ); // caller code
C++:
template <int N>
void f( int (&array)[N] ); // Inside f, size N embedded in type
f( array ); // caller code
C++ (though a dispatch):
template <int N>
void f( int (&array)[N] ) { // Dispatcher
f( array, N );
}
void f( int *array, int size ); // Actual function, as per option 1
f( array ); // Compiler processes the type as per 2
You cannot do that. Either you have a convention to signal the end of the array (e.g. that it is made of non-zero integers followed by a 0), or you transmit the size of the array (usually as an additional argument).
If you use the Boehm garbage collector (which has a lot of benefit, in particular you allocate with GC_malloc and friends but you don't care about free-ing memory explicitly), you could use the GC_size function to give you the size of a GC_malloc-ed memory zone, but standard malloc don't have this feature.
You're asking what we think of the following code:
template<typename E, int size>
int ArrLength(E(&)[size]){return size;}
void main()
{
int arr[17];
int sizeofArray = ArrLength(arr);
}
Well, void main has never been standard, neither in C nor in C++.
It's int main.
Regarding the ArrLength function, a proper implementation does not work for local types in C++98. It does work for local types by C++11 rules. But in C++11 you can write just end(a) - begin(a).
The implementation you show is not proper: it should absolutely not have int template argument. Make that a ptrdiff_t. For example, in 64-bit Windows the type int is still 32-bit.
Finally, as general advice:
Use std::vector and std::array.
One relevant benefit of this approach is that it avoid throwing away the size information, i.e. it avoids creating the problem you're asking about. There are also many other advantages. So, try it.
The first element could be a count, or the last element could be a sentinel. That's about all I can think of that could work portably.
In new code, for container-agnostic code prefer passing two iterators (or pointers in C) as a much better solution than just passing a raw array. For container-specific code use the C++ containers like vector.
No you can't. Your prototype is equivalent to
void foo(int * A);
there is obviously no size information. Also implementation dependent tricks can't help:
the array variable can be allocated on the stack or be static, so there is no information provided by malloc or friends
if allocated on the heap, a user of that function is not forced to call it with the first element of an allocation.
e.g the following are valid
int B[22];
foo(B);
int * A = new int[33];
foo(A + 25);
This is something that I would not suggest doing, however if you know the address of the beginning of the array and the address of the next variable/structure defined, you could subtract the address. Probably not a good idea though.
Probably an array allocated at compile time has information on its size in the debug information of the executable. Moreover one could search in the code for all the address corresponding to compile time allocated variables and assume the size of the array is minus the difference between its starting address and the next closest starting address of any variable.
For a dinamically allocated variable it should be possible to get its size from the heap data structures.
It is hacky and system dependant, but it is still a possible solution.
One estimate is as follows: if you have for instance an array of ints but know that they are between (stupid example) 0..80000, the first array element that's either negative or larger than 80000 is potentially right past the end of the array.
This can sometimes work because the memory right past the end of the array (I'm assuming it was dynamically allocated) won't have been initialized by the program (and thus might contain garbage values), but might still be part of the allocated pages, depending on the size of the array. In other cases it will crash or fail to provide meaningful output.
All of the other answers are probably better, i.e. you either have to pass the length of the array or terminate it with a special byte sequence.
The following method is not portable, but it works for me in VS2005:
int getSizeOfArray( int* ptr )
{
int size = 0;
void* ptrToStruct = ptr;
long adr = (long)ptrToStruct;
adr = adr - 0x10;
void* ptrToSize = (void*)adr;
size = *(int*)ptrToSize;
size /= sizeof(int);
return size;
}
This is entirely dependent of the memory model of your compiler and system so, again, it is not portable. I bet there are equivalent methods for other platforms. I would never use this in a production environment, merely stating this as an alternative.
You can use this: int n = sizeof(A) / sizeof(A[0]);

Global Arrays in C++

Why the array is not overflowed (e.g. error alert) when the array is declared globally, in other why I'm able to fill it with unlimited amount of elements (through for) even it's limited by size in declaration and it does alert when I declare the array locally inside the main ?
char name[9];
int main(){
int i;
for( int i=0; i<18; ++i){
cin>>name[i];
}
cout<<"Inside the array: ";
for(i=0; i<20; i++)
cout<<name[i];
return 0;
}
C++ does not check bounds errors for arrays of any kind. Reading or writing outside of the array bounds causes what is known as "undefined behaviour", which means anything could happen. In your case, it seems that what happens is that it appears to work, but the program will still be in an invalid state.
If you want bounds checking, use a std::vector and its at() member function:
vector <int> a( 3 ); // vector of 3 ints
int n = a.at( 0 ); // ok
n = a.at( 42 ); // throws an exception
C++ does not have array bounds checking so the language never check to see if you have exceeded the end of your array but as others have mentioned bad things can be expected to happen.
Global variables exists in the static segment which is totally separate from your stack. It also does not contain important information like return addresses. When you exceed an array's boundaries you are effectively corrupting memory. It just so happens that corrupting the stack is more likely to cause more visible bad things than corrupting the data segment. All of this depends on the way your operating system organizes a process's memory.
its undefined behavior. Anything can happen.
You cannot assume too much about the memory layout of variables. It may run on your computer with these parameters, but totally fail when you increase your access bounds, or run that code on another machine. So if you seriously want to write code, don't let this become a habit.
I'd go one step further and state that C/C++ do not have arrays. What they have is array-like syntactic sugar that is immediately translated to pointer arithmetic, which cannot be checked, as pointers can be used to access potentially all of memory. Any checking that the compiler may manage to perform based on static sizes and constant bounds on an index is a happy accident, but you cannot rely on it.
Here's an oddity that stunned me when I first saw it:
int a[10], i;
i = 5;
a[i] = 42; // Looks normal.
5[a] = 37; // But what's this???
std::cout << "Array element = " << a[i] << std::endl;
But the odd-looking line is perfectly legal C++. This example emphasizes that arrays in C/C++ are a fiction.
Neil Butterworth already commented on the benefits of using std::vector and the at() access method for it, and I cannot second his recommendation strongly enough. (Unfortunately, the designers of STL blew a golden opportunity to make checked access the [] operators, with the at() methods the unchecked operators. This has probably cost the C++ programming community millions of hours and millions of dollars, and will continue to do so.)