Pascal and Delphi Arrays to C/C++ Arrays - c++

In pascal and delphi, arrays have their lengths stored at some offset in memory from the array's pointer. I found that the following code works for me and it gets the length of an array:
type PInt = ^Integer; //pointer to integer.
Function Length(Arr: PInt): Integer;
var
Ptr: PInt;
Begin
Ptr := Arr - sizeof(Integer);
Result := Ptr^ + 1;
End;
Function High(Arr: PInt): Integer; //equivalent to length - 1.
Begin
Result := (Arr - sizeof(Integer))^;
End;
I translated the above code into C++ and it thus becomes:
int Length(int* Arr)
{
int* Ptr = (Arr - sizeof(int));
return *reinterpret_cast<char*>(Ptr) + 1;
}
int High(int* Arr)
{
return *(Arr - sizeof(int));
}
Now assuming the above are equivalent to the Pascal/Delphi versions, how can I write a struct to represent a Pascal Array?
In other words, how can I write a struct such that the following is true:
Length(SomeStructPointer) = SomeStructPointer->size
I tried the following:
typedef struct
{
unsigned size;
int* IntArray;
} PSArray;
int main()
{
PSArray ps;
ps.IntArray = new int[100];
ps.size = 100;
std::cout<<Length((int*) &ps); //should print 100 or the size member but it doesn't.
delete[] ps.IntArray;
}

In Pascal and Delphi, arrays have their lengths stored at
some offset in memory from the array's pointer.
This is not so. The entire premise of your question is wrong. The Delphi functions you present do not work in general. They might work for dynamic arrays. But it is certainly not the case that you can pass an pointer to an array and be sure that the length is stored before it.
And in fact the Delphi code in the question does not even work for dynamic arrays. Your pointer arithmetic is all wrong. You read a value 16 bytes to the left rather than 4 bytes. And you fail to check for nil. So it's all a bit of a disaster really.
Moving on to your C++ code, you are reaping the result of this false premise. You've allocated an array. There's no reason to believe that the int to the left of the array holds the length. Your C++ code is also very broken. But there's little point attempting to fix it because it can never be fixed. The functions you define cannot be implemented. It is simply not the case that an array is stored adjacent to a variable containing the length.
What you are looking for in your C++ code is std::vector. That offers first class support for obtaining the length of the container. Do not re-invent the wheel.
If interop is your goal, then you need to use valid interop types. And Delphi managed dynamic arrays do not qualify. Use a pointer to an array, and a separately passed length.

Why? I can see no good reason to do this. Use idiomatic Pascal in Pascal, use idiomatic C++ in C++. Using sizeof like that also ignores padding, and so your results may vary from platform to platform.
If you want a size, store it in the struct. If you want a non-member length function, just write one that works with the way you wrote the struct. Personally, I suggest using std::array if the size won't change and std::vector if it will. If you absolutely need a non-member length function, try this:
template<typename T>
auto length(const T& t) -> decltype(t.size()) {
return t.size();
}
That will work with both std::array and std::vector.
PS: If you're doing this for "performance reasons", please profile your code and prove that there is a bottleneck before doing something that will become a maintenance hazard.

Related

Defining Array C/C++

What is the difference between this two array definitions and which one is more correct and why?
#include <stdio.h>
#define SIZE 20
int main() {
// definition method 1:
int a[SIZE];
// end definition method 1.
// defintion method 2:
int n;
scanf("%d", &n);
int b[n];
// end definition method 2.
return 0;
}
I know if we read size, variable n, from stdin, it's more correct to define our (block of memory we'll be using) array as a pointer and use stdlib.h and array = malloc(n * sizeof(int)), rather than decalring it as int array[n], but again why?
It's not "more correct" or "less correct". It either is xor isn't correct. In particular, this works in C, but not in C++.
You are declaring dynamic arrays. Better way to declare Dynamic arrays as
int *arr; // int * type is just for simplicity
arr = malloc(n*sizeof(int*));
this is because variable length arrays are only allowed in C99 and you can't use this in c89/90.
In (pre-C99) C and C++, all types are statically sized. This means that arrays must be declared with a size that is both constant and known to the compiler.
Now, many C++ compilers offer dynamically sized arrays as a nonstandard extension, and C99 explicitly permits them. So int b[n] will most likely work if you try it. But in some cases, it will not, and the compiler is not wrong in those cases.
If you know SIZE at compile-time:
int ar[SIZE];
If you don't:
std::vector<int> ar;
I don't want to see malloc anywhere in your C++ code. However, you are fundamentally correct and for C that's just what you'd do:
int* ptr = malloc(sizeof(int) * SIZE);
/* ... */
free(ptr);
Variable-length arrays are a GCC extension that allow you to do:
int ar[n];
but I've had issues where VLAs were disabled but GCC didn't successfully detect that I was trying to use them. Chaos ensues. Just avoid it.
Q1 : First definition is the static array declaration. Perfectly correct.
It is when you have the size known, so no comparison with VLA or malloc().
Q2 : Which is better when taking size as an input from the user : VLA or malloc .
VLA : They are limited by the environment's bounds on the size of automatic
allocation. And automatic variables are usually allocated on the stack which is relatively
small.The limitation is platform specific.Also, this is in c99 and above only.Some ease of use while declaring multidimensional arrays is obtained by VLA.
Malloc : Allocates from the heap.So, for large size is definitely better.For, multidimensional arrays pointers are involved so a bit complex implementataion.
Check http://bytes.com/topic/c/answers/578354-vla-feature-c99-vs-malloc
I think that metod1 could be little bit faster, but both of them are correct in C.
In C++ first will work, but if you want to use a second you should use:
int size = 5;
int * array = new int[size];
and remember to delete it:
delete [] array;
I think it gives you more option to use while coding.
If you use malloc or other dynamic allocation to get a pointer. You will use like p+n..., but if you use array, you could use array[n]. Also, while define pointer, you need to free it; but array does not need to free.
And in C++, we could define user-defined class to do such things, and in STL, there is std::vector which do the array-things, and much more.
Both are correct. the declaration you use depends on your code.
The first declaration i.e. int a[size]; creates an array with a fixed size of 20 elements.
It is helpful when you know the exact size of the array that will be used in the code. for example, you are generating
table of a number n up till its 20th multiple.
The second declaration allows you to make an array of the size that you desire.
It is helpful when you will need an array of different sizes, each time the code is executed for example, you want to generate the fibonacci series till n. In that case, the size of the array must be n for each value of n. So say you have n = 5, in this case int a [20] will waste memory because only the first five slots will be used for the fibonacci series and the rest will be empty. Similarly if n = 25 then your array int a[20] will become too small.
The difference if you define array using malloc is that, you can pass the size of array dynamically i.e at run time. You input a value your program has during run time.
One more difference is that arrays created using malloc are allocated space on heap. So they are preserved across function calls unlike static arrays.
example-
#include<stdio.h>
#include<stdlib.h>
int main()
{
int n;
int *a;
scanf("%d",&n);
a=(int *)malloc(n*sizeof(int));
return 0;
}

Is it valid to access a multi dimensional C++ array as one contiguous block (on heap) [duplicate]

This question already has answers here:
May I treat a 2D array as a contiguous 1D array?
(6 answers)
Closed 8 years ago.
Related thread here: Does C99 guarantee that arrays are contiguous?
Apparently answer() isn't valid below, but could be re-written to use char * or cast to int[nElements] (possibly). I'll admit I don't understand the standard references and why a contiguous block of int couldn't be accessed via int* if properly aligned.
First is the following code block valid on most C++ platforms?
void answer(int *pData, size_t nElements)
{
for( size_t i=0; i<nElements; ++i )
pData[i] = 42;
}
void random_code()
{
int arr1[1][2][3][4]; // local allocation
answer(arr1, sizeof(arr1) / sizeof(int));
int arr2[20][15];
answer(arr2, sizeof(arr2) / sizeof(int));
}
Second does answer() remain valid for all allocation types (global, local, heap(hopefully correct!))?
int g_arr[20][15]; // global
void foo() {
int (*pData)[10] = new int[50][10]; // heap allocation, at least partially
answer(&pData[0][0], 50*10);
// not even sure if delete[] will free pData correctly, but...
}
Yes, most platforms will indeed pack the elements of an N-dimensional array in such a way that linear addressing on a pointer to the first element will find all of the elements.
It is actually hard (as in, I cannot figure it out) to come up with a standards compliant implementation that does not do this, as an array of arrays must pack said arrays, and the size of the array of arrays is the size of each sub array times the number of arrays of arrays. There does not seem to be room for it not to work. Even the ordering of each element seems to be well defined.
Despite this, no clause in the standard I am aware of lets you actually reinterpret the pointer to the first element of a multi dimensional array as a pointer to an array of the product. Many clauses talk about how you can only access the elements of the array, or one-past-the-end.
The code in answer() is fine. The code in random_code() is misusing answer() (or not calling the overload of answer() shown in the question). It should be:
void random_code()
{
int arr1[1][2][3][4];
answer(&arr1[0][0][0][0], sizeof(arr1) / sizeof(int));
int arr2[20][15];
answer(&arr2[0][0], sizeof(arr2) / sizeof(int));
}
The code in answer() expects an int *; you were passing an int (*)[2][3][4] and an int (*)[15], neither of which looks like int *.
This remains valid for other allocation mechanisms that allocate a single contiguous block of data, such as the ones shown.
As the previous person said, there's a type error in your code. You're trying to use an int ()[X] type actual argument for an int formal argument. So to make your code work, you should use type casting.
C++/C uses the same memory layout for data types not depending on what section of memory is used for allocating an object so that the same code can be used for values where they are. So the answer to your second question, is if your code is working on stack-allocated values, it is going to work with a heap-allocated value too.

Memset Definition and use

What's the usefulness of the function memset()?.
Definition: Sets the first num bytes of the block of memory pointed by ptr to the
specified value (interpreted as an unsigned char).
Does this mean it hard codes a value in a memory address?
memset(&serv_addr,0,sizeof(serv_addr) is the example that I'm trying to understand.
Can someone please explain in a VERY simplified way?
memset() is a very fast version of a relatively simple operation:
void* memset(void* b, int c, size_t len) {
char* p = (char*)b;
for (size_t i = 0; i != len; ++i) {
p[i] = c;
}
return b;
}
That is, memset(b, c, l) set the l bytes starting at address b to the value c. It just does it much faster than in the above implementation.
memset() is usually used to initialise values. For example consider the following struct:
struct Size {
int width;
int height;
}
If you create one of these on the stack like so:
struct Size someSize;
Then the values in that struct are going to be undefined. They might be zero, they might be whatever values happened to be there from when that portion of the stack was last used. So usually you would follow that line with:
memset(&someSize, 0, sizeof(someSize));
Of course it can be used for other scenarios, this is just one of them. Just think of it as a way to simply set a portion of memory to a certain value.
memset is a common way to set a memory region to 0 regardless of the data type. One can say that memset doesn't care about the data type and just sets all bytes to zero.
IMHO in C++ one should avoid doing memset when possible since it circumvents the type safety that C++ provides, instead one should use constructor or initialization as means of initializing. memset done on a class instance may also destroy something unintentionally:
e.g.
class A
{
public:
shared_ptr<char*> _p;
};
a memset on an instance of the above would not do a reference counter decrement properly.
I guess that serv_addr is some local or global variable of some struct type -perhaps struct sockaddr- (or maybe a class).
&serv_addr is taking the address of that variable. It is a valid address, given as first argument to memset. The second argument to memset is the byte to be used for filling (zero byte). The last argument to memset is the size, in bytes, of that memory zone to fill, which is the size of that serv_addr variable in your example.
So this call to memset clears a global or local variable serv_addr containing some struct.
In practice, the GCC compiler, when it is optimizing, will generate clever code for that, usually unrolling and inlining it (actually, it is often a builtin, so GCC can generate very clever code for it).
It is nothing but setting the memory to particular value.
Here is example code.
Memset(const *p,unit8_t V,unit8_t L) , Here the P is the pointer to target memory, V is the value to the target buffer which will be set to a value V and l is the length of the data.
while(L --> 0)
{
*p++ = V;
}
memset- set bytes in memory
Synopsis-
#include<string.h>
void *memset(void *s,int c,size_t n)
Description- The memset() function shall copy c (converted to an unsigned char) into each of the first n bytes of the object pointed to by s.
Here for the above function , the memset() shall return s value.

Size of an Array.... in C/C++?

Okay so you have and array A[]... that is passed to you in some function say with the following function prototype:
void foo(int A[]);
Okay, as you know it's kind of hard to find the size of that array without knowing some sort of ending variable or knowing the size already...
Well here is the deal though. I have seem some people figure it out on a challenge problem, and I don't understand how they did it. I wasn't able to see their source code of course, that is why I am here asking.
Does anyone know how it would even be remotely possible to find the size of that array?? Maybe something like what the free() function does in C??
What do you think of this??
template<typename E, int size>
int ArrLength(E(&)[size]){return size;}
void main()
{
int arr[17];
int sizeofArray = ArrLength(arr);
}
The signature of that function is not that of a function taking an array, but rather a pointer to int. You cannot obtain the size of the array within the function, and will have to pass it as an extra argument to the function.
If you are allowed to change the signature of the function there are different alternatives:
C/C++ (simple):
void f( int *data, int size ); // function
f( array, sizeof array/sizeof array[0] ); // caller code
C++:
template <int N>
void f( int (&array)[N] ); // Inside f, size N embedded in type
f( array ); // caller code
C++ (though a dispatch):
template <int N>
void f( int (&array)[N] ) { // Dispatcher
f( array, N );
}
void f( int *array, int size ); // Actual function, as per option 1
f( array ); // Compiler processes the type as per 2
You cannot do that. Either you have a convention to signal the end of the array (e.g. that it is made of non-zero integers followed by a 0), or you transmit the size of the array (usually as an additional argument).
If you use the Boehm garbage collector (which has a lot of benefit, in particular you allocate with GC_malloc and friends but you don't care about free-ing memory explicitly), you could use the GC_size function to give you the size of a GC_malloc-ed memory zone, but standard malloc don't have this feature.
You're asking what we think of the following code:
template<typename E, int size>
int ArrLength(E(&)[size]){return size;}
void main()
{
int arr[17];
int sizeofArray = ArrLength(arr);
}
Well, void main has never been standard, neither in C nor in C++.
It's int main.
Regarding the ArrLength function, a proper implementation does not work for local types in C++98. It does work for local types by C++11 rules. But in C++11 you can write just end(a) - begin(a).
The implementation you show is not proper: it should absolutely not have int template argument. Make that a ptrdiff_t. For example, in 64-bit Windows the type int is still 32-bit.
Finally, as general advice:
Use std::vector and std::array.
One relevant benefit of this approach is that it avoid throwing away the size information, i.e. it avoids creating the problem you're asking about. There are also many other advantages. So, try it.
The first element could be a count, or the last element could be a sentinel. That's about all I can think of that could work portably.
In new code, for container-agnostic code prefer passing two iterators (or pointers in C) as a much better solution than just passing a raw array. For container-specific code use the C++ containers like vector.
No you can't. Your prototype is equivalent to
void foo(int * A);
there is obviously no size information. Also implementation dependent tricks can't help:
the array variable can be allocated on the stack or be static, so there is no information provided by malloc or friends
if allocated on the heap, a user of that function is not forced to call it with the first element of an allocation.
e.g the following are valid
int B[22];
foo(B);
int * A = new int[33];
foo(A + 25);
This is something that I would not suggest doing, however if you know the address of the beginning of the array and the address of the next variable/structure defined, you could subtract the address. Probably not a good idea though.
Probably an array allocated at compile time has information on its size in the debug information of the executable. Moreover one could search in the code for all the address corresponding to compile time allocated variables and assume the size of the array is minus the difference between its starting address and the next closest starting address of any variable.
For a dinamically allocated variable it should be possible to get its size from the heap data structures.
It is hacky and system dependant, but it is still a possible solution.
One estimate is as follows: if you have for instance an array of ints but know that they are between (stupid example) 0..80000, the first array element that's either negative or larger than 80000 is potentially right past the end of the array.
This can sometimes work because the memory right past the end of the array (I'm assuming it was dynamically allocated) won't have been initialized by the program (and thus might contain garbage values), but might still be part of the allocated pages, depending on the size of the array. In other cases it will crash or fail to provide meaningful output.
All of the other answers are probably better, i.e. you either have to pass the length of the array or terminate it with a special byte sequence.
The following method is not portable, but it works for me in VS2005:
int getSizeOfArray( int* ptr )
{
int size = 0;
void* ptrToStruct = ptr;
long adr = (long)ptrToStruct;
adr = adr - 0x10;
void* ptrToSize = (void*)adr;
size = *(int*)ptrToSize;
size /= sizeof(int);
return size;
}
This is entirely dependent of the memory model of your compiler and system so, again, it is not portable. I bet there are equivalent methods for other platforms. I would never use this in a production environment, merely stating this as an alternative.
You can use this: int n = sizeof(A) / sizeof(A[0]);

use array in structure c++

I have a struc like this:
struct process {int PID;int myMemory[];};
however, when I try to use it
process p;
int memory[2];
p.myMemory = memory;
I get an criptic error from eclipse saying int[0] is not compatible with int[2];
what am i doing wrong?
Thanks!
Don't use static arrays, malloc, or even new if you're using C++. Use std::vector which will ensure correct memory management.
#include <vector>
struct Process {
int pid;
std::vector<int> myMemory;
};
Process p;
p.reserve(2); // allocates enough space on the heap to store 2 ints
p.myMemory.push_back( 4815 ); // add an index-zero element of 4815
p.myMemory.push_back( 162342 ); // add an index-one element of 162342
I might also suggest creating a constructor so that pid does not initially have an undefined value:
struct Process {
Process() : pid(-1), myMemory() {
}
int pid;
std::vector<int> myMemory;
};
I think you should declare myMemory as an int* then malloc() when you know the size of it. After this it can be used like a normal array. Int[0] seems to mean "array with no dimension specified".
EXAMPLE:
int *a; // suppose you'd like to have an array with user specified length
// get dimension (int d)
a = (int *) malloc(d * sizeof(int));
// now you can forget a is a pointer:
a[0] = 5;
a[2] = 1;
free((void *) a); // don't forget this!
All these answers about vector or whatever are confused :) using a dynamically allocated pointer opens up a memory management problem, using vector opens up a performance problem as well as making the data type a non-POD and also preventing memcpy() working.
The right answer is to use
Array<int,2>
where Array is a template the C++ committee didn't bother to put in C++99 but which is in C++0x (although I'm not sure of the name). This is an inline (no memory management or performance issues) first class array which is a wrapper around a C array. I guess Boost has something already.
In C++, array definition is almost equal to pointer constants, meaning that their address cannot be changed, while the values which they point to can be changed. That said, you cannot copy elements of an array into another by the assignment operator. You have to go through the arrays and copy the elements one by one and check for the boundary conditions yourself.
The syntax ...
struct process {int PID;int myMemory[];};
... is not valid C++, but it may be accepted by some compilers as a language extension. In particular, as I recall g++ accepts it. It's in support for the C "struct hack", which is unnecessary in C++.
In C++, if you want a variable length array in a struct, use std::vector or some other array-like class, like
#include <vector>
struct Process
{
int pid;
std::vector<int> memory;
};
By the way, it's a good idea to reserve use of UPPERCASE IDENTIFIERS for macros, so as to reduce the probability of name collisions with macros, and not make people reading the code deaf (it's shouting).
Cheers & hth.,
You cannot make the array (defined using []) to point to another array. Because the array identifier is a const pointer. You can change the value pointed by the pointer but you cannot change the pointer itself. Think of "int array[]" as "int* const array".
The only time you can do that is during initialization.
// OK
int array[] = {1, 2, 3};
// NOT OK
int array[];
array = [1, 2, 3]; // this is no good.
int x[] is normally understood as int * x.
In this case, it is not, so if you want a vector of integers of an undetermined number of positions, change your declaration to:
struct process {int PID;int * myMemory;};
You should change your initialization to:
int memory[2];
p.myMemory = new int[ 10 ];