Why function's return statement is not usual int[ ] - d

I have just started to use Dlang and it seems to be ideal for people who want a safer C language. In addition it has modern paradigms also like functional programming.
I am trying following code to convert a list of numbers as strings to list of integers:
import std.stdio;
import std.array;
import std.conv;
int[] list_str2int(char[][] slist){ // this function should convert a list of numbers as char[] to integers
int[100] intcol;
int N = cast(int)slist.length;
foreach(int i; 0..N){
char[] temp = slist[i];
temp = split(temp, ".")[0]; // effectively this is floor; to!int does not work if '.' is there;
intcol[i] = to!int(temp); // not working;
}
return intcol; // error from this statement;
}
void main(){
char[][] strlist = cast(char[][])["1.1","2.1","3.2","4.4"];
int[] newintlist = list_str2int(strlist);
writeln("Converted list: ", newintlist);
}
But getting following error:
testing.d(13): Error: returning cast(int[])intcol escapes a reference to local variable intcol
Failed: ["/usr/bin/dmd", "-v", "-o-", "testing.d", "-I."]
I cannot understand why there is error on return line of the first function where the variable is int[].
Where is the problem and how can it be solved? Thanks for your help.

int[100] is a static array (a value type), and is located on list_str2int's stack frame - therefore it will cease to exist once the function returns. The function's return value, int[], is a slice (a reference type), which does not hold any data, but refers to a contiguous number of integers somewhere in memory. The statement return intcol; thus takes a slice of the static array, however, returning it is invalid because the slice would point to memory that would no longer be valid after the function returns.
You have a few options:
Declare the return type as int[100] also. Making it a value type, the integers will be copied to the caller's stack frame.
Allocate the array in the program's heap by declaring and initializing the array as auto intcol = new int[100];. This will make intcol a slice of memory in the heap. Memory in the heap is owned by the garbage collector, and has effectively infinite lifetime.
An option that's further from the above but more idiomatic to modern D is to use ranges. Your program could be rewritten to a single statement as follows:
import std.algorithm.iteration;
import std.stdio;
import std.array;
import std.conv;
void main()
{
["1.1", "2.1", "3.2", "4.4"]
.map!(item => item
.split(".")[0]
.to!int
)
.array // optional, writeln and co can write ranges
.writefln!"Converted list: %s";
}

int[] is a slice, it can use dynamic allocation.
The int[100] is an array of 100 elements. It is allocated on the stack
Just like in C, you can't return local memory from a function, you can't return intcol, as the memory behind it becomes invalid after the function returns.
It seems unknown to me, if you want to use dynamic arrays or static. If you want to use dynamic arrays, then stick to them.
import std.stdio;
import std.array;
import std.conv;
int[] list_str2int(char[][] slist){ // this function should convert a list of numbers as char[] to integers
int N = cast(int)slist.length;
int[] intcol = new int[N];
foreach(int i; 0..N){
char[] temp = slist[i];
temp = split(temp, ".")[0]; // effectively this is floor; to!int does not work if '.' is there;
intcol[i] = to!int(temp); // not working;
}
return intcol; // error from this statement;
}
void main(){
char[][] strlist = cast(char[][])["1.1","2.1","3.2","4.4"];
int[] newintlist = list_str2int(strlist);
writeln("Converted list: ", newintlist);
}
will output:
Converted list: [1, 2, 3, 4]

Related

Helper function to construct 2D arrays

Am I breaking C++ coding conventions writing a helper function which allocates a 2D array outside main()? Because my application calls for many N-dimensional arrays I want to ensure the same process is followed. A prototype which demonstrates what I am doing :
#include <iostream>
// my helper function which allocates the memory for a 2D int array, then returns its pointer.
// the final version will be templated so I can return arrays of any primitive type.
int** make2DArray(int dim1, int dim2)
{
int** out = new int* [dim1];
for (int i = 0; i < dim2; i++) { out[i] = new int[dim2];}
return out;
}
//helper function to deallocate the 2D array.
void destroy2DArray(int** name, int dim1, int dim2)
{
for (int i = 0; i < dim2; i++) { delete[] name[i]; }
delete[] name;
return;
}
int main()
{
int** test = make2DArray(2,2); //makes a 2x2 array and stores its pointer in test.
//set the values to show setting works
test[0][0] = 5;
test[0][1] = 2;
test[1][0] = 1;
test[1][1] = -5;
// print the array values to show accessing works
printf("array test is test[0][0] = %d, test[0][1] = %d, test[1][0] = %d, test[1][1] = %d",
test[0][0],test[0][1],test[1][0],test[1][1]);
//deallocate the memory held by test
destroy2DArray(test,2,2);
return 0;
}
My concern is this may not be memory-safe, since it appears I am allocating memory outside of the function in which it is used (potential out-of-scope error). I can read and write to the array when I am making a single small array, but am worried when I scale this up and there are many operations going on the code might access and alter these values.
I may be able to sidestep these issues by making an array class which includes these functions as members, but I am curious about this as an edge case of C++ style and scoping.
There is a difference between allocating 2D arrays like this and what you get when you declare a local variable like int ary[10][10] that based on your statement
My concern is that this operation may not be memory-safe, since it
appears that I am allocating memory for an array outside of the
function in which it is used (potential out-of-scope error)
I am guessing you do not fully understand.
You are allocating arrays on the heap. Declaring a local variable like int ary[10][10] places it on the stack. It is the latter case where you need to worry about not referencing that memory outside of its scope-based lifetime; that is, it is the following that is totally wrong:
//DON'T DO THIS.
template<size_t M, size_t N>
int* make2DArray( ) {
int ary[M][N];
return reinterpret_cast<int*>(ary);
}
int main()
{
auto foo = make2DArray<10, 10>();
}
because ary is local to the function and when the stack frame created by the call to make2DArray<10,10> goes away the pointer the function returns will be dangling.
Heap allocation is a different story. It outlives the scope in which it was created. It lasts until it is deleted.
But anyway, as others have said in comments, your code looks like C not C++. Prefer an std::vector<std::vector<int>> rather than rolling your own.
If you must use an array and are allergic to std::vector, create the 2d array (matrix) as one contiguous area in memory:
int * matrix = new int [dim1 * dim2];
If you want to set the values to zero:
std::fill(matrix, (matrix + (dim1 * dim2)), 0);
If you want to access a value at <row, column>:
int value = matrix[(row * column) + column];
Since the matrix was one allocation, you only need one delete:
delete [] matrix;

Returning a filtered range

I'm looking at filtering ranges, and getting a little confused. In D, I can write this code:
import std.stdio;
import std.range;
import std.algorithm;
auto filterNums(int[] vals)
{
int limit = 3;
return filter!(n => n >limit)(vals);
}
int main()
{
int[] nums = [1,2,3,4,5];
auto flt = filterNums(nums);
foreach(n;flt)
{
writeln(n);
}
return 0;
}
which gives the expected output of:
4
5
But this doesn't seem to be terribly safe code. If the filter is lazy evaluating, how does it know that limit is 3, once the local variable goes out of scope? Is my code just lucky that nothing else has over-ridden limit in memory? Also is the passed variable nums a reference, or a copied value: if a value, is the filter making its own copy?
Or perhaps I am not using filter in the correct way?
int is a value type, so the filter lambda receives a copy of it. If there were any scoping issues, the compiler would warn you.
int[] nums is a dynamic array managed by D's runtime. It consists of two parts: a value part, which contains its length and a pointer, which points to the dynamic part on the heap (where the ints are stored). The length and pointer itself are passed by value, which means appending or removing will not affect the original, but editing an element e.g. vals[1] = 1 would. To pass all of it by reference, you would use
filterNums(ref int[] vals).
Either way, the garbage collector keeps it around as long as it's needed, in this case it is stored in the filter construct.

C++ Function Alters Value of Passed Parameter

I have a simple swapping function to take an integer array, and return a new array with swapped values.
int* Node::dataSwap(int *data, int n_index, int swap_index){
printDatt(data);
int *path = data;
int swapped = data[n_index];
int to_swap = data[swap_index];
path[n_index] = to_swap;
path[swap_index] = swapped;
printDatt(data);
return path;
}
However, the reference to the original data is being altered by this function. The output looks something like this (printing the should be the same data to console).
0, 1, 2
3, 4, 5
6, 7, 8
0, 1, 2
3, 4, 8
6, 7, 5
Why is "data" being changed when I am not changing it? Is "path" a reference to the actual mem addr of "data"?
The type of the argument data and the local variable path is int *. You can read this as "pointer to int".
A pointer is a variable holding a memory address. Nothing more, nothing less. Since you set path = data, those two pointers are equal.
In your mind, data is an array. But that's not what the function dataSwap is seeing. To the function dataSwap, its argument data is just a pointer to an int. This int is the first element of your array. You accessed elements of the array using data[n_index]; but that's just a synonym for *(data + n_index).
How to remedy to your problem?
The C way: malloc and memcpy
Since you want to return a new array, you should return a new array. To do this, you should allocate a new region of memory with malloc, and then copy the values of the original array to the new region of memory, using memcpy.
Note that it is impossible to do this using only the current arguments of the function, since none of those arguments indicate the size of the array:
data is a pointer to the first element of the array;
n_index is the index of one of the elements in the array;
swap_index is the index of another element in the array.*
So you should add a fourth element to the function, int size, to specify how many elements are in the array. You can use size as argument to malloc and memcpy, or to write a for loop iterating over the elements of the array.
New problem arising: if you call malloc to allocate new memory, then the user will have to call free to free the memory at some point.
C++ has the cool keyword new whose syntax is somewhat lighter than the syntax of malloc. But this doesn't solve the main problem; if you allocate new memory with the keyword new, then the user will have to free the memory with the keyword delete at some point.
Urgh, so much burden!
But this was the C way. A good rule of thumb in C++ is: never handle arrays manually. The standard library has std::vector for that. There are situations where using new might be the best solution; but in most simple cases, it isn't.
The C++ way: std::vector
Using the class std::vector from the standard library, your code becomes:
#include <vector>
std::vector<int> Node::dataSwap(std::vector<int> data, int n_index, int swap_index)
{
std::vector<int> new_data = data;
int swapped = data[n_index];
int to_swap = data[swap_index];
new_data[n_index] = to_swap;
new_data[swap_index] = swapped;
return (new_data);
}
No malloc, no new, no free and no delete. The class std::vector handles all that internally. You don't need to manually copy the data either; the initialisation new_data = data calls the copy constructor of class std::vector and does that for you.
Avoid using new as much as you can; use a class that handles all the memory internally, like you would expect it in a higher-level language.
Or, even simpler:
The C++ way: std::vector and std::swap
#include <vector>
#include <algorithm>
std::vector<int> Node::dataSwap(std::vector<int> data, int n_index, int swap_index)
{
std::vector<int> new_data = data;
std::swap(new_data[n_index], new_data[swap_index]);
return (new_data);
}
Is "path" a reference to the actual mem addr of "data"?
Yes! In order to create a new array that is a copy of the passed data (only with one pair of values swapped over), then your function would need to create the new array (that is, allocate data for it), copy the passed data into it, then perform the swap. The function would then return the address of that new data, which should be freed later on, when it is no longer needed.
However, in order to do this, you would need to also pass the size of the data array to the function.
One way to do this, using 'old-style' C++, is with the new operator. With the added 'size' parameter, your function would look something like this:
int* Node::dataSwap(int *data, int n_index, int swap_index, int data_size)
{
printDatt(data);
int *path = new int[data_size]; // Create new array...
for (int i = 0; i < data_size; ++i) path[i] = data[i]; // ... and copy data
int swapped = data[n_index];
int to_swap = data[swap_index];
path[n_index] = to_swap;
path[swap_index] = swapped;
printDatt(data);
return path; // At some point later on, your CALLING code would "delete[] path"
}
You are changing the memory at which the pointer path point and that is data. I think try to understand better how the pointers works will help you. :)
Then you can use the swap function from the std library:
std::swap(data[n_index], data[swap_index]);
It will make your code nicer.

How to properly assigning a dynamically create an int and assign it to a dynamic array array with values?

I am working on an assignment for my c++ class where I need to dynamically assign a new int, array, and pointer from another pointer to practice dynamic memory allocation.
At first, I was struggling with creating a new int to provide an int for my new array, but I got it to compile and was wondering if my declarations were correct.
int *dmaArray = new int;
*dmaArray = 4;
I then took that and put it into a dynamically created array, but I don't know how to declare values of the array as it errors out saying "cannot convert to int". I did some thinking and I believe it's because it was declared and needs to be initialized at the declaration; I can't because the declaration is already a declaration (new) in itself.
int * nodeValues = new int[*dmaArray];
nodeValues[*dmaArray] = {6, 2, 28, 1};
A loop wouldn't work to assign values after since the values aren't consecutive or in any pattern. (well, regardless, I would need to use an array because the assignment said so.
This is not how to to declare a dynamic array and initialize it:
int * nodeValues = new int[*dmaArray];
nodeValues[*dmaArray] = {6, 2, 28, 1};
So you declare it this way:
int* nodeValues = new int[dmaArray];
And to assign values to it use loops or manually:
nodeValues[0] = 6;
nodeValues[1] = 2;
nodeValues[2] = 28,
nodeValues[3] = 1;
Remember arrays use indexes to read/write its elements as the fact being some sort of data of the same type contiguous to each other in memory.
So if you want to print the array:
for(auto i(0); i != dmaArray; ++i)
std::cout << nodeValues[i] << ", ";
Finally you should clean memory dynamically allocated after finishing with it because the compiler doesn't do it for you:
delete[] nodeValues;
I figured out how:
int * nodeValues = new int[*dmaArray]{6,5,28,1};

Pointer Pointer Methods C++

I have two questions:
1) How can I make an array which points to objects of integers?
int* myName[5]; // is this correct?
2) If I want to return a pointer to an array, which points to objects (like (1)) how can I do this in a method? ie) I want to impliment the method:
int **getStuff() {
// what goes here?
return *(myName); // im pretty sure this is not correct
}
Thanks for the help!
How can I make an array which points
to objects?
int * myName[5]; /* correct */
If I want to return a pointer to an
array, which points to objects (like
(1)) how can I do this in a method?
Technically, you write this function:
int * (* getStuff() )[5] {
return &myName;
}
That returns a pointer to that array. However, you don't want to do that. You wanted to return a pointer to the first element of the array:
int ** getStuff() {
return myName; /* or return &myName[0]; */
}
That way, you can now access items as you want like getStuff()[0] = &someInteger;
Note that your code,
int* myName[5];
declares an array containing 5 values, each of which is a "pointer to int", which is what you asked.
However this being C++, that's all it does. As a Python scripter, that might cause you some surprises.
It does not give any of those 5 pointers sensible values, and it does not create any integers for them to point to.
If you put it in a function body, then it creates the array on the stack. This means that the array will cease to exist when the current scope ends (which, to put it simply, means when you get to the enclosing close-curly, so for example return does it). So in particular, the following code is bad:
int **myFunction() {
int *myArray[5];
return myArray;
} // <-- end of scope, and return takes us out of it
It might compile, but the function returns a pointer to something that no longer exists by the time the caller sees it. This leads to what we call "undefined behaviour".
If you want the array to exist outside the function it's created in, you could create one on the heap each time your function is called, and return a pointer, like this:
int **myFunction() {
int **myArray = new int[5];
return myArray;
}
The function returns a different array each time it's called. When the caller has finished with it, it should destroy the array, like this:
delete[] myArray;
otherwise it will never be freed, and will sit around using up memory forever (or when your program exits on most OSes).
Alternatively, you can use the keyword "static" to create an array with "global storage duration" (meaning that it exists as long as the program is running, but there's only one of it rather than a new one each time). That means the function returns the same array each time it's called. The caller could store some pointers in it, forget about it, call the function again, and see the same pointers still there:
int **myFunction() {
static int *myArray[5];
return myArray;
}
Note how similar this code is to the very bad code from earlier.
Finally, if you just want to create an array of integers, not an array of pointers to integers, you can do this:
int myArray[5] = { 1, 2, 3, 4, 5};
That actually creates 5 integers (meaning, it assigns space which can store the integer values themselves. That's different from the array of pointers, which stores the addresses of space used to store integer values).
It also stores the specified values in that space: myArray[0] is now 1, myArray[1] is 2, etc.
1) Correct - this is an array of 5 pointers to ints
2) You can return a pointer to an array of pointers to ints by returning a pointer to the first element of that array. This has two levels of indirection, so you need two asterisks. You can also return the array normally, since arrays automatically decay into pointers to their first elements.
int **getStuff() {
return myName; // 1
return &myName[0]; // 2
}
int **myName;
int **getStuff() {
int **array = new int*[5];
for (int i = 0; i < 5; i++)
{
int key = i;
array[i] = &key;
}
return array;
}
Steve Jessop, I think you meant:
int **myFunction() {
int **myArray = new int*[5];
return myArray;
}
This returns a heap array pointer (not pointer to its elements), testable and deletable. Nothing leaks.
template <class T>
T* newarray(int len)
{
T *a;
try
{
a = new T[len];
memset(a,0,len*sizeof(T));
return a;
}
catch (...)
{return 0;}
}
.
.
.
void foo()
{
float *f=0;
f=newarray<float>(1000000);
if(!f) return;
//use f
delete [] f;
}