How exactly does a reference to an array work? - c++

I am looking at a unique example here and am trying to understand why his snippet behaves the way it does
// uninitialized mem
char test[99999];
//
test[0] = 'a';
test[1] = 'b';
test[2] = 'c';
test[3] = 'd';
test[4] = 'e';
test[5] = 'f';
test[6] = 'g';
for (int i = 0; i < 99999; i++) {
cout << (&test[i])[i] << endl;
}
In particular, what is happening in memory for the output to skip a character?
output:
a
c
e
g
..

This is what is happening:
An array is just a contiguous chunk of memory.
&test
Is getting the address of that index of the starting point of array. Not the value.
When you add [some number], it counts up the number times the size of the data type, in this case each char is a byte.
So when you do
&test[i]
that means the starting address + i bytes.
when you do
(&test[i])[i]
You are doing i bytes from the starting address, and then treat that as the starting address and go up i more bytes.
So in your iterations:
(&test[0])[0] // index 0 + 0 = 0
(&test[1])[1] // index 1 + 1 = 2
(&test[2])[2] // index 2 + 2 = 4
(&test[3])[3] // index 3 + 3 = 6

It should become a bit more obvious when you consider what the array indexing is actually doing.
Given an array test, you usually access the nth element of test with test[n]. However, this is actually the equivalent of *(test+n). This is because addition on pointers automatically multiplies the amount you add with the size of the type being pointed to. This means the pointer will then be pointing at the second item in the array if you add one to the pointer, the third item if you add two, and so on.
The code you provide then references that value, so you end up with &(*(test+n)). The reference (&) and the dereference (*) operations then cancel each other out, which means you end up with just test+n.
The code then does another array index on that value, so you end up with (test+n)[n], which again may be written as *((test+n)+n). If you simplify that, you get *(test+n+n), which may be rewritten as *(test+2*n).
Clearly then, if you convert that back to array indexing notation, you end up with test[2*n], which indicates in simple form that you'll be querying every other element of the array.

Related

Palindrome mystery: Why an array of size 3 ends up being printed with 5 elements?

#include <iostream>
#include <cstring>
using namespace std;
int main(){
char a[] = "abc";
char b[2];
for(int i = 0,k = 2;i < 3;i++,k--){
b[k] = a[i];
cout << i << " " << k << endl;
}
if(strcmp(a,b) == 0){
cout << "palindrome";
}else{
cout << "no palindrome" << endl;
}
cout << "a: " << a << endl;
cout << "b: " << b << endl;
return 0;
}
output:
0 2
1 1
2 0
no palindrom
a: abc
b: cbabc
I don't understand why b array ends up with 5 elements, when the array holds only 3. Additionally, the loop loops only 3 times and this is the output I get.... A mystery.
You have an out-of-bounds array access and also need to be conscious of null-terminating your strings!
Specifically, char b[2]; gives you an array with exactly 2 chars, so only b[0] and b[1] are valid. You also need to account for the null character that should terminate all C-style strings. So to hold "cba" for example you need 4 elements. You can also see this if you print sizeof(a) (should be 4: 'a', 'b', 'c', '\0').
Basically, your program elicits undefined behavior (UB). The simple fix is to make b bigger (the same size as a, which is 4 in this case). The more complete answer is to manage your array lengths more carefully and look at the safer "n" versions of the C manipulation functions such as strncmp
Edit: to be complete, you have 2 sourced of UB. The first is in line b[k] = a[i] when k == 2 because again you have only allocated b[0] and b[1]. The second is when you call strcmp since b has not been properly null-terminated and strcmp will happily read past the array bounds, which it doesn't know.
b is not terminated by a null character (\0), so any string operation on it (like strcmp, or even just printing it with cout runs over until it happens to hit such a character somewhere in the memory. In other words, you are witnessing undefined behavior.
Strictly speaking you have undefined behaviour and any observed behaviour (wrong or seemingly partially correct) is explained by that.
For details and solutions see the other answers.
End of answer.
Now lets look at a speculation on why you might in your environment end up with specifically the output you observe.
Assumption, the memory for your arrays
char a[] = "abc";
char b[2]
looks like an often seen habit of linkers of how to arrange variables:
b[0] non-initialised
b[1] non-initialised
a[0] = 'a'
a[1] = 'b'
a[2] = 'c'
a[3] = '\0'
Note the four (not three) elements of a and the terminator 0.
Your loop, right in the first iteration, attempts to write to the non-existing b[2].
This is already what causes undefined behaviour. Clean discussion ends here.
Let's continue speculating.
Your loop unintentionally writes one place beyond the existing b[1] and ends up clobbering a[0]. By chance it writes the value which happens to be already there, so no change there.
Your loop continues to write, now to existing entries of b.
The speculated result is
b[0] = 'c
b[1] = 'b'
a[0] = 'a' = 'a'
a[1] = 'b'
a[2] = 'c'
a[3] = '\0'
and the loop ends.
Then you try to output a and b.
This is done by outputting all characters found consecutively from the start of the arrays, until a terminator 0 is found.
For a this (luckily in case of the "a") is "abc\0", all from a.
For b this is "bc" from b, followed (on the search for a 0) by "abc\0" from a.
Note that the seemingly correct "a" already is incorrectly from a, not from b.
Ok, when debugging this you can check for address of b[2].
In gdb:
(gdb) p &b[1]
$8 = 0x7fffffffdfe3 "\377abc"
See? If b was null terminated it would start with '\0', but it doesn't, you tell the compiler to use 2 spaces for b. When asked the debugger what's the address of last b character b[1], it not only tells the address, it also shows the char* value represented. As b is a non null terminated (my compiler didn't initialize it), it will continue beyond the boundaries of b!. Suspiciously enough the string of characters finishes with 'a''b''c''\0'. Let's check address of a[0]:
(gdb) p &a[0]
$9 = 0x7fffffffdfe4 "abc"
See? The a field pointed by b is contiguous to a. Now you are making two mistakes here:
You are not properly initializing b.
b reserves 2 slots of memory. If you want to check palindromes of a fixed size of 3 characters you should reserve 4 slots like you did for the null terminated string "abc".
Try changing b declaration from:
char b[2];
To:
char b[] = "xyz";
Your initialization code will set the palindrome as a function of a, so it would do what you intend to.

C++ expression is not assignable when writing `&str_send[0] = &str_recv[c * 6];`

I'm trying to parse an array that receives a multiple of 8 bytes and sends back each array of 8 bytes one at a time.
I'm getting an expression is not assignable or lvalue required as left operand of assignment error when building
I'm trying to figure out why I cannot simply change the address of an array to the new position. At first I thought it was a C-style array issue, but the same error happened when I tried with std::vector<unsigned char>
Is there a preferable way of doing this without copying the bytes?
Thanks,
unsigned char str_send[8];
unsigned char str_recv[BUF_SIZE];
int n = receive(cport_nr, str_recv, (int)BUF_SIZE);
if (n > 0 && ( n % 8 == 0 ) )
{
for (int c = 0; c < n / 8; c++) //Break up multiple 8-byte chunks
{
&str_send[0] = &str_recv[c * 6]; //ERROR expression is not assignable
}
}
return (0);
You can't change the address of a variable.
What you could do is create an array of pointers to the original array, but then you're just copying addresses (probably 8 bytes each) instead of single byte values. Then you've have to dereference those pointers, which makes it non-trivial to send what they point to.
Copying the bytes is exactly what you want to do here. Then you have a buffer you can send as-is:
str_send[0] = str_recv[c * 6];

setting char arrays equal to eachother with isdigit and isalpha

Im trying to set a char array equal to 2 other arrays depending on if the element in the first array is a number or a letter. The code makes logical sense to me but the output for the 2 other strings after the for loop doesn't correspond to the logic. Is it because of a missing null value somewhere in the other 2 loops or is the code itself invalid? arrayAlpha, arrayNum, and palind are all char arrays set to a length of 30 elements while string length was already determined before the for loop began.
for(int k=0; k<=stringLength; k++)
{
if( isalpha(palind[k])){
arrayAlpha[k]=palind[k];}
if ( isdigit(palind[k]))
{
arrayNum[k]=palind[k];
}
}
Given the input:
char palind[30] = "12345abcde";
arrayAlpha is garbage.
arrayNum is "12345"
However,
char palind[30] = "abcde12345";
arrayAlpha is "abcde".
arrayNum is garbage.
Thus, [k] is the problem when used in your arrayNum or arrayAlpha which doesn't start with 0.
Simple change will just be subtracting the length of the other.
arrayAlpha[k - strlen(arrayNum)] = palind[k];
arrayNum[k - strlen(arrayAlpha)] = palind[k];
since lengthOfPalind = lengthOfArrayAlpha + lengthOfArrayNum assuming palind only contains letters or numbers.

What exactly happens if I assign a value to a string position that is not used at the moment?

I almost never use the c++ string type but I'll need to use a set of strings so I think they would be the best way to go...
I coded like this:
string a;
a[0] = 'b';
printf("%s", a.c_str());
and it printed the letter 'b'
but when I tried:
string a;
// i bellow would be a number from 0 to 9, so I add 48 to get the correspondent char
a[0] = 48 + i;
printf("%s", a.c_str());
It is not printing a single digit...
My question is: did it print 'b' correctly in the first case just because of a lucky undefined behavior?
I'm asking that because if I already had something in position 0, the assignment a[0] = 48 + i; would print the number correctly.
String is a dynamic array so you cant call to undefined part of memory. The calling
string a;
a[0] = 'b';
printf("%s", a.c_str());
Is very dangerous because u already access to other part of memory and overwrote this. I guess your program will throw an error in other part of program. Look what u did:
string a; a[0] = 'b'; printf("%d ", a.size()); output will be of course 0.
You have to reserve memory before that like: a.resize(10)
Please correct me anyone, if this is a platform dependent answer. I used VS2012 and had no time to test with gcc.
Once you creates a string like,
string s;
It allocates some initial memory. You can see it by,
s.capacity();//For me it gets 15
It returns some non-zero value (again in VS2012). So, having,
s[0] = 'b';
is meaningful.
If you try something,
a[10000] = 'b';
It'll crash your program.
The case for a[0] = 48 + i ( 0 <= i <= 9), I'd say this shouldn't be a problem other than it prints some gibberish after your single digit.

Multidimensional array with different sizes using Arduino

I would like to have a multidimensional array that allows for different sizes.
Example:
int x[][][] = {{{1,2},{2,3}},{{1,2}},{{4,5},{2,7},{1,1}}};
The values will be known at compile time and will not change.
I would like to be able to access the values like val = x[2][0][1];
What is the best way to go about this? I'm used to java/php where doing something like this is trivial.
Thanks
I suppose you could do this "the old fashioned (uphill both ways) way":
#include <stdio.h>
int main(void){
int *x[3][3];
int y[12] = {1,2,3,4,5,6,7,8,9,10,11,12};
x[0][0] = &y[0];
x[0][1] = &y[2];
x[1][0] = &y[4];
x[2][0] = &y[6];
x[2][1] = &y[8];
x[2][2] = &y[10];
// testing:
printf("x[0][0][0] = %d\n", x[0][0][0]);
printf("x[0][0][1] = %d\n", x[0][0][1]);
printf("x[0][1][0] = %d\n", x[0][1][0]);
printf("x[0][1][1] = %d\n", x[0][1][1]);
printf("x[1][0][0] = %d\n", x[1][0][0]);
printf("x[1][0][1] = %d\n", x[1][0][1]);
printf("x[2][0][0] = %d\n", x[2][0][0]);
printf("x[2][0][1] = %d\n", x[2][0][1]);
printf("x[2][1][0] = %d\n", x[2][1][0]);
printf("x[2][1][1] = %d\n", x[2][1][1]);
printf("x[2][2][1] = %d\n", x[2][2][0]);
printf("x[2][2][1] = %d\n", x[2][2][1]);
return 0;
}
Basically, the array x is a little bit too big (3x3) and it points to the "right place" in the array y that contains your data (I am using the digits 1…12 because it's easier to see it is doing the right thing). For a small example like this, you end up with an array of 9 pointers in x (72 bytes), plus the 12 integers in y (48 bytes).
If you filled an int array with zeros where you didn't need values (or -1 if you wanted to indicate "invalid") you would end up with 18x4 = 72 bytes. So the above method is less efficient - because this array is not "very sparse". As you change the degree of raggedness, this gets better. If you really wanted to be efficient you would have an array of pointers-of-pointers, followed by n arrays of pointers - but this gets very messy very quickly.
Very often the right approach is a tradeoff between speed and memory size (which is always at a premium on the Arduino).
By the way - the above code does indeed produce the output
x[0][0][0] = 1
x[0][0][1] = 2
x[0][1][0] = 3
x[0][1][1] = 4
x[1][0][0] = 5
x[1][0][1] = 6
x[2][0][0] = 7
x[2][0][1] = 8
x[2][1][0] = 9
x[2][1][1] = 10
x[2][2][1] = 11
x[2][2][1] = 12
Of course it doesn't stop you from accessing an invalid array element - and doing so will generate a seg fault (since the unused elements in x are probably invalid pointers).
Thanks Floris.
I've decided to just load all values into a single array, like
{1,2,2,3,1,2,4,5,2,7,1,1}
and have a second array which stores the length of each first dimension, like
{2,1,3}
The third dimension always has a length of 2, so I will just multiply the number by 2. I'm going to make a helper class so I can just do something like getX(2,0) which would return 4, and have another function like getLength(0) which would return 2.