int* to Constant Array - c++

I asked this question: Array Equivalent of Bare-String
To which the answer was C++ doesn't provide this functionality for const int*s. Which is disappointing. So my question then is: In practice how do I get around this limitation?
I want to write a struct like this:
struct foo{
const char* letters = "abc";
const int* numbers = ???
};
I cannot:
&{1, 2, 3} cause I can't take the address of an r-value
array<int, 3>{{1, 2, 3}}.data() cause the memory is cleaned up immediately after initialization
const int* bar(){ return new int[3]{1, 2, 3}; } cause nothing will delete this pointer
I know that I can use an auto pointer to get around this. I am not suggesting that struct foo is good code, I am trying to illustrate that the compiler makes a provision to store the const array "abc" in memory and clean it up on program exit, I want there to be a way to do that for ints as well.
Is there a way to accomplish this?

How about a static which you point to - I think this what the compiler pretty much does internally for "strings literals" anyway?
static const int Numbers[] = {1, 2, 3};
struct foo{
const char* letters = "abc";
const int* numbers = Numbers;
};

String literals are all you get. However, they are also enough to cover most integral data. In your case you can use
L"\1\2\3"
to get a compiler-managed array of wide characters. C++11 and later also support u8, u16, and u32 strings.

We can accomplish this using Ben Voigt's answer:
const int* numbers = sizeof(int) == sizeof(char32_t) ? reinterpret_cast<const int*>(U"\1\2\3") : reinterpret_cast<const int*>(u"\1\2\3");
The ternary is compiled out as is evidenced by the fact that you can declare numbers as constexpr.
There are a couple drawbacks to this implementation:
This is actually a wchar_t string literal you will get a terminating 0 element in addition to any characters you specify
This assumes that an int will be either 32-bits or 16-bits, if that's not the case this will try to cast from a char16_t to a whatever sized int and you will have major problems
In any case we can simplify this into a macro:
#define QUOTATION(x) sizeof(int) == sizeof(char32_t) ? reinterpret_cast<const int*>(U ## x) : reinterpret_cast<const int*>(u ## x)
Which can be used like:
const int* numbers = QUOTATION("\1\2\3");

Related

Must the pointer be initialized before use , then how to understand char * p?

new learner ; something puzzle about pointer;
As I learn from books, before using the pointer it must be initialized , so we usually use like this
int a = 12;
int * p = &a;
so I understand why int* p = 12 is wrong ,because it has no address;
then I find something today while coding , That is from this :
char * months[12] = {"Jan", "Feb", "Mar", "April", "May" , "Jun", "Jul"
,"Aug","Sep","Oct","Nov","Dec"};
Then another usually used situation came to my mind , That is :
char *p = "string"; (this is ok , why int * a = 12 can't be allowed ?)
I am puzzled. when is it initialized and how ? and why int * a = 12 can't be auto initialized ? maybe something about the arrange of memory.
First off:
int a = 12;
int* p = &a;
This works because &a is a memory address.
int* p = 12;
This fails mostly because 12 is not a memory address. It's also true that 12, by itself, has no address, but this would be better reflected by a snippet like int* p = &12; (which wouldn't work, as you correctly noted).
An interesting property of pointers is that they are often used to designate the start of a list of values. For instance, take this array of integers:
int a[] = {1, 3, 7, 13};
It can trivially be turned into an integer pointer.
int* p = a; // magic!
The pointee is the first element of a, so *p == 1. Now, you can also do p[0] (which is 1, too), p[1] == 3, p[3] == 7, and p[4] == 13.
The reason char* foo = "bar" works is that "bar" is not a single value: it's a character array in disguise. Single characters are denoted by single quotes. As a matter of fact:
"bar"[0] == 'b'
"bar"[1] == 'a'
"bar"[2] == 'r'
The compiler has special support for string literals (quoted strings) that make it possible to assign them straight to pointers. For instance, char* foo = "bar" is valid.
A C99-compliant compiler also has support for array literals. For instance, int* p = (int [3]){1, 2, 3}; is valid. The character array and the int array will be given a global address, because the people who made C felt that it was a useful thing to do.
int* p = 12 is wrong because the assigned value may or may not belongs to memory address. You are forcing p to point at that location.
char *p = "string" is allowed because compiler already has set the space for the string and p is pointing to the first character of that string.
It comes down to types.
In both C and C++, the type of a plain integer literal like 12 is int. There is no implicit conversion from the type int to the type int*, which makes sense: a pointer and an integer are, conceptually, completely different things. So int *p = 12; is invalid.
In C, a plain string literal like "abc" is translated into a static array of chars (of size exactly sufficient to store abc plus a terminating null char). The type "array of chars" is implicitly convertible to the type char* (pointer to char) - arrays are said to decay into pointers. So the assignment char *p = "abc"; is valid.
But there's a catch: it's undefined behavior to modify that array (both in C and C++). That conversion is in fact deprecated (or even illegal) in C++, and you should use const char * instead.
In reality the gcc compiler will warn you about:
char* p = "hello";
This is because "hello" is now treated as an equivalent to const char*.
so this would be better:
const char* p = "hello";
But yes as other people have described, "hello" has an address which points to the start of a fixed sequence of characters.
int* p = 12 is wrong in the sense that it does something that is almost definitely not what you think it does. Assuming that your compiler doesn't complain that you are trying to implicitly cast an int into a int *, this is not illegal. What you did was, point p at memory location 12, which is almost definitely something you shouldn't be reading. The assignment is legal, but if you dereference that pointer you are in undefined behavior territory. If you are in user mode, then *(int*)12 is probably a segmentation fault.
Using C terminology, the difference between these two cases is that string literals exist, which are an array of char (and thus an lvalue, so you can take their address or point to them); but in C90 there are no other literals. 12 is an integer constant, not a literal. You can't do &(12), because the language says so. Brace-enclosed initializer lists are not values. Constants are rvalues; literals are lvalues.
In C++ the behaviour is the same, however C++ uses different terminology for the same thing. In C++, constants are all called "literals", however they are also all rvalues (except for string literals) so you cannot take their address.
C99 added array literals of other types.
The language has made an exception and allows string literals to be used to initialize char const*. Some compilers are less strict and allow string literals to be used to initialize char* as well.
Update, in response to comment by Pascal Cuoq
In C and C++, the following are valid ways to initialize variables using string literals:
char carr[] = "abc";
char carr[10] = "abc";
char const* cp = "abc";
The following are valid ways to initialize variables using integer literals in an initializer list:
int iarr[] = {1, 2, 3};
int iarr[10] = {1, 2, 3};
However, the following is not a valid way to initialize a variable using integer literals in an initializer list:
int const* ip = {1, 2, 3};
That's what I meant when I said the language has made an exception and allows string literals to be used to initialize char const*.
Many programmers are confused by the syntax of specifying a pointer.
int* p;
int *p;
int * p;
All of the above declare the same thing: a pointer, p, which either be NULL or the address of an integer-size storage unit in memory.
Thus
int * p = 12;
declares a pointer, p, and assigns it the value 12.
In C and C++ pointers are just variables with special meaning to the compiler such that you are allowed to use special syntax to access the memory location whose value they hold.
Think about this a different way for a moment.
Think of the number "90210". That could be your bank balance. It could be the number of hours since you were born.
Those are all simple "integer" interpretations of the number. If I tell you that it is a Zip Code - suddenly it describes a place. 90210 isn't Beverly Hills in California, it is the [postal] address of Beverly Hills in California.
Likewise, when you write
int * p = 12;
you are saying that "12" is the address of an integer in memory, and you're going to remember that fact in the pointer-variable p.
You could write
int * p = &12;
This will force the compiler to generate an integer storage unit in the program executable containing the native integer representation of 12, and then it will generate code which loads the address of that integer into the variable p.
char* p = "hello";
is very different.
12; // evaluates to an integer value of 12
"hello"; // evaluates to an integer value which specifies the location of the characters { 'h', 'e', 'l', 'l', 'o', 0 } in memory, with a pointer cast.
int i = 12; // legal
char h = 'h'; // legal
const char* p = "hello"; // legal
uintptr_t p = "hello"; // legal
The double-quotes in C and C++ have a special meaning, they evaluate to a pointer to the string contained in them, rather than evaluating to the string itself.
This is because
"The quick brown fox jumped over the lazy dog"
wouldn't fit into a CPU register (which are 32 or 64 bits depending on your machine). Instead, the text is written into the executable and the program instead loads the address of it (which will fit into a register). And this is what in C/C++ we call a pointer.
I have found on stackoverflow one good hint: always initialize declared pointer with NULL.
This will help understanding that newly created pointer cannot be used without further actions.
So the actions that should follow have to assign proper address to the pointer (initialize that pointer).
To do what was probably your intention with original code
int *p = 12;
you have to do for example:
int *p;
p = malloc(sizeof(p)); /* pointer p holds address of allocated memory */
*p = 12;
or another example:
int *p;
const int a = 12;
p = &a; /* pointer p holds address of variable a */
Why char *p = "string"; is correct was answered by others above. This is just another way of pointer initialization.
Also, similarly, in one of answers here on stackoverflow I have found really nice option for your case:
int *p = &(int){12};

Pointer to array of character arrays

Okay, this one has me stumped. I am trying to pass an array of character arrays into my class's constructor. The class has a private attribute which stores a pointer to the array of character arrays. The class may then process the array via the pointer.
Below is some code that demonstrates the desired functionality. But, it won't compile. How do I fix this code so it works?
using namespace std;
const int MAX_LINES = 10, MAX_STRING = 80;
class Alphabetizer{
public:
Alphabetizer(char * inArray[][MAX_STRING]) : input(inArray){};
private:
char * input[MAX_LINES][MAX_STRING];
};
int main(){
char charArray[MAX_LINES][MAX_STRING];
Alphabetizer theAlaphBet(charArray);
return 0;
}
If you're insisting on using C-compatible character pointers, I think you'll have the best luck using a char ** as the type for input. This is more of the usual way to do this (in C at least), and it has the added benefit of not forcing you to define a maximum string size.
As others have pointed out, you can take advantage of std::string instead, which may be a better choice overall.
I'm guessing it's that you're not passing a pointer to char[][], you're passing a char[][].
Also, you should be using std::string instead of char arrays.
std::string will be the most appropriate here! It handles strings and character arrays well enough!
There are few errors in the code. I suppose you are trying to refer to the charArray in the main function from inside the Alphabetizer object. If that is the case the declaration
char * input[MAX_LINES][MAX_STRING];
is wrong because the above declaration makes input an array of MAX_LINE of ( array of MAX_STRING of (char*)). In summary input is an array not a pointer to array of whatever. If you had intended it to be a pointer - which is what rest of your code hints to me - then you have to do the following,
const int MAX_LINES = 10, MAX_STRING = 80;
class Alphabetizer{
public:
Alphabetizer(char ((*ar)[MAX_LINES])[MAX_STRING]) : m_ar(ar){};
private:
char ((*m_ar)[10])[80];
};
int main(){
char charArray[MAX_LINES][MAX_STRING];
char ((*ar)[MAX_LINES])[MAX_STRING] = &charArray;
Alphabetizer theAlaphBet(&charArray);
return 0;
}
Moreover doing,
input(inArray)
is wrong, as it is equivalent to doing the following,
char a[1] = {'a'};
char b[1] = {'p'};
a = b;
assigning an array to another does not copy one over another. You have to do explicit memcpy. (This semantics is not meaningful in c or c++)
It's difficult to tell without seeing the compile errors, but I think the problem might be this line:
Alphabetizer theAlaphBet(charArray);
You are passing the array directly rather than it's address. It should read:
Alphabetizer theAlaphBet( &charArray );
However I think you may be overcomplicating things. You might be better off using a reference rather than a pointer:
const int MAX_LINES = 10, MAX_STRING = 80;
class Alphabetizer{
public:
Alphabetizer(char & inArray[][MAX_STRING]) : input(inArray){};
private:
char & input[MAX_LINES][MAX_STRING];
};
int main(){
char charArray[MAX_LINES][MAX_STRING];
Alphabetizer theAlaphBet(charArray);
return 0;
}
You might also want to look into using std::string instead as this may help to simplify your code.

Is it alright to use memcpy() to copy a struct that contains a pointer?

I was thinking about this the other day and I am curious if this is a bad idea...
Lets say there is a structure that contains a pointer to a string array.
Would the memcpy() copy the 'name' array pointer in the below example?
Edit: The std is inaccessible in this example.
struct charMap
{
unsigned char * name;
unsigned char id;
};
typedef struct charMap CharMapT;
class ABC
{
public:
ABC(){}
void Function();
CharMapT* structList;
}
void ABC::Function ()
{
CharMapT list[] =
{
{"NAME1", 1},
{"NAME2", 2},
{"NAME3", 3}
};
structList = new CharMapT[sizeof(list)];
memcpy(structList, &list, sizeof(list));
}
There are several errors in the code presented, which I will talk about first, followed by my stock-diatribe of pointers vs. arrays.
struct charMap
{
unsigned int * name;
unsigned int id;
};
typedef struct charMap CharMapT;
This declares a structure type that includes a pointer to unsigned int as the first member (name) and an int as the second member (id). On a 32-bit system with default byte packing this will be 8 bytes wide (32-bit pointer = 4bytes, 32-bit signed int=4bytes). If this is a 64-bit machine the pointers will be 8 bytes wide, the int still-likely 32-bits wide, making the structure size 12 bytes.
Questionable Code
void ABC::Function ()
{
CharMapT list[] =
{
{"NAME1", 1},
{"NAME2", 2},
{"NAME3", 3}
};
structList = new CharMapT[sizeof(list)];
memcpy(structList, &list, sizeof(list));
}
This allocates dynamic array of CharMapT structs. How many? More than you think. The sizeof(list) will return the byte-count of the list[] array. Since a CharMapT structure is 8 bytes wide (see above) this will 3 * 8, or 24 CharMapT items (36 items if using 64-bit pointers).
We then memcpy() 24 bytes (or 36 bytes) from list (the & in &list is unecessary) to the newly allocated memory. this will copy over 3 CharMapT structures, leaving the other 21 we allocated untouched (beyond their initial default construction).
Note: you're initializing a const char * to a field declared as unsigned int *, so if this even compiled the fundamental data type would be different. Assuming you fixed your structure and change the pointer type to const char *, the addresses of the static string constants (the addresses of the "NAME" constants) somewhere in your const data segment will be assigned to the pointer variables of the elements in structList[0].name, structList[2].name, and structList[3].name respectively.
This will NOT copy the data pointed to. it will only copy the pointer values. If you want copies of the data then you must raw-allocate them (malloc, new, whatever).
Better still, use an std::vector<CharMapT>, use std::string for CharMapT::name, and use std::copy() to replicate the source (or even direct-assignment).
I hope that explains what you were looking for.
Pointer vs. Array Diatribe
Never confuse a pointer with an array. A pointer is a variable that holds an address. Just like an int variable hold an integer value, or a char variable holds a character type, the value held in a pointer is an address
An array is different. It is also a variable (obviously), but it cannot be an l-value, and nearly every place it is typically used a conversion happens. Conceptually that conversion results in a temporary pointer that points to the data type of the array, and holds the address of the first element. There are times when that concept does not happen (such as the applying the address-of operator).
void foo(const char * p)
{
}
char ar[] = "Hello, World!";
foo(ar); // passes 'ar', converted to `char*`, into foo.
// the parameter p in foo will *hold* this address
or this:
char ar[] = "Goodbye, World!";
const char *p = ar; // ok. p now holds the address of first element in ar
++p; // ok. address in `p` changed to address (ar+1)
but not this:
char ar[] = "Goodbye, World!";
++ar; // error. nothing to increment.
It won't copy your actual data pointed by name. It will copy the pointer and you'll have 2 pointers to the same place in 2 objects (for each pair of objects in 2 arrays).
All you really need to know here is that memcpy will give you a bit for bit copy of the original. So what you'll have is two pointers with the same value (i.e., an address) which refer to the same data.
On a side note, you have declared name as a pointer to int, which is of course wrong here. It should be a const char*. Also, as this is C++ and not C, you're better served by something like std::copy which won't break your code subtly if charMap someday becomes a complex type. On the same note, prefer std::string instead of const char* in most situations.
Your use of sizeof() is wrong when calling new. You are allocating an array of CharMapT elements. You have to specify the number of elements, but you are specifying a byte count instead. So you need to fix that:
structList = new CharMapT[sizeof(list) / sizeof(CharMapT)];
With that fixed, the result of the memcpy() will be that structList will contains an exact copy of the raw data that list[] contains. That means that the structList[N].name pointers will contain the same values as the list[N].name pointers, and thus they will all be pointing at the same physical memory for the string values.
If you want to do a deep copy of the string values, you have to allocate them separately, eg:
void ABC::Function ()
{
CharMapT list[] =
{
{"NAME1", 1},
{"NAME2", 2},
{"NAME3", 3}
};
int num = sizeof(list) / sizeof(CharMapT);
structList = new CharMapT[num];
for (int i = 0; i < num; ++i)
{
int len = strlen(list[i].name);
structList[i].name = new char[len+1];
strcpy(structList[i].name, list[i].name);
structList[i].name[len] = 0;
structList[i].id = list[i].id;
}
...
for (int i = 0; i < num; ++i)
delete[] structList[i].name;
delete[] structList;
}
I'd like to add to #EdS.'s answer:
Your code is just much more c++ than c-style c++ code if you do it like this:
#include<string>
#include<vector>
struct CharMap
{
CharMap(const std::string& name, unsigned char id); // only needed if you don't use -std=c++11
std::string name;
unsigned char id;
};
CharMap::CharMap(const std::string& name, unsigned char id):
name(name),
id(id)
{}
class ABC
{
public:
ABC(); // or ABC() = default; if you use -std=c++11
void Function();
private:
std::vector<CharMap> structList;
}
ABC::ABC(){} // not needed with -std=c++11
void ABC::Function ()
{
// This works with -std=c++11:
//structList =
//{
// {"NAME1", 1},
// {"NAME2", 2},
// {"NAME3", 3}
//};
// without c++11:
structList = std::vector<CharMap>(3);
structList[0] = CharMap("NAME1",1); // don't worry about copies, we have RVO (check wikipedia or SO)
structList[1] = CharMap("NAME2",2);
structList[2] = CharMap("NAME2",3);
}
Why not using std::vector for making an array? You can do that like this:
#include<vector>
std::vector<CharMapT> structList(list.size());
It is safer, too, avoiding using pointers decreases the chance of memory leaks or bugs arising due to wrongly using the sizeof operator.
I suppose you do not really want a structList, that has as many elements as the memory size of your list. (If list is double this could be many times more than the number of elements in your list.)
Also, memcpy is really not necessary, if list is also a vector (that is a c function really). You just do a simple assign operation:
structList = list; // given that list is a vector.
This will copy the elements like memcpy.

Binary serialization of variable length data and zero length arrays, is it safe?

I did some research but cannot find a definite approval or disapproval.
What I want is, a fixed size structure + variable length part, so that serialization can be expressed in simple and less error prone way.
struct serialized_data
{
int len;
int type;
char variable_length_text[0];
};
And then:
serialize_data buff = (serialize_data*)malloc(sizeof(serialize_data)+5);
buff->len=5;
buff->type=1;
memcpy(buff->variable_length_text, "abcd", 5);
Unfortunately I can't find if MSVC, GCC, CLang etc., are ok with it.
Maybe there is a better way to achieve the same?
I really don't want those ugly casts all around:
memcpy((char*)(((char*)buffer)+sizeof(serialize_data)), "abcd", 5);
This program is using a zero length array. This is not C but a GNU extension.
http://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
A common idiom in C89, called the struct hack, was to use:
struct serialized_data
{
int len;
int type;
char variable_length_text[1];
};
Unfortunately its common use as a flexible array is not strictly conforming.
C99 comes with something similar to perform the same task: a feature called the flexible array member.
Here is an example right from the Standard (C99, 6.7.2.1p17)
struct s { int n; double d[]; };
int m = 12; // some value
struct s *p = malloc(sizeof (struct s) + sizeof (double [m]));

C++: sizeof for array length

Let's say I have a macro called LengthOf(array):
sizeof array / sizeof array[0]
When I make a new array of size 23, shouldn't I get 23 back for LengthOf?
WCHAR* str = new WCHAR[23];
str[22] = '\0';
size_t len = LengthOf(str); // len == 4
Why does len == 4?
UPDATE: I made a typo, it's a WCHAR*, not a WCHAR**.
Because str here is a pointer to a pointer, not an array.
This is one of the fine differences between pointers and arrays: in this case, your pointer is on the stack, pointing to the array of 23 characters that has been allocated elsewhere (presumably the heap).
WCHAR** str = new WCHAR[23];
First of all, this shouldn't even compile -- it tries to assign a pointer to WCHAR to a pointer to pointer to WCHAR. The compiler should reject the code based on this mismatch.
Second, one of the known shortcomings of the sizeof(array)/sizeof(array[0]) macro is that it can and will fail completely when applied to a pointer instead of a real array. In C++, you can use a template to get code like this rejected:
#include <iostream>
template <class T, size_t N>
size_t size(T (&x)[N]) {
return N;
}
int main() {
int a[4];
int *b;
b = ::new int[20];
std::cout << size(a); // compiles and prints '4'
// std::cout << size(b); // uncomment this, and the code won't compile.
return 0;
}
As others have pointed out, the macro fails to work properly if a pointer is passed to it instead of an actual array. Unfortunately, because pointers and arrays evaluate similarly in most expressions, the compiler isn't able to let you know there's a problem unless you make you macro somewhat more complex.
For a C++ version of the macro that's typesafe (will generate an error if you pass a pointer rather than an array type), see:
Compile time sizeof_array without using a macro
It wouldn't exactly 'fix' your problem, but it would let you know that you're doing something wrong.
For a macro that works in C and is somewhat safer (many pointers will diagnose as an error, but some will pass through without error - including yours, unfortunately):
Is there a standard function in C that would return the length of an array?
Of course, using the power of #ifdef __cplusplus you can have both in a general purpose header and have the compiler select the safer one for C++ builds and the C-compatible one when C++ isn't in effect.
The problem is that the sizeof operator checks the size of it's argument. The argument passed in your sample code is WCHAR*. So, the sizeof(WCHAR*) is 4. If you had an array, such as WCHAR foo[23], and took sizeof(foo), the type passed is WCHAR[23], essentially, and would yield sizeof(WCHAR) * 23. Effectively at compile type WCHAR* and WCHAR[23] are different types, and while you and I can see that the result of new WCHAR[23] is functionally equivalent to WCHAR[23], in actuality, the return type is WCHAR*, with absolutely no size information.
As a corellary, since sizeof(new WCHAR[23]) equals 4 on your platform, you're obviously dealing with an architecture where a pointer is 4 bytes. If you built this on an x64 platform, you'd find that sizeof(new WCHAR[23]) will return 8.
You wrote:
WCHAR* str = new WCHAR[23];
if 23 is meant to be a static value, (not variable in the entire life of your program) it's better use #define or const than just hardcoding 23.
#define STR_LENGTH 23
WCHAR* str = new WCHAR[STR_LENGTH];
size_t len = (size_t) STR_LENGTH;
or C++ version
const int STR_LENGTH = 23;
WCHAR* str = new WCHAR[STR_LENGTH];
size_t len = static_cast<size_t>(STR_LENGTH);