Conversion from vector to array: length becomes exactly 4 - c++

I have a vector of chars:
vector<char> bytesv;
I push 1024 chars to this vector in a loop (not important to the context) using push_back(char):
bytesv.push_back(c);
I know this vector has an exact value of 1024. It indeeds print 1024 when doing the following:
cout << bytesv.size() << "\n";
What I am trying to do: I need to transform this vector into a char array (char[]) of the same length and elements as the vector. I do the following:
char* bytes = &bytesv[0];
The problem: But when I print the size of this array, it prints 4, so the size is not what I expected:
cout << sizeof(bytes) << "\n";
Full code:
vector<char> bytesv;
for (char c : charr) { // Not important, there are 1024 chars in this array
bytesv.push_back(c);
}
cout << bytesv.size() << "\n";
char* bytes = &bytesv[0];
cout << sizeof(bytes) << "\n";
Prints:
1024
4
This obviously has to do with the fact that bytes is actually a char*, not really an array AFAIK.
The question: How can I safely transfer all the vector's contents into an array, then?

How can I safely transfer all the vector's contents into an array, then?
Allocate the required memory by using dynamic memory allocation.
size_t size = bytesv.size();
char* char_array = new char[size];
Copy the elements from the vector to the array.
for ( size_t i = 0; i < size; ++i )
char_array[i] = bytesv[i];
Make sure you deallocate the memory after you are done using it.
delete [] char_array;
Having said, that I realized that you mentioned in a comment,
My ultimate goal is to save these bytes to a file, using fstream, which requires an array of chars as far as I am concerned.
You don't need to copy the contents of the vector to an array to save them to an fstream. The contents of a std::vector are guaranteed to be in contiguous memory. You can just use:
outStream.write(bytesv.data(), bytesv.size());

sizeof(bytes) is the size of the pointer, not what it points to. Also,
char* bytes = &bytesv[0];
Doesn't transfer anything to an array, all you've done is saved a pointer to the beginning of the underlying array in std::vector.
To correctly move the data to an array you'll need to dynamically allocate an array. But the question is why would you do that? You already have a vector. It's like an array but about a billion times better.

How can I safely transfer all the vector's contents into an array, then?
There's no need to "transfer" (i.e. copy). You can access the vector's underlying storage as an array by using the data method.
char* arr = bytesv.data();
http://en.cppreference.com/w/cpp/container/vector/data
bytes is actually a char*, not really an array
The char* is not an array but a pointer to the first value in the array. You can get the number of elements in the array from bytesv.size()

sizeof(bytes) is always 4 because *bytes is a pointer and you are using a machine that uses 4-byte pointers.
You already know that you have 1024 bytes; just use that knowledge.

First of all: you need to copy the content of the vector to the array - otherwise you can't access elements of your array, when the vector is gone. So you need to allocate memory to your array (not just defining a pointer).
char* bytes = new char[bytesv.size()];
for (size_t i = 0; i < bytesv.size(); ++i) {
bytes [i] = bytesv.at(i);
}
//...
delete[] bytes;
Secondly sizeof() doesn't do what you expect: its not reporting the length of an array, but the size of a type/pointer. In case of stack allocated arrays, it can be used as: sizeof(array)/sizeof(array[0]); to determine the size, but as sizeof() is a compile time operator it can't know the size of your dynamic allocated arrays or vectors.
If you use an array, you need to use a seperate variable to store the length of this array (alternatively you could use std::array instead).
#include <iostream>
#include <string>
#include <vector>
int main(int argc, char* argv[]){
std::vector<uint8_t> bytesv = {0x01, 0x02, 0x03, 0x04, 0x05, 0x06 };
size_t bytesLength = bytesv.size();
char* bytes = new char[bytesLength];
for (size_t i = 0; i < bytesv.size(); ++i) {
bytes[i] = bytesv.at(i);
}
//...
std::cout << bytesLength << std::endl;
delete[] bytes;
return 0;
}

That's because the sizeof any pointer is 4 (on a 32 bit target). What
char* bytes = &bytesv[0];
is giving you is a pointer to the first element, not, necessarily, an array of chars.
now if you used:
char (*bytes)[1024] = (char (*)[1024])&v[0];
std::cout << sizeof(bytes) << " " << sizeof(*bytes);
that would give you a pointer to a 'char[1024]' array.

Related

c++ read into c-style string one char at a time?

in c++ id like to read input into a c-style string one character at a time. how do you do this without first creating a char array with a set size (you don't know how many chars the user will enter). And since you can't resize the array, how is this done? I was thinking something along these lines, but this does not work.
char words[1];
int count = 0;
char c;
while(cin.get(c))
{
words[count] = c;
char temp[count+1];
count++;
words = temp;
delete[] temp;
}
Since you cannot use std::vector, I am assuming you cannot use std::string either. If you can use std::string, you can the solution provide by the answer by #ilia.
Without that, your only option is to:
Use a pointer that points to dynamically allocated memory.
Keep track of the size of the allocated array. If the number of characters to be stored exceeds the current size, increase the array size, allocate new memory, copy the contents from the old memory to new memory, delete old memory, use the new memory.
Delete the allocated memory at the end of the function.
Here's what I suggest:
#include <iostream>
int main()
{
size_t currentSize = 10;
// Always make space for the terminating null character.
char* words = new char[currentSize+1];
size_t count = 0;
char c;
while(std::cin.get(c))
{
if ( count == currentSize )
{
// Allocate memory to hold more data.
size_t newSize = currentSize*2;
char* temp = new char[newSize+1];
// Copy the contents from the old location to the new location.
for ( size_t i = 0; i < currentSize; ++i )
{
temp[i] = words[i];
}
// Delete the old memory.
delete [] words;
// Use the new memory
words = temp;
currentSize = newSize;
}
words[count] = c;
count++;
}
// Terminate the string with a null character.
words[count] = '\0';
std::cout << words << std::endl;
// Deallocate the memory.
delete [] words;
}
You asked for C-style array. Stack or dynamic allocation will not serve you in this case. You need to change the count of the array number in each time you add new element which is not possible automatically. You have to work around and delete and reserve the array each time a new chae is read. So you have to options:
Use std::vector (which was created for this purpose)
Duplicate what is inside std::vector and write it yourself during your code( which seems terrible)
std::vector solution:
std::vector<char> words;
words.reserve(ESTIMATED_COUNT); // if you you do not the estimated count just delete this line
char c;
while(cin.get(c)){
words.push_back(c);
}
#include <iostream>
#include <string>
using namespace std;
int main()
{
string s1;
char c;
while (cin.get(c))
{
if (c == '\n')
continue;
s1 += c;
cout << "s1 is: " << s1.c_str() << endl; //s1.c_str() returns c style string
}
}
You have two ways, first is to use an zero size array, after each input you delete the array and allocate a new one that is +1 bigger, then store the input. This uses less memory but inefficient. (In C, you can use realloc to improve efficiency)
Second is to use a buffer, for example you store read input in a fixed size array and when it get full, you append the buffer at the end of main array (by deleting and re-allocating).
By the way, you can use std::vector which grows the size of itself automatically and efficiently.
If you're set on using c-style strings then:
char* words;
int count = 0;
char c;
while(cin.get(c))
{
// Create a new array and store the value at the end.
char* temp = new char[++count];
temp[count - 1] = c;
// Copy over the previous values. Notice the -1.
// You could also replace this FOR loop with a memcpy().
for(int i = 0; i < count - 1; i++)
temp[i] = words[i];
// Delete the data in the old memory location.
delete[] words;
// Point to the newly updated memory location.
words = temp;
}
I would do what Humam Helfawi suggested and use std::vector<char> since you are using C++, it will make your life easier. The implementation above is basically the less elegant version of vector. If you don't know the size before hand then you will have to resize memory.
You need to allocate a string buffer of arbitrary size. Then, if the maximum number of characters is reached upon appending, you need to enlarge the buffer with realloc.
In order to avoid calling realloc at each character, which is not optimal, a growth strategy is recommended, such as doubling the size at each allocation. There are even more fine-tuned growth strategies, which depend on the platform.
Then, at the end, you may use realloc to trim the buffer to the exact number of appended bytes, if necessary.

Are character arrays the only type that have some form of null-termination?

I've read some of my friend's code and seen functions like this:
int foo(int* arr, int n)
{
// ...
}
which he then calls like this:
int myArr [] = {69, 69, 69, 69, 69};
int f = foo(myArr, sizeof(myArr)/sizeof(int));
Now, I understand that sizeof(myArr)/sizeof(int) is dividing the size of myArr in bytes by the size of an int in bytes, thus returning the length of myArray. However, I don't understand how sizeof(myArr) is implemented unless there's some sort of generic null element that terminates arrays and then sizeof(...) works similar to how strlen(...) works:
size_t strlen(char* c)
{
size_t k = 0;
while (*c != '\0')
{
++k;
++c;
}
return k;
}
Now, if sizeof(...) does work similar to that, then I don't see why, when passing an array to a function, you can't simply do
int foo(int* arr)
{
int n = sizeof(arr)/sizeof(int);
// ....
}
which is simpler way of writing functions because the array is essentially being passed in as a single unit that gets unpacked.
My guess is that arrays of non-character type don't have the null-termination property that character arrays do. In that case, how does sizeof(...) work? And what is the point of null-termination in character arrays anyhow? Why are they created differently than any other array?
Anyways, I was wondering whether someone could clear up all the obvious confusion that I have.
sizeof works on arrays because the compiler knows the length at compile time. If you pass that array to a function, it turns into a pointer, at which point the compiler doesn't know the full size of the array anymore.
For example:
#include <iostream>
void printPointerSize(int* a) {
// a is a pointer, and all pointers are 8 bytes (64 bits) on my machine
std::cout << "int* pointer argument has size: " << sizeof a << std::endl;
}
int main() {
// the compiler determines from the initializer that this is an int[5]
int a[] = {1, 2, 3, 4, 5};
// since the compiler knows that a is an int[5],
// then sizeof a is 5 * sizeof int
std::cout << "int[5] has size: " << sizeof a << std::endl;
printPointerSize(a);
}
Output (on a platform with 64-bit pointers and 32-bit integers):
int[5] has size: 20
int* pointer argument has size: 8
Note that if you try to create a function that takes an array as an argument, the compiler will just turn it into a pointer anyway:
void printPointerSize(int a[5]) {
// this will print the size of a pointer,
// not the size of a 5-element int array
std::cout << "int[5] argument has size: " << sizeof a << std::endl;
}
In addition to Brandon's answer, you need to distinguish between array capacity and array size.
Array Capacity
Array Capacity is the maximum number items the array can hold. There can be from 0 to capacity number of items in the array. Which brings up the question, "how many items are in the array?"
Array Size
Array Size is the number of valid items in the array. An array that has a capacity of 20 items, may only have 3 valid items in it.
Example:
char array[20];
array[0] = 'M';
array[1] = 'e';
array[2] = 'a';
array[3] = 't';
The above array has a capacity of 20, but only 4 valid items.
Repeating the question, "How many items are in the array?"
The C++ language does not maintain the number of items in an array. The sizeof operator returns the capacity of an array, but not the size.
The size of an array must be maintained in a separate variable. Some crafty programmers can use an array slot to maintain this value. So when passing an array, you will have to pass: the array (or pointer to it), the capacity and the size. Most starting programmers forget about the size parameter, which leads to many difficult defects that are hard to find.

What's going on with my memory?

I have a function that allocated a buffer for the size of a file with
char *buffer = new char[size_of_file];
The i loop over the buffer and copy some of the pointers into a subbuffer to work with smaller units of it.
char *subbuffer = new char[size+1];
for (int i =0; i < size; i++) {
subbuffer[i] = (buffer + cursor)[i];
}
Next I call a function and pass it this subbuffer, and arbitrary cursor for a location in the subbuffer, and the size of text to be abstracted.
wchar_t* FileReader::getStringForSizeAndCursor(int32_t size, int cursor, char *buffer) {
int wlen = size/2;
#if MARKUP_SIZEOFWCHAR == 4 // sizeof(wchar_t) == 4
uint32_t *dest = new uint32_t[wlen+1];
#else
uint16_t *dest = new uint16_t[wlen+1];
#endif
char *bcpy = new char[size];
memcpy(bcpy, (buffer + cursor), size+2);
unsigned char *ptr = (unsigned char *)bcpy; //need to be careful not to read outside the buffer
for(int i=0; i<wlen; i++) {
dest[i] = (ptr[0] << 8) + ptr[1];
ptr += 2;
}
//cout << "size:: " << size << " wlen:: " << wlen << " c:: " << c << "\n";
dest[wlen] = ('\0' << 8) + '\0';
return (wchar_t *)dest;
}
I store this in a value as the property of a struct whilst looping through the file.
My issue seems to be when I free subbuffer, and start reading the title properties of my structs by looping over an array of struct pointers, my app segfaults. GDB tells me it finished normally though, but a bunch of records that I cout are missing.
I suspect this has to do with function scope of something. I thought the memcpy in getStringForSizeAndCursor would fix the segfault since it's copying bytes outside of subbuffer before I free. Right now I would expect those to then be cleaned up by my struct deconstructor, but either things are deconstructing before I expect or some memory is still pointing to the original subbuffer, if I let subbuffer leak I get back the data I expected, but this is not a solution.
The only definite error I can see in your question's code is the too small allocation of bcpy, where you allocate a buffer of size size and promptly copy size+2 bytes to the buffer. Since you're not using the extra 2 bytes in the code, just drop the +2 in the copy.
Besides that, I can only see one suspicious thing, you're doing;
char *subbuffer = new char[size+1];
and copying size bytes to the buffer. The allocation hints that you're allocating extra memory for a zero termination, but either it shouldn't be there at all (no +1) or you should allocate 2 bytes (since your function hints to a double byte character set. Either way, I can't see you zero terminating it, so use of it as a zero terminated string will probably break.
#Grizzly in the comments has a point too, allocating and handling memory for strings and wstrings is probably something you could "offload" to the STL with good results.

How to create a jagged string array in c++?

I want to create jagged character two dimensional array in c++.
int arrsize[3] = {10, 5, 2};
char** record;
record = (char**)malloc(3);
cout << endl << sizeof(record) << endl;
for (int i = 0; i < 3; i++)
{
record[i] = (char *)malloc(arrsize[i] * sizeof(char *));
cout << endl << sizeof(record[i]) << endl;
}
I want to set record[0] for name (should have 10 letter), record[1] for marks (should have 5 digit mark )and record[3] for Id (should have 2 digit number). How can i implement this? I directly write the record array to the binary file. I don't want to use struct and class.
in C++ it would like this:
std::vector<std::string> record;
Why would you not use a struct when it is the sensible solution to your problem?
struct record {
char name[10];
char mark[5];
char id[2];
};
Then writing to a binary file becomes trivial:
record r = get_a_record();
write( fd, &r, sizeof r );
Notes:
You might want to allocate a bit of extra space for NUL terminators, but this depends on the format that you want to use in the file.
If you are writing to a binary file, why do you want to write mark and id as strings? Why not store an int (4 bytes, greater range of values) and a unsigned char (1 byte)
If you insist on not using a user defined type (really, you should), then you can just create a single block of memory and use pointer arithmetic, but beware that the binary generated by the compiler will be the same, the only difference is that your code will be less maintainable:
char record[ 10+5+2 ];
// copy name to record
// copy mark to record+10
// copy id to record+15
write( fd, record, sizeof record);
Actually the right pattern “to malloc” is:
T * p = (T *) malloc(count * sizeof(T));
where T could be any type, including char *. So the right code for allocating memory in this case is like that:
int arrsize[3] = { 10, 5, 2 };
char** record;
record = (char**) malloc(3 * sizeof(char *));
cout << sizeof(record) << endl;
for (int i = 0; i < 3; ++i) {
record[i] = (char *) malloc(arrsize[i] * sizeof(char));
}
I deleted cout'ing sizeof(record[i]) because it will always yield size of (one) pointer to char (4 on my laptop). sizeof is something that plays in compiling time and has no idea how much memory pointed by record[i] (which is really a pointer - char * type) was allocated in the execution time.
malloc(3) allocates 3 bytes. Your jagged array would be an array containing pointers to character arrays. Each pointer usually takes 4 bytes (on a 32-bit machine), but more correctly sizeof(char*), so you should allocate using malloc(3 * sizeof(char*) ).
And then record[i] = (char*)malloc((arrsize[i]+1) * sizeof(char)), because a string is a char* and a character is a char, and because each C-style string is conventionally terminated with a '\0' character to indicate its length. You could do without it, but it would be harder to use for instance:
strcpy(record[0], name);
sprintf(record[1], "%0.2f", mark);
sprintf(record[2], "%d", id);
to fill in your record, because sprintf puts in a \0 at the end. I assumed mark was a floating-point number and id was an integer.
As regards writing all this to a file, if the file is binary why put everything in as strings in the first place?
Assuming you do, you could use something like:
ofstream f("myfile",ios_base::out|ios_base::binary);
for (int i=0; i<3; i++)
f.write(record[i], arrsize[i]);
f.close();
That being said, I second Anders' idea. If you use STL vectors and strings, you won't have to deal with ugly memory allocations, and your code will probably look cleaner as well.

length of array in c++

I read to get the length of array in C++, you do this:
int arr[17];
int arrSize = sizeof(arr) / sizeof(int);
I tried to do the same for a string:
where I have
string * arr;
arr = new (nothrow) string [213561];
And then I do
arr[k] = "stuff";
where I loop through each index and put "stuff" in it.
Now I want the size of the array which should be 213561, what's the correct way to do it and why is it so complex in C++?
What you are trying to do cannot work because sizeof works on types at compile-time (and pointer types never hold the size of the array they may be pointing to).
In your case, computing sizeof(arr) returns the size taken in memory by the pointer, not
size of the array * size of a std::string
I suggest you use one of these two options
either use fixed-size arrays (sizeof works)
or vectors (myVector.size() returns what you need)
... unless you have a good reason not to.
The correct way of doing this in C++ is to use a vector. That way you can either specify a size up-front, or resize it as you go.
Specifying size up-front:
using namespace std;
vector<string> arr(213561);
for (vector<string>::iterator p = arr.begin(); p != arr.end(); ++p)
{
*p = "abc";
}
Expanding the vector as you go:
using namespace std;
vector<string> arr; // <-- note, default constructor
for (int i = 0; i < 213561; ++i)
{
// add elements to the end of the array, automatically reallocating memory if necessary
arr.push_back("abc");
}
Either way, the size of the array is found with:
size_t elements = arr.size(); // = 213561
The sizeof method only works as long as your array is really an array, i.e. an object that has the array type. In your first example object arr has type int[17]. It is an array type, which means that you can use the sizeof method and get 17 as the result.
Once you convert your array type T[N] to a pointer type T *, you basically lose your array type. The sizeof method applied to a pointer will not evaluate to the size of the original array.
When you allocate array of type T[N] with new[], the result is a pointer of type T * right away. It is not an array type from the very beginning. The information about array size is lost right away and trying to use the sizeof method with such a pointer will not work. In order to preserve the size information about a dynamically allocated run-time sized array, you have to store it in a separate variable yourself.
Here is how you find the size of an array:
const size_t ARRAY_SIZE = 17;
int array[ARRAY_SIZE];
//...
std::cout << "My array size is: " << ARRAY_SIZE << "\n";
You can put ARRAY_SIZE into a header so that other translation units can access the array size.
If you want a dynamic array, that will grow as needed, try std::vector.
You need to keep track of the length using a separate variable. There is no way of getting the length of an area that you only have a pointer to, unless you store that length somewhere.
You cannot get the length of the allocated array.
What you can do is save it seperately at the time of allocation..
Also, you could check the length of the string (which isn't what you're asking, but still..) using strlen()
In c++ here arr is simply a reference to the first element of the array. In case of dynamic arrays it is not possible.
There is a subtle nuance in both C and C++ with memory allocation. Neither language supports dynamic arrays. Here is what you are seeing:
int ary[17];
int arrSize = sizeof(ary) / sizeof(ary[0]);
Here ary is a true array of 17 integers. The array size calculation works because sizeof(ary) returns the size of the memory block allocated for the entire array. You divide this by the size of each element and violà you have the number of elements in the array.
std::string * arr;
arr = new (std::nothrow) std::string[213561];
In this case arr is a pointer to some memory. The new operator allocates a block of memory large enough to hold 213,561 contiguous std::string objects and constructs each of them into the memory. The arr variable simply points to the beginning of the block of memory. C++ does not track the number of elements that you have allocated. You didn't really create a dynamic array - instead, you have allocated enough memory for a bunch of contiguous objects.
C and C++ both allow you to apply the subscripting operator to a pointer as syntactical sugar. You will see a lot of comments about how arr[0] translates into *(arr + 0). The reality is that allocating memory using the new operator results in a block of memory that is not an array at all. The syntactical sugar makes it look like one. The next thing that you will encounter is that multi-dimensional arrays are similar sugar.
Consider the following snippet. Once you understand what is going on there, you will be a lot closer to understanding how memory works. This is the primary reason why C and C++ cannot tell you how large an array is if it is dynamically allocated - it does not know the size, all that it has is a pointer to the allocated memory.
#include <iostream>
int
main()
{
//
// The compiler translates array subscript notation into
// pointer arithmetic in simple cases so "hello"[3] is
// is translated into *("hello" + 3). Since addition is
// commutative, the order of "hello" and 3 are irrelevant.
//
std::cout
<< "\"hello\"[3] = '" << "hello"[3] << "'\n"
<< "3[\"hello\"] = " << 3["hello"] << "\n"
<< std::endl;
//
// All memory is linear in C or C++. So an 3x3 array of
// integers is a contiguous block of 9 integers in row
// major order. The following snippet prints out the 3x3
// identity matrix using row and column syntax.
//
int ary[3][3] = { { 1, 0, 0 },
{ 0, 1, 0 },
{ 0, 0, 1 } };
for (int r=0; r<3; ++r) {
for (int c=0; c<3; ++c) {
std::cout << "\t" << ary[r][c];
}
std::cout << "\n";
}
std::cout << "\n";
//
// Since memory is linear, we can also access the same
// 3x3 array linearly through a pointer. The inner loop
// is what the compiler is doing when you access ary[r][c]
// above - "ary[r][c]" becomes "*(ptr + (r * 3) + c)"
// since the compiler knows the dimensions of "ary" at
// compile time.
//
int *ptr = &ary[0][0];
for (int i=0; i<9; ++i) {
ptr[i] = i;
}
for (int r=0; r<3; ++r) {
for (int c=0; c<3; ++c) {
std::cout << "\t" << *(ptr + (r * 3) + c);
}
std::cout << "\n";
}
return 0;
}