I guess I'm still not understanding the limitations of C++ containers and arrays. According to this post and this It is impossible to store items of dynamic size in an STL vector.
However with the following code I can dynamically re-size an element of a vector with the results one would expect if it was ok to have items of varying and changing size in a vector.
string test = "TEST";
vector<string> studentsV;
for (int i = 0; i < 5; ++i)
{
studentsV.push_back(test);
}
studentsV[2].resize(100);
for (string s : studentsV)
{
cout << s << "end" << endl;
}
Result:
TESTend
TESTend
TEST
end
TESTend
TESTend
I can re-size the string element to any size, and it works fine. I can also do the same with a regular C-style array. So, what is the difference between the above posts and what I am doing, and can you give an example of what "dynamic item size" really means, because apparently I am not understanding.
A std::string uses dynamic memory to increase the size of the string being stored. This is not what those articles are talking about.
What they mean, is that sizeof(std::string) is constant. The actual object representing a std::string will always have the same size, but it might do additional allocations in another part of memory.
A std::vector is really just a friendly wrapper around a dynamically-sized array. The definition of an array in C or C++ is a contiguous block of memory where all elements are of equal size.
can you give an example of what "dynamic item size" really means, because apparently I am not understanding.
This is the core of your question.
Namely: if all C++ classes (even ones that manage dynamic memory as part of their implementations) have a fixed and known footprint size via sizeof()...just what sort of thing is it that you can't put in a std::vector?
Since something like a std::string and a std::bitset are classes of different sizes, you couldn't have a vector of [string string bitset string bitset string]. But the type system already wouldn't let you do that. So that can't be what they're talking about.
They're just saying there's no hook for supporting structures like this from the C world:
struct packetheader {
int id;
int filename_len;
};
struct packet {
struct packetheader h;
char filename[1];
};
You couldn't make a std::vector<packet> and expect to find some parameter to push_back letting you specify a per-item size. You'd lose any data you'd allocated outside of the structure boundary.
So to use something like that, you'd have to do std::vector<packet*> and store pointers.
The size of std::string is not dynamic. std::string is probably implemented with a pointer to a dynamically allocated memory. This makes sizeof(std::string) static and possibly different from the size of the actual string.
Related
The following piece of code works fine. Problem is I need it to work when size of array is unknown. In the example below I have hardcoded the values to 2. In the real world I do not know the size. Is there a way to modify the code so that it works even when size of the array is not known.
void namesArray(std::string (&numList)[2], std::string name)
{
//This is just place holder code. Please ignore the logic.
numList[ 0 ] = "Peter" + name;
numList[ 1 ] = "Bruce" + name;
}
int main()
{
std::string nameList[2];
namesArray( nameList, "Parker");
std::cout << nameList[0]<< std::endl;
std::cout << nameList[1] << std::endl;
return 0;
}
I CANNOT use any other datatype (eg: Vectors) except Arrays due to external limitations.
Edit: When I say the size is unknown, what I mean is the size of the Array is not known until runtime.
Also, what I am presenting is an over simplification of my actual code. The function accepts only arrays.
UPDATE: Thank you all for the solutions offered. Looks like the code I have already authored worked in my solution. I know it's wierd to use arrays when vetors offer more flexibility. However, when dealing with legacy code you sometimes don't have a choice. THANKS A LOT FOR ALL THE ANSWERS TO EVERYONE WHO RESPONDED. IT WAS VERY INFORMATIVE.
If I understood your question well, you can use template mechanism to array size deduction:
#include <iostream>
#include <string>
template <size_t N>
void namesArray(std::string (&numList)[N], std::string name) {
//This is just place holder code. Please ignore the logic.
numList[ 0 ] = "Peter" + name;
numList[ 1 ] = "Bruce" + name;
}
int main() {
std::string nameList[5];
namesArray( nameList, "Parker");
std::cout << nameList[0]<< std::endl;
std::cout << nameList[1] << std::endl;
return 0;
}
You got presented several solutions. I'd like to present another, one that is an abstraction which incorporates several solutions. Take a gsl::span (or a std::span if you are from the future).
A span is generalized a view on a contiguous sequence of elements. And a powerful abstraction.
You want to pass an array of static size? A span can be constructed from one via a template constructor.
You want to pass a pointer and a size? span got you covered there as well.
A container like std::vector or std::array? No problem.
Use a span if all you care about is the sequence property, and not what the sequence itself is.
The simplest solution is to pass the size in with the array:
void namesArray(std::string *numList, std::size_t numCount, std::string name)
This will probably work for you -- you need to know the size of the array when you pass it in, but it doesn't require that you be working with a stack array from that scope, and in general usage you're more likely to have access to the size of the array than to be creating the array in the same scope that you're calling the function. It also makes things much more explicit and, if you switch to heap arrays instead of stack arrays, it still works fine.
Here's the thing: You always know the size of the array. It's literally impossible to write code that creates an array of an unknown size. You can (un)intentionally forget the array size, but at some point, it has to be known, because you literally can't create the array otherwise. You might only know it at runtime, because it's, say, defined by user input, but that just means it's in a variable, and you still know it, you just don't have it predefined at compile time.
Also, if the array size is defined at runtime, you're using heap arrays, and the other solution won't work for that.
This question already has answers here:
How do I find the length of an array?
(30 answers)
Closed 8 years ago.
I have this code.
int x[5];
printf("%d\n",sizeof(x) );
int *a;
a = new int[3];
printf("%d\n",sizeof(*a));
When I pass a 'static' array to sizeof(), it returns the dimension of the declared array multiplied by the number of bytes that the datatype uses in memory. However, a dynamic array seems to be different. My question is what should I do to get the size of an 'dynamic' array?
PD: Could it be related to the following?
int *a;
a=new int[3];
a[0]=3;
a[1]=4;
a[2]=5;
a[3]=6;
Why can I modify the third position if it's supposed I put a 'limit' in "a=new int[3]".
When I pass a 'static' array to sizeof(), it returns the dimension of the declared array multiplied by the number of bytes that the datatype uses in memory.
Correct, that is how the size of the entire array is computed.
However, a dynamic array seems to be different.
This is because you are not passing a dynamic array; you are passing a pointer. Pointer is a data type with the size independent of the size of the block of memory to which it may point, hence you always get a constant value. When you allocate memory for your dynamically sized memory block, you need to store the size of allocation for future reference:
size_t count = 123; // <<== You can compute this count dynamically
int *array = new int[count];
cout << "Array size: " << (sizeof(*array) * count) << endl;
C++14 will have variable-length arrays. These arrays will provide a proper size when you check sizeof.
Could it be related to the following? [...]
No, it is unrelated. Your code snippet shows undefined behavior (writing past the end of the allocated block of memory), meaning that your code is invalid. It could crash right away, lead to a crash later on, or exhibit other arbitrary behavior.
In C++ arrays do not have any intrinsic size at runtime.
At compile time one can use sizeof as you showed in order to obtain the size known to the compiler, but if the actual size is not known until runtime then it is the responsibility of the program to keep track of the length.
Two popular strategies are:
Keep a separate variable that contains the current length of the array.
Add an extra element to the end of the array that contains some sort of marker value that indicates that it's the last element. For example, if your array is known to be only of positive integers then you could use -1 as your marker.
If you do not keep track of the end of your array and you write beyond what you allocated then you risk overwriting other data stored adjacent to the array in memory, which could cause crashes or other undefined behavior.
Other languages tend to use the former strategy and provide a mechanism for obtaining the current record of the length. Some languages also allow the array to be dynamically resized after it's created, which usually involves creating a new array and copying over all of the data before destroying the original.
The vector type in the standard library provides an abstraction over arrays that can be more convenient when the size of the data is not known until runtime. It keeps track of the current array size, and allows the array to grow later. For example:
#include <vector>
int main() {
std::vector<int> a;
a.push_back(3);
a.push_back(4);
a.push_back(5);
a.push_back(6);
printf("%d\n", a.size());
return 0;
}
As a side-note, since a.size() (and sizeof(...)) returns a size_t, which isn't necessarily the same size as an int (though it happens to be on some platforms), using printf with %d is not portable. Instead, one can use iostream, which is also more idiomatic C++:
#include <iostream>
std::cout << a.size() << '\n';
You do not, at least not in standard C++. You have to keep track of it yourself, use an alternative to raw pointers such as std::vector that keeps track of the allocated size for you, or use a non-standard function such as _msize https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/msize?view=msvc-160 on Microsoft Windows or malloc_size https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man3/malloc_size.3.html on MacOS X.
I have a class in c++ like the following:
class myCls
{
public:
myCls();
void setAngle(float angle);
void setArr(unsigned char arr[64]);
unsigned char arr[64];
double angle;
int index;
static float calcMean(const unsigned char arr[64]);
static float sqrt7(float x);
};
Now in my main program I have a 3D vector of the class:
vector<vector<vector< myCls > > > obj;
The size of the vector is also dynamically changed. My question is that how can I store the content of my vector into a file and retrieve it afterward?
I have tried many ways with no success.This is my try:
std::ofstream outFile;
outFile.open(fileName, ios::out);
for(int i=0;i<obj.size();i++)
{
outFile.write((const char *)(obj.data()),sizeof(vector<vector<myCls> >)*obj.size());
}
outFile.close();
And for reading it:
vector<vector<vector<myCls>>> myObj;
id(inFile.is_open())
{
inFile.read((char*)(myObj.data()),sizeof(vector<vector<myCls> >)*obj.size());
}
What I get is only runTime error.
Can anyone help me in this issue please?
If you don't care too much about performance, try boost::serialization. Since they've already implemented serialization functions for stl containers, you would only have to write the serialize function for a myCL, and everything else comes for free. Since your member variables are all public, you can do that intrusively or non-intrusively.
Internally, a vector most usually consists of two numbers, representing the current length and the allocated length (capacity), as well as a pointer to the actual data. So the size of the “raw” object is fixed and approximately thrice the size of a pointer. This is what your code currently writes. The values the pointer points at won't be stored. When you read things back, you're setting the pointer to something which in most cases won't even be allocated memory, thus the runtime error.
In general, it's a really bad idea to directly manipulate the memory of any class which provides constructors, destructors or assignment operators. Your code writing to the private members of the vector would thoroughly confuse memory management, even if you took care to restore the pointed-at data as well. For this reason, you should only write simple (POD) data the way you did. Everything else should be customized to use custom code.
In the case of a vector, you'd probably store the length first, and then write the elements one at a time. For reading, you'd read the length, probably reserve memory accordingly, and then read elements one at a time. The boost::serialization templates suggested by Voltron will probably save you the trouble of implementing all that.
My question is with regard to C++
Suppose I write a function to return a list of items to the caller. Each item has 2 logical fields: 1) an int ID, and 2) some data whose size may vary, let's say from 4 bytes up to 16Kbytes. So my question is whether to use a data structure like:
struct item {
int field1;
char field2[MAX_LEN];
OR, rather, to allocate field2 from the heap, and require the caller to destroy when he's done:
struct item{
int field1;
char *field2; // new char[N] -- destroy[] when done!
Since the max size of field #2 is large, is makes sense that this would be allocated from the heap, right? So once I know the size N, I call field2 = new char[N], and populate it.
Now, is this horribly inefficient?
Is it worse in cases where N is always small, i.e. suppose I have 10000 items that have N=4?
You should instead use one of the standard library containers, like std::string or std::vector<char>; then you don't have to worry about managing the memory yourself.
The thing that's horribly in efficient is all the time you will waste tracking down memory leaks. Use classes that take care of this for you.
But if you don't want to do that:
suppose I have 10000 items that have N=4?
So you waste 40k of memory - your PC has at least a gigabyte, probably two, don't worry about it. A consistent interface, even if you're doing new/delete, is better than something fancy that will be harder to debug.
The only time when can safely use fixed-size buffers in production code is sizes are compile-time system constants, such as MAX_PATH.
You could do both:
struct item {
...
char *field2; // Points to buf if < 8 chars (assuming null-terminator).
char buf[8];
};
This does require some clever copy semantics, so you'll need a custom copy-constructor and assignment operator.
Alternatively, if item is always heap-allocated, you could ensure that item and its data are always allocated together:
struct item {
...
char field2[1];
}
item* new_item(int size) {
int offset = &((item*)0)->field2[0] - (char*)0;
return new(malloc(offset + size)) item;
}
Actually it depends. As I see it:
statically sized buffer
Good
No need to manage memory
Very efficient in terms of execution speed
Bad
Might waste some memory
dynamically sized buffer
Good
Does not have to waste any memory, as the exact amount needed is known
Bad
Memory must be managed.
Might be slow(er)
With that in mind, and based on the situation (Is it likely sizes will vary much? Is execution speed extra important? ... ), pick one.
I need to craft a packet that has a header, a trailer, and a variable length payload field. So far I have been using a vector for the payload so my struct is set up like this:
struct a_struct{
hdr a_hdr;
vector<unsigned int> a_vector;
tr a_tr;
};
When I try to access members of the vector I get a seg fault and a sizeof of an entire structs give me 32 (after I've added about 100 elements to the vector.
Is this a good approach? What is better?
I found this post
Variable Sized Struct C++
He was using a char array, and I'm using a vector though.
Even though the vector type is inlined in the struct, the only member that is in the vector is likely a pointer. Adding members to the vector won't increase the size of the vector type itself but the memory that it points to. That's why you won't ever see the size of the struct increase in memory and hence you get a seg fault.
Usually when people want to make a variable sized struct, they do so by adding an array as the last member of the struct and setting it's length to 1. They then will allocate extra memory for the structure that is actually required by sizeof() in order to "expand" the structure. This is almost always accompanied by an extra member in the struct detailing the size of the expanded array.
The reason for using 1 is thoroughly documented on Raymond's blog
http://blogs.msdn.com/oldnewthing/archive/2004/08/26/220873.aspx
The solution in the other SO answer is c-specific, and relies on the peculiarities of c arrays - and even in c, sizeof() won't help you find the "true" size of a variable size struct. Essentially, it's cheating, and it's a kind of cheating that isn't necessary in C++.
What you are doing is fine. To avoid seg faults, access the vector as you would any other vector in C++:
a_struct a;
for(int i = 0; i < 100; ++i) a.a_vector.push_back(i);
cout << a.a_vector[22] << endl; // Prints 22
i saw this implementation in boost..it looks really neat...to have a variable
length payload....
class msg_hdr_t
{
public:
std::size_t len; // Message length
unsigned int priority;// Message priority
//!Returns the data buffer associated with this this message
void * data(){ return this+1; } //
};
this may be totally un-related to the question, but i wanted to share the info