I am creating a struct called student. In order to store the name, is there anything wrong with just declaring a char pointer in the struct instead of a char array with a predefined size? I can then assign a string literal to the char pointer in the main code.
struct student
{
int ID;
char* name;
};
It really depends on your use case. As suggested above you should use std::string in C++. But if you are using C-style strings, then it depends on your usage.
Using char[] of defined size you can avoid errors due to null pointers and other pointer related errors like memory leaks, dangling pointers etc., but you might not be making an optimal use of memory. You may for example define
#define MAX_SIZE 100
struct student
{
int ID;
char name[MAX_SIZE];
};
And then
#define STUDENT_COUNT 50
struct student many_students[STUDENT_COUNT];
But the length of names of student will be different and in many cases much less than MAX_SIZE. As such much memory will be wasted here.
Or in some cases it might be greater than MAX_SIZE. You may have to truncate the names here to avoid memory corruption.
In other case where we define use char*, memory is not wasted as we allocate only the required amount, but we must take care of memory allocation and freeing.
struct student
{
int ID;
char *name;
};
Then while storing name we need to do something like this:
struct student many_student[STUDENT_COUNT];
int i;
for( i=0; i<STUDENT_COUNT; i++) {
// some code to get student name
many_student[i].name = (char*)malloc(name_length+1 * sizeof(char));
// Now we can store name
}
// Later when name is no longer required free it
free(many_student[some_valid_index_to_free].name);
// also set it to NULL, to avoid dangling pointers
many_student[some_valid_index_to_free].name = NULL;
Also if you are again allocating memory to name, you should free previously allocated memory to avoid memory leaks. Also another thing to consider is NULL checks for pointers before use, i.e., you should always check as
if(many_students[valid_index].name!=NULL) {
// do stuff
}
Although you can create macros to do this, but these are basic overheads with pointers.
Another advantage of using pointers is that if there are many similar names then you can point multiple pointers to same name and save memory, but in array you will be having separate memory for all, e.g,
// IF we have a predefined global name array
char *GLOBAL_NAMES[] = {"NAME_1", "NAME_2", "NAME_3", "NAME_4", ... , "NAME_N"};
// using pointers, just need to assign name to correct pointer in array
many_student[valid_index_1].name = GLOBAL_NAMES[INDEX_NAME_1];
many_student[valid_index_2].name = GLOBAL_NAMES[INDEX_NAME_1];
// In case of array we would have had to copy.
Although this might not be your case, but just saying that pointers may help avoid extra usage.
Hope it will help you :)
Don't use either, use std::string. I (and many others) guarantee that compared to either char* or char[]:
it will be easier to use and
it will be less prone to bugs.
Difference is same as difference between static and dynamic memory allocation. With former ( static ) you have to specify size enough to store the name whereas with latter you have to pay attention to delete it when in no need.
Although it's all time better to use std::string.
Unless there is a strong reason to not do so, I'd suggest you to use a convenient string class like std::string, instead of a raw char* pointer.
Using std::string will simplify your code a lot, e.g. the structure will be automatically copyable, the strings will be automatically released, etc.
A reason why you could not use std::string is because you are designing an interface boundary, think of e.g. Win32 APIs which are mainly C-interface-based (implementation can be in C++), so you can't use C++ at the boundary and instead must use pure C.
But if that's not the case, do yourself a favor and use std::string.
Note also that in case you must use a raw char* pointer, you have several design questions to clarify, e.g.:
Is this an owning pointer, or an observing pointer?
If it's an owning pointer, in what way is it allocated, and in what way is it released? (e.g. malloc()/free(), new[]/delete[], some other allocator like COM CoTaskMemAlloc(), SysAllocString(), etc.)
If it's an observing pointer, you must make sure that the observed string's lifetime exceeds that of the observing pointer, to avoid dangling references.
All these questions are just non-existent if you use a convenient string class (like e.g. std::string).
Note also that, as some Win32 data structures do, you can have a maximum-sized string buffer inside your structure, e.g.
struct Student
{
int ID;
char Name[60];
};
In this case you could use C functions like strcpy(), or safer variants, to deep-copy a source string into the Name buffer. In that case you have good locality since the name string is inside the structure, and a simplified memory management with respect to the raw char* pointer case, but at the cost of having a pre-allocated memory buffer.
This may or may not be a better option for you, depending on your particular programming context. Anyway, keep in mind that this is a more C-like approach; a better C++ approach would be to just use a string class like std::string.
TL;DR - use std::string, as we're talking in c++.
EDIT: Previously, as per the C tag (currently removed)
As per your requirement, assigning a string literal needs a pointer, you cannot do that with an array, anyway.#
If you're using that pointer to store the base address of a string literal, then it is ok. Otherwise, you need to
allocate memory before using that pointer
deallocate memory once you're done with it.
#) Base address of compile time allocted array cannot be changed, thus assignment won't work.
Use the std::string library. It is more easier to work with. And has way more functionality compared to the built in counterparts.
Related
I have the following C++ function that reads a string from the flash memory and returns it.
I want to avoid using the String class because this is on an Arduino and I've been advised that the String class for the Arduino is buggy.
const char* readFromFlashMemory()
{
char s[FLASH_STOP_ADDR-FLASH_START_ADDR+2];
memset(s, (byte)'\0', FLASH_STOP_ADDR-FLASH_START_ADDR+2);
unsigned short int numChars = 0;
for (int i = FLASH_START_ADDRESS; i <= FLASH_STOP_ADDRESS; i++)
{
byte b = EEPROM.read(i);
if (b == (byte)'\0')
return s;
else
s[numChars++] = (char)b;
}
}
This function seems to work. But the calling method gets back an empty string.
Am I not allowed to return a pointer to a character array that is on this function's stack?
How is the best/most idiomatic way I should write this function so the calling function receives the value that I want to pass to it?
The comments are probably going to lead you astray or at best confuse you. Let me break it down with some options.
Firstly, the problem is as you say: the array whose address you are returning no longer exists when the function is popped off the stack to the caller. Ignoring this results in Undefined Behavior.
Here are a few options, along with some discussion:
The caller owns the buffer
void readFromFlashMemory(char *s, size_t len)
Advantage is that the caller chooses how this memory is allocated, which is useful in embedded environments.
Note you could also choose to return s from this function as a convenience, or to convey some extra meaning.
For me, if I was working in an embedded environment such as Arduino, this would be my preference 100%.
Use std::string, std::vector or similar
std::string readFromFlashMemory()
This is probably the way you'd do it if you didn't care about allocation overhead and other potential issues such as fragmentation over time.
Allocate memory yourself
char* readFromFlashMemory()
If you want to ensure the allocated memory is exactly the right size, then you'd probably read into a local buffer first, and then allocate the memory and copy. Same memory considerations as std::string or other solutions dealing with heap memory.
This form also has the nightmarish property of the caller being responsible for managing the returned pointer and eventually calling delete[]. It's highly inadvisable. It's also distressingly common. :(
Better way to return dynamically allocated memory, if you absolutely must
std::unique_ptr<char[]> readFromFlashMemory()
Same as #3, but the pointer is managed safely. Requires C++11 or later.
Use a static buffer
const char* readFromFlashMemory()
{
static char s[FLASH_STOP_ADDR-FLASH_START_ADDR+2];
// ...
return s;
}
Generally frowned upon. Particularly because this type of pattern results in nasty problems in multi-threaded environments.
People mostly choose this approach because they're lazy and want a simple interface. I guess one benefit is that the caller doesn't have to know anything about what size buffer is acceptable.
Make your own class with an internal buffer
class FlashReader
{
public:
const char* Read();
private:
char buffer[FLASH_STOP_ADDR-FLASH_START_ADDR+2];
};
This is more of a verbose solution, and may start to smell like over-engineering. But it does combine the best parts of both #1 and #5. That is, you get stack allocation of your buffer, you don't need to know the size, and the function itself doesn't need extra arguments.
If you did want to have a static buffer, then you could just define one static instance of the class, but the difference would be a clear intent of this in the code.
I have the following method signature:
int get_name(char* &name_out);
What I'd like to be able to do is allocate memory that is the size of name for name_out and then copy the contents of name into name_out.
Or would it be better to change it to:
int get_name(char &name_out[], int size);
and allow the user of the function to first allocate and track the memory, and return an error if the size of the given array isn't large enough to contain the string?
The only thing that I don't like about that is that it would require the user of the get_name() function to have knowledge about the length of the name string.
I feel it would be redundant to have two functions int get_name_length(); and get_name(char* name_out);
Since this is a piece of a programming assignment, there are stipulations:
The string class is not allowed.
C-style strings must be used.
Vectors cannot be used.
The function must return an int to indicate an error.
Exception handling is not allowed.
Thank you.
If I understand correctly what you are trying to do is to implement a variant for 'strcpy'.
The major difference is that you pass the allocation responsibility to your copy function, while 'strcpy' leaves that to the user. If this is a production code then I recommend following the 'strcpy' approach which is what the industry is accustomed to.
If it's just for play then wrap strcpy with a function that does the allocation and stick to strcpy's interface.
The standard C way to do this, is to pass two pointers to the function:
int getName(char** name_out, size_t* size_out);
This has several advantages:
The caller is free to preallocate memory/reuse an allocation.
The callee is free to adjust the allocation via realloc() (assuming that the C-style string is allocated with malloc()), or via a pair of delete[]/new[].
The address taking is explicit. I.e. at the calling site you would write:
char* name = null_ptr;
size_t size = 0;
if(getName(&name, &size)) handleError();
The explicit & operator makes it very clear that the function getName() can change both variables. If you go with references, you cannot distinguish between call by reference and call by value without looking at the function declaration.
Also note, the type to use for allocation sizes is size_t: it is the type that is guaranteed to be able to hold the size of the entire usable address space.
I'm working on a program that stores a vital data structure as an unstructured string with program-defined delimiters (so we need to walk the string and extract the information we need as we go) and we'd like to convert it to a more structured data type.
In essence, this will require a struct with a field describing what kind of data the struct contains and another field that's a string with the data itself. The length of the string will always be known at allocation time. We've determined through testing that doubling the number of allocations required for each of these data types is an unnacceptable cost. Is there any way to allocate the memory for the struct and the std::string contained in the struct in a single allocation? If we were using cstrings I'd just have a char * in the struct and point it to the end of the struct after allocating a block big enough for the struct and string, but we'd prefer std::string if possible.
Most of my experience is with C, so please forgive any C++ ignorance displayed here.
If you have such rigorous memory needs, then you're going to have to abandon std::string.
The best alternative is to find or write an implementation of basic_string_ref (a proposal for the next C++ standard library), which is really just a char* coupled with a size. But it has all of the (non-mutating) functions of std::basic_string. Then you use a factory function to allocate the memory you need (your struct size + string data), and then use placement new to initialize the basic_string_ref.
Of course, you'll also need a custom deletion function, since you can't just pass the pointer to "delete".
Given the previously linked to implementation of basic_string_ref (and its associated typedefs, string_ref), here's a factory constructor/destructor, for some type T that needs to have a string on it:
template<typename T> T *Create(..., const char *theString, size_t lenstr)
{
char *memory = new char[sizeof(T) + lenstr + 1];
memcpy(memory + sizeof(T), theString, lenstr);
try
{
return new(memory) T(..., string_ref(theString, lenstr);
}
catch(...)
{
delete[] memory;
throw;
}
}
template<typename T> T *Create(..., const std::string & theString)
{
return Create(..., theString.c_str(), theString.length());
}
template<typename T> T *Create(..., const string_ref &theString)
{
return Create(..., theString.data(), theString.length());
}
template<typename T> void Destroy(T *pValue)
{
pValue->~T();
char *memory = reinterpret_cast<char*>(pValue);
delete[] memory;
}
Obviously, you'll need to fill in the other constructor parameters yourself. And your type's constructor will need to take a string_ref that refers to the string.
If you are using std::string, you can't really do one allocation for both structure and string, and you also can't make the allocation of both to be one large block. If you are using old C-style strings it's possible though.
If I understand you correctly, you are saying that through profiling you have determined that the fact that you have to allocate a string and another data member in your data structure imposes an unacceptable cost to you application.
If that's indeed the case I can think of a couple solutions.
You could pre-allocate all of these structures up front, before your program starts. Keep them in some kind of fixed collection so they aren't copy-constructed, and reserve enough buffer in your strings to hold your data.
Controversial as it may seem, you could use old C-style char arrays. It seems like you are fogoing much of the reason to use strings in the first place, which is the memory management. However in your case, since you know the needed buffer sizes at start up, you could handle this yourself. If you like the other facilities that string provides, bear in mind that much of that is still available in the <algorithm>s.
Take a look at Variable Sized Struct C++ - the short answer is that there's no way to do it in vanilla C++.
Do you really need to allocate the container structs on the heap? It might be more efficient to have those on the stack, so they don't need to be allocated at all.
Indeed two allocations can seem too high. There are two ways to cut them down though:
Do a single allocation
Do a single dynamic allocation
It might not seem so different, so let me explain.
1. You can use the struct hack in C++
Yes this is not typical C++
Yes this requires special care
Technically it requires:
disabling the copy constructor and assignment operator
making the constructor and destructor private and provide factory methods for allocating and deallocating the object
Honestly, this is the hard-way.
2. You can avoid allocating the outer struct dynamically
Simple enough:
struct M {
Kind _kind;
std::string _data;
};
and then pass instances of M on the stack. Move operations should guarantee that the std::string is not copied (you can always disable copy to make sure of it).
This solution is much simpler. The only (slight) drawback is in memory locality... but on the other hand the top of the stack is already in the CPU cache anyway.
C-style strings can always be converted to std::string as needed. In fact, there's a good chance that your observations from profiling are due to fragmentation of your data rather than simply the number of allocations, and creating an std::string on demand will be efficient. Of course, not knowing your actual application this is just a guess, and really one can't know this until it's tested anyways. I imagine a class
class my_class {
std::string data() const { return self._data; }
const char* data_as_c_str() const // In case you really need it!
{ return self._data; }
private:
int _type;
char _data[1];
};
Note I used a standard clever C trick for data layout: _data is as long as you want it to be, so long as your factory function allocates the extra space for it. IIRC, C99 even gave a special syntax for it:
struct my_struct {
int type;
char data[];
};
which has good odds of working with your C++ compiler. (Is this in the C++11 standard?)
Of course, if you do do this, you really need to make all of the constructors private and friend your factory function, to ensure that the factory function is the only way to actually instantiate my_class -- it would be broken without the extra memory for the array. You'll definitely need to make operator= private too, or otherwise implement it carefully.
Rethinking your data types is probably a good idea.
For example, one thing you can do is, rather than trying to put your char arrays into a structured data type, use a smart reference instead. A class that looks like
class structured_data_reference {
public:
structured_data_reference(const char *data):_data(data) {}
std::string get_first_field() const {
// Do something interesting with _data to get the first field
}
private:
const char *_data;
};
You'll want to do the right thing with the other constructors and assignment operator too (probably disable assignment, and implement something reasonable for move and copy). And you may want reference counted pointers (e.g. std::shared_ptr) throughout your code rather than bare pointers.
Another hack that's possible is to just use std::string, but store the type information in the first entry (or first several). This requires accounting for that whenever you access the data, of course.
I'm not sure if this exactly addressing your problem. One way you can optimize the memory allocation in C++ by using a pre-allocated buffer and then using a 'placement new' operator.
I tried to solve your problem as I understood it.
unsigned char *myPool = new unsigned char[10000];
struct myStruct
{
myStruct(char* aSource1, char* aSource2)
{
original = new (myPool) string(aSource1); //placement new
data = new (myPool) string(aSource2); //placement new
}
~myStruct()
{
original = NULL; //no deallocation needed
data = NULL; //no deallocation needed
}
string* original;
string* data;
};
int main()
{
myStruct* aStruct = new (myPool) myStruct("h1", "h2");
// Use the struct
aStruct = NULL; // No need to deallocate
delete [] myPool;
return 0;
}
[Edit] After, the comment from NicolBolas, the problem is bit more clear. I decided to write one more answer, eventhough in reality it is not that much advantageous than using a raw character array. But, I still believe that this is well within the stated constraints.
Idea would be to provide a custom allocater for the string class as specified in this SO question.
In the implementation of the allocate method, use the placement new as
pointer allocate(size_type n, void * = 0)
{
// fail if we try to allocate too much
if((n * sizeof(T))> max_size()) { throw std::bad_alloc(); }
//T* t = static_cast<T *>(::operator new(n * sizeof(T)));
T* t = new (/* provide the address of the original character buffer*/) T[n];
return t;
}
The constraint is that for the placement new to work, the original string address should be known to the allocater at run time. This can be achieved by external explicit setting before the new string member creation. However, this is not so elegant.
In essence, this will require a struct with a field describing what kind of data the struct contains and another field that's a string with the data itself.
I have a feeling that may you are not exploiting C++'s type-system to its maximum potential here. It looks and feels very C-ish (that is not a proper word, I know). I don't have concrete examples to post here since I don't have any idea about the problem you are trying to solve.
Is there any way to allocate the memory for the struct and the std::string contained in the struct in a single allocation?
I believe that you are worrying about the structure allocation followed by a copy of the string to the structure member? This ideally shouldn't happen (but of course, this depends on how and when you are initializng the members). C++11 supports move construction. This should take care of any extra string copies that you are worried about.
You should really, really post some code to make this discussion worthwhile :)
a vital data structure as an unstructured string with program-defined delimiters
One question: Is this string mutable? If not, you can use a slightly different data-structure. Don't store copies of parts of this vital data structure but rather indices/iterators to this string which point to the delimiters.
// assume that !, [, ], $, % etc. are your program defined delims
const std::string vital = "!id[thisisdata]$[moredata]%[controlblock]%";
// define a special struct
enum Type { ... };
struct Info {
size_t start, end;
Type type;
// define appropriate ctors
};
// parse the string and return Info obejcts
std::vector<Info> parse(const std::string& str) {
std::vector<Info> v;
// loop through the string looking for delims
for (size_t b = 0, e = str.size(); b < e; ++b) {
// on hitting one such delim create an Info
switch( str[ b ] ) {
case '%':
...
case '$;:
// initializing the start and then move until
// you get the appropriate end delim
}
// use push_back/emplace_back to insert this newly
// created Info object back in the vector
v.push_back( Info( start, end, kind ) );
}
return v;
}
When does using pointers in any language require someone to use more than one, let's say a triple pointer. When does it make sense to use a triple pointer instead of just using a regular pointer?
For example:
char * * *ptr;
instead of
char *ptr;
each star should be read as "which pointed to by a pointer" so
char *foo;
is "char which pointed to by a pointer foo". However
char *** foo;
is "char which pointed to by a pointer which is pointed to a pointer which is pointed to a pointer foo". Thus foo is a pointer. At that address is a second pointer. At the address pointed to by that is a third pointer. Dereferencing the third pointer results in a char. If that's all there is to it, its hard to make much of a case for that.
Its still possible to get some useful work done, though. Imagine we're writing a substitute for bash, or some other process control program. We want to manage our processes' invocations in an object oriented way...
struct invocation {
char* command; // command to invoke the subprocess
char* path; // path to executable
char** env; // environment variables passed to the subprocess
...
}
But we want to do something fancy. We want to have a way to browse all of the different sets of environment variables as seen by each subprocess. to do that, we gather each set of env members from the invocation instances into an array env_list and pass it to the function that deals with that:
void browse_env(size_t envc, char*** env_list);
If you work with "objects" in C, you probably have this:
struct customer {
char *name;
char *address;
int id;
} typedef Customer;
If you want to create an object, you would do something like this:
Customer *customer = malloc(sizeof Customer);
// Initialise state.
We're using a pointer to a struct here because struct arguments are passed by value and we need to work with one object. (Also: Objective-C, an object-oriented wrapper language for C, uses internally but visibly pointers to structs.)
If I need to store multiple objects, I use an array:
Customer **customers = malloc(sizeof(Customer *) * 10);
int customerCount = 0;
Since an array variable in C points to the first item, I use a pointer… again. Now I have double pointers.
But now imagine I have a function which filters the array and returns a new one. But imagine it can't via the return mechanism because it must return an error code—my function accesses a database. I need to do it through a by-reference argument. This is my function's signature:
int filterRegisteredCustomers(Customer **unfilteredCustomers, Customer ***filteredCustomers, int unfilteredCount, int *filteredCount);
The function takes an array of customers and returns a reference to an array of customers (which are pointers to a struct). It also takes the number of customers and returns the number of filtered customers (again, by-reference argument).
I can call it this way:
Customer **result, int n = 0;
int errorCode = filterRegisteredCustomers(customers, &result, customerCount, &n);
I could go on imagining more situations… This one is without the typedef:
int fetchCustomerMatrix(struct customer ****outMatrix, int *rows, int *columns);
Obviously, I would be a horrible and/or sadistic developer to leave it that way. So, using:
typedef Customer *CustomerArray;
typedef CustomerArray *CustomerMatrix;
I can just do this:
int fetchCustomerMatrix(CustomerMatrix *outMatrix, int *rows, int *columns);
If your app is used in a hotel where you use a matrix per level, you'll probably need an array to a matrix:
int fetchHotel(struct customer *****hotel, int *rows, int *columns, int *levels);
Or just this:
typedef CustomerMatrix *Hotel;
int fetchHotel(Hotel *hotel, int *rows, int *columns, int *levels);
Don't get me even started on an array of hotels:
int fetchHotels(struct customer ******hotels, int *rows, int *columns, int *levels, int *hotels);
…arranged in a matrix (some kind of large hotel corporation?):
int fetchHotelMatrix(struct customer *******hotelMatrix, int *rows, int *columns, int *levels, int *hotelRows, int *hotelColumns);
What I'm trying to say is that you can imagine crazy applications for multiple indirections. Just make sure you use typedef if multi-pointers are a good idea and you decide to use them.
(Does this post count as an application for a SevenStarDeveloper?)
A pointer is simply a variable that holds a memory address.
So you use a pointer to a pointer, when you want to hold the address of a pointer variable.
If you want to return a pointer, and you are already using the return variable for something, you will pass in the address of a pointer. The function then dereferences this pointer so it can set the pointer value. I.e. the parameter of that function would be a pointer to a pointer.
Multiple levels of indirection are also used for multi dimensional arrays. If you want to return a 2 dimensional array, you would use a triple pointer. When using them for multi dimensional arrays though be careful to cast properly as you go through each level of indirection.
Here is an example of returning a pointer value via a parameter:
//Not a very useful example, but shows what I mean...
bool getOffsetBy3Pointer(const char *pInput, char **pOutput)
{
*pOutput = pInput + 3;
return true;
}
And you call this function like so:
const char *p = "hi you";
char *pYou;
bool bSuccess = getOffsetBy3Pointer(p, &pYou);
assert(!stricmp(pYou, "you"));
ImageMagicks's Wand has a function that is declared as
WandExport char* * * * * * DrawGetVectorGraphics ( const DrawingWand *)
I am not making this up.
N-dimensional dynamically-allocated arrays, where N > 3, require three or more levels of indirection in C.
A standard use of double pointers, eg: myStruct** ptrptr, is as a pointer to a pointer. Eg as a function parameter, this allows you to change the actual structure the caller is pointing to, instead of only being able to change the values within that structure.
Char *** foo can be interpreted as a pointer to a two-dimensional array of strings.
You use an extra level of indirection - or pointing - when necessary, not because it would be fun. You seldom see triple pointers; I don't think I've ever seen a quadruple pointer (and my mind would boggle if I did).
State tables can be represented by a 2D array of an appropriate data type (pointers to a structure, for example). When I wrote some almost generic code to do state tables, I remember having one function that took a triple pointer - which represented a 2D array of pointers to structures. Ouch!
int main( int argc, char** argv );
Functions that encapsulate creation of resources often use double pointers. That is, you pass in the address of a pointer to a resource. The function can then create the resource in question, and set the pointer to point to it. This is only possible if it has the address of the pointer in question, so it must be a double pointer.
If you have to modify a pointer inside a function you must pass a reference to it.
It makes sense to use a pointer to a pointer whenever the pointer actually points towards a pointer (this chain is unlimited, hence "triple pointers" etc are possible).
The reason for creating such code is because you want the compiler/interpreter to be able to properly check the types you are using (prevent mystery bugs).
You dont have to use such types - you can always simply use a simple "void *" and typecast whenever you need to actually dereference the pointer and access the data that the pointer is directing towards. But that is usually bad practice and prone to errors - certainly there are cases where using void * is actually good and making code much more elegant. Think of it more like your last resort.
=> Its mostly for helping the compiler to make sure things are used the way they are supposed to be used.
To be honest, I've rarely seen a triple-pointer.
I glanced on google code search, and there are some examples, but not very illuminating. (see links at end - SO doesn't like 'em)
As others have mentioned, double pointers you'll see from time to time. Plain single pointers are useful because they point to some allocated resource. Double pointers are useful because you can pass them to a function and have the function fill in the "plain" pointer for you.
It sounds like maybe you need some explanation about what pointers are and how they work?
You need to understand that first, if you don't already.
But that's a separate question (:
http://www.google.com/codesearch/p?hl=en#e_ObwTAVPyo/security/nss/lib/ckfw/capi/ckcapi.h&q=***%20lang:c&l=301
http://www.google.com/codesearch/p?hl=en#eVvq2YWVpsY/openssl-0.9.8e/crypto/ec/ec_mult.c&q=***%20lang:c&l=344
Pointers to pointers are rarely used in C++. They primarily have two uses.
The first use is to pass an array. char**, for instance, is a pointer to pointer to char, which is often used to pass an array of strings. Pointers to arrays don't work for good reasons, but that's a different topic (see the comp.lang.c FAQ if you want to know more). In some rare cases, you may see a third * used for an array of arrays, but it's commonly more effective to store everything in one contiguous array and index it manually (e.g. array[x+y*width] rather than array[x][y]). In C++, however, this is far less common because of container classes.
The second use is to pass by reference. An int* parameter allows the function to modify the integer pointed to by the calling function, and is commonly used to provide multiple return values. This pattern of passing parameters by reference to allow multiple returns is still present in C++, but, like other uses of pass-by-reference, is generally superseded by the introduction of actual references. The other reason for pass-by-reference - avoiding copying of complex constructs - is also possible with the C++ reference.
C++ has a third factor which reduces the use of multiple pointers: it has string. A reference to string might take the type char** in C, so that the function can change the address of the string variable it's passed, but in C++, we usually see string& instead.
When you use nested dynamically allocated (or pointer linked) data structures. These things are all linked by pointers.
Particularly in single-threaded dialects of C which don't aggressively use type-based aliasing analysis, it can sometimes be useful to write memory managers which can accommodate relocatable objects. Instead of giving applications direct pointers to chunks of memory, the application receives pointers into a table of handle descriptors, each of which contains a pointer to an actual chunk of memory along with a word indicating its size. If one needs to allocate space for a struct woozle, one could say:
struct woozle **my_woozle = newHandle(sizeof struct woozle);
and then access (somewhat awkwardly in C syntax--the syntax is cleaner in
Pascal): (*my_woozle)->someField=23; it's important that applications not
keep direct pointers to any handle's target across calls to functions which
allocate memory, but if there only exists a single pointer to every block
identified by a handle the memory manager will be able to move things around
in case fragmentation would become a problem.
The approach doesn't work nearly as well in dialects of C which aggressively
pursue type-based aliasing, since the pointer returned by NewHandle doesn't
identify a pointer of type struct woozle* but instead identifies a pointer
of type void*, and even on platforms where those pointer types would have
the same representation the Standard doesn't require that implementations
interpret a pointer cast as an indication that it should expect that aliasing
might occur.
Double indirection simplifies many tree-balancing algorithms, where usually one wants to be able to efficiently "unlink" a subtree from its parent. For instance, an AVL tree implementation might use:
void rotateLeft(struct tree **tree) {
struct tree *t = *tree,
*r = t->right,
*rl = r->left;
*tree = r;
r->left = t;
t->right = rl;
}
Without the "double pointer", we would have to do something more complicated, like explicitly keeping track of a node's parent and whether it's a left or right branch.
I have been given a header with the following declaration:
//The index of 1 is used to make sure this is an array.
MyObject objs[1];
However, I need to make this array dynamically sized one the program is started. I would think I should just declare it as MyObject *objs;, but I figure if the original programmer declared it this way, there is some reason for it.
Is there anyway I can dynamically resize this? Or should I just change it to a pointer and then malloc() it?
Could I use some the new keyword somehow to do this?
Use an STL vector:
#include <vector>
std::vector<MyObject> objs(size);
A vector is a dynamic array and is a part of the Standard Template Library. It resizes automatically as you push back objects into the array and can be accessed like a normal C array with the [] operator. Also, &objs[0] is guaranteed to point to a contiguous sequence in memory -- unlike a list -- if the container is not empty.
You're correct. If you want to dynamically instantiate its size you need to use a pointer.
(Since you're using C++ why not use the new operator instead of malloc?)
MyObject* objs = new MyObject[size];
Or should I just change it to a
pointer and then malloc() it?
If you do that, how are constructors going to be called for the objects in on the malloc'd memory? I'll give you a hint - they won't be - you need to use a std::vector.
I have only seen an array used as a pointer inside a struct or union. This was ages ago and was used to treat the len and first char of a string as a hash to improve the speed of string comparisons for a scripting language.
The code was similar to this:
union small_string {
struct {
char len;
char buff[1];
};
short hash;
};
Then small_string was initialised using malloc, note the c cast is effectively a reinterpret_cast
small_string str = (small_string) malloc(len + 1);
strcpy(str.buff, val);
And to test for equality
int fast_str_equal(small_string str1, small_string str2)
{
if (str1.hash == str2.hash)
return strcmp(str1.buff, str2.buff) == 0;
return 0;
}
As you can see this is not a very portable or safe style of c++. But offered a great speed improvement for associative arrays indexed by short strings, which are the basis of most scripting languages.
I would probably avoid this style of c++ today.
Is this at the end of a struct somewhere?
One trick I've seen is to declare a struct
struct foo {
/* optional stuff here */
int arr[1];
}
and malloc more memory than sizeof (struct foo) so that arr becomes a variable-sized array.
This was fairly commonly used in C programs back when I was hacking C, since variable-sized arrays were not available, and doing an additional allocation was considered too error-prone.
The right thing to do, in almost all cases, is to change the array to an STL vector.
Using the STL is best if you want a dynamically sizing array, there are several options, one is std::vector. If you aren't bothered about inserting, you can also use std::list.
Its seems - yes, you can do this change.
But check your code on sizeof( objs );
MyObj *arr1 = new MyObj[1];
MyObj arr2[1];
sizeof(arr1) != sizeof(arr2)
Maybe this fact used somewhere in your code.
That comment is incredibly bad. A one-element array is an array even though the comment suggests otherwise.
I've never seen anybody try to enforce "is an array" this way. The array syntax is largely syntactic sugar (a[2] gives the same result as 2[a]: i.e., the third element in a (NOTE this is an interesting and valid syntax but usually a very bad form to use because you're going to confuse programmers for no reason)).
Because the array syntax is largely syntactic sugar, switching to a pointer makes sense as well. But if you're going to do that, then going with new[] makes more sense (because you get your constructors called for free), and going with std::vector makes even more sense (because you don't have to remember to call delete[] every place the array goes out of scope due to return, break, the end of statement, throwing an exception, etc.).