Overlapping strings - c++

I have a problem with overlapping char*.
I'm working in a low-memory environment, namely Arduino and I would like to use the least memory possible. I want to be able to prepend a string with another and to do it without any copying of variables which wastes memory.
This is standard C or C++.
char* bigPacket = (char*)malloc(25); //Makes a big string of length 25
char* payload = bigPacket + 2; //This is part of the big string, 2 chars in.
bigPacket[0] = 72; // Letter 'H'
bigPacket[1] = 72; //I'm expecting the final bigPacket to read "HHHello, world"
payload = "Hello, World";
print(bigPacket);
But the problem is that it does not print "HHHello, world" as it should. Instead, it just prints "HH". Is there a proper way to make it be able to overlap these strings to print "HHHello, world"?

You changed where payload points. What you needed to do was leave payload alone and change the data it points to.
strcpy(payload, "Hello World");
Edit: If you really want to avoid copies you'd end up with something like the SGI Rope class. But you'd pay a lot in code complexity.

If you want to do this without either very complicated code or multiple copies of data, destroying the benefit, you need to have the complete string as one literal in your program: "HHHelloWorld". You can then play with pointers and lengths to access various parts of it, but remember there is only one null byte, at the end of the string.
However, I suspect that this is an over-optimization. Arduino programming rarely involves a lot of very long string. It is important to keep the code simple and direct.

You should not mess with pointers for something like that. Instead you should store string literals in flash instead of sram memory. This is usually done with the help of progmem macros. Often the "F" macro is sufficient though. Then you can copy your strings - as needed - and if needed - into a suitable buffer.
Simplest example:
Serial.println(F("this is text from flash memory"));

You just assign the payload pointer to point to the constant string, you do not copy the string to what it currently points to.
In order to copy the string you need to use strcpy or memcpy:
char *bigPacket = malloc(25);
bigPacket[0] = bigpacket[1] = 72;
strcpy( bigpacket+2, "Hello, World");
print( bigPacket );
Note that this is rather unlikely to save memory, since "Hello, world" will exist as a constant string in your code, to save memory it is probably most efficient to call print multiple times.
However, I guess that is not possible in this case.

Related

Storing single characters

If I want to store a single character say 'c' am i better of using
std::string myChar = 'c';
rather than the built in char type?
char myChar = 'c';
Is there any safety gained by storing single characters as string?
There is a little safety gained as you won't accidentally use the string for calculations.
int a = 5+myChar;
Will give a compiler error if it is a string and wont if it's a char, because those are seen as numbers.
Please note, that the first example doesn't compile. It has to be
std::string myChar = "c";
(with double quotes). I see more disadvantages in this approach:
It will consume way more memory than required. With short-string optimizations the data will not be stored on the heap, but a string is still 3 words long (often 1 word is 4Byte, so that would be 12 Bytes) compared to one byte1 when using char.
The access to that char is really inconvenient, you would always have to use .front(), .back() or [0] to access that char.
It doesn't convey the meaning of your variables, it's like replacing all int-variables in your program with a std::vector<int> with a single element.
The only "safety" I can see, is as AlexGeorg already mentioned, you can't mistakenly use it in calculations. But that's it and this could also be seen as disadavantage.
So, no, your most likely not better of when using a string to store a single character. Except you have some really specific circumstances.
1plus maybe some padding.
The positive thing using string is the error at compile time when you trying to use variable for a mathematical expression example :
int sum = 15 + myChar;
You have instead some negative thing to take in consideration :
The first one is the performance, allocate a string is more expensive in term of memory occupation and time of execution.
The second one is that the String does not assure that the variable has a single character. So you have to pay attention when you use it.

String efficiency with small strings

I think i heard somewhere that strings have something called "small string optimization", a way of avoiding allocations. Can I avoid allocations altogether by doing something like this:
auto s = "hello" + "world!"s;
Instead of:
auto s = "hello, world!"s;
No, that won't work. SSO means storing short strings without a pointer inside a string object. As soon as you concatenate two short strings, it won't fit. There are string classes that have larger buffers internally, in case you need one that does SSO for say 31 characters.

Is there any difference between the different methods of clearing the contents of a string variable?

Given a string variable set to some value:
string s = "Hello";
Is there any difference (performance, gotchas) between the following methods to clear the contents?:
s = ""
s = std::string()
s.clear()
I got the sample code from this answer to a question about clearing a variable https://stackoverflow.com/a/11617595/1228532
There are some noticeable differences.
clear sets the length of the string to 0, but does not change its capacity.
s="" or s = std::string() creates a whole new (empty) string, assigns its value to the existing string, and throws away the contents of the existing string. Especially if you're using an implementation of std::string that doesn't include the short string optimization, this may well be much slower than clear. To add insult to injury, it also means that if you add more data to the string, it'll end up reallocating the buffer starting from a tiny buffer that it will probably have to reallocate as the string grows.
Bottom line: clear will often be faster, not to mention giving a...clear expression of your real intent.

Strange error in variable values C++

I have used this code. Here a string is present from location starting from 4 and length of string is 14. All these calculations are done prior to this code. I am pasting a small snippet of the error containing code.
void *data = malloc(4096);
int len = 14;
int fileptr = 4;
string str;
cout<<len<<endl;
cout<<fileptr<<endl;
memcpy(&str, (char *)data+fileptr, len);
cout<<len<<endl;
cout<<fileptr<<endl;
Output i get is:
14
4
4012176
2009288233
Here i am reading a string "System Catalog" from memory. Its displaying the string correctly. But the values of fileptr and len are abruptly changing after using memcpy() function.
string is not the same as a char*. string is an object. So you can't just memcpy() data to it. So the behavior of this code is undefined.
In your case, you are copying 14 bytes of junk data into str and corrupting the stack.
The result is that you are overwriting both len and fileptr with junk from the malloc().
I'm not sure exactly what you're trying to do, but if you want to create a string, you should do it like this:
string str = "System Catalog";
A string is an object and is not just a sequence of bytes. You cannot just memcpy over it from raw memory.
My guess is that in your code the str variable is allocated before other variables in stack memory and memcpy-ing over it you are overwriting them.
Note that your phrase "It's displaying the string correctly" has the seed of a common misconception about C++ in it.
When you do bad things in C++ (e.g. writing bytes over an object) you should expect the worst possible behavior. The worst possible behavior however is NOT an ugly result, a crash or a runtime error... but something that seems to work but that has bad consequences in the future.
You want to assign this many characters from that char pointer into a std::string, so you should look at what facilities a string object provides for doing that rather than hitting it over the head with memcpy(). As others have noted, memcpy() is for use in low-level C-style code, not for interacting with C++ objects.
In particular, you should study the assignment methods provided by std::string, one of which does exactly what you want -- which isn't a coincidence.
string is an object - please look up the semantics for it. Why are you doing this and what are you trying to achieve?
If for some reason you actually MUST use memcpy you can get the Internal address of the string to copy to (provided the string is big enough to contain the information you want to copy)
static_cast < char * >(&(str[0]));
But this is VERY VERY BAD. If you use it, I'm quite sure there are more crazy things going on in your code :-)

single character c-style string full of junk

It's a shame I can't figure out such basic thing about c++, but c-style strings are acting as I wouldn't expect. For example, I create it like this:
char* cstr = new char[1];
It's initialized to: Íýýýýý««««««««îţ . Like normal, I can set just first char because others are not really existing (or I thought that they aren't). While working whit c-style strings all this junk is ingored and everything works fine.
Now I mixed std::string whit those c-stlye one and what I get is a mess. Whit this code:
std::string str = "aaa";
str += cstr;
I end up whit: aaaÍýýýýý««««««««îţ , but now those characters actually exist as string.size() returns length including this junk.
I can't find why is this happening, but it must be connected whit string creating, because something like char* cstr = "aaa" results in aaa without any additional junk, but trying to change string initialized this way results in memory access violation. Could someone explain me this behavior please? Thanks!
PS: My JavaScript Failed to load so if someone could format this post properly, I'd be glad!
Answer: Oh god! How could I forget on that... thanks to all for, well, immediate answer. Best one was from minitech so I'll mark this as answer as soon as my java script loads up :/
All C-style strings are null-terminated. So, a string initialized using new char[1] leaves you space for no characters. You can't set the first character to anything but \0, otherwise normal string operations will keep reading into memory until they find a zero. So use new char[2] instead.
When working with C-style strings you need to have a null terminator:
char* cstr = new char[2];
cstr[0] = 'X';
cstr[1] = '\0';
Having said all that, it is really bad code to do the above. Just use std::string unless you have a very good reason not too. It takes care of the memory allocations and deallocations for you.
C-style strings require a NUL ('\0') terminator; they don't have a length associated with them like C++ strings do. So your single-character string must be new char[2]; it will not be initialized; and you will need to make sure it's terminated with \0.
When you use new char[1], you request space for an array of characters. There is no request that said characters are initialized. Thus, the "junk" that you see is uninitialized memory. Before treating the array as a C-style string, you should do this:
cstr[0] = '\0';
c-style strings are NULL delimited. So, to ignore any junk in memory you need to place NULL byte('\0') in the string body. Otherwise, system library function will look at all bytes starting with your string start until they meet NULL byte in the memory (which will be at some random position).
This also mean that to have c-style string of one character you actually need to allocate 2 bytes: one for a meaningful character and second for '\0'.