Storing single characters

Storing single characters - c++

If I want to store a single character say 'c' am i better of using
std::string myChar = 'c';
rather than the built in char type?
char myChar = 'c';
Is there any safety gained by storing single characters as string?

There is a little safety gained as you won't accidentally use the string for calculations.
int a = 5+myChar;
Will give a compiler error if it is a string and wont if it's a char, because those are seen as numbers.

Please note, that the first example doesn't compile. It has to be
std::string myChar = "c";
(with double quotes). I see more disadvantages in this approach:
It will consume way more memory than required. With short-string optimizations the data will not be stored on the heap, but a string is still 3 words long (often 1 word is 4Byte, so that would be 12 Bytes) compared to one byte1 when using char.
The access to that char is really inconvenient, you would always have to use .front(), .back() or [0] to access that char.
It doesn't convey the meaning of your variables, it's like replacing all int-variables in your program with a std::vector<int> with a single element.
The only "safety" I can see, is as AlexGeorg already mentioned, you can't mistakenly use it in calculations. But that's it and this could also be seen as disadavantage.
So, no, your most likely not better of when using a string to store a single character. Except you have some really specific circumstances.
1plus maybe some padding.

The positive thing using string is the error at compile time when you trying to use variable for a mathematical expression example :
int sum = 15 + myChar;
You have instead some negative thing to take in consideration :
The first one is the performance, allocate a string is more expensive in term of memory occupation and time of execution.
The second one is that the String does not assure that the variable has a single character. So you have to pay attention when you use it.

Related

What is the advantage of using gets(a) instead of cin.getline(a,20)?

We will have to define an array for storing the string either way.
char[10];
And so suppose I want to store smcck in this array. What is the advantage of using gets(a)? My teacher said that the extra space in the array is wasted when we use cin.getline(a, 20), but that applies for gets(a) too right?
Also just an extra question, what exactly is stored in the empty "boxes"of an array?

gets() is a C function,it does not do bounds checking and is considered dangerous, it has been kept all this years for compatibility and nothing else.
You can check the following link to clear your doubt :
http://www.gidnetwork.com/b-56.html
Don't mix C features with C++, though all the feature of C works in C++ but it is not recommended . If you are working on C++ then you should probably avoid using gets(). use getline() instead.

Well, I don't think gets(a) is bettet because it does not check for the size of the string. If you try to read a long string using it, it may cause an buffer overflow. That means it will use all the 10 spaces you allocated for it and then it will try to use space allocated for another variables or another programs (what is going to make you publication crash).
The cin.getline() receives an int as a parameter with tells it to not read more than the expected number of characters. If you allocate a vector with only 10 positions and read 20 characters it will cause the same problem I told you about gets().
About the strings representation in memory, if you put "smcck" on an array
char v[10];
The word will take the first 5 positions (0 to 4), the position 5 will be taken by a null character (represented by '\0') that will mark the end of the string. Usually, what comes next in the array does not matter and are kept the way it were in the past.the null terminated character is used to mark where the string ends, so you can work it safely.

C string one character shorter than defined length?

Very new to c++ and I have the following code:
char input[3];
cout << "Enter input: ";
cin.getline(input,sizeof(input));
cout << input;
And entering something like abc will only output ab, cutting it one character short. The defined length is 3 characters so why is it only capturing 2 characters?

Remember that c-strings are null terminated. To store 3 characters you need to allocate space for 4 because of the null terminator.
Also as the #MikeSeymour mentioned in the comments in c++ its best to avoid the issue completely and use std::string.

You can thank your favorite deity that this fail-safe is in, most functions aren't that kind.
In C, strings are null-terminated, which means they take an extra character than the actual data to mark where the string actually ends.
Since you're using C++ anyway, you should avoid bare-bones char arrays. Some reasons:
buffer overflows. You managed to hit this issue on your first try, take a hint!
Unicode awareness. We're living in 2015. Still using 256 characters is unacceptable by any standard.
memory safety. It's way harder to leak a proper string than a plain old array. strings have strong copy semantics that cover pretty much anything you can think of.
ease of use. You have the entire STL algorithm list at your disposal! Use it rather than rolling your own.

Overlapping strings

I have a problem with overlapping char*.
I'm working in a low-memory environment, namely Arduino and I would like to use the least memory possible. I want to be able to prepend a string with another and to do it without any copying of variables which wastes memory.
This is standard C or C++.
char* bigPacket = (char*)malloc(25); //Makes a big string of length 25
char* payload = bigPacket + 2; //This is part of the big string, 2 chars in.
bigPacket[0] = 72; // Letter 'H'
bigPacket[1] = 72; //I'm expecting the final bigPacket to read "HHHello, world"
payload = "Hello, World";
print(bigPacket);
But the problem is that it does not print "HHHello, world" as it should. Instead, it just prints "HH". Is there a proper way to make it be able to overlap these strings to print "HHHello, world"?

You changed where payload points. What you needed to do was leave payload alone and change the data it points to.
strcpy(payload, "Hello World");
Edit: If you really want to avoid copies you'd end up with something like the SGI Rope class. But you'd pay a lot in code complexity.

If you want to do this without either very complicated code or multiple copies of data, destroying the benefit, you need to have the complete string as one literal in your program: "HHHelloWorld". You can then play with pointers and lengths to access various parts of it, but remember there is only one null byte, at the end of the string.
However, I suspect that this is an over-optimization. Arduino programming rarely involves a lot of very long string. It is important to keep the code simple and direct.

You should not mess with pointers for something like that. Instead you should store string literals in flash instead of sram memory. This is usually done with the help of progmem macros. Often the "F" macro is sufficient though. Then you can copy your strings - as needed - and if needed - into a suitable buffer.
Simplest example:
Serial.println(F("this is text from flash memory"));

You just assign the payload pointer to point to the constant string, you do not copy the string to what it currently points to.
In order to copy the string you need to use strcpy or memcpy:
char *bigPacket = malloc(25);
bigPacket[0] = bigpacket[1] = 72;
strcpy( bigpacket+2, "Hello, World");
print( bigPacket );
Note that this is rather unlikely to save memory, since "Hello, world" will exist as a constant string in your code, to save memory it is probably most efficient to call print multiple times.
However, I guess that is not possible in this case.

What's the purpose of the char data type in C++? Why not just use strings? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I don't see the point of having the use of single quotes reserved for single characters. So, when is this used? i.e char a = 'a'; Coming from Javascript and PHP, I'm used to being able to use single quotes for entire strings, but after learning that single quotes are reserved for characters in C++, I'm curious to know why? Why would you need a single character?

PHP and JavaScript are languages that operate at quite a high level. This means that the basic types are essentially just a few different types, whose implementation is hidden inside a set of functions in the actual script engine.
C and C++, as well as most other low level languages expose more of "how the machine works". A string, in C, is a sequence of characters. If you want to deal with strings, you need to be able to deal with their components, which is char. A single character becomes useful when you want to build strings, compare the contents of strings, etc. Naturally, for normal string operations in C++, you'd use std::string, and then, like in script languages, most aspects of how the string is actually represented is hidden inside the std::string class implementation, so you don't really need to care about it. But if you were to "look inside" a std::string, it would somewhere sooner or later, become a char *, which is a pointer to a piece of memory that contains a sequence of characters, individual char elements.
One could look at it like going from having "ready made big lumps of Lego" to having only small pieces to work with. You can still build the same things, but it requires more pieces, and requires a bit more effort to construct. What you win is flexibility and speed. A char is really easy to deal with for the processor, where a single character in PHP is still represented as a string - it just happens to be one element long. As such, there is extra overhead in keeping track of this one character string, where it's stored, how long it is, etc, because the functionality in the language doesn't make any distinction between a single character and a string of a megabyte.
The purpose of C, and to a large degree also C++, is to closely represent the hardware. So your basic types are much closer to what the actual hardware representation is, and this is something you will need to learn more about if you are going to understand C and C++ well. Unfortunately, to cover ALL of that would be far beyond a single answer in SO. You will need to get yourself a good C and/or C++ book.

To run things faster and efficient you must learn how to use less space needed. When you have an 'a', this is actually a number (see ASCII table), but when you got "a" it is an array of 2 characters {'a','\0'}. The zero is to know when your string ends because the computer is not sure when the string ends. Do you want to add a length property like in javascript, to know directly the string's length? You use more space that may not be needed. Somehow you have to distinguish these two values to run efficient code. Learning C/C++ first you actually learn how things work on the low level of your computer and understand more than by learning php/javascript/ruby/python first. C++ is more customizable than higher level programming languages.

Javascript and PHP are scripting languages while C++ (and especially its predecessor C) is quite low-level native programming language where you have to consider how your variables are stored in memory.
char a = 'a'; creates a 8-bit-long numeric variable that can hold character value (ASCII code) and put the value of the character a into it. So char a = 97; does the same work.
const char* s = "a"; creates a null terminated string which is an array with two elements: the value of character a and the terminating 0 character (just number 0 or the character '\0'). The * means we create a pointer to the array. Its type is const char because it contains string literal which is constant. We could create an identical array using const char s[2] = { 97, 0 }; or const char s[2] = { 'a', '\0' };.
By the way, single quotes are not reserved exclusively for single characters. You can put a few characters into single quotes. See What do single quotes do in C++ when used on multiple characters?.

The language C++ inherited scalar types from the language C. So you should get a good book about C++ or C to get the details.
The type char is a numeric type. It's an integer that can old a character. It's usually signed but can be overwritten with the signed or unsigned prefix. Additionally the signedness can be configured at the compiler. The 'A' literal defines a number that is identical to the code of the character A.
The "a" string is an array of char's and has a zero termination. In the example you have two bytes 'a','\0'. When you use the literal the compiler passes the address as a pointer to the array. This pointer can be assigned to pointer variable
char *s = "A";
or passed to a function
foo("A");
Additionally there are all the pointer arithmentics possible that you can grasp when you got the meaning of a char array.
Edit Both the number literal 'a' and the char array "A" are const objects. It's obvious that you can't assign anything to a number like
'a' = 23; // wrong!
When you assigned the literal to variable you can change the variable later tough.
But when you stored the pointer in a pointer variable it's illegal and you get undefined behavior when you try to modify the char array:
char *s = "A";
*s = 'B'; // try to change the first byte of the char array, causes undefined behavior.
To express this it's good style to use a pointer to const char variable:
const char *s = "A";
*s = 'B'; // compiler diagnostic says: not allowed

Is it better to use std::string or single char when possible?

Is it better to use std::string or single char when possible?
In my class I want to store certain characters. I have CsvReader
class, and I want to store columnDelimiter character. I wonder,
is it better to have it as char, or just use std::string?
In terms of usage I suppose std::string is far better, but I wonder
maybe there will be major performance differences?

If your delimiter is constrained to be a single character, use a char.
If your delimiter may be a string, use a std::string.
Seems fairly self-explanatory. Refer to the requirements of the project, and the constraints of the feature that follow from those requirements.
Personally it seems to me that a CSV field delimiter will always be a single character, in which case std::string is not only misleading, but pointlessly heavy.
In terms of usage I suppose std::string is far better
I have largely ignored this claim as you did not provide any rationale, but let me just say that I reject the hypothetical premise of the claim.
I wonder maybe there will be major performance differences?
Absolutely! A string consists of a dynamically-allocated block of characters; this is entirely more heavy than a single byte in memory. Notwithstanding the small-string-optimisation that your implementation may perform, it's simply pointless to add all this weight when all you wish to represent is a single character. A single character is a char, so use a char in such a case.

A character is a character. A string is a string; conceptually, a set of N characters, where N is any natural number.
If your design requires a character, use char. If it requires a string, use string.
In both cases you may have multilanguage issues (what happens if the characteer is 青? what happens if the string is 青い?), but these are totally independent of your choice of whether you need a character or a set of N characters, i.e. a string.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js