What is a good, safe way to extract a specific number of characters from a std::basic_istream and store it in a std::string?
In the following program I use a char[] to eventually obtain result, but I would like to avoid the POD types and ensure something safer and more maintainable:
#include <sstream>
#include <string>
#include <iostream>
#include <exception>
int main()
{
std::stringstream inss{std::string{R"(some/path/to/a/file/is/stored/in/50/chars Other data starts here.)"}};
char arr[50]{};
if (!inss.read(arr,50))
throw std::runtime_error("Could not read enough characters.\n");
//std::string result{arr}; // Will probably copy past the end of arr
std::string result{arr,arr+50};
std::cout << "Path is: " << result << '\n';
std::cout << "stringstream still has: " << inss.str() << '\n';
return 0;
}
Alternatives:
Convert entire stream to a string up front: std::string{inss.c_str()}
This seems wasteful as it would make a copy of the entire stream.
Write a template function to accept the char[]
This would still use an intermediate POD array.
Use std::basic_istream::get in a loop to read the required number of characters together with std::basic_string::push_back
The loop seems a bit unwieldy, but it does avoid the array.
Just read it directly into the result string.
#include <sstream>
#include <string>
#include <iostream>
#include <exception>
int main()
{
std::stringstream inss{std::string{R"(some/path/to/a/file/is/stored/in/50/chars Other data starts here.)"}};
std::string result(50, '\0');
if (!inss.read(&result[0], result.size()))
throw std::runtime_error("Could not read enough characters.\n");
std::cout << "Path is: " << result << '\n';
std::cout << "stringstream still has: " << inss.str() << '\n';
return 0;
}
Since C++11, the following guarantee about the memory layout of the std::string (from cppreference).
The elements of a basic_string are stored contiguously, that is, for a basic_string s, &*(s.begin() + n) == &*s.begin() + n for any n in [0, s.size()), or, equivalently, a pointer to s[0] can be passed to functions that expect a pointer to the first element of a CharT[] array.
(since C++11)
Related
I am attempting to extract a hash-digest in hexadecimal via a stringstream, but I cannot get it to work when iterating over data.
Using std::hex I can do this easily with normal integer literals, like this:
#include <sstream>
#include <iostream>
std::stringstream my_stream;
my_stream << std::hex;
my_stream << 100;
std::cout << my_stream.str() << std::endl; // Prints "64"
However when I try to push in data from a digest it just interprets the data as characters and pushes them into the stringstream. Here is the function:
#include <sstream>
#include <sha.h> // Crypto++ library required
std::string hash_string(const std::string& message) {
using namespace CryptoPP;
std::stringstream buffer;
byte digest[SHA256::DIGESTSIZE]; // 32 bytes or 256 bits
static SHA256 local_hash;
local_hash.CalculateDigest(digest, reinterpret_cast<byte*>(
const_cast<char*>(message.data())),
message.length());
// PROBLEMATIC PART
buffer << std::hex;
for (size_t i = 0; i < SHA256::DIGESTSIZE; i++) {
buffer << *(digest+i);
}
return buffer.str();
}
The type byte is just a typedef of unsigned char so I do not see why this would not input correctly. Printing the return value using std::cout gives the ASCI mess of normal character interpretation. Why does it work in the first case, and not in the second case?
Example:
std::string my_hash = hash_string("hello");
std::cout << hash << std::endl; // Prints: ",≥M║_░ú♫&Φ;*┼╣Γ₧←▬▲\▼ºB^s♦3bôïÿ$"
First, the std::hex format modifier applies to integers, not to characters. Since you are trying to print unsigned char, the format modifier is not applied. You can fix this by casting to int instead. In your first example, it works because the literal 100 is interpreted as an integer. If you replace 100 with e.g. static_cast<unsigned char>(100), you would no longer get the hexadecimal representation.
Second, std::hex is not enough, since you likely want to pad each character to a 2-digit hex value (i.e. F should be printed as 0F). You can fix this by also applying the format modifiers std::setfill('0') and std::setw(2) (reference, reference).
Applying these modifications, your code would then look like this:
#include <iomanip>
...
buffer << std::hex << std::setfill('0') << std::setw(2);
for (size_t i = 0; i < SHA256::DIGESTSIZE; i++) {
buffer << static_cast<int>(*(digest+i));
}
I'm trying to save some string via string_view to second data container but run into some difficulties.
It turns out that string changes its underlying data storage after move()'ing it.
And my question is, why does it happen?
Example:
#include <iostream>
#include <string>
#include <string_view>
using namespace std;
int main() {
string a_str = "abc";
cout << "a_str data pointer: " << (void *) a_str.data() << endl;
string_view a_sv = a_str;
string b_str = move(a_str);
cout << "b_str data pointer: " << (void *) b_str.data() << endl;
cout << "a_sv: " << a_sv << endl;
}
Output:
a_str data pointer: 0x63fdf0
b_str data pointer: 0x63fdc0
a_sv: bc
Thanks for your replies!
What you are seeing is a consequence of short string optimization. In the most basic sense, there is an array in the string object to save a call to new for small strings. Since the array is a member of the class, it has to have it's own address in each object and when you move a string that is in the array, a copy happens.
The string "abc" is short enough for short string optimization. See What are the mechanics of short string optimization in libc++?
If you change it to a longer string you will see the same address.
I was programming some test cases an noticed an odd behaviour.
An move assignment to a string did not erase the value of the first string, but assigned the value of the target string.
sample code:
#include <utility>
#include <string>
#include <iostream>
int main(void) {
std::string a = "foo";
std::string b = "bar";
std::cout << a << std::endl;
b = std::move(a);
std::cout << a << std::endl;
return 0;
}
result:
$ ./string.exe
foo
bar
expected result:
$ ./string.exe
foo
So to my questions:
Is that intentional?
Does this happen only with strings and/or STL objects?
Does this happen with custom objects (as in user defined)?
Environment:
Win10 64bit
msys2
g++ 5.2
EDIT
After reading the possible duplicate answer and the answer by #OMGtechy
i extended the test to check for small string optimizations.
#include <utility>
#include <string>
#include <iostream>
#include <cinttypes>
#include <sstream>
int main(void) {
std::ostringstream oss1;
oss1 << "foo ";
std::ostringstream oss2;
oss2 << "bar ";
for (std::uint64_t i(0);;++i) {
oss1 << i % 10;
oss2 << i % 10;
std::string a = oss1.str();
std::string b = oss2.str();
b = std::move(a);
if (a.size() < i) {
std::cout << "move operation origin was cleared at: " << i << std::endl;
break;
}
if (0 == i % 1000)
std::cout << i << std::endl;
}
return 0;
}
This ran on my machine up to 1 MB, which is not a small string anymore.
And it just stopped, so i could paste the source here (Read: i stopped it).
This is likely due to short string optimization; i.e. there's no internal pointer to "move" over, so it ends up acting just like a copy.
I suggest you try this with a string large number of characters; this should be enough to get around short string optimization and exhibit the behaviour you expected.
This is perfectly valid, because the C++ standard states that moved from objects (with some exceptions, strings are not one of them as of C++11) shall be in a valid but unspecified state.
I'm a C++ programmer, who's still in the nest, and not yet found my wings. I was writing a Calendar program, and I discovered, that C++ does not support a string type. How do I make an Array, that will be able to store strings of characters?
I've thought of creating an enumerated data type, as the array type. While, it will work, for my Calendar, it won't work if say I was creating a database of the names of students in my class.
http://prntscr.com/7m074w I got; "error, 'string' does not name a type."
that C++ does not support a string type.
Wrong info, you can create an character array as follows
char array[length];
//Where length should be a constant integer
Otherwise you can depend on standard template library container, std::string
If you have C++11 compiler you can depend on std::array
The C++ Standard Library includes a string type, std::string. See http://en.cppreference.com/w/cpp/string/basic_string
The Standard Library also provides a fixed-size array type, std::array. See http://en.cppreference.com/w/cpp/container/array
But you may also want to learn about the dynamically-sized array type, std::vector. See http://en.cppreference.com/w/cpp/container/vector
The language also includes legacy support for c-strings and c-arrays, which you can find in a good C++ or C book. See The Definitive C++ Book Guide and List
An example of how to use an array/vector of strings:
#include <string>
#include <array>
#include <vector>
#include <iostream>
int main() {
std::array<std::string, 3> stringarray;
stringarray[0] = "hello";
stringarray[1] = "world";
// stringarray[2] contains an empty string.
for (size_t i = 0; i < stringarray.size(); ++i) {
std::cout << "stringarray[" << i << "] = " << stringarray[i] << "\n";
}
// Using a vector, which has a variable size.
std::vector<std::string> stringvec;
stringvec.push_back("world");
stringvec.insert(stringvec.begin(), "hello");
stringvec.push_back("greetings");
stringvec.push_back("little bird");
std::cout << "size " << stringvec.size()
<< "capacity " << stringvec.capacity()
<< "empty? " << (stringvec.empty() ? "yes" : "no")
<< "\n";
// remove the last element
stringvec.pop_back();
std::cout << "size " << stringvec.size()
<< "capacity " << stringvec.capacity()
<< "empty? " << (stringvec.empty() ? "yes" : "no")
<< "\n";
std::cout << "stringvec: ";
for (auto& str : stringvec) {
std::cout << "'" << str << "' ";
}
std::cout << "\n";
// iterators and string concatenation
std::string greeting = "";
for (auto it = stringvec.begin(); it != stringvec.end(); ++it) {
if (!greeting.empty()) // add a space between words
greeting += ' ';
greeting += *it;
}
std::cout << "stringvec combined :- " << greeting << "\n";
}
Live demo: http://ideone.com/LWYevW
You can create an array of characters by char name[length];.
C++ also has a data type string. You can create an array of strings and store what values you'd like. here .
So
use array of characters
use string data type
For Example -
#include <iostream>
#include <string>
int main ()
{
//To Create a String
std::string s0 ("Initial string");
return 0;
}
C++ does have a string type: string from #include <string>
If you don't want to use that, you can also use char* name = "YourTextHere..." or `char[length+1] name = "YourTextHere"
I'm developing a application and my idea is store "apps" in files, like executables. Now i have that:
AppWriter.c
#include <vector>
#include <time.h>
#include <functional>
struct PROGRAM
{
std::vector<int> RandomStuff;
std::vector<std::function<void()>> Functions;
std::function<void()> MAIN;
} CODED;
void RANDOMFUNC()
{
srand(time(NULL));
for(int i = 0; i < 40; i++)
CODED.RandomStuff.push_back(rand() % 254);
}
void LOGARRAY()
{
for(int i = 0; i < CODED.RandomStuff.size(); i++)
std::cout << "["<< i + 1 <<"]: "<< CODED.RandomStuff[i] << std::endl;
}
void PROGRAMMAIN()
{
std::cout << "Hello i call random function!" << std::endl;
CODED.Functions[0]();
CODED.Functions[1]();
}
void main()
{
CODED.MAIN = PROGRAMMAIN;
CODED.Functions.push_back(RANDOMFUNC);
CODED.Functions.push_back(LOGARRAY);
std::cout << "Testing MAIN" << std::endl;
CODED.MAIN();
FILE *file = fopen("TEST_PROGRAM.TRI","wb+");
fwrite(&CODED,sizeof(CODED),1,file);
fclose(file);
std::cout << "Program writted correctly!" << std::endl;
_sleep(10000);
}
AppReader.c
#include <iostream>
#include <vector>
#include <time.h>
#include <functional>
struct PROGRAM
{
std::vector<int> RandomStuff;
std::vector<std::function<void()>> Functions;
std::function<void()> MAIN;
} DUMPED;
void main()
{
FILE *file = fopen("TEST_PROGRAM.TRI","rb+");
fseek(file,0,SEEK_END);
int program_len = ftell(file);
rewind(file);
fread(&DUMPED,sizeof(PROGRAM),1,file);
std::cout
<< "Function array size: " << DUMPED.Functions.size() << std::endl
<< "Random Stuff Array size: " << DUMPED.RandomStuff.size() << std::endl;
DUMPED.MAIN();
}
When i run AppReader the functions dont work(Maybe why std::function it's like void pointers?), but in arrays or if i add variables i can see with debugger the data are storaged correctly (for that i tryed the vector of functions), but whatever doesn't work throw's me error on functional file. ¿Any ideas how i can do that?
This is never going to work. At all. Ever. std::function is a complex type. Binary reads and writes don't work for complex types. They never can. You would have to ask for functions in a pre-defined serializable format, like LLVM IR.
Your problem is that you're storing information about functions that exist in one executable, then trying to run them in a separate executable. Other than that, your code does work, but as DeadMG says, you shouldn't be storing complex types in a file. Here's how I modified your code to prove that your code works if run within a single executable:
#include <iostream>
#include <vector>
#include <time.h>
#include <functional>
struct PROGRAM
{
std::vector<int> RandomStuff;
std::vector<std::function<void()>> Functions;
std::function<void()> MAIN;
} CODED;
void RANDOMFUNC()
{
srand(time(NULL));
for(int i = 0; i < 40; i++)
CODED.RandomStuff.push_back(rand() % 254);
}
void LOGARRAY()
{
for(int i = 0; i < CODED.RandomStuff.size(); i++)
std::cout << "["<< i + 1 <<"]: "<< CODED.RandomStuff[i] << std::endl;
}
void PROGRAMMAIN()
{
std::cout << "Hello i call random function!" << std::endl;
CODED.Functions[0]();
CODED.Functions[1]();
}
int main()
{
CODED.MAIN = PROGRAMMAIN;
CODED.Functions.push_back(RANDOMFUNC);
CODED.Functions.push_back(LOGARRAY);
std::cout << "Testing MAIN" << std::endl;
CODED.MAIN();
FILE *file = fopen("TEST_PROGRAM.TRI","wb+");
fwrite(&CODED,sizeof(CODED),1,file);
fclose(file);
std::cout << "Program writted correctly!" << std::endl;
// _sleep(10000);
std::cout << "---------------------\n";
file = fopen("TEST_PROGRAM.TRI","rb+");
fseek(file,0,SEEK_END);
int program_len = ftell(file);
rewind(file);
fread(&CODED,sizeof(PROGRAM),1,file);
std::cout
<< "Function array size: " << CODED.Functions.size() << std::endl
<< "Random Stuff Array size: " << CODED.RandomStuff.size() << std::endl;
CODED.MAIN();
}
The problem is not that you're storing complex types via binary read/write, per se. (Although that is a problem, it's not the cause of the problem you posted this question about.) Your problem is that your data structures are storing information about the functions that exist in your 'writer' executable. Those same functions don't even exist in your 'reader' executable, but even if they did, they likely wouldn't be at the same address. Your data structures are storing, via std::function, pointers to the addresses where the functions exist in your 'writer' executable. When you try to call these non-existent functions in your 'reader' executable, your code happily tries to call them but you get a segfault (or whatever error your OS gives) because that's not the start of a valid function in your 'reader' executable.
Now with regard to writing complex types (e.g. std::vector) directly to a file in binary format: Doing so "works" in the sample code above only because the binary copies of the std::vectors have pointers that, once read back in, still point to valid data from the original std::vectors which you wrote out. Note that you didn't write the std::vector's actual data, you only wrote their metadata, which probably includes things like the length of the vector, the amount of memory currently allocated for the vector, and a pointer to the vector's data. When you read that back, the metadata is correct except for one thing: Any pointers in it are pointing to addresses that were valid when you wrote the data, but which may not be valid now. In the case of the sample code above, the pointers end up pointing to the same (still valid) data from the original vectors. But there's still a problem here: You now have more than one std::vector that thinks they own that memory. When one of them is deleted, it will delete the memory that the other vector expects to still exist. And when the other vector is deleted, it will cause a double-delete. That opens the door to all kinds of UB. E.g. that memory could have been allocated for another purpose by that time, and now the 2nd delete will delete that other purpose's memory, or else the memory has NOT been allocated for another purpose and the 2nd delete may corrupt the heap. To fix this, you'd have to serialize out the essence of each vector, rather than their binary representation, and when reading it back in, you'd have to reconstruct an equivalent copy, rather than simply reconstitute a copy from the binary image of the original.