This question already has answers here:
What exactly does stringstream do?
(4 answers)
Closed 4 years ago.
I'm going to keep this question very simple. I'm learning C++ and I've come across stringstreams. I understand that their main usage is to have variables input into them so they can later output the value they hold using str() as a string. My question is - what's the point of this? This sounds like a very fancy way of just concatenating a bunch of variables in a string object using the + operator. Does it have more to it than that or is it just so it confuses noobs and causes them to fail their exams?
Well, one problem is that you cannot "concatenate a bunch of variables in a string using the + operator" (only other strings or char*s).
So, how are you going to turn all your objects into strings? Unlike Java, C++ does not have any to_string() member convention. On the other hand, every class interested in using iostream will define stream inserters (std::ostream& operator<<(std::ostream& os, const MyClass& foo) and maybe std::istream& operator>>(std::istream& os, MyClass& foo).)
So, you use the stream inserters to convert objects to text. Sometimes you don't want to write to the console or to a file, but instead you want to store it as a string.
Also, using the iostream framework lets you use the manipulators to control precision, width, numerical base, and so on, instead of trying to do all that manually as you construct a string.
Now, that's not to say that the stringstream solution is ideal: in fact, a lot of libraries exist to do the same sort of task better (including at least Boost.Format, Boost.Convert, Boost.Lexical_Cast, and Boost.Spirit just in Boost.)
If you have:
int a = 3;
std::string str = "hello";
MyObject obj;
Then:
std::string concat = a + str + obj;
std::string objstr = obj;
won't work, while:
std::stringstream stream;
stream << a << str << obj;
std::string concat = stream.str();
std::stringstream stream2;
stream2 << obj;
std::string objstr = stream2.str();
Will work (at least if MyObject defines a operator<<). That's the whole point of std::stringstream: make it easy to redirect "anything" to a string.
Any object that can be redirected to a std::ostream (std::fstream, std::cout...) can also be redirected to a std:::stringstream (as it derives from ̀std::ostream too). Then you just need to declare one std::ostream redirection operator (operator<<) and it can be used to redirect the object everywhere (file, console, but also string...).
The whole point is that you could declare a operator+ and operator+= to make it possible to concatenate your object to a std::string. But then, if you also wish to redirect it to a stream (file, cout), you'll have to declare 3 operators (operator+, operator+= and finally operator<< for streams), all doing almost the same thing. In the end, thanks to std::stringstream, having only one single operator (operator<<) is enough to redirect to file, cout and string.
What's the point of stringstream?
It is a flexible and fast stream, and works as a simulator for other (comparatively) slow streams. I use stringstream for lots of different things.
My favorite use is for test. I create a std::istream and fill it with test data. This allows me to create the test data 'file' using the same editor with which I code (and no actual file polluting my work dir). Adding more test cases is remarkably less time consuming.
// 012345678901234567890123456789012345678901234567890
std::istringstream iss("7 ((23, 342), (17, 234), (335, 159), (10, 10))");
// ---|^^^^^^^|^v|vvvvvvv|v^|^^^^^^^^|^v|vvvvvv|^
// 1 2 3 4
// echo to user of test
std::cout << "Test Data Input: " << iss.str() << std::endl ;
// code under test starts here ...
int nTowns = 1;
char lparen0 = 2;
iss >> nTowns // NOTE : formatted input drops whitespace
>> lparen0;
// and continues with echo of post conversion
std::cout << " 0: " << nTowns << ' ' << lparen0 << std::endl;
// then more of the same - until record is completely read.
I have used stringstream to build a screen update, and use that dynamic changing string to 'measure' the banner width, and compute where to start the placement on the screen so that the banner is centered (or left, or right):
static void centerUpdateScreenBanner( uint64_t gen,
int pdMS,
int changes,
TC_t& tc)
{
// build contents of screen banner update
std::stringstream ss;
ss << std::setw(3) << pdMS << " "
<< std::setw(4) << gen << " "
<< std::setw(4) << changes;
// compute start column placement for centering
int rCol = tc.maxCol -
static_cast<int>(ss.str().size()) +
tc.DfltIndnt-1;
// send banner to terminal device for user info
tc.termRef << Ansi_t::gotoRC(0, rCol) // top right
<< ss.str() << std::flush;
}
Inside this link list, I use a stringstream to absorb and recursively concatenate list nodes info. While this builds recursively, other cout or cerr can proceed un-hindered - as if a 3rd channel.
std::string showR(void) {
std::stringstream ss;
ss << m_payload->show(); // this node
if(m_next) ss << m_next->showR(); // subsequent nodes
return (ss.str());
}
Summary: I have found std::stringstream to be very useful.
Related
in this minimal example there is a weird messing up between the input to a stringstream and the content of a previously used cout:
online gdb:
https://onlinegdb.com/itO69QGAE
code:
#include <string>
#include <iostream>
#include <sstream>
using namespace std;
const char sepa[] = {':', ' '};
const char crlf[] = {'\r', '\n'};
int main()
{
cout<<"Hello World" << endl;
stringstream s;
string test1 = "test_01";
string test2 = "test_02";
s << test1;
cout << s.str() << endl;
// works as expected
// excpecting: "test_01"
// output: "test_01"
s << sepa;
cout << s.str() << endl;
// messing up with previous cout output
// expecting: "test_01: "
// output: "test_01: \nHello World"
s << test2;
cout << s.str() << endl;
// s seems to be polluted
// expecting: "test_01: test_02"
// output: "test_01: \nHello Worldtest_02"
s << crlf;
cout << s.str() << endl;
// once again messing up with the cout content
// expecting: "test_01: test_02\r\n"
// output: "test_01: Hello Worldtest_02\r\nHello World"
return 0;
}
So I am wondering why is this happing?
As it only happens when a char array is pushed into the stringstream it's likely about this... but according to the reference the stringstream's "<<"-operator can/should handle char* (what actually the name of this array stand's for).
Beside that there seems to be a (?hidden, or at least not obvious?) relation between stringstream and cout. So why does the content pollute into the stringstream?
Is there any wrong/foolish usage in this example or where is the dog buried (-> german idiom :P )?
Best regards and thanks
Damian
P.S. My question is not about "fixing" this issue like using a string instead of the char array (this will work)... it's about comprehend the internal mechanics and why this is actually happing, because for me this is just an unexpected behaviour.
The std::stringstream::str() function returns a string containing all characters previously written into the stream, in all previous calls to operator<< (or other output functions). However it seems that you expect that only the last output operation will be returned - this is not the case.
This is analogous to how e.g. std::cout works: each invocation of std::cout << appends the string to standard output; it does not clear the console's screen.
To achieve what you want, you either need to use a separate std::stringstream instance every time:
std::stringstream s1;
s1 << test1;
std::cout << s1.str() << std::endl;
std::stringstream s2;
s2 << sepa;
std::cout << s2.str() << std::endl;
Or better, clear the contents of the std::stringstream using the single argument overload of the str() function:
std::stringstream s;
s << test1;
std::cout << s.str() << std::endl;
// reset the contents of s to an empty string
s.str("");
s << sepa;
std::cout << s.str() << std::endl;
The s.str("") call effectively discards all characters previously written into the stream.
Note, that even though std::stringstream contains a clear() function that would seem a better candidate, it's not analogous to e.g. std::string::clear() or std::vector::clear() and won't yield the effect desired in your case.
Here I am again,
Thanks to "Some programmer dude"'s comment I think I figured it out:
As there is no (null-)termination-symbol related to both char arrays it seems that the stringstream-<<-operator inserts until it stumbles over an null-terminator '\0'.
Either expending both arrays with a \0-symbol (e.g. const char sepa[] = {':', ' ', '\0'}) or terminating the length with e.g. s << string(sepa,2) will do the expected output.
In this specific case above the data seems to lay aligned in memory, so that the next null-terminator will be found inside the cout << "Hello World"-statement. As this alignment is not guaranteed, this will actually result in undefined behaviour, when the termination is missing.
So also two additional "terminating"-arrays like e.g const char sepa[] = {':', ' '}; char[] end_of_sepa = {'\0'}; declared right after the mentioned arrays will result in expected output, eventhough when the rest will be left unchanged... but this is probably not guaranteed and depends on the internal representation in memory.
P.S. As previously written this issue is not about fixing but comprehension. So please feel free to confirm or correct my assumption.
EDIT: Corrected the bold code section.
My name is Jose. I need help with a project. I need to handle .csv files in C++. The file contains nit, date and amount spent. The program must accumulate the purchase totals by NIT and must print on screen:
Sum NITs:
Average NITs
Min NITs
Max NITs
Count NITs
This following are links tot he csv files with nit, date, and total spent
I am trying to create output similar to:
My current codes is:
#include<iostream>
#include<fstream>
#include<string.h>
#include<stdlib.h>
#include<vector>
#include<sstream>
using namespace std;
void mostrar_csv();
int main()
{
mostrar_csv();
system("pause");
return 0;
}
void mostrar_csv()
{
ifstream archivo("archivo.csv");
string linea = "";
string escritura = "";
vector<string> vect;
while (getline(archivo, linea))
{
stringstream dato(linea);
while (getline(dato, escritura, ';'))
{
vect.push_back(escritura);
}
}
for (int i = 0; i < vect.size(); i++)
{ // EL .size literalmente es un metodo, es el tamaño que tiene el vector
cout << i + 1 << ".-- " << vect.at(i) << "\n";
}
cout << "\n\n";
cout << "the size is " << " " << vect.size() << " \n\n ";
}
See a full description below.
But first the example code (one of many possible solutions):
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <numeric>
#include <iterator>
#include <regex>
#include <map>
#include <tuple>
#include <algorithm>
#include <iomanip>
std::regex delimiter(",");
using Data = std::tuple<unsigned long, std::string, double>;
int main() {
// Open the file and check if it could be opened
if (std::ifstream csvFileStream{ "r:\\archivo.csv" }; csvFileStream) {
// Here we will store all data
std::vector<Data> data;
// Now read every line of the file until eof
for (std::string line{}; std::getline(csvFileStream, line); ) {
// Split the line into tokens
std::vector token(std::sregex_token_iterator(line.begin(), line.end(), delimiter, -1), {});
// Add to our data vector
data.emplace_back(Data{ std::stoul(token[0]), std::string(token[1]), std::stod(token[2]) });
}
// Now we want to aggregate the data. Get the sum over all
const double sum = std::accumulate(data.begin(), data.end(), 0.0, [](double v, const Data& d) { return v + std::get<2>(d); });
// Get the average over all
const double average = sum / data.size();
// Get the min and max value over all.
const auto [min, max] = std::minmax_element(data.begin(), data.end(), [](const Data& d1, const Data& d2) { return std::get<2>(d1) < std::get<2>(d2); });
// Next, we want to group based on NIT
std::map<unsigned long, double> groups{};
for (const Data& d : data) groups[std::get<0>(d)] += std::get<2>(d);
// Generate output
std::cout << "No. NIT Total Vendido\n";
unsigned int no{ 1U };
for (const auto& [NIT, gsum] : groups)
std::cout << std::right << std::setw(3) << no++ << ' ' << std::left << std::setw(9) << NIT
<< std::right << std::fixed << std::setprecision(2) << std::setw(19) << gsum << "\n";
std::cout << " ---------------\nSumatoria NITS:" << std::setw(17) << sum
<< "\nMedia NITs :" << std::setw(17) << average << "\nMin NITS :" << std::setw(17) << std::get<2>(*min)
<< "\nMax NITS :" << std::setw(17) << std::get<2>(*max) << "\nCount NITs :" << std::setw(14) << groups.size() << "\n";
}
else {
std::cerr << "\n*** Error: Could not open csv file\n";
}
return 0;
}
One of the major topics here is, how to parse a string or, it is called like this, how to split a string into tokens.
Splitting strings into tokens is a very old task. In very early C there was the function strtok, which still exists, even in C++. Here std::strtok.
But because of the additional functionality of std::getline is has been heavily misused for tokenizing strings. If you look on the top question/answer regarding how to parse a CSV file (please see here), then you will see what I mean.
People are using std::getline to read a text line, a string, from the original stream, then stuffing it into an std::istringstream and use std::getline with delimiter again to parse the string into tokens. Weird.
But, since many many many years, we have a dedicated, special function for tokenizing strings, especially and explicitly designed for that purpose. It is the
std::sregex_token_iterator
And since we have such a dedicated function, we should simply use it.
This thing is an iterator. For iterating over a string, hence the function name is starting with an s. The begin part defines, on what range of input we shall operate, then there is a std::regex for what should be matched / or what should not be matched in the input string. The type of matching strategy is given with last parameter.
0 --> give me the stuff that I defined in the regex and (optional)
-1 --> give me that what is NOT matched based on the regex.
We can use this iterator for storing the tokens in a std::vector. The std::vector has a range constructor, which takes 2 iterators as parameter, and copies the data between the first iterator and 2nd iterator to the std::vector. The statement
std::vector tokens(std::sregex_token_iterator(s.begin(), s.end(), re, -1), {});
defines a variable “tokens” as a std::vector and uses the so called range-constructor of the std::vector. Please note: I am using C++17 and can define the std::vector without template argument. The compiler can deduce the argument from the given function parameters. This feature is called CTAD ("class template argument deduction").
Additionally, you can see that I do not use the "end()"-iterator explicitly.
This iterator will be constructed from the empty brace-enclosed default initializer list with the correct type, because it will be deduced to be the same as the type of the first argument due to the std::vector constructor requiring that.
You can read any number of tokens in a line and put it into the std::vector
But you can do even more. You can validate your input. If you use 0 as last parameter, you define a std::regex that even validates your input. And you get only valid tokens.
Additionally, it helps you to avoid the error that you made, with the last getline statement.
Overall, the usage of a dedicated functionality is superior over the misused std::getline and people should simple use it.
Some people may complain about the function overhead, but how many of them are using big data. And even then, the approach would be probably then to use string.findand string.substring or std::stringviews or whatever.
Now we should have gotten a basic understanding, how to split a string into tokens.
Next, we will explore the rest os the software.
At the beginning we open a file and check, if it has been open. We use the new existing if statement, where you can put an initializer and the condition in the (). So, we define a variable std::ifstream an use its constructor to open the file. That was the initializer. Then we put the stream as condition as the 2nd part of the if-statement. This will check, if the file could be opened or not. That works, because the std::ifstreams !-operator is overwritten and will return a boolean state of the stream.
OK, now the file is open. With a normal for-statement, we read all lines of the file, using std::getline.
Then we tokenize the line (the string). Our data per line (csv) consists of 3 values. An unsigned long, a std::string and a double. We define a Type "Data" to be a tuple of those types.
The tokens for each line will be converted and put into the std::tuple via in-place construction and the tuple will then be added to our target vector.
So, basically we need just 3 lines of code, to read and parse the complete source csv-file.
Good. Now we have all data in a std::vector "data".
We can use existing functions from the algorithm library for getting the sum, average, min and max value.
Since we want to group the data based on the NIT, we then create an associative container: std::map. The key is the NIT and the value is the sum of the doubles. With the std::map index operator [] we can access or create a new key. Meaning, when a NIT is not existing in the map, then it will be added. In any case, the index operator [] will return a reference to the value. And we simply add the double to the value of the map. This we do for all tuples in the data-vector.
After this, all group sums exist, and the number of keys in the map, the size() of the std::map is the number of groups.
The rest is just simple formatiing and output.
I recently had a problem creating a stringstream due to the fact that I incorrectly assumed std::setw() would affect the stringstream for every insertion, until I changed it explicitly. However, it is always unset after the insertion.
// With timestruct with value of 'Oct 7 9:04 AM'
std::stringstream ss;
ss.fill('0'); ss.setf(ios::right, ios::adjustfield);
ss << setw(2) << timestruct.tm_mday;
ss << timestruct.tm_hour;
ss << timestruct.tm_min;
std::string filingTime = ss.str(); // BAD: '0794'
So, I have a number of questions:
Why is setw() this way?
Are any other manipulators this way?
Is there a difference in behavior between std::ios_base::width() and std::setw()?
Finally is there an online reference that clearly documents this behavior? My vendor documentation (MS Visual Studio 2005) doesn't seem to clearly show this.
Important notes from the comments below:
By Martin:
#Chareles: Then by this requirement all manipulators are sticky. Except setw which seems to be reset after use.
By Charles:
Exactly! and the only reason that setw appears to behave differently is because there are requirements on formatted output operations to explicitly .width(0) the output stream.
The following is the discussion that lead to the above conclusion:
Looking at the code the following manipulators return an object rather than a stream:
setiosflags
resetiosflags
setbase
setfill
setprecision
setw
This is a common technique to apply an operation to only the next object that is applied to the stream. Unfortunately this does not preclude them from being sticky. Tests indicate that all of them except setw are sticky.
setiosflags: Sticky
resetiosflags:Sticky
setbase: Sticky
setfill: Sticky
setprecision: Sticky
All the other manipulators return a stream object. Thus any state information they change must be recorded in the stream object and is thus permanent (until another manipulator changes the state). Thus the following manipulators must be Sticky manipulators.
[no]boolalpha
[no]showbase
[no]showpoint
[no]showpos
[no]skipws
[no]unitbuf
[no]uppercase
dec/ hex/ oct
fixed/ scientific
internal/ left/ right
These manipulators actually perform an operation on the stream itself rather than the stream object (Though technically the stream is part of the stream objects state). But I do not believe they affect any other part of the stream objects state.
ws/ endl/ ends/ flush
The conclusion is that setw seems to be the only manipulator on my version that is not sticky.
For Charles a simple trick to affect only the next item in the chain:
Here is an Example how an object can be used to temporaily change the state then put it back by the use of an object:
#include <iostream>
#include <iomanip>
// Private object constructed by the format object PutSquareBracket
struct SquareBracktAroundNextItem
{
SquareBracktAroundNextItem(std::ostream& str)
:m_str(str)
{}
std::ostream& m_str;
};
// New Format Object
struct PutSquareBracket
{};
// Format object passed to stream.
// All it does is return an object that can maintain state away from the
// stream object (so that it is not STICKY)
SquareBracktAroundNextItem operator<<(std::ostream& str,PutSquareBracket const& data)
{
return SquareBracktAroundNextItem(str);
}
// The Non Sticky formatting.
// Here we temporariy set formating to fixed with a precision of 10.
// After the next value is printed we return the stream to the original state
// Then return the stream for normal processing.
template<typename T>
std::ostream& operator<<(SquareBracktAroundNextItem const& bracket,T const& data)
{
std::ios_base::fmtflags flags = bracket.m_str.flags();
std::streamsize currentPrecision = bracket.m_str.precision();
bracket.m_str << '[' << std::fixed << std::setprecision(10) << data << std::setprecision(currentPrecision) << ']';
bracket.m_str.flags(flags);
return bracket.m_str;
}
int main()
{
std::cout << 5.34 << "\n" // Before
<< PutSquareBracket() << 5.34 << "\n" // Temp change settings.
<< 5.34 << "\n"; // After
}
> ./a.out
5.34
[5.3400000000]
5.34
The reason that width does not appear to be 'sticky' is that certain operations are guaranteed to call .width(0) on an output stream. Those are:
21.3.7.9 [lib.string.io]:
template<class charT, class traits, class Allocator>
basic_ostream<charT, traits>&
operator<<(basic_ostream<charT, traits>& os,
const basic_string<charT,traits,Allocator>& str);
22.2.2.2.2 [lib.facet.num.put.virtuals]: All do_put overloads for the num_put template. These are used by overloads of operator<< taking a basic_ostream and a built in numeric type.
22.2.6.2.2 [lib.locale.money.put.virtuals]: All do_put overloads for the money_put template.
27.6.2.5.4 [lib.ostream.inserters.character]: Overloads of operator<< taking a basic_ostream and one of the char type of the basic_ostream instantiation or char, signed char or unsigned char or pointers to arrays of these char types.
To be honest I'm not sure of the rationale for this, but no other states of an ostream should be reset by formatted output functions. Of course, things like badbit and failbit may be set if there is a failure in the output operation, but that should be expected.
The only reason that I can think of for resetting the width is that it might be surprising if, when trying to output some delimited fields, your delimiters were padded.
E.g.
std::cout << std::setw(6) << 4.5 << '|' << 3.6 << '\n';
" 4.5 | 3.6 \n"
To 'correct' this would take:
std::cout << std::setw(6) << 4.5 << std::setw(0) << '|' << std::setw(6) << 3.6 << std::setw(0) << '\n';
whereas with a resetting width, the desired output can be generated with the shorter:
std::cout << std::setw(6) << 4.5 << '|' << std::setw(6) << 3.6 << '\n';
setw() only affects the next insertion. That's just the way setw() behaves. The behavior of setw() is the same as ios_base::width(). I got my setw() information from cplusplus.com.
You can find a full list of manipulators here. From that link, all the stream flags should stay set until changed by another manipulator. One note about the left, right and internal manipulators: They are like the other flags and do persist until changed. However, they only have an effect when the width of the stream is set, and the width must be set every line. So, for example
cout.width(6);
cout << right << "a" << endl;
cout.width(6);
cout << "b" << endl;
cout.width(6);
cout << "c" << endl;
would give you
> a
> b
> c
but
cout.width(6);
cout << right << "a" << endl;
cout << "b" << endl;
cout << "c" << endl;
would give you
> a
>b
>c
The Input and Output manipulators are not sticky and only occur once where they are used. The parameterized manipulators are each different, here's a brief description of each:
setiosflags lets you manually set flags, a list of which can be found here, so it is sticky.
resetiosflags behaves similar to setiosflags except it unsets the specified flags.
setbase sets the base of integers inserted into the stream (so 17 in base 16 would be "11", and in base 2 would be "10001").
setfill sets the fill character to insert in the stream when setw is used.
setprecision sets the decimal precision to be used when inserting floating point values.
setw makes only the next insertion the specified width by filling with the character specified in setfill
I am trying to convert several values into one string, which is to be used as a filename, however after trying several different methods, I'm a bit stumped.
string reportfile = myarray[0][2] + myarray[0][3] + "report.txt";
cout << reportfile << endl;
ofstream outfile(reportfile);
I've tried to_string and .str(), and I tried to add each of them onto the string separately, still converting the methods mentioned before, but I either did all of it incorrectly, or they it didn't work.
The arrays would contain year and day, I need the reportfile value to be, for example:
201312report.txt
So, how would I go about to converting the two int array items and the text into a single string.
In C++11 you can use std::to_string() for int-to-string conversions:
string reportfile = to_string(myarray[0][2]) + to_string(myarray[0][3]) + "report.txt";
Try this:
#include <sstream> // ^ top of the file
std::ostringstream reportfile;
reportfile << myarray[0][2] << myarray[0][3] << "report.txt";
std::string reportfile_str = reportfile.str();
std::cout << reportfile_str << std::endl;
std::ofstream outfile(reportfile_str.c_str()); // in c++11, ommit the ".c_str()"
This assumes there is an output operation from whatever type is stored in myarray (I assumed it's an integer type).
I have a string whose last part(suffix) needs to be changed several times and I need to generate new strings. I am trying to use ostringstream to do this as I think, using streams will be faster than string concatenations. But when the previous suffix is greater than the later one, it gets messed up. The stream strips off null characters too.
#include<iostream>
#include<sstream>
using namespace std;
int main()
{
ostringstream os;
streampos pos;
os << "Hello ";
pos = os.tellp();
os << "Universe";
os.seekp(pos);
cout<< os.str() << endl;
os << "World\0";
cout<< os.str().c_str() << endl;
return 0;
}
Output
Hello Universe
Hello Worldrse
But I want Hello World. How do I do this? Is there anyother way to do this in a faster manner?
Edit:
Appending std::ends works. But wondering how it works internally. Also like to know if there are faster ways to do the same.
The string "World" is already null-terminated. That's how C strings work. "World\0" has two \0 characters. Therefore, operator<<(ostream&, const char*) will treat them the same, and copy all characters up to \0. You can see this even more clearly, if you try os << "World\0!". The ! will not be copied at all, since operator<< stopped at the first \0.
So, it's not ostringstream. It's how C strings aka const char* work.
It doesn't strip anything. All string literals in C++ are terminated by NUL, so by inserting one manually you just finish the string, as far as anyone processing it is concerned. Use ostream::write or ostream::put, if you need to do that — anything that expects char* (with no additional argument for size) will most likely treat it specially.
os.write("World\0", 6);
Why do you think a stream operation is faster than a string? And why build the string before outputting to cout?
If you want a prefix to your output you could just do it like this
const std::string prefix = "Hello ";
std::cout << prefix << "Universe" << std::endl;
std::cout << prefix << "World" << std::endl;