Given data format as "int,int,...,int,string,int", is it possible to use stringstream (only) to properly decode the fields?
[Code]
int main(int c, char** v)
{
std::string line = "0,1,2,3,4,5,CT_O,6";
char delimiter[7];
int id, ag, lid, cid, fid, did, j = -12345;
char dcontact[4]; // <- The size of <string-field> is known and fixed
std::stringstream ssline(line);
ssline >> id >> delimiter[0]
>> ag >> delimiter[1]
>> lid >> delimiter[2]
>> cid >> delimiter[3]
>> fid >> delimiter[4]
>> did >> delimiter[5] // <- should I do something here?
>> dcontact >> delimiter[6]
>> j;
std::cout << id << ":" << ag << ":" << lid << ":" << cid << ":" << fid << ":" << did << ":";
std::cout << dcontact << "\n";
}
[Output] 0:1:2:3:4:5:CT_6,0:-45689, the bolded part shows the stringstream failed to read 4 char only to dcontact. dcontact actually hold more than 4 chars, leaving j with garbage data.
Yes, there is no specific overload of operator >> (istream&, char[N]) for N and there is for char* so it sees that as the best match. The overload for char* reads to the next whitespace character so it doesn't stop at the comma.
You could wrap your dcontact in a struct and have a specific overload to read into your struct. Else you could use read, albeit it breaks your lovely chain of >> operators.
ssline.read( dcontact, 4 );
will work at that point.
To read up to a delimiter, incidentally, you can use getline. (get will also work but getline free-function writing to a std::string will mean you don't have to guess the length).
(Note that other people have specified to use get rather than read, but this will fail in your case as you do not have an extra byte at the end of your dcontact array for a null terminator. IF you want dcontact to be null-terminated then make it 5 characters and use 'get` and the null will be appended for you).
Slightly more robust (handles the ',' delimiter correctly):
template <char D>
std::istream& delim(std::istream& in)
{
char c;
if (in >> c && c != D) in.setstate(std::ios_base::failbit);
return in;
}
int main()
{
std::string line = "0,1,2,3,4,5,CT_O,6";
int id, ag, lid, cid, fid, did, j = -12345;
char dcontact[5]; // <- The size of <string-field> is known and fixed
std::stringstream ssline(line);
(ssline >> id >> delim<','>
>> ag >> delim<','>
>> lid >> delim<','>
>> cid >> delim<','>
>> fid >> delim<','>
>> did >> delim<','> >> std::ws
).get(dcontact, 5, ',') >> delim<','>
>> j;
std::cout << id << ":" << ag << ":" << lid << ":"
<< cid << ":" << fid << ":" << did << ":";
<< dcontact << "\n";
}
The problem is that the >> operator for a string
(std::string or a C style string) actually implements the
semantics for a word, with a particular definition of word. The
decision is arbitrary (I would have made it a line), but since
a string can represent many different things, they had to choose
something.
The solution, in general, is not to use >> on a string, ever.
Define the class you want (here, probably something like
Symbol), and define an operator >> for it which respects its
semantics. You're code will be a lot clearer for it, and you
can add various invarant controls as appropriate. If you know
that the field is always exactly four characters, you can do
something simple like:
class DContactSymbol
{
char myName[ 4 ];
public:
// ...
friend std::istream&
operator>>( std::istream& source, DContactSymbol& dest );
// ...
};
std::istream&
operator>>( std::istream& source, DContactSymbol& dest )
{
std::sentry guard( source );
if ( source ) {
std::string tmp;
std::streambuf* sb = source.rdbuf();
int ch = sb->sgetc();
while ( source && (isalnum( ch ) || ch == '_') ) {
tmp += static_cast< char >( ch );
if ( tmp.size() > sizeof( dest.myName ) ) {
source.setstate( std::ios_base::failbit );
}
}
if ( ch == source::traits_type::eof() ) {
source.setstate( std::ios_base::eofbit );
}
if ( tmp.size() != sizeof( dest.myName ) ) {
source.setstate( std::ios_base::failbit );
}
if ( source ) {
tmp.copy( dest.myName, sizeof( dest.myName ) );
}
}
return source;
}
(Note that unlike some of the other suggestions, for example
using std::istream::read, this one maintains all of the usual
conventions, like skipping leading white space dependent on the
skipws flag.)
Of course, if you can't guarantee 100% that the symbol will
always be 4 characters, you should use std::string for it, and
modify the >> operator accordingly.
And BTW, you seem to want to read four characters into
dcontact, although it's only large enough for three (since
>> will insert a terminating '\0'). If you read any more
than three into it, you have undefined behavior.
try this
int main(int c, char** v) {
string line = "0,1,2,3,4,5,CT_O,6";
char delimiter[7];
int id, ag, lid, cid, fid, did, j = -12345;
char dcontact[5]; // <- The size of <string-field> is known and fixed
stringstream ssline(line);
ssline >> id >> delimiter[0]
>> ag >> delimiter[1]
>> lid >> delimiter[2]
>> cid >> delimiter[3]
>> fid >> delimiter[4]
>> did >> delimiter[5];
ssline.get(dcontact, 5);
ssline >> delimiter[6]
>> j;
std::cout << id << ":" << ag << ":" << lid << ":" << cid << ":" << fid << ":" << did << ":";
std::cout << dcontact << "\n" << j;
}
Since the length of the string is known you can use std::setw(4), as in
ssline >> std::setw(4) >> dcontact >> delimiter[6];
Related
I have a file with lines in the format:
firstword;secondword;4.0
I need to split the lines by ;, store the first two words in char arrays, and store the number as a double.
In Python, I would just use split(";"), then split("") on the first two indexes of the resulting list then float() on the last index. But I don't know the syntax for doing this in C++.
So far, I'm able to read from the file and store the lines as strings in the studentList array. But I don't know where to begin with extracting the words and numbers from the items in the array. I know I would need to declare new variables to store them in, but I'm not there yet.
I don't want to use vectors for this.
#include <iomanip>
#include <fstream>
#include <string>
#include <stdlib.h>
#include <iostream>
using namespace std;
int main() {
string studentList[4];
ifstream file;
file.open("input.txt");
if(file.is_open()) {
for (int i = 0; i < 4; i++) {
file >> studentList[i];
}
file.close();
}
for(int i = 0; i < 4; i++) {
cout << studentList[i];
}
return 0;
}
you can use std::getline which support delimiter
#include <string>
#include <sstream>
#include <iostream>
int main() {
std::istringstream file("a;b;1.0\nc;d;2.0");
for (int i = 0; i < 2; i++){
std::string x,y,v;
std::getline(file,x,';');
std::getline(file,y,';');
std::getline(file,v); // default delim is new line
std::cout << x << ' ' << y << ' ' << v << '\n';
}
}
C++ uses the stream class as its string-handling workhorse. Every kind of transformation is typically designed to work through them. For splitting strings, std::getline() is absolutely the right tool. (And possibly a std::istringstream to help out.)
A few other pointers as well.
Use struct for related information
Here we have a “student” with three related pieces of information:
struct Student {
std::string last_name;
std::string first_name;
double gpa;
};
Notice how one of those items is not a string.
Keep track of the number of items used in an array
Your arrays should have a maximum (allocated) size, plus a separate count of the items used.
constexpr int MAX_STUDENTS = 100;
Student studentList[MAX_STUDENTS];
int num_students = 0;
When adding an item (to the end), remember that in C++ arrays always start with index 0:
if (num_students < MAX_STUDENTS) {
studentList[num_students].first_name = "James";
studentList[num_students].last_name = "Bond";
studentList[num_students].gpa = 4.0;
num_students += 1;
}
You can avoid some of that bookkeeping by using a std::vector:
std::vector <Student> studentList;
studentList.emplace_back( "James", "Bond", 4.0 );
But as you requested we avoid them, we’ll stick with arrays.
Use a stream extractor function overload to read a struct from stream
The input stream is expected to have student data formatted as a semicolon-delimited record — that is: last name, semicolon, first name, semicolon, gpa, newline.
std::istream & operator >> ( std::istream & ins, Student & student ) {
ins >> std::ws; // skip any leading whitespace
getline( ins, student.last_name, ';' ); // read last_name & eat delimiter
getline( ins, student.first_name, ';' ); // read first_name & eat delimiter
ins >> student.gpa; // read gpa. Does not eat delimiters
ins >> std::ws; // skip all trailing whitespace (including newline)
return ins;
}
Notice how std::getline() was put to use here to read strings terminating with a semicolon. Everything else must be either:
read as a string then converted to the desired type, or
read using the >> operator and have the delimiter specifically read.
For example, if the GPA were not last in our list, we would have to read and discard (“eat”) a semicolon:
char c;
ins >> student.gpa >> c;
if (c != ';') ins.setstate( std::ios::failbit );
Yes, that is kind of long and obnoxious. But it is how C++ streams work.
Fortunately with our current Student structure, we can eat that trailing newline along with all other whitespace.
Now we can easily read a list of students until the stream indicates EOF (or any error):
while (f >> studentList[num_students]) {
num_students += 1;
if (num_students == MAX_STUDENTS) break; // don’t forget to watch your bounds!
}
Use a stream insertion function overload to write
’Nuff said.
std::ostream & operator << ( std::ostream & outs, const Student & student ) {
return outs
<< student.last_name << ";"
<< student.first_name << ";"
<< std::fixed << std::setprecision(1) << student.gpa << "\n";
}
I am personally disinclined to modify stream characteristics on argument streams, and would instead use an intermediary std::ostreamstream:
std::ostringstream oss;
oss << std::fixed << std::setprecision(1) << student.gpa;
outs << oss.str() << "\n";
But that is beyond the usual examples, and is often unnecessary. Know your data.
Either way you can now write the list of students with a simple << in a loop:
for (int n = 0; n < num_students; n++)
f << studentList[n];
Use streams with C++ idioms
You are typing too much. Use C++’s object storage model to your advantage. Curly braces (for compound statements) help tremendously.
While you are at it, name your input files as descriptively as you are allowed.
{
std::ifstream f( "students.txt" );
while (f >> studentList[num_students])
if (++num_students == MAX_STUDENTS)
break;
}
No students will be read if f does not open. Reading will stop once you run out of students (or some error occurs) or you run out of space in the array, whichever comes first. And the file is automatically closed and the f object is destroyed when we hit that final closing brace, which terminates the lexical context containing it.
Include only required headers
Finally, try to include only those headers you actually use. This is something of an acquired skill, alas. It helps when you are beginning to list those things you are including them for right alongside the directive.
Putting it all together into a working example
#include <algorithm> // std::sort
#include <fstream> // std::ifstream
#include <iomanip> // std::setprecision
#include <iostream> // std::cin, std::cout, etc
#include <string> // std::string
struct Student {
std::string last_name;
std::string first_name;
double gpa;
};
std::istream & operator >> ( std::istream & ins, Student & student ) {
ins >> std::ws; // skip any leading whitespace
getline( ins, student.last_name, ';' ); // read last_name & eat delimiter
getline( ins, student.first_name, ';' ); // read first_name & eat delimiter
ins >> student.gpa; // read gpa. Does not eat delimiters
ins >> std::ws; // skip all trailing whitespace (including newline)
return ins;
}
std::ostream & operator << ( std::ostream & outs, const Student & student ) {
return outs
<< student.last_name << ";"
<< student.first_name << ";"
<< std::fixed << std::setprecision(1) << student.gpa << "\n";
}
int main() {
constexpr int MAX_STUDENTS = 100;
Student studentList[MAX_STUDENTS];
int num_students = 0;
// Read students from file
std::ifstream f( "students.txt" );
while (f >> studentList[num_students])
if (++num_students == MAX_STUDENTS)
break;
// Sort students by GPA from lowest to highest
std::sort( studentList, studentList+num_students,
[]( auto a, auto b ) { return a.gpa < b.gpa; } );
// Print students
for(int i = 0; i < num_students; i++) {
std::cout << studentList[i];
}
}
The “students.txt” file contains:
Blackfoot;Lawrence;3.7
Chén;Junfeng;3.8
Gupta;Chaya;4.0
Martin;Anita;3.6
Running the program produces the output:
Martin;Anita;3.6
Blackfoot;Lawrence;3.7
Chén;Junfeng;3.8
Gupta;Chaya;4.0
You can, of course, print the students any way you wish. This example just prints them with the same semicolon-delimited-format as they were input. Here we print them with GPA and surname only:
for (int n = 0; n < num_students; n++)
std::cout << studentList[n].gpa << ": " << studentList[n].last_name << "\n";
Every language has its own idiomatic usage which you should learn to take advantage of.
I'm trying to create some code to open a file, read the content and check if a couple of integers are equal by using getline(). The problem is that it seems to work only with strings, instead of doing it with integers aswell. Could you help me?
fstream ficheroEntrada;
string frase;
int dni, dnitxt;
int i=0;
int time;
cout << "Introduce tu DNI: ";
cin >> dni;
ficheroEntrada.open ("Datos.txt",ios::in);
if (ficheroEntrada.is_open()) {
while (! ficheroEntrada.eof() ) {
getline (ficheroEntrada, dnitxt);
if (dnitxt == dni){
getline (ficheroEntrada, frase);
cout << dni << " " << frase << endl;
}else{
getline (ficheroEntrada, dnitxt);
}
}
ficheroEntrada.close();
}
getline() member function is used to extract string input. So it would be better if you input data in form of string and then use "stoi" (stands for string to integer) to extract only integer values from the string data.
You can check how to use "stoi" seperately.
getline doesn't read an integer, only a string, a whole line at a time.
If I understand correctly, you are searching for the int dni into the file Datos.txt. What is the format of the file ?
Assuming it looks something like this:
4
the phrase coressponding to 4
15
the phrase coressponding to 15
...
You can use stoi to convert what you've read into an integer:
string dni_buffer;
int found_dni
if (ficheroEntrada.is_open()) {
while (! ficheroEntrada.eof() ) {
getline (ficheroEntrada, dni_buffer);
found_dni = stoi(dni_buffer);
if (found == dni){
getline (ficheroEntrada, frase);
cout << dni << " " << frase << endl;
}else{
// discard the text line with the wrong dni
// we can use frase as it will be overwritten anyways
getline (ficheroEntrada, frase);
}
}
ficheroEntrada.close();
}
This is not tested.
C++ has two type of getline.
One of them is a non-member function in std::string. This version extracts from a stream into a std::string object getline. Like:
std::string line;
std::getline( input_stream, line );
The other one is a member function of an input-stream like std::ifstream and this version extracts from the stream into an array of character getline like:
char array[ 50 ];
input_stream( array, 50 );
NOTE
Both versions extracts characters from a stream NOT a real integer type!
For having an answer to your question, you should know what type of data you have in your file. A file like this: I have only 3 $dollars!; when you try to read that, by using std::getline or input_stream.getline you cannot extract 3 as in integer type!. Instead of getline you can use operator >> to extract a single data one-by-one; like:
input_stream >> word_1 >> word_2 >> word_3 >> int_1 >> word_4;.
Now int_1 has the value: 3
Practical Example
std::ifstream input_stream( "file", std::ifstream::in );
int number_1;
int number_2;
while( input_stream >> number_1 >> number_2 ){
std::cout << number_1 << " == " << number_2 << " : " << ( number_1 == number_2 ) << '\n';
}
input_stream.close();
The output:
10 == 11 : 0
11 == 11 : 1
12 == 11 : 0
How to split the string to two-parts after I assign the operation to math operator? For example 4567*6789 I want to split string into three part
First:4567 Operation:* Second:6789
Input is from textfile
char operation;
while (getline(ifs, line)){
stringstream ss(line.c_str());
char str;
//get string from stringstream
//delimiter here + - * / to split string to two part
while (ss >> str) {
if (ispunct(str)) {
operation = str;
}
}
}
Maybe, just maybe, by thinking this out, we can come up with a solution.
We know that operator>> will stop processing when encounter a character that is not a digit. So we can use this fact.
int multiplier = 0;
ss >> multiplier;
The next characters are not digits, so they could be an operator character.
What happens if we read in a character:
char operation = '?';
ss >> operation;
Oh, I forgot to mention that the operator>> will skip spaces by default.
Lastly, we can input the second number:
int multiplicand = 0;
ss >> multiplicand;
To confirm, let's print out what we have read in:
std::cout << "First Number: " << multiplier << "\n";
std::cout << "Operation : " << operation << "\n";
std::cout << "Second Number: " << multiplicand << "\n";
Using a debugger here will help show what is happening, as each statement is executed, one at at time.
Edit 1: More complicated
You can always get more complicated and use a parser, lexer or write your own. A good method of implementation is to use a state machine.
For example, you would read a single character, then decide what to do with it depending on the state. For example, if the character is a digit, you may want to build a number. For a character (other than white space), convert it to a token and store it somewhere.
There are parse trees and other data structures which can ease the operation of parsing. There are parsing libraries out there too, such as boost::spirit, yacc, bison, flex and lex.
One way is:
char opr;
int firstNumber, SecondNumber;
ss>>firstNumber>>opr>>SecondNumber;
instead of:
while (ss >> str) {
if (ispunct(str)) {
operation = str;
}
}
Or using regex for complex expersions. Here is an example of using regex in math expersions.
If you have a string at hand, you could simply split the string into left and right at the operator position as follows:
char* linePtr = strdup("4567*6789"); // strdup to preserve original value
char* op = strpbrk(linePtr, "+-*");
if (op) {
string opStr(op,1);
*op = 0x0;
string lhs(linePtr);
string rhs(op+1);
cout << lhs << " " << opStr << " " << rhs;
}
A simple solution would be to use sscanf:
int left, right;
char o;
if (sscanf("4567*6789", "%d%c%d", &left, &o, &right) == 3) {
// scan valid...
cout << left << " " << o << " " << right;
}
My proposual is to create to functions:
std::size_t delimiter_pos(const std::string line)
{
std::size_t found = std::string::npos;
(found = line.find('+')) != std::string::npos ||
(found = line.find('-')) != std::string::npos ||
(found = line.find('*')) != std::string::npos ||
(found = line.find('/')) != std::string::npos;
return found;
}
And second function that calculate operands:
void parse(const std::string line)
{
std::string line;
std::size_t pos = delimiter_pos(line);
if (pos != std::string::npos)
{
std::string first = line.substr(0, pos);
char operation = line[pos];
std::string second = line.substr(pos + 1, line.size() - (pos + 1));
}
}
I hope my examples helped you
I have the following code to convert an encrypted ciphertext to a readable hexadecimal format:
std::string convertToReadable(std::string ciphertext)
{
std::stringstream outText;
for(unsigned int i = 0; i < ciphertext.size(); i++ )
outText << std::hex << std::setw(2) << std::setfill('0') << (0xFF & static_cast<byte>(ciphertext[i])) << ":";
return outText.str();
}
The readable result of this function is something as:
56:5e:8b:a8:04:93:e2:f1:5c:20:8b:fd:f5:b7:22:0b:82:42:46:58:9b:d4:c1:8e:ac:62:85:04:ff:7f:c6:d3:
Now I need to do the way back, converting the readable format to the original ciphertext in order to decrypt it:
std::string convertFromReadable(std::string text)
{
std::istringstream cipherStream;
for(unsigned int i = 0; i < text.size(); i++ )
{
if (text.substr(i, 1) == ":")
continue;
std::string str = text.substr(i, 2);
std::istringstream buffer(str);
int value;
buffer >> std::hex >> value;
cipherStream << value;
}
return cipherStream.str();
}
This is not absolutely working, as I´m getting the wrong string back.
How can I fix the convertFromReadable() so that I can have the original ciphertext back ?
Thanks for helping
Here are problems that you should fix before debugging this any further:
cipherStream should be ostringstream, not istringstream
The for loop should stop two characters before the end. Otherwise your substr is going to fail. Make the loop condition i+2 < text.size()
When you read two characters from the input, you need to advance i by two, i.e. add i++ after the std::string str = text.substr(i, 2); line.
Since you want character output, add a cast to char when writing the data to cipherStream, i.e. cipherStream << (char)value
Good you got your code working. Just thought I'd illustrate a slightly simpler, more direct approach using streams without the fiddly index tracking and substr extraction:
std::string convertFromReadable(const std::string& text)
{
std::istringstream iss(text);
std::ostringstream cipherStream;
int n;
while (iss >> std::hex >> n)
{
cipherStream << (char)n;
// if there's another character it better be ':'
char c;
if (iss >> c && c != ':')
throw std::runtime_error("invalid character in cipher");
}
return cipherStream.str();
}
Note that after the last hex value, if there's no colon the if (iss >> c... test will evaluate false as will the while (iss >> ... test, fallingt through to return.
Hey I am trying to read in the following lines using a getline
(15,0,1,#)
(2,11,2,.)
(3,20,0,S)
I want to be able to just extract the integers as ints and the characters as char, but I have no idea how to only extract those.
It seems you could read off the separators, i.e., '(', ')', and ',' and then just use the formatted input. Using a simple template for a manipulator should do the trick nicely:
#include <iostream>
#include <sstream>
template <char C>
std::istream& read_char(std::istream& in)
{
if ((in >> std::ws).peek() == C) {
in.ignore();
}
else {
in.setstate(std::ios_base::failbit);
}
return in;
}
auto const open_paren = &read_char<'('>;
auto const close_paren = &read_char<')'>;
auto const comma = &read_char<','>;
int main()
{
int x, y, z;
char c;
std::istringstream in("(1, 2, 3, x)\n(4, 5, 6, .)");
if (in >> open_paren >> x
>> comma >> y
>> comma >> z
>> comma >> c
>> close_paren) {
std::cout << "x=" << x << " y=" << y << " z=" << z << " c=" << c << '\n';
}
}
Compare the value you get from getline()'s hexadecimal value, and run some if statements to compare to ASCII. That will tell you if you grabbed a number, letter, or symbol.