I would like to read a text file, which contains the following data:
Wogger John 2 6.2
Bilbo 111 81.3
Mary 29 154.8
The data separated by tabulator. The problem is if the string contains a space (for example: 'Wogger John'), then the program doesn't work. THe program does work if I replace string 'Wogger John' to 'Wogger' or 'John'. How to fix the problem? How to use the getline() function. Here the code:
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <cerrno>
#include <cstring>
#include <tuple>
std::vector<std::tuple<std::string, int, double>> Readfile()
{
std::ifstream File("file_read_v3.txt");
std::vector<std::tuple<std::string, int, double>> data;
std::string name;
int a;
double b;
while (File >> name >> a >> b)
{
data.push_back(std::tuple<std::string, int, double>(name, a, b));
}
return data;
}
int main()
{
auto vt = Readfile();
for (const auto& i : vt) {
std::cout << std::get<0>(i) << ", " << std::get<1>(i) << ", " << std::get<2>(i) << std::endl;
}
return 0;
system("pause");
}
while (std::getline(File, name, '\t') >> a >> b)
{
data.push_back(std::tuple<std::string, int, double>(name, a, b));
}
std::getline reads up to and including a delimiter, then discards the delimiter. It defaults to the delimiter being a newline; here we make it a tab.
If the separator was always the same, and different from the separator between the names you could use it in getline as a delimiter, something like:
while (std::getline(File, name, '\t')){
File >> a >> b;
data.push_back(std::tuple<std::string, int, double>(name, a, b));
}
But by your sample file contents, it seems that you have tabs and spaces randomly placed, for that reason getline wouldn't really help very much, the problem remains.
You could, for instance, read the file char by char and stop reading when you encouter a digit, put that digit back and then parse the numeric values, roughly something like this:
std::vector<std::tuple<std::string, int, double>> Readfile()
{
std::ifstream File("file_read_v3.txt");
std::vector<std::tuple<std::string, int, double>> data;
if (File.is_open()) // always check if file was opened
{
int a;
double b;
char c;
do
{
std::string name;
while (!std::isdigit(c = File.get())) // get character, check if is digit
{
name.push_back(c); // add characters until a digit is found
}
File.unget(); // put digit back
File >> a >> b; // parse the int and double
data.push_back(std::tuple<std::string, int, double>(name, a, b));
c = File.get(); // get next character
} while (c != EOF); // if there is nothing else to read
}
return data;
}
To do, remove the space or tab at the end of name.
Thise line:
while (File >> name >> a >> b)
is causing your problems. You are saying cut my line into 1 string and 2 ints seperated by whitespace, so the line:
Wogger John 2 6.2
can not be parsed because it contains 2 strings + 2 ints and the stream operator is going to return false and your while loop is exiting.
To solve this natively you need to parse every char until you hit an \t. Put the parsed chars in a string, then parse the number until \t comes again and parse the next number.
Related
I'm a beginner programmer working through the 2019 Advent of Code challenges in C++.
The last piece of the puzzle I'm putting together is actually getting the program to read the input.txt file, which is essentially a long string of values in the form of '10,20,40,23" etc. on a single line.
In the previous puzzle I used the lines
int inputvalue;
std::ifstream file("input.txt");
while(file >> inputvalue){
//
}
to grab lines from the file, but it was formatted as a text file in sequential lines with no comma separation.
ie:
10
20
40
23
What can I do to read through the file using the comma delineation, and specifically how can I get those values to be read as integers, instead of as strings or chars, and store them into a vector?
While it would be strange to write a routine to read just one line from a comma separated file, instead of writing a general routine to read all lines (and just take the first one if you only wanted one) -- you can take out the parts for reading multiple lines into a std::vector<std::vector<int>> and just read one line into a std::vector<int> -- though it only saves a handful of lines of code.
The general approach would be to read the entire line of text with getline(file, line) and then create a std::stringstream (line) from which you could then use >> to read each integer followed by a getline (file, tmpstr, ',') to read the delimiter.
You can take a second argument in addition to the file to read, so you can pass the delimiter as the first character of the second argument -- that way there is no reason to re-compile your code to handle delimiters of ';' or ',' or any other single character.
You can put a short bit of code together to do that which could look like the following:
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>
int main (int argc, char **argv) {
if (argc < 2) { /* validate at least 1 argument given */
std::cerr << "error: insufficient number of arguments.\n"
"usage: " << argv[0] << " <filename>\n";
return 1;
}
std::vector<int> v {}; /* vector<int> */
std::string line {}; /* string to hold each line */
std::ifstream f (argv[1]); /* open file-stream with 1st argument */
const char delim = argc > 2 ? *argv[2] : ','; /* delim is 1st char in */
/* 2nd arg (default ',') */
if (!f.good()) { /* validate file open for reading */
std::cerr << "errro: file open failed '" << argv[1] << "'.\n";
return 1;
}
if (getline (f, line)) { /* read line of input into line */
int itmp; /* temporary integer to fill */
std::stringstream ss (line); /* create stringstream from line */
while (ss >> itmp) { /* read integer value from ss */
std::string stmp {}; /* temporary string to hold delim */
v.push_back(itmp); /* add to vector */
getline (ss, stmp, delim); /* read delimiter */
}
}
for (auto col : v) /* loop over each integer */
std::cout << " " << col; /* output col value */
std::cout << '\n'; /* tidy up with newline */
}
(note: there are relatively few changes needed to read all lines into a vector of vectors, the more notable is simply replacing the if(getline...) with while(getline..) and then filling a temporary vector which, if non-empty, is then pushed back into your collection of vectors)
Example Input File
With a set of comma separated integers in the file named dat/int-1-10-1-line.txt, e.g.
$ cat dat/int-1-10-1-line.txt
1,2,3,4,5,6,7,8,9,10
Example Use/Output
Your use an results would be:
$ ./bin/read_csv_int-1-line dat/int-1-10-1-line.txt
1 2 3 4 5 6 7 8 9 10
Of course you can change the output format to whatever you need. Look things over and let me know if you have further questions.
You have options. In my opinion, the most straight-forward is to just read a string and then convert to integer. You can use the additional "delimiter" parameter of std::getline to stop when it encounters a comma:
std::string value;
while (std::getline(file, value, ',')) {
int ival = std::stoi(value);
std::cout << ival << std::endl;
}
A common alternative is to read a single character, expecting it to be a comma:
int ival;
while (file >> ival) {
std::cout << ival << std::endl;
// Skip comma (we hope)
char we_sure_hope_this_is_a_comma;
file >> we_sure_hope_this_is_a_comma;
}
If it's possible for whitespace to also be present, you may want a less "hopeful" technique to skip the comma:
// Skip characters up to (and including) next comma
for (char c; file >> c && c != ',';);
Or simply:
// Skip characters up to (and including) next comma
while (file && file.get() != ',');
Or indeed, if you expect only whitespace or a comma, you could do something like:
// Skip comma and any leading whitespace
(file >> std::ws).get();
Of course, all the above are more-or-less clunky ways of doing this:
// Skip characters up to (and including) next comma on next read
file.ignore(std::numeric_limits<std::streamsize>::max(), ',');
All these approaches assume input is a single line. If you expect multiple lines of input with comma-separated values, you'll also need to handle end-of-line occurring without encountering a comma. Otherwise, you might miss the first input on the next line. Except for the "hopeful" approach, which will work but only on a technicality.
For robustness, I generally advise you to read line-based input as a whole string with std::getline, and then use std::istringstream to read individual values out of that line.
Here is another compact solution using iterators.
#include <iostream>
#include <vector>
#include <string>
#include <iterator>
#include <fstream>
#include <algorithm>
template <char D>
struct WordDelimiter : public std::string
{};
template <char D>
std::istream &
operator>>(std::istream & is, WordDelimiter<D> & output)
{
// Output gets every comma-separated token
std::getline(is, output, D);
return is;
}
int main() {
// Open a test file with comma-separated tokens
std::ifstream f{"test.txt"};
// every token is appended in the vector
std::vector<std::string> vec{ std::istream_iterator<WordDelimiter<','>>{ f },
std::istream_iterator<WordDelimiter<','>>{} };
// Transform str vector to int vector
// WARNING: no error checking made here
std::vector<int> vecint;
std::transform(std::begin(vec),std::end(vec),std::back_inserter(vecint),[](const auto& s) { return std::stoi(s); });
for (auto val : vecint) {
std::cout << val << std::endl;
}
return 0;
}
I have a stream in a file (which is opened by ifstream) with doubles and some string but I don't know their positions. For example:
1 2 3 4 g 3 2 t 1 d
So I have to read a stream as the above one, but I don't know how to differentiate the type of the variable before reading.
How can I do that with ifstream and with no knowledge about the order of the variables?
If you are sure your numbers are integers, use the code below. If not, replace int in vec and temp by double
#include <fstream>
#include <vector>
#include <fstream>
int main()
{
std::vector<int> vec;
std::ifstream file {"fileLocation"};
while(file.peek() != EOF)
{
if (isdigit(file.peek())){
int temp;
file >> temp;
vec.push_back(temp);
file.seekg(file.tellg() + file.gcount());
}
else
file.seekg(static_cast<int>(file.tellg()) + 1);
}
for(auto const & el : vec ) std::cout << el << " ";
}
You can peek() the next character in the stream, and if it is a digit or sign then read the next value as a double, otherwise read it as a string.
Otherwise, just read everything from the stream as a string, and then use std::stod() or std::strtod() to attempt to convert each string to a double, and check the conversions for failure.
I want to read string and extract all numbers.
Input: 5a3 1f a0aaaa f1fg3
Output: 53 1 0 13
I tried this code:
string s;
getline(cin, s);
stringstream str_strm(s);
int found;
string temp;
while (!str_strm.eof()) {
str_strm >> temp;
if (stringstream(temp) >> found)
{
cout << found << endl;
}
}
but when found 5 (from example)after that automatically start to check the other string. How can I extract all numbers?
Here's a possible solution - while loop is used to separate strings with whitespaces, after that digits are extracted from the sub-strings.
int main()
{
stringstream ss("5a3 1f a0aaaa f1fg3");
string str;
while (getline(ss, str, ' ') ){
str.erase(std::remove_if(str.begin(), str.end(), [](unsigned char c) { return !std::isdigit(c); }), str.end());
cout << str << " ";
}
}
You could read each space separated word, and then remove the non-digits, like this
std::string word;
while (std::cin >> word)
{
word.erase(std::remove_if(word.begin(), word.end(),
[](unsigned char c) { return not std::isdigit(c); }),
word.end());
std::cout << word << " ";
}
For the input of 5a3 1f a0aaaa f1fg3, it prints 53 1 0 13.
The admittedly odd way of removing elements of a range, is a common idiom.
You could even avoid the loop entirely, if you have the input on a single line
std::string word;
std::getline(std::cin, word);
word.erase(std::remove_if(word.begin(), word.end(),
[](unsigned char c) { return not std::isdigit(c)
and not std::isspace(c); }),
word.end());
std::cout << word;
Please see here the ultra simple example. (There is an even simpler solution at the bottom of this post)
It is using modern C++ elements and algorithms. And has only a few lines of code.
#include <iostream>
#include <string>
#include <regex>
#include <iterator>
#include <algorithm>
#include <sstream>
int main() {
// Read a string from the console
if (std::string line{}; std::getline(std::cin, line)) {
// Put the complete line into a std::istringstream
std::istringstream iss{line};
// Print result
std::transform(std::istream_iterator<std::string>(iss), {}, std::ostream_iterator<std::string>(std::cout, " "),
[](const std::string& s) { return std::regex_replace(s, std::regex{ R"([^\d])" }, ""); });
}
return 0;
}
So, what's going on here. Let us look at it statement by statement. So, first:
if (std::string line{}; std::getline(std::cin, line)) {
This is a if-statement with initializer. If you look up if in the C++ reference, here, then you can see, that we can now have an additional initialization statement as the first part in the if. And why are we using that? Because it is an additional measure for scoping. The variable "line" is only used within the scope of the if statement. It is not needed outside the if. From the functionality point of view, it is the same as writing:
std::string line{};
if (std::getline(std::cin, line)) {
But then, "line" would be also visible outside of the if statement. And, because we want to prevent the pollution of outer namespace, we select this method.
Next is std::getline. This will read a complete line from the input stream, so, from the console (std::cin)and put it into the string. The std::getline returns a reference to the stream. The stream has an overloaded bool operator, that returns, if there was a failure (or end of file) or not. So, the if statement checks, if the input operation works. By the way. All IO-opereations should be checked, if they work or fail.
Good, now we have the complete line of the user input in our variable "line".
With
std::istringstream iss{line};
we put the string into an std::istringstream. We do this, because we want to make use of the C++ "iostream" library. The std::istringstream behaves as any other stream, for example std::cin and you can extract values from it that are separated by a white space. Like in std::cin >> v1 >> v2. The disadvantage for such an approach is, that you need to know the number of values in advance or use a dynamic growing container and a loop.
And this brings ud to our next construct that I want to explain. You may have heard about "iterators". Iterators are like pointers and can point to a range of elements. If you have a std::vector or any other container, then you can iterate with the begin() and end() iterator over all elements in the std::vector without knowing, how many elements are in the std::vector, without knowing how many elements it contains.
And for input streams, we have something similar: The std::istream_iterator. This iterator will iterate over the elements in the std::sitringstream and returns the type of variable given in its template parameter, by repeatedly calling the extractor operator >>. Here, in our case, a std::string. You may know ask: Until when? Where is the end. If you look in the description of the constructor number 1 of the std::istream_operator then you will see, that the default constructor Constructs the end-of-stream iterator. and the default construct can be generated by using the empty braced {} initializer. So {} is the end iterator.
If we want to read all std::strings from the std::istringstream, then we read between
std::istream_iterator<std::string>(iss) and {}. So every string that is in the std::istringstream.
Good, next, there is a similar thing for output, the std::ostream_iterator. This will call the inserter operator "<<" for all elements in a given range. And, we can can specify, to which stream it should send the data, here std::cout and additionally a separator-string, which will be appended to the outputted value.
OK, next: std::transform. As it names says, it will transform the elements in a range of elements, between a begin() and end() iterator, to a other range. So, it will transform the elements as shown above from the std::istringstream and send them to the std::ostream iterator. So, we read the source value, transform it, then write it.
But, how to transform. For the transformation, we give a simple lambda function, which calls the std::regex_replace function. This is a standard function, to replace parts of a string with other string data. And, the what that will be replaced is specified by a std::regex. This is a special pattern that is defined in some kind of meta language and matches specified parts of a string. in our case we use [^\d] which means, not a digit. You can test regexes here. You can also lean about them here.
And now, all together, explains the above solution.
All this can be further optimized to 2 statements:
#include <iostream>
#include <string>
#include <regex>
int main() {
// Read a string from the console
if (std::string line{}; std::getline(std::cin, line)) {
// Remove unnecessary characters
std::cout << std::regex_replace(line, std::regex{ R"([^\d ])" }, "") << "\n";
}
return 0;
}
I cannot think of a more simpler solution.
In case of questions, please ask.
You can use get from istream to get each character, including whitespace, and then isdigit to check for a digit character...
#include <iostream>
#include <cctype>
int main()
{
char ch;
std::cin.get(ch);
while (!std::cin.eof())
{
if (isdigit(ch) || ch == ' ' || ch == '\n')
{
std::cout << ch;
}
std::cin.get(ch);
}
return 0;
}
However, you can avoid using std::cin.eof() for your expression for your While loop as follows...
#include <iostream>
#include <cctype>
int main()
{
char ch;
while (std::cin.get(ch))
{
if (isdigit(ch) || ch == ' ' || ch == '\n')
{
std::cout << ch;
}
}
return 0;
}
Regular expression pattern matching can be used to find all the digits in the input string.
Here is an example program to find the digits:
// C++ program to find all digits in a string
#include <bits/stdc++.h>
using namespace std;
int main() {
string inputString;
cout << "Enter the input string: ";
getline(cin, inputString);
cout << "Digits found: ";
// Define the regular expression matcher and pattern
smatch matcher;
regex pattern("[[:digit:]]");
while (regex_search(inputString, matcher, pattern)) {
// Show the match
cout << matcher.str(0);
// Continue searching the rest of the string
inputString = matcher.suffix().str();
}
return 0;
}
Output:
Enter the input string: sdfh354 eutyt;ljkn756897490uiotureu 587689jkgf 90
Digits found: 35475689749058768990
Here is another approach of finding the numbers in the string, without using the regular expression pattern matching:
#include <iostream>
#include <cctype>
#include <bits/stdc++.h>
using namespace std;
int main() {
string rawInput;
cout <<"Enter input string: ";
getline(cin, rawInput);
// Get all words from the input string
stringstream allWords(rawInput);
// Find and print digits in each word
string word;
while(allWords >> word) {
for(int i = 0; word[i]; i++) {
// Print only the numbers in the word
if(isdigit(word[i])) {
cout<<word[i];
}
}
cout<<" ";
}
cout<<"\n";
return 0;
}
Output:
Enter input string: ghjg45 jsdfj 897897 343yut45 90
45 897897 34345 90
How can I extract all numbers?
When you KNOW that the input numbers are all hex values ... (and how many)
stringstream ss ("5a3 1f a0aaaa f1fg3");
for (int i=0; i<4; ++i)
{
int k;
ss >> hex >> k;
cout << k << endl;
}
with output
1443
31
10529450
3871
I have a program that takes an input stream from a text file, which contains positive integers, delimited by spaces. The file contains only numbers and one instance of abc which my program should ignore before continuing to read data from the file.
this is my code and it does not work
int line;
in >> line;
in.ignore(1, 'abc');
in.clear();
could someone specify what the problem is? essentially, I want to discard the alpha input, clear cin and continue to read from file but I get an infinite loop.
This code should work
I get three parts from the input stream, and put the useful parts into a stringstream and read from it
int part1, part2;
std::string abc;
in >> part1 >> abc >> part2;
std::stringstream ss;
ss << part1 << part2;
int line;
ss >> line
You can explicitely ignore any alphabetical character this way:
#include <locale>
#include <iostream>
std::istream& read_number(std::istream& is, int& number) {
auto& ctype = std::use_facet<std::ctype<char>>(is.getloc());
while(ctype.is(std::ctype_base::alpha, is.peek())) is.ignore(1);
return is >> number;
}
int main() {
using std::string;
using std::cin;
using std::locale;
int number;
while(read_number(std::cin, number)) {
std::cout << number << ' ';
}
}
For input like "452abc def 23 11 -233b" this will produce "452 23 11 -233".
I'm trying to read just the integers from a text file structured like this....
ALS 46000
BZK 39850
CAR 38000
//....
using ifstream.
I've considered 2 options.
1) Regex using Boost
2) Creating a throwaway string ( i.e. I read in a word, don't do anything with it, then read in the score ). However, this is a last resort.
Are there any ways to express in C++ that I want the ifstream to only read in text that is an integer? I'm reluctant to use regular expressions if it turns out that there is a much simpler way to accomplish this.
why to make simple things complicated?
whats wrong in this :
ifstream ss("C:\\test.txt");
int score;
string name;
while( ss >> name >> score )
{
// do something with score
}
Edit:
it's in fact possible to work on streams directly with spirit than I suggested previously, with a parser:
+(omit[+(alpha|blank)] >> int_)
and one line of code(except for variable definitions):
void extract_file()
{
std::ifstream f("E:/dd/dd.trunk/sandbox/text.txt");
boost::spirit::istream_iterator it_begin(f), it_end;
// extract all numbers into a vector
std::vector<int> vi;
parse(it_begin, it_end, +(omit[+(alpha|blank)] >> int_), vi);
// print them to verify
std::copy(vi.begin(), vi.end(),
std::ostream_iterator<int>(std::cout, ", " ));
}
you get all numbers into a vector at once with one line, couldn't be simpler.
if you do not mind using boost.spirit2. the parser to get numbers from a line only is
omit[+(alpha|blank)] >> int_
to extract everything is
+(alpha|blank) >> int_
See the whole program below(Test with VC10 Beta 2):
#include <boost/spirit/include/qi.hpp>
#include <iostream>
#include <string>
#include <cstring>
#include <vector>
#include <fstream>
#include <algorithm>
#include <iterator>
using std::cout;
using namespace boost::spirit;
using namespace boost::spirit::qi;
void extract_everything(std::string& line)
{
std::string::iterator it_begin = line.begin();
std::string::iterator it_end = line.end();
std::string s;
int i;
parse(it_begin, it_end, +(alpha|blank)>>int_, s, i);
cout << "string " << s
<< "followed by nubmer " << i
<< std::endl;
}
void extract_number(std::string& line)
{
std::string::iterator it_begin = line.begin();
std::string::iterator it_end = line.end();
int i;
parse(it_begin, it_end, omit[+(alpha|blank)] >> int_, i);
cout << "number only: " << i << std::endl;
}
void extract_line()
{
std::ifstream f("E:/dd/dd.trunk/sandbox/text.txt");
std::string s;
int i;
// iterated file line by line
while(getline(f, s))
{
cout << "parsing " << s << " yields:\n";
extract_number(s); //
extract_everything(s);
}
}
void extract_file()
{
std::ifstream f("E:/dd/dd.trunk/sandbox/text.txt");
boost::spirit::istream_iterator it_begin(f), it_end;
// extract all numbers into a vector
std::vector<int> vi;
parse(it_begin, it_end, +(omit[+(alpha|blank)] >> int_), vi);
// print them to verify
std::copy(vi.begin(), vi.end(),
std::ostream_iterator<int>(std::cout, ", " ));
}
int main(int argc, char * argv[])
{
extract_line();
extract_file();
return 0;
}
outputs:
parsing ALS 46000 yields:
number only: 46000
string ALS followed by nubmer 46000
parsing BZK 39850 yields:
number only: 39850
string BZK followed by nubmer 39850
parsing CAR 38000 yields:
number only: 38000
string CAR followed by nubmer 38000
46000, 39850, 38000,
You can call ignore to have in skip over a specified number of characters.
istr.ignore(4);
You can also tell it to stop at a delimiter. You would still need to know the maximum number of characters the leading string could be, but this would also work for shorter leading strings:
istr.ignore(10, ' ');
You could also write a loop that just reads characters until you see the first digit character:
char c;
while (istr.getchar(c) && !isdigit(c))
{
// do nothing
}
if (istr && isdigit(c))
istr.putback(c);
here goes :P
private static void readFile(String fileName) {
try {
HashMap<String, Integer> map = new HashMap<String, Integer>();
File file = new File(fileName);
Scanner scanner = new Scanner(file).useDelimiter(";");
while (scanner.hasNext()) {
String token = scanner.next();
String[] split = token.split(":");
if (split.length == 2) {
Integer count = map.get(split[0]);
map.put(split[0], count == null ? 1 : count + 1);
System.out.println(split[0] + ":" + split[1]);
} else {
split = token.split("=");
if (split.length == 2) {
Integer count = map.get(split[0]);
map.put(split[0], count == null ? 1 : count + 1);
System.out.println(split[0] + ":" + split[1]);
}
}
}
scanner.close();
System.out.println("Counts:" + map);
} catch (FileNotFoundException e) {
e.printStackTrace();
}
}
public static void main(String[] args) {
readFile("test.txt");
}
}
fscanf(file, "%*s %d", &num);
or %05d if you have leading zeros and fixed width of 5....
sometimes the fastest way to do things in C++ is to use C. :)
You can create a ctype facet that classifies letters as white space. Create a locale that uses this facet, then imbue the stream with that locale. Having that, you can extract numbers from the stream, but all letters will be treated as white space (i.e. when you extract numbers, the letters will be ignored just like a space or a tab would be):
Such a locale can look like this:
#include <iostream>
#include <locale>
#include <vector>
#include <algorithm>
struct digits_only: std::ctype<char>
{
digits_only(): std::ctype<char>(get_table()) {}
static std::ctype_base::mask const* get_table()
{
static std::vector<std::ctype_base::mask>
rc(std::ctype<char>::table_size,std::ctype_base::space);
if (rc['0'] == std::ctype_base::space)
std::fill_n(&rc['0'], 9, std::ctype_base::mask());
return &rc[0];
}
};
Sample code to use it could look like this:
int main() {
std::cin.imbue(std::locale(std::locale(), new digits_only()));
std::copy(std::istream_iterator<int>(std::cin),
std::istream_iterator<int>(),
std::ostream_iterator<int>(std::cout, "\n"));
}
Using your sample data, the output I get from this looks like this:
46000
39850
38000
Note that as it stands, I've written this to accept only digits. If (for example) you were reading floating point numbers, you'd also want to retain '.' (or the locale-specific equivalent) as the decimal point. One way to handle things is to start with a copy of the normal ctype table, and then just set the things you want to ignore as space.