C++ how to get ONLY integers from complex string - c++

I have few strings, each one contains one word and several integer numbers (One string is whole line):
Adam 2 5 1 5 3 4
John 1 4 2 5 22 7
Kate 7 3 4 2 1 15
Bill 2222 2 22 11 111
As you can see, each word/number is separated with space. Now, I want to load these data into a map, where word (name) would be the key and the value would be vector of the numbers in line. I already have key values in separated temporary stl container, so the task is to load only the integer numbers from each line to 2D vector and then merge these two into map.
The question is, is there any C++ function, which would avoid words and white spaces and get only integers from a string, or I have to search strings char-by-char like
here ?
I found only partial solution, which is not able to get more than one digit number:
vector<int> out;
for (int i = 0; i < line.length(); i++) {
if (isdigit(line.at(i))) {
stringstream const_char;
int intValue;
const_char << line.at(i);
const_char >> intValue;
out.push_back(intValue);
}
}

If every line has the format "word number number number ...", use a stringstream and skip the word by reading it.
If the current line is in line:
vector<int> out;
istringstream in(line);
string word;
in >> word;
int x = 0;
while (in >> x)
{
out.push_back(x);
}

split the string on spaces since that seems to be your delimiter. Then check that each substring contains an int with strol.
Then use stoi to convert the integer substrings to int.
If no stoi conversion can be performed (the string does not contain a number), an invalid_argument exception is thrown, so don't try to convert the name substring.
#include <iostream>
#include <vector>
#include <string>
#include <sstream>
#include <cstdlib>
std::vector<std::string> &split(const std::string &s, char delim, std::vector<std::string> &elems) {
std::stringstream ss(s);
std::string item;
while (std::getline(ss, item, delim)) {
elems.push_back(item);
}
return elems;
}
std::vector<std::string> split(const std::string &s, char delim) {
std::vector<std::string> elems;
split(s, delim, elems);
return elems;
}
inline bool isInteger(const std::string & s)
{
if(s.empty() || ((!isdigit(s[0])) && (s[0] != '-') && (s[0] != '+'))) return false ;
char * p ;
strtol(s.c_str(), &p, 10) ;
return (*p == 0) ;
}
int main()
{
std::cout << "Hello World" << std::endl;
std::string example="Adam 2 5 1 5 3 4";
std::vector<std::string> subStrings;
subStrings = split(example, ' ');
std::string sItem;
for(std::vector<std::string>::iterator it = subStrings.begin(); it != subStrings.end(); ++it) {
sItem = *it;
if( isInteger(sItem) ){
int nItem = std::stoi (sItem);
std::cout << nItem << '\n';
}
}
return 0;
}

use find() and substr() of string Class to find the name if it is always at the beginning of the string.
std::string s = "Adam 2 5 1 5 3 4";
std::string delimiter = " ";
s.substr(0, s.find(delimiter)); //To get the name
s.erase(0, s.find(delimiter)); //To delete the name
//Repeat the mechanism with a for or a while for the numbers
I do not test this solution but I use something similar with always the label in first place.
If the name could be anywhere, I do not see how test it without check for every character.

Here is a program that demonstrates an approach to the task that can be used.
#include <iostream>
#include <string>
#include <vector>
#include <map>
#include <sstream>
#include <iterator>
int main()
{
std::string s( "Adam 2 5 1 5 3 4" );
std::map<std::string, std::vector<int>> m;
std::string key;
std::istringstream is( s );
if ( is >> key )
{
m[key] = std::vector<int>( std::istream_iterator<int>( is ),
std::istream_iterator<int>() );
}
for ( const auto &p : m )
{
std::cout << p.first << ": ";
for ( int x : p.second ) std::cout << x << ' ';
std::cout << std::endl;
}
return 0;
}
The output is
Adam: 2 5 1 5 3 4

Assuming that the name comes first, here is a function that will read the string and add to the map.
#include <map>
#include <vector>
#include <sstream>
#include <string>
#include <algorithm>
using namespace std;
typedef std::map<std::string, std::vector<int> > StringMap;
void AddToMap(StringMap& sMap, const std::string& line)
{
// copy string to stream and get the name
istringstream strm(line);
string name;
strm >> name;
// iterate through the ints and populate the vector
StringMap::iterator it = sMap.insert(make_pair(name, std::vector<int>())).first;
int num;
while (strm >> num)
it->second.push_back(num);
}
The function above adds a new entry to the map with the first read, and on subsequent reads, populates the vector.
Note that the map::insert function returns a std::pair, where the first of that pair is the iterator to the map entry that was created. So we just get the iterator, and from there, push_back the entries.
Here is a test program:
int main()
{
vector<std::string> data = { "Adam 2 5 1 5 3 4", "John 1 4 2 5 22 7",
"Kate 7 3 4 2 1 15", "Bill 2222 2 22 11 111" };
StringMap vectMap;
// Add results to map
for_each(data.begin(), data.end(),
[&](const std::string& s){AddToMap(vectMap, s); });
// Output the results
for_each(vectMap.begin(), vectMap.end(),
[](const StringMap::value_type& vt)
{cout << vt.first << " "; copy(vt.second.begin(), vt.second.end(),
ostream_iterator<int>(cout, " ")); cout << "\n"; });
}
Live example: http://ideone.com/8UlnX2

Related

How to read a CSV dataset in which each row has a distinct length. C++

I'm just new to C++ and am studying how to read data from csv file.
I want to read the following csv data into vector. Each row is a vector. The file name is path.csv:
0
0 1
0 2 4
0 3 6 7
I use the following function:
vector<vector<int>> read_multi_int(string path) {
vector<vector<int>> user_vec;
ifstream fp(path);
string line;
getline(fp, line);
while (getline(fp, line)) {
vector<int> data_line;
string number;
istringstream readstr(line);
while (getline(readstr, number, ',')) {
//getline(readstr, number, ',');
data_line.push_back(atoi(number.c_str()));
}
user_vec.push_back(data_line);
}
return user_vec;
}
vector<vector<int>> path = read_multi_int("C:/Users/data/paths.csv");
Print funtion:
template <typename T>
void print_multi(T u)
{
for (int i = 0; i < u.size(); ++i) {
if (u[i].size() > 1) {
for (int j = 0; j < u[i].size(); ++j) {
//printf("%d ", u[i][j]);
cout << u[i][j] << " ";
}
printf("\n");
}
}
printf("\n");
}
Then I get
0 0 0
0 1 0
0 2 4
0 3 6 7
Zeros are added at the end of the rows. Is possible to just read the data from the csv file without adding those extra zeros? Thanks!
Based on the output you are seeing and the code with ',' commas, I beleive that your actual input data really looks like this:
A,B,C,D
0,,,
0,1,,
0,2,4,
0,3,6,7
So the main change is to replace atoi with strtol, as atoi will always return 0 on a failure to parse a number, but with strtol we can check if the parse succeeded.
That means that the solution is as follows:
vector<vector<int>> read_multi_int(string path) {
vector<vector<int>> user_vec;
ifstream fp(path);
string line;
getline(fp, line);
while (getline(fp, line)) {
vector<int> data_line;
string number;
istringstream readstr(line);
while (getline(readstr, number, ',')) {
char* temp;
char numberA[30];
int numberI = strtol(number.c_str(), &temp, 10);
if (temp == number || *temp != '\0' ||
((numberI == LONG_MIN || numberI == LONG_MAX) && errno == ERANGE))
{
// Could not convert
}else{
data_line.emplace_back(numberI);
}
}
user_vec.emplace_back(data_line);
}
return user_vec;
}
Then to display your results:
vector<vector<int>> path = read_multi_int("C:/Users/data/paths.csv");
for (const auto& row : path)
{
for (const auto& s : row) std::cout << s << ' ';
std::cout << std::endl;
}
Give the expected output:
0
0 1
0 2 4
0 3 6 7
Already very good, but there is one obvious error and another error in your print function. Please see, how I output the values, with simple range based for loops.
If your source file does not contain a comma (','), but a different delimiter, then you need to call std::getline with this different delimiter, in your case a blank (' '). Please read here about std::getline.
If we then use the following input
Header
0
0 1
0 2 4
0 3 6 7
with the corrected program.
#include <vector>
#include <fstream>
#include <iostream>
#include <string>
#include <sstream>
using namespace std;
vector<vector<int>> read_multi_int(string path) {
vector<vector<int>> user_vec;
ifstream fp(path);
string line;
getline(fp, line);
while (getline(fp, line)) {
vector<int> data_line;
string number;
istringstream readstr(line);
while (getline(readstr, number, ' ')) {
//getline(readstr, number, ',');
data_line.push_back(atoi(number.c_str()));
}
user_vec.push_back(data_line);
}
return user_vec;
}
int main() {
vector<vector<int>> path = read_multi_int("C:/Users/data/paths.csv");
for (vector<int>& v : path) {
for (int i : v) std::cout << i << ' ';
std::cout << '\n';
}
}
then we receive this as output:
0
0 1
0 2 4
0 3 6 7
Which is correct, but unfortunately different from your shown output.
So, your output routine, or some other code, may also have some problem.
Besides. If there is no comma, then you can take advantage of formatted input functions using the extraction operator >>. This will read your input until the next space and convert it automatically to a number.
Additionally, it is strongly recommended, to initialize all variables during definition. You should do this always.
Modifying your code to use formatted input, initialization, and, maybe, better variable names, then it could look like the below.
#include <vector>
#include <fstream>
#include <iostream>
#include <string>
#include <sstream>
using namespace std;
vector<vector<int>> multipleLinesWithIntegers(const string& path) {
// Here we will store the resulting 2d vector
vector<vector<int>> result{};
// Open the file
ifstream fp{ path };
// Read header line
string line{};
getline(fp, line);
// Now read all lines with numbers in the file
while (getline(fp, line)) {
// Here we will store all numbers of one line
vector<int> numbers{};
// Put the line into an istringstream for easier extraction
istringstream sline{ line };
int number{};
while (sline >> number) {
numbers.push_back(number);
}
result.push_back(numbers);
}
return result;
}
int main() {
vector<vector<int>> values = multipleLinesWithIntegers("C:/Users/data/paths.csv");
for (const vector<int>& v : values) {
for (const int i : v) std::cout << i << ' ';
std::cout << '\n';
}
}
And, the next step would be to use a some more advanced style:
#include <vector>
#include <fstream>
#include <iostream>
#include <string>
#include <sstream>
#include <iterator>
auto multipleLinesWithIntegers(const std::string& path) {
// Here we will store the resulting 2d vector
std::vector<std::vector<int>> result{};
// Open the file and check, if it could be opened
if (std::ifstream fp{ path }; fp) {
// Read header line
if (std::string line{}; getline(fp, line)) {
// Now read all lines with numbers in the file
while (getline(fp, line)) {
// Put the line into an istringstream for easier extraction
std::istringstream sline{ line };
// Get the numbers and add them to the result
result.emplace_back(std::vector(std::istream_iterator<int>(sline), {}));
}
}
else std::cerr << "\n\nError: Could not read header line '" << line << "'\n\n";
}
else std::cerr << "\n\nError: Could not open file '" << path << "'\n\n'";
return result;
}
int main() {
const std::vector<std::vector<int>> values{ multipleLinesWithIntegers("C:/Users/data/paths.csv") };
for (const std::vector<int>& v : values) {
for (const int i : v) std::cout << i << ' ';
std::cout << '\n';
}
}
Edit
You have shown your output routine. That should be changed to:
void printMulti(const std::vector<std::vector<int>>& u)
{
for (int i = 0; i < u.size(); ++i) {
if (u[i].size() > 0) {
for (int j = 0; j < u[i].size(); ++j) {
std::cout << u[i][j] << ' ';
}
std::cout << '\n';
}
}
std::cout << '\n';
}

stringstream in cpp is continual parsing possible

I've got a vector of strings wherein if the 1st character is "1" then I need to push the integer (represented as a string) into a vector else I just need to print the 1st char.
While using stringstream the following is the code ive written.
vector<string> arr = {"1 23", "2", "1 45", "3", "4"};
vector<int> v;
for(string x : arr){
stringstream ss(x);
string word;
string arr[2];
int i =0 ;
while(ss >> word){
arr[i++] = word;
}
i = 0;
if(arr[0] == "1")
v.push_back(atoi(arr[1]));
else
cout << arr[0] << endl;
Instead of using an array arr, is there a way to take the next word from stringstream once the first word is "1"? Because when I tried the stringstream began all over again from start.
The code uses std::stringstream, but it doesn't take any advantage from this object, like extracting directly an int.
std::vector<std::string> arr = {"1 23", "2", "1 45", "3", "4"};
std::vector<int> v;
for ( auto const& word : arr )
{
std::stringstream ss{ word }; // Initialize with a string,
int first;
if ( ss >> first )
{ // ^^^^^^^^^^^ but extract an int...
if ( first == 1 )
{
int second;
if ( ss >> second ) // and another.
v.push_back(second);
}
else
std::cout << first << '\n';
} // Error handling is left to the reader.
}
Assuming the strings are always well-formed and in the format you describe, and the numbers in the strings are always valid integers, you could so something like this instead:
#include <iostream>
#include <vector>
#include <string>
#include <cstdlib>
using namespace std;
int main() {
const vector<string> arr = {"1 23", "2", "1 45", "3", "4"};
vector<int> v;
for (const string& s : arr) {
if (s.size() > 2 && s[0] == '1' && s[1] == ' ') {
v.push_back(atoi(s.c_str() + 2));
} else {
cout << s << "\n";
}
}
for (const int i: v) {
cout << i << "\n";
}
}
For strings in the array that don't start with a 1 and a space that you said you're just supposed to print, I just printed out the whole string instead of its first character.
If you're not sure about your strings in the array, you'll need to check for errors first. Also, see How can I convert a std::string to int? for alternatives to atoi().

C++ - Parsing number from std::string [duplicate]

This question already has answers here:
C++ Extract number from the middle of a string
(8 answers)
Closed 5 years ago.
I need to iterate through a shopping list which I have put into a vector and further separate each line by the quantity and item name. How can I get a pair with the number as the first item and the item name as the second?
Example:
vector<string> shopping_list = {"3 Apples", "5 Mandarin Oranges", "24 Eggs", "152 Chickens"}
I'm not sure how big the number will be so I can't use a constant index.
Ideally I would like a vector of pairs.
You can write a function to split quantity and item like following:
#include <sstream>
auto split( const std::string &p ) {
int num;
std::string item;
std::istringstream ss ( p);
ss >>num ; // assuming format is integer followed by space then item
getline(ss, item); // remaining string
return make_pair(num,item) ;
}
Then use std::transform to get vector of pairs :
std::transform( shopping_list.cbegin(),
shopping_list.cend(),
std::back_inserter(items),
split );
See Here
I suggest you the following solution without stringstream just as alternative solution
#include <iostream>
#include <string>
#include <vector>
using namespace std;
int main() {
vector<string> shopping_list = { "3 Apples", "5 Mandarin Oranges", "24 Eggs", "152 Chickens" };
vector< pair<int, string> > pairs_list;
for (string s : shopping_list)
{
int num;
string name;
int space_pos = s.find_first_of(" ");
if (space_pos == std::string::npos)
continue; // format is broken : no spaces
try{
name = s.substr(space_pos + 1);
num = std::stoi(s.substr(0, space_pos));
}
catch (...)
{
continue; // format is broken : any problem
}
pairs_list.push_back(make_pair(num, name));
}
for (auto p : pairs_list)
{
cout << p.first << " : " << p.second << endl;
}
return 0;
}
You can use std::stringstream as follows.
vector< pair<int,string> > myList;
for(int i=0;i<shopping_list.size();i++) {
int num;
string item;
std::stringstream ss;
ss<<shopping_list[i];
ss>>num;
ss>>item;
myList.push_back(make_pair(num,item));
...
}
num is your required number.

Parse a string by whitespace into a vector

Suppose I have a string of numbers
"1 2 3 4 5 6"
I want to split this string and place every number into a different slot in my vector. What is the best way to go about this
Use istringstream to refer the string as a stream and >> operator to take the numbers. It will work also if the string contains newlines and tabs. Here is an example:
#include <vector>
#include <sstream> // for istringstream
#include <iostream> // for cout
using namespace std; // I like using vector instead of std::vector
int main()
{
char *s = "1 2 3 4 5";
istringstream s2(s);
vector<int> v;
int tmp;
while (s2 >> tmp) {
v.push_back(tmp);
}
// print the vector
for (vector<int>::iterator it = v.begin(); it != v.end(); it++) {
cout << *it << endl;
}
}
#include <iostream>
#include <string>
#include <algorithm>
#include <cstdlib>
std::vector<std::string> StringToVector(std::string const& str, char const delimiter);
int main(){
std::string str{"1 2 3 4 5 6 "};
std::vector<std::string> vec{StringToVector(str, ' ')};
//print the vector
for(std::string const& item : vec){
std::cout << "[" << item << "]";
}
return EXIT_SUCCESS;
}
std::vector<std::string> StringToVector(std::string const& str, char const delimiter){
std::vector<std::string> vec;
std::string element;
//we are going to loop through each character of the string slowly building an element string.
//whenever we hit a delimiter, we will push the element into the vector, and clear it to get ready for the next element
for_each(begin(str),end(str),[&](char const ch){
if(ch!=delimiter){
element+=ch;
}
else{
if (element.length()>0){
vec.push_back(element);
element.clear();
}
}
});
//push in the last element if the string does not end with the delimiter
if (element.length()>0){
vec.push_back(element);
}
return vec;
}
g++ -std=c++0x -o main main.cpp
this has the advantage of never pushing an empty string into the vector.
you can also choose what you want the separator to be.
maybe you could write some others: one for a vector of characters or maybe the delimiter could be a string? :)
good luck!
#include <vector>
#include <string>
#include <sstream>
int str_to_int(const string& str){
stringstream io;
int out;
io<<str;
io>>out;
return out;
};
vector<int> Tokenize(string str, string delimiters = " ")
{
vector<int> tokens;
string::size_type nwpos; //position of first non white space, which means it is first real char
nwpos = str.find_first_not_of(delimiters, 0); //ignore the whitespace before the first word
string::size_type pos = str.find_first_of(delimiters, nwpos);
while (string::npos != pos || string::npos != nwpos)
{
// Found a token, add it to the vector.
tokens.push_back(str_to_int(str.substr(nwpos, pos - nwpos)));
// Skip delimiters. Note the "not_of"
nwpos = str.find_first_not_of(delimiters, pos);
// Find next "non-delimiter"
pos = str.find_first_of(delimiters, nwpos);
}
return tokens;
};
try:
#include <sstream>
#include <string>
#include <algorithm>
#include <iterator>
#include <vector>
int main()
{
// The data
std::string data = "1 2 3 4 5 6";
// data in a stream (this could be a file)
std::stringstream datastream(data);
// Copy the data from the stream into a vector.
std::vector<int> vec;
std::copy(std::istream_iterator<int>(datastream), std::istream_iterator<int>(),
std::back_inserter(vec)
);
// We can also copy the vector to the output (or any other stream).
std::copy(vec.begin(), vec.end(),
std::ostream_iterator<int>(std::cout, "\n")
);
}

C++ cin read STDIN

How to use C++ to get all the STDIN and parse it?
For example, my input is
2
1 4
3
5 6 7
I want to use C++ to read the STDIN using cin and store the each line in an array. So, it will be an vector/array of array of integers.
Thanks!
Since this isn't tagged as homework, here's a small example of reading from stdin using std::vectors and std::stringstreams. I added an extra part at the end also for iterating through the vectors and printing out the values. Give the console an EOF (ctrl + d for *nix, ctrl + z for Windows) to stop it from reading in input.
#include <iostream>
#include <vector>
#include <sstream>
int main(void)
{
std::vector< std::vector<int> > vecLines;
// read in every line of stdin
std::string line;
while ( getline(std::cin, line) )
{
int num;
std::vector<int> ints;
std::istringstream ss(line); // create a stringstream from the string
// extract all the numbers from that line
while (ss >> num)
ints.push_back(num);
// add the vector of ints to the vector of vectors
vecLines.push_back(ints);
}
std::cout << "\nValues:" << std::endl;
// print the vectors - iterate through the vector of vectors
for ( std::vector< std::vector<int> >::iterator it_vecs = vecLines.begin();
it_vecs != vecLines.end(); ++it_vecs )
{
// iterate through the vector of ints and print the ints
for ( std::vector<int>::iterator it_ints = (*it_vecs).begin();
it_ints < (*it_vecs).end(); ++it_ints )
{
std::cout << *it_ints << " ";
}
std::cout << std::endl; // new line after each vector has been printed
}
return 0;
}
Input/Output:
2
1 4
3
5 6 7
Values:
2
1 4
3
5 6 7
EDIT: Added a couple more comments to the code. Also note that an empty vectors of ints can be added to vecLines (from an empty line of input), that's intentional so that the output is the same as the input.
int main ()
{
char line[100];
while(!cin.eof()){
cin.getline(line, 100);
printf("%s\n", line);
}
return 0;
}
Sorry, I just wasn't sure if there's any way better than this.
This one should fit your requirement , use istringstream to separate the line into an array.
#include <iostream>
#include <vector>
#include <sstream>
#include <string>
using namespace std;
int main()
{
string s("A B C D E F G");
vector<string> vec;
istringstream iss(s);
do
{
string sub;
iss >> sub;
if ( ! sub.empty() )
vec.push_back (sub);
} while (iss);
vector<string>::iterator it = vec.begin();
while ( it != vec.end() )
{
cout << *it << endl;
it ++;
}
return 0;
}