Reading lines and columns with different separators from a text file - c++

I am trying to write a function that reads in individual lines from a text file. Each line has two or three columns. I am want to know most elegant/clean approach for it. I am need the function to work with different separators (\t,\n,' ',',',';').
My approach works correctly except for different separators.
E.g. Input:
6
0 0
1 1
2 2
3 3
4 4
5 5
10
0 1 0.47
2 0 0.67
3 0 0.98
4 0 0.12
2 1 0.94
3 1 0.05
4 1 0.22
3 2 0.24
4 2 0.36
4 3 0.69
Pattern Input:
[total number of vertices]
[id-vertex][\separetor][name-vertex]
...
[total number of edges]
[id-vertex][\separator][id-neighbor][\separetor][weight]
...
*\separetor=\t|\n|' '|','|';'
My approach:
void readStream(istream& is, const char separator) {
uint n, m;
is >> n;
cout << n << endl;
string name;
uint vertexId, neighborId;
float weight;
while(!is.eof()) {
for(uint i = 0; i < n; i++) {
is >> vertexId >> name;
cout << vertexId;
cout << " " << name << endl;
}
is >> m;
cout << m << endl;
for(uint j = 0; j < n; j++) {
is >> vertexId >> neighborId >> weight;
cout << vertexId;
cout << " " << neighborId;
cout << " " << weight << endl;
}
break;
}
}
Overview:
Problem: Different separators.
Others elegantes solutions: In general, someone have other elegant/clean solutions to the problem?

You may use boost split it can split a string on multiple separators that you can specify.
std::string = line;
std::vector<std::string> parts;
boost::split(parts, line, boost::is_any_of("\t\n,; "));

If you are sure the separator is not white space, you can just throw them into a garbage string (e.g. separator in the case follow)
is >> vertexId >> separator >> neighborId >> separator >> weight;

The following code may be useful:
int t1,t2;
double t3;//global variables...
void parse_Vertex_Line(char *str)
{
int tmp=0;
char *p=str;
//extract the vertex-id
while(*p >='0' && *p <='9')
tmp = tmp*10 + *(p++) -'0';
t1=tmp;
tmp=0;
p++;
//now extract the vertex-name..
while(*p >='0' && *p <='9')
tmp = tmp*10 + *(p++) -'0';
t2=tmp;
return;
}
void parse_Edge_Line(char *str)
{
//extracting the first two numbers is just the same...
int tmp=0;
char *p=str;
//extract the first vertex-id
while(*p >='0' && *p <='9')
tmp = tmp*10 + *(p++) -'0';
t1=tmp;
tmp=0;
p++;
//now extract the second vertex-id..
while(*p >='0' && *p <='9')
tmp = tmp*10 + *(p++) -'0';
t2=tmp;
p++;
//but extracting a double value is a bit different...
//extract the weight...
int before_decimal=0, after_decimal=0;
while(*p!='.')
before_decimal = before_decimal*10 + *(p++) -'0';
p++;
int no_of_digits=0;
while(*p>='0' && *p<='9')
{
after_decimal = after_decimal*10 + *(p++) -'0';
no_of_digits++;
}
//assign it to the global double variable...
t3 = before_decimal + (after_decimal/pow(10.0, no_of_digits));
}
Now what you do is first get the number of vertices(n). Next read each of the n lines.
Calling the function parse_Vertex_Line each time. Then read the number of edges and similarly call parse_Edge_Line each time. Extract the values and Store them.
This code works for almost any delimiters. Hope this looks elegant to you.

You could use(Considering that your file will always be in the above mentioned format)
fstream file;
file.open("abc.txt",ios::in);
int numOfVertices;
string line;
getline(file, line);
numOfVertices = stoi(line);
vector<int> xCoord;
vector<int> yCoord;
while((--numOfVertices)>=0)
{
string line;
getline(file, line);
std::size_t prev = 0, pos;
pos = line.find_first_of(" ';", prev);
xCoord.push_back(stoi(line.substr(prev, pos-prev)));
prev = pos+1;
pos = line.find_first_of(" ';", prev); //considering some of the delimiters
yCoord.push_back(stoi(line.substr(prev, pos-prev)));
}
This is to add the vertices. Similarly you may extract the edges as well.

I modified my other posting for this scenario: Override istream operator >> and modify delimiter (The explanation for this is first variant of the accepted solution here for a possible implementation).
In general one way you could possibly deal with the unwanted separators is to turn them into spaces!
My approach enables the use of new delimiters for operations '>>' from istream:
struct delimiterIsSpace : ctype<char> {
delimiterIsSpace() : ctype<char>(get_table()) {}
static mask const* get_table() {
static mask rc[table_size];
rc[';'] = ctype_base::space;
rc[','] = ctype_base::space;
rc[' '] = ctype_base::space;
rc['\t'] = ctype_base::space;
rc['\n'] = ctype_base::space;
return &rc[0];
}
};
How to use:
cin.imbue(locale(cin.getloc(), new delimiterIsSpace));
for (int a, b; cin >> a >> b; ) {
cout << "a=" << a << " b=" << b << "\n";
}

Related

Get all input of int from cin separated with space in c++

N = Input How much attempt (First Line).
s = Input How much value can be added (Second, fourth and sixth lines).
P = Input of numbers separated with space.
Example :
3 ( Input N )
2 ( s 1 )
2 3
3 ( s 2 )
1 2 3
1 ( s 3 )
12
Example :
Read #1: 5 (Output s1 = 2 + 3)
Read #2: 6 (Output s2 = 1+2+3)
Read #3: 12 (Output s3 = 12)
I've been searching and trying for very long but couldn't figure out such basic as how to cin based on given numbers, with spaces and add all values into a variable. For example:
#include <iostream>
using namespace std;
int main() {
int l, o[l], r, p[r], i;
cin >> l;
for(i = 0; i < l; i++) {
cin>>o[l];
r = o[l]; // for every o[0] to o[l]
}
while (cin>>o[l]) {
for (i = 0; i < l; i++){
cin>>p[o]; // for every o[0] to o[l]
// i.e o[l] = 1 then 2 values can be added (because it starts from zero)
// input 1 2
// o[1] = {1, 2}
int example += o[1];
cout<< "Read#2: " << example;
}
}
}
And it doesn't work. Then i found getline(), ignoring the s and just input anything that will finally be added to a number, turned out it is only usable for char string. I tried scanf, but I'm not sure how it works. So im wondering if it's all about s(values) × 1(column) matrix from a looping but sill not sure how to make it. Any easy solutions to this without additional libraries or something like that? Thanks in advance.
#include <iostream>
using namespace std;
int main() {
int t; //number of attempts
cin >> t;
while(t--) { // for t attempts
int n, s = 0; //number of values and initial sum
cin >> n;
while (n--) { //for n values
int k; //value to be added
cin >> k;
s += k; //add k to sum
}
cout << s << "\n"; //print the sum and a newline
}
return 0;
}
If you want to add more details, (i.e. print Read#n on the nth attempt), you can always use
for (int i = 1; i <= n; i++)
to replace while(t--) and at the end of the attempt just print
cout << "Read#" << i << ": " << s << "\n";

find the maximum number of words in a sentence from a paragraph with C++

I am trying to find out the maximum number of words in a sentence (Separated by a dot) from a paragraph. and I am completely stuck into how to sort and output to stdout.
Eg:
Given a string S: {"Program to split strings. By using custom split function. In C++"};
The expected output should be : 5
#define max 8 // define the max string
string strings[max]; // define max string
string words[max];
int count = 0;
void split (string str, char seperator) // custom split() function
{
int currIndex = 0, i = 0;
int startIndex = 0, endIndex = 0;
while (i <= str.size())
{
if (str[i] == seperator || i == str.size())
{
endIndex = i;
string subStr = "";
subStr.append(str, startIndex, endIndex - startIndex);
strings[currIndex] = subStr;
currIndex += 1;
startIndex = endIndex + 1;
}
i++;
}
}
void countWords(string str) // Count The words
{
int count = 0, i;
for (i = 0; str[i] != '\0';i++)
{
if (str[i] == ' ')
count++;
}
cout << "\n- Number of words in the string are: " << count +1 <<" -";
}
//Sort the array in descending order by the number of words
void sortByWordNumber(int num[30])
{
/* CODE str::sort? std::*/
}
int main()
{
string str = "Program to split strings. By using custom split function. In C++";
char seperator = '.'; // dot
int numberOfWords;
split(str, seperator);
cout <<" The split string is: ";
for (int i = 0; i < max; i++)
{
cout << "\n initial array index: " << i << " " << strings[i];
countWords(strings[i]);
}
return 0;
}
Count + 1 in countWords() is giving the numbers correctly only on the first result then it adds the " " whitespace to the word count.
Please take into consideration answering with the easiest solution to understand first. (std::sort, making a new function, lambda)
Your code does not make a sense. For example the meaning of this declaration
string strings[max];
is unclear.
And to find the maximum number of words in sentences of a paragraph there is no need to sort the sentences themselves by the number of words.
If I have understood correctly what you need is something like the following.
#include <iostream>
#include <sstream>
#include <iterator>
int main()
{
std::string s;
std::cout << "Enter a paragraph of sentences: ";
std::getline( std::cin, s );
size_t max_words = 0;
std::istringstream is( s );
std::string sentence;
while ( std::getline( is, sentence, '.' ) )
{
std::istringstream iss( sentence );
auto n = std::distance( std::istream_iterator<std::string>( iss ),
std::istream_iterator<std::string>() );
if ( max_words < n ) max_words = n;
}
std::cout << "The maximum number of words in sentences is "
<< max_words << '\n';
return 0;
}
If to enter the paragraph
Here is a paragraph. It contains several sentences. For example, how to use string streams.
then the output will be
The maximum number of words in sentences is 7
If you are not yet familiar with string streams then you could use member functions find, find_first_of, find_first_not_of with objects of the type std::string to split a string into sentences and to count words in a sentence.
Your use case sounds like a reduction. Essentially you can have a state machine (parser) that goes through the string and updates some state (e.g. counters) when it encounters the word and sentence delimiters. Special care should be given for corner cases, e.g. when having continuous multiple white-spaces or >1 continous full stops (.). A reduction handling these cases is shown below:
int max_words_in(std::string const& str)
{
// p is the current and max word count.
auto parser = [in_space = false] (std::pair<int, int> p, char c) mutable {
switch (c) {
case '.': // Sentence ends.
if (!in_space && p.second <= p.first) p.second = p.first + 1;
p.first = 0;
in_space = true;
break;
case ' ': // Word ends.
if (!in_space) ++p.first;
in_space = true;
break;
default: // Other character encountered.
in_space = false;
}
return p; // Return the updated accumulation value.
};
return std::accumulate(
str.begin(), str.end(), std::make_pair(0, 0), parser).second;
}
Demo
The tricky part is deciding how to handle degenerate cases, e.g. what should the output be for "This is a , ,tricky .. .. string to count" where different types of delimiters alternate in arbitrary ways. Having a state machine implementation of the parsing logic allows you to easily adjust your solution (e.g. you can pass an "ignore list" to the parser and update the default case to not reset the in_space variable when c belongs to that list).
vector<string> split(string str, char seperator) // custom split() function
{
size_t i = 0;
size_t seperator_pos = 0;
vector<string> sentences;
int word_count = 0;
for (; i < str.size(); i++)
{
if (str[i] == seperator)
{
i++;
sentences.push_back(str.substr(seperator_pos, i - seperator_pos));
seperator_pos = i;
}
}
if (str[str.size() - 1] != seperator)
{
sentences.push_back(str.substr(seperator_pos + 1, str.size() - seperator_pos));
}
return sentences;
}

No Instance of overloaded function vector of structs

I creating a project from college where I have to recreate the Scrabble Junior game to a console Game, but I've got a problem and i question in my code.
Firstly, I've got an error in my code saying :
"no instance of overloaded function "std::vector<_Ty, _Alloc>::push_back [with _Ty=Board::Word, _Alloc=std::allocator<Board::Word>]" matches the argument list
argument types are: (Board::Word)
object type is: std::vector<Board::Word, std::allocator<Board::Word>>
The struct Word is this one:
struct Word {
int row;
int column;
char orientation;
int tilesadded = 0; //starts at 0
int wordlength;
bool completed = false;
int currentletterpositiontoAdd[2]; //array to hold the coordenates of the next tile to be added
std::string name;
};
This struct basically stores every word and it's position in the board
And then I have also a vector storing every word struct: std::vector <Word> words;
The code that builds this struct is the following (because I get from a file every word and position to the board):
void Board::GetBoard()
{
std::ifstream file;
std::string filename, input;
std::cout << "----------------------------------------------------------------------------------------" << std::endl << std::endl;
std::cout << "What is the directory of the board file? (.txt is added for you) -> ";
while (std::getline(std::cin, filename))
{
file.open(filename + ".txt");
if (!file.is_open())
{
std::cin.clear();
file.clear();
RED;
std::cerr << "Error reading file." << std::endl;
WHITE;
std::cout << "What is the directory of the board file? (.txt is added for you) -> ";
}
else
break;
}
while (std::getline(file, input))
{
if ((int)input[0] == 49 || (int)input[0] == 50)
{ //this means that its the first line of the file and the first character is either 1 or 2
boardSize = stoi(input.substr(0, 2));
}
else
{
std::string nametoCat;
Word word;
word.row = input[0] - 'A' + 1; //calculation of the position on the board using ascii code ex: input[0] = C so: 'C' -'A' + 1 = 3 row -> 3
word.column = input[1] - 'a' + 1; //calculation of the position on the board using ascii code ex: input[1] = e so: 'e' -'a' + 1 = 5 column -> 5
word.orientation = input[3];
word.currentletterpositiontoAdd[0] = word.row;
word.currentletterpositiontoAdd[1] = word.column;
for (int x = 5; x < 1000000000; x++)
{ //for loop to check the name ending and build a string with the name
if (input[x] == '\0')
break;
else
nametoCat += input[x];
}
word.name = nametoCat;
word.wordlength = word.name.size(); //storing the word length to use later to check if word is completed in board
words.push_back(word);
}
}
}
The file looks like this:
15 x 15
Ak H EGGS
Bg H BUZZ
Ca H MUSIC
Cm H ARM
...
And secondly, I would like to make the code look more "clean" and understandable and remove that 1000000000 from: for (int x = 5; x < 1000000000; x++) and do it another way, but i can't find a solution. Because this 1000000000 looks like a magic number and not a number that would always work, for example, wouldn't work with a word of size 1000000001 (unlikely but possible).
Thank you.

C++: Reading lines of integers from cin

As I'm familiarizing myself with the I/O aspect of C++, I'm trying to write a program to read some lines of integers from std::cin. Say the input looks like this:
1 2 3
4 5 6
7 8 9
10 11 12
How can I read the above lines into a 2D vector?
vector<vector<int>> nums;
/*
... some code here and nums will look like the following:
nums = {
{1,2,3},
{4,5,6},
{7,8,9},
{10,11,12}
}
*/
I've also tried to read the above lines of integers to a 1D vector, but I'm having some issues dealing with the '\n' character. My code is:
string rawInput;
vector<int> temp;
while(getline(cin, rawInput, ' ') ){
int num = atoi( rawInput.c_str() );
temp.push_back(num);
}
And the final result I got by printing out all the elements in the "temp" vector is:
1 2 3 5 6 8 9 11 12 // 4, 7, 10 went missing
Any help is appreciated. Thank you.
First use getline to grab an entire line, then you can use a istringstream to create a stream of ints just for that line.
At that point it's just a matter of creating each subvector of ints using the vector constructor that takes two iterators. An istream_iterator<int> on your istringstream gets this done:
std::vector<std::vector<int>> nums;
std::string line;
while (std::getline(std::cin, line)) {
std::istringstream ss(line);
nums.emplace_back(std::istream_iterator<int>{ss}, std::istream_iterator<int>{});
}
What is happening is since you are using only ' '(space) as deliminator, the input happens to be
1
2
3\n4 //<------ Newline also comes with the input
...
So, you are passing 3\n4, 6\n7 etc to atoi it returns 3,6 etc(atoi parses the input till first non-digit input) and the 4,7 is lost.
To achieve want you want you can use getline with istringstream (keeping the default deliminator as newline)
string rawInput;
vector<vector<int>> temp;
while(getline(cin, rawInput) ){
istringstream bufferInput(rawInput);
temp.push_back(vector<int>{std::istream_iterator<int>{bufferInput}, std::istream_iterator<int>{}});
}
you can use stringstream
string rawInput;
vector<int> temp;
stringstream ss;
while(getline(cin,rawInput)){
ss<<rawInput;
vector<int> temp;
int x;
while(ss>>x){
temp.push_back(x);
}
num.push_back(temp)
}
I recently wrote an answer to another question but with a few adaptations it achieves exactly what you are looking for (I hope):
#ifndef _IOSTREAM_H
#include <iostream>
#endif
#ifndef _STRING_H
#include <string>
#endif
#ifndef _VECTOR_H
#include <vector>
#endif
using namespace std;
enum XYZ { X = 0, Y = 1, Z = 2 };
struct Vector {
float x, y, z;
Vector(float _x=0, float _y=0, float _z=0) {
x = _x;
y = _y;
z = _z;
}
float& operator[](size_t index) {
if (index == XYZ::X) return x;
if (index == XYZ::Y) return y;
if (index == XYZ::Z) return z;
throw new exception;
}
};
#define min(a, b) (((a) < (b)) ? (a) : (b))
bool isCharNumeric(char c) {
const char* numbers = "0123456789";
for (size_t index = 0; index < strlen(numbers); index++)
if (c == numbers[index]) return true; return false;
}
vector<Vector> parseNumbers(string str_in) {
str_in += " "; //safe, no out of bounds
vector<Vector> results = {};
char currentChar;
char skipChar = ' ';
bool found_period = false;
size_t count_len = 0;
Vector vector_buffer(0,0,0);
XYZ current_axis = (XYZ)0;
for (size_t index = 0; index < str_in.length(); index++) {
currentChar = str_in[index];
if (currentChar == skipChar || currentChar == '\n' || currentChar == '\t')
continue;
else if (isCharNumeric(currentChar)) {
string word = ""; //word buffer
size_t word_len = min(min(str_in.find_first_of(' ', index + 1) - (index), str_in.find_first_of('\n', index + 1) - (index)), str_in.find_first_of('\t', index + 1) - (index)); //whatever char comes first; newline, tab or space
//append chars of following word checking if it is still valid number char
if (word_len > 0) {
size_t count_word_len = 0;
for (count_word_len = 0; count_word_len < word_len; count_word_len++)
if (isCharNumeric(str_in[index + count_word_len])) {
word += str_in[index + count_word_len];
}
else if (str_in[index + count_word_len] == '.' && isCharNumeric(str_in[index + count_word_len + 1])) {
//Floating-point numbers
word += '.';
found_period = true;
continue;
}
else {
word = "";
continue;
}
vector_buffer[current_axis] = stof(word);
if (current_axis == XYZ::Z) {
current_axis = XYZ::X;
results.push_back(vector_buffer);
}
else {
current_axis = (XYZ)(current_axis + 1);
}
index += count_word_len;
word = "";
continue;
}
}
}
return results;
}
Example implementation:
int main(int argc, char** argv) {
string user_input;
cin >> user_input;
vector<Vector> numbers = parseNumbers(user_input);
for each (Vector v in numbers) {
cout << "X=" << v.X << "\n";
cout << "Y=" << v.Y << "\n";
cout << "Z=" << v.Z << "\n\n";
}
}
Suprisingly none of the answers use the istream stream operator:
http://www.cplusplus.com/reference/istream/istream/operator%3E%3E/
When stream is empty eofbit is set, so run a while loop on that.
Works great for all types, and can be overloaded for custom types (such as 2D texture).

Input String where Integer should be - C++

I'm a beginner and am stuck on such a simple problem whilst working through Stroustrup's Principles and Practices.
Using only basic elements
#include "std_lib_facilities.h"
int main()
{
double highest = 0;
double lowest = 100;
int i=0;
double sum = 0;
vector <double> inputlist;
double input;
string unit;
cout<<"Type in a number followed by it's unit \n";
while(cin>>input>>unit){
inputlist.push_back(input);
sum += inputlist[i];
if (input >= lowest && input <= highest){
cout<<input<<" \n";
++i;
}
else if (input < lowest){
lowest = input;
cout<<"\nLowest Number so far \n"<<lowest;
++i;
}
else if (input > highest){
highest = input;
cout<<"\nHighest number so far \n"<< highest;
++i;
}
else
cout<<"Lowest is: \n"<<lowest<<"\n\n Highest is: \n"<<highest<<" \n\n and the total is: \n"<<sum;
if (unit == "ft", "m", "in","cm")
cout<<unit<<"\n";
else
cout<<"cannot recognize unit";
}
keep_window_open();
return 0;
}
I need the program to show the user the sum and highest and lowest value when the character "|" is entered. Problem is: i need this entered where the Integer value should be entered.
NOTE: I don't know much about conversions but tried a few and they didn't work.
If I understood you correctly, you want to read int from std::cin, but:
int i;
if (std::cin >> i) {
...
doesn't suite your needs since there might be '|' sign as a signal for termination of reading.
Here's what you could do: read input word by word (std::string) and parse these words separately using temporary std::istringstream:
std::string word;
if (std::cin >> word) {
if (word == "|")
...
// else:
std::istringstream is(word);
int i;
if (is >> i) {
// integer successfully retrieved from stream
}
}
just #include <sstream>
Read the value with string. if it doesn't match | convert it to double using the following function:
double toDouble(string s)
{
int sign = 1, i=0;
if (s[0]=='-')
sign = -1, i=1;
double result = 0, result2 = 0;
for (; i < s.size(); i++)
if (s[i] == '.')
break;
else
result = result * 10 + (s[i] - '0');
for (i = s.size()-1 ; i>=0 ; i--)
if (s[i] == '.')
break;
else
result2 = result2 / 10 + (s[i] - '0');
if (i>=0)
result += result2/10;
return result * sign;
}
Summing meters with inches does not make much sense. Therefore, you should consider to translate the units into scaling factors. You could use a map to get the scaling factors.
Even if this is somewhat overshoot you might use regular expressions to parse the user input. If the regex does not match you can test for stuff like "|".
In the new c++-standard (http://en.wikipedia.org/wiki/C%2B%2B11) a regex library is defined for this purpose. Pityingly, the g++ regex library is buggy. But you can use boost (http://www.boost.org/doc/libs/1_54_0/libs/regex/doc/html/boost_regex/).
Here is an example:
#include <iostream>
#include <vector>
#include <map>
#include <boost/regex.hpp> //< Pittyingly std::regex is buggy.
using namespace std; ///< Avoid this in larger projects!
using namespace boost;
int main() {
const string strReFloat("([-+]?[[:digit:]]*\\.?[[:digit:]]+(?:[eE][-+]?[[:digit:]]+)?)");
const string strReUnit("([[:alpha:]]+)");
const string strReMaybeBlanks("[[:blank:]]*");
const string strReFloatWithUnit(strReMaybeBlanks+strReFloat+strReMaybeBlanks+strReUnit+strReMaybeBlanks);
const regex reFloatWithUnit(strReFloatWithUnit);
const map<const string,double> unitVal= {
{"m", 1.0},
{"in", 0.0254},
{"ft", 0.3048},
{"cm", 0.01}
};
double highest = 0;
double lowest = 100;
int i=0;
double sum = 0;
vector <double> inputlist;
double input;
double unitToMeter;
string unit;
string str;
while( (cout<<"\nType in a number followed by it's unit \n", getline(cin,str), str != "") ){
smatch parts;
if( regex_match(str,parts,reFloatWithUnit) ) {
unit = parts[2].str();
auto found = unitVal.find(unit);
if( found != unitVal.end() ) {
cout<<unit<<"\n";
input = found->second * atof(parts[1].str().c_str());
} else {
cout << "Unit \"" << unit << "\" not recognized. Using meters.\n";
}
inputlist.push_back(input);
sum += inputlist[i];
if (input >= lowest && input <= highest){
cout<<input<<" \n";
++i;
}
else if (input < lowest){
lowest = input;
cout<<"\nLowest Number so far \n"<<lowest;
++i;
}
else if (input > highest){
highest = input;
cout<<"\nHighest number so far \n"<< highest;
++i;
}
} else if( str == "|" ) {
cout << "sum:" << sum << "\n";
} else {
cout << "Input not recognized.\n";
}
}
return 0;
}