C++ Read until tab detected - c++

This may be a little bit redundant, but is there a short/compact method of reading in a string until a tab is reached in C++? Similar to other questions, but I want to keep reading even if I hit a space. For example if the STDIN is
Cute Kitty is fabulous as always
Then I want to read in Cute Kitty; is; fabulous as always, three times.
I've seen people do this with regex in files, but how would you do this on the stdin in C++? I want to put it in a string class and whenever I try something like
scanf("%s\t", &mystring);
It throws up an error because I'm not using an array of chars.
Thanks, please keep answers easy enough for a noob to understand.

This code seems to work for me. It basically gets the line that was entered from the user via stdin and then reads each character waiting for a tab character (\t), or the end of the line.
#include <iostream>
#include <string>
int main()
{
std::string a;
std::getline(std::cin,a);
int index_holder = 0;
for(std::string::size_type i = 0; i < a.size(); ++i)
{
if(a[i] == '\t' || (i == a.size() - 1)) {
std::cout << a.substr(index_holder, i - index_holder) << std::endl;
index_holder = i + 1;
}
}
return 0;
}

Have a look at strtok:
char * strtok ( char * str, const char * delimiters );
Split string into tokens. A sequence of calls to this function split str into tokens, which are sequences of contiguous characters separated by any of the characters that are part of delimiters.

Related

Extracting a particular data from a CSV file in c++

I have written a program to read a CSV file but I'm having some trouble in extracting data from that CSV file in c++. I want to count the no. of columns starting from the 5th column in the 1st row until the last column of the 1st row of the CSV file. I have written the following code to read a CVS file, but I am not sure how shall I count the no. of columns as I have mentioned before.
Will appreciate it if anyone could please tell me how shall I go about it?
char* substring(char* source, int startIndex, int endIndex)
{
int size = endIndex - startIndex + 1;
char* s = new char[size+1];
strncpy(s, source + startIndex, size); //you can read the documentation of strncpy online
s[size] = '\0'; //make it null-terminated
return s;
}
char** readCSV(const char* csvFileName, int& csvLineCount)
{
ifstream fin(csvFileName);
if (!fin)
{
return nullptr;
}
csvLineCount = 0;
char line[1024];
while(fin.getline(line, 1024))
{
csvLineCount++;
};
char **lines = new char*[csvLineCount];
fin.clear();
fin.seekg(0, ios::beg);
for (int i=0; i<csvLineCount; i++)
{
fin.getline(line, 1024);
lines[i] = new char[strlen(line)+1];
strcpy(lines[i], line);
};
fin.close();
return lines;
}
I have attached a few lines from the CSV file:-
Province/State,Country/Region,Lat,Long,1/22/20,1/23/20,1/24/20,
,Afghanistan,33.0,65.0,0,0,0,0,0,0,0,
,Albania,41.1533,20.1683,0,0,0,0
What I need is, in the 1st row, the number of dates after Long.
To answer your question:
I have attached a few lines from the CSV file:-
Province/State,Country/Region,Lat,Long,1/22/20,1/23/20,1/24/20, ,Afghanistan,33.0,65.0,0,0,0,0,0,0,0, ,Albania,41.1533,20.1683,0,0,0,0
What I need is, in the 1st row, the number of dates after Long.
Yeah, not that difficult - that's how I would do it:
#include <iostream>
#include <string>
#include <fstream>
#include <regex>
#define FILENAME "test.csv" //Your filename as Macro
//(The compiler just sees text.csv instead of FILENAME)
void read(){
std::string n;
//date format pattern %m/%dd/%YY
std::regex pattern1("\\b\\d{1}[/]\\d{2}[/]\\d{2}\\b");
//date format pattern %mm/%dd/%YY
std::regex pattern2("\\b\\d{2}[/]\\d{2}[/]\\d{2}\\b");
std::smatch result1, result2;
std::ifstream file(FILENAME, std::ios::in);
if ( ! file.is_open() )
{
std::cout << "Could not open file!" << '\n';
}
do{
getline(file,n,',');
//https://en.cppreference.com/w/cpp/string/basic_string/getline
if(std::regex_search(n,result1,pattern1))
std::cout << result1.str(1) << n << std::endl;
if(std::regex_search(n,result2,pattern2))
std::cout << result2.str(1) << n << std::endl;
}
while(!file.eof());
file.close();
}
int main ()
{
read();
return 0;
}
The file test.csv contains the following for testing:
Province/State,Country/Region,Lat,Long,1/22/20,1/23/20,1/24/20, ,Afghanistan,33.0,65.0,0,0,0,0,0,0,0, ,Albania,41.1533,20.1683,0,0,0,0
Province/State,Country/Region,Lat,Long,1/25/20,12/26/20,1/27/20, ,Bfghanistan,33.0,65.0,0,0,0,0,0,0,0, ,Blbania,41.1533,20.1683,0,0,0,0
It actually is pretty simple:
getline takes the open file and "escapes" at a so called escape-charachter,
in your case a comma ','.
(That is the very best way I found in reading csv - you can replace it with whatever you want, for example: ';' or ' ' or '...' - guess you get the drill)
After this you got all data nicely separated underneath one another without a comma.
Now you can "filter" out what you need. I use regex - but use what ever you want.
(Just fyi: For c++ tagged questions you shouldn't use c-style like strncpy..)
I gave you an example for 1.23.20 (m/dd/yy) and to make it simple if your file contains a november or december like 12.22.20 (mm/dd/yy) to make
the regex pattern more easy to read/understand in 2 lines.
you can/may have to expand the regex pattern if the data somehow matches
your date format in the file, really good explained here and not as complicated as it looks.
From that point you can put all the printed stuff f.e. in a vector (some more convenient array) to handle and/or pass/return data - that's up to you.
If you need more explaining I am happy to help you out and/or expand this example, just leave a comment.
You basically want to search for the seperator substring within your line (normally it is ';').
If you print out your lines it should look like this:
a;b;c;d;e;f;g;h
There are several ways to achieve what you want, I would look for a strip or split upon character function. Something along the example below should work. If you use std you can go with str.IndexOf instead of a loop.
int rows(char* line,char seperator, int count) {
unsigned length = strlen(line);
for (int i=pos; i<length;i++){
if(strcmp(line[i],seperator)) break;
}
count++;
if (i<length-1) return rows(substring(line,i,length-i),seperator,count);
else return count;
}
The recursion can obviously be replaced by one loop ;)
int countSign(char* line, char* sign){
unsigned l = strlen(line);
int count = 0;
for (int i=0; i < l; i++) {
if(strcmp(line[i],sign)) count++;
}
}

Comparing numbers in strings to numbers in int?

I'm trying to make a program that will open a txt file containing a list of names in this format (ignore the bullets):
3 Mark
4 Ralph
1 Ed
2 Kevin
and will create a file w/ organized names based on the number in front of them:
1 Ed
2 Kevin
3 Mark
4 Ralph
I think I'm experiencing trouble in line 40, where I try to compare the numbers stored in strings with a number stored in an int.
I can't think of any other way to tackle this, any advice would be wonderful!
#include <iostream>
#include <fstream>
#include <vector>
#include <cstdlib>
using namespace std;
int main()
{
ifstream in;
ofstream out;
string line;
string collection[5];
vector <string> lines;
vector <string> newLines;
in.open("infile.txt");
if (in.fail())
{
cout << "Input file opening failed. \n";
exit(1);
}
out.open("outfile.txt");
if (out.fail())
{
cout << "Output file opening failed. \n";
exit(1);
}
while (!in.eof())
{
getline(in, line);
lines.push_back(line);
}
for (int i = 0; i < lines.size(); i++)
{
collection[i] = lines[i];
}
for (int j = 0; j < lines.size(); j++)
{
for (int x = 0; x < lines.size(); x--)
{
if (collection[x][0] == j)
newLines.push_back(collection[x]);
}
}
for (int k = 0; k < newLines.size(); k++)
{
out << newLines[k] << endl;
}
in.close( );
out.close( );
return 0;
}
Using a debugger would tell you where you went wrong, but let me highlight the mistake:
if (collection[x][0] == j)
You're expecting a string like 3 Mark. The first character of this string is '3', but that has the ASCII value of 51, and that is the numerical value you'll get when trying work with it is this way! This will never equal j, unless you've got a lot of lines in your file, and then your search system will not work at all like you wanted. YOu need to convert that character into an integer, and then do your comparison.
C++ offers many way to process data via streams, including parsing simple datafiles and converting text to numbers and vice versa. Here's a simple standalone function that will read a datafile like you have (only with arbitrary text including spaces after the number on each line).
#include <algorithm>
// snip
struct file_entry { int i; std::string text; };
std::vector<file_entry> parse_file(std::istream& in)
{
std::vector<file_entry> data;
while (!in.eof())
{
file_entry e;
in >> e.i; // read the first number on the line
e.ignore(); // skip the space between the number and the text
std::getline(in, e.text); // read the whole of the rest of the line
data.push_back(e);
}
return data;
}
Because the standard way that >> works involves reading until the next space (or end of line), if you want to read a chunk of text which contains whitespace, it will be much easier to use std::getline to just slurp up the whole of the rest of the current line.
Note: I've made no attempt to handle malformed lines in the textfile, or any number of other possible error conditions. Writing a proper file parser is outside of the scope of this question, but there are plenty of tutorials out there on using C++'s stream functionality appropriately.
Now you have the file in a convenient structure, you can use other standard c++ features to sort it, rather than reinventing the wheel and trying to do it yourself:
int sort_file_entry(file_entry a, file_entry b)
{
return a.i < b.i;
}
int main()
{
// set up all your streams, etc.
std::vector<file_entry> lines = parse_file(in);
std::sort(lines.begin(), lines.end(), sort_file_entry);
// now you can write the sorted vector back out to disk.
}
Again, a full introduction to how iterators and containers work is well outside the scope of this answer, but the internet has no shortage of introductory C++ guides out there. Good luck!

Remove character from array where spaces and punctuation marks are found [duplicate]

This question already has answers here:
C++ Remove punctuation from String
(12 answers)
Closed 9 years ago.
In my program, I am checking whole cstring, if any spaces or punctuation marks are found, just add empty character to that location but the complilor is giving me an error: empty character constant.
Please help me out, in my loop i am checking like this
if(ispunct(str1[start])) {
str1[start]=''; // << empty character constant.
}
if(isspace(str1[start])) {
str1[start]=''; // << empty character constant.
}
This is where my errors are please correct me.
for eg the word is str,, ing, output should be string.
There is no such thing as an empty character.
If you mean a space then change '' to ' ' (with a space in it).
If you mean NUL then change it to '\0'.
Edit: the answer is no longer relevant now that the OP has edited the question. Leaving up for posterity's sake.
If you're wanting to add a null character, use '\0'. If you're wanting to use a different character, using the appropriate character for that. You can't assign it nothing. That's meaningless. That's like saying
int myHexInt = 0x;
or
long long myIndeger = L;
The compiler will error. Put in the value you wanted. In the char case, that's a value from 0 to 255.
UPDATE:
From the edit to OP's question, it's apparent that he/she wanted to trim a string of punctuation and space characters.
As detailed in the flagged possible duplicate, one way is to use remove_copy_if:
string test = "THisisa test;;';';';";
string temp, finalresult;
remove_copy_if(test.begin(), test.end(), std::back_inserter(temp), ptr_fun<int, int>(&ispunct));
remove_copy_if(temp.begin(), temp.end(), std::back_inserter(finalresult), ptr_fun<int, int>(&isspace));
ORIGINAL
Examining your question, replacing spaces with spaces is redundant, so you really need to figure out how to replace punctuation characters with spaces. You can do so using a comparison function (by wrapping std::ispunct) in tandem with std::replace_if from the STL:
#include <string>
#include <algorithm>
#include <iostream>
#include <cctype>
using namespace std;
bool is_punct(const char& c) {
return ispunct(c);
}
int main() {
string test = "THisisa test;;';';';";
char test2[] = "THisisa test;;';';'; another";
size_t size = sizeof(test2)/sizeof(test2[0]);
replace_if(test.begin(), test.end(), is_punct, ' ');//for C++ strings
replace_if(&test2[0], &test2[size-1], is_punct, ' ');//for c-strings
cout << test << endl;
cout << test2 << endl;
}
This outputs:
THisisa test
THisisa test another
Try this (as you asked for cstring explicitly):
char str1[100] = "str,, ing";
if(ispunct(str1[start]) || isspace(str1[start])) {
strncpy(str1 + start, str1 + start + 1, strlen(str1) - start + 1);
}
Well, doing this just in pure c language, there are more efficient solutions (have a look at #MichaelPlotke's answer for details).
But as you also explicitly ask for c++, I'd recommend a solution as follows:
Note you can use the standard c++ algorithms for 'plain' c-style character arrays also. You just have to place your predicate conditions for removal into a small helper functor and use it with the std::remove_if() algorithm:
struct is_char_category_in_question {
bool operator()(const char& c) const;
};
And later use it like:
#include <string>
#include <algorithm>
#include <iostream>
#include <cctype>
#include <cstring>
// Best chance to have the predicate elided to be inlined, when writing
// the functor like this:
struct is_char_category_in_question {
bool operator()(const char& c) const {
return std::ispunct(c) || std::isspace(c);
}
};
int main() {
static char str1[100] = "str,, ing";
size_t size = strlen(str1);
// Using std::remove_if() is likely to provide the best balance from perfor-
// mance and code size efficiency you can expect from your compiler
// implementation.
std::remove_if(&str1[0], &str1[size + 1], is_char_category_in_question());
// Regarding specification of the range definitions end of the above state-
// ment, note we have to add 1 to the strlen() calculated size, to catch the
// closing `\0` character of the c-style string being copied correctly and
// terminate the result as well!
std::cout << str1 << endl; // Prints: string
}
See this compilable and working sample also here.
As I don't like the accepted answer, here's mine:
#include <stdio.h>
#include <string.h>
#include <cctype>
int main() {
char str[100] = "str,, ing";
int bad = 0;
int cur = 0;
while (str[cur] != '\0') {
if (bad < cur && !ispunct(str[cur]) && !isspace(str[cur])) {
str[bad] = str[cur];
}
if (ispunct(str[cur]) || isspace(str[cur])) {
cur++;
}
else {
cur++;
bad++;
}
}
str[bad] = '\0';
fprintf(stdout, "cur = %d; bad = %d; str = %s\n", cur, bad, str);
return 0;
}
Which outputs cur = 18; bad = 14; str = string
This has the advantage of being more efficient and more readable, hm, well, in a style I happen to like better (see comments for a lengthy debate / explanation).

Basic C++: convert from char to string

I'm a little confused by the class code. Here's what I'm trying to do:
//Program takes "string text" and compares it to "string remove". Any letters in
//common between the two get deleted and the remaining string gets returned.
#include <string>
#include "genlib.h"
string CensorString1(string text, string remove);
int main() {
CensorString1("abcdef", "abc");
return 0;
}
string CensorString1(string text, string remove) {
for (int i = 0; text[i]; i++){
for (int n = 0; remove[n]; n++){
if (i != n){
string outputString = ' '; //to store non repeated chars in,
//forming my return string
outputString += i;
}
}
}
return outputString;
}
I'm getting an error on the "outputString += 1" saying: "cannot convert from "char" to
std::basic_string
I'm also getting an error on the "return outputString" saying: undeclared identifier
???????
I get that I'm putting a "char" on a "string" variable but what if shortly that "char" will soon be a string? Is there a way to pass this?
I'm always forgetting libraries. Can someone recommend a couple of standard/basic libraries I should always think about? Right now I'm thinking , "genlib.h" (from class).
C++ is kicking my ass. I can't get around constant little errors. Tell me it's going to get better.
There are many errors in your code:
Your outputString needs to be in the outer scope (syntax)
You compare i to n instead of text[i] to remove[n] (semantic)
You are adding i to the output instead of text[i] (semantic)
You ignore the return of CensorString1 (semantic)
Here is your modified code:
string CensorString1(string text, string remove) {
string outputString;
for (int i = 0; text[i] ; i++){
for (int n = 0; remove[n] ; n++){
if (text[i] != remove[n]){
outputString += text[i];
}
}
}
return outputString;
}
This has some remaining issues. For example, using text[i] and remove[n] for termination conditions. It is also very inefficient, but it should be a decent start.
At any rate, strings are always double-quoted in C and C++. Single-quoted constants are char constants. Fix that and you should probably be all right.
Also, look at this SO question: How do you append an int to a string in C++?
Good luck at Stanford!
There are some problems there:
string outputString = ' '; will try to construct a string from a char, which you can't do. You can assign a char to a string though, so this should be valid:
string outputString;
outputString = ' ';
Then, outputString is only visible within your if, so it won't act as an accumulator but rather be created and destroyed.
You're also trying to add character indices to the string, instead of characters, which is not what I think you want to be doing. It seems like you're mixing up C and C++.
For example, if you want to print the characters of a string, you could do something like:
string s("Test");
for (int i=0;i<s.length();i++)
cout << s[i];
Finally, I'd say that if you want to remove characters in text that also appear in remove, you'd need to make sure that none of the characters in remove match your current character, before you add it to the output string.
This is an implementation of what I think you want, your code has multiple problems which just showed up described in multiple other answers.
std::string CensorString1(std::string text, std::string remove) {
std::string result;
for (int i = 0; i<text.length(); i++) {
const char ch = text[i];
if(remove.find(ch) == -1)
result.append(1,ch);
}
return result;
}

How to tokenize (words) classifying punctuation as space

Based on this question which was closed rather quickly:
Trying to create a program to read a users input then break the array into seperate words are my pointers all valid?
Rather than closing I think some extra work could have gone into helping the OP to clarify the question.
The Question:
I want to tokenize user input and store the tokens into an array of words.
I want to use punctuation (.,-) as delimiter and thus removed it from the token stream.
In C I would use strtok() to break an array into tokens and then manually build an array.
Like this:
The main Function:
char **findwords(char *str);
int main()
{
int test;
char words[100]; //an array of chars to hold the string given by the user
char **word; //pointer to a list of words
int index = 0; //index of the current word we are printing
char c;
cout << "die monster !";
//a loop to place the charecters that the user put in into the array
do
{
c = getchar();
words[index] = c;
}
while (words[index] != '\n');
word = findwords(words);
while (word[index] != 0) //loop through the list of words until the end of the list
{
printf("%s\n", word[index]); // while the words are going through the list print them out
index ++; //move on to the next word
}
//free it from the list since it was dynamically allocated
free(word);
cin >> test;
return 0;
}
The line tokenizer:
char **findwords(char *str)
{
int size = 20; //original size of the list
char *newword; //pointer to the new word from strok
int index = 0; //our current location in words
char **words = (char **)malloc(sizeof(char *) * (size +1)); //this is the actual list of words
/* Get the initial word, and pass in the original string we want strtok() *
* to work on. Here, we are seperating words based on spaces, commas, *
* periods, and dashes. IE, if they are found, a new word is created. */
newword = strtok(str, " ,.-");
while (newword != 0) //create a loop that goes through the string until it gets to the end
{
if (index == size)
{
//if the string is larger than the array increase the maximum size of the array
size += 10;
//resize the array
char **words = (char **)malloc(sizeof(char *) * (size +1));
}
//asign words to its proper value
words[index] = newword;
//get the next word in the string
newword = strtok(0, " ,.-");
//increment the index to get to the next word
++index;
}
words[index] = 0;
return words;
}
Any comments on the above code would be appreciated.
But, additionally, what is the best technique for achieving this goal in C++?
Have a look at boost tokenizer for something that's much better in a C++ context than strtok().
Already covered by a lot of questions is how to tokenize a stream in C++.
Example: How to read a file and get words in C++
But what is harder to find is how get the same functionality as strtok():
Basically strtok() allows you to split the string on a whole bunch of user defined characters, while the C++ stream only allows you to use white space as a separator. Fortunately the definition of white space is defined by the locale so we can modify the locale to treat other characters as space and this will then allow us to tokenize the stream in a more natural fashion.
#include <locale>
#include <string>
#include <sstream>
#include <iostream>
// This is my facet that will treat the ,.- as space characters and thus ignore them.
class WordSplitterFacet: public std::ctype<char>
{
public:
typedef std::ctype<char> base;
typedef base::char_type char_type;
WordSplitterFacet(std::locale const& l)
: base(table)
{
std::ctype<char> const& defaultCType = std::use_facet<std::ctype<char> >(l);
// Copy the default value from the provided locale
static char data[256];
for(int loop = 0;loop < 256;++loop) { data[loop] = loop;}
defaultCType.is(data, data+256, table);
// Modifications to default to include extra space types.
table[','] |= base::space;
table['.'] |= base::space;
table['-'] |= base::space;
}
private:
base::mask table[256];
};
We can then use this facet in a local like this:
std::ctype<char>* wordSplitter(new WordSplitterFacet(std::locale()));
<stream>.imbue(std::locale(std::locale(), wordSplitter));
The next part of your question is how would I store these words in an array. Well, in C++ you would not. You would delegate this functionality to the std::vector/std::string. By reading your code you will see that your code is doing two major things in the same part of the code.
It is managing memory.
It is tokenizing the data.
There is basic principle Separation of Concerns where your code should only try and do one of two things. It should either do resource management (memory management in this case) or it should do business logic (tokenization of the data). By separating these into different parts of the code you make the code more generally easier to use and easier to write. Fortunately in this example all the resource management is already done by the std::vector/std::string thus allowing us to concentrate on the business logic.
As has been shown many times the easy way to tokenize a stream is using operator >> and a string. This will break the stream into words. You can then use iterators to automatically loop across the stream tokenizing the stream.
std::vector<std::string> data;
for(std::istream_iterator<std::string> loop(<stream>); loop != std::istream_iterator<std::string>(); ++loop)
{
// In here loop is an iterator that has tokenized the stream using the
// operator >> (which for std::string reads one space separated word.
data.push_back(*loop);
}
If we combine this with some standard algorithms to simplify the code.
std::copy(std::istream_iterator<std::string>(<stream>), std::istream_iterator<std::string>(), std::back_inserter(data));
Now combining all the above into a single application
int main()
{
// Create the facet.
std::ctype<char>* wordSplitter(new WordSplitterFacet(std::locale()));
// Here I am using a string stream.
// But any stream can be used. Note you must imbue a stream before it is used.
// Otherwise the imbue() will silently fail.
std::stringstream teststr;
teststr.imbue(std::locale(std::locale(), wordSplitter));
// Now that it is imbued we can use it.
// If this was a file stream then you could open it here.
teststr << "This, stri,plop";
cout << "die monster !";
std::vector<std::string> data;
std::copy(std::istream_iterator<std::string>(teststr), std::istream_iterator<std::string>(), std::back_inserter(data));
// Copy the array to cout one word per line
std::copy(data.begin(), data.end(), std::ostream_iterator<std::string>(std::cout, "\n"));
}