Hi I have a string like this:
word1--tab--word2--tab--word3--tab--word4--tab--word5--tab--word6
I need to extract the third word from the string. I thought of reading character by character and getting the word after reading the second tab. But I guess it is inefficient. Can you show me a more specific way please?
std::string has the find method which returns an index. You can use
find("--", lastFoundIndex + 1)
three times to find the start index of your word, a fourth time for the end index, and then use substr.
assuming "tab" is \t;
std::istringstream str(".....");
std::string temp, word;
str >> temp >> temp >> word;
Related
I am reading from a file in C++, and I want to remove all but the first word and store it,
sentence = sentence.substr(sentence.find_first_of(" \t") +
1);
this code remove the first word and keep the whole sentence , is there a way to store the removed word.
https://en.cppreference.com/w/cpp/string/basic_string/find_first_of
take position of first match from find_first_of and then sentence start pos to position from find_first_of
std::string w1 = sentence.substr(0, sentence.find_first_of(" \t"));
I'm trying to find a certain word in a string, but find that word alone. For example, if I had a word bank:
789540132143
93
3
5434
I only want a match to be found for the value 3, as the other values do not match exactly. I used the normal string::find function, but that found matches for all four values in the word bank because they all contain 3.
There is no whitespace surrounding the values, and I am not allowed to use Regex. I'm looking for the fastest implementation of completing this task.
If you want to count the words you should use a string to int map. Read a word from your file using >> into a string then increment the map accordingly
string word;
map<string,int> count;
ifstream input("file.txt");
while (input.good()) {
input >> word;
count[word]++;
}
using >> has the benefit that you don't have to worry about whitespace.
All depends on the definition of words: is it a string speparated from others with a whitespace ? Or are other word separators (e.g. coma, dot, semicolon, colon, parenntheses...) relevant as well ?
How to parse for words without regex:
Here an accetable approach using find() and its variant find_first_of():
string myline; // line to be parsed
string what="3"; // string to be found
string separator=" \t\n,;.:()[]"; // string separators
while (getline(cin, myline)) {
size_t nxt=0;
while ( (nxt=myline.find(what, nxt)) != string::npos) { // search occurences of what
if (nxt==0||separator.find(myline[nxt-1])!=string::npos) { // if at befgin of a word
size_t nsep=myline.find_first_of(separator,nxt+1); // check if goes to end of wordd
if ((nsep==string::npos && myline.length()-nxt==what.length()) || nsep-nxt==what.length()) {
cout << "Line: "<<myline<<endl; // bingo !!
cout << "from pos "<<nxt<<" to " << nsep << endl;
}
}
nxt++; // ready for next occurence
}
}
And here the online demo.
The principle is to check if the occurences found correspond to a word, i.e. are at the begin of a string or begin of a word (i.e. the previous char is a separator) and that it goes until the next separator (or end of line).
How to solve your real problem:
You can have the fastest word search function: if ou use it for solving your problem of counting words, as you've explained in your comment, you'll waste a lot of efforts !
The best way to achieve this would certainly be to use a map<string, int> to store/updated a counter for each string encountered in the file.
You then just have to parse each line into words (you could use find_fisrst_of() as suggested above) and use the map:
mymap[word]++;
I have a text file that resembles
1 \t words words words words
2 \t words words words words
where the # is the line #, followed by a tab, then followed by random words
I need to read in the int, store it, then skip the \t, and read in each word individually while keeping track of that words position.
I was hoping I could it with getline(file, word, ' '), and a counter but that grabs my first word as 1 \t words.
Any help or suggestions would be much appreciated.
use stringstream and getline,
getline(file, line);
std::stringstream ssline(line);
int num;
ssline >> num;
std::string word;
while(ssline >> word){
// do whatever you want.
}
Assuming that you can't say how many words there are on a line, then the simple answer is to do it in two steps.
Read a whole line into a string with getline
Place that string into a std::istringstream and read the line number and following words from the std::istringstream
then repeat from step 1
std::istream::ignore discards characters until one compares equal to delim. Is there an alternative working on strings rather then chars, i.e one that discards strings until one compares equal to the specified?
The easiest way would be to continuously extract a string until you find the right one:
std::istringstream iss;
std::string str;
std::string pattern = "find me";
while ( iss >> str && str != pattern ) ;
if (!iss) { /* Error occured */ }
This assumes that the strings are delimited with whitespace characters, of course.
I have for example
string s = " abc edef";
I create istringstream with this string.
Is there any way to from getline get only "abc"and "edef" ? Beacouse now I get that empty string between pairs of spaces :/
Use the >> operator to get only whitespace-delimited "words".