Split a line read from a file in C++ - c++

How do I access individual elements of a line read from a file?
I used the following to read a line from a file:
getline(infile, data) // where infile is an object of ifstream and data is the variable where the line will be stored
The following line is stored in data : "The quick brown fox jumped over the lazy dog"
How do I access particular elements of the line now? What if I want to play around with the second element ( quick ) of the line or get hold of a certain word in the line? How do I select it?
Any help will be appreciated

data = "The quick brown fox jumped over the lazy dog" and the data is string , your string delimeter is " ",you can use std::string::find() to find the position of the string delimeter and std::string::substr() to get a token:
std::string data = "The quick brown fox jumped over the lazy dog";
std::string delimiter = " ";
std::string token = data.substr(0, data.find(delimiter)); // token is "the"

Since your text is space separated, you can use std::istringstream to separate the words.
std::vector<std::string> words;
const std::string data = "The quick brown fox jumped over the lazy dog";
std::string w;
std::istringstream text_stream(data);
while (text_stream >> w)
{
words.push_back(w);
std::cout << w << "\n";
}
The operator>> will read characters into a string until a space is found.

Related

regexp, how to match longer strings first?

There must be an option/flag for this I missed with matlab:
I want to use regular expressions to match to my given string, but multiple matches are possible. I want them sorted to first match the longer ones, before the shorter ones.
How can this be achieved?
regexpi('A quick brown fox jumps over the lazy dog.','quick|the|a','match','once')
%returns 'A', would like it to return 'quick'
Maybe you can try the following code
% match all possible key words, don't use argument 'once' in `regexpi()`
v = regexpi('A quick brown fox jumps over the lazy dog.','quick|the|a','match');
% calculate the lengths of matched words
lens = cellfun(#length,v);
% output the longest word
v{lens == max(lens)}
such that
ans = quick
You could do
>> regexpi('A quick brown fox jumps over the lazy dog.','.*?(quick)|.*?(the)|.*?(a)','tokens','once')
ans =
1×1 cell array
{'quick'}
but that's pretty ugly. Another solution, which is a smidge less ugly, is
>> str = "A quick brown fox jumps over the lazy dog.";
>> list = ["quick" "the" "a"];
>> list(find(arrayfun(#(x)contains(str,x), list), 1))
ans =
"quick"
I think I like Thomas' solution the most.

How does split and strip function work together in python?

headers = table.by_tag('th')
labels = [str(t.content).split('(')[0].strip() for t in headers[3:-1]]
I know what is meant by split() and strip(). But what does split('(')[0] means? headers is a content from a table.
For example. HTML was..
<table>
<tr><th>Jerry Brown (D)</th><th>Meg Whitman(D)</th></tr>
<tr><td>1</td><td>4</td></tr>
<tr><td>2</td><td>1</td></tr>
<tr><td>3</td><td>2</td></tr>
</table>
headers may be extracted by BeautifulSoup
and result is a list contains below
["<th>Jerry Brown (D)</th>", "<th>Meg Whitman(D)</th>"]
so t.content is Jerry Brown (D) and Meg Whitman(D)
"Jerry Brown (D)".split('(') = ["Jerry Brown ", "D)"]
"Meg Whitman(D)".split('(') = ["Meg Whitman", "D)"]
["Jerry Brown ", "D)"][0] = "Jerry Brown "
["Meg Whitman", "D)"][0] = "Meg Whitman"
and strip() may remove whitespace on both sides of string so...
labels means ["Jerry Brown","Meg Whitman"]

Matching a whole string with case insensitive

I'm looking for a function if available to match a whole word for example:
std::string str1 = "I'm using firefox browser";
std::string str2 = "The quick brown fox.";
std::string str3 = "The quick brown fox jumps over the lazy dog.";
Only str2 and str3 should match for the word fox. So, it doesn't matter if there is a symbol like period (.) or a comma (,) before or after the word and it should match and it also has to be case-insensitive search at the same time.
I've found many ways to search a case insensitive string but I would like to know something for matching a whole word.
I would like to recommend std::regex of C++11. But, it is not working yet with g++4.8. So I recommend the replacement boost::regex.
#include<iostream>
#include<string>
#include<algorithm>
#include<boost/regex.hpp>
int main()
{
std::vector <std::string> strs = {"I'm using firefox browser",
"The quick brown fox.",
"The quick brown Fox jumps over the lazy dog."};
for( auto s : strs ) {
std::cout << "\n s: " << s << '\n';
if( boost::regex_search( s, boost::regex("\\<fox\\>", boost::regex::icase))) {
std::cout << "\n Match: " << s << '\n';
}
}
return 0;
}
/*
Local Variables:
compile-command: "g++ --std=c++11 test.cc -lboost_regex -o ./test.exe && ./test.exe"
End:
*/
The output is:
s: I'm using firefox browser
s: The quick brown fox.
Match: the quick brown fox.
s: The quick brown Fox jumps over the lazy dog.
Match: the quick brown fox jumps over the lazy dog.

read and get value btw whitespace and another character

how to get the 1st name. here is the sample of data.. first name here is Owen, Florencio. I need to read and get the value frm whitespace to ; ??
Owen;Grzegorek;Howard Miller Co;15410 Minnetonka Industrial Rd;Minnetonka;Hennepin;MN;55345;952-939-2973;952-939-4663;owen#grzegorek.com;http://www.owengrzegorek.com
Florencio;Hollberg;Hellenic Museum & Cultural Ctr;2211 Kenmere Ave;Burbank;Los
Use string::find to find the first instance of semi-colon, and do a string::substr.
string str = "Florencio;Hollberg;Hellenic Museum & Cultural Ctr;2211 Kenmere Ave;Burbank;Los";
std::size_t pos = str.find(";");
str = str.substr(0, pos);
cout << str << endl;
Output:
Florencio
Of course, you have to modify the code to suit your needs.

why is my std::string being cut off?

I initialize a string as follows:
std::string myString = "'The quick brown fox jumps over the lazy dog' is an English-language pangram (a phrase that contains all of the letters of the alphabet)";
and the myString ends up being cut off like this:
'The quick brown fox jumps over the
lazy dog' is an English-language
pangram (a phrase that contains
Where can i set the size limit?
I tried the following without success:
std::string myString;
myString.resize(300);
myString = "'The quick brown fox jumps over the lazy dog' is an English-language pangram (a phrase that contains all of the letters of the alphabet)";
Many thanks!
Of course it was just the debugger cutting it off (xcode). I'm just getting started with xcode/c++, so thanks a lot for the quick replies.
Are you sure?
kkekan> ./a.out
'The quick brown fox jumps over the lazy dog' is an English-language pangram (a phrase that contains all of the letters of the alphabet)
There is no good reason why this should have happen!
Try the following (in debug mode):
assert(!"Congratulations, I am in debug mode! Let's do a test now...")
std::string myString = "'The quick brown fox jumps over the lazy dog' is an English-language pangram (a phrase that contains all of the letters of the alphabet)";
assert(myString.size() > 120);
Does the (second) assertion fail?
When printing, or displaying text, the output machinery buffers the output. You can tell it to flush the buffers (display all remaining text) by output a '\n' or using std::endl or executing the flush() method:
#include <iostream>
using std::cout;
using std::endl;
int main(void)
{
std::string myString =
"'The quick brown fox jumps over the lazy dog'" // Compiler concatenates
" is an English-language pangram (a phrase" // these contiguous text
" that contains all of the letters of the" // literals automatically.
" alphabet)";
// Method 1: use '\n'
// A newline forces the buffers to flush.
cout << myString << '\n';
// Method 2: use std::endl;
// The std::endl flushes the buffer then sends '\n' to the output.
cout << myString << endl;
// Method 3: use flush() method
cout << myString;
cout.flush();
return 0;
}
For more information about buffers, search Stack Overflow for "C++ output buffer".