Save next word, if a given word is found (C++) - c++

I'm pretty new to C++. I have a text doc that looks like this:
InputFile.txt
...
.
..
.
.
....
TIME/DISTANCE = 500/ 0.1500E+05
..
..
.
...
TIME/DISTANCE = 500/ 1.5400E+02
.
...
...
.
TIME/DISTANCE = 500/ 320.0565
..
..
.
.
...
The one line shown keeps repeating throughout the file. My objective is to save all the numbers after the 500/ into an array/vector/another file/anything. I know how to read a file and get a line:
string line;
vector <string> v1;
ifstream txtfile ("InputFile.txt");
if (txtfile.is_open())
{
while (txtfile.good())
{
while( getline( txtfile, line ) )
{
// ?????
// if(line.find("500/") != string::npos)
// ?????
}
}
txtfile.close();
}
Does anybody have a solution? Or point me in the right direction?
Thanks in advance.
Edit: Both proposed solutions (Jerry's and Galik's) work perfectly. I love this community. :)

This is one of those rare cases that (IMO) it may make sense to use sscanf in C++.
std::string line;
std::vector<double> numbers;
while (std::getline(txtfile, line)) {
double d;
if (1==sscanf(line.c_str(), " TIME/DISTANCE = 500 / %lf", &d))
numbers.push_back(d);
}
This takes each line, and attempts to treat it as having the format you care about. Where that succeeded, the return value from sscanf will be 1 (the number of items converted). Where it fails, the return value will be 0 (i.e., it didn't convert anything successfully). Then we save it if (and only if) there was a successful conversion.
Also note that sscanf is "smart" enough to treat a single space in the format string as matching an arbitrary amount of white-space in the input, so we don't have to try to match the amount of white space precisely.
We could vary this somewhat. If there has to be a number before the '/', but it could be something different from 500, we could replace that part of the format string with %*d. That means sscanf will search for a number (specifically an integer) there, but not assign it to anything. If it finds something other than an integer, conversion will fail, so (for example) TIME/DISTANCE ABC/1.234 would fail, but TIME/DISTANCE 234/1.l234 would succeed.

When processing your line then you can use line.find() to check its the right line and to find your data:
if(line.find("TIME/DISTANCE") != std::string::npos)
{
// this is the correct line
}
Once you have the correct line you can get the position of the data like this:
std::string::size_type pos = line.find("500/");
if(pos != std::string::npos)
{
// pos holds the position of the numbers you want
std::string wanted_numbers = lint.substr(pos + 4); // get only the numbers in a string
}
Hope that helps
EDIT: Fixed bug (adding 4 to pos to skip over the "500/" part)

Related

C++ read different kind of datas from file until there's a string beginning with a number

In C++, I'd like to read from an input file which contains different kind of datas: first the name of a contestant (2 or more strings with whitespaces), then an ID (string without whitespaces, always beginning with a number), then another strings without ws and a numbers (the sports and their achieved places).
For example:
Josh Michael Allen 1063Szinyei running 3 swimming 1 jumping 1
I show you the code what I started to write and then stucked..
void ContestEnor::next()
{
string line;
getline(_f , line);
if( !(_end = _f.fail()) ){
istringstream is(line);
is >> _cur.contestant >> _cur.id; // here I don't know how to go on
_cur.counter = 0;
//...
}
}
Thank you for your help in advance.
You should look into using std::getline with a delimiter. This way, you can delimit on a space character and read until you find a string where the first character in a number. Here is a short code example (this seems rather homework-like, so I don't want to write too much of it for you ;):
std::string temp, id;
while (std::getline(_f, temp, ' ')) {
if (temp[0] >= 0 && temp[0] <= '9') {
id = temp;
}
// you would need to add more code for the rest of the data on that line
}
/* close the file, etc. */
This code should be pretty self-explanatory. The most important thing to know is that you can use std::getline to get data up until a delimiter. The delimiter is consumed, just like the default behavior of delimiting on a newline character. Thus, the name getline isn't entirely accurate - you can still get only part of a line if you need to.

Extracting certain integers from string C++

Good day to all,
I am having a hard time trying to extract desired integers from a string. I am given the following to read in from a file:
itemnameitemnumber price percentmarkup
examples
Gowns-u2285 24.22 37%
TwoB1Ask1-m1275 90.4 1%
What I have been trying to do is get the item number separated from the item name so that I can store the item number as a reference for sorting. As you can see the first example itemnameitemnumber is a clear cut character to digit separation, whereas the next example has numbers within its item name.
I have tried several different approaches, however with certain item names having integers apart of their name is proving to be beyond my experience.
If anyone can help me with this I would be greatly appreciative for their time and knowledge.
Good day,
I don't know, if you have a fixed number of digits for itemnumber, but i am going to assume that you don't.
This is a simple approach; first you have to separate the words of your line. For example, use std::istringstream.
When you have the line split to words, for example by giving its iterators to a vector, or reading it with operator>>, you start to check the first word from backwards, until you find anything that is not one of "0123456789 " (note the whitespace at the end).
After you've done this, you get the iterator about where these digits end (from backwards), and cut your original string, or if you have the opportunity, the already split string. Voilá! You have yourself your item name and item number.
For the record, i am going to do this whole thing, utilising the same technique for the percent markup too, of course with the exception characters being "% ".
#define VALID_DIGITS "0123456789 "
#define VALID_PERCENTAGE "% "
struct ItemData {
std::string Name;
int Count;
double Price;
double PercentMarkup;
};
int ExtractItemData(std::string Line, ItemData & Output) {
std::istringstream Stream( Line );
std::vector<std::string> Words( Stream.begin(), Stream.end() );
if (Words.size() < 3) {
/* somebody gave us a malformed line with less than needed words */
return -1;
}
// Search from backwards, until you do not find anything that is not digits (0-9) or a whitespace
std::size_t StartOfDigits = Words[0].find_last_not_of( VALID_DIGITS );
if (StartOfDigits == std::string::npos) {
/* error; your item name is invalid */
return -2;
}
else {
// Separate the string into 2 parts
Output.Name = Words[0].substr(0, StartOfDigits); // Get the first part
Output.Count = std::stoi( Words[0].substr(StartOfDigits, Words[0].length() - StartOfDigits) );
Output.Price = std::stod( Words[1] );
// Search from backwards, until we do not find anything that is not '%' or ' '
std::size_t StartOfPercent = Words[2].find_last_not_of(VALID_PERCENTAGE);
Output.PercentMarkup = std::stod( Words[2].substr(0, StartOfPercent) );
}
return 0;
}
Code requies includes sstream, vector, string, and cstdint if you do not have size_t defined
Hope the answer was useful.
Best of luck, COlda.
PS.: My first answer on stack overflow ^^;
you can iterate on the string pushing the numbers to a vector then use stringstream to convert them to integers

Pull out data from a file and store it in strings in C++

I have a file which contains records of students in the following format.
Umar|Ejaz|12345|umar#umar.com
Majid|Hussain|12345|majid#majid.com
Ali|Akbar|12345|ali#geeks-inn.com
Mahtab|Maqsood|12345|mahtab#myself.com
Juanid|Asghar|12345|junaid#junaid.com
The data has been stored according to the following format:
firstName|lastName|contactNumber|email
The total number of lines(records) can not exceed the limit 100. In my program, I've defined the following string variables.
#define MAX_SIZE 100
// other code
string firstName[MAX_SIZE];
string lastName[MAX_SIZE];
string contactNumber[MAX_SIZE];
string email[MAX_SIZE];
Now, I want to pull data from the file, and using the delimiter '|', I want to put data in the corresponding strings. I'm using the following strategy to put back data into string variables.
ifstream readFromFile;
readFromFile.open("output.txt");
// other code
int x = 0;
string temp;
while(getline(readFromFile, temp)) {
int charPosition = 0;
while(temp[charPosition] != '|') {
firstName[x] += temp[charPosition];
charPosition++;
}
while(temp[charPosition] != '|') {
lastName[x] += temp[charPosition];
charPosition++;
}
while(temp[charPosition] != '|') {
contactNumber[x] += temp[charPosition];
charPosition++;
}
while(temp[charPosition] != endl) {
email[x] += temp[charPosition];
charPosition++;
}
x++;
}
Is it necessary to attach null character '\0' at the end of each string? And if I do not attach, will it create problems when I will be actually implementing those string variables in my program. I'm a new to C++, and I've come up with this solution. If anybody has better technique, he is surely welcome.
Edit: Also I can't compare a char(acter) with endl, how can I?
Edit: The code that I've written isn't working. It gives me following error.
Segmentation fault (core dumped)
Note: I can only use .txt file. A .csv file can't be used.
There are many techniques to do this. I suggest searching StackOveflow for "[C++] read file" to see some more methods.
Find and Substring
You could use the std::string::find method to find the delimiter and then use std::string::substr to return a substring between the position and the delimiter.
std::string::size_type position = 0;
positition = temp.find('|');
if (position != std::string::npos)
{
firstName[x] = temp.substr(0, position);
}
If you don't terminate a a C-style string with a null character there is no way to determine where the string ends. Thus, you'll need to terminate the strings.
I would personally read the data into std::string objects:
std::string first, last, etc;
while (std::getline(readFromFile, first, '|')
&& std::getline(readFromFile, last, '|')
&& std::getline(readFromFile, etc)) {
// do something with the input
}
std::endl is a manipulator implemented as a function template. You can't compare a char with that. There is also hardly ever a reason to use std::endl because it flushes the stream after adding a newline which makes writing really slow. You probably meant to compare to a newline character, i.e., to '\n'. However, since you read the string with std::getline() the line break character will already be removed! You need to make sure you don't access more than temp.size() characters otherwise.
Your record also contains arrays of strings rather than arrays of characters and you assign individual chars to them. You either wanted to yse char something[SIZE] or you'd store strings!

copying after a line has been found from a file from that position till the end of that file in c++

I have a file which holds protein coordinates as well as other information preceding it. My aim is to look for a certain line called "$PARAMETERS" and then copy from there every line succeeding it till the end of the file.
How can I get that done? This is the small code I wrote part of the entire program (that someone else wrote years ago, and I took over to upgrade his code for my research):
ifstream InFile;
InFile.open (DC_InFile.c_str(), ios::in);
while ( not InFile.eof() )
{
Line = NextLine (&InFile);
if (Line.find ("#") == 0) continue; // skip lines starting with # (comments)
if (Line.length() == 0) continue; // skip empty lines
size_t pos = Line.find("$PARAMETERS");
Line.copy(Line.begin("$PARAMETERS")+pos, Line.end("$END"));
&Line.copy >> x_1 >> y_2 >> z_3;
}
Bearing in mind that I defined Line as string
I guess you need to read data between $PARAMETERS and $END, not from $PARAMETERS until end of file. If so, you can use the following code:
string str;
while (getline(InFile, str))
{
if (str.find("#") == 0)
continue;
if (str.length() == 0)
continue;
if (str.find("$PARAMETERS") == 0)
{
double x_1, y_2, z_3; // you want to read numbers, i guess
while (getline(InFile, str))
{
if (str.find("$END") == 0)
break;
stringstream stream(str);
if (stream >> x_1 >> y_2 >> z_3)
{
// Do whatever you want with x_1, y_2 and z_3
}
}
}
}
This will handle multiple sections of data; not sure if you really want this behavior.
For example:
# comment
$PARAMETERS
1 2 3
4 5 6
$END
#unrelated data
100 200 300
$PARAMETERS
7 8 9
10 11 12
$END
I'm not sure what you want on the first line of the copied file but assuming you get that straight and you haven't read beyond the current line, you can copy the tail of the fike you are reading like this:
out << InFile.rdbuf();
Here out is the std::ostream you want to send the data to.
Note, that you should not use InFile.eof() to determine whether there is more data! Instead, you should read what you want to read and then check that the read was successful. You need to check after reading because the stream cannot know what you are trying to read before you have done so.
Following up on Dietmar's answer: it sounds to me like you
should be using std::getline until you find a line which
matches your pattern. If you want that line as part of your
output, then output it, then use Dietmar's solution to copy the
rest of the file. Something like:
while ( std::getline( in, line ) && ! isStartLine( line ) ) {
}
if ( in ) { // Since you might not have found the line
out << line << '\n'; // If you want the matching line
// You can also edit it here.
out << in.rdbuf();
}
And don't put all sorts of complicated parsing information,
with continue and break, in the loop. The results are both
unreadable and unmaintainable. Factor it out into a simple
function, as above: you'll also have a better chance of getting
it right. (In your case, should you match "$PARAMETERS #
xxx", or not?) In a separate function, it's much easier to get
it right.

Cleaning a string of punctuation in C++

Ok so before I even ask my question I want to make one thing clear. I am currently a student at NIU for Computer Science and this does relate to one of my assignments for a class there. So if anyone has a problem read no further and just go on about your business.
Now for anyone who is willing to help heres the situation. For my current assignment we have to read a file that is just a block of text. For each word in the file we are to clear any punctuation in the word (ex : "can't" would end up as "can" and "that--to" would end up as "that" obviously with out the quotes, quotes were used just to specify what the example was).
The problem I've run into is that I can clean the string fine and then insert it into the map that we are using but for some reason with the code I have written it is allowing an empty string to be inserted into the map. Now I've tried everything that I can come up with to stop this from happening and the only thing I've come up with is to use the erase method within the map structure itself.
So what I am looking for is two things, any suggestions about how I could a) fix this with out simply just erasing it and b) any improvements that I could make on the code I already have written.
Here are the functions I have written to read in from the file and then the one that cleans it.
Note: the function that reads in from the file calls the clean_entry function to get rid of punctuation before anything is inserted into the map.
Edit: Thank you Chris. Numbers are allowed :). If anyone has any improvements to the code I've written or any criticisms of something I did I'll listen. At school we really don't get feed back on the correct, proper, or most efficient way to do things.
int get_words(map<string, int>& mapz)
{
int cnt = 0; //set out counter to zero
map<string, int>::const_iterator mapzIter;
ifstream input; //declare instream
input.open( "prog2.d" ); //open instream
assert( input ); //assure it is open
string s; //temp strings to read into
string not_s;
input >> s;
while(!input.eof()) //read in until EOF
{
not_s = "";
clean_entry(s, not_s);
if((int)not_s.length() == 0)
{
input >> s;
clean_entry(s, not_s);
}
mapz[not_s]++; //increment occurence
input >>s;
}
input.close(); //close instream
for(mapzIter = mapz.begin(); mapzIter != mapz.end(); mapzIter++)
cnt = cnt + mapzIter->second;
return cnt; //return number of words in instream
}
void clean_entry(const string& non_clean, string& clean)
{
int i, j, begin, end;
for(i = 0; isalnum(non_clean[i]) == 0 && non_clean[i] != '\0'; i++);
begin = i;
if(begin ==(int)non_clean.length())
return;
for(j = begin; isalnum(non_clean[j]) != 0 && non_clean[j] != '\0'; j++);
end = j;
clean = non_clean.substr(begin, (end-begin));
for(i = 0; i < (int)clean.size(); i++)
clean[i] = tolower(clean[i]);
}
The problem with empty entries is in your while loop. If you get an empty string, you clean the next one, and add it without checking. Try changing:
not_s = "";
clean_entry(s, not_s);
if((int)not_s.length() == 0)
{
input >> s;
clean_entry(s, not_s);
}
mapz[not_s]++; //increment occurence
input >>s;
to
not_s = "";
clean_entry(s, not_s);
if((int)not_s.length() > 0)
{
mapz[not_s]++; //increment occurence
}
input >>s;
EDIT: I notice you are checking if the characters are alphanumeric. If numbers are not allowed, you may need to revisit that area as well.
Further improvements would be to
declare variables only when you use them, and in the innermost scope
use c++-style casts instead of the c-style (int) casts
use empty() instead of length() == 0 comparisons
use the prefix increment operator for the iterators (i.e. ++mapzIter)
A blank string is a valid instance of the string class, so there's nothing special about adding it into the map. What you could do is first check if it's empty, and only increment in that case:
if (!not_s.empty())
mapz[not_s]++;
Style-wise, there's a few things I'd change, one would be to return clean from clean_entry instead of modifying it:
string not_s = clean_entry(s);
...
string clean_entry(const string &non_clean)
{
string clean;
... // as before
if(begin ==(int)non_clean.length())
return clean;
... // as before
return clean;
}
This makes it clearer what the function is doing (taking a string, and returning something based on that string).
The function 'getWords' is doing a lot of distinct actions that could be split out into other functions. There's a good chance that by splitting it up into it's individual parts, you would have found the bug yourself.
From the basic structure, I think you could split the code into (at least):
getNextWord: Return the next (non blank) word from the stream (returns false if none left)
clean_entry: What you have now
getNextCleanWord: Calls getNextWord, and if 'true' calls CleanWord. Returns 'false' if no words left.
The signatures of 'getNextWord' and 'getNextCleanWord' might look something like:
bool getNextWord (std::ifstream & input, std::string & str);
bool getNextCleanWord (std::ifstream & input, std::string & str);
The idea is that each function does a smaller more distinct part of the problem. For example, 'getNextWord' does nothing but get the next non blank word (if there is one). This smaller piece therefore becomes an easier part of the problem to solve and debug if necessary.
The main component of 'getWords' then can be simplified down to:
std::string nextCleanWord;
while (getNextCleanWord (input, nextCleanWord))
{
++map[nextCleanWord];
}
An important aspect to development, IMHO, is to try to Divide and Conquer the problem. Split it up into the individual tasks that need to take place. These sub-tasks will be easier to complete and should also be easier to maintain.