Here's my task and below is most of the code I already wrote:
Develop the program so that it finds and extracts specified items from the xmlstring file using start and end tags. Now we find and extract and display first the location information and then the temperature information.
Location can be found between the tags <location> and </location>. The temperature is between the tags <temp_c> and </temp_c>.
To make it easy to find whatever information from the, xml-string write a function that takes the xml-string and the "inner" text (same for start tag and end tag) of the tags as parameters and returns the text from between the start tag and end tags. If either start or end tag is not found the function must return "not found".
Note that when you search for the tag you must search for the whole tag (including angle brackets) not just the tag name that was given as parameter.
For example, if you wanted to find the location
location = find_field(page, "location");
and to get the temperature you could call it as follows:
temperature = find_field(page, "temp_c");
MY CODE:
#pragma warning (disable:4996)
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
string find_field(const string& xml, string tag_name);
int main() {
string page, line, location, temperature;
ifstream inputFile("weather.xml");
while (getline(inputFile, line)) {
page.append(line);
line.erase();
}
location = find_field(page, "location");
temperature = find_field(page, "<temp_c>");
cout << "Location: " << location << endl;
cout << "Temperature: " << temperature << endl;
}
string find_field(const string& xml, string tag_name)
{
string start_tag = "<" + tag_name + ">";
string end_tag = "<" + tag_name + ">";
return "not found";
}
SPECIFIC QUESTION
When I run the program it says:
Location: not found
Temperature: not found
Just not found. But it doesnt show the data that is in the file. How can I fix it? Thanks
I will not solve this completely for you, because I assume this is a task for you to actually learn, but I will give you some guidance.
In C++ you work on strings with iterators or offsets (positions). Since you are starting, I would suggest to first get familiar with offsets.
Basically, you need to search for the positions of start_tag and end_tag in your xml string and then return what is in between. The std::string class has a find method (http://www.cplusplus.com/reference/string/string/find/). You use it to find the start position of the search string. Additionally, you will probably need the length method (http://www.cplusplus.com/reference/string/string/length/) for position calculation and the substr method (http://www.cplusplus.com/reference/string/string/substr/) to get the interesting part.
Your pseudo logic of the find could be:
Find the start position of the start_tag in xml
Calculate the end position of the start_tag (start position + length of start_tag)
Find the start position of the end_tag in xml
Return sub-string between the two positions (the length is the difference between start position of end_tag and end position of start_tag)
Of course you need to check if the positions are valid before step 4 and if not return the "Not found".
Additionally, please consider Scheff's third comment on your question, that the end_tag starts with </ and that you need to call find_field without the angle brackets around your search string because you add them later in the function.
I hope this helps you find a solution.
Here is a very incomplete starter example:
string find_field(const string& xml, string tag_name)
{
std::string start_tag = "<" + tag_name + ">";
std::string end_tag = "</" + tag_name + ">";
size_t start_tag_start_pos = xml.find(start_tag);
// make some sanity checks to the position here (find returns std::string::npos if start_tag wasn't found)
size_t start_pos_of_interesting_string = start_tag_start_pos + start_tag.length();
// you can find the start_pos of the end tag similarly
// don't forget sanity checks
// calculate the length of between start and end tag
return xml.substr(start_pos_of_interesting_string, /* you need to add the length of the interesting string here*/);
}
Related
An input file is entered with the following data:
Juan Dela Cruz 150.50 5
'Juan Dela Cruz' is a name that I would like to assign to string A,
'150.50' is a number I would like to assign to float B
and 5 is a number I would like to assign to int C.
If I try cin, it is delimited by the spaces in between.
If I use getline, it's getting the whole line as a string.
What would be the correct syntax for this?
If we analyze the string, then we can make the following observation. At the very end, we have an integer. In front of the integer we have a space. And in front of that the float value. And again in fron of that a space.
So, we can simply look from the back of the string for the 2nd last space. This can easily be achieved by
size_t position = lineFromeFile.rfind(' ', lineFromeFile.rfind(' ')-1);
We need a nested statement of rfind please see here, version no 3.
Then we build a substring with the name. From start of the string up to the found position.
For the numbers, we put the rest of the original string into an std::istringstream and then simply extract from there.
Please see the following simple code, which has just a few lines of code.
#include <iostream>
#include <string>
#include <cctype>
#include <sstream>
int main() {
// This is the string that we read via getline or whatever
std::string lineFromeFile("Juan Dela Cruz 150.50 5");
// Let's search for the 2nd last space
size_t position = lineFromeFile.rfind(' ', lineFromeFile.rfind(' ')-1);
// Get the name as a substring from the original string
std::string name = lineFromeFile.substr(0, position);
// Put the numbers in a istringstream for better extraction
std::istringstream iss(lineFromeFile.substr(position));
// Get the rest of the values
float fValue;
int iValue;
iss >> fValue >> iValue;
// Show result to use
std::cout << "\nName:\t" << name << "\nFloat:\t" << fValue << "\nInt:\t" << iValue << '\n';
return 0;
}
Probably simplest in this case would be to read whole line into string and then parse it with regex:
const std::regex reg("\\s*(\\S.*)\\s+(\\d+(\\.\\d+)?)\\s+(\\d+)\\s*");
std::smatch match;
if (std::regex_match( input, match, reg)) {
auto A = match[1];
auto B = std::stof( match[2] );
auto C = std::stoi( match[4] );
} else {
// error invalid format
}
Live example
As always when the input does not (or sometimes does not) match a strict enough syntax, read the whole line and then apply the rules which to a human are "obvious".
In this case (quoting comment by john):
Read the whole string as a single line. Then analyze the string to work out where the breaks are between A, B and C. Then convert each part to the type you require.
Specifically, you probably want to use reverse searching functions (e.g. https://en.cppreference.com/w/cpp/string/byte/strrchr ), because the last parts of the input seem the most strictly formatted, i.e. easiest to parse. The rest is then the unpredictable part at the start.
either try inputting the different data type in different lines and then use line breaks to input different data types or use the distinction to differentiate different data types like adding a . or comma
use the same symbol after each data package, for example, Juan Dela Cruz;150.50;5 then you can check for a ; and separate your string there.
If you want to use the same input format you could use digits as an indicator to separate them
i have this project due however i am unsure of how to parse the data by the word, part of speech and its definition... I know that i should make use of the tab spacing to read it but i have no idea how to implement it. here is an example of the file
Recollection n. The power of recalling ideas to the mind, or the period within which things can be recollected; remembrance; memory; as, an event within my recollection.
Nip n. A pinch with the nails or teeth.
Wodegeld n. A geld, or payment, for wood.
Xiphoid a. Of or pertaining to the xiphoid process; xiphoidian.
NB: Each word and part of speech and definition is one line in a text file.
If you can be sure that the definition will always follow the first period on a line, you could use an implementation like this. But it will break if there are ever more than 2 periods on a single line.
string str = "";
vector<pair<string,string>> v; // <word,definition>
while(getline(fileStream, str, '.')) { // grab line, deliminated '.'
str[str.length() - 1] = ""; // get rid of n, v, etc. from word
v.push_back(make_pair<string,string>(str,"")); // push the word
getline(fileStream, str, '.'); // grab the next part of the line
v.back()->second = str; // push definition into last added element
}
for(auto x : v) { // check your results
cout << "word -> " << x->first << endl;
cout << "definition -> " << x->second << endl << endl;
}
The better solution would be to learn Regular Expressions. It's a complicated topic but absolutely necessary if you want to learn how to parse text efficiently and properly:
http://www.cplusplus.com/reference/regex/
Good day to all,
I am having a hard time trying to extract desired integers from a string. I am given the following to read in from a file:
itemnameitemnumber price percentmarkup
examples
Gowns-u2285 24.22 37%
TwoB1Ask1-m1275 90.4 1%
What I have been trying to do is get the item number separated from the item name so that I can store the item number as a reference for sorting. As you can see the first example itemnameitemnumber is a clear cut character to digit separation, whereas the next example has numbers within its item name.
I have tried several different approaches, however with certain item names having integers apart of their name is proving to be beyond my experience.
If anyone can help me with this I would be greatly appreciative for their time and knowledge.
Good day,
I don't know, if you have a fixed number of digits for itemnumber, but i am going to assume that you don't.
This is a simple approach; first you have to separate the words of your line. For example, use std::istringstream.
When you have the line split to words, for example by giving its iterators to a vector, or reading it with operator>>, you start to check the first word from backwards, until you find anything that is not one of "0123456789 " (note the whitespace at the end).
After you've done this, you get the iterator about where these digits end (from backwards), and cut your original string, or if you have the opportunity, the already split string. Voilá! You have yourself your item name and item number.
For the record, i am going to do this whole thing, utilising the same technique for the percent markup too, of course with the exception characters being "% ".
#define VALID_DIGITS "0123456789 "
#define VALID_PERCENTAGE "% "
struct ItemData {
std::string Name;
int Count;
double Price;
double PercentMarkup;
};
int ExtractItemData(std::string Line, ItemData & Output) {
std::istringstream Stream( Line );
std::vector<std::string> Words( Stream.begin(), Stream.end() );
if (Words.size() < 3) {
/* somebody gave us a malformed line with less than needed words */
return -1;
}
// Search from backwards, until you do not find anything that is not digits (0-9) or a whitespace
std::size_t StartOfDigits = Words[0].find_last_not_of( VALID_DIGITS );
if (StartOfDigits == std::string::npos) {
/* error; your item name is invalid */
return -2;
}
else {
// Separate the string into 2 parts
Output.Name = Words[0].substr(0, StartOfDigits); // Get the first part
Output.Count = std::stoi( Words[0].substr(StartOfDigits, Words[0].length() - StartOfDigits) );
Output.Price = std::stod( Words[1] );
// Search from backwards, until we do not find anything that is not '%' or ' '
std::size_t StartOfPercent = Words[2].find_last_not_of(VALID_PERCENTAGE);
Output.PercentMarkup = std::stod( Words[2].substr(0, StartOfPercent) );
}
return 0;
}
Code requies includes sstream, vector, string, and cstdint if you do not have size_t defined
Hope the answer was useful.
Best of luck, COlda.
PS.: My first answer on stack overflow ^^;
you can iterate on the string pushing the numbers to a vector then use stringstream to convert them to integers
Here is the content of txt file that i've managed read.
X-axis=0-9
y-axis=0-9
location.txt
temp.txt
I'm not sure whether if its possible but after reading the contents of this txt file i'm trying to store just the x and y axis range into 2 variables so that i'll be able to use it for later functions. Any suggestion? And do i need to use vectors? Here is the code for reading of the file.
string configName;
ifstream inFile;
do {
cout << "Please enter config filename: ";
cin >> configName;
inFile.open(configName);
if (inFile.fail()){
cerr << "Error finding file, please re-enter again." << endl;
}
} while (inFile.fail());
string content;
string tempStr;
while (getline(inFile, content)){
if (content[0] && content[1] == '/') continue;
cout << endl << content << endl;
depends on the style of your file, if you are always sure that the style will remain unchanged, u can read the file character by character and implement pattern recognition stuff like
if (tempstr == "y-axis=")
and then convert the appropriate substring to integer using functions like
std::stoi
and store it
I'm going to assume you already have the whole contents of the .txt file in a single string somewhere. In that case, your next task should be to split the string. Personally, yes, I would recommend using vectors. Say you wanted to split that string by newlines. A function like this:
#include <string>
#include <vector>
std::vector<std::string> split(std::string str)
{
std::vector<std::string> ret;
int cur_pos = 0;
int next_delim = str.find("\n");
while (next_delim != -1) {
ret.push_back(str.substr(cur_pos, next_delim - cur_pos));
cur_pos = next_delim + 1;
next_delim = str.find("\n", cur_pos);
}
return ret;
}
Will split an input string by newlines. From there, you can begin parsing the strings in that vector. They key functions you'll want to look at are std::string's substr() and find() methods. A quick google search should get you to the relevant documentation, but here you are, just in case:
http://www.cplusplus.com/reference/string/string/substr/
http://www.cplusplus.com/reference/string/string/find/
Now, say you have the string "X-axis=0-9" in vec[0]. Then, what you can do is do a find for = and then get the substrings before and after that index. The stuff before will be "X-axis" and the stuff after will be "0-9". This will allow you to figure that the "0-9" should be ascribed to whatever "X-axis" is. From there, I think you can figure it out, but I hope this gives you a good idea as to where to start!
std::string::find() can be used to search for a character in a string;
std::string::substr() can be used to extract part of a string into another new sub-string;
std::atoi() can be used to convert a string into an integer.
So then, these three functions will allow you to do some processing on content, specifically: (1) search content for the start/stop delimiters of the first value (= and -) and the second value (- and string::npos), (2) extract them into temporary sub-strings, and then (3) convert the sub-strings to ints. Which is what you want.
I am making a roman numeral converter. I have everything figured out except there is one problem at the end.
The string looks like IVV
I need to make it IX
I have split the string at each new letter, then appended them back on, then using an if statement to see if it contains 2 "V"s. I want to know if there is a simpler way to do this.
Using std::string should help you tremendously as you can leverage its search and replace functionality. You'll want to start with the find function which allows you to search for a character or a string and returns an index where what you are searching for exists or npos if the search fails.
You can then call replace passing it the index returned by find, the number of characters you want to replace and what replace the range with.
The code below should help you get started.
#include <string>
#include <iostream>
int main()
{
std::string roman("IVV");
// Search for the string you want to replace
std::string::size_type loc = roman.find("VV");
// If the substring is found replace it.
if (loc != std::string::npos)
{
// replace 2 characters staring at position loc with the string "X"
roman.replace(loc, 2, "X");
}
std::cout << roman << std::endl;
return 0;
}
You could use std string find and rfind operations, these find the position of the first and the last occurrence of the entered parameter, check if these are not equal and you will know
Answer updated
#include <string>
int main()
{
std::string x1 = "IVV";
if (x1.find('V') !=x1.rfind('V'))
{
x1.replace(x1.find('V'), 2, 'X');
}
return 0;
}