How to dismiss characters/integers from input text from file? - c++

I have managed to write some code that can read from a .txt file, however I want my program to only read in important data.
For example, if my text file had the following data:
Name= Samuel
Favourite colour= Green
Age= 24
Gender= Male
I want my program to just read, and ignore everything before the "="
Samuel
Green
24
Male
I looked into the .substr() method, however, you need to know the exact position of the = sign.
This is my code, and it does not work
while ( getline (open_file,line) ){
for (int i=0; i<line.length(); i++){
if (line == "="){
cout << " " + (rest of the line;
}
I would really appreciate it if someone could help me out.

The most efficient way to read in data files is to read in a line at a time into a string variable. Next, extract the important parts.
Your data file looks like it is of the format:
<name> = <value>
I suggest you extract both name and value as strings (e.g. substrings), then you can pass the original data, in its original form, to other functions. Let the other functions worry about converting into integers or other data types.
The name field can be found by searching for the '=' and remembering the position of the '='. Next use the substring method and extract from the beginning of the string to the position before the '='.
The value is the substring that starts after the position of the '=' to the end of the string.
I'll let you look up the std::string functions and how to use them. I don't want to give you the code because you won't learn as much (such as how to look up functions).
See also std::getline.

Related

How to read CSV file with newline and comma characters inside cells in C++

I've got a CSV file containing cells with break lines ("\n") and/or commas which are enclosed with double quotes.
When I use getline() function to get each row, it consider each line inside cell as a new row of csv file. In addition, when using splitIntoVec to get vector of each row, it condiders comma inside a cell as a new vector element.
I want to store the content of csv file into a vector of vectors which each row is a vector of strings inside its cells.
for instance, for the following csv file content
"Row 1 cell 1
With break line","Row1 cell2, with comma"
"Row 2 cell 1
With break line","Row2 cell2, with comma"
Row 3 cell 1,Row3 cell 2
I get the result vector of 4 string vectors which the first one has only one element and the second one has 3 elements.
Here is my code :
vector<vector<string>> readFromCsv(string &fileName, char rowDelimiter = "\n", char colDelimiter = ",") {
ifstream file(fileName); // declare file stream
string value;
vector<vector<string>> contentVec;
vector<string> rowVec;
string rowStr;
while (getline(file, rowStr, rowDelimiter)) {
rowVec = splitIntoVec(rowStr, colDelimiter);
contentVec.push_back(rowVec);
}
return contentVec;
}
Is there any other function (in libraries like boost) available to resolve these issues? Any help would be appreciated.
In PHP , I get the content of the csv file by fgetcsv() correctly . Is there any alternative function in c++?
#Simone already said in his comment that it is not the CSV file. But seeing your problem you will need to get your hand dirty and do some text processing to get it separate. You can read complete file in a string and then break it further using loops or which ever way you see fit. For this you will need to keep track of the encountered " while traversing and breaking only when it is not inside double quotes.
For Example,
(opening apostrophes)"Row 1 cell 1
With break line"(closing apostrophes),"(opening apostrophes)Row1 cell2, with comma"(closing apostrophes)
You will have to keep track of opening and closing double apostrophes using index or number and break for rows only if '\n' is found outside the opening and closing apostrophes.
You can use regex also if you are sure there are no " in the cells.
Thanks #Alex Useful link if someone else faces the same issue : http://mybyteofcode.blogspot.nl/2010/11/parse-csv-file-with-embedded-new-lines.html
You have to completely separate by ", keeping 2 states: inside "" and outside. , and EOL have different meanings based on the states.
You can use getline(file, rowStr, '"') to read in everything up to the ", but your logic to separate in records will be a bit more complex. If numbers are allowed without quotation marks, then it becomes even more complex.

(C++) Seekg in fstream cutting off characters

So I'm not entirely sure why this is happening. I've tried just adding in spaces before the words in the txt file that I'm reading from and it fixes it for some, but not all. Basically I'm just trying to return a name, and each name in the file is on a different line. But when i print the names, some of them are cut off, like "Dillon" would be "llon" or "Stephanie" will be "phanie" and so on. Here's the use of seekg:
string Employee::randomFirstName()
{
int i;
string fName;
i = rand() % 100;
ifstream firstName;
firstName.open("First Names.txt", ios::out);
firstName.seekg(i);
firstName >> fName;
return fName;
}
So, I would post the txt file, but its just a list of names, one per line, 100 of them. I've tried looking up examples of the use of seekg, but I cant seem to figure out why it cuts off some. Also, it only cuts off sometimes. One output it'll print out "Dillon" right, next it would print "llon".
Any help would be appreciated
istream::seekg() will move to a character position. Therefore, seeking to a random character position between 0 and 99 (rand() % 100) may end up in the middle of a line. There is no way for seekg to know you wanted to seek to a line number: it has no concept of lines.
You can instead use std::getline for i number of times to get to that specific line.

sscanf input not working

I have tab seperated records like this
1000 Muhammad Aashir 0213-4211685 123456 0
first I have read the line by using fgets and now i am trying to extract contents by using sscanf, but there is an unexpected problem... please help I am beginner
here is the code
char buffer[SIZE];
Account req;
while(fgets(buffer,SIZE,fptr))
{
cout<<endl<<buffer<<endl;
sscanf(buffer,"%d\t%s\t%s\t%s\t%ld\n",&req.acc_num,req.name,req.mobileno,req.pass,&req.acc_bal);
cout<<endl<<req.pass;
}
output of BUFFER is same as the record line
but after extracting values, when I am displaying the 'req.pass' the value is incorrect
req.pass is displaying '0213-4211685' but it has to display '123456'
sscanf will capture until reaching any kind of whitespace. In your case, req.name only contains Muhammad. This will cause the rest of your variables to contain the wrong info.
If you need to use sscanf(), you'll have to replace instances of " " in your name with an escape character, like "_" for example.

C++ Filter Content from Text File

I have been wanting to extract a line of text once [1],[2] ... [n] is found. But it seems like I couldn't get my thinking out to store a line into a char starting with [1].
void ExtractWebContent::filterContent(){
char str [10];
ifstream reading;
reading.open("file_Currency.txt");
while (!reading.eof()){
reading.get(str,10,'[1]');
cout << str << endl;
}
cout << str;
reading.close();
}
This is the file that I want to extract from..
CAPTION: Currencies
Name Price Change % Chg
[80]USD/SGD
1.2606 -0.00 -0.13%
USD/SGD [81]USDSGD=X
[82]EUR/SGD
1.5242 0.00 +0.11%
EUR/SGD [83]EURSGD=X
I am using linux, C++ programming. This is meant to filter figures obtained from HTML text file.
Any help would be very much appreciated. Thank you!
The big error you have is that you treat a single character as a string. The third argument is supposed to be a single character delimiter, i.e. a character that separates records in the file. If you add the compiler option -Wall when compiling you will get a warning about having more than one character in the single-character literal.
One way of doing what you want, is to use regular expressions.

input, output and \n's

So I'm trying to solve this problem that asks to look for palindromes in strings, so seems like I've got everything right, however the problem is with the output.
Here's the original and my out put:
http://pastebin.com/c6Gh8kB9
Here's whats been said about input and input of the problem:
Input format :
A file with no more than 20,000
characters. The file has one or more
lines. No line is longer than 80
characters (not counting the newline
at the end).
Output format :
The first line of the output should be the length of the longest
palindrome found. The next line or
lines should be the actual text of the
palindrome (without any surrounding
white space or punctuation but with
all other characters) printed on a
line (or more than one line if
newlines are included in the
palindromic text). If there are
multiple palindromes of longest
length, output the one that appears
first.
Here's how I read the input :
string test;
string original;
while (getline(fin,test))
original += test;
And here's how I output it:
int len = answer.length();
answer = cleanUp(answer);
while (len > 0){
string s3 = answer.substr(0,80);
answer.erase(0,80);
fout << s3 << endl;
len -= 80;
}
cleanUp() is a function to remove the illegal characters from the beginning and the end. I'm guessing that the problem is with \n's and the way I read the input. How can I fix this ?
No line is longer than 80 characters (not counting the newline at the end)
does not imply that every line is 80 characters except for the last, while your output code does assume this by taking 80 characters off answer in every iteration.
You may want to keep the newlines in the string until the output phase. Alternatively, you might store newline positions in a separate std::vector. The first option complicates your palindrome search routine; the second your output code.
(If I were you, I'd also index into answer instead of taking chunks off with substr/erase; your output code is now O(n^2) while it could be O(n).)
After rereading, it appears that I misunderstood the question. I was thinking in terms of each line representing a single word, and the intent is to test whether that "word" is palindromic.
After rereading, I think the question is really more like: "Given a sequence of up to 20,000 characters, find the longest palindromic sub-sequence. Oh, incidentally, the input is broken up into lines of no more than 80 characters."
If that's correct, I'd ignore the line-length completely. I'd read the entire file into a single buffer, then search for palindromes in that buffer.
To find the palindromes, I'd simply walk through each position in the array, and find the longest possible palindrome with that as its center point:
for (int i=1; i<total_chars; i++)
for (n=1; n<min(i, total_chars-i); n++)
if (array[i+n] != array[i-n])
// Candidate palindrome is from array[i-n+1] to array[i+n-1]