How to ignore a character through strtok? - c++

In the below code i would like to also ignore the character ” . But after adding that in i still get “Mr_Bishop” as my output.
I have the following code:
ifstream getfile;
getfile.open(file,ios::in);
char data[256];
char *line;
//loop till end of file
while(!getfile.eof())
{
//get data and store to variable data
getfile.getline(data,256,'\n');
line = strtok(data," ”");
while(line != NULL)
{
cout << line << endl;
line = strtok(NULL," ");
}
}//end of while loop
my file content :
hello 7 “Mr_Bishop”
hello 10 “0913823”
Basically all i want my output to be :
hello
7
Mr_Bishop
hello
10
0913823
With this code i only get :
hello
7
"Mr_Bishop"
hello
10
"0913823"
Thanks in advance! :)
I realise i have made an error in the inner loop missing out the quote. But now i receive the following output :
hello
7
Mr_Bishop
�
hello
10
0913823
�
any help? thanks! :)

It looks like you used Wordpad or something to generate the file. You should use Notepad or Notepad++ on Windows or similar thing that will create ASCII encoding on Linux. Right now you are using what looks like UTF-8 encoding.
In addition the proper escape sequence for " is \". For instance
line = strtok(data," \"");
Once you fix your file to be in ASCII encoding, you'll find you missed something in your loop.
while(!getfile.eof())
{
//get data and store to variable data
getfile.getline(data,256,'\n');
line = strtok(data," \"");
while(line != NULL)
{
std::cout << line << std::endl;
line = strtok(NULL," \""); // THIS used to be strtok(NULL," ");
}
}//end of while loop
You missed a set of quotes there.
Correcting the file and this mistake yields the proper output.

Have a very careful look at your code:
line = strtok(data," ”");
Notice how the quotes lean at different angles (well mine do, I guess hopefully your font shows the same thing). You have included only the closing double quote in your strtok() call. However, Your data file has:
hello 7 “Mr_Bishop”
which has two different kinds of quotes. Make sure you're using all the right characters, whatever "right" is for your data.
UPDATE: Your data is probably UTF-8 encoded (that's how you got those leaning double quotes in there) and you're using strtok() which is completely unaware of UTF-8 encoding. So it's probably doing the wrong thing, splitting up the multibyte UTF-8 characters, and leaving you with rubbish at the end of the line.

Related

Formatting Output c++

Wanting to do some fancy formatting. I have several lines that I want to interact with each other. Get the first two lines. Print out the character in the second line times the integer in the first line. Seperate them all with a asterisk character. No asterisk after the final character is printed. Move onto the next integer and character. Print them on a separate line. Do this for the whole list. The problem I am having is printing them on separate lines. Example:
5
!
2
?
3
#
Desired output:
!*!*!*!*!
?*?
#*#*#
My output:
!*!*!*!*!*?*?*#*#*#*
Below is my code. Another thing to mention is that I am reading the data about the characters and numbers from a separate text file. So I am using the getline function.
Here is a chunk of the code:
ifstream File
File.open("NumbersAndCharacters.txt")
string Number;
string Character;
while(!File.eof(){
getline(File, Number);
getline(File, Character);
//a few lines of stringstream action
for (int i=0; i<=Number; i++){
cout<<Character<<"*";}//end for. I think this is where
//the problem is.
}//end while
File.close();
return 0;
Where is the error? Is it the loop? Or do I not understand getline?
It should be printing an "endl" or "\n" after each multiplication of the character is done.
Thanks to everyone for the responses!
You have not shown your code yet, but what seems to be the issue here is that you simply forgot to add a new line every time you print your characters. For example, you probably have done:
std::cout << "!";
Well, in this context you forgot to add the new line ('\n'), so you have two options here: first insert the new line yourself:
std::cout << "! \n";
Or std::endl;
std::cout << "!" << std::endl;
For comparison of the two, see here and here. Without further description, or more importantly your code that doesn't seem to work properly, we can't make suggestions or solve your problem.

seekg() not working as expected

I have a small program, that is meant to copy a small phrase from a file, but it appears that I am either misinformed as to how seekg() works, or there is a problem in my code preventing the function from working as expected.
The text file contains:
//Intro
previouslyNoted=false
The code is meant to copy the word "false" into a string
std::fstream stats("text.txt", std::ios::out | std::ios::in);
//String that will hold the contents of the file
std::string statsStr = "";
//Integer to hold the index of the phrase we want to extract
int index = 0;
//COPY CONTENTS OF FILE TO STRING
while (!stats.eof())
{
static std::string tempString;
stats >> tempString;
statsStr += tempString + " ";
}
//FIND AND COPY PHRASE
index = statsStr.find("previouslyNoted="); //index is equal to 8
//Place the get pointer where "false" is expected to be
stats.seekg(index + strlen("previouslyNoted=")); //get pointer is placed at 24th index
//Copy phrase
stats >> previouslyNotedStr;
//Output phrase
std::cout << previouslyNotedStr << std::endl;
But for whatever reason, the program outputs:
=false
What I expected to happen:
I believe that I placed the get pointer at the 24th index of the file, which is where the phrase "false" begins. Then the program would've inputted from that index onward until a space character would have been met, or the end of the file would have been met.
What actually happened:
For whatever reason, the get pointer started an index before expected. And I'm not sure as to why. An explanation as to what is going wrong/what I'm doing wrong would be much appreciated.
Also, I do understand that I could simply make previouslyNotedStr a substring of statsStr, starting from where I wish, and I've already tried that with success. I'm really just experimenting here.
The VisualC++ tag means you are on windows. On Windows the end of line takes two characters (\r\n). When you read the file in a string at a time, this end-of-line sequence is treated as a delimiter and you replace it with a single space character.
Therefore after you read the file you statsStr does not match the contents of the file. Every where there is a new line in the file you have replaced two characters with one. Hence when you use seekg to position yourself in the file based on numbers you got from the statsStr string, you end up in the wrong place.
Even if you get the new line handling correct, you will still encounter problems if the file contains two or more consecutive white space characters, because these will be collapsed into a single space character by your read loop.
You are reading the file word by word. There are better methods:
while (getline(stats, statsSTr)
{
// An entire line is read into statsStr.
std::string::size_type posn = statsStr.find("previouslyNoted=");
// ...
}
By reading entire text lines into a string, there is no need to reposition the file.
Also, there is a white-space issue when reading by word. This will affect where you think the text is in the file. For example, white space is skipped, and there is no telling how many spaces, newlines or tabs were skipped.
By the way, don't even think about replacing the text in the same file. Replacement of text only works if the replacement text has the same length as the original text in the file. Write to a new file instead.
Edit 1:
A better method is to declare your key strings as array. This helps with positioning pointers within a string:
static const char key_text[] = "previouslyNoted=";
while (getline(stats, statsStr))
{
std::string::size_type key_position = statsStr.find(key_text);
std::string::size_type value_position = key_position + sizeof(key_text) - 1; // for the nul terminator.
// value_position points to the character after the '='.
// ...
}
You may want to save programming type by making your data file conform to an existing format, such as INI or XML, and using appropriate libraries to parse them.

how to test for white space c++: [duplicate]

This question already has answers here:
Why does reading a record struct fields from std::istream fail, and how can I fix it?
(9 answers)
Closed 8 years ago.
I'm trying to parse a .csv file, and I need to be able to test for a carriage return. Here is a test .csv file called sample.csv:
2
3
As you'll notice, there are two rows and one column in this file. I now write the following C++ code:
ifstream myfile (sample.csv); //Import file
char nextchar;
myfile.get(nextchar);
cout<<nextchar<<'\n';
myfile.get(nextchar);
cout<< nextchar<<" If 0, then that was not a carriage return. If 1, it was. :"<<(nextchar=='\n')<<'\n';
myfile.get(nextchar);
cout<<nextchar<<'\n';
I expect the following output:
2
If 0, then that was not a carriage return. If 1, it was. :1
3
however, I get:
2
If 0, then that was not a carriage return. If 1, it was. :0
3
How is this possible? how do I test for a carriage return??
It may be a pair of characters CR + LF. In any case you could output the code of this character yourself. Why did not you do this?
Also you could apply standard function std::isspace decalred in header <cctype>
I suggest to use standard function std::getline to read a whole line instead of using get.
There are a lot of things that can go wrong in the assumptions: OS behaviour, the text editor used to write the sample file, an undesired extra space or tab at the end of line, and the ios_base::openmode used to open the file, as well as all possible combination between those...
First instert this line to see what you actually read: is it 0x0d or 0x0a ? or somthing else ?
cout << "Char read: 0x0"<< std::hex << (int)nextchar<<"\n";
cout << "If 0 ... // Existing line
You can also replace your sample with the following. It opens the file in binary mode and display in hex the chars really in the file :
ifstream myfile ("sample.csv", ifstream::binary); //Import file
while (myfile.good() ) {
char nextchar;
myfile.get(nextchar);
if (myfile.good())
cout << "0x0"<< std::hex << (int)nextchar
<< " " << (isprint(nextchar)? nextchar:'?') <<"\n";
}
If second and third line are 0x0d and 0x0a, you'll know for sure that your text editor has put the extra CR.
Then you can remove ifstream::binary in the code above. Normally you should have, as you pointed out only 0x0a in the second line. If it's not the case, then you should investigate if the default openmode was somehow altered.
By the way, I've compiled your original code under windows and prepared the sample file using notepad , ran the programm and got... what you did expect ! Then I've redone the test with the following modification and the finally got what you got.
Good luck !

No methods of read a file seem to work, all return nothing - C++

EDIT: Problem solved! Turns out Windows 7 wont let me read/ write to files without explicitly running as administrator. So if i run as admin it works fine, if i dont i get the weird results i explain below.
I've been trying to get a part of a larger program of mine to read a file.
Despite trying multiple methods(istream::getline, std::getline, using the >> operator etc) All of them return with either /0, blank or a random number/what ever i initialised the var with.
My first thought was that the file didn't exist or couldn't be opened, however the state flags .good, .bad and .eof all indicate no problems and the file im trying to read is certainly in the same directory as the debug .exe and contains data.
I'd most like to use istream::getline to read lines into a char array, however reading lines into a string array is possible too.
My current code looks like this:
void startup::load_settings(char filename[]) //master function for opening a file.
{
int i = 0; //count variable
int num = 0; //var containing all the lines we read.
char line[5];
ifstream settings_file (settings.inf);
if (settings_file.is_open());
{
while (settings_file.good())
{
settings_file.getline(line, 5);
cout << line;
}
}
return;
}
As said above, it compiles but just puts /0 into every element of the char array much like all the other methods i've tried.
Thanks for any help.
Firstly your code is not complete, what is settings.inf ?
Secondly most probably your reading everything fine, but the way you are printing is cumbersome
cout << line; where char line[5]; be sure that the last element of the array is \0.
You can do something like this.
line[4] = '\0' or you can manually print the values of each element in array in a loop.
Also you can try printing the character codes in hex for example. Because the values (character codes) in array might be not from the visible character range of ASCII symbols. You can do it like this for example :
cout << hex << (int)line[i]

C++ Text File, Chinese characters

I have a C++ project which is supposed to add <item> to the beginning of every line and </item > to the end of every line. This works fine with normal English text, but I have a Chinese text file I would like to do this to, but it does not work. I normally use .txt files, but for this I have to use .rtf to save the Chinese text. After I run my code, it becomes gibberish. Here's an example.
{\rtf1\adeflang1025\ansi\ansicpg1252\uc1\adeff31507\deff0\stshfdbch31506\stshfloch31506\stshfhich31506\stshfbi31507\deflang1033\deflangfe1033\themelang1033\themelangfe0\themelangcs0{\fonttbl{\f2\fbidi
\fmodern\fcharset0\fprq1{*\panose
02070309020205020404}Courier
New;}
Code:
int main()
{
ifstream in;
ofstream out;
string lineT, newlineT;
in.open("rawquote.rtf");
if(in.fail())
exit(1);
out.open("itemisedQuote.rtf");
do
{
getline(in,lineT,'\n');
newlineT += "<item>";
newlineT += lineT;
newlineT += "</item>";
if (lineT.length() >5)
{
out<<newlineT<<'\n';
}
newlineT = "";
lineT = "";
} while(!in.eof());
return 0;
}
That looks like RTF, which makes sense as you say this is an rtf file.
Basically, if you dump that file when you open, you'll see it looks like that...
Also, you should revisit your loop
std::string line;
while(getline(in, line, '\n'))
{
// do stuff here, the above check correctly that you have indeed read in a line!
out << "<item>" << line << "</item>" << endl;
}
You can't read the RTF code the same way as plain text as you'll just ignore format tags, etc. and might just break the code.
Try to save your chinese text as a text file using UTF-8 (without BOM) and your code should work. However this might fail if some other UTF-8 encoded character contains essentially a line break (not sure about this part right now), so you should try to do real UTF-8 conversion and read the file using wide chars instead of regular chars (as Chan suggested), which is a little bit tricky using C++.
It's kind of a miracle that this works for non-Chinese text. "\n" is not the line separator in RTF, "\par" is. The odds that more damage is done to the RTF header are certainly greater for Chinese.
C++ is not the best language to tackle this. It is a trivial 5 minute program in C# as long as the file doesn't get too large:
using System;
using System.Windows.Forms; // Add reference
class Program {
static void Main(string[] args) {
var rtb = new RichTextBox();
rtb.LoadFile(args[0], RichTextBoxStreamType.RichText);
var lines = rtb.Lines;
for (int ix = 0; ix < lines.Length; ++ix) {
lines[ix] = "<item>" + lines[ix] + "</item>";
}
rtb.Lines = lines;
rtb.SaveFile(args[0], RichTextBoxStreamType.RichText);
}
}
If C++ is a hard requirement then you'll have to find an RTF parser.
I think you should use 'wchar' for string instead of 'regular char'.
If I'm understanding the objective of this code, your solution is not going to work. A line break in an RTF document does not correspond to a line break in the visible text.
If you can't just use plain text (Chinese characters are not a problem with a valid encoding), take a look at the RTF spec. You'll discover that it is a nightmare. So you're best bet is probably a third-party library that can parse RTF and read it "line" by "line." I have never looked for such a library, so do not have any suggestions off the top of my head, but I'm sure they are out there.