how to test for white space c++: [duplicate] - c++

This question already has answers here:
Why does reading a record struct fields from std::istream fail, and how can I fix it?
(9 answers)
Closed 8 years ago.
I'm trying to parse a .csv file, and I need to be able to test for a carriage return. Here is a test .csv file called sample.csv:
2
3
As you'll notice, there are two rows and one column in this file. I now write the following C++ code:
ifstream myfile (sample.csv); //Import file
char nextchar;
myfile.get(nextchar);
cout<<nextchar<<'\n';
myfile.get(nextchar);
cout<< nextchar<<" If 0, then that was not a carriage return. If 1, it was. :"<<(nextchar=='\n')<<'\n';
myfile.get(nextchar);
cout<<nextchar<<'\n';
I expect the following output:
2
If 0, then that was not a carriage return. If 1, it was. :1
3
however, I get:
2
If 0, then that was not a carriage return. If 1, it was. :0
3
How is this possible? how do I test for a carriage return??

It may be a pair of characters CR + LF. In any case you could output the code of this character yourself. Why did not you do this?
Also you could apply standard function std::isspace decalred in header <cctype>
I suggest to use standard function std::getline to read a whole line instead of using get.

There are a lot of things that can go wrong in the assumptions: OS behaviour, the text editor used to write the sample file, an undesired extra space or tab at the end of line, and the ios_base::openmode used to open the file, as well as all possible combination between those...
First instert this line to see what you actually read: is it 0x0d or 0x0a ? or somthing else ?
cout << "Char read: 0x0"<< std::hex << (int)nextchar<<"\n";
cout << "If 0 ... // Existing line
You can also replace your sample with the following. It opens the file in binary mode and display in hex the chars really in the file :
ifstream myfile ("sample.csv", ifstream::binary); //Import file
while (myfile.good() ) {
char nextchar;
myfile.get(nextchar);
if (myfile.good())
cout << "0x0"<< std::hex << (int)nextchar
<< " " << (isprint(nextchar)? nextchar:'?') <<"\n";
}
If second and third line are 0x0d and 0x0a, you'll know for sure that your text editor has put the extra CR.
Then you can remove ifstream::binary in the code above. Normally you should have, as you pointed out only 0x0a in the second line. If it's not the case, then you should investigate if the default openmode was somehow altered.
By the way, I've compiled your original code under windows and prepared the sample file using notepad , ran the programm and got... what you did expect ! Then I've redone the test with the following modification and the finally got what you got.
Good luck !

Related

Read a file line by line in C++

I wrote the following C++ program to read a text file line by line and print out the content of the file line by line. I entered the name of the text file as the only command line argument into the command line.
#include <iostream>
#include <fstream>
using namespace std;
int main(int argc, char* argv[])
{
char buf[255] = {};
if (argc != 2)
{
cout << "Invalid number of files." << endl;
return 1;
}
ifstream f(argv[1], ios::in | ios::binary);
if (!f)
{
cout << "Error: Cannot open file." << endl;
return 1;
}
while (!f.eof())
{
f.get(buf,255);
cout << buf << endl;
}
f.close();
return 0;
}
However, when I ran this code in Visual Studio, the Debug Console was completely blank. What's wrong with my code?
Apart from the errors mentioned in the comments, the program has a logical error because istream& istream::get(char* s, streamsize n) does not do what you (or I, until I debugged it) thought it does. Yes, it reads to the next newline; but it leaves the newline in the input!
The next time you call get(), it will see the newline immediately and return with an empty line in the buffer, for ever and ever.
The best way to fix this is to use the appropriate function, namely istream::getline() which extracts, but does not store the newline.
The EOF issue
is worth mentioning. The canonical way to read lines (if you want to write to a character buffer) is
while (f.getline(buf, bufSz))
{
cout << buf << "\n";
}
getline() returns a reference to the stream which in turn has a conversion function to bool, which makes it usable in a boolean expression like this. The conversion is true if input could be obtained. Interestingly, it may have encountered the end of file, and f.eof() would be true; but that alone does not make the stream convert to false. As long as it could extract at least one character it will convert to true, indicating that the last input operation made input available, and the loop will work as expected.
The next read after encountering EOF would then fail because no data could be extracted: After all, the read position is still at EOF. That is considered a read failure. The condition is wrong and the loop is exited, which was exactly the intent.
The buffer size issue
is worth mentioning, as well. The standard draft says in 30.7.4.3:
Characters are extracted and stored until one of the following occurs:
end-of-file occurs on the input sequence (in which case the function calls setstate(eofbit));
traits::eq(c, delim) for the next available input character c
(in which case the input character
is extracted but not stored);
n is less than one or n - 1 characters are stored
(in which case the function calls setstate(
failbit)).
The conditions are tested in that order, which means that if n-1 characters have been stored and the next character is a newline (the default delimiter), the input was successful (and the newline is extracted as well).
This means that if your file contains a single line 123 you can read that successfully with f.getline(buf, 4), but not a line 1234 (both may or may not be followed by a newline).
The line ending issue
Another complication here is that on Windows a file created with a typical editor will have a hidden carriage return before the newline, i.e. a line actually looks like "123\r\n" ("\r" and "\n" each being a single character with the values 13 and 10, respectively). Because you opened the file with the binary flag the program will see the carriage return; all lines will contain that "invisible" character, and the number of visible characters fitting in the buffer will be one shorter than one would assume.
The console issue ;-)
Oh, and your Console was not entirely empty; it's just that modern computers are too fast and the first line which was probably printed (it was in my case) scrolled away faster than anybody could switch windows. When I looked closely there was a cursor in the bottom left corner where the program was busy printing line after line of nothing ;-).
The conclusion
Debug your programs. It's very easy with VS.
Use getline(istream, string).
Use the return value of input functions (typically the stream)
as a boolean in a while loop: "As long as you can extract any input, use that input."
Beware of line ending issues.
Consider C I/O (printf, scanf) for anything non-trivial (I didn't discuss this in my answer but I think that's what many people do).

Clear CSV-file from non-specified symbols using C++ [duplicate]

This question already has answers here:
Why can Windows not read beyond the 0x1A (EOF) character but Unix can? [duplicate]
(2 answers)
Closed 3 years ago.
I'm trying to convert CSV-file to TXT-file using simple C++-code like this:
std::ofstream txtFile(strFileName, std::ofstream::out | std::ofstream::app);
std::string strLine;
std::ifstream csvFile(strCSVDir);
while (std::getline(csvFile, strLine))
{
std::string subString;
std::stringstream s(strLine);
while (std::getline(s, subString, ';'))
{
txtFile << subString << "\t";
}
txtFile << "\n";
}
txtFile.close();
csvFile.close();
It works fine, but only if the CSV-file doesn't contain any non-specified symbols, like arrow on this picture:
In this case my code can read only part of CSV-file until it meet this arrow symbol. How can I get around this situation?
Update: if I look at this CSV-file in byte-representation (for example in Far Hex-view), than I see code of arrow-symbol is "1A". The table of Unicode-characters points that it is Substitute symbol. How does it get in this CSV-file I don't know.
It might be easier to just read the entire file - then replacing and finally saving.
Going from your snippet:
std::stringstream sstr;
sstr << csvFile.rdbuf();
std::string buffer = sstr.str();
boost::replace_all(buffer, ";", "");
txtFile << buffer;
Update: if you don't have boost it should be easy to replace with something else like a for loop (since it is just a single char replacement)
Update 2: The reason why reading might not read the entire file in this case is because it is being read as a text file and probably contains a terminating character somewhere due to the way it is being read - see https://en.cppreference.com/w/cpp/io/c#Binary_and_text_modes for explaination.

why I cant read a file to an integer vector?

well! I have a text file including some integer values and non-integers like character strings and white spaces so I want only to read integers values so I used a vector of integers but when I read the file the opining is ok but it seems the first input fails thus breaks the loop!!!
here is my main example:
ifstream in("file.txt");
if(in.fail())
cout << "opening failed!" << endl;
//opening is fine!
int value;
vector<int> v;
while(in >> value) // the problem here; it fails why?
{
cout << "ok"; // not printed
v.push_back(value);
}
cout << v.size() << endl; // 0??!!
this is the content of file.txt:
32 43 24 32
15 23
57
77 81
if I make a vector of chars it's ok but I want only to use one of integers
*** I already used a code like this and worked fine but now I don't know what happened??!!! it's really annoting
any help, comment, tip is welcome and appreciated
This line:
while(in >> value)
says while I can read integers...
But in the post this may not be true - you are not handling this case.
Either read stuff that is not integers and handle it. Or just read strings and then decide what to do.
In addition
cout << "ok"; // not printed
is because it is buffered.
Do this
cout << "ok" << flush; // printed
excuse me first for annoying you with nonsense question. finally I managed to discover the error:
in my main folder of project I unintentionally created a winrar file input.rar then I didn't remove it but rename it to input.txt it's ok I opened it manually and removed some unreadable characters. then I put inside it the content above of integers then my c++ application succeeds in opening it but can't read it.
*now I removed it input.txt which was input.rar and created a new document text input.txt and now everything is good!!!
thank you for your collaboration. and this post may help someone else.
* don't create rar file or other formats then rename them to be text files and try to read them via your c++ fstream because it'll fail in fact it'll produce an error-prone which looks impossible to solve

Formatting Output c++

Wanting to do some fancy formatting. I have several lines that I want to interact with each other. Get the first two lines. Print out the character in the second line times the integer in the first line. Seperate them all with a asterisk character. No asterisk after the final character is printed. Move onto the next integer and character. Print them on a separate line. Do this for the whole list. The problem I am having is printing them on separate lines. Example:
5
!
2
?
3
#
Desired output:
!*!*!*!*!
?*?
#*#*#
My output:
!*!*!*!*!*?*?*#*#*#*
Below is my code. Another thing to mention is that I am reading the data about the characters and numbers from a separate text file. So I am using the getline function.
Here is a chunk of the code:
ifstream File
File.open("NumbersAndCharacters.txt")
string Number;
string Character;
while(!File.eof(){
getline(File, Number);
getline(File, Character);
//a few lines of stringstream action
for (int i=0; i<=Number; i++){
cout<<Character<<"*";}//end for. I think this is where
//the problem is.
}//end while
File.close();
return 0;
Where is the error? Is it the loop? Or do I not understand getline?
It should be printing an "endl" or "\n" after each multiplication of the character is done.
Thanks to everyone for the responses!
You have not shown your code yet, but what seems to be the issue here is that you simply forgot to add a new line every time you print your characters. For example, you probably have done:
std::cout << "!";
Well, in this context you forgot to add the new line ('\n'), so you have two options here: first insert the new line yourself:
std::cout << "! \n";
Or std::endl;
std::cout << "!" << std::endl;
For comparison of the two, see here and here. Without further description, or more importantly your code that doesn't seem to work properly, we can't make suggestions or solve your problem.

How to ignore a character through strtok?

In the below code i would like to also ignore the character ” . But after adding that in i still get “Mr_Bishop” as my output.
I have the following code:
ifstream getfile;
getfile.open(file,ios::in);
char data[256];
char *line;
//loop till end of file
while(!getfile.eof())
{
//get data and store to variable data
getfile.getline(data,256,'\n');
line = strtok(data," ”");
while(line != NULL)
{
cout << line << endl;
line = strtok(NULL," ");
}
}//end of while loop
my file content :
hello 7 “Mr_Bishop”
hello 10 “0913823”
Basically all i want my output to be :
hello
7
Mr_Bishop
hello
10
0913823
With this code i only get :
hello
7
"Mr_Bishop"
hello
10
"0913823"
Thanks in advance! :)
I realise i have made an error in the inner loop missing out the quote. But now i receive the following output :
hello
7
Mr_Bishop
�
hello
10
0913823
�
any help? thanks! :)
It looks like you used Wordpad or something to generate the file. You should use Notepad or Notepad++ on Windows or similar thing that will create ASCII encoding on Linux. Right now you are using what looks like UTF-8 encoding.
In addition the proper escape sequence for " is \". For instance
line = strtok(data," \"");
Once you fix your file to be in ASCII encoding, you'll find you missed something in your loop.
while(!getfile.eof())
{
//get data and store to variable data
getfile.getline(data,256,'\n');
line = strtok(data," \"");
while(line != NULL)
{
std::cout << line << std::endl;
line = strtok(NULL," \""); // THIS used to be strtok(NULL," ");
}
}//end of while loop
You missed a set of quotes there.
Correcting the file and this mistake yields the proper output.
Have a very careful look at your code:
line = strtok(data," ”");
Notice how the quotes lean at different angles (well mine do, I guess hopefully your font shows the same thing). You have included only the closing double quote in your strtok() call. However, Your data file has:
hello 7 “Mr_Bishop”
which has two different kinds of quotes. Make sure you're using all the right characters, whatever "right" is for your data.
UPDATE: Your data is probably UTF-8 encoded (that's how you got those leaning double quotes in there) and you're using strtok() which is completely unaware of UTF-8 encoding. So it's probably doing the wrong thing, splitting up the multibyte UTF-8 characters, and leaving you with rubbish at the end of the line.