Currently my programme takes a string as an input which I access using argc and argv
then I use
FILE *fp, *input = stdin;
fp = fopen("input.xml","w+b");
while(fgets(mystring,100,input) != NULL)
{
fputs(mystring,fp);
}
fclose(fp);
I did this part only to create a file input.xml which I then supply to
ifstream in("input.xml");
string s((std::istreambuf_iterator<char>(in)), std::istreambuf_iterator<char>());
to get s as a string(basic string).
Is there a way to feed my input directly to ifstream? (i.e feeding a string to ifstream).
Let me get this straight:
You read a string from standard input
You write it to a file
You then read it from the file
And use the file stream object to create a string
That's crazy talk!
Drop the file streams and just instantiate the string from STDIN directly:
string s(
(std::istreambuf_iterator<char>(std::cin)),
std::istreambuf_iterator<char>()
);
Remember, std::cin is a std::istream, and the IOStreams part of the standard library is designed for this sort of generic access to data.
Be aware, though, that your approach with std::istreambuf_iterator is not the most efficient, which may be a concern if you're reading a lot of data.
If I get it right then you want to read all the text provided through standard input into a string.
An easy way to achieve this could be the use of std::getline, which reads all the bytes up to a specific delimiter into a string. If you may assume that it is text content (which does not contain any \0-character), you could write the following:
std::string s;
std::getline(cin,s,'\0');
Related
I am working with a custom file type that behaves similar to a zip file and contains files within it. I'm trying to read a text file in this custom file type, but when I open and parse the text file it returns back information I can't use. Below is how I'm reading it currently:
std::ifstream file("C:\\Ex\\ample\\file.cust\\signature.txt\\");
// Below is a while loop extracting the items
// In this example it should extract 9 items
// Currently it is unable to open properly, it is behaving similar to a zip file
std::vector <std::string> names;
while (file)
{
std::string s;
if (!getline(file, s)) break;
std::istringstream ss(s);
std::vector<std::string> record;
while (ss)
{
std::string s;
if (!getline(ss, s, ',')) break;
names.push_back(s);
}
}
// text output:
// s = "ƒÃ\x10‰\x1f[ë\vÿpô‹ÏPèÕ\näÿ‹Ç_^]Â\x4"
If you have come up with this new format, make sure to know how you're writing the file, i.e., what mode you used to write the file:
Constant | Explanation
----------|-------------
app | seek to the end of stream before each write
binary | open in binary mode
in | open for reading
out | open for writing
trunc | discard the contents of the stream when opening
ate | seek to the end of stream immediately after open
It may help to look at the ifstream class definition and the examples here.
The file must be read with the same mode with which it was written. As #Someprogrammerdude pointed out in a comment, you seem to be reading binary information, so perhaps something like .open(filename, std::ios::binary) could help.
If this doesn't help, you should provide more details on the exact specification of the file format and perhaps some code (if you can share, obviously).
Consider I created a file using this way:
std::ofstream osf("MyTextFile.txt");
string buffer="spam\neggs\n";
osf.write(buffer,buffer.length());
osf.close();
When I was trying to read that file using the following way, I realized that more characters than present was read.
std::ifstream is("MyTextFile.txt");
is.seekg (0, is.end);
int length = is.tellg();
is.seekg (0, is.beg);
char * buffer = new char [length];
is.read (buffer,length);
//work with buffer
delete[] buffer;
For example, if the file contains spam\neggs\n, then this procedure reads 12 characters instead of 10. The first 10 chars are spam\neggs\n as expected but there are 2 more, which have the integer value 65533.
Moreover, this problem happens only when \n is present in the file. For example, there is no problem if file contains spam\teggs\t instead.
The question is;
Am I doing anything wrong? Or doesn't this procedure work as it should do?
Bonus Q: Can you suggest an alternative for reading the whole file at once?
Note: I found this way here.
The problem is that you wrote the string
"spam\neggs\n"
initially to an ofstream, without setting the std::ios::binary flag at the open (or on the initializator). This causes the runtime to translate to the "native text format", i. e., to convert each \n to \r\n on the output (as you are on Windows OS). So, after being written, the contents of your file was actually:
"spam\r\neggs\r\n"
(i. e., 12 chars). That was returned by
int length = is.tellg();
But, when you tried to read 12 chars you got
"spam\neggs\n"
back, because the runtime converted each \r\n back to \n.
As a final advice, please, please, don't use new char[length]... use std::string and reserve so you won't leak memory etc. And if your file can be very big, maybe it's not a good idea to slurp the whole file to memory at once, also.
Just an idea, since the number 2 corresponds to the count of \ns: Are you doing this on Windows? It might have something to do with the file actually containing \r\n. What happens if you open the file in binary mode (std::ios::binary)?
Can you suggest an alternative for reading the whole file at once?
Yes:
std::ifstream is("MyTextFile.txt");
std::string str( std::istreambuf_iterator<char>{is}, {} ); // requires <iterator>
str now contains the file. Does this solve your problem?
Here is the code I'm having a trouble with, I have a .txt file that contains a list of users and their passwords using this format: user;password.
I need to search for a user in the file and then delete the line which contains this user.
void deleteauser()
{
string user;
cout<<"Which user do you wish to delete?";
cin>>user;
string line;
string delimiter=";";
string token,token1;
ifstream infile;
infile.open("users.txt",ios::in);
while (getline(infile,line,'\n'))
{
token = line.substr(0, line.find(delimiter));
token1=line.substr(token.length(), line.find('\n'));
if(token==user)
{
//here i need to delete the line of the user that has been found
}
}
infile.close();
}
Read the input file, line by line, writing to a temporary file. When you find lines you don't want then just don't write them to the temporary file. When done rename the temporary file as the real file.
To edit a file you have 2 options:
Read in every line and write out those you want to keep
Seek to the part of the file you want deleted and replace the text with spaces (or similar)
You have the first half pretty much done - just write out what you read to a temporary file and delete/rename to make it the original.
For the second option, you can write to the input file at that point if you use an iofstream (be aware of buffering issues). The better option is to use seekp or seekg to get to the right point before overwriting the file.
I'm doing some file io and created the test below, but I thought testoutput2.txt would be the same as testinputdata.txt after running it?
testinputdata.txt:
some plain
text
data with
a number
42.0
testoutput2.txt (In some editors its on seperate lines, but in others its all on one line)
some plain
ऀ琀攀砀琀ഀഀ
data with
愀 渀甀洀戀攀爀ഀഀ
42.0
int main()
{
//Read plain text data
std::ifstream filein("testinputdata.txt");
filein.seekg(0,std::ios::end);
std::streampos length = filein.tellg();
filein.seekg(0,std::ios::beg);
std::vector<char> datain(length);
filein.read(&datain[0], length);
filein.close();
//Write data
std::ofstream fileoutBinary("testoutput.dat");
fileoutBinary.write(&datain[0], datain.size());
fileoutBinary.close();
//Read file
std::ifstream filein2("testoutput.dat");
std::vector<char> datain2;
filein2.seekg(0,std::ios::end);
length = filein2.tellg();
filein2.seekg(0,std::ios::beg);
datain2.resize(length);
filein2.read(&datain2[0], datain2.size());
filein2.close();
//Write data
std::ofstream fileout("testoutput2.txt");
fileout.write(&datain2[0], datain2.size());
fileout.close();
}
Its working fine on my side, i have run your program on VC++ 6.0 and checked the output on notepad and MS Word. can you specify name of editor where you are facing problem.
You can't read Unicode text into a std::vector<char>. The char data type only works with narrow strings, and my guess is that the text file you're reading in (testinputdata.txt) is saved with either UTF-8 or UTF-16 encoding.
Try using the wchar_t type for your characters, instead. It is specifically designed to work with "wide" (or Unicode) characters.
Thou shalt verify thy input was successful! Although this would sort you out, you should also note that number of bytes in the file has no direct relationship to the number of characters being read: there can be less characters than bytes (think Unicode character using multiple bytes using UTF8 to be encoded) or vice versa (although the latter doesn't happen with any of the Unicode encodings). All you experience is that read() couldn't read as many characters as you'd asked it to read but write() happily wrote the junk you gave it.
I have a text file (~10GB) with the following format:
data1<TAB>data2<TAB>data3<TAB>data4<NEWLINE>
I want to scan through it and do processing only on data2. What is the best (fastest) way to extract data2 in C++.
EDIT: Added NEWLINE
Read the file line by line. For each line, split on the tab. That will leave you with an array containing the fields, allowing you to work with the second field (data2).
This sounds like a job for a higher level tool like shell utilities:
cut -f2 # from stdin
cut -f2 <my_file # from file
But nonetheless, you can do that with C++ as well:
void parse(std::istream& in)
{
std::string word;
while( in ) {
std::cin >> word; // throwaway 1
std::cin >> word; // data2
process(word);
std::cin >> word >> word; // throwaway 3 and 4
}
}
// ...
parse(std::cin);
std::ifstream file("my_file");
parse(file);
Read the file a line at a time. It's pretty straight forward parsing out the tabs from there. You could use something like strtok() or similar routine.
Well, open a file stream (which should be able to handle 10gig files) and then just jump to after the first tab, which is a '\t', read your data and then skip to the next newline and repeat.
#include <fstream>
#include <string>
int main(){
std::fstream fin("your_file.txt");
while(fin){
std::string data2;
char sink = '\0';
// skip to first tab
fin.ignore(1024,'\t');
fin >> data2;
// do stuff with data2
// skip to next line
fin.ignore(1024,'\n');
}
}
Since the file is of a considerable size, you might consider using a technique that will allow you overlap your I/O with your processing. In response a comment, you mentioned you were working on linux. Provided you are using kernel 2.6 or later you might consider using Linux asynchronous I/O (AIO). Specifically you would use aio_read to queue up some read requests, then use aio_suspend to wait for one (or more) of the request to end. As requests complete you would scan through the buffers using a plain char* to locate the data you are interested in. For each piece of data you find you could at that point create a std::string (although avoiding copying may be beneficial) and process it. Once you have scanned a block you would requeue it to read another block from the file. You continue doing this until you have processed every block in the file.
The code for this method will be more complex than reading the file line by line, but it may be considerably faster.
You could use iostream as others have suggested. Another way to go would be to simply use fscanf. For example:
#include <stdio.h>
...
FILE* fp = fopen(path_to_file, "r");
char[256] data;
while(fscanf(fp, "%*s<tab>%s<tab>%*s<tab>%*s", data))
{
do what you want with your data
}