Line by line reading in C and C++? - c++

I want to read line by line from a file in C or C++, and I know how to do that when I assume some fixed size of a line, but is there a simple way to somehow calculate or get the exact size needed for a line or all lines in file? (Reading word by word until newline is also good for me if anyone can do it that way.)

If you use a streamed reader, all this will be hidden from you. See getline. The example below is based from the code here.
// getline with strings
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main () {
string str;
ifstream ifs("data.txt");
getline (ifs,str);
cout << "first line of the file is " << str << ".\n";
}

In C, if you have POSIX 2008 libraries (more recent versions of Linux, for example), you can use the POSIX getline() function. If you don't have the function in your libraries, you can implement it easily enough, which is probably better than inventing your own interface to do the job.
In C++, you can use std::getline().
Even though the two functions have the same basic name, the calling conventions and semantics are quite different (because the languages C and C++ are quite different) - except that they both read a line of data from a file stream, of course.
There isn't an easy way to tell how big the longest line in a file is - except by reading the whole file to find out, which is kind of wasteful.

I would use an IFStream and use getline to read from a file.
http://www.cplusplus.com/doc/tutorial/files/
int main () {
string line;
ifstream myfile ("example.txt");
if (myfile.is_open())
{
while ( myfile.good() )
{
getline (myfile,line);
cout << line << endl;
}
myfile.close();
}
else cout << "Unable to open file";
return 0;
}

You can't get the length of line until after you read it in. You can, however, read into a buffer repeatedly until you reach the end of line.
For programming in c, try using fgets to read in a line of code. It will read n characters or stop if it encounters a newline. You can read in a small buffer of size n until the last character in the string is the newline.
See the link above for more information.
Here is an example on how to read an display a full line of file using a small buffer:
#include <stdio.h>
#include <string.h>
int main()
{
FILE * pFile;
const int n = 5;
char mystring [n];
int lineLength = 0;
pFile = fopen ("myfile.txt" , "r");
if (pFile == NULL)
{
perror ("Error opening file");
}
else
{
do
{
fgets (mystring , n , pFile);
puts (mystring);
lineLength += strlen(mystring);
} while(mystring[strlen ( mystring)-1] != '\n' && !feof(pFile));
fclose (pFile);
}
printf("Line Length: %d\n", lineLength);
return 0;
}

In C++ you can use the std::getline function, which takes a stream and reads up to the first '\n' character. In C, I would just use fgets and keep reallocating a buffer until the last character is the '\n', then we know we have read the entire line.
C++:
std::ifstream file("myfile.txt");
std::string line;
std::getline(file, line);
std::cout << line;
C:
// I didn't test this code I just made it off the top of my head.
FILE* file = fopen("myfile.txt", "r");
size_t cap = 256;
size_t len = 0;
char* line = malloc(cap);
for (;;) {
fgets(&line[len], cap - len, file);
len = strlen(line);
if (line[len-1] != '\n' && !feof(file)) {
cap <<= 1;
line = realloc(line, cap);
} else {
break;
}
}
printf("%s", line);

getline is only POSIX, here is an ANSI (NO max-line-size info needed!):
const char* getline(FILE *f,char **r)
{
char t[100];
if( feof(f) )
return 0;
**r=0;
while( fgets(t,100,f) )
{
char *p=strchr(t,'\n');
if( p )
{
*p=0;
if( (p=strchr(t,'\r')) ) *p=0;
*r=realloc(*r,strlen(*r)+1+strlen(t));
strcat(*r,t);
return *r;
}
else
{
if( (p=strchr(t,'\r')) ) *p=0;
*r=realloc(*r,strlen(*r)+1+strlen(t));
strcat(*r,t);
}
}
return feof(f)?(**r?*r:0):*r;
}
and now it's easy and short in your main:
char *line,*buffer = malloc(100);
FILE *f=fopen("yourfile.txt","rb");
if( !f ) return;
setvbuf(f,0,_IOLBF,4096);
while( (line=getline(f,&buffer)) )
puts(line);
fclose(f);
free(buffer);
it works on windows for Windows AND Unix-textfiles,
it works on Unix for Unix AND Windows-textfiles

Here is a C++ way of reading the lines, using std algorithms and iterators:
#include <iostream>
#include <iterator>
#include <vector>
#include <algorithm>
struct getline :
public std::iterator<std::input_iterator_tag, std::string>
{
std::istream* in;
std::string line;
getline(std::istream& in) : in(&in) {
++*this;
}
getline() : in(0) {
}
getline& operator++() {
if(in && !std::getline(*in, line)) in = 0;
}
std::string operator*() const {
return line;
}
bool operator!=(const getline& rhs) const {
return !in != !rhs.in;
}
};
int main() {
std::vector<std::string> v;
std::copy(getline(std::cin), getline(), std::back_inserter(v));
}

Related

Input from text file into char *array[9]

I have a file with 9 words and i have to store each word into the char array of 9 pointers but i keep getting an error message. I cannot use vectors!
#include <iostream>
#include <fstream>
using namespace std;
int main()
{
char *words[9];
ifstream inStream;
inStream.open("sentence.txt");
if (inStream.fail())
{
cout << "Input file opening failed.\n";
exit(1);
}
for ( int i = 0; i < 10; i++)
{
inStream >> words[i];
}
inStream.close();
return 0;
}
The declaration
char *words[9];
declares a raw array of pointers. This array is not initialized so the pointers have indeterminate values. Using any of them would be Undefined Behavior.
Instead you want
vector<string> words;
where vector is std::vector from the <vector> header, and string is std::string from the <string> header.
Use the push_back member function to add strings to the end of the vector.
Also you need to move the close call out of the loop. Otherwise it will close the file in the first iteration.
This approach gives the code (off the cuff, disclaimer...)
#include <fstream>
#include <iostream>
#include <vector>
#include <string>
using namespace std;
int main()
{
vector<string> words;
ifstream inStream;
inStream.open("sentence.txt");
for ( int i = 0; i < 10; i++)
{
string word;
if( inStream >> word )
words.push_back( word );
}
inStream.close();
}
If you can't use std::string and std::vector then you need to initialize the array of pointers, and make sure that you don't read more into the buffers than there's room for.
The main problem here is that >> is unsafe for reading into a raw array given by a pointer. It doesn't know how large that array is. It can easily lead to a buffer overrun, with dire consequences.
And so this gets a bit complicated, but it can look like this:
#include <ctype.h> // isspace
#include <fstream>
#include <iostream>
#include <locale.h> // setlocale, LC_ALL
#include <stdlib.h> // EXIT_FAILURE
using namespace std;
void fail( char const* const message )
{
cerr << "! " << message << "\n";
exit( EXIT_FAILURE );
}
void readWordFrom( istream& stream, char* const p_buffer, int const buffer_size )
{
int charCode;
// Skip whitespace:
while( (charCode = stream.get()) != EOF and isspace( charCode ) ) {}
int n_read = 0;
char* p = p_buffer;
while( n_read < buffer_size - 1 and charCode != EOF and not isspace( charCode ) )
{
*p = charCode; ++p;
++n_read;
charCode = stream.get();
}
*p = '\0'; // Terminating null-byte.
if( charCode != EOF )
{
stream.putback( charCode );
if( not isspace( charCode ) )
{
assert( n_read == buffer_size - 1 ); // We exceeded buffer size.
stream.setstate( ios::failbit );
}
}
}
int main()
{
static int const n_words = 9;
static int const max_word_length = 80;
static int const buffer_size = max_word_length + 1; // For end byte.
char *words[n_words];
for( auto& p_word : words ) { p_word = new char[buffer_size]; }
ifstream inStream{ "sentence.txt" };
if( inStream.fail() ) { fail( "Input file opening failed." ); }
setlocale( LC_ALL, "" ); // Pedantically necessary for `isspace`.
for( auto const p_word : words )
{
readWordFrom( inStream, p_word, buffer_size );
if( inStream.fail() ) { fail( "Reading a word failed." ); }
}
for( auto const p_word : words ) { cout << p_word << "\n"; }
for( auto const p_word : words ) { delete[] p_word; }
}
You never allocate any memory for your char* pointers kept in the array.
The idiomatic way to write a c++ code would be:
#include <iostream>
#include <fstream>
#include <vector>
int main() {
std::vector<std::string> words(9);
std::ifstream inStream;
inStream.open("sentence.txt");
for ( int i = 0; inStream && i < 9; i++) {
inStream >> words[i];
}
}
The inStream.close() isn't necessary, and even wrong inside the loop. The std::istream will be closed automatically as soon the variable goes out of scope.
There are a few problems with your code.
char *words[9];
This allocates space for 9 pointers, not nine strings. Since you don't know how big the strings are you have two choices. You can either "guess" how much you'll need and limit the inputs accordingly, or you can use dynamic memory allocation (malloc or new) to create the space you need to store the strings. Dynamic memory would be my choice.
for ( int i = 0; i < 10; i++)
This loop will execute on words[0] through words[9]. However, there is no words[9] (that would be the tenth word) so you'll overwrite memory that you have not allocated
inStream >> words[i];
This will send your input stream to memory that you don't "own". You need to allocate space for the words to live before capturing them from the input stream. To do this correctly, you'll need to know how much space each word will need so you can allocate it.
you could try something like this:
int main()
{
char *words[9];
char tempInput[256]; // space to capture the input, up to a maximum size of 256 chars
ifstream inStream;
inStream.open("sentence.txt");
if (inStream.fail())
{
cout << "Input file opening failed.\n";
exit(1);
}
for ( int i = 0; i < 9; i++)
{
//Clear the input buffer
memset(tempInput, 0, 256);
//Capture the next word
inStream >> tempInput;
//allocate space to save the word
words[i] = new char(strlen(tempInput));
//Copy the word to its final location
strcpy(words[i], tempInput)
}
inStream.close();
return 0;
}

c++ - std::getline reads non existing characters [duplicate]

When i read from a file string by string, >> operation gets first string but it starts with "i" . Assume that first string is "street", than it gets as "istreet".
Other strings are okay. I tried for different txt files. The result is same. First string starts with "i". What is the problem?
Here is my code :
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
using namespace std;
int cube(int x){ return (x*x*x);}
int main(){
int maxChar;
int lineLength=0;
int cost=0;
cout<<"Enter the max char per line... : ";
cin>>maxChar;
cout<<endl<<"Max char per line is : "<<maxChar<<endl;
fstream inFile("bla.txt",ios::in);
if (!inFile) {
cerr << "Unable to open file datafile.txt";
exit(1); // call system to stop
}
while(!inFile.eof()) {
string word;
inFile >> word;
cout<<word<<endl;
cout<<word.length()<<endl;
if(word.length()+lineLength<=maxChar){
lineLength +=(word.length()+1);
}
else {
cost+=cube(maxChar-(lineLength-1));
lineLength=(word.length()+1);
}
}
}
You're seeing a UTF-8 Byte Order Mark (BOM). It was added by the application that created the file.
To detect and ignore the marker you could try this (untested) function:
bool SkipBOM(std::istream & in)
{
char test[4] = {0};
in.read(test, 3);
if (strcmp(test, "\xEF\xBB\xBF") == 0)
return true;
in.seekg(0);
return false;
}
With reference to the excellent answer by Mark Ransom above, adding this code skips the BOM (Byte Order Mark) on an existing stream. Call it after opening a file.
// Skips the Byte Order Mark (BOM) that defines UTF-8 in some text files.
void SkipBOM(std::ifstream &in)
{
char test[3] = {0};
in.read(test, 3);
if ((unsigned char)test[0] == 0xEF &&
(unsigned char)test[1] == 0xBB &&
(unsigned char)test[2] == 0xBF)
{
return;
}
in.seekg(0);
}
To use:
ifstream in(path);
SkipBOM(in);
string line;
while (getline(in, line))
{
// Process lines of input here.
}
Here is another two ideas.
if you are the one who create the files, save they length along with them, and when reading them, just cut all the prefix with this simple calculation: trueFileLength - savedFileLength = numOfByesToCut
create your own prefix when saving the files, and when reading search for it and delete all what you found before.

reading last n lines from file in c/c++

I have seen many posts but didn't find something like i want.
I am getting wrong output :
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ...... // may be this is EOF character
Going into infinite loop.
My algorithm:
Go to end of file.
decrease position of pointer by 1 and read character by
character.
exit if we found our 10 lines or we reach beginning of file.
now i will scan the full file till EOF and print them //not implemented in code.
code:
#include<iostream>
#include<stdio.h>
#include<conio.h>
#include<stdlib.h>
#include<string.h>
using namespace std;
int main()
{
FILE *f1=fopen("input.txt","r");
FILE *f2=fopen("output.txt","w");
int i,j,pos;
int count=0;
char ch;
int begin=ftell(f1);
// GO TO END OF FILE
fseek(f1,0,SEEK_END);
int end = ftell(f1);
pos=ftell(f1);
while(count<10)
{
pos=ftell(f1);
// FILE IS LESS THAN 10 LINES
if(pos<begin)
break;
ch=fgetc(f1);
if(ch=='\n')
count++;
fputc(ch,f2);
fseek(f1,pos-1,end);
}
return 0;
}
UPD 1:
changed code: it has just 1 error now - if input has lines like
3enil
2enil
1enil
it prints 10 lines only
line1
line2
line3ÿine1
line2
line3ÿine1
line2
line3ÿine1
line2
line3ÿine1
line2
PS:
1. working on windows in notepad++
this is not homework
also i want to do it without using any more memory or use of STL.
i am practicing to improve my basic knowledge so please don't post about any functions (like tail -5 tc.)
please help to improve my code.
Comments in the code
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
FILE *in, *out;
int count = 0;
long int pos;
char s[100];
in = fopen("input.txt", "r");
/* always check return of fopen */
if (in == NULL) {
perror("fopen");
exit(EXIT_FAILURE);
}
out = fopen("output.txt", "w");
if (out == NULL) {
perror("fopen");
exit(EXIT_FAILURE);
}
fseek(in, 0, SEEK_END);
pos = ftell(in);
/* Don't write each char on output.txt, just search for '\n' */
while (pos) {
fseek(in, --pos, SEEK_SET); /* seek from begin */
if (fgetc(in) == '\n') {
if (count++ == 10) break;
}
}
/* Write line by line, is faster than fputc for each char */
while (fgets(s, sizeof(s), in) != NULL) {
fprintf(out, "%s", s);
}
fclose(in);
fclose(out);
return 0;
}
There are a number of problems with your code. The most
important one is that you never check that any of the functions
succeeded. And saving the results an ftell in an int isn't
a very good idea either. Then there's the test pos < begin;
this can only occur if there was an error. And the fact that
you're putting the results of fgetc in a char (which results
in a loss of information). And the fact that the first read you
do is at the end of file, so will fail (and once a stream enters
an error state, it stays there). And the fact that you can't
reliably do arithmetic on the values returned by ftell (except
under Unix) if the file was opened in text mode.
Oh, and there is no "EOF character"; 'ÿ' is a perfectly valid
character (0xFF in Latin-1). Once you assign the return value
of fgetc to a char, you've lost any possibility to test for
end of file.
I might add that reading backwards one character at a time is
extremely inefficient. The usual solution would be to allocate
a sufficiently large buffer, then count the '\n' in it.
EDIT:
Just a quick bit of code to give the idea:
std::string
getLastLines( std::string const& filename, int lineCount )
{
size_t const granularity = 100 * lineCount;
std::ifstream source( filename.c_str(), std::ios_base::binary );
source.seekg( 0, std::ios_base::end );
size_t size = static_cast<size_t>( source.tellg() );
std::vector<char> buffer;
int newlineCount = 0;
while ( source
&& buffer.size() != size
&& newlineCount < lineCount ) {
buffer.resize( std::min( buffer.size() + granularity, size ) );
source.seekg( -static_cast<std::streamoff>( buffer.size() ),
std::ios_base::end );
source.read( buffer.data(), buffer.size() );
newlineCount = std::count( buffer.begin(), buffer.end(), '\n');
}
std::vector<char>::iterator start = buffer.begin();
while ( newlineCount > lineCount ) {
start = std::find( start, buffer.end(), '\n' ) + 1;
-- newlineCount;
}
std::vector<char>::iterator end = remove( start, buffer.end(), '\r' );
return std::string( start, end );
}
This is a bit weak in the error handling; in particular, you
probably want to distinguish the between the inability to open
a file and any other errors. (No other errors should occur,
but you never know.)
Also, this is purely Windows, and it supposes that the actual
file contains pure text, and doesn't contain any '\r' that
aren't part of a CRLF. (For Unix, just drop the next to the
last line.)
This can be done using circular array very efficiently.
No additional buffer is required.
void printlast_n_lines(char* fileName, int n){
const int k = n;
ifstream file(fileName);
string l[k];
int size = 0 ;
while(file.good()){
getline(file, l[size%k]); //this is just circular array
cout << l[size%k] << '\n';
size++;
}
//start of circular array & size of it
int start = size > k ? (size%k) : 0 ; //this get the start of last k lines
int count = min(k, size); // no of lines to print
for(int i = 0; i< count ; i++){
cout << l[(start+i)%k] << '\n' ; // start from in between and print from start due to remainder till all counts are covered
}
}
Please provide feedback.
int end = ftell(f1);
pos=ftell(f1);
this tells you the last point at file, so EOF.
When you read, you get the EOF error, and the ppointer wants to move 1 space forward...
So, i recomend decreasing the current position by one.
Or put the fseek(f1, -2,SEEK_CUR) at the beginning of the while loop to make up for the fread by 1 point and go 1 point back...
I believe, you are using fseek wrong. Check man fseek on the Google.
Try this:
fseek(f1, -2, SEEK_CUR);
//1 to neutrialize change from fgect
//and 1 to move backward
Also you should set position at the beginning to the last element:
fseek(f1, -1, SEEK_END).
You don't need end variable.
You should check return values of all functions (fgetc, fseek and ftell). It is good practise. I don't know if this code will work with empty files or sth similar.
Use :fseek(f1,-2,SEEK_CUR);to back
I write this code ,It can work ,you can try:
#include "stdio.h"
int main()
{
int count = 0;
char * fileName = "count.c";
char * outFileName = "out11.txt";
FILE * fpIn;
FILE * fpOut;
if((fpIn = fopen(fileName,"r")) == NULL )
printf(" file %s open error\n",fileName);
if((fpOut = fopen(outFileName,"w")) == NULL )
printf(" file %s open error\n",outFileName);
fseek(fpIn,0,SEEK_END);
while(count < 10)
{
fseek(fpIn,-2,SEEK_CUR);
if(ftell(fpIn)<0L)
break;
char now = fgetc(fpIn);
printf("%c",now);
fputc(now,fpOut);
if(now == '\n')
++count;
}
fclose(fpIn);
fclose(fpOut);
}
I would use two streams to print last n lines of the file:
This runs in O(lines) runtime and O(lines) space.
#include<bits/stdc++.h>
using namespace std;
int main(){
// read last n lines of a file
ifstream f("file.in");
ifstream g("file.in");
// move f stream n lines down.
int n;
cin >> n;
string line;
for(int i=0; i<k; ++i) getline(f,line);
// move f and g stream at the same pace.
for(; getline(f,line); ){
getline(g, line);
}
// g now has to go the last n lines.
for(; getline(g,line); )
cout << line << endl;
}
A solution with a O(lines) runtime and O(N) space is using a queue:
ifstream fin("file.in");
int k;
cin >> k;
queue<string> Q;
string line;
for(; getline(fin, line); ){
if(Q.size() == k){
Q.pop();
}
Q.push(line);
}
while(!Q.empty()){
cout << Q.front() << endl;
Q.pop();
}
Here is the solution in C++.
#include <iostream>
#include <string>
#include <exception>
#include <cstdlib>
int main(int argc, char *argv[])
{
auto& file = std::cin;
int n = 5;
if (argc > 1) {
try {
n = std::stoi(argv[1]);
} catch (std::exception& e) {
std::cout << "Error: argument must be an int" << std::endl;
std::exit(EXIT_FAILURE);
}
}
file.seekg(0, file.end);
n = n + 1; // Add one so the loop stops at the newline above
while (file.tellg() != 0 && n) {
file.seekg(-1, file.cur);
if (file.peek() == '\n')
n--;
}
if (file.peek() == '\n') // If we stop in the middle we will be at a newline
file.seekg(1, file.cur);
std::string line;
while (std::getline(file, line))
std::cout << line << std::endl;
std::exit(EXIT_SUCCESS);
}
Build:
$ g++ <SOURCE_NAME> -o last_n_lines
Run:
$ ./last_n_lines 10 < <SOME_FILE>

read output of a command line by line into a vector of strings in c++

I need to read the output of a bash command into vector of strings line by line.I tried this code with ifstream but it gives error. What must i use to parse them instead of ifstream?
using namespace std;
int main()
{
vector<string> text_file;
string cmd = "ls";
FILE* stream=popen(cmd.c_str(), "r");
ifstream ifs( stream );
string temp;
while(getline(ifs, temp))
text_file.push_back(temp);
for (int i=0; i<text_file.size(); i++)
cout<<text_file[i]<<endl;
}
You cannot use C I/O with the C++ iostream facilities. If you really want to use popen, you need to access its results with read.
If ls is really what you want to do, give
Boost.Filesystem
a try.
#include <boost/filesystem.hpp>
#include <vector>
int main()
{
namespace bfs = boost::filesystem;
bfs::directory_iterator it{bfs::path{"/tmp"}};
for(bfs::directory_iterator it{bfs::path{"/tmp"}}; it != bfs::directory_iterator{}; ++it) {
std::cout << *it << std::endl;
}
return 0;
}
I think you would like to use GNU library function getline
int main ()
{
vector<string> text_file;
FILE *stream = popen ("ls", "r");
char *ptr = NULL;
size_t len;
string str;
while (getline (&ptr, &len, stream) != -1)
{
str = ptr;
text_file.push_back (str);
}
for (size_t i = 0; i < text_file.size(); ++i)
cout << text_file[i];
}

C++ reading from file puts three weird characters

When i read from a file string by string, >> operation gets first string but it starts with "i" . Assume that first string is "street", than it gets as "istreet".
Other strings are okay. I tried for different txt files. The result is same. First string starts with "i". What is the problem?
Here is my code :
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
using namespace std;
int cube(int x){ return (x*x*x);}
int main(){
int maxChar;
int lineLength=0;
int cost=0;
cout<<"Enter the max char per line... : ";
cin>>maxChar;
cout<<endl<<"Max char per line is : "<<maxChar<<endl;
fstream inFile("bla.txt",ios::in);
if (!inFile) {
cerr << "Unable to open file datafile.txt";
exit(1); // call system to stop
}
while(!inFile.eof()) {
string word;
inFile >> word;
cout<<word<<endl;
cout<<word.length()<<endl;
if(word.length()+lineLength<=maxChar){
lineLength +=(word.length()+1);
}
else {
cost+=cube(maxChar-(lineLength-1));
lineLength=(word.length()+1);
}
}
}
You're seeing a UTF-8 Byte Order Mark (BOM). It was added by the application that created the file.
To detect and ignore the marker you could try this (untested) function:
bool SkipBOM(std::istream & in)
{
char test[4] = {0};
in.read(test, 3);
if (strcmp(test, "\xEF\xBB\xBF") == 0)
return true;
in.seekg(0);
return false;
}
With reference to the excellent answer by Mark Ransom above, adding this code skips the BOM (Byte Order Mark) on an existing stream. Call it after opening a file.
// Skips the Byte Order Mark (BOM) that defines UTF-8 in some text files.
void SkipBOM(std::ifstream &in)
{
char test[3] = {0};
in.read(test, 3);
if ((unsigned char)test[0] == 0xEF &&
(unsigned char)test[1] == 0xBB &&
(unsigned char)test[2] == 0xBF)
{
return;
}
in.seekg(0);
}
To use:
ifstream in(path);
SkipBOM(in);
string line;
while (getline(in, line))
{
// Process lines of input here.
}
Here is another two ideas.
if you are the one who create the files, save they length along with them, and when reading them, just cut all the prefix with this simple calculation: trueFileLength - savedFileLength = numOfByesToCut
create your own prefix when saving the files, and when reading search for it and delete all what you found before.