Saving char in a file problems - c++

Hey.
I have some problems writing char to a file with ofstream.
this is how the code looks (Just to show how it works. This is NOT the real code).
char buffer[5001];
char secondbuffer[5001];
char temp;
ifstream in(Filename here);
int i = 0;
while(in.get(secondbuffer) && !in.eof[])
{
i++;
}
for(int j = 0; j < i; j++)
{
secondbuffer[j] = buffer[j];
}
ofstream fout(somefile);
fout << secondbuffer;
// end of program
The problem is that it reads the characters of the first file fine, but when it writes to the second file, it adds all characters from the first file, as its supposed to do, but when there are no more characters, it adds a lot of "Ì" characters in the end of file.
fx:
file 1:
abc
file 2:
abcÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌ...
How can I prevent the program save "Ì" in the file?
EDIT2:
int i = 0;
lenghtofFile++;
while(fin.get(firstfileBuffer[i]) && !fin.eof())
{
i++;
lenghtofFile++;
}
firstfileBuffer[i] = '\0';
for(int j = 0; j < lenghtofFile; j++)
{
if(secondfileBuffer[j] != ' ' && secondfileBuffer[j] != '\0')
{
secondfileBuffer[j] = function(key, firstfileBuffer[j]);
}
}
secondfileBuffer[lenghtofFile]='\0';
fout << secondfileBuffer;

You need to null-terminate secondbuffer. You are adding all the characters read from the stream, which do not include the trailing NULL.
on the line before fout, add
secondbuffer[j]='\0\';

The problem is that there is no terminating null character in your file. When you read the file in, you get "abc" just fine, but the garbage that was sitting in secondbuffer when it was declared is still there, so writing "abc" to the beginning of it means that you have a 5001-length array of garbage that starts with "abc."
Try adding
secondbuffer[i] = '\0'; after your for loop.

This should work fine:
#include<iostream>
#include<fstream>
using namespace std;
int main()
{
char buffer[5001];
char secondbuffer[5001];
ifstream in("foo.txt", ifstream::in);
ofstream fout("blah_copy.txt");
do
{
in.getline(buffer,5001);
fout<<buffer;
}
while(!in.eof());
in.close();
fout.close();
return 0;
}

Related

reading a large txt file (2GB) passing it to a string, takes too long

I have a big text file (2GB) that contains couple of books. I want to create a (**char)
that contains each word of the whole text file. But firstly i pass all the text file data in a HUGE string, THEN making the **char variable
the problem is that it takes TOO long(hours) for the getline() loop to end.I ran it for 30 mins and the program read 500.000 lines. The whole file is 43.000.000 lines
int main (){
ifstream book;
string sbook,str;
book.open("gutenberg.txt"); // the huge file
cout<<"Reading the file ....."<<endl;
while(!book.eof()){
getline(book,sbook);//passing the line as a string to sbook
if(str.empty()){
str= sbook;
}
else
str= str + " " + sbook;//apend sbook to another string until the file closes
}//I never managed to get out of this loop
cout<<"Done reading the file."<<endl;
cout<<"Removal....."<<endl;
removal(str);//removes all puncuations and makes each upperccase letter to a lowercase
cout<<"done removal"<<endl;
cout<<"Removing doublewhitespaces...."<<endl;
int whitespaces=removedoublewhitespace(str);//removes excess whitespaces leaving only one whitespace within each word
//and returns the number of all the whitespaces
cout<<"doublewhitespaces removed."<<endl;
cout<<"initiating leksis....."<<endl;
char **leksis=new char*[whitespaces+1];//whitespase+1 is how many words are left in the file
for(int i=0;i<whitespaces+1;i++){
leksis[i]= new char[30];
}
cout<<"done initiating leksis."<<endl;
int y=0,j=0;
cout<<"constructing leksis,finding plithos...."<<endl;
for(int i=0;i<str.length();i++){
if(isspace(str[i])){;
y++;
j=0;
leksis[y][j]=' ';
j++;
}
else{
leksis[y][j]=str[i];
j++;
}
}
cout<<"Done constructing leksis,finding plithos...."<<endl;
removal() function
void removal(string &s) {
for (int i = 0, len = s.size(); i < len; i++)
{
char c=s[i];
if(isupper(s[i])){
s[i]=tolower(s[i]);
}
int flag=ispunct(s[i]);
if (flag){
s.erase(i--, 1);
len = s.size();
}
}
}
removedoublewhitespace() function :
int removedoublewhitespace(string &str){
int wcnt=0;
for(int i=str.size()-1; i >= 0; i-- )
{
if(str[i]==' '&&str[i]==str[i-1]) //added equal sign
{
str.erase( str.begin() + i );
}
}
for(int i=0;i<str.size();i++){
if(isspace(str[i])){
wcnt++;
}
}
return wcnt;
}
this loop
while(!book.eof()){
getline(book,sbook);//passing the line as a string to sbook
if(str.empty()){
str= sbook;
}
else
str= str + " " + sbook;
is hugely inefficient. Concatenating an huge string like that is terrible. If you must have the whole file in memory at once then put it in a linked list of strings, one for each line. Or a vector of strings, thats also a huge chunk of memory but it will be allocated more efficiently

How to identify a specific word (like C Keyword) from an input text file and output them into an another external text file one word per line?

I want to extract c-keywords from my given input text file (which is a sample c code) and output them into a separate text file one word per line. How can i get this? Please help me. My code is given below:
#include <iostream>
#include<fstream>
#include<string.h>
using namespace std;
int main()
{
ifstream My_input_file("D:\\input.txt");
ofstream My_output_file("D:\\output.txt");
string line, temp;
int flag=1;
if(My_input_file.is_open())
{
while(getline(My_input_file,line))
{
int lenth=line.length(),i;
char* newline = new char[lenth+1];
strcpy(newline,line.c_str());
for(i=0; i<lenth; i++)
{
if(newline[i]=='/'&& newline[i+1]=='/')
break;
if (newline[i]=='/'&& newline[i+1]=='*')
flag=2;
if (newline[i]=='*'&& newline[i+1]=='/')
{flag=1;
i++;}
else if(flag==1)
My_output_file<<newline[i];
}
My_output_file<<"\n";
for(i=0;i<lenth;i++)
{
if(newline[i]=' ')
{
if(temp=="auto"||temp=="double"||temp=="int"||temp=="struct"||temp=="break"||temp=="else"||temp=="long"||temp=="switch"||temp=="case"||temp=="enum"||temp=="register"||temp=="typedef"||temp=="char"||temp=="extern"||temp=="return"||temp=="union"||temp=="const"||temp=="float"||temp=="short"||temp=="unsigned"||temp=="continue"||temp=="for"||temp=="signed"||temp=="void"||temp=="default"||temp=="goto"||temp=="sizeof"||temp=="volatile"||temp=="do"||temp=="if"||temp=="static"||temp=="while")
My_output_file<<newline[i];
}
else
{
temp+=newline[i];
}
}
}
}
return 0;
}
I want to extract the c keyword like word from my input text file and generate the matching word in a output text file. But my program just avoiding comment line. It does not write the matched word what i searched for.
If my input.txt file contains some text like a c++ code for prime number check with comment line
Then output.txt file should be look like:
The Whole prime number code Without comment line
Matched keyword list is:
Int
float
double
return
After editing my code i have find out a solution for lexical analysis. The edited code is looks like this:
#include <iostream>
#include<fstream>
#include<string.h>
#include<stdlib.h>
#include<ctype.h>
using namespace std;
int isKeyword(char buffer[]){
char keywords[32][10] = {"auto","break","case","char","const","continue","default",
"do","double","else","enum","extern","float","for","goto",
"if","int","long","register","return","short","signed",
"sizeof","static","struct","switch","typedef","union",
"unsigned","void","volatile","while"};
int i, flag = 0;
for(i = 0; i < 32; ++i){
if(strcmp(keywords[i], buffer) == 0){
flag = 1;
break;
}
}
return flag;
}
int main()
{
ifstream My_input_file("D:\\input.txt");
ifstream My_library_file("D:\\library.txt");
ofstream My_output_file("D:\\output.txt");
string line, temp;
int flag=1;
if(My_input_file.is_open())
{
while(getline(My_input_file,line))
{
int lenth=line.length(),i;
char* newline = new char[lenth+1];
strcpy(newline,line.c_str());
for(i=0; i<lenth; i++)
{
if(newline[i]=='/'&& newline[i+1]=='/')
break;
if (newline[i]=='/'&& newline[i+1]=='*')
flag=2;
if (newline[i]=='*'&& newline[i+1]=='/')
{flag=1;
i++;}
else if(flag==1)
My_output_file<<newline[i];
}
My_output_file<<"\n";
}
My_output_file<<endl<<"Keywords Found in Your Text";
My_output_file<<endl<<"-------------------------------------"<<endl;
cout<<"Keyword Found in Your Text"<<endl;
cout<<"-------------------------------------"<<endl;
char ch, buffer[15];
ifstream fin("D:\\input.txt");
int j=0;
if(!fin.is_open()){
cout<<"error while opening the file\n";
exit(0);
}
while(!fin.eof()){
ch = fin.get();
if(isalnum(ch)){
buffer[j++] = ch;
}
else if((ch == ' ' || ch == '\n') && (j != 0)){
buffer[j] = '\0';
j = 0;
if(isKeyword(buffer) == 1)
cout<<buffer<<"\n";
}
}
}
return 0;
}
But now it is output a text file without comment line and the founded keyword list in a console window. I want to output the keyword list in my output.txt file.
My input.txt is like this:
input.txt
My Output Should be look like this:
output.txt
But it output a text file like this:
After Clicking Build and Run the output
and the console showing the founded keyword list like this:
After Build and Run the console
But, I want to put the founded keyword list into output.txt file instead of the console display.
Then i have changed these code:
if(isKeyword(buffer) == 1)
cout<<buffer<<"\n";
with this
My_output_file<<buffer[j]<<"\n";
Then i got my solution near about. But it output the first word of the keyword. How can i get the full word?

C++, reading chars into a vector<char> from a file, character by character

I am trying to read in the first 7 chars of a file named "board.txt" into a vector<'char> but I am having issues for some reason. I am not too familiar with C++ so any advice would be appreciated, here is the code I have so far
//rack
int charCount = 0;
char ch;
ifstream rackIn("board.txt");
while(rackIn.get(ch) && charCount < 7){
this->getMyRack().push_back(ch);
}
And here is the function getMyRack used in the code above:
vector<char> board::getMyRack(){
return this->myRack;
}
myRack is a char vector
I tried to test this in my main using this:
for (int i = 0; i < test->getMyRack().size(); ++i){
cout << test->getMyRack().at(i);
}
but it does not output anything, why are the chars i am reading in not being added into my char vectors?
Because you don't put char in your vector. Your function getMyRack() returns vector but not address of your vector. You can add method to your class board for adding char, for example:
void board::addChar(char c){
this->myRack.push_back(c);
}
And then call this function:
while(rackIn.get(ch) && charCount < 7){
this->addChar(ch);
}
Or change the return type of your function.
read line one or (how much lines required) from file to a string
create substring of 7 chars from beginning
std::ifstream file("board.txt");
std::string str;
// to read single line
std::getline(file, str);
// to read 7 chars
str= str.substr(0,7);
vector<char> char_buf;
for(size_t i =0; i <= str.size();i++)
{
char_buf.push_back(str[i])
}
// use the char_buf
easier or second way is use
#include<fstream> // for ifstream
#include <cstdlib> // for exit()
std::string file_name ="board.txt";
std::ifstream input_stream;
std::vector<char> char_buf;
input_stream.open(file_name);
if(input_stream.fail()) { exit(0);}
int char_no=0;
while(i<=7)
{
char c = input_stream.get();
char_buf.push_back(c);
i++;
}
// use char_buf
std::string str;
int char_count=0;
// Read the next line from File untill it reaches the 7.
while (std::getline(in, str)&& char_count!=7)
{
// Line contains string of length > 0 then save it in vector
if (str.size() > 0)
your_char_vector.push_back(str);
char_count++;
if(char_count==7)
break;
}

reading last n lines from file in c/c++

I have seen many posts but didn't find something like i want.
I am getting wrong output :
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ...... // may be this is EOF character
Going into infinite loop.
My algorithm:
Go to end of file.
decrease position of pointer by 1 and read character by
character.
exit if we found our 10 lines or we reach beginning of file.
now i will scan the full file till EOF and print them //not implemented in code.
code:
#include<iostream>
#include<stdio.h>
#include<conio.h>
#include<stdlib.h>
#include<string.h>
using namespace std;
int main()
{
FILE *f1=fopen("input.txt","r");
FILE *f2=fopen("output.txt","w");
int i,j,pos;
int count=0;
char ch;
int begin=ftell(f1);
// GO TO END OF FILE
fseek(f1,0,SEEK_END);
int end = ftell(f1);
pos=ftell(f1);
while(count<10)
{
pos=ftell(f1);
// FILE IS LESS THAN 10 LINES
if(pos<begin)
break;
ch=fgetc(f1);
if(ch=='\n')
count++;
fputc(ch,f2);
fseek(f1,pos-1,end);
}
return 0;
}
UPD 1:
changed code: it has just 1 error now - if input has lines like
3enil
2enil
1enil
it prints 10 lines only
line1
line2
line3ÿine1
line2
line3ÿine1
line2
line3ÿine1
line2
line3ÿine1
line2
PS:
1. working on windows in notepad++
this is not homework
also i want to do it without using any more memory or use of STL.
i am practicing to improve my basic knowledge so please don't post about any functions (like tail -5 tc.)
please help to improve my code.
Comments in the code
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
FILE *in, *out;
int count = 0;
long int pos;
char s[100];
in = fopen("input.txt", "r");
/* always check return of fopen */
if (in == NULL) {
perror("fopen");
exit(EXIT_FAILURE);
}
out = fopen("output.txt", "w");
if (out == NULL) {
perror("fopen");
exit(EXIT_FAILURE);
}
fseek(in, 0, SEEK_END);
pos = ftell(in);
/* Don't write each char on output.txt, just search for '\n' */
while (pos) {
fseek(in, --pos, SEEK_SET); /* seek from begin */
if (fgetc(in) == '\n') {
if (count++ == 10) break;
}
}
/* Write line by line, is faster than fputc for each char */
while (fgets(s, sizeof(s), in) != NULL) {
fprintf(out, "%s", s);
}
fclose(in);
fclose(out);
return 0;
}
There are a number of problems with your code. The most
important one is that you never check that any of the functions
succeeded. And saving the results an ftell in an int isn't
a very good idea either. Then there's the test pos < begin;
this can only occur if there was an error. And the fact that
you're putting the results of fgetc in a char (which results
in a loss of information). And the fact that the first read you
do is at the end of file, so will fail (and once a stream enters
an error state, it stays there). And the fact that you can't
reliably do arithmetic on the values returned by ftell (except
under Unix) if the file was opened in text mode.
Oh, and there is no "EOF character"; 'ÿ' is a perfectly valid
character (0xFF in Latin-1). Once you assign the return value
of fgetc to a char, you've lost any possibility to test for
end of file.
I might add that reading backwards one character at a time is
extremely inefficient. The usual solution would be to allocate
a sufficiently large buffer, then count the '\n' in it.
EDIT:
Just a quick bit of code to give the idea:
std::string
getLastLines( std::string const& filename, int lineCount )
{
size_t const granularity = 100 * lineCount;
std::ifstream source( filename.c_str(), std::ios_base::binary );
source.seekg( 0, std::ios_base::end );
size_t size = static_cast<size_t>( source.tellg() );
std::vector<char> buffer;
int newlineCount = 0;
while ( source
&& buffer.size() != size
&& newlineCount < lineCount ) {
buffer.resize( std::min( buffer.size() + granularity, size ) );
source.seekg( -static_cast<std::streamoff>( buffer.size() ),
std::ios_base::end );
source.read( buffer.data(), buffer.size() );
newlineCount = std::count( buffer.begin(), buffer.end(), '\n');
}
std::vector<char>::iterator start = buffer.begin();
while ( newlineCount > lineCount ) {
start = std::find( start, buffer.end(), '\n' ) + 1;
-- newlineCount;
}
std::vector<char>::iterator end = remove( start, buffer.end(), '\r' );
return std::string( start, end );
}
This is a bit weak in the error handling; in particular, you
probably want to distinguish the between the inability to open
a file and any other errors. (No other errors should occur,
but you never know.)
Also, this is purely Windows, and it supposes that the actual
file contains pure text, and doesn't contain any '\r' that
aren't part of a CRLF. (For Unix, just drop the next to the
last line.)
This can be done using circular array very efficiently.
No additional buffer is required.
void printlast_n_lines(char* fileName, int n){
const int k = n;
ifstream file(fileName);
string l[k];
int size = 0 ;
while(file.good()){
getline(file, l[size%k]); //this is just circular array
cout << l[size%k] << '\n';
size++;
}
//start of circular array & size of it
int start = size > k ? (size%k) : 0 ; //this get the start of last k lines
int count = min(k, size); // no of lines to print
for(int i = 0; i< count ; i++){
cout << l[(start+i)%k] << '\n' ; // start from in between and print from start due to remainder till all counts are covered
}
}
Please provide feedback.
int end = ftell(f1);
pos=ftell(f1);
this tells you the last point at file, so EOF.
When you read, you get the EOF error, and the ppointer wants to move 1 space forward...
So, i recomend decreasing the current position by one.
Or put the fseek(f1, -2,SEEK_CUR) at the beginning of the while loop to make up for the fread by 1 point and go 1 point back...
I believe, you are using fseek wrong. Check man fseek on the Google.
Try this:
fseek(f1, -2, SEEK_CUR);
//1 to neutrialize change from fgect
//and 1 to move backward
Also you should set position at the beginning to the last element:
fseek(f1, -1, SEEK_END).
You don't need end variable.
You should check return values of all functions (fgetc, fseek and ftell). It is good practise. I don't know if this code will work with empty files or sth similar.
Use :fseek(f1,-2,SEEK_CUR);to back
I write this code ,It can work ,you can try:
#include "stdio.h"
int main()
{
int count = 0;
char * fileName = "count.c";
char * outFileName = "out11.txt";
FILE * fpIn;
FILE * fpOut;
if((fpIn = fopen(fileName,"r")) == NULL )
printf(" file %s open error\n",fileName);
if((fpOut = fopen(outFileName,"w")) == NULL )
printf(" file %s open error\n",outFileName);
fseek(fpIn,0,SEEK_END);
while(count < 10)
{
fseek(fpIn,-2,SEEK_CUR);
if(ftell(fpIn)<0L)
break;
char now = fgetc(fpIn);
printf("%c",now);
fputc(now,fpOut);
if(now == '\n')
++count;
}
fclose(fpIn);
fclose(fpOut);
}
I would use two streams to print last n lines of the file:
This runs in O(lines) runtime and O(lines) space.
#include<bits/stdc++.h>
using namespace std;
int main(){
// read last n lines of a file
ifstream f("file.in");
ifstream g("file.in");
// move f stream n lines down.
int n;
cin >> n;
string line;
for(int i=0; i<k; ++i) getline(f,line);
// move f and g stream at the same pace.
for(; getline(f,line); ){
getline(g, line);
}
// g now has to go the last n lines.
for(; getline(g,line); )
cout << line << endl;
}
A solution with a O(lines) runtime and O(N) space is using a queue:
ifstream fin("file.in");
int k;
cin >> k;
queue<string> Q;
string line;
for(; getline(fin, line); ){
if(Q.size() == k){
Q.pop();
}
Q.push(line);
}
while(!Q.empty()){
cout << Q.front() << endl;
Q.pop();
}
Here is the solution in C++.
#include <iostream>
#include <string>
#include <exception>
#include <cstdlib>
int main(int argc, char *argv[])
{
auto& file = std::cin;
int n = 5;
if (argc > 1) {
try {
n = std::stoi(argv[1]);
} catch (std::exception& e) {
std::cout << "Error: argument must be an int" << std::endl;
std::exit(EXIT_FAILURE);
}
}
file.seekg(0, file.end);
n = n + 1; // Add one so the loop stops at the newline above
while (file.tellg() != 0 && n) {
file.seekg(-1, file.cur);
if (file.peek() == '\n')
n--;
}
if (file.peek() == '\n') // If we stop in the middle we will be at a newline
file.seekg(1, file.cur);
std::string line;
while (std::getline(file, line))
std::cout << line << std::endl;
std::exit(EXIT_SUCCESS);
}
Build:
$ g++ <SOURCE_NAME> -o last_n_lines
Run:
$ ./last_n_lines 10 < <SOME_FILE>

Output wrong. Possible strncpy issue?

So, I'm trying to get this code to parse each line inputted from the file into individual tokens, then add each one in turn to tklist array. Then the main just prints out each token. It's printing blanks though, and when I step into the code it looks like the strncpy isn't working. Any ideas what the issue is? I get no errors.
Here's the main function:
#include <iostream>
#include <fstream>
using namespace std;
#include "definitions.h"
#include "system_utilities.h"
int main()
{
ifstream inFile;
char line[MAX_CMD_LINE_LENGTH];
char* token[MAX_TOKENS_ON_A_LINE];
int numtokens;
system("pwd");
inFile.open("p4input.txt", ios::in);
if(inFile.fail()) {
cout << "Could not open input file. Program terminating.\n\n";
return 0;
}
while (!inFile.eof())
{
inFile.getline(line, 255);
line[strlen(line)+1] = '\0';
numtokens = parseCommandLine(line, token);
int t;
for (t=1; t <= numtokens; t++) {
cout << "Token "<< t << ": " << token[t-1] << "\n";
}
}
return 0;
}
And here's the parseCommandLine function:
int parseCommandLine(char cline[], char *tklist[]){
int i;
int length; //length of line
int count = 0; //counts number of tokens
int toklength = 0; //counts the length of each token
length = strlen(cline);
for (i=0; i < length; i++) { //go to first character of each token
if (((cline[i] != ' ' && cline[i-1]==' ') || i == 0)&& cline[i]!= '"') {
while ((cline[i]!=' ')&& (cline[i] != '\0') && (cline[i] != '\r')){
toklength++;
i++;
}
//---------------
tklist[count] = (char *) malloc( toklength +1);
strncpy(tklist[count], &cline[i-toklength], toklength);
//--------------
count ++;
toklength = 0;
}
if (cline[i] == '"') {
do {
toklength++;
i++;
if (cline[i] == ' ') {
toklength--;
}
} while (cline[i]!='"');
//--------------
tklist[count] = (char *) malloc( toklength +1);
strncpy(tklist[count], &cline[i-toklength], toklength);
//--------------
count ++;
toklength = 0;
}
}
int j;
for (j = 0; j < count; j++) {
free( (void *)tklist[j] );
}
return count;
}
Like I said, when I debug it looks like a problem with copying, but I'm a beginner so I suspect I'm doing something wrong.
Thanks for any help you can give!!
Try something like
tklist[count][toklength]='\0';
after
strncpy(tklist[count], &cline[i-toklength], toklength);
strncpy() does not necessarily add a null terminator for you. strncpy needs some care to use safely.
No null-character is implicitly appended at the end of destination if
source is longer than num..
Just for starters... there are other deeper issues as mentioned in comments.
To start with the generic equivalent of malloc/free is new/delete (heap memory allocation).
Second you seem to be confusing strings and c_strings (good old char*). getline uses strings, your parsing function uses c_strings they are not the same things and there is .c_str() a member function of string to do the conversion.
So, I'm trying to get this code to parse each line inputted from the
file into individual tokens, then add each one in turn to tklist
array.
For
each line inputted from the file
use
std::ifstream ifs;
std::string s;
/**/
std::getline(ifs, s);
adopted to your loop.
To
parse [..] into individual tokens
look how
std::string
can help you on that task (or use boost::tokenizer).
And this
then add each [token] in turn to tklist array.
almost cries for std::list or std::vector instead of a plain C array, the choice, which container to use depends e.g. on what you intend to do with the tokens found.