C++ clear/overwrite content from beginning of file - c++

I need to erase or overwrite(erase would be better) a number of bytes from the start of a file. The content are first read into a data structure the after deleting new content are written.
Currently I have the reading and writing part. How can I do the data clearing part? Ask if you want to know anything else. Thanks, sorry for bad English :)
typedef struct header {
char version[5]; //Offset 0, length 5, Archiver version
int files_no; //Offset 5, length 8, Number of files in the archive
char desc[256]; //Offset 13, length 256, Archive description, Header size 296??
header() {
strcpy(version, "0.20");
strcpy(desc, "THIS IS A DESCRIPTION FIELD WITH 256BYTE!");
}
}archiveHeader;
int archive(char *argv[], int argc) {
archiveHeader archive_struct_write, archive_struct_read;
string output = argv[argc-1]; output += ".n0b"; //Name of file to modify
int write_counter = 1;
ofstream file_write(output.c_str(), ios::binary | ios::app);
file_write.seekp(0, file_write.beg);
ifstream file_read(output.c_str(), ios::binary);
file_read.read((char*)&archive_struct_read, sizeof(archive_struct_read)); //Read existing beginning data
//Steps for erasing the read stuff goes here.
write_counter = archive_struct_read.files_no;
archive_struct_write.files_no = write_counter + 1;
file_write.write((char*)&archive_struct_write, sizeof(archive_struct_write)); //New beginning data written
file_read.close();
file_write.close();
write_counter++;
return 0;
}
Edit : Buffering everything after the beginning bytes and writing to a new file is not an option, the file can vary in size upto GBs. Thanks!

Related

ifstream not reading the same characters as they are written in the file

console
file
Simple explanation: ifstream's get() is reading the wrong chars (console is different from file) and I need to know why.
I am recording registers into a file as a char array. When I write it to the file, it writes successfully. I open the file and find the chars I intended, except notepad apparently shows unicode character 0000 ( NULL) as a space.
For instance, the entries
id = 1000; //an 8-byte long long
name = "stack"; //variable size
surname = "overflow"; //variable size
degree = "internet"; //variable size
sex = 'c'; //1-byte char
birthdate = 256; //4-byte int
become this on the file:
& èstackoverflowinternetc
or, putting the number of unicode characters that disappear when posted here between brackets:
&[3]| [1]è|stack|overflow|internet|c| [1] | //separating each section with a | for easier reading. Some unicode characters disappear when I post them here, but I assure you they are the correct ones
SIZE| ID | name| surname| degree |g| birth
(writing is working fine and puts the expected characters)
Trouble is, when the console in the code below prints what the buffer is reading from the file, it gives me the following record (extra spaces included)
Þstackoverflowinternetc
Which is bad because it returns me the wrong ID and birthdate. Either "-21" and "4747968" or "Ù" and "-1066252288". Other fields are unnaffected. Weird because size bytes show up as empty space in the console, so it shouldn't be able to split name, surname, degree and sex.
ifstream infile("alumni.freire", ios::binary);
if(infile.is_open()){
infile.seekg(pos, ios::beg);
int size;
size = infile.get();
char charreg[size];
charreg[0] = size;
//testing what buffer gives me
for(int i = 1; i < size; i++){
charreg[i] = infile.get();
cout << charreg[i];
}
}
What am I doing wrong?
EDIT: to explain better what I did:
I get the entries on the first "code" from user input and use them as parameters when creating a "reg" class I implemented. The reg class then does (adequatly, I've already tested it) the conversion to strings, and calculates a hidden four-element char array containing instance size, name size, surname size and degree size. When the program writes the class on-file, it is written perfectly, as I showed in the second "code" section. (If you do the calculations you'll see '&' equals the size of the entire thing, for example). When I read it from the file, it appears differently on console for some reason. Different characters. But it reads the right amount of characters because "name", "surname" and "degree" appear correctly.
EDIT n2: I made "charreg[]" into an int array and printed it and the values are correct. I have no idea what's happening anymore.
EDIT n3: Apparently the reason I was getting the wrong chars is that I should have used unsigned chars...
The idea to write, as is, your structure is good. But your approach is wrong.
You must have something to separate your fields.
For example you know that your ID is 8 byte long, great ! You can read 8 bytes :
long long id;
read(fd, &id, 8);
In your example you got -24 because you read the first byte of the full id number.
But for the rest of the file, how can you know the length of the first name and the last name ?
You could read byte by byte until you find an null byte.
But I suggest you to use a more structured file.
For example, you can define a structure like this :
long long id; // 8 bytes
char firstname[256]; // 256 bytes
char lastname[256]; // 256 bytes
char sex; // 1 byte
int birthdate; // 4 bytes
With this structure you can read and write super easily :
struct my_struct s;
read(fd, &s, sizeof(struct my_struct)); // read 8+256+256+1+4 bytes
s.birthdate = 128;
write(fd, &s, sizeof(struct my_struct));// write the structure
Of course you loose the "variable length" of the first name and last name. Do you really need more than 100 chars for a name ?
In a case you really need, you could introduce an header over each variable length value. But you loose the ability to read everything at once.
long long id;
int foo_size;
char *foo;
And then to read it :
struct my_struct s;
read(fd, &s, 12); // read the header, 8 + 4 bytes
char foo[s.foo_size];
read(fd, &s, s.foo_size);
You should define what exactly you need to save. Define a precise data structure that you can easily deduce at read, avoid things like "oh, let's read until null-byte".
I used C function to explain you because it's much more representative. You know what you read and what you write.
Start to play with this, and then try the same with c++ streams/function
I don't know how you are writing back information to the file but here is how I would do that, I'm hoping this is a fairly simple way of doing it. Keep in mind I have no idea what kind of file you are actually working with.
long long id = 1000;
std::string name = "name";
std::string surname = "overflow";
std::string degree = "internet";
unsigned char sex = 'c';
int birthdate = 256;
ofstream outfile("test.txt", ios::binary);
if (outfile.is_open())
{
const char* idBytes = static_cast<char*>(static_cast<void*>(&id));
const char* nameBytes = name.c_str();
const char* surnameBytes = surname.c_str();
const char* degreeBytes = degree.c_str();
const char* birthdateBytes = static_cast<char*>(static_cast<void*>(&birthdate));
outfile.write(idBytes, sizeof(id));
outfile.write(nameBytes, name.length());
outfile.write(surnameBytes, surname.length());
outfile.write(degreeBytes, degree.length());
outfile.put(sex);
outfile.write(birthdateBytes, sizeof(birthdate));
outfile.flush();
outfile.close();
}
and here is how I am going to output it, which to me seems to be coming out as expected.
ifstream infile("test.txt", std::ifstream::ate | ios::binary);
if (infile.is_open())
{
std::size_t fileSize = infile.tellg();
infile.seekg(0);
for (int i = 0; i < fileSize; i++)
{
char c = infile.get();
std::cout << c;
}
std::cout << std::endl;
}

loop through WAV file in c(or c++)

I am trying to copy a WAV sound in C. the original file is a 2 seconds file, but I want to replicate the data in the destination file several times, so that it plays longer. For example, if I copy it 3 times, it should play for 6 seconds... right?
But for some reason, even though the destination file is bigger than the original file, it still plays for 2 seconds...
Can anyone help please?
Here is my code:
#include <iostream>
#include <stdio.h>
#include <stdlib.h>
using namespace std;
typedef struct header_file
{
char chunk_id[4];
int chunk_size;
char format[4];
char subchunk1_id[4];
int subchunk1_size;
short int audio_format;
short int num_channels;
int sample_rate;
int byte_rate;
short int block_align;
short int bits_per_sample;
char subchunk2_id[4];
int subchunk2_size;
} header;
typedef struct header_file* header_p;
int main()
{
FILE * infile = fopen("../files/man1_nb.wav","rb"); // Open wave file in read mode
FILE * outfile = fopen("../files/Output.wav","wb"); // Create output ( wave format) file in write mode
int BUFSIZE = 2; // BUFSIZE can be changed according to the frame size required (eg: 512)
int count = 0; // For counting number of frames in wave file.
short int buff16[BUFSIZE]; // short int used for 16 bit as input data format is 16 bit PCM audio
header_p meta = (header_p)malloc(sizeof(header)); // header_p points to a header struct that contains the wave file metadata fields
int nb; // variable storing number of byes returned
if (infile)
{
fread(meta, 1, sizeof(header), infile); // Read only the header
fwrite(meta,1, sizeof(*meta), outfile); // copy header to destination file
int looper = 0; // number of times sound data is copied
for(looper=0; looper <2; looper++){
while (!feof(infile))
{
nb = fread(buff16,1,BUFSIZE,infile); // Reading data in chunks of BUFSIZE
count++; // Incrementing Number of frames
fwrite(buff16,1,nb,outfile); // Writing read data into output file
}
fseek(infile, 44, SEEK_SET); // Go back to end of header
}
}
fclose(infile); fclose(outfile);
return 0;
}
Both of your read and write code parts are wrong.
wav files have RIFF format and consists of tlv chunks. Each chunk consists of header and data. Typically wav file consists of 3 chunks: format chunk with FOURCC code, format chunk with PCMWAVEFORMAT struct and data chunk with sound data. Also since size of each chunk is limited by 32 bit of length holding field, large files are constructed by concatenating wav files together.
You need to parse file chunk-by-chunk, and write into destination chunk-by-chunk, updating headers accordingly.
When you change size of your data you'll need to update output header as well.
long total_bytes_written_to_outfile = ftell(outfile);
// correct chunk_size and subchunk2_size just before closing outfile:
fseek(outfile, 0, SEEK_SET);
int size = total_bytes_written_to_outfile - sizeof(*meta);
meta->chunk_size = sizeof(header) - sizeof(meta->chunk_id) - sizeof(meta->chunk_size) + size;
meta->subchunk2_size = size;
fwrite(meta, 1, sizeof(*meta), outfile);
fclose(outfile);
Also, to make sure you are reading correct file check that meta->chunk_size == file size of man1_nb.wav - 8

How to check whether ifstream is end of file in C++

I need to read all blocks of one large file(about 10GB) sequentially, the file contains many floats with a few strings, like this(each item splited by '\n'):
6.292611
-1.078219E-266
-2.305673E+065
sod;eiwo
4.899747e-237
1.673940e+089
-4.515213
I read MAX_NUM_PER_FILE items each time and process them and write to another file, but i don't know when the ifstream is ended.
Here is my code:
ifstream file_input(path_input); //my file is a text file, but i tried both text and binary mode, both failed.
if(file_input)
{
file_input.seekg(0,file_input.end);
unsigned long long length = file_input.tellg(); //get file size
file_input.seekg(0,file_input.beg);
char * buffer = new char [MAX_NUM_PER_FILE+MAX_NUM_PER_LINE];
int i=1,j;
char c,tmp[3];
while(file_input.tellg()<length)
{
file_input.read(buffer,MAX_NUM_PER_FILE);
j=MAX_NUM_PER_FILE;
while(file_input.get(c)&&c!='\n')
buffer[j++]=c; //get a complete item
//process with buffer...
itoa(i++,tmp,10); //int2char
string out_name="out"+string(tmp)+".txt";
ofstream file_output(out_name);
file_output.write(buffer,j);
file_output.close();
}
file_input.close();
delete[] buffer;
}
My code goes wrong, length is bigger than real file size. I have tried file_input.good() or !file_input.eof(), they didn't work, getline(file_input,s) is good, but it is much slower than read, i want read, but i don't know how to check whether ifstream is end-of-file.
I do my work in WINDOWS 7 with VS2010.
I have searched, but there are not any answer about it, How to open a file using ifstream and keep reading it until the end this link can't answer my question.
Update, Problem solved
Hi everyone, I have figured it out that it's my fault. Both while(file_input.tellg()<length) and while(file_input.peek()!=EOF) work fine! while(file_input.peek()!=EOF) is recommended.
The extra items written after the end-of-file is the left items in buffer written in the last time.
Here is the correct code:
ifstream file_input(path_input);
if(file_input)
{
//file_input.seekg(0,file_input.end);
//unsigned long long length = file_input.tellg(); //get file size
//file_input.seekg(0,file_input.beg);
char * buffer = new char [MAX_NUM_PER_FILE+MAX_NUM_PER_LINE];
int i=1,j;
char c,tmp[3];
while(file_input.peek()!=EOF)
{
memset(buffer,0,sizeof(char)*(MAX_NUM_PER_FILE+MAX_NUM_PER_LINE)); //clear first!
file_input.read(buffer,MAX_NUM_PER_FILE);
j=MAX_NUM_PER_FILE;
while(file_input.get(c)&&c!='\n')
buffer[j++]=c;
itoa(i++,tmp,10);//int2char
string out_name="out"+string(tmp)+".txt";
ofstream file_output(out_name);
file_output.write(buffer,strlen(buffer)); //use the correct buffer size instead of j
file_output.close();
}
file_input.close();
delete[] buffer;
}
while( file_input.peek() != EOF )
{
// code
}
Basically peek() will read the next char without extracting it.
So you can simply compare it to EOF.

Split a File and put it back together in c++

I want to copy a file by reading blocks of data, sending it and than put it back together again. Sending is not part of the problem, so I left it out in the code. It should work with any type of file and arbitrary piece_lengths.
This is just a pre-stage. In the end data block should not be chosen sequentially but at random. There could be some time between receiving another block of data.
I know the example just makes sense if size % piece_length != 0.
I'm getting crashed files of the same size as the original file at the other end.
Does anyone see the problem?
int main ()
{
string file = "path/test.txt"
string file2 = "path2/test.txt";
std::ifstream infile (file.c_str() ,std::ifstream::binary);
//get size of file
infile.seekg (0,infile.end);
long size = infile.tellg();
infile.seekg (0);
size_t piece_length = 5;
for (int i = 0; i < ((size / piece_length) + 1); i++)
{
if ( i != (size / piece_length))
{
std::ifstream infile (file.c_str() ,std::ifstream::binary);
infile.seekg((i * piece_length) , infile.beg);
char* buffer = new char[piece_length];
infile.read(buffer, piece_length);
infile.close();
std::ofstream outfile (file2.c_str() ,std::ofstream::binary);
outfile.seekp((i * piece_length), outfile.beg);
outfile.write(buffer, piece_length);
outfile.close();
}
else
{
std::ifstream infile (file.c_str() ,std::ifstream::binary);
infile.seekg((i * piece_length) , infile.beg);
char* buffer = new char[size % piece_length];
infile.read(buffer, size % piece_length);
infile.close();
std::ofstream outfile (file2.c_str() ,std::ofstream::binary);
outfile.seekp((i * piece_length), outfile.beg);
outfile.write(buffer, size % piece_length);
outfile.close();
}
}
return 0;
}
To answer your specific question, you need to open outfile with ios::in | ios::out in the flags, otherwise it defaults to write-only mode and destroys what was already in the file. See this answer for more details: Write to the middle of an existing binary file c++
You may want to consider the following though:
If you are just writing parts to the end of the file, just use ios::app (append). Don't even need to seek.
You don't need to keep reopening infile or even outfile, just reuse them.
You can also reuse buffer. Please remember to delete them, or better yet use a std::vector.

What is the proper method of reading and parsing data files in C++?

What is an efficient, proper way of reading in a data file with mixed characters? For example, I have a data file that contains a mixture of data loaded from other files, 32-bit integers, characters and strings. Currently, I am using an fstream object, but it gets stopped once it hits an int32 or the end of a string. if i add random data onto the end of the string in the data file, it seems to follow through with the rest of the file. This leads me to believe that the null-termination added onto strings is messing it up. Here's an example of loading in the file:
void main()
{
fstream fin("C://mark.dat", ios::in|ios::binary|ios::ate);
char *mymemory = 0;
int size;
size = 0;
if (fin.is_open())
{
size = static_cast<int>(fin.tellg());
mymemory = new char[static_cast<int>(size+1)];
memset(mymemory, 0, static_cast<int>(size + 1));
fin.seekg(0, ios::beg);
fin.read(mymemory, size);
fin.close();
printf(mymemory);
std::string hithere;
hithere = cin.get();
}
}
Why might this code stop after reading in an integer or a string? How might one get around this? Is this the wrong approach when dealing with these types of files? Should I be using fstream at all?
Have you ever considered that the file reading is working perfectly and it is printf(mymemory) that is stopping at the first null?
Have a look with the debugger and see if I am right.
Also, if you want to print someone else's buffer, use puts(mymemory) or printf("%s", mymemory). Don't accept someone else's input for the format string, it could crash your program.
Try
for (int i = 0; i < size ; ++i)
{
// 0 - pad with 0s
// 2 - to two zeros max
// X - a Hex value with capital A-F (0A, 1B, etc)
printf("%02X ", (int)mymemory[i]);
if (i % 32 == 0)
printf("\n"); //New line every 32 bytes
}
as a way to dump your data file back out as hex.