Write at specific position at a file with open() - c++

Hello I am trying to simulate two programs that send and receive files in C++ from the network, something like client and server. To begin with I have to split a file to pages of 4096 bytes and send it to the other program in order to create the file. The way I send and receive files through the network is by write and read. So in the client programm I must create a function tha receives the packages and puts them into a file. I cannot figure a way to put the packages in to the file. For example I a file has 2 pages I must create another file using these 2 pages. Also i cannot know if they come in order so I must create the file and put them in the right position.
/*consider the connections are ok and the file's name is at char* name*/
int file=open(name,"O_CREAT | O_WRONLY,0666);
char buffer[4096];
int pagenumber;
for(int i=0;i<page_number;i++){
read(socket,&pagenumber,sizeof(int));
read(socket,buffer,sizeof(int));
write(file(pagenumber*4096),buffer,4096);
}
This code works for pagenumber=0 but for pagenumber=1 nothing happens! Can you help me? Thanks in advance!

To write at a certain position in the file you must use lseek
off_t lseek(int fd, off_t offset, int whence);
It takes the descriptor, the offset and the final parameter is a constant in these:
SEEK_SET The offset is set to offset bytes.
SEEK_CUR The offset is set to its current location plus offset bytes.
SEEK_END The offset is set to the size of the file plus offset bytes.
If you know how big is the file going to be, you can use ftruncate for it.
int ftruncate(int fd, off_t length);
Anyway even if you create a file that is huge, since most filesystems on Linux support sparse files, the actual file on disk will be the sum of the blocks that have been written.

The first argument to write() is a filedescriptor, which you optained with open(). So it should be
int file = open(...);
...
write(file,buffer,4096);
not
write(file(pagenumber*4096),buffer,4096);
Regarding the question as to how to write at a specific position. You can prepare the file beforehand with write, and then use seek() to position the file where you want to write at. For a description of seek you can look here.

Mario, first of all, lets no rely on garbage in 'pagenumber' to continue the loop (which is happening when loop boundary condition is checked here for the first time). Now, if you are writing page number '0' and then page following it, pagenumber will be initialized to 0 and your loop will come out. Also, please check bytes written and read in write and read system calls respectively.

try pwrite
int file=open(name,"O_CREAT | O_WRONLY,0666);
char buffer[4096];
int pagenumber;
for(int i=0;i<page_number;i++){
read(socket,&pagenumber,sizeof(int));
read(socket,buffer,sizeof(int));
pwrite(file,buffer,4096,4096*i);
}

Related

Knowing current compressed file size using gzwrite (zlib)

I'm using zlib for c++.
Quote from
http://refspecs.linuxbase.org/LSB_3.0.0/LSB-PDA/LSB-PDA/zlib-gzwrite-1.html regarding gzwrite function:
The gzwrite() function shall write data to the compressed file referenced by file, which shall have been opened in a write mode (see gzopen() and gzdopen()). On entry, buf shall point to a buffer containing len bytes of uncompressed data. The gzwrite() function shall compress this data and write it to file. The gzwrite() function shall return the number of uncompressed bytes actually written.
I interpret this as the return value will NOT tell me how much larger the file became when writing. Only how much data was compressed into the file.
The only way to know how large the file is would then be to close it, and read the size from the file system. I have a requirement to only continue to write to the file until it reaches a certain size. Can this be achieved without closing the file?
A workaround would be to write until the uncompressed size reaches my limit and then close the file, read the size from file system and update my best guess of file size based on that, and then re-open the file and continue writing. This would make me close and open the file a few times towards the end (as I'm approaching the size limit).
Another workaround, which would give more of an estimate (which is not what I want really) would be to write until uncompressed size reaches the limit, close the file, read the file size from the file system and calculate the compression ratio so far. The I can use this compression ratio to calculate a new limit for uncompressed file size where the compression should get me down to the limit for the compressed file size. If I repeat this the estimate would improve, but again, not what I'm looking for.
Are there better options?
Preferred option would be if zlib could tell me the compressed file size while the file is still open. I don't see why this information would not be available inside zlib at this point, since compression happens when I call gzwrite and not when i close the file.
zlib provides the function gzoffset(), which does exactly what you're asking.
If for some reason you are stuck with a version of zlib that is more than about eight years old, when gzoffset() was added, then this is easy to do with gzdopen(). You open the output file with fopen() or open(), and provide the file descriptor (using fileno() and dup() if you used fopen()), and then provide that descriptor to gzdopen(). Then you can use ftell() or lseek() at any time to see how much as been written. Be careful to not try to double-close the descriptor. See the comments for gzdopen().
You can work around this issue by using a pipe. The idea is to write the compressed data into a pipe. After that, you read the data from the other end of the pipe, count it and write it to the actual file.
To set this up you need to first open the file to write to via a simple open. Then create a pipe via pipe2 and initialize zlib by passing one of the pipe descriptors to gzdopen:
int out = open("/path/to/file", O_WRONLY | O_CREAT | O_TRUNC);
int p[2];
pipe2(p, O_NONBLOCK);
gzFile zFile = gzdopen(p[0], "w");
You can now write the data first to the pipe and then splice it from the pipe to the out file:
gzwrite(zFile, buf, 1024); //or any other length
size_t bytesWritten = 0;
do {
bytesWritten = splice(p[1], NULL, out, NULL, 1024, SPLICE_F_NONBLOCK | SPLICE_F_MORE);
} while(bytesWritten == 1024);
As you can see, you now have the bytesWritten to tell you how much data was actually written. Simply sum it up in another variable and stop splicing as soon as you have written as much data as you need to (or just splice it in one go by writing everything to the zFile and the splice once with the amount of data you are allowed to store as the fifth parameter. If you want to not compress uneccessary data, simply do it in chunks as shown above).
A note on splice: Splice is linux specific, and is basically just a very efficient copy. You can always replace it with a simple "read and write" combo, i.e. read data from fd[1] into a buffer and then write the data from that buffer into out - splice is just faster and less code.

Reading a Potentially incomplete File C++

I am writing a program to reformat a DNS log file for insertion to a database. There is a possibility that the line currently being written to in the log file is incomplete. If it is, I would like to discard it.
I started off believing that the eof function might be a good fit for my application, however I noticed a lot of programmers dissuading the use of the eof function. I have also noticed that the feof function seems to be quite similar.
Any suggestions/explanations that you guys could provide about the side effects of these functions would be most appreciated, as would any suggestions for more appropriate methods!
Edit: I currently am using the istream::peek function in order to skip over the last line, regardless of whether it is complete or not. While acceptable, a solution that determines whether the last line is complete would be preferred.
The specific comparison I'm using is: logFile.peek() != EOF
I would consider using
int fseek ( FILE * stream, long int offset, int origin );
with SEEK_END
and then
long int ftell ( FILE * stream );
to determine the number of bytes in the file, and therefore - where it ends. I have found this to be more reliable in detecting the end of the file (in bytes).
Could you detect an (End of Record/Line) EOR marker (CRLF perhaps) in the last two or three bytes of the file? (3 bytes might be used for CRLF^Z...depends on the file type). This would verify if you have a complete last row
fseek (stream, -2,SEEK_END);
fread (2 bytes... etc
If you try to open the file with exclusive locks, you can detect (by the failure to open) that the file is in use, and try again in a second...(or whenever)
If you need to capture the file contents as the file is being written, it's much easier if you eliminate as many layers of indirection and buffering between your logic and the actual bytes of data in the file.
Do not use C++ IO streams of any type - you have no real control over them. Don't use FILE *-based functions such as fopen() and fread() - those are buffered, and even if you disable buffering there are layers of code between your code and the data that once again you can't control and don't know what's happening.
In a POSIX environment, you can use low-level C-style open() and read()/pread() calls. And use fstat() to know when the file contents have changed - you'll see the st_size member of the struct stat argument change after a call to fstat().
You'd open the file like this:
int logFileFD = open( "/some/file/name.log", O_RDONLY );
Inside a loop, you could do something like this (error checking and actual data processing omitted):
size_t lastSize = 0;
while ( !done )
{
struct stat statBuf;
fstat( logFileFD, &statBuf );
if ( statBuf.st_size == lastSize )
{
sleep( 1 ); // or however long you want
continue; // go to next loop iteration
}
// process new data - might need to keep some of the old data
// around to handle lines that cross boundaries
processNewContents( logFileFD, lastSize, statBuf.st_size );
}
processNewContents() could look something like this:
void processNewContents( int fd, size_t start, size_t end )
{
static char oldData[ BUFSIZE ];
static char newData[ BUFSIZE ];
// assumes amount of data will fit in newData...
ssize_t bytesRead = pread( fd, newData, start, end - start );
// process the data that was read read here
return;
}
You may also find that you need to add some code to close() then re-open() the file in case your application doesn't seem to be "seeing" data written to the file. I've seen that happen on some systems - the application somehow sees a cached copy of the file size somewhere while an ls run in another context gets the more accurate, updated size. If, for example, you know your log file is written to every 10-15 seconds, if you go 30 seconds without seeing any change to the file you know to try reopening the file.
You can also track the inode number in the struct stat results to catch log file rotation.
In a non-POSIX environment, you can replace open(), fstat() and pread() calls with the low-level OS equivalent, although Windows provides most of what you'd need. On Windows, lseek() followed by read() would replace pread().

Storing audio file into an array/stringstream C++

I would like to send the contents of an audio file to another system over the network using socket. Both systems run on Windows operating system. Is there a tutorial on some way to store the audio contents into a C++ array or Stringstream datatype, so that it will be easier to send it to a different node.
I basically want to know how to extract data bytes from an audio file.
The easiest thing to do is to simply send the data in chunks of bytes. If you are starting with an audio file, just open it like any other binary file with something like file = fopen(filename, "rb"); (where filename is the name of the audio file). Then enter a loop to read a chunks of bytes until you reach the end of the file. Just use something like bytes_read = fread(buffer, sizeof(char), read_size, file); where buffer should probably be a char array of at least size read_size, which could be, say, 1024. After each fread, you can make your network send call. Alternately, you could read the whole file first and then send it chunk by chunk. Your call. Either way, when you reach the end of the file, send some sort of signal that you have reached the end. The receiving system should take these chunks and call fwrite to create a new audio file. You can either append each chunk as it comes in or buffer it all until you reach the end and then write it all out.
soundfile++ can be used if you have wav files only. Check the readtest and writetest demo programs here http://sig.sapp.org/doc/examples/soundfile/

ERROR_NOT_ENOUGH_MEMORY Error when writing INI using WritePrivateProfileString, after 200k calls

I'm making simple dll packet sniffer using C++, that will hook to the apps, and write the received packet into INI file. Unfortunately after 20-30 minutes it crashed the main apps.
When the packet is received, receivedPacket() will be called. After 20+ minutes, WriteCount value is around 150,000-200,000.. and starting to get C++ runtime error/crash, GetLastError() code that I get is 0x8, which is ERROR_NOT_ENOUGH_MEMORY, and the WritePrivateProfileStringA() returns 0
void writeToINI(LPCSTR iSec,LPCSTR iKey,int iVal){
sprintf(inival, _T("%d"), iVal);
WritePrivateProfileStringA(iSec,iKey,inival,iniloc);
//sprintf(strc, _T("%d \n"), WriteCount);
//WriteConsole(GetStdHandle(STD_OUTPUT_HANDLE), strc, strlen(strc), 0, 0);
WriteCount++;
}
void receivedPacket(char *packet,WORD size){
switch ( packet[2] )
{
case 0x30:
// Size : 0x5F
ID = *(signed char*)&packet[0x10];
X = *(signed short*)&packet[0x20];
Y = *(signed short*)&packet[0x22];
Z = *(signed short*)&packet[0x24];
sprintf(inisec, _T("PACKET_%d"), (ID+1));
writeToINI(inisec,"id",ID);
writeToINI(inisec,"x",X);
writeToINI(inisec,"y",Y);
writeToINI(inisec,"z",Z);
}
[.....OTHER CASES.....]
}
Thanks :)
WritePrivateProfileString() and GetPrivateProfileString() are very slow (due to parsing INI file each call), instead you can:
use one of existing parsing libraries, but i am not sure about memory efficiency nor supporting sequential write.
write your own sequential INI writter:
read file (or only part, by part, if it is too big)
find section and key (if not found, create new section at end of file, or find insertion position, if you want sorted sections), save file position of key and next key
change value
save (beginning of original file to position of key + actual changed key + position of next key in original file to end of file) (if new section is created at end, you can simply append new section to original file) (if packets rewrite same ID often, you can add padding whitespace after each key, large to hold any value of desired type (example: change X=1---\n to X=100-\n (change - to whitespace), so you have constant size of key, you can update only part of file) )
database, for example MySQL
write binary file (fastest solution) and make program to read values, or to convert to text
Little note: I use GetPrivateProfileString() few years ago to read settings file (about 1KB of size), reading form HDD: 50ms, reading from USB flash disk: 1000ms!, after changing (1. read file to memory 2. run my own parser) it run in 1ms both on HDD and USB.
Thanks for the reply guys, but looks like the problem wasn't come from WritePrivateProfileStringA().
I just need to add extra size in malloc() for the Hook.
:)

Is ftruncate() asynchronous?

I am attempting to write a class in C++ that provides a means of atomically appending to a file, even for the case of power failure mid write.
First, I write my current file position (a 64 offset from the beginning of the file, in bytes) to a separate journal file. Then, I write the requested data to the end of the date file. Finally, I call ftruncate() (setting the truncated size to 0) on the journal file.
The main idea is that if this class is ever asked to open a file that has a non empty journal file, then you know a write was interrupted and you can read the position of the last write from the journal file and fseek to that spot. You lose the last partial write, but the file should not be corrupted.
Unfortunately, it seems like ftruncate() is asynchronous. In practice, even if I call fflush() and fsync() after ftruncate I see the journal grow to up to hundreds of bytes while doing lots of writes. It always ultimately ends up at 0, but I expected to see it at either size 0 or size 8 at all times.
Is it possible to make ftruncate completely synchronous? Or is there a better way to use the journal?
ftruncate() does not change your file descriptor's write offset in the file. If you are leaving the file open and writing the next length after calling ftruncate(), then what's happening is the file's offset is still increasing. When you write, it resets the length of the file to be at the offset and then writes your bytes there.
Probably what you want to do is call lseek(fd, 0, SEEK_SET) after you call ftruncate() so that the next write to the file will take place at the beginning of the file.