C++ using 7zip.dll - c++

I'm developing an app which will need to work with different types of archives. As many of the archive types as possible is good. I have choosen a 7zip.dll as an engine of archive-worker. But there is a problem, does anyone knows how to uncompress a file from archive to memory buffer? As I see, 7zip.dll supports only uncompressing to hard disk. Also, it would be nice to load archive from memory buffer. Has anyone tried to do something like that?

Not sure if I completely understand your needs (for example, don't you need the decompressed file on disk?).
I was looking at LZMA SDK 9.20 and its lzma.txt readme file, and there are plenty of hints that decompression to memory is possible - you may just need to use the C API rather than the C++ interface. Check out, for example, the section called Single-call Decompressing:
When to use: RAM->RAM decompressing
Compile files: LzmaDec.h + LzmaDec.c + Types.h
Compile defines: no defines
Memory Requirements:
- Input buffer: compressed size
- Output buffer: uncompressed size
- LZMA Internal Structures: state_size (16 KB for default settings)
Also, there is this function:
SRes LzmaDec_DecodeToBuf(CLzmaDec *p, Byte *dest, SizeT *destLen,
const Byte *src, SizeT *srcLen, ELzmaFinishMode finishMode, ELzmaStatus *status);
You can utilize these by memory-mapping the archive file. To the best of my knowledge, if your process creates a memory-mapped file with exclusive access (so no other process can access it) and does no explicit flushing, all changes to the file will be kept in memory until the mapping is destroyed or the file closed. Alternatively, you could just load the archive contents in memory.
For the sake of completeness, I hacked together several examples into a demo of using memory mapping in Windows.
#include <stdio.h>
#include <time.h>
#include <Windows.h>
#include <WinNT.h>
// This demo will limit the file to 4KiB
#define FILE_SIZE_MAX_LOWER_DW 4096
#define FILE_SIZE_MAX_UPPER_DW 0
#define MAP_OFFSET_LOWER_DW 0
#define MAP_OFFSET_UPPER_DW 0
#define TEST_ITERATIONS 1000
#define INT16_SIZE 2
typedef short int int16;
// NOTE: This will not work for Windows less than XP or 2003 Server!
int main()
{
HANDLE hFile, hFileMapping;
PBYTE mapViewStartAddress;
// Note: with no explicit security attributes, the process needs to have
// the necessary rights (e.g. read, write) to this location.
LPCSTR path = "C:\\Users\\mcmlxxxvi\\Desktop\\test.dat";
// First, open a file handle.
hFile = CreateFile(path,
GENERIC_READ | GENERIC_WRITE, // The file is created with Read/Write permissions
FILE_SHARE_READ, // Set this to 0 for exclusive access
NULL, // Optional security attributes
CREATE_ALWAYS, // File is created if not found, overwritten otherwise
FILE_ATTRIBUTE_TEMPORARY, // This affects the caching behaviour
0); // Attributes template, can be left NULL
if ((hFile) == INVALID_HANDLE_VALUE)
{
fprintf(stderr, "Unable to open file");
return 1;
}
// Then, create a memory mapping for the opened file.
hFileMapping = CreateFileMapping(hFile, // Handle for an opened file
NULL, // Optional security attributes
PAGE_READWRITE, // File can be mapped for Read/Write access
FILE_SIZE_MAX_UPPER_DW, // Maximum file size split in DWORDs.
FILE_SIZE_MAX_LOWER_DW, // NOTE: I may have these two mixed up!
NULL); // Optional name
if (hFileMapping == 0)
{
CloseHandle(hFile);
fprintf(stderr, "Unable to open file for mapping.");
return 1;
}
// Next, map a view (a continuous portion of the file) to a memory region
// The view must start and end at an offset that is a multiple of
// the allocation granularity (roughly speaking, the machine page size).
mapViewStartAddress = (PBYTE)MapViewOfFile(hFileMapping, // Handle to a memory-mapped file
FILE_MAP_READ | FILE_MAP_WRITE, // Maps the view for Read/Write access
MAP_OFFSET_UPPER_DW, // Offset in the file from which
MAP_OFFSET_LOWER_DW, // the view starts, split in DWORDs.
FILE_SIZE_MAX_LOWER_DW); // Size of the view (here, entire file)
if (mapViewStartAddress == 0)
{
CloseHandle(hFileMapping);
CloseHandle(hFile);
fprintf(stderr, "Couldn't map a view of the file.");
return 1;
}
// This is where actual business stuff belongs.
// This example application does iterations of reading and writing
// random numbers for the entire length of the file.
int16 value;
errno_t result = 0;
srand((int)time(NULL));
for (int i = 0; i < TEST_ITERATIONS; i++)
{
// Write
for (int j = 0; j < FILE_SIZE_MAX_LOWER_DW / INT16_SIZE; j++)
{
value = rand();
result = memcpy_s(mapViewStartAddress + j * INT16_SIZE, INT16_SIZE, &value, INT16_SIZE);
if (result != 0)
{
CloseHandle(hFileMapping);
CloseHandle(hFile);
fprintf(stderr, "File write error during iteration #%d, error %d", i, GetLastError());
return 1;
}
}
// Read
SetFilePointer(hFileMapping, 0, 0, FILE_BEGIN);
for (int j = 0; j < FILE_SIZE_MAX_LOWER_DW / sizeof(int); j++)
{
result = memcpy_s(&value, INT16_SIZE, mapViewStartAddress + j * INT16_SIZE, INT16_SIZE);
if (result != 0)
{
CloseHandle(hFileMapping);
CloseHandle(hFile);
fprintf(stderr, "File read error during iteration #%d, error %d", i, GetLastError());
return 1;
}
}
}
// End business stuff
CloseHandle(hFileMapping);
CloseHandle(hFile);
return 0;
}

Related

How to use ReadFileScatter

I've been attempting to use ReadFileScatter today in my code (which sounds like exactly what I need), so far without a lot of luck. Google'ing the internet for what goes wrong doesn't give me much insight.
The documentation states:
The array must contain enough elements to store nNumberOfBytesToRead bytes of data, plus one element for the terminating NULL. For example, if there are 40 KB to be read and the page size is 4 KB, the array must have 11 elements that includes 10 for the data and one for the NULL.
Each buffer must be at least the size of a system memory page and must be aligned on a system memory page size boundary. The system reads one system memory page of data into each buffer.
The function stores the data in the buffers in sequential order. For example, it stores data into the first buffer, then into the second buffer, and so on until each buffer is filled and all the data is stored, or there are no more buffers.
So far I've been attempting to do just that. I allocated a bunch of bytes using VirtualAlloc (which ensures the page boundary constraint), add a terminator NULL to the list, ensure the data on disk is on the system boundary (and implicitly a disk sector size boundary) as well and issue the call.
Without further due, here's the minimum test case in C++:
// Setup: c:\tmp\test.dat is a file with at least 12K of stuff.
// I attempt to read page 2/3, e.g. offset [4096-4096+8192>
// TEST:
SYSTEM_INFO systemInfo;
GetSystemInfo(&systemInfo);
auto pageSize = systemInfo.dwPageSize;
std::cout << "Page size: "<< pageSize << std::endl;
// Allocate 2 pages that are aligned with one in the middle:
auto buffer = reinterpret_cast<char*>(VirtualAlloc(NULL, pageSize * 3, MEM_RESERVE | MEM_COMMIT, PAGE_READWRITE));
// Create read buffer:
std::vector<FILE_SEGMENT_ELEMENT> elements;
{
FILE_SEGMENT_ELEMENT element1;
element1.Buffer = buffer;
elements.push_back(element1);
}
{
FILE_SEGMENT_ELEMENT element2;
element2.Buffer = buffer + pageSize * 2;
elements.push_back(element2);
}
{
FILE_SEGMENT_ELEMENT terminator;
terminator.Buffer = NULL;
elements.push_back(terminator);
}
// [..] Physical sector size is normally checked as well. In my case it's 512 bytes,
// so I guess that's irrelevant here.
//
// Open file:
auto fileHandle = CreateFile(
"c:\\tmp\\test.dat",
GENERIC_READ | GENERIC_WRITE,
FILE_SHARE_READ,
NULL,
OPEN_ALWAYS,
FILE_FLAG_NO_BUFFERING | FILE_FLAG_WRITE_THROUGH,
NULL);
auto err = GetLastError();
if (err != ERROR_ALREADY_EXISTS && err != ERROR_SUCCESS)
{
throw std::exception(); // FIXME.
}
OVERLAPPED overlapped;
memset(&overlapped, 0, sizeof(OVERLAPPED));
LARGE_INTEGER tmp;
tmp.QuadPart = 4096; // Read from disk page 1
overlapped.Offset = tmp.LowPart;
overlapped.OffsetHigh = tmp.HighPart;
overlapped.hEvent = CreateEvent(NULL, TRUE, FALSE, NULL);
auto succes = ReadFileScatter(fileHandle, elements.data(), pageSize * 2, NULL, &overlapped);
err = GetLastError();
if (!succes && err != ERROR_IO_PENDING && err != ERROR_SUCCESS)
{
throw std::exception(); // The call always ends up here with error 87: Invalid parameter
}
WaitForSingleObject(overlapped.hEvent, INFINITE);
std::cout << "Call succeeded!" << std::endl;
// FIXME: Proper exception handling.
// Clean up:
VirtualFree(buffer, pageSize * 3, MEM_DECOMMIT | MEM_RELEASE);
CloseHandle(overlapped.hEvent);
CloseHandle(fileHandle);
In the code, the error is noted with the comment // The call always ends up here with error 87: Invalid parameter. However, as far as I can see, I check all the boxes that they describe on MSDN... so...
What am I doing wrong here?

CreateFileMapping returns ERROR_INVALID_HANDLE

I am trying to use CreateFileMapping for the first time and it is giving me this error when I use GetLastError():
ERROR_INVALID_HANDLE: The handle is invalid.
Here is my code:
// create the name of our file-mapping object
nTry++; // Ensures a unique string is used in case user closes and reopens
wsprintfA(szName, FS6IPC_MSGNAME1 ":%X:%X", GetCurrentProcessId(), nTry);
// stuff the name into a global atom
m_atom = GlobalAddAtomA(szName);
if (m_atom == 0)
{ *pdwResult = ERR_ATOM;
return FALSE;
}
// create the file-mapping object
m_hMap = CreateFileMappingA(
(HANDLE)0xFFFFFFFF, // use system paging file
NULL, // security
PAGE_READWRITE, // protection
0, MAX_SIZE+256, // size
szName); //
EDIT:
The first issue was resolved, but now my program crashes somewhere else.
#define FS6IPC_MESSAGE_SUCCESS 1
#define FS6IPC_MESSAGE_FAILURE 0
// IPC message types
#define FS6IPC_READSTATEDATA_ID 1
#define FS6IPC_WRITESTATEDATA_ID 2
// read request structure
typedef struct tagFS6IPC_READSTATEDATA_HDR
{
DWORD dwId; // FS6IPC_READSTATEDATA_ID
DWORD dwOffset; // state table offset
DWORD nBytes; // number of bytes of state data to read
void* pDest; // destination buffer for data (client use only)
} FS6IPC_READSTATEDATA_HDR;
// write request structure
typedef struct tagFS6IPC_WRITESTATEDATA_HDR
{
DWORD dwId; // FS6IPC_WRITESTATEDATA_ID
DWORD dwOffset; // state table offset
DWORD nBytes; // number of bytes of state data to write
} FS6IPC_WRITESTATEDATA_HDR;
while (*pdw)
{ switch (*pdw)
{ case FS6IPC_READSTATEDATA_ID:
pHdrR = (FS6IPC_READSTATEDATA_HDR *) pdw;
m_pNext += sizeof(FS6IPC_READSTATEDATA_HDR);
if (pHdrR->pDest && pHdrR->nBytes)
CopyMemory(pHdrR->pDest, m_pNext, pHdrR->nBytes);
m_pNext += pHdrR->nBytes; // Debugger says the issue is here
break;
case FS6IPC_WRITESTATEDATA_ID:
// This is a write, so there's no returned data to store
pHdrW = (FS6IPC_WRITESTATEDATA_HDR *) pdw;
m_pNext += sizeof(FS6IPC_WRITESTATEDATA_HDR) + pHdrW->nBytes;
break;
default:
// Error! So terminate the scan
*pdw = 0;
break;
}
pdw = (DWORD *) m_pNext;
}
I'm guessing you're running on a 64-bit system, on which HANDLEs are 64 bits. The OS is quite right—the handle value 0x00000000FFFFFFFF is an invalid handle value for your process.
What exactly are you trying to do? If you want to create a file mapping backed by an actual file, pass in the handle for that file. If you want to a create a file mapping backed by the paging file instead, pass in INVALID_HANDLE_VALUE. INVALID_HANDLE_VALUE happens to be (HANDLE)-1, which is 0xFFFFFFFF on 32-bit systems but 0xFFFFFFFFFFFFFFFF on 64-bit systems, but that doesn't really matter since you should just use the symbolic value INVALID_HANDLE_VALUE in any case.
If your application is crashing when you pass in INVALID_HANDLE_VALUE, it's not because the call to CreateFileMapping is failing, it's for some other reason, and you should debug that.

Unable to receive data from serial port

Currently I try to write a serial port communication in VC++ to transfer data from PC and robot via XBee transmitter. But after I wrote some commands to poll data from robot, I didn't receive anything from the robot (the output of filesize is 0 in the code.). Because my MATLAB interface works, so the problem should happen in the code not the hardware or communication. Would you please give me help?
01/03/2014 Updated: I have updated my codes. It still can not receive any data from my robot (the output of read is 0). When I use "cout<<&read" in the while loop, I obtain "0041F01C1". I also don't know how to define the size of buffer, because I don't know the size of data I will receive. In the codes, I just give it a random size like 103. Please help me.
// This is the main DLL file.
#include "StdAfx.h"
#include <iostream>
#define WIN32_LEAN_AND_MEAN //for GetCommState command
#include "Windows.h"
#include <WinBase.h>
using namespace std;
int main(){
char init[]="";
HANDLE serialHandle;
// Open serial port
serialHandle = CreateFile("\\\\.\\COM8", GENERIC_READ | GENERIC_WRITE, 0, 0, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, 0);
// Do some basic settings
DCB serialParams;
DWORD read, written;
serialParams.DCBlength = sizeof(serialParams);
if((GetCommState(serialHandle, &serialParams)==0))
{
printf("Get configuration port has a problem.");
return FALSE;
}
GetCommState(serialHandle, &serialParams);
serialParams.BaudRate = CBR_57600;
serialParams.ByteSize = 8;
serialParams.StopBits = ONESTOPBIT;
serialParams.Parity = NOPARITY;
//set flow control="hardware"
serialParams.fOutX=false;
serialParams.fInX=false;
serialParams.fOutxCtsFlow=true;
serialParams.fOutxDsrFlow=true;
serialParams.fDsrSensitivity=true;
serialParams.fRtsControl=RTS_CONTROL_HANDSHAKE;
serialParams.fDtrControl=DTR_CONTROL_HANDSHAKE;
if (!SetCommState(serialHandle, &serialParams))
{
printf("Set configuration port has a problem.");
return FALSE;
}
GetCommState(serialHandle, &serialParams);
// Set timeouts
COMMTIMEOUTS timeout = { 0 };
timeout.ReadIntervalTimeout = 30;
timeout.ReadTotalTimeoutConstant = 30;
timeout.ReadTotalTimeoutMultiplier = 30;
timeout.WriteTotalTimeoutConstant = 30;
timeout.WriteTotalTimeoutMultiplier = 30;
SetCommTimeouts(serialHandle, &timeout);
if (!SetCommTimeouts(serialHandle, &timeout))
{
printf("Set configuration port has a problem.");
return FALSE;
}
//write packet to poll data from robot
WriteFile(serialHandle,">*>p4",strlen(">*>p4"),&written,NULL);
//check whether the data can be received
char buffer[103];
do {
ReadFile (serialHandle,buffer,sizeof(buffer),&read,NULL);
cout << read;
} while (read!=0);
//buffer[read]="\0";
CloseHandle(serialHandle);
return 0;
}
GetFileSize is documented not to be valid when used with a serial port handle. Use the ReadFile function to receive serial port data.
You should use strlen instead of sizeof here:
WriteFile(serialHandle,init,strlen(init),&written,NULL)
You would be even better off creating a function like this:
function write_to_robot (const char * msg)
{
DWORD written;
BOOL ok = WriteFile(serialHandle, msg, strlen(msg), &written, NULL)
&& (written == strlen(msg));
if (!ok) printf ("Could not send message '%s' to robot\n", msg);
}
But that's only the appetizer. The main trouble is, as MDN says:
You cannot use the GetFileSize function with a handle of a nonseeking device such as a pipe or a communications device.
If you want to read from the port, you can simply use ReadFile until it returns zero bytes.
If you already know the max size of your robot's response, try reading that many characters.
Continue reading until the read reports an actual number of bytes read inferior to the size of the buffer. For instance:
#define MAX_ROBOT_ANSWER_LENGTH 1000 /* bytes */
const char * read_robot_response ()
{
static char buffer[MAX_ROBOT_ANSWER_LENGTH];
DWORD read;
if (!ReadFile (serialHandle, buffer, sizeof(buffer), &read, NULL))
{
printf ("something wrong with the com port handle");
exit (-1);
}
if (read == sizeof(buffer))
{
// the robot response is bigger than it should
printf ("this robot is overly talkative. Flushing input\n");
// read the rest of the input so that the next answer will not be
// polluted by leftovers of the previous one.
do {
ReadFile (serialHandle, buffer, sizeof(buffer), &read, NULL);
} while (read != 0);
// report error
return "error: robot response exceeds maximal length";
}
else
{
// add a terminator to string in case Mr Robot forgot to provide one
buffer[read] = '\0';
printf ("Mr Robot said '%s'\n", buffer);
return buffer;
}
}
This simplistic function returns a static variable, which will be overwritten each time you call read_robot_response.
Of course the proper way of doing things would be to use blocking I/Os instead of waiting one second and praying for the robot to answer in time, but that would require a lot more effort.
If you feel adventurous, you can use overlapped I/O, as this lenghty MDN article thoroughly explores.
EDIT: after looking at your code
// this reads at most 103 bytes of the answer, and does not display them
if (!ReadFile(serialHandle,buffer,sizeof(buffer),&read,NULL))
{
printf("Reading data to port has a problem.");
return FALSE;
}
// this could display the length of the remaining of the answer,
// provided it is more than 103 bytes long
do {
ReadFile (serialHandle,buffer,sizeof(buffer),&read,NULL);
cout << read;
}
while (read!=0);
You are displaying nothing but the length of the response beyond the first 103 characters received.
This should do the trick:
#define BUFFER_LEN 1000
DWORD read;
char buffer [BUFFER_LEN];
do {
if (!ReadFile(
serialHandle, // handle
buffer, // where to put your characters
sizeof(buffer) // max nr of chars to read
-1, // leave space for terminator character
&read, // get the number of bytes actually read
NULL)) // Yet another blody stupid Microsoft parameter
{
// die if something went wrong
printf("Reading data to port has a problem.");
return FALSE;
}
// add a terminator after last character read,
// so as to have a null terminated C string to display
buffer[read] = '\0';
// display what you actually read
cout << buffer;
}
while (read!=0);
I advised you to wrap the actual calls to serial port accesses inside simpler functions for a reason.
As I said before, Microsoft interfaces are a disaster. They are verbose, cumbersome and only moderately consistent. Using them directly leads to awkward and obfuscated code.
Here, for instance, you seem to have gotten confused between read and buffer
read holds the number of bytes actually read from the serial port
buffer holds the actual data.
buffer is what you will want to display to see what the robot answered you
Also, you should have a documentation for your robot stating which kind of answers you are supposed to expect. It would help to know how they are formatted, for instance whether they are null-terminated strings or not. That could dispense to add the string terminator.

Libzip - read file contents from zip

I using libzip to work with zip files and everything goes fine, until i need to read file from zip
I need to read just a whole text files, so it will be great to achieve something like PHP "file_get_contents" function.
To read file from zip there is a function "int
zip_fread(struct zip_file *file, void *buf, zip_uint64_t nbytes)".
Main problem what i don't know what size of buf must be and how many nbytes i must read (well i need to read whole file, but files have different size). I can just do a big buffer to fit them all and read all it's size, or do a while loop until fread return -1 but i don't think it's rational option.
You can try using zip_stat to get file size.
http://linux.die.net/man/3/zip_stat
I haven't used the libzip interface but from what you write it seems to look very similar to a file interface: once you got a handle to the stream you keep calling zip_fread() until this function return an error (ir, possibly, less than requested bytes). The buffer you pass in us just a reasonably size temporary buffer where the data is communicated.
Personally I would probably create a stream buffer for this so once the file in the zip archive is set up it can be read using the conventional I/O stream methods. This would look something like this:
struct zipbuf: std::streambuf {
zipbuf(???): file_(???) {}
private:
zip_file* file_;
enum { s_size = 8196 };
char buffer_[s_size];
int underflow() {
int rc(zip_fread(this->file_, this->buffer_, s_size));
this->setg(this->buffer_, this->buffer_,
this->buffer_ + std::max(0, rc));
return this->gptr() == this->egptr()
? traits_type::eof()
: traits_type::to_int_type(*this->gptr());
}
};
With this stream buffer you should be able to create an std::istream and read the file into whatever structure you need:
zipbuf buf(???);
std::istream in(&buf);
...
Obviously, this code isn't tested or compiled. However, when you replace the ??? with whatever is needed to open the zip file, I'd think this should pretty much work.
Here is a routine I wrote that extracts data from a zip-stream and prints out a line at a time. This uses zlib, not libzip, but if this code is useful to you, feel free to use it:
#
# compile with -lz option in order to link in the zlib library
#
#include <zlib.h>
#define Z_CHUNK 2097152
int unzipFile(const char *fName)
{
z_stream zStream;
char *zRemainderBuf = malloc(1);
unsigned char zInBuf[Z_CHUNK];
unsigned char zOutBuf[Z_CHUNK];
char zLineBuf[Z_CHUNK];
unsigned int zHave, zBufIdx, zBufOffset, zOutBufIdx;
int zError;
FILE *inFp = fopen(fName, "rbR");
if (!inFp) { fprintf(stderr, "could not open file: %s\n", fName); return EXIT_FAILURE; }
zStream.zalloc = Z_NULL;
zStream.zfree = Z_NULL;
zStream.opaque = Z_NULL;
zStream.avail_in = 0;
zStream.next_in = Z_NULL;
zError = inflateInit2(&zStream, (15+32)); /* cf. http://www.zlib.net/manual.html */
if (zError != Z_OK) { fprintf(stderr, "could not initialize z-stream\n"); return EXIT_FAILURE; }
*zRemainderBuf = '\0';
do {
zStream.avail_in = fread(zInBuf, 1, Z_CHUNK, inFp);
if (zStream.avail_in == 0)
break;
zStream.next_in = zInBuf;
do {
zStream.avail_out = Z_CHUNK;
zStream.next_out = zOutBuf;
zError = inflate(&zStream, Z_NO_FLUSH);
switch (zError) {
case Z_NEED_DICT: { fprintf(stderr, "Z-stream needs dictionary!\n"); return EXIT_FAILURE; }
case Z_DATA_ERROR: { fprintf(stderr, "Z-stream suffered data error!\n"); return EXIT_FAILURE; }
case Z_MEM_ERROR: { fprintf(stderr, "Z-stream suffered memory error!\n"); return EXIT_FAILURE; }
}
zHave = Z_CHUNK - zStream.avail_out;
zOutBuf[zHave] = '\0';
/* copy remainder buffer onto line buffer, if not NULL */
if (zRemainderBuf) {
strncpy(zLineBuf, zRemainderBuf, strlen(zRemainderBuf));
zBufOffset = strlen(zRemainderBuf);
}
else
zBufOffset = 0;
/* read through zOutBuf for newlines */
for (zBufIdx = zBufOffset, zOutBufIdx = 0; zOutBufIdx < zHave; zBufIdx++, zOutBufIdx++) {
zLineBuf[zBufIdx] = zOutBuf[zOutBufIdx];
if (zLineBuf[zBufIdx] == '\n') {
zLineBuf[zBufIdx] = '\0';
zBufIdx = -1;
fprintf(stdout, "%s\n", zLineBuf);
}
}
/* copy some of line buffer onto the remainder buffer, if there are remnants from the z-stream */
if (strlen(zLineBuf) > 0) {
if (strlen(zLineBuf) > strlen(zRemainderBuf)) {
/* to minimize the chance of doing another (expensive) malloc, we double the length of zRemainderBuf */
free(zRemainderBuf);
zRemainderBuf = malloc(strlen(zLineBuf) * 2);
}
strncpy(zRemainderBuf, zLineBuf, zBufIdx);
zRemainderBuf[zBufIdx] = '\0';
}
} while (zStream.avail_out == 0);
} while (zError != Z_STREAM_END);
/* close gzip stream */
zError = inflateEnd(&zStream);
if (zError != Z_OK) {
fprintf(stderr, "could not close z-stream!\n");
return EXIT_FAILURE;
}
if (zRemainderBuf)
free(zRemainderBuf);
fclose(inFp);
return EXIT_SUCCESS;
}
With any streaming you should consider the memory requirements of your app.
A good buffer size is large, but you do not want to have too much memory in use depending on your RAM usage requirements. A small buffer size will require you call your read and write operations more times which are expensive in terms of time performance. So, you need to find a buffer in the middle of those two extremes.
Typically I use a size of 4096 (4KB) which is sufficiently large for many purposes. If you want, you can go larger. But at the worst case size of 1 byte, you will be waiting a long time for you read to complete.
So to answer your question, there is no "right" size to pick. It is a choice you should make so that the speed of your app and the memory it requires are what you need.

How can I detect only deleted, changed, and created files on a volume?

I need to know if there is an easy way of detecting only the files that were deleted, modified or created on an NTFS volume.
I have written a program for offsite backup in C++. After the first backup, I check the archive bit of each file to see if there was any change made, and back up only the files that were changed. Also, it backs up from the VSS snapshot in order to prevent file locks.
This seems to work fine on most file systems, but for some with lots of files and directories, this process takes too long and often the backup takes more than a day to finish backing up.
I tried using the change journal to easily detect changes made on an NTFS volume, but the change journal would show a lot of records, most of them relating to small temporary files created and destroyed. Also, I could the file name, file reference number, and the parent file reference number, but I could not get the full file path. The parent file reference number is somehow supposed to give you the parent directory path.
EDIT: This needs to run everyday, so at the beginning of every scan, it should record only the changes that took place since the last scan. Or atleast, there should be a way to say changes since so and so time and date.
You can enumerate all the files on a volume using FSCTL_ENUM_USN_DATA. This is a fast process (my tests returned better than 6000 records per second even on a very old machine, and 20000+ is more typical) and only includes files that currently exist.
The data returned includes the file flags as well as the USNs so you could check for changes whichever way you prefer.
You will still need to work out the full path for the files by matching the parent IDs with the file IDs of the directories. One approach would be to use a buffer large enough to hold all the file records simultaneously, and search through the records to find the matching parent for each file you need to back up. For large volumes you would probably need to process the directory records into a more efficient data structure, perhaps a hash table.
Alternately, you can read/reread the records for the parent directories as needed. This would be less efficient, but the performance might still be satisfactory depending on how many files are being backed up. Windows does appear to cache the data returned by FSCTL_ENUM_USN_DATA.
This program searches the C volume for files named test.txt and returns information about any files found, as well as about their parent directories.
#include <Windows.h>
#include <stdio.h>
#define BUFFER_SIZE (1024 * 1024)
HANDLE drive;
USN maxusn;
void show_record (USN_RECORD * record)
{
void * buffer;
MFT_ENUM_DATA mft_enum_data;
DWORD bytecount = 1;
USN_RECORD * parent_record;
WCHAR * filename;
WCHAR * filenameend;
printf("=================================================================\n");
printf("RecordLength: %u\n", record->RecordLength);
printf("MajorVersion: %u\n", (DWORD)record->MajorVersion);
printf("MinorVersion: %u\n", (DWORD)record->MinorVersion);
printf("FileReferenceNumber: %lu\n", record->FileReferenceNumber);
printf("ParentFRN: %lu\n", record->ParentFileReferenceNumber);
printf("USN: %lu\n", record->Usn);
printf("Timestamp: %lu\n", record->TimeStamp);
printf("Reason: %u\n", record->Reason);
printf("SourceInfo: %u\n", record->SourceInfo);
printf("SecurityId: %u\n", record->SecurityId);
printf("FileAttributes: %x\n", record->FileAttributes);
printf("FileNameLength: %u\n", (DWORD)record->FileNameLength);
filename = (WCHAR *)(((BYTE *)record) + record->FileNameOffset);
filenameend= (WCHAR *)(((BYTE *)record) + record->FileNameOffset + record->FileNameLength);
printf("FileName: %.*ls\n", filenameend - filename, filename);
buffer = VirtualAlloc(NULL, BUFFER_SIZE, MEM_RESERVE | MEM_COMMIT, PAGE_READWRITE);
if (buffer == NULL)
{
printf("VirtualAlloc: %u\n", GetLastError());
return;
}
mft_enum_data.StartFileReferenceNumber = record->ParentFileReferenceNumber;
mft_enum_data.LowUsn = 0;
mft_enum_data.HighUsn = maxusn;
if (!DeviceIoControl(drive, FSCTL_ENUM_USN_DATA, &mft_enum_data, sizeof(mft_enum_data), buffer, BUFFER_SIZE, &bytecount, NULL))
{
printf("FSCTL_ENUM_USN_DATA (show_record): %u\n", GetLastError());
return;
}
parent_record = (USN_RECORD *)((USN *)buffer + 1);
if (parent_record->FileReferenceNumber != record->ParentFileReferenceNumber)
{
printf("=================================================================\n");
printf("Couldn't retrieve FileReferenceNumber %u\n", record->ParentFileReferenceNumber);
return;
}
show_record(parent_record);
}
void check_record(USN_RECORD * record)
{
WCHAR * filename;
WCHAR * filenameend;
filename = (WCHAR *)(((BYTE *)record) + record->FileNameOffset);
filenameend= (WCHAR *)(((BYTE *)record) + record->FileNameOffset + record->FileNameLength);
if (filenameend - filename != 8) return;
if (wcsncmp(filename, L"test.txt", 8) != 0) return;
show_record(record);
}
int main(int argc, char ** argv)
{
MFT_ENUM_DATA mft_enum_data;
DWORD bytecount = 1;
void * buffer;
USN_RECORD * record;
USN_RECORD * recordend;
USN_JOURNAL_DATA * journal;
DWORDLONG nextid;
DWORDLONG filecount = 0;
DWORD starttick, endtick;
starttick = GetTickCount();
printf("Allocating memory.\n");
buffer = VirtualAlloc(NULL, BUFFER_SIZE, MEM_RESERVE | MEM_COMMIT, PAGE_READWRITE);
if (buffer == NULL)
{
printf("VirtualAlloc: %u\n", GetLastError());
return 0;
}
printf("Opening volume.\n");
drive = CreateFile(L"\\\\?\\c:", GENERIC_READ, FILE_SHARE_DELETE | FILE_SHARE_READ | FILE_SHARE_WRITE, NULL, OPEN_ALWAYS, FILE_FLAG_NO_BUFFERING, NULL);
if (drive == INVALID_HANDLE_VALUE)
{
printf("CreateFile: %u\n", GetLastError());
return 0;
}
printf("Calling FSCTL_QUERY_USN_JOURNAL\n");
if (!DeviceIoControl(drive, FSCTL_QUERY_USN_JOURNAL, NULL, 0, buffer, BUFFER_SIZE, &bytecount, NULL))
{
printf("FSCTL_QUERY_USN_JOURNAL: %u\n", GetLastError());
return 0;
}
journal = (USN_JOURNAL_DATA *)buffer;
printf("UsnJournalID: %lu\n", journal->UsnJournalID);
printf("FirstUsn: %lu\n", journal->FirstUsn);
printf("NextUsn: %lu\n", journal->NextUsn);
printf("LowestValidUsn: %lu\n", journal->LowestValidUsn);
printf("MaxUsn: %lu\n", journal->MaxUsn);
printf("MaximumSize: %lu\n", journal->MaximumSize);
printf("AllocationDelta: %lu\n", journal->AllocationDelta);
maxusn = journal->MaxUsn;
mft_enum_data.StartFileReferenceNumber = 0;
mft_enum_data.LowUsn = 0;
mft_enum_data.HighUsn = maxusn;
for (;;)
{
// printf("=================================================================\n");
// printf("Calling FSCTL_ENUM_USN_DATA\n");
if (!DeviceIoControl(drive, FSCTL_ENUM_USN_DATA, &mft_enum_data, sizeof(mft_enum_data), buffer, BUFFER_SIZE, &bytecount, NULL))
{
printf("=================================================================\n");
printf("FSCTL_ENUM_USN_DATA: %u\n", GetLastError());
printf("Final ID: %lu\n", nextid);
printf("File count: %lu\n", filecount);
endtick = GetTickCount();
printf("Ticks: %u\n", endtick - starttick);
return 0;
}
// printf("Bytes returned: %u\n", bytecount);
nextid = *((DWORDLONG *)buffer);
// printf("Next ID: %lu\n", nextid);
record = (USN_RECORD *)((USN *)buffer + 1);
recordend = (USN_RECORD *)(((BYTE *)buffer) + bytecount);
while (record < recordend)
{
filecount++;
check_record(record);
record = (USN_RECORD *)(((BYTE *)record) + record->RecordLength);
}
mft_enum_data.StartFileReferenceNumber = nextid;
}
}
Additional notes
As discussed in the comments, you may need to replace MFT_ENUM_DATA with MFT_ENUM_DATA_V0 on versions of Windows later than Windows 7. (This may also depend on what compiler and SDK you are using.)
I'm printing the 64-bit file reference numbers as if they were 32-bit. That was just a mistake on my part. Probably in production code you won't be printing them anyway, but FYI.
The change journal is your best bet. You can use the file reference numbers to match file creation/deletion pairs and thus ignore temporary files, without having to process them any further.
I think you have to scan the Master File Table to make sense of ParentFileReferenceNumber. Of course you only need to keep track of directories when doing this, and use a data structure that will allow you to quickly lookup the information, so you only need to scan the MFT once.
You can use ReadDirectoryChanges and surrounding windows API.
I know how to achieve this in java. It will help you if you implement Java code inside C++.
In Java you can achieve this using Jnotify API.It looks for changes in sub-directory also.