I am trying to read a binary file's data sadly opening in C++ is a lot different than in python for these things as they have byte mode. It seems C++ does not have that.
for (auto p = directory_iterator(path); p != directory_iterator(); p++) {
if (!is_directory(p->path()))
byte tmpdata;
std::ifstream tmpreader;
tmpreader.open(desfile, std::ios_base::binary);
int currentByte = tmpreader.get();
while (currentByte >= 0)
{
//std::cout << "Does this get Called?" << std::endl;
int currentByte = tmpreader.get();
tmpdata = currentByte;
}
tmpreader.close()
}
else
{
continue;
}
I want basically a clone of Python's methods of opening a file in 'rb' mode. To have to actual byte data of all of the contents (which is not readable as it has nonprintable chars even for C++. Most of which probably cant be converted to signed chars just because it contains zlib compressed data that I need to feed in my DLL to decompress it all.
I do know that in Python I can do something like this:
file_object = open('[file here]', 'rb')
turns out that replacing the C++ Code above with this helps. However fopen is depreciated but I dont care.
What the Code above did not do was work because I was not reading from the buffer data. I did realize later that fopen, fseek, fread, and fclose was the functions I needed for read bytes mode ('rb').
for (auto p = directory_iterator(path); p != directory_iterator(); p++) {
if (!is_directory(p->path()))
{
std::string desfile = p->path().filename().string();
byte tmpdata;
unsigned char* data2;
FILE *fp = fopen("data.d", "rb");
fseek(fp, 0, SEEK_END); // GO TO END OF FILE
size_t size = ftell(fp);
fseek(fp, 0, SEEK_SET); // GO BACK TO START
data2 = new unsigned char[size];
tmpdata = fread(data2, 1, size, fp);
fclose(fp);
}
else
{
continue;
}
int currentByte = tmpreader.get();
while (currentByte >= 0)
{
//std::cout << "Does this get Called?" << std::endl;
int currentByte = tmpreader.get();
//^ here!
You are declaring a second variable hiding the outer one. However, this inner one is only valid within the while loop's body, so the while condition checks the outer variable which is not modified any more. Rather do it this way:
int currentByte;
while ((currentByte = tmpreader.get()) >= 0)
{
Related
void demodlg::printData(short* data)
{
FILE* pF;
char buf[50];
snprintf(buf, sizeof(buf), "%s\\%s\\%s%d.binary", "test", "data", "data", frameNum++);
pF = fopen(buf, "wb");
int lines = frameDescr->m_numLines;
int samples = frameDescr->m_pLineTypeDescr[0].m_numSamples;
int l, s;
fprintf(pF, "\t");
for (l = 0; l < lines; l++)
{
fprintf(pF, "%d\t", l);
}
fprintf(pF, "\n");
for (s = 0; s < samples; s++)
{
fprintf(pF, "%d)\t", s);
for (l = 0; l < lines; l++)
{
fprintf(pF, "%d\t", *(data + l * samples + s));
}
fprintf(pF, "\n");
}
fclose(pF);
}
I have the code snippet above which just takes in some data and then writes it out to a binary file. This function gets called about 20-30 times per second, so I'm trying to optimize it as much as possible. Each file that it writes to is about 1 MB in size. Ideally, I'd be able to write 20-30 MB per second. As of now, it's not at that rate.
Does anyone have any ideas on how I can optimize this further?
I originally was writing to a txt file before changing to a binary file, but the different isn't too noticeable, surprisingly.
Also, frameDescr gets updated for every frame so I believe I do need to get access to the lines and samples variables from inside, unfortunately.
I found this post to refer to (Writing a binary file in C++ very fast) but I'm not sure how I can apply it to mine.
Here is a short example of how I would write an array of data to a binary file and how I would read it back.
I do not understand the concept or purpose of lines in your code so I did not attempt to replicate it. If you do have additional data you need to write to allow it to be reconstructed when read I have placed comments to note where you could insert that code.
Keep in mind that the data when written as binary must be read the same way, so if you were writing the text in a particular format to consume it from another program then a binary file will not work for you unless you modify that other program or create an additional step to read the binary data and write the text format before consumption.
Assuming there is a speed advantage to writing the data as binary then adding an additional step to convert the binary data to text format is beneficial because you can do it offline when you're not trying to maintain a particular frame rate.
Normally since you tagged this c++ I would prefer manipulating the data in a vector and perhaps using c++ streams to write and read the data, but I tried to keep this as similar to your code as possible.
#include <cstdio>
#include <stdint.h>
const size_t kNumEntries = 128 * 1024;
void writeData(const char *filename, int16_t *data, size_t numEntries)
{
FILE *f = fopen(filename, "wb");
if (!f)
{
fprintf(stderr, "Error opening file: '%s'\n", filename);
return;
}
//If you have additional data that must be in the file write it here
//either as individual items that are mirrored in the reader,
//or using the pattern showm below for variable sized data.
//Write the number of entries we have to write to the file so the reader
//will know how much memory to allocate how many to read.
fwrite(&numEntries, sizeof(numEntries), 1, f);
//Write the actual data
fwrite(data, sizeof(*data), numEntries, f);
fclose(f);
}
int16_t* readData(const char *filename)
{
FILE *f = fopen(filename, "rb");
if (!f)
{
fprintf(stderr, "Error opening file: '%s'\n", filename);
return 0;
}
//If you have additional data to read, do it here.
//This code whould mirror the writing function.
//Read the number of entries in the file.
size_t numEntries;
fread(&numEntries, sizeof(numEntries), 1, f);
//Allocate memory for the entreis and read them into it.
int16_t *data = new int16_t[sizeof(int16_t) * numEntries];
fread(data, sizeof(*data), numEntries, f);
fclose(f);
return data;
}
int main()
{
int16_t *dataToWrite = new int16_t[sizeof(int16_t) * kNumEntries];
int16_t *dataRead = new int16_t[sizeof(int16_t) * kNumEntries];
for (int i = 0; i < kNumEntries; ++i)
{
dataToWrite[i] = i;
dataRead[i] = 0;
}
writeData("test.bin", dataToWrite, kNumEntries);
dataRead = readData("test.bin");
for (int i = 0; i < kNumEntries; ++i)
{
if (dataToWrite[i] != dataRead[i])
{
fprintf(stderr,
"Data mismatch at entry %d, : dataToWrite = %d, dataRead = %d\n",
i, dataToWrite[i], dataRead[i]);
}
}
delete[] dataRead;
return 0;
}
I have a server that sends raw binary data to print a "map" that a user must traverse through, however, I am having trouble clearing out my buffer after each line read and thus keep getting residual data printed at the end of the shorter lines. In the screenshot below you can see my output on the left, and what the output should be on the right. What is the best way to solve this? I feel like I am missing something but cant seem to find a solution.
And the code that is reading/printing this is below:
char* mapData = NULL;
string command = "command> ";
size_t dataSize = 0;
while(mapData != command.c_str()) {
unsigned char* buffer = (unsigned char*) &dataSize;
connection = read(mySocket, buffer, 8);
if(connection == -1 || connection < 0) {
cerr << "**Error: could not read text size" << endl;
return 1;
}
mapData = (char*)malloc(dataSize);
buffer = (unsigned char*) mapData;
while((connection = read(mySocket, buffer, dataSize)) != -1) {
if(connection == -1 || connection < 0) {
cerr << "**Error: could not read text size" << endl;
return 1;
}
if(dataSize != 1) {
cout << buffer;
}
free(buffer);
buffer = NULL;
}
}
You are ignoring the return value of read() to know how many bytes are in the buffer.
read() returns the actual number of bytes that were read, which may be fewer than you requested. So you need to call read() in a loop until you have read all of the bytes you are expecting, eg:
int readAll(int sock, void *buffer, size_t buflen)
{
unsigned char* pbuf = reinterpret_cast<unsigned char*>(buffer);
while (buflen > 0) {
int numRead = read(sock, pbuf, buflen);
if (numRead < 0) return -1;
if (numRead == 0) return 0;
pbuf += numRead;
buflen -= numRead;
}
return 1;
}
Also, after reading the buffer, you are treating it as if it were null-terminated, but it is not, which is why you get extra garbage in your output.
More importantly, mapData != command.c_str() will ALWAYS be true, so your while loop iterates indefinitely (until a socket error occurs), which is not what you want. You want the loop to end when you receive a "command> " string instead.
mapData is initially NULL, and c_str() NEVER returns NULL, so the loop ALWAYS iterates at least once.
Then you allocate and free mapData but don't reset it to NULL, so it is left pointing at invalid memory. Which doesn't really matter, since your while loop is just comparing pointers. c_str() will NEVER return a pointer to memory that mapData ever points to.
To end your loop correctly, you need to compare the contents of mapData after reading, not compare its memory address.
Try this instead:
char *mapData = NULL;
uint64_t dataSize = 0;
const string command = "command> ";
bool keepLooping = true;
do {
if (readAll(mySocket, &dataSize, sizeof(dataSize)) <= 0) {
cerr << "**Error: could not read text size" << endl;
return 1;
}
if (dataSize == 0)
continue;
mapData = new char[dataSize];
if (readAll(mySocket, mapData, dataSize) <= 0) {
cerr << "**Error: could not read text" << endl;
delete[] mapData;
return 1;
}
cout.write(mapData, dataSize);
keepLooping = (dataSize != command.size()) || (strncmp(mapData, command.c_str(), command.size()) != 0);
delete[] mapData;
}
while (keepLooping);
Alternatively:
string mapData;
uint64_t dataSize = 0;
const string command = "command> ";
do {
if (readAll(mySocket, &dataSize, sizeof(dataSize)) <= 0) {
cerr << "**Error: could not read text size" << endl;
return 1;
}
mapData.resize(dataSize);
if (dataSize > 0) {
if (readAll(mySocket, &mapData[0], dataSize) <= 0) {
cerr << "**Error: could not read text" << endl;
return 1;
}
cout << mapData;
}
}
while (mapData != command);
like #eozd pointed out, calling malloc and free in your loop is a bad idea since you use return statements. Your code may leak memory. You should ensure you call free before returns. Even better, you could declare your buffer outside of while loop, and use break instead of return, and call free if there was en error
Looking at your solution, it seems that the communication protocol involves sending data size first, followed by the actual data. How is data size written to the wire? You may need to convert it from network byte order.
To debug, you could print out the value of dataSize before every read to make sure that it is what you expect
You should clear the buffer too. Add:
memset(mapData, 0, dataSize);
after the malloc.
I'm trying to read a binary file and store it in a buffer. The problem is, that in the binary file are multiple null-terminated characters, but they are not at the end, instead they are before other binary text, so if I store the text after the '\0' it just deletes it in the buffer.
Example:
char * a = "this is a\0 test";
cout << a;
This will just output: this is a
here's my real code:
this function reads one character
bool CStream::Read (int * _OutChar)
{
if (!bInitialized)
return false;
int iReturn = 0;
*_OutChar = fgetc (pFile);
if (*_OutChar == EOF)
return false;
return true;
}
And this is how I use it:
char * SendData = new char[4096 + 1];
for (i = 0; i < 4096; i++)
{
if (Stream.Read (&iChar))
SendData[i] = iChar;
else
break;
}
I just want to mention that there is a standard way to read from a binary file into a buffer.
Using <cstdio>:
char buffer[BUFFERSIZE];
FILE * filp = fopen("filename.bin", "rb");
int bytes_read = fread(buffer, sizeof(char), BUFFERSIZE, filp);
Using <fstream>:
std::ifstream fin("filename.bin", ios::in | ios::binary );
fin.read(buffer, BUFFERSIZE);
What you do with the buffer afterwards is all up to you of course.
Edit: Full example using <cstdio>
#include <cstdio>
const int BUFFERSIZE = 4096;
int main() {
const char * fname = "filename.bin";
FILE* filp = fopen(fname, "rb" );
if (!filp) { printf("Error: could not open file %s\n", fname); return -1; }
char * buffer = new char[BUFFERSIZE];
while ( (int bytes = fread(buffer, sizeof(char), BUFFERSIZE, filp)) > 0 ) {
// Do something with the bytes, first elements of buffer.
// For example, reversing the data and forget about it afterwards!
for (char *beg = buffer, *end=buffer + bytes; beg < end; beg++, end-- ) {
swap(*beg, *end);
}
}
// Done and close.
fclose(filp);
return 0;
}
static std::vector<unsigned char> read_binary_file (const std::string filename)
{
// binary mode is only for switching off newline translation
std::ifstream file(filename, std::ios::binary);
file.unsetf(std::ios::skipws);
std::streampos file_size;
file.seekg(0, std::ios::end);
file_size = file.tellg();
file.seekg(0, std::ios::beg);
std::vector<unsigned char> vec;
vec.reserve(file_size);
vec.insert(vec.begin(),
std::istream_iterator<unsigned char>(file),
std::istream_iterator<unsigned char>());
return (vec);
}
and then
auto vec = read_binary_file(filename);
auto src = (char*) new char[vec.size()];
std::copy(vec.begin(), vec.end(), src);
The problem is definitievely the writing of your buffer, because you read a byte at a time.
If you know the length of the data in your buffer, you could force cout to go on:
char *bf = "Hello\0 world";
cout << bf << endl;
cout << string(bf, 12) << endl;
This should give the following output:
Hello
Hello world
However this is a workaround, as cout is foreseent to output printable data. Be aware that the output of non printable chars such as '\0' is system dependent.
Alternative solutions:
But if you manipulate binary data, you should define ad-hoc data structures and printing. Here some hints, with a quick draft for the general principles:
struct Mybuff { // special strtucture to manage buffers of binary data
static const int maxsz = 512;
int size;
char buffer[maxsz];
void set(char *src, int sz) // binary copy of data of a given length
{ size = sz; memcpy(buffer, src, max(sz, maxsz)); }
} ;
Then you could overload the output operator function:
ostream& operator<< (ostream& os, Mybuff &b)
{
for (int i = 0; i < b.size; i++)
os.put(isprint(b.buffer[i]) ? b.buffer[i]:'*'); // non printables replaced with *
return os;
}
ANd you could use it like this:
char *bf = "Hello\0 world";
Mybuff my;
my.set(bf, 13); // physical copy of memory
cout << my << endl; // special output
I believe your problem is not in reading the data, but rather in how you try to print it.
char * a = "this is a\0 test";
cout << a;
This example you show us prints a C-string. Since C-string is a sequence of chars ended by '\0', the printing function stops at the first null char.
This is because you need to know where the string ends either by using special terminating character (like '\0' here) or knowing its length.
So, to print whole data, you must know the length of it and use a loop similar to the one you use for reading it.
Are you on Windows? If so you need to execute _setmode(_fileno(stdout), _O_BINARY);
Include <fcntl.h> and <io.h>
UPDATE
I solved it with the answer that's marked as valid, but with one slight difference. I open the file using fopen(file, "r+b"), not fopen(file, "r+"). The b opens it in binary mode, and doesn't screw up the file.
I was doing a simple program which I called "fuzzer".
This is my code:
int main(int argc, char* argv[]){
// Here go some checks, such as argc being correct, etc.
// ...
// Read source file
FILE *fSource;
fSource = fopen(argv[1], "r+");
if(fSource == NULL){
cout << "Can't open file!";
return 2;
}
// Loop source file
char b;
int i = 0;
while((b = fgetc(fSource)) != EOF){
b ^= 0x13;
fseek(fSource, i++, SEEK_SET);
fwrite(&b, 1, sizeof(b), fSource);
}
fclose(fSource);
cout << "Fuzzed.";
return 0;
}
However, it doesn't work. Before, I used while(!feof), but it didn't work either, and I saw that it's not correct, so I changed it to (b = fgetc()) != EOF (I suppose it's correct, right?).
When I run it, it gets stuck on an endless loop, and it doesn't modify the original file, but rather appends tildes to it (and the file quickly increases its size, until I stop it). If I change the open mode from "a+" to "r+", it simply deletes the contents of the file (but it at least doesn't get stuck in an endless loop).
Note: I understand that this isn't any kind of obfuscation or encryption. I'm not trying to encode files, just practicing with C++ and files.
This code worked for me when tested on an Ubuntu 12.04 derivative with GCC 4.9.0:
#include <iostream>
#include <stdio.h>
using namespace std;
int main(int argc, char* argv[])
{
if (argc != 2)
{
cerr << "Usage: " << argv[0] << " file\n";
return 1;
}
FILE *fSource = fopen(argv[1], "r+");
if (fSource == NULL)
{
cerr << "Can't open file: " << argv[1] << "\n";
return 2;
}
int c;
int i = 0;
while ((c = fgetc(fSource)) != EOF)
{
char b = c ^ 0x13;
fseek(fSource, i++, SEEK_SET);
fwrite(&b, 1, sizeof(b), fSource);
fseek(fSource, i, SEEK_SET);
}
fclose(fSource);
cout << "Fuzzed: " << argv[1] << "\n";
return 0;
}
It reports file names; it reports errors to standard error (cerr); it uses int c; to read the character, but copies that to char b so that the fwrite() works. When run on (a copy of) its own source code, the first time the output looks like gibberish, and the second time recovers the original.
This loop, using fputc() instead of fwrite(), also works without needing the intermediate variable b:
while ((c = fgetc(fSource)) != EOF)
{
fseek(fSource, i++, SEEK_SET);
fputc(c ^ 0x13, fSource);
fseek(fSource, i, SEEK_SET);
}
The use of an fseek() after the read and after the write is mandated by the C standard. I'm not sure whether that's the main cause of your trouble, but it could in theory be one of the issues.
You need int b;. A char can never be EOF. The manual describes all this. All in all, something like this:
for (int b, i = 0; (b = fgetc(fSource)) != EOF; ++i)
{
unsigned char x = b;
x ^= 0x13;
fseek(fSource, i, SEEK_SET);
fwrite(&x, 1, 1, fSource);
fseek(fSource, i + 1, SEEK_SET);
}
You should also open the file with mode "rb+", and seek between each read and write (thanks #Jonathan Leffler).
I need to read a binary file containing several bytes and divide the contents into frames, each consisting of 535 bytes each. The number of frames present in the file is not known at runtime and thus I need to dynamically allocate memory for them. The code below is a snippet and as you can see, I'm trying to create a pointer to an array of bytes (uint8_t) and then increment into the next frame and so on, in the loop that reads the buffered data into the frames. How do I allocate memory at runtime and is this the best way to do the task? Please let me know if there is a more elegant solution. Also, how I manage the memory?
#include <cstdio>
using namespace std;
long getFileSize(FILE *file)
{
long currentPosition, endPosition;
currentPosition = ftell(file);
fseek(file, 0, 2);
endPosition = ftell(file);
fseek(file, currentPosition, 0);
return endPosition;
}
int main()
{
const char *filePath = "C:\Payload\Untitled.bin";
uint8_t *fileBuffer;
FILE *file = NULL;
if((file = fopen(filePath, "rb")) == NULL)
cout << "Failure. Either the file does not exist or this application lacks sufficient permissions to access it." << endl;
else
cout << "Success. File has been loaded." << endl;
long fileSize = getFileSize(file);
fileBuffer = new uint8_t[fileSize];
fread(fileBuffer, fileSize, 1, file);
uint8_t (*frameBuffer)[535];
for(int i = 0, j = 0; i < fileSize; i++)
{
frameBuffer[j][i] = fileBuffer[i];
if((i % 534) == 0)
{
j++;
}
}
struct frame {
unsigned char bytes[535];
};
std::vector<frame> frames;
Now your loop can simply read a frame and push it into frames. No explicit memory management needed: std::vector does that for you.