Arrays to binary files and vice versa - c++

I was trying to write an array of ints to a binary file and then read the just written file and write it in another array (of the same size of the first), but i don't understand why the second array contains the correct numbers only until 25 (its 26th element, since numbers i wrote in the first start from 0).
A very weird thing i noticed, is that if i replace 'x[i] = i;' with 'x[i] = i * 3;' in the first for cycle in main, i obtain the correct numbers printed until 279 instead of 25 (and 25*3 != 279).
How could I write/read binary files to/from raw arrays in C++?
main.cpp:
#include "arrays_binary_files.hpp"
#include <iostream>
int main()
{
int x[1000];
//#if 0
for (size_t i{ 0 }; i != sizeof x / sizeof * x; ++i)
x[i] = i;
//#endif
int y[sizeof x / sizeof *x];
std::cout << "scrivo x su x.bin? [invio per continuare] _";//write?
(void)getchar();
std::cout << "\nscrivo x su x.bin...";//writing...
arrToBinFile(x, sizeof x / sizeof * x, "x.bin");
std::cout << "\nscritto x su x.bin";//written!
std::cout << "\n\nscrivo x.bin su y? [invio per continuare] _";//read?
(void)getchar();
std::cout << "scrivo x.bin su y...";//reading...
binFileToArr("x.bin", y, sizeof y / sizeof * x);
std::cout << "\nscritto x.bin su y";//read!
std::cout << "\n\nvisualizzo y? [invio per continuare] _";//show?
for (size_t i{ 0 }; i != sizeof y / sizeof * y; ++i) {
std::cout << '\n' << y[i];//stampa bene solo fino a 25
(void)getchar();
}
return 0;
}
arrays_binary_files.hpp:
#ifndef arrays_binary_files_hpp_included
#define arrays_binary_files_hpp_included
#include <fstream>
#include <filesystem>
//namespace {
/*
* gives internal linkage (like specifying static for everything), so that each
* function is "local" in each translation unit which is pasted in by #include
*/
//i tried to inline in order to debug using breakpoints, but i didn't understand the same where the bug is
inline char arrToBinFile(const int inputArray[], const size_t inputArrayLength, const std::string& fileName) {
std::ofstream outputData;
outputData.open(fileName);
if (outputData) {
outputData.write(reinterpret_cast<const char*>(inputArray), sizeof(int) * inputArrayLength);
outputData.close();
return 0;
}
else {
outputData.close();
return -1;
}
}
//i tried to inline to debug using breakpoints, but i didn't understand the same where the bug is
inline char binFileToArr(const std::string& fileName, int outputArray[], const size_t outputArrayLength) {
std::ifstream inputData;
inputData.open(fileName);
if (inputData /*&& std::filesystem::file_size(fileName) <= outputArrayLength*/) {
inputData.read(reinterpret_cast<char*>(outputArray), sizeof(int) * outputArrayLength);
inputData.close();
return 0;
}
else {
inputData.close();
return -1;
}
}
//}
#endif
screenshot of the console in case of leaving the main function in main.cpp as it is:
screenshot of the console in case of replacing 'x[i] = i;' with 'x[i] = i * 3;' in the main function in main.cpp:

Related

(C++) Fastest way possible for reading in matrix files (arbitrary size)

I'm developing a bioinformatic tool, which requires reading in millions of matrix files (average dimension = (20k, 20k)). They are tab-delimited text files, and they look something like:
0.53 0.11
0.24 0.33
Because the software reads the matrix files one at a time, memory is not an issue, but it's very slow. The following is my current function for reading in a matrix file. I first make a matrix object using a double pointer, then fill in the matrix by looping through an input file .
float** make_matrix(int nrow, int ncol, float val){
float** M = new float *[nrow];
for(int i = 0; i < nrow; i++) {
M[i] = new float[ncol];
for(int j = 0; j < ncol; j++) {
M[i][j] = val;
}
}
return M;
}
float** read_matrix(string fname, int dim_1, int dim_2){
float** K = make_matrix(dim_1, dim_2, 0);
ifstream ifile(fname);
for (int i = 0; i < dim_1; ++i) {
for (int j = 0; j < dim_2; ++j) {
ifile >> K[i][j];
}
}
ifile.clear();
ifile.seekg(0, ios::beg);
return K;
}
Is there a much faster way to do this? From my experience with python, reading in a matrix file using pandas is so much faster than using python for-loops. Is there a trick like that in c++?
(added)
Thanks so much everyone for all your suggestions and comments!
The fastest way, by far, is to change the way you write those files: write in binary format, two int first (width, height) then just dump your values.
You will be able to load it in just three read calls.
Just for fun, I measured the program posted above (using a 20,000x20,000 ASCII input file, as described) on my Mac Mini (3.2GHz i7 with SSD drive) and found that it took about 102 seconds to parse in the file using the posted code.
Then I wrote a version of the same function that uses the C stdio API (fopen()/fread()/fclose()) and does character-by-character parsing into a 1D float array. This implementation takes about 13 seconds to parse in the file on the same hardware, so it's about 7 times faster.
Both programs were compiled with g++ -O3 test_read_matrix.cpp.
float* faster_read_matrix(string fname, int numRows, int numCols)
{
FILE * fpIn = fopen(fname.c_str(), "r");
if (fpIn == NULL)
{
printf("Couldn't open file [%s] for input!\n", fname.c_str());
return NULL;
}
float* K = new float[numRows*numCols];
// We'll hold the current number in (numberBuf) until we're ready to parse it
char numberBuf[128] = {'\0'};
int numCharsInBuffer = 0;
int curRow = 0, curCol = 0;
while(curRow < numRows)
{
char tempBuf[4*1024]; // an arbitrary size
const size_t bytesRead = fread(tempBuf, 1, sizeof(tempBuf), fpIn);
if (bytesRead <= 0)
{
if (bytesRead < 0) perror("fread");
break;
}
for (size_t i=0; i<bytesRead; i++)
{
const char c = tempBuf[i];
if ((c=='.')||(c=='+')||(c=='-')||(isdigit(c)))
{
if ((numCharsInBuffer+1) < sizeof(numberBuf)) numberBuf[numCharsInBuffer++] = c;
else
{
printf("Error, number string was too long for numberBuf!\n");
}
}
else
{
if (numCharsInBuffer > 0)
{
// Parse the current number-chars we have assembled into (numberBuf) and reset (numberBuf) to empty
numberBuf[numCharsInBuffer] = '\0';
if (curCol < numCols) K[curRow*numCols+curCol] = strtod(numberBuf, NULL);
else
{
printf("Error, too many values in row %i! (Expected %i, found at least %i)\n", curRow, numCols, curCol);
}
curCol++;
}
numCharsInBuffer = 0;
if (c == '\n')
{
curRow++;
curCol = 0;
if (curRow >= numRows) break;
}
}
}
}
fclose(fpIn);
if (curRow != numRows) printf("Warning: I read %i lines in the file, but I expected there would be %i!\n", curRow, numRows);
return K;
}
I am dissatisfied with Jeremy Friesner’s otherwise excellent answer because it:
blames the problem to be with C++'s I/O system (which it is not)
fixes the problem by circumventing the actual I/O problem without being explicit about how it is a significant contributor to speed
modifies memory accesses which (may or may not) contribute to speed, and does so in a way that very large matrices may not be supported
The reason his code runs so much faster is because he removes the single most important bottleneck: unoptimized disk access. JWO’s original code can be brought to match with three extra lines of code:
float** read_matrix(std::string fname, int dim_1, int dim_2){
float** K = make_matrix(dim_1, dim_2, 0);
std::size_t buffer_size = 4*1024; // 1
char buffer[buffer_size]; // 2
std::ifstream ifile(fname);
ifile.rdbuf()->pubsetbuf(buffer, buffer_size); // 3
for (int i = 0; i < dim_1; ++i) {
for (int j = 0; j < dim_2; ++j) {
ss >> K[i][j];
}
}
// ifile.clear();
// ifile.seekg(0, std::ios::beg);
return K;
}
The addition exactly replicates Friesner’s design, but using the C++ library capabilities without all the extra programming grief on our end.
You’ll notice I also removed a couple lines at the bottom that should be inconsequential to program function and correctness, but which may cause a minor cumulative time issue as well. (If they are not inconsequential, that is a bug and should be fixed!)
How much difference this all makes depends entirely on the quality of the C++ Standard Library implementation. AFAIK the big three modern C++ compilers (MSVC, GCC, and Clang) all have sufficiently-optimized I/O handling to make the issue moot.
locale
One other thing that may also make a difference is to .imbue() the stream with the default "C" locale, which avoids a lot of special handling for numbers in locale-dependent formats other than what your files use. You only need to bother to do this if you have changed your global locale, though.
ifile.imbue(std::locale(""));
redundant initialization
Another thing that is killing your time is the effort to zero-initialize the array when you create it. Don’t do that if you don’t need it! (You don’t need it here because you know the total extents and will fill them properly. C++17 and later is nice enough to give you a zero value if the input stream goes bad, too. So you get zeros for unread values either way.)
dynamic memory block size
Finally, keeping memory accesses to an array of array should not significantly affect speed, but it still might be worth testing if you can change it. This is assuming that the resulting matrix will never be too large for the memory manager to return as a single block (and consequently crash your program).
A common design is to allocate the entire array as a single block, with the requested size plus size for the array of pointers to the rest of the block. This allows you to delete the array in a single delete[] statement. Again, I don’t believe this should be an optimization issue you need to care about until your profiler says so.
At the risk of the answer being considered incomplete (no code examples), I would like to add to the other answers additional options how to tackle the problem:
Use a binary format (width,height, values...) as file format and then use file mapping (MapViewOfFile() on Windows, mmap() or so on posix/unix systems).
Then, you can simply point your "matrix structure" pointer to the mapped address space and you are done. And in case, you do something like sparse access to the matrix, it can even save some real IO. If you always do full access to all elements of the matrix (no sparse matrices etc.), it is still quite elegant and probably faster than malloc/read.
Replacements for c++ iostream, which is known to be quite slow and should not be used for performance critical stuff:
Have a look at the {fmt} library, which has become quite popular in recent years and claims to be quite fast.
Back in the days, when I did a lot of numerics on large data sets, I always opted for binary files for storage. (It was back in the days, when the fastest CPU you get your hands on were the Pentium 1 (with the floating point bug :)). Back then, all was slower, memory was much more limited (we had MB not GB as units for RAM in our systems) and all in all, nearly 20 years have passed since.
So, as a refresher, I did write some code to show, how much faster than iostream and text files you can do if you do not have extra constraints (such as endianess of different cpus etc.).
So far, my little test only has an iostream and a binary file version with a) stdio fread() kind of loading and b) mmap(). Since I sit in front of a debian bullseye computer, my code uses linux specific stuff for the mmap() approach. To run it on Windows, you have to change a few lines of code and some includes.
Edit: I added a save function using {fmt} now as well.
Edit: I added a load function with stdio now as well.
Edit: To reduce memory workload, I reordered the code somewhat
and now only keep 2 matrix instances in memory at any given time.
The program does the following:
create a 20k x 20k matrix in ram (in a struct named Matrix_t). With random values, slowly generated by std::random.
Write the matrix with iostream to a text file.
Write the matrix with stdio to a binary file.
Create a new matrix textMatrix by loading its data from the text file.
Create a new matrix inMemoryMatrix by loading its data from the binary file with a few fread() calls.
mmap() the binary file and use it under the name mappedMatrix.
Compare each of the loaded matrices to the original randomMatrix to see if the round-trip worked.
Here the results I got on my machine after compiling this work of wonder with clang++ -O3 -o fmatio fast-matrix-io.cpp -lfmt:
./fmatio
creating random matrix (20k x 20k) (27.0775seconds)
the first 10 floating values in randomMatrix are:
57970.2 -365700 -986079 44657.8 826968 -506928 668277 398241 -828176 394645
saveMatrixAsText_IOSTREAM()
saving matrix with iostream. (192.749seconds)
saveMatrixAsText_FMT(mat0_fmt.txt)
saving matrix with {fmt}. (34.4932seconds)
saveMatrixAsBinary()
saving matrix into a binary file. (30.7591seconds)
loadMatrixFromText_IOSTREAM()
loading matrix from text file with iostream. (102.074seconds)
randomMatrix == textMatrix
comparing randomMatrix with textMatrix. (0.125328seconds)
loadMatrixFromText_STDIO(mat0_fmt.txt)
loading matrix from text file with stdio. (71.2746seconds)
randomMatrix == textMatrix
comparing randomMatrix with textMatrix (stdio). (0.124684seconds)
loadMatrixFromBinary(mat0.bin)
loading matrix from binary file into memory. (0.495685seconds)
randomMatrix == inMemoryMatrix
comparing randomMatrix with inMemoryMatrix. (0.124206seconds)
mapMatrixFromBinaryFile(mat0.bin)
mapping a view to a matrix in a binary file. (4.5883e-05seconds)
randomMatrix == mappedMatrix
comparing randomMatrix with mappedMatrix. (0.158459seconds)
And here is the code:
#include <cinttypes>
#include <memory>
#include <random>
#include <iostream>
#include <fstream>
#include <cstring>
#include <string>
#include <chrono>
#include <limits>
#include <iomanip>
// includes for mmap()...
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <cstdio>
#include <cstdlib>
#include <unistd.h>
// includes for {fmt}...
#include <fmt/core.h>
#include <fmt/os.h>
struct StopWatch {
using Clock = std::chrono::high_resolution_clock;
using TimePoint =
std::chrono::time_point<Clock>;
using Duration =
std::chrono::duration<double>;
void start(const char* description) {
this->description = std::string(description);
tstart = Clock::now();
}
void stop() {
TimePoint tend = Clock::now();
Duration elapsed = tend - tstart;
std::cout << description << " (" << elapsed.count()
<< "seconds)" << std::endl;
}
TimePoint tstart;
std::string description;
};
struct Matrix_t {
uint32_t ncol;
uint32_t nrow;
float values[];
inline uint32_t to_index(uint32_t col, uint32_t row) const {
return ncol * row + col;
}
};
template <class Initializer>
Matrix_t *createMatrix
( uint32_t ncol,
uint32_t nrow,
Initializer initFn
) {
size_t nfloats = ncol*nrow;
size_t nbytes = UINTMAX_C(8) + nfloats * sizeof(float);
Matrix_t * result =
reinterpret_cast<Matrix_t*>(operator new(nbytes));
if (nullptr != result) {
result->ncol = ncol;
result->nrow = nrow;
for (uint32_t row = 0; row < nrow; row++) {
for (uint32_t col = 0; col < ncol; col++) {
result->values[result->to_index(col,row)] =
initFn(ncol,nrow,col,row);
}
}
}
return result;
}
void saveMatrixAsText_IOSTREAM(const char* filePath,
const Matrix_t* matrix) {
std::cout << "saveMatrixAsText_IOSTREAM()" << std::endl;
if (nullptr == matrix) {
std::cout << "cannot save matrix - no matrix!" << std::endl;
}
std::ofstream outFile(filePath);
if (outFile) {
outFile << matrix->ncol << " " << matrix->nrow << std::endl;
const auto defaultPrecision = outFile.precision();
outFile.precision
(std::numeric_limits<float>::max_digits10);
for (uint32_t row = 0; row < matrix->nrow; row++) {
for (uint32_t col = 0; col < matrix->ncol; col++) {
outFile << matrix->values[matrix->to_index(col,row)]
<< " ";
}
outFile << std::endl;
}
} else {
std::cout << "could not open " << filePath << " for writing."
<< std::endl;
}
}
void saveMatrixAsText_FMT(const char* filePath,
const Matrix_t* matrix) {
std::cout << "saveMatrixAsText_FMT(" << filePath << ")"
<< std::endl;
if (nullptr == matrix) {
std::cout << "cannot save matrix - no matrix!" << std::endl;
}
auto outFile = fmt::output_file(filePath);
outFile.print("{} {}\n", matrix->ncol, matrix->nrow);
for (uint32_t row = 0; row < matrix->nrow; row++) {
outFile.print("{}", matrix->values[matrix->to_index(0,row)]);
for (uint32_t col = 1; col < matrix->ncol; col++) {
outFile.print(" {}",
matrix->values[matrix->to_index(col,row)]);
}
outFile.print("\n");
}
}
void saveMatrixAsBinary(const char* filePath,
const Matrix_t* matrix) {
std::cout << "saveMatrixAsBinary()" << std::endl;
FILE * outFile = fopen(filePath, "wb");
if (nullptr != outFile) {
fwrite( &matrix->ncol, 4, 1, outFile);
fwrite( &matrix->nrow, 4, 1, outFile);
size_t nfloats = matrix->ncol * matrix->nrow;
fwrite( &matrix->values, sizeof(float), nfloats, outFile);
fclose(outFile);
} else {
std::cout << "could not open " << filePath << " for writing."
<< std::endl;
}
}
Matrix_t* loadMatrixFromText_IOSTREAM(const char* filePath) {
std::cout << "loadMatrixFromText_IOSTREAM()" << std::endl;
std::ifstream inFile(filePath);
if (inFile) {
uint32_t ncol;
uint32_t nrow;
inFile >> ncol;
inFile >> nrow;
uint32_t nfloats = ncol * nrow;
auto loader =
[&inFile]
(uint32_t , uint32_t , uint32_t , uint32_t )
-> float
{
float value;
inFile >> value;
return value;
};
Matrix_t * matrix = createMatrix( ncol, nrow, loader);
return matrix;
} else {
std::cout << "could not open " << filePath << "for reading."
<< std::endl;
}
return nullptr;
}
Matrix_t* loadMatrixFromText_STDIO(const char* filePath) {
std::cout << "loadMatrixFromText_STDIO(" << filePath << ")"
<< std::endl;
Matrix_t* matrix = nullptr;
FILE * inFile = fopen(filePath, "rt");
if (nullptr != inFile) {
uint32_t ncol;
uint32_t nrow;
fscanf(inFile, "%d %d", &ncol, &nrow);
auto loader =
[&inFile]
(uint32_t , uint32_t , uint32_t , uint32_t )
-> float
{
float value;
fscanf(inFile, "%f", &value);
return value;
};
matrix = createMatrix( ncol, nrow, loader);
fclose(inFile);
} else {
std::cout << "could not open " << filePath << "for reading."
<< std::endl;
}
return matrix;
}
Matrix_t* loadMatrixFromBinary(const char* filePath) {
std::cout << "loadMatrixFromBinary(" << filePath << ")"
<< std::endl;
FILE * inFile = fopen(filePath, "rb");
if (nullptr != inFile) {
uint32_t ncol;
uint32_t nrow;
fread( &ncol, 4, 1, inFile);
fread( &nrow, 4, 1, inFile);
uint32_t nfloats = ncol * nrow;
uint32_t nbytes = nfloats * sizeof(float) + UINT32_C(8);
Matrix_t* matrix =
reinterpret_cast<Matrix_t*>
(operator new (nbytes));
if (nullptr != matrix) {
matrix->ncol = ncol;
matrix->nrow = nrow;
fread( &matrix->values[0], sizeof(float), nfloats, inFile);
return matrix;
} else {
std::cout << "could not find memory for the matrix."
<< std::endl;
}
fclose(inFile);
} else {
std::cout << "could not open file "
<< filePath << " for reading." << std::endl;
}
return nullptr;
}
void freeMatrix(Matrix_t* matrix) {
operator delete(matrix);
}
Matrix_t* mapMatrixFromBinaryFile(const char* filePath) {
std::cout << "mapMatrixFromBinaryFile(" << filePath << ")"
<< std::endl;
Matrix_t * matrix = nullptr;
int fd = open( filePath, O_RDONLY);
if (-1 != fd) {
struct stat sb;
if (-1 != fstat(fd, &sb)) {
auto fileSize = sb.st_size;
matrix =
reinterpret_cast<Matrix_t*>
(mmap(nullptr, fileSize, PROT_READ, MAP_PRIVATE, fd, 0));
if (nullptr == matrix) {
std::cout << "mmap() failed!" << std::endl;
}
} else {
std::cout << "fstat() failed!" << std::endl;
}
close(fd);
} else {
std::cout << "open() failed!" << std::endl;
}
return matrix;
}
void unmapMatrix(Matrix_t* matrix) {
if (nullptr == matrix)
return;
size_t nbytes =
UINTMAX_C(8) +
sizeof(float) * matrix->ncol * matrix->nrow;
munmap(matrix, nbytes);
}
bool areMatricesEqual( const Matrix_t* m1, const Matrix_t* m2) {
if (nullptr == m1) return false;
if (nullptr == m2) return false;
if (m1->ncol != m2->ncol) return false;
if (m1->nrow != m2->nrow) return false;
// both exist and have same size...
size_t nfloats = m1->ncol * m1->nrow;
size_t nbytes = nfloats * sizeof(float);
return 0 == memcmp( m1->values, m2->values, nbytes);
}
int main(int argc, const char* argv[]) {
std::random_device rdev;
std::default_random_engine reng(rdev());
std::uniform_real_distribution<> rdist(-1.0E6F, 1.0E6F);
StopWatch sw;
auto randomInitFunction =
[&reng,&rdist]
(uint32_t ncol, uint32_t nrow, uint32_t col, uint32_t row)
-> float
{
return rdist(reng);
};
sw.start("creating random matrix (20k x 20k)");
Matrix_t * randomMatrix =
createMatrix(UINT32_C(20000),
UINT32_C(20000),
randomInitFunction);
sw.stop();
if (nullptr != randomMatrix) {
std::cout
<< "the first 10 floating values in randomMatrix are: "
<< std::endl;
std::cout << randomMatrix->values[0];
for (size_t i = 1; i < 10; i++) {
std::cout << " " << randomMatrix->values[i];
}
std::cout << std::endl;
sw.start("saving matrix with iostream.");
saveMatrixAsText_IOSTREAM("mat0_iostream.txt", randomMatrix);
sw.stop();
sw.start("saving matrix with {fmt}.");
saveMatrixAsText_FMT("mat0_fmt.txt", randomMatrix);
sw.stop();
sw.start("saving matrix into a binary file.");
saveMatrixAsBinary("mat0.bin", randomMatrix);
sw.stop();
sw.start("loading matrix from text file with iostream.");
Matrix_t* textMatrix =
loadMatrixFromText_IOSTREAM("mat0_iostream.txt");
sw.stop();
sw.start("comparing randomMatrix with textMatrix.");
if (!areMatricesEqual(randomMatrix, textMatrix)) {
std::cout << "randomMatrix != textMatrix!" << std::endl;
} else {
std::cout << "randomMatrix == textMatrix" << std::endl;
}
sw.stop();
freeMatrix(textMatrix);
textMatrix = nullptr;
sw.start("loading matrix from text file with stdio.");
textMatrix =
loadMatrixFromText_STDIO("mat0_fmt.txt");
sw.stop();
sw.start("comparing randomMatrix with textMatrix (stdio).");
if (!areMatricesEqual(randomMatrix, textMatrix)) {
std::cout << "randomMatrix != textMatrix!" << std::endl;
} else {
std::cout << "randomMatrix == textMatrix" << std::endl;
}
sw.stop();
freeMatrix(textMatrix);
textMatrix = nullptr;
sw.start("loading matrix from binary file into memory.");
Matrix_t* inMemoryMatrix =
loadMatrixFromBinary("mat0.bin");
sw.stop();
sw.start("comparing randomMatrix with inMemoryMatrix.");
if (!areMatricesEqual(randomMatrix, inMemoryMatrix)) {
std::cout << "randomMatrix != inMemoryMatrix!"
<< std::endl;
} else {
std::cout << "randomMatrix == inMemoryMatrix" << std::endl;
}
sw.stop();
freeMatrix(inMemoryMatrix);
inMemoryMatrix = nullptr;
sw.start("mapping a view to a matrix in a binary file.");
Matrix_t* mappedMatrix =
mapMatrixFromBinaryFile("mat0.bin");
sw.stop();
sw.start("comparing randomMatrix with mappedMatrix.");
if (!areMatricesEqual(randomMatrix, mappedMatrix)) {
std::cout << "randomMatrix != mappedMatrix!"
<< std::endl;
} else {
std::cout << "randomMatrix == mappedMatrix" << std::endl;
}
sw.stop();
unmapMatrix(mappedMatrix);
mappedMatrix = nullptr;
freeMatrix(randomMatrix);
} else {
std::cout << "could not create random matrix!" << std::endl;
}
return 0;
}
Please note, that binary formats where you simply cast to a struct pointer also depend on how the compiler does alignment and padding within structures. In my case, I was lucky and it worked. On other systems, you might have to tweak a little (#pragma pack(4) or something along that line) to make it work.

Convert String in to MAC address [duplicate]

This question already has answers here:
Can a local variable's memory be accessed outside its scope?
(20 answers)
Closed 5 years ago.
Inside the file in SPIFFS, I'm saving information about the mac address in the form "XX:XX:XX:XX:XX:XX".
When I read the file, I need to switch it from STRING to a array of hexadecimal values.
uint8_t* str2mac(char* mac){
uint8_t bytes[6];
int values[6];
int i;
if( 6 == sscanf( mac, "%x:%x:%x:%x:%x:%x%*c",&values[0], &values[1], &values[2],&values[3], &values[4], &values[5] ) ){
/* convert to uint8_t */
for( i = 0; i < 6; ++i )bytes[i] = (uint8_t) values[i];
}else{
/* invalid mac */
}
return bytes;
}
wifi_set_macaddr(STATION_IF, str2mac((char*)readFileSPIFFS("/mac.txt").c_str()));
But I'm wrong in the code somewhere
When i put AA:00:00:00:00:01 in file, my ESP8266 set 29:D5:23:40:00:00
I need help, thank you
You are returning a pointer to a "local" variable, i.e. one which's lifetime ends when the function is finished. Using such a pointer then is UB, which may be, for example, the behaviour you are seeing.
To overcome this, you could pass the array as parameter; then the caller is responsible for memory management.
BTW: you could use format %hhx to read in directly into an 8 bit unsigned data type:
int str2mac(const char* mac, uint8_t* values){
if( 6 == sscanf( mac, "%hhx:%hhx:%hhx:%hhx:%hhx:%hhx",&values[0], &values[1], &values[2],&values[3], &values[4], &values[5] ) ){
return 1;
}else{
return 0;
}
}
int main() {
uint8_t values[6] = { 0 };
int success = str2mac("AA:00:00:00:00:01", values);
if (success) {
for (int i=0; i<6; i++) {
printf("%02X:",values[i]);
}
}
}
Your code doesn't seem to be compatible with wifi_set_macaddr (I looked up API documentation). It expects a uint8 pointer to mac address, which means the way you wrote it is not going to work (returning pointer to local variable etc). Here is an example which you should be able to adapt to your purpouse:
#include <iostream>
#include <fstream>
// mock up/print result
bool wifi_set_macaddr(uint8_t index, uint8_t *mac)
{
std::cout << "index: " << (int)index << " mac: ";
for (int i = 0; i < 6; ++i)
std::cout << std::hex << (int)mac[i] << " ";
std::cout << std::endl;
return true;
}
// test file
void writeTestFile()
{
std::ofstream ofs("mac.txt");
if (!(ofs << "AA:00:00:00:00:01" << std::endl))
{
std::cout << "File error" << std::endl;
}
ofs.close();
}
int main()
{
writeTestFile();
uint8_t mac[6];
int i = 0, x;
std::ifstream ifs("mac.txt");
for (; i < 6 && ifs >> std::hex >> x; ++i)
{
mac[i] = static_cast<uint8_t>(x);
ifs.ignore();
}
if (i < 6)
{
std::cout << "File error or invalid MAC address" << std::endl;
return 1;
}
wifi_set_macaddr(0x00, mac);
return 0;
}
http://coliru.stacked-crooked.com/view?id=d387c628e590a467

How to fill a String inside an array of structures within a structure in C++

So I have a structure called fastarray which contains a poiner to another structure called token_det. My problem is trying to fill a char array inside the array of structs fails mid way through and gives a error message as "The exception unknown software exception (0x0000417) occured in the application at location 0x78b2ae6e". I tried increasing the size of the char array using malloc but the string concat function keeps failing after concatinating a few strings. Below is a example of the code:
#include <stdio.h>
#include <string>
#include <stdlib.h>
#include <iostream.h>
using namespace std;
#define MAX_TOKENS 300000
struct token_det
{
int token;
std::string data;
char mdepth[300];
};
typedef struct fastarray
{
token_det *td; //MAX_TOKENS
}FASTARRAY;
int main()
{
printf("inside main\n");
int lv_ret = 0;
int count = 0;
char log[50] = {""};
int wtoken = 0;
FASTARRAY *f_array = NULL;
f_array = (FASTARRAY *)malloc(sizeof(FASTARRAY));
f_array->td = NULL;
f_array->td = (token_det *)malloc(MAX_TOKENS * sizeof(token_det));
printf("after malloc\n");
memset(f_array, 0, sizeof(f_array));
memset(f_array->td, 0, sizeof(f_array->td));
int x=0;
while(x<=10000)
{
printf("inside while");
f_array->td[x].data = "104,";
f_array->td[x].data.append("stasimorphy");
f_array->td[x].data.append(",");
f_array->td[x].data.append("psychognosy");
f_array->td[x].data.append(",");
f_array->td[x].data.append("whoever");
f_array->td[x].data.append(",");
x++;
sprintf_s(log,sizeof(log),"Data for x-%d = %s\n",x,f_array->td[x].data);
printf(log);
}
free(f_array->td);
free(f_array);
printf("after while\n");
return 0;
}
Explanation of what I was doing and why
When I tried to understand what you wanted to do there I've had no problem except for the parts in which you're using memset. With memset(f_array, 0, sizeof(f_array)); you're explicitly setting the f_array to point to 0 in the memory which was constantly throwing exceptions for me.
As I've never really been a friend of malloc I've been using C++ syntax as follows:
For allocating a single instance I'd use FASTARRAY *f_array = new fastarray;. You can read up on why using new instead of malloc is favorable in C++ here.
In the same way I've been using C++ syntax for allocating the dynamic array f_array->td = new token_det[MAX_TOKENS]; A Q&A about that topic for reference can be found here.
For filling the data string inside the dynamic array's struct I've been using the += syntax as it's easier to read, in my opinion. Accessing the element inside the f_array has been achieved using (*(f_array->td + x)).data += "stasimorphy";
You can try my solution online here.
Code Dump
I tried to change as little as possible to make it work.
#include <sstream>
#include <string>
#include <iostream>
using namespace std;
#define MAX_TOKENS 300000
struct token_det
{
int token;
std::string data;
char mdepth[300];
};
typedef struct fastarray
{
token_det *td; //MAX_TOKENS
}FASTARRAY;
int main()
{
std::cout << "inside main\n";
int lv_ret = 0;
int count = 0;
char log[50] = { "" };
int wtoken = 0;
FASTARRAY *f_array = new fastarray;
f_array->td = new token_det[MAX_TOKENS];
std::cout << "after malloc\n";
int x = 0;
while (x <= 10000)
{
std::cout << "inside while";
std::stringstream log;
(*(f_array->td + x)).data = "104,";
(*(f_array->td + x)).data += "stasimorphy";
(*(f_array->td + x)).data += ",";
(*(f_array->td + x)).data += "psychognosy";
(*(f_array->td + x)).data += ",";
(*(f_array->td + x)).data += "whoever";
(*(f_array->td + x)).data += ",";
log << "Data for x-" << x << " = " << (f_array->td + x)->data << std::endl;
std::cout << log.str();
x++;
}
delete[] f_array->td;
free(f_array);
std::cout << "after while\n";
return 0;
}

C++: Issues with Circular Buffer

I'm having some trouble writing a circular buffer in C++. Here is my code base at the moment:
circ_buf.h:
#ifndef __CIRC_BUF_H__
#define __CIRC_BUF_H__
#define MAX_DATA (25) // Arbitrary size limit
// The Circular Buffer itself
struct circ_buf {
int s; // Index of oldest reading
int e; // Index of most recent reading
int data[MAX_DATA]; // The data
};
/*** Function Declarations ***/
void empty(circ_buf*);
bool is_empty(circ_buf*);
bool is_full(circ_buf*);
void read(circ_buf*, int);
int overwrite(circ_buf*);
#endif // __CIRC_BUF_H__
circ_buf.cpp:
#include "circ_buf.h"
/*** Function Definitions ***/
// Empty the buffer
void empty(circ_buf* cb) {
cb->s = 0; cb->e = 0;
}
// Is the buffer empty?
bool is_empty(circ_buf* cb) {
// By common convention, if the start index is equal to the end
// index, our buffer is considered empty.
return cb->s == cb->e;
}
// Is the buffer full?
bool is_full(circ_buf* cb) {
// By common convention, if the start index is one greater than
// the end index, our buffer is considered full.
// REMEMBER: we still need to account for wrapping around!
return cb->s == ((cb->e + 1) % MAX_DATA);
}
// Read data into the buffer
void read(circ_buf* cb, int k) {
int i = cb->e;
cb->data[i] = k;
cb->e = (i + 1) % MAX_DATA;
}
// Overwrite data in the buffer
int overwrite(circ_buf* cb) {
int i = cb->s;
int k = cb->data[i];
cb->s = (i + 1) % MAX_DATA;
}
circ_buf_test.cpp:
#include <iostream>
#include <fstream>
#include <string>
#include <cstdlib>
#include "circ_buf.h"
int main(int argc, char** argv) {
// Our data source
std::string file = "million_numbers.txt";
std::fstream in(file, std::ios_base::in);
// The buffer
circ_buf buffer = { .s = 0, .e = 0, .data = {} };
for (int i = 0; i < MAX_DATA; ++i) {
int k = 0; in >> k; // Get next int from in
read(&buffer, k);
}
for (int i = 0; i < MAX_DATA; ++i)
std::cout << overwrite(&buffer) << std::endl;
}
The main issue I'm having is getting the buffer to write integers to its array. When I compile and run the main program (circ_buf_test), it just prints the same number 25 times, instead of what I expect it to print (the numbers 1 through 25 - "million_numbers.txt" is literally just the numbers 1 through 1000000). The number is 2292656, in case this may be important.
Does anyone have an idea about what might be going wrong here?
Your function overwrite(circ_buf* cb) returns nothing (there are no return in it's body). So the code for printing of values can print anything (see "undefined behavior"):
for (int i = 0; i < MAX_DATA; ++i)
std::cout << overwrite(&buffer) << std::endl;
I expect you can find the reason of this "main issue" in the compilation log (see lines started with "Warning"). You can fix it this way:
int overwrite(circ_buf* cb) {
int i = cb->s;
int k = cb->data[i];
cb->s = (i + 1) % MAX_DATA;
return k;
}

Adding a string or char array to a byte vector

I'm currently working on a class to create and read out packets send through the network, so far I have it working with 16bit and 8bit integers (Well unsigned but still).
Now the problem is I've tried numerous ways of copying it over but somehow the _buffer got mangled, it segfaulted, or the result was wrong.
I'd appreciate if someone could show me a working example.
My current code can be seen below.
Thanks, Xeross
Main
#include <iostream>
#include <stdio.h>
#include "Packet.h"
using namespace std;
int main(int argc, char** argv)
{
cout << "#################################" << endl;
cout << "# Internal Use Only #" << endl;
cout << "# Codename PACKETSTORM #" << endl;
cout << "#################################" << endl;
cout << endl;
Packet packet = Packet();
packet.SetOpcode(0x1f4d);
cout << "Current opcode is: " << packet.GetOpcode() << endl << endl;
packet.add(uint8_t(5))
.add(uint16_t(4000))
.add(uint8_t(5));
for(uint8_t i=0; i<10;i++)
printf("Byte %u = %x\n", i, packet._buffer[i]);
printf("\nReading them out: \n1 = %u\n2 = %u\n3 = %u\n4 = %s",
packet.readUint8(),
packet.readUint16(),
packet.readUint8());
return 0;
}
Packet.h
#ifndef _PACKET_H_
#define _PACKET_H_
#include <iostream>
#include <vector>
#include <stdio.h>
#include <stdint.h>
#include <string.h>
using namespace std;
class Packet
{
public:
Packet() : m_opcode(0), _buffer(0), _wpos(0), _rpos(0) {}
Packet(uint16_t opcode) : m_opcode(opcode), _buffer(0), _wpos(0), _rpos(0) {}
uint16_t GetOpcode() { return m_opcode; }
void SetOpcode(uint16_t opcode) { m_opcode = opcode; }
Packet& add(uint8_t value)
{
if(_buffer.size() < _wpos + 1)
_buffer.resize(_wpos + 1);
memcpy(&_buffer[_wpos], &value, 1);
_wpos += 1;
return *this;
}
Packet& add(uint16_t value)
{
if(_buffer.size() < _wpos + 2)
_buffer.resize(_wpos + 2);
memcpy(&_buffer[_wpos], &value, 2);
_wpos += 2;
return *this;
}
uint8_t readUint8()
{
uint8_t result = _buffer[_rpos];
_rpos += sizeof(uint8_t);
return result;
}
uint16_t readUint16()
{
uint16_t result;
memcpy(&result, &_buffer[_rpos], sizeof(uint16_t));
_rpos += sizeof(uint16_t);
return result;
}
uint16_t m_opcode;
std::vector<uint8_t> _buffer;
protected:
size_t _wpos; // Write position
size_t _rpos; // Read position
};
#endif // _PACKET_H_
Since you're using an std::vector for your buffer, you may as well let it keep track of the write position itself and avoid having to keep manually resizing it. You can also avoid writing multiple overloads of the add function by using a function template:
template <class T>
Packet& add(T value) {
std::copy((uint8_t*) &value, ((uint8_t*) &value) + sizeof(T), std::back_inserter(_buffer));
return *this;
}
Now you can write any POD type to your buffer.
implicitly:
int i = 5;
o.write(i);
or explictly:
o.write<int>(5);
To read from the buffer, you will need to keep track of a read position:
template <class T>
T read() {
T result;
uint8_t *p = &_buffer[_rpos];
std::copy(p, p + sizeof(T), (uint8_t*) &result);
_rpos += sizeof(T);
return result;
}
You will need to explicitly pass a type parameter to read. i.e.
int i = o.read<int>();
Caveat: I have used this pattern often, but since I am typing this off the top of my head, there may be a few errors in the code.
Edit: I just noticed that you want to be able to add strings or other non-POD types to your buffer. You can do that via template specialization:
template <>
Packet& add(std::string s) {
add(string.length());
for (size_t i = 0; i < string.length(); ++i)
add(string[i]);
return *this;
}
This tells the compiler: if add is called with a string type, use this function instead of the generic add() function.
and to read a string:
template <>
std::string read<>() {
size_t len = read<size_t>();
std::string s;
while (len--)
s += read<char>();
return s;
}
You could use std::string as internal buffer and use append() when adding new elements.
Thus adding strings or const char* would be trivial.
Adding/writing uint8 can be done with casting it to char, writing uint16 - to char* with length sizeof(uint16_t).
void write_uint16( uint16_t val )
{
m_strBuffer.append( (char*)(&var), sizeof(val) );
}
Reading uint16:
uint16_t read_int16()
{
return ( *(uint16_t*)(m_strBuffer.c_str() + m_nOffset) );
}
You appear to be attempting to print ten bytes out of the buffer when you've only added four, and thus you're running off the end of the vector. This could be causing your seg fault.
Also your printf is trying to print a character as an unsigned int with %x. You need to use static_cast<unsigned>(packet._buffer[i]) as the parameter.
Stylistically:
Packet packet = Packet(); could potentially result in two objects being constructed. Just use Packet packet;
Generally try to avoid protected attributes (protected methods are fine) as they reduce encapsulation of your class.