How to parse data in C++

How to parse data in C++ - c++

final byte LOGIN_REQUEST = 1;
long deviceId = 123456789;
String nickname = "testid";
Socket mSocket = new Socket("localhost", 12021);
ByteBuffer bBuffer = ByteBuffer.allocate(1);
bBuffer.order(ByteOrder.LITTLE_ENDIAN);
//1
bBuffer.put(LOGIN_REQUEST);
//8
bBuffer.putLong(deviceId);
byte[] bString = nickname.getBytes();
int sLength = bString.length;
//4
bBuffer.putInt(sLength);
bBuffer.put(bString);
I am sending byte data like this and I want to parse it on my linux server using c++
In c++, I am reading
char *pdata = new char[BUF_SIZE];
int dataLength = read(m_events[i].data.fd, pdata, BUF_SIZE);
and push the pdata into the pthread's queue. I think I have to read first byte to see the type of the packet and read the next 8byte to get the device id and so on..
Please give me some references or tutorial to do this in c++ code..
Thanks in advance..

The below code will effectively do the trick. I am assuming a java int is 32bit
#include <inttypes.h>
// Declare variables
unsigned char login_req;
int64_t device_id;
uint32_t name_len;
char* name_str;
// Populate variables;
login_req = pdata[0];
memcpy( &device_id, pdata+1, 8 );
memcpy( &name_len, pdata+9, 4 );
name_str = (char*)malloc( name_len + 1 );
memcpy( name_str, pdata+13, name_len );
name_str[name_len] = '\0';
Note: I am glossing over some stuff, namely
Does not handle when BUF_SIZE is to small
Does not handle when the C program machine is not little ENDIAN. If it is big endian then you would need to switch the bytes after the memcpy for device_id and name_len
Does not to type cast on memcpy calls to avoid possible compiler warnings
This solution is pure C, will work in C++ too

If you have control over the Java code, and there's not a standard network protocol you have to implement, then I would recommend a different approach. Instead of pushing bytes around at this low a level, you can use a higher level serialization library, such as Google Protocol Buffers. There are both C++ tutorials and Java tutorials, which should get you started.

Using the IOStream library is pretty much the same as you could have done using basic read and write without the intermediate buffer in C. i.e.
#include <inttypes.h>
#include <iostream>
void readData( std::istream input )
{
// Declare variables
unsigned char login_req;
int64_t device_id;
uint32_t name_len;
char* name_str;
// Populate variables;
input.read( &login_req, 1 );
input.read( &device_id, 8 );
input.read( &name_len, 4 );
name_str = (char*)malloc( name_len + 1 );
input.read( name_str, name_len );
name_str[name_len] = '\0';
}
Once again I am not error checking on the istream::read calls, worrying about endian issues, or putting in the type casts. Trying to keep it simple.

Related

ntohl() returning 0 when reading from mmap()

Good evening, I am attempting to read some binary information from a .img file. I can retrieve 16-bit numbers (uint16_t) from ntohs(), but when I try to retrieve from the same position using ntohl(), it gives me 0 instead.
Here are the critical pieces of my program.
#include <iostream>
#include <fcntl.h>
#include <sys/stat.h>
#include <sys/mman.h>
#include <arpa/inet.h>
#include <cmath>
int fd;
struct blockInfo {
long blockSize = 0;
long blockCount = 0;
long fatStart = 0;
long fatBlocks = 0;
long rootStart = 0;
long rootBlocks = 0;
long freeBlocks = 0;
long resBlocks = 0;
long alloBlocks = 0;
};
int main(int argc, char *argv[]) {
fd = open(argv[1], O_RDWR);
// Get file size
struct stat buf{};
stat(path, &buf);
size_t size = buf.st_size;
// A struct to hold data retrieved from a big endian image.
blockInfo info;
auto mapPointer = (char*) mmap(nullptr, size,
(PROT_READ | PROT_WRITE), MAP_PRIVATE, fd, 0);
info.blockSize = ntohs((uint16_t) mapPointer[12]);
long anotherBlockSize = ntohl((uint32_t) mapPointer[11]);
printf("%ld", info.blockSize); // == 512, correct
printf("%ld", anotherBlockSize); // == 0, what?
}
I understand that blockSize and anotherBlockSize are not supposed to be equal, but anotherBlockSize should be non-zero at the least, right?
Something else, I go to access data at ntohs(pointer[16]), which should return 2, but also returns 0. What is going on here? Any help would be appreciated.

No, anotherBlockSize will not necessarily be non-zero
info.blockSize = ntohs((uint16_t) mapPointer[12]);
This code reads a char from offset 12 relatively to mapPointer, casts it to uint16_t and applies ntohs() to it.
long anotherBlockSize = ntohl((uint32_t) mapPointer[11]);
This code reads a char from offset 11 relatively to mapPointer, casts it to uint32_t and applies ntohl() to it.
Obviously, you are reading non-overlapped data (different chars) from the mapped memory, so you should not expect blockSize and anotherBlockSize to be connected.
If you are trying to read the same memory in different ways (as uint32_t and uint16_t), you must do some pointer casting:
info.blockSize = ntohs( *((uint16_t*)&mapPointer[12]));
Note that such code will generally be platform dependent. Such cast working perfectly on x86 may fail on ARM.

auto mapPointer = (char*) ...
This declares mapPointer to be a char *.
... ntohl((uint32_t) mapPointer[11]);
Your obvious intent here is to use mapPointer to retrieve a 32 bit value, a four-byte value, from this location.
Unfortunately, because mapPointer is a plain, garden-variety char *, the expression mapPointer[11] evaluates to a single, lonely char value. One byte. That's what the code reads from the mmaped memory block, at the 11th offset from the start of the block. The (uint32_t) does not read an uint32_t from the address referenced mapPointer+11. mapPointer[11] reads a single char value from mapPointer+11, because mapPointer is a pointer to a char, converts it to a uint32_t, and feeds to to ntohl().

How to make secondary DirectBuffer sound?

I am trying to get sound from simple tapping keyboard. Looks like a little drum machine.
If DirectSound is not a proper way to do this, please suggest something else.
In my code I don't know what's wrong. Here it is without error checking and with translations:
//Declaring the IDirectSound object
IDirectSound* device;
DirectSoundCreate(NULL, &device, NULL);
device->SetCooperativeLevel(hWnd, DSSCL_NORMAL );
/* Declaring secondary buffers */
IDirectSoundBuffer* kickbuf;
IDirectSoundBuffer* snarebuf;
/* Declaring .wav files pointers
And to structures for reading the information int the begining of the .wav file */
FILE* fkick;
FILE* fsnare;
sWaveHeader kickHdr;
sWaveHeader snareHdr;
The structure sWaveHeader is declared this way:
typedef struct sWaveHeader
{
char RiffSig[4]; // 'RIFF'
unsigned long WaveformChunkSize; // 8
char WaveSig[4]; // 'WAVE'
char FormatSig[4]; // 'fmt '
unsigned long FormatChunkSize; // 16
unsigned short FormatTag; // WAVE_FORMAT_PCM
unsigned short Channels; // Channels
unsigned long SampleRate;
unsigned long BytesPerSec;
unsigned short BlockAlign;
unsigned short BitsPerSample;
char DataSig[4]; // 'data'
unsigned long DataSize;
} sWaveHeader;
The .wav file opening
#define KICK "D:/muzic/kick.wav"
#define SNARE "D:/muzic/snare.wav"
fkick = fopen(KICK, "rb")
fsnare = fopen(SNARE, "rb")
Here I make a function that does the common work for snarebuf* and **kickbuf
int read_wav_to_WaveHeader (sWaveHeader* , FILE* , IDirectSoundBuffer* ); // The declaring
But I wil not write this function, just show the way it works with kickbuf, for instance.
fseek(fkick, 0, SEEK_SET); // Zero the position in file
fread(&kickHdr, 1, sizeof(sWaveHeader), fkick); // reading the sWaveHeader structure from file
Here goes a checking for fitting if sWaveHeader structure:
if(memcmp(pwvHdr.RiffSig, "RIFF", 4) ||
memcmp(pwvHdr.WaveSig, "WAVE", 4) ||
memcmp(pwvHdr.FormatSig, "fmt ", 4) ||
memcmp(pwvHdr.DataSig, "data", 4))
return 1;
Declaring the format and descriptor for a buffer and filling them:
DSBUFFERDESC bufDesc;
WAVEFORMATEX wvFormat;
ZeroMemory(&wvFormat, sizeof(WAVEFORMATEX));
wvFormat.wFormatTag = WAVE_FORMAT_PCM;
wvFormat.nChannels = kickHdr.Channels;
wvFormat.nSamplesPerSec = kickHdr.SampleRate;
wvFormat.wBitsPerSample = kickHdr.BitsPerSample;
wvFormat.nBlockAlign = wvFormat.wBitsPerSample / 8 * wvFormat.nChannels;
ZeroMemory(&bufDesc, sizeof(DSBUFFERDESC));
bufDesc.dwSize = sizeof(DSBUFFERDESC);
bufDesc.dwFlags = DSBCAPS_CTRLVOLUME |
DSBCAPS_CTRLPAN |
DSBCAPS_CTRLFREQUENCY;
bufDesc.dwBufferBytes = kickHdr.DataSize;
bufDesc.lpwfxFormat = &wvFormat;
Well, the creating of a buffer:
device->CreateSoundBuffer(&bufDesc, &kickbuf, NULL); // Any mistakes by this point?
Now locking the buffer and loading some data to it.
This data starts after sizeof(sWaveHeader) bytes in a WAVE file, am I wrong?
LPVOID Ptr1; // pointer on a pointer on a First block of data
LPVOID Ptr2; // pointer on a pointer on a Second block of data
DWORD Size1, Size2; // their sizes
Now calling the Lock() method:
kickbuf->Lock((DWORD)LockPos, (DWORD)Size,
&Ptr1, &Size1,
&Ptr2, &Size2, 0);
Loading data (is it ok?):
fseek(fkick, sizeof(sWaveHeader), SEEK_SET);
fread(Ptr1, 1, Size1, fkick);
if(Ptr2 != NULL)
fread(Ptr2, 1, Size2, fkick);
Unlocking the buffer:
kickbuf->Unlock(Ptr1, Size1, Ptr2, Size2);
Setting the volume:
kickbuf->SetVolume(-2500);
Then I make a wile(1) looping:
1. ask for a key pressing
2. if it is pressed:
kickbuf->SetCurrentPosition(0)
kickbuf->Play(0,0,0);
But there's no sound playing, please say, what is not proper in my code or maybe in the whole concept. Thank you.

When you initialize the WAVEFORMATEX, your are forgetting to set the nAvgBytesPerSec member. Add this line after the initialization of wvFormat.nBlockAlign:
wvFormat.nAvgBytesPerSec = wvFormat.nSamplesPerSec * wvFormat.nBlockAlign;
Also, I suspect this could be a problem:
kickbuf->SetVolume(-2500);
I suspect that will just attenuate your sample to absolute silence. Try taking that call out so that it plays at full volume.
But more likely, none of you sample code above shows validation of the return values from any of the DirectSound APIs, nor any of the file I/O values. Have you validated the HRESULTs returned by all the DSound APIs are returning S_OK? Have you tried printing or using OutputDebugString to print the values you computed for the members of WAVEFORMATEX?
Have you debugging the fread calls to validate that you are getting valid data into your buffers?
Hope this helps.

How to zlib compress a QByteArray?

I would like to maintain interoperability between every other application on the planet (including web applications) when compressing text. Since qCompress and qUncompress seem to go against the grain, I'm trying to use zlib directly from my Qt application.
I will accept the simplest (most minimal) answer that shows me how to use the zlib library with a QByteArray directly OR modify the output of qCompress so that it can be used outside of a Qt application.
Here's my embarrassing attempt:
QByteArray tdata = QString("Oh noes!").toUtf8();
QByteArray cdata;
uLongf len = 12 + 1.002*tdata.length();
compress(&cdata, &len, &tdata, tdata.length());
And the error:
error: cannot convert 'QByteArray*' to 'Bytef*' for argument '1' to 'int compress(Bytef*, uLongf*, const Bytef*, uLong)'
Then I tried using QByteArray::constData()
compress(cdata.constData(), &len, &tdata, tdata.length());
But got the following error:
error: invalid conversion from 'const char*' to 'Bytef*'
I have no idea what a Bytef is so I start looking in the zlib sources to investigate. But all I can find for this is in QtSources/src/3rdparty/zlib/zconf.h
# define Bytef z_Bytef
So now I'm just lost.

Based on this note in qUncompress, I think it's pretty easy.
Note: If you want to use this function to uncompress external data that was compressed using zlib, you first need to prepend a four byte header to the byte array containing the data. The header must contain the expected length (in bytes) of the uncompressed data, expressed as an unsigned, big-endian, 32-bit integer.
So you can probably just compress it like this:
QByteArray tdata = QString("Oh noes!").toUtf8();
QByteArray compressedData = qCompress(tdata);
compressedData.remove(0, 4);

Here is some code I once wrote which gets as input a pointer to a byte array, the number of bytes to compress and the compression level and then uses zlib to compress the input. The result is returned in a string.
enum compressionLevel
{
clFast,
clSmall,
clDefault
};
const size_t ChunkSize = 262144; //256k default size for chunks fed to zlib
void compressZlib(const char *s, size_t nbytes, std::string &out, compressionLevel l /*= clDefault*/ )
{
int level = Z_DEFAULT_COMPRESSION;
switch (l)
{
case clDefault:
level = Z_DEFAULT_COMPRESSION; break;
case clSmall:
level = Z_BEST_COMPRESSION; break;
case clFast:
level = Z_BEST_SPEED; break;
};
z_stream strm;
strm.zalloc = Z_NULL;
strm.zfree = Z_NULL;
strm.opaque = Z_NULL;
int ret = deflateInit(&strm, level);
if (ret != Z_OK)
{
throw std::runtime_error("Error while initializing zlib, error code "+ret);
}
size_t toCompress = nbytes;
char *readp = (char*)s;
size_t writeOffset = out.size();
out.reserve((size_t)(nbytes * 0.7));
while (toCompress > 0)
{
size_t toRead = std::min(toCompress,ChunkSize);
int flush = toRead < toCompress ? Z_NO_FLUSH : Z_FINISH;
strm.avail_in = toRead;
strm.next_in = (Bytef*)readp;
char *writep = new char[ChunkSize];
do{
strm.avail_out = ChunkSize;
strm.next_out = (Bytef*)writep;
deflate(&strm, flush);
size_t written = ChunkSize - strm.avail_out;
out.resize(out.size() + written);
memcpy(&(out[writeOffset]), writep, written);
writeOffset += written;
} while (strm.avail_out == 0);
delete[] writep;
readp += toRead;
toCompress -= toRead;
}
(void)deflateEnd(&strm);
}
Maybe this helps you to solve your problem, I guess using the cdata.constData() you can directly call this function

Just to help you out with the last section of your question here:
I have no idea what a Bytef is so I start looking in the zlib sources to investigate.
For the definitions of Byte and Bytef, look at lines 332 and 333 of zconf.h, as well as line 342:
332 #if !defined(__MACTYPES__)
333 typedef unsigned char Byte; /* 8 bits */
...
338 #ifdef SMALL_MEDIUM
339 /* Borland C/C++ and some old MSC versions ignore FAR inside typedef */
340 # define Bytef Byte FAR
341 #else
342 typedef Byte FAR Bytef;
The definition of FAR is for mixed-mode MSDOS programming, otherwise it is not defined as anything (see lines 328-330 of zconf.h).
Thus the zlib typedefs Bytef and Byte are basically the same as unsigned char on most platforms. Therefore you should be able to-do the following:
QByteArray tdata = QString("Oh noes!").toUtf8();
QByteArray cdata(compressBound(tdata.length()), '\0');
uLongf len = compressBound(tdata.length());
compress(reinterpret_cast<unsigned char*>(cdata.data()), &len,
reinterpret_cast<unsigned char*>(tdata.data()), tdata.length());

C++ replacement for BYTE C macro

I'm trying to port the C openGL texture loading code found here:
http://www.nullterminator.net/gltexture.html
to C++. In particular I'm trying to deal with reading some textures in from a file, what is the best way of rewriting the following code in an idiomatic and portable manner:
GLuint texture;
int width = 256, height = 256;
BYTE * data;
FILE * file;
// open texture data
file = fopen( filename, "rb" );
if ( file == NULL ) return 0;
// allocate buffer
data = malloc( width * height * 3 );
// read texture data
fread( data, width * height * 3, 1, file );
fclose( file );
In particular what is the best way of replacing the BYTE macro in a c++ way that is portable?
EDIT: BYTE macro is not defined in the current environment I am working in. I was trying to figure out what the underlying type of this is on other systems so that I can typedef for the correct type.

Assuming the original code is portable, you can just leave it. Just make sure you pull in the definition of BYTE as is. C++ compilers are backwards compatible to C, so the corresponding headers are still there.
(If BYTE is really a macro, I'd perhaps typedef it.)

The C code should work just fine when compiled as C++.
Rather than use the BYTE type, just use the OpenGL-defined type GLbyte, which is the actual type the APIs take anyway. It is defined in gl.h thus:
typedef signed char GLbyte;
A very quick (untested!) translation of the above code into C++ would be something like:
GLuint texture;
unsigned width = 256, height = 256;
unsigned buffer_size = width * height * 3;
GLbyte * data;
std::ifstream file;
// open texture data
file.open(filename, ios_base::in | ios_base::binary);
if (!file) return 0;
// allocate buffer
data = new BYTE[buffer_size];
// read texture data
file.read(data, buffer_size);
file.close();
// Process data...
// ...
// Don't forget to release it when you're done!
delete [] data;

BYTE* in this case seems to be just a macro for char* or unsigned char*. I could be wrong but I doubt it. So using char* or unsigned char* in your program would be equivalent. However if you are porting from C to C++ you might want to consider using ifstream (in binary mode) from the C++ standard library.

Use unsigned char instead of BYTE - should work as expected (you might have to cast the return value of malloc().

Serialization/Deserialization of a struct to a char* in C

I have a struct
struct Packet {
int senderId;
int sequenceNumber;
char data[MaxDataSize];
char* Serialize() {
char *message = new char[MaxMailSize];
message[0] = senderId;
message[1] = sequenceNumber;
for (unsigned i=0;i<MaxDataSize;i++)
message[i+2] = data[i];
return message;
}
void Deserialize(char *message) {
senderId = message[0];
sequenceNumber = message[1];
for (unsigned i=0;i<MaxDataSize;i++)
data[i] = message[i+2];
}
};
I need to convert this to a char* , maximum length MaxMailSize > MaxDataSize for sending over network and then deserialize it at the other end
I can't use tpl or any other library.
Is there any way to make this better I am not that comfortable with this, or is this the best we can do.

since this is to be sent over a network, i strongly advise you to convert those data into network byte order before transmitting, and back into host byte order when receiving. this is because the byte ordering is not the same everywhere, and once your bytes are not in the right order, it may become very difficult to reverse them (depending on the programming language used on the receiving side). byte ordering functions are defined along with sockets, and are named htons(), htonl(), ntohs() and ntohl(). (in those name: h means 'host' or your computer, n means 'network', s means 'short' or 16bit value, l means 'long' or 32 bit value).
then you are on your own with serialization, C and C++ have no automatic way to perform it. some softwares can generate code to do it for you, like the ASN.1 implementation asn1c, but they are difficult to use because they involve much more than just copying data over the network.

Depending if you have enough place or not... you might simply use the streams :)
std::string Serialize() {
std::ostringstream out;
char version = '1';
out << version << senderId << '|' << sequenceNumber << '|' << data;
return out.str();
}
void Deserialize(const std::string& iString)
{
std::istringstream in(iString);
char version = 0, check1 = 0, check2 = 0;
in >> version;
switch(version)
{
case '1':
senderId >> check1 >> sequenceNumber >> check2 >> data;
break;
default:
// Handle
}
// You can check here than 'check1' and 'check2' both equal to '|'
}
I readily admit it takes more place... or that it might.
Actually, on a 32 bits architecture an int usually cover 4 bytes (4 char). Serializing them using streams only take more than 4 'char' if the value is superior to 9999, which usually gives some room.
Also note that you should probably include some guards in your stream, just to check when you get it back that it's alright.
Versioning is probably a good idea, it does not cost much and allows for unplanned later development.

You can have a class reprensenting the object you use in your software with all the niceties and member func and whatever you need. Then you have a 'serialized' struct that's more of a description of what will end up on the network.
To ensure the compiler will do whatever you tell him to do, you need to instruct it to 'pack' the structure. The directive I used here is for gcc, see your compiler doc if you're not using gcc.
Then the serialize and deserialize routine just convert between the two, ensuring byte order and details like that.
#include <arpa/inet.h> /* ntohl htonl */
#include <string.h> /* memcpy */
class Packet {
int senderId;
int sequenceNumber;
char data[MaxDataSize];
public:
char* Serialize();
void Deserialize(char *message);
};
struct SerializedPacket {
int senderId;
int sequenceNumber;
char data[MaxDataSize];
} __attribute__((packed));
void* Packet::Serialize() {
struct SerializedPacket *s = new SerializedPacket();
s->senderId = htonl(this->senderId);
s->sequenceNumber = htonl(this->sequenceNumber);
memcpy(s->data, this->data, MaxDataSize);
return s;
}
void Packet::Deserialize(void *message) {
struct SerializedPacket *s = (struct SerializedPacket*)message;
this->senderId = ntohl(s->senderId);
this->sequenceNumber = ntohl(s->sequenceNumber);
memcpy(this->data, s->data, MaxDataSize);
}

int senderId;
int sequenceNumber;
...
char *message = new char[MaxMailSize];
message[0] = senderId;
message[1] = sequenceNumber;
You're overwriting values here. senderId and sequenceNumber are both ints and will take up more than sizeof(char) bytes on most architectures. Try something more like this:
char * message = new char[MaxMailSize];
int offset = 0;
memcpy(message + offset, &senderId, sizeof(senderId));
offset += sizeof(senderId);
memcpy(message + offset, &sequenceNumber, sizeof(sequenceNumber));
offset += sizeof(sequenceNumber);
memcpy(message + offset, data, MaxDataSize);
EDIT:
fixed code written in a stupor. Also, as noted in comment, any such packet is not portable due to endian differences.

To answer your question generally, C++ has no reflection mechanism, and so manual serialize and unserialize functions defined on a per-class basis is the best you can do. That being said, the serialization function you wrote will mangle your data. Here is a correct implementation:
char * message = new char[MaxMailSize];
int net_senderId = htonl(senderId);
int net_sequenceNumber = htonl(sequenceNumber);
memcpy(message, &net_senderId, sizeof(net_senderId));
memcpy(message + sizeof(net_senderId), &net_sequenceNumber, sizeof(net_sequenceNumber));

As mentioned in other posts, senderId and sequenceNumber are both of type int, which is likely to be larger than char, so these values will be truncated.
If that's acceptable, then the code is OK. If not, then you need to split them into their constituent bytes. Given that the protocol you are using will specifiy the byte order of multi-byte fields, the most portable, and least ambiguous, way of doing this is through shifting.
For example, let's say that senderId and sequenceNumber are both 2 bytes long, and the protocol requires that the higher byte goes first:
char* Serialize() {
char *message = new char[MaxMailSize];
message[0] = senderId >> 8;
message[1] = senderId;
message[2] = sequenceNumber >> 8;
message[3] = sequenceNumber;
memcpy(&message[4], data, MaxDataSize);
return message;
}
I'd also recommend replacing the for loop with memcpy (if available), as it's unlikely to be less efficient, and it makes the code shorter.
Finally, this all assumes that char is one byte long. If it isn't, then all the data will need to be masked, e.g.:
message[0] = (senderId >> 8) & 0xFF;

You can use Protocol Buffers for defining and serializing of structs and classes. This is what google uses internally, and has a very small transfer mechanism.
http://code.google.com/apis/protocolbuffers/

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to parse data in C++ - c++

Related

ntohl() returning 0 when reading from mmap()

How to make secondary DirectBuffer sound?

How to zlib compress a QByteArray?

C++ replacement for BYTE C macro

Serialization/Deserialization of a struct to a char* in C

Categories

Resources