Looking for a sequence number in a series of char* buffers? - c++

I am receiving a stream of const char* msg types of a certain size_t len. At some byte offset within there there is a sequence number (32 or 64 byte, im not sure which) so my idea was to do the following every time I get one of the msg things:
for (int i = 0; i < 30; ++i)
{
uint32_t seq = *(uint32_t*) msg[i];
cout << "seq" << i << " " << seq << endl;
}
//and similar for 64 bytes
so that afterwards I can group the lines with the same offset and see which offset i is giving me sequential looking output. The problem with this is that I segfault with stuff like:
(gdb) p *(uint32_t*) msg[i]
Cannot access memory at address 0x2d
How can I carry out my little search idea for the sequence numbers?

Try:
uint32_t seq = *(uint32_t*) &msg[i];
and
(gdb) p *(uint32_t*)&msg[i]
EDIT: A bigger change, which is potentially more portable is:
uint32_t seq;
memcpy(&seq, msg + i, sizeof(seq));
seq = ntohl(seq);

char msg[30];
for ( int i = 0; i < 30; i++ )
msg[i] = '\0';
char *iter_p = NULL;
iter_p = msg;
int i = 0;
while ( iter_p < &msg[30] ) {
uint32_t seq = *(uint32_t *)iter_p;
cout << "seq" << i << " " << seq << endl;
iter_p += 4;
i++;
}
Try iterating through like this, step an iterator pointer through. =)
iter_p +=4 --> step 32 bits, since iter_p is a character.

that's not how you convert bytes to an int, you are trying to dereference a pointer to a location in memory that doesn't exist. Try something like this: http://www.cplusplus.com/forum/beginner/3076/

You make a simple mistake: msg[i] return the VALUE of the char at position i. To get the address of it you should use msg + i or &msg[i].
But this code is not portable for some architectures that can't read unaligned word.
The best way of reading the unaligned word is using the packed structures:
#pragma pack(1)
struct Header {
uint32_t seq;
};
#pragma pack()
for (int i = 0; i < 30; ++i)
{
const Header *h = (const Header *)(msg + i);
cout << "seq" << i << " " << htonl(h->seq) << endl;
}
Pay attention for the endian issue and htonl call.

Related

Deinterleave audio data in varied bitrates

I'm trying to write one function that can deinterleave 8/16/24/32 bit audio data, given that the audio data naturally arrives in an 8 bit buffer.
I have this working for 8 bit, and it works for 16/24/32, but only for the first channel (channel 0). I have tried so many + and * and other operators that I'm just guessing at this point. I cannot find the magic formula. I am using C++ but would also accept a memcpy into the vector if that's easiest.
Check out the code. If you change the demux call to another bitrate you will see the problem. There is an easy math solution here I am sure, I just cannot get it.
#include <vector>
#include <map>
#include <iostream>
#include <iomanip>
#include <string>
#include <string.h>
const int bitrate = 8;
const int channel_count = 5;
const int audio_size = bitrate * channel_count * 4;
uint8_t audio_ptr[audio_size];
const int bytes_per_channel = audio_size / channel_count;
void Demux(int bitrate){
int byterate = bitrate/8;
std::map<int, std::vector<uint8_t> > channel_audio;
for(int i = 0; i < channel_count; i++){
std::vector<uint8_t> audio;
audio.reserve(bytes_per_channel);
for(int x = 0; x < bytes_per_channel; x += byterate){
for(int z = 0; z < byterate; z++){
// What is the magic formula!
audio.push_back(audio_ptr[(x * channel_count) + i + z]);
}
}
channel_audio.insert(std::make_pair(i, audio));
}
int remapsize = 0;
std::cout << "\nRemapped Audio";
std::map<int, std::vector<uint8_t> >::iterator it;
for(it = channel_audio.begin(); it != channel_audio.end(); ++it){
std::cout << "\nChannel" << it->first << " ";
std::vector<uint8_t> v = it->second;
remapsize += v.size();
for(size_t i = 0; i < v.size(); i++){
std::cout << "0x" << std::hex << std::setfill('0') << std::setw(2) << +v[i] << " ";
if(i && (i + 1) % 32 == 0){
std::cout << std::endl;
}
}
}
std::cout << "Total remapped audio size is " << std::dec << remapsize << std::endl;
}
int main()
{
// External data
std::cout << "Raw Audio\n";
for(int i = 0; i < audio_size; i++){
audio_ptr[i] = i;
std::cout << "0x" << std::hex << std::setfill('0') << std::setw(2) << +audio_ptr[i] << " ";
if(i && (i + 1) % 32 == 0){
std::cout << std::endl;
}
}
std::cout << "Total raw audio size is " << std::dec << audio_size << std::endl;
Demux(8);
//Demux(16);
//Demux(24);
//Demux(32);
}
You're actually pretty close. But the code is confusing: specifically the variable names and what actual values they represent. As a result, you appear to be just guessing the math. So let's go back to square one and determine what exactly it is we need to do, and the math will very easily fall out of it.
First, just imagine we have one sample covering each of the five channels. This is called an audio frame for that sample. The frame looks like this:
[channel0][channel1][channel2][channel3][channel4]
The width of a sample in one channel is called byterate in your code, but I don't like that name. I'm going to call it bytes_per_sample instead. You can easily see the width of the entire frame is this:
int bytes_per_frame = bytes_per_sample * channel_count;
It should be equally obvious that to find the starting offset for channel c within a single frame, you multiply as follows:
int sample_offset_in_frame = bytes_per_sample * c;
That's just about all you need! The last bit is your z loop which covers each byte in a single sample for one channel. I don't know what z is supposed to represent, apart from being a random single-letter identifier you chose, but hey let's just keep it.
Putting all this together, you get the absolute offset of sample s in channel c and then you copy individual bytes out of it:
int sample_offset = bytes_per_frame * s + bytes_per_sample * c;
for (int z = 0; z < bytes_per_sample; ++z) {
audio.push_back(audio_ptr[sample_offset + z]);
}
This does actually assume you're looping over the number of samples, not the number of bytes in your channel. So let's show all the loops for completion sake:
const int bytes_per_sample = bitrate / 8;
const int bytes_per_frame = bytes_per_sample * channel_count;
const int num_samples = audio_size / bytes_per_frame;
for (int c = 0; c < channel_count; ++c)
{
int sample_offset = bytes_per_sample * c;
for (int s = 0; s < num_samples; ++s)
{
for (int z = 0; z < bytes_per_sample; ++z)
{
audio.push_back(audio_ptr[sample_offset + z]);
}
// Skip to next frame
sample_offset += bytes_per_frame;
}
}
You'll see here that I split the math up so that it's doing less multiplications in the loops. This is mostly for readability, but might also help a compiler understand what's happening when it tries to optimize. Concerns over optimization are secondary (and in your case, there are much more expensive worries going on with those vectors and the map)..
The most important thing is you have readable code with reasonable variable names that makes logical sense.

Why is the function displaying the hex code in reverse order?

The following code (in C++) is supposed to get some data along with it's size (in terms of bytes) and return the string containing the hexadecimal code. size is the size of the memory block with its location stored in val.
std::string byteToHexString(const unsigned char* val, unsigned long long size)
{
unsigned char temp;
std::string vf;
vf.resize(2 * size+1);
for(unsigned long long i= 0; i < size; i++)
{
temp = val[i] / 16;
vf[2*i] = (temp <= 9)? '0' + temp: 'A' + temp - 10; // i.e., (10 = 9 + 1)
temp = val[i] % 16;
vf[2*i+1] = (temp <= 9)? '0' + temp: 'A' + temp - 10; // i.e., (10 = 9 + 1)
}
vf[2*size] = '\0';
return (vf);
}
So on executing the above function the following way:
int main()
{
unsigned int a = 5555;
std::cout << byteToHexString((unsigned char*)(&a), 4);
return 0;
}
The output we obtain is:
B3150000
Shouldn't the output rather be 000015B3? So why is this displaying in reverse order? Is there something wrong with the code (I am using g++ compiler in Ubuntu)?
You are seeing the order in which bytes are stored for representing integers on your architecture, which happens to be little-endian. That means, the least-significant byte comes first.
If you want to display it in normal numeric form, you either need to detect the endianness of your architecture and switch the code accordingly, or just use a string stream:
unsigned int a = 5555;
std::ostringstream ss;
ss << std::setfill( '0' ) << std::setw( sizeof(a)*2 ) << std::hex << a;
std::cout << ss.str() << std::endl;

reading binary from a file gives negative number

Hey everyone this may turn out to be a simple stupid question, but one that has been giving me headaches for a while now. I'm reading data from a Named Binary Tag file, and the code is working except when I try to read big-endian numbers. The code that gets an integer looks like this:
long NBTTypes::getInteger(istream &in, int num_bytes, bool isBigEndian)
{
long result = 0;
char buff[8];
//get bytes
readData(in, buff, num_bytes, isBigEndian);
//convert to integer
cout <<"Converting bytes to integer..." << endl;
result = buff[0];
cout <<"Result starts at " << result << endl;
for(int i = 1; i < num_bytes; ++i)
{
result = (result << 8) | buff[i];
cout <<"Result is now " << result << endl;
}
cout <<"Done." << endl;
return result;
}
And the readData function:
void NBTTypes::readData(istream &in, char *buffer, unsigned long num_bytes, bool BE)
{
char hold;
//get data
in.read(buffer, num_bytes);
if(BE)
{
//convert to little-endian
cout <<"Converting to a little-endian number..." << endl;
for(unsigned long i = 0; i < num_bytes / 2; ++i)
{
hold = buffer[i];
buffer[i] = buffer[num_bytes - i - 1];
buffer[num_bytes - i - 1] = hold;
}
cout <<"Done." << endl;
}
}
This code originally worked (gave correct positive values), but now for whatever reason the values I get are either over or underflowing. What am I missing?
Your byte order swapping is fine, however building the integer from the sequences of bytes is not.
First of all, you get the endianness wrong: the first byte you read in becomes the most significant byte, while it should be the other way around.
Then, when OR-ing in the characters from the array, be aware that they are promoted to an int, which, for a signed char, sets a lot of additional bits unless you mask them out.
Finally, when long is wider than num_bytes, you need to sign-extend the bits.
This code works:
union {
long s; // Signed result
unsigned long u; // Use unsigned for safe bit-shifting
} result;
int i = num_bytes-1;
if (buff[i] & 0x80)
result.s = -1; // sign-extend
else
result.s = 0;
for (; i >= 0; --i)
result.u = (result.u << 8) | (0xff & buff[i]);
return result.s;

Conversion from Integer to BCD

I want to convert the integer (whose maximum value can reach to 99999999) in to BCD and store in to array of 4 characters.
Like for example:
Input is : 12345 (Integer)
Output should be = "00012345" in BCD which is stored in to array of 4 characters.
Here 0x00 0x01 0x23 0x45 stored in BCD format.
I tried in the below manner but didnt work
int decNum = 12345;
long aux;
aux = (long)decNum;
cout<<" aux = "<<aux<<endl;
char* str = (char*)& aux;
char output[4];
int len = 0;
int i = 3;
while (len < 8)
{
cout <<"str: " << len << " " << (int)str[len] << endl;
unsigned char temp = str[len]%10;
len++;
cout <<"str: " << len << " " << (int)str[len] << endl;
output[i] = ((str[len]) << 4) | temp;
i--;
len++;
}
Any help will be appreciated
str points actually to a long (probably 4 bytes), but the iteration accesses 8 bytes.
The operation str[len]%10 looks as if you are expecting digits, but there is only binary data. In addition I suspect that i gets negative.
First, don't use C-style casts (like (long)a or (char*)). They are a bad smell. Instead, learn and use C++ style casts (like static_cast<long>(a)), because they point out where you are doing things that are dangeruos, instead of just silently working and causing undefined behavior.
char* str = (char*)& aux; gives you a pointer to the bytes of aux -- it is actually char* str = reinterpret_cast<char*>(&aux);. It does not give you a traditional string with digits in it. sizeof(char) is 1, sizeof(long) is almost certainly 4, so there are only 4 valid bytes in your aux variable. You proceed to try to read 8 of them.
I doubt this is doing what you want it to do. If you want to print out a number into a string, you will have to run actual code, not just reinterpret bits in memory.
std::string s; std::stringstream ss; ss << aux; ss >> s; will create a std::string with the base-10 digits of aux in it.
Then you can look at the characters in s to build your BCD.
This is far from the fastest method, but it at least is close to your original approach.
First of all sorry about the C code, I was deceived since this started as a C questions, porting to C++ should not really be such a big deal.
If you really want it to be in a char array I'll do something like following code, I find useful to still leave the result in a little endian format so I can just cast it to an int for printing out, however that is not strictly necessary:
#include <stdio.h>
typedef struct
{
char value[4];
} BCD_Number;
BCD_Number bin2bcd(int bin_number);
int main(int args, char **argv)
{
BCD_Number bcd_result;
bcd_result = bin2bcd(12345678);
/* Assuming an int is 4 bytes */
printf("result=0x%08x\n", *((int *)bcd_result.value));
}
BCD_Number bin2bcd(int bin_number)
{
BCD_Number bcd_number;
for(int i = 0; i < sizeof(bcd_number.value); i++)
{
bcd_number.value[i] = bin_number % 10;
bin_number /= 10;
bcd_number.value[i] |= bin_number % 10 << 4;
bin_number /= 10;
}
return bcd_number;
}

How to put bit sequence into bytes (C/C++)

I have a couple of integers, for example (in binary represetation):
00001000, 01111111, 10000000, 00000001
and I need to put them in sequence to array of bytes(chars), without the leading zeros, like so:
10001111 11110000 0001000
I understand that it is must be done by bit shifting with <<,>> and using binary or |. But I can't find the correct algorithm, can you suggest the best approach?
The integers I need to put there are unsigned long long ints, so the length of one can be anywhere from 1 bit to 8 bytes (64 bits).
You could use a std::bitset:
#include <bitset>
#include <iostream>
int main() {
unsigned i = 242122534;
std::bitset<sizeof(i) * 8> bits;
bits = i;
std::cout << bits.to_string() << "\n";
}
There are doubtless other ways of doing it, but I would probably go with the simplest:
std::vector<unsigned char> integers; // Has your list of bytes
integers.push_back(0x02);
integers.push_back(0xFF);
integers.push_back(0x00);
integers.push_back(0x10);
integers.push_back(0x01);
std::string str; // Will have your resulting string
for(unsigned int i=0; i < integers.size(); i++)
for(int j=0; j<8; j++)
str += ((integers[i]<<j) & 0x80 ? "1" : "0");
std::cout << str << "\n";
size_t begin = str.find("1");
if(begin > 0) str.erase(0,begin);
std::cout << str << "\n";
I wrote this up before you mentioned that you were using long ints or whatnot, but that doesn't actually change very much of this. The mask needs to change, and the j loop variable, but otherwise the above should work.
Convert them to strings, then erase all leading zeros:
#include <iostream>
#include <sstream>
#include <string>
#include <cstdint>
std::string to_bin(uint64_t v)
{
std::stringstream ss;
for(size_t x = 0; x < 64; ++x)
{
if(v & 0x8000000000000000)
ss << "1";
else
ss << "0";
v <<= 1;
}
return ss.str();
}
void trim_right(std::string& in)
{
size_t non_zero = in.find_first_not_of("0");
if(std::string::npos != non_zero)
in.erase(in.begin(), in.begin() + non_zero);
else
{
// no 1 in data set, what to do?
in = "<no data>";
}
}
int main()
{
uint64_t v1 = 437148234;
uint64_t v2 = 1;
uint64_t v3 = 0;
std::string v1s = to_bin(v1);
std::string v2s = to_bin(v2);
std::string v3s = to_bin(v3);
trim_right(v1s);
trim_right(v2s);
trim_right(v3s);
std::cout << v1s << "\n"
<< v2s << "\n"
<< v3s << "\n";
return 0;
}
A simple approach would be having the "current byte" (acc in the following), the associated number of used bits in it (bitcount) and a vector of fully processed bytes (output):
int acc = 0;
int bitcount = 0;
std::vector<unsigned char> output;
void writeBits(int size, unsigned long long x)
{
while (size > 0)
{
// sz = How many bit we're about to copy
int sz = size;
// max avail space in acc
if (sz > 8 - bitcount) sz = 8 - bitcount;
// get the bits
acc |= ((x >> (size - sz)) << (8 - bitcount - sz));
// zero them off in x
x &= (1 << (size - sz)) - 1;
// acc got bigger and x got smaller
bitcount += sz;
size -= sz;
if (bitcount == 8)
{
// got a full byte!
output.push_back(acc);
acc = bitcount = 0;
}
}
}
void writeNumber(unsigned long long x)
{
// How big is it?
int size = 0;
while (size < 64 && x >= (1ULL << size))
size++;
writeBits(size, x);
}
Note that at the end of the processing you should check if there is any bit still in the accumulator (bitcount > 0) and you should flush them in that case by doing a output.push_back(acc);.
Note also that if speed is an issue then probably using a bigger accumulator is a good idea (however the output will depend on machine endianness) and also that discovering how many bits are used in a number can be made much faster than a linear search in C++ (for example x86 has a special machine language instruction BSR dedicated to this).