How to convert audio byte to samples

How to convert audio byte to samples - c++

This is my struct
/* wave data block header */
typedef struct wavehdr_tag {
LPSTR lpData; /* pointer to locked data buffer */
DWORD dwBufferLength; /* length of data buffer */
DWORD dwBytesRecorded; /* used for input only */
DWORD_PTR dwUser; /* for client's use */
DWORD dwFlags; /* assorted flags (see defines) */
DWORD dwLoops; /* loop control counter */
struct wavehdr_tag FAR *lpNext; /* reserved for driver */
DWORD_PTR reserved; /* reserved for driver */
} WAVEHDR, *PWAVEHDR, NEAR *NPWAVEHDR, FAR *LPWAVEHDR;
I have this variable WAVEHDR waveHeader;
I record 10 secs from microphone and waveHeader->lpData has my raw recorded data, and waveHeader->dwBytesRecorded is the raw data's length
Now I want to calculate the volume in each second to say which second has highest volume and which one has the lowest.
I know I should sum the absolute values and divide by the number of samples
I used sum += abs(waveHeader->lpData[i]); for i from 0 to length of one secs data, but it doesn't give me a good result
it always gives me the same result for each second, but I am silent in some seconds and speak in some...
I read I have to add samples, not bytes How should I convert waveHeader->lpData[i] to samples?
//len = length of one secs data (waveHeader->dwBytesRecorded/10)
for (int i=0; i<len; i++)
{
sum += abs(waveHeader->lpData[i]);
}

You have the WAVEFORMATEX used for capturing the audio, right? If so, you can modify the following routine to meet your needs:
void ProcessSamples(WAVEHDR* header, WAVEFORMATEX* format)
{
BYTE* pData = (BYTE*)(header->data);
DWORD dwNumSamples = header->dwBytesRecorded / format->nBlockAlign;
// 16-bit stereo, the most common format
if ((format->wBitsPerSample == 16) && (format->nChannels == 2))
{
for (DWORD index = 0; index < dwNumSamples; index++)
{
short left = *(short*)pData; pData+=2;
short right = *(short*)pData; pData+=2;
}
}
else if ((format->wBitsPerSample == 16) && (format->nChannels == 1))
{
for (DWORD index = 0; index < dwNumSamples; index++)
{
short monoSample = *(short*)pData; pData+=2;
}
}
else if ((format->wBitsPerSample == 8) && (format->nChannels == 2))
{
// 8-bit samples are unsigned.
// "128" is the median silent value
// normalize to a "signed" value
for (DWORD index = 0; index < dwNumSamples; index++)
{
signed char left = (*(signed char*)pData) - 128; pData += 1;
signed char right = (*(signed char*)pData) - 128; pData += 1;
}
}
else if ((format->wBitsPerSample == 8) && (format->nChannels == 1))
{
for (DWORD index = 0; index < dwNumSamples; index++)
{
signed char monosample = (*(signed char*)pData) - 128; pData += 1;
}
}
}

Related

Strange behavior with with writing / reading from vector

I am sitting here for hours looking at the code, and I just don't get it.
It's about the std::vector canData which is used as a buffer for decoding and encoding data from a CAN DBC parser.
The complete example of my problem is here.
Basically there is a value encoded to an array and then decoded again from this array. But the size of this array is always zero and even after clearing the array, although its zero, one can still decode data from it.
Can somebody please explain that to me?
Am I missing something?
unsigned int canIdentifier = 0x100;
std::vector<std::uint8_t> canData;
canData.reserve(4);
network.messages[canIdentifier].signals["multiplexor"].encode(canData, 0);
network.messages[canIdentifier].signals["signal_1"].encode(canData, 0x12);
std::cout << "size: " << canData.size() << std::endl;
canData.clear();
decodeMessage(canIdentifier, canData);
std::cout << "2size: " << canData.size() << std::endl;
Updated needed functions:
uint64_t Signal::decode(std::vector<uint8_t> & data)
{
/* safety check */
if (bitSize == 0) {
return 0;
}
/* copy bits */
uint64_t retVal = 0;
if (byteOrder == ByteOrder::BigEndian) {
/* start with MSB */
unsigned int srcBit = startBit;
unsigned int dstBit = bitSize - 1;
for (unsigned int i = 0; i < bitSize; ++i) {
/* copy bit */
if (data[srcBit / 8] & (1 << (srcBit % 8))) {
retVal |= (1ULL << dstBit);
}
/* calculate next position */
if ((srcBit % 8) == 0) {
srcBit += 15;
} else {
--srcBit;
}
--dstBit;
}
} else {
/* start with LSB */
unsigned int srcBit = startBit;
unsigned int dstBit = 0;
for (unsigned int i = 0; i < bitSize; ++i) {
/* copy bit */
if (data[srcBit / 8] & (1 << (srcBit % 8))) {
retVal |= (1ULL << dstBit);
}
/* calculate next position */
++srcBit;
++dstBit;
}
}
/* if signed, then fill all bits above MSB with 1 */
if (valueType == ValueType::Signed) {
if (retVal & (1 << (bitSize - 1))) {
for (unsigned int i = bitSize; i < 8 * sizeof(retVal); ++i) {
retVal |= (1ULL << i);
}
}
}
return retVal;
}
void Signal::encode(std::vector<uint8_t> & data, uint64_t rawValue)
{
/* safety check */
if (bitSize == 0) {
return;
}
/* copy bits */
if (byteOrder == ByteOrder::BigEndian) {
/* start with MSB */
unsigned int srcBit = startBit;
unsigned int dstBit = bitSize - 1;
for (unsigned int i = 0; i < bitSize; ++i) {
/* copy bit */
if (rawValue & (1ULL << dstBit)) {
data[srcBit / 8] |= (1 << (srcBit % 8));
} else {
data[srcBit / 8] &= ~(1 << (srcBit % 8));
}
/* calculate next position */
if ((srcBit % 8) == 0) {
srcBit += 15;
} else {
--srcBit;
}
--dstBit;
}
} else {
/* start with LSB */
unsigned int srcBit = startBit;
unsigned int dstBit = 0;
for (unsigned int i = 0; i < bitSize; ++i) {
/* copy bit */
if (rawValue & (1ULL << dstBit)) {
data[srcBit / 8] |= (1 << (srcBit % 8));
} else {
data[srcBit / 8] &= ~(1 << (srcBit % 8));
}
/* calculate next position */
++srcBit;
++dstBit;
}
}
}

Looking at the code step by step:
canData.reserve(4);
allocates memory for vector that can contain (at least) 4 uint8_t, but contain 0 (canData.resize(4) would change vector size). canData.capacity() is then 4 (or more), but canData.size() is 0.
encode(...) method access vector using operator[]. It does not check if index is in range (so less than canData.size()), so there is no exception (if vector at() was used instead it would throw). Also, as accessed indexes are in allocated memory, nothing bad (in particular memory leak) happens there.
canData.clear()
destroy all vector elements, that are in range, so between index 0 and canData.size(). Thus, it does not touch elements above canData.size(), which is 0 in this case. clear() also does not shrink memory allocated for vector (or is not guaranteed to reallocate to shrink memory) - shrink_to_fit() would do so.
In the end, decodeMessage operates on memory that is allocated and filled with correct data, that was not destroyed. Again, usage of vector operator[] cause no exception / memory leak.
As stated in comments, lot of undefined behavior.

Correctly add string to a REG_BINARY type in Windows Registry

I am trying to automate the process of adding software policy hash rules to Windows, and am currently having a problem adding valid hashes to the registry. This code creates a key and adds the hash to the registry:
HKEY* m_hKey;
string md5Digest;
string valueName = "ItemData";
vector<BYTE> itemData;
/*
use Crypto++ to get file hash
*/
//convert string to format that can be loaded into registry
for (int i = 1; i < md5Digest.length(); i += 2)
itemData.push_back('0x' + md5Digest[i - 1] + md5Digest[i]);
// Total data size, in bytes
const DWORD dataSize = static_cast<DWORD>(itemData.size());
::RegSetValueEx(
m_hKey,
valueName.c_str(),
0, // reserved
REG_BINARY,
&itemData[0],
dataSize
);
This works fine, and adds the key to the registry:
But when comparing the registry key to a rule added by Group Policy you can see a very important difference:
The 'ItemData' values are different between them. The bottom picture's ItemData value is the correct output. When debugging the program I can clearly see that md5Digest has the correct value, so the problem lies with the conversion of the md5Digest string to the ItemData vector of BYTEs or unsigned chars....
What is the problem with my code, why is the data being entered incorrectly to the registry?

You have a string that you want to convert to a byte array. You can write a helper function to convert 2 chars to a BYTE:
using BYTE = unsigned char;
BYTE convert(char a, char b)
{
// Convert hex char to byte
// Needs fixin for lower case
if (a >= '0' && a <= '9') a -= '0';
else a -= 55; // 55 = 'A' - 10
if (b >= '0' && b <= '9') b -= '0';
else b -= 55;
return (a << 4) | b;
}
....
vector<BYTE> v;
string s = "3D65B8EBDD0E";
for (int i = 0; i < s.length(); i+=2) {
v.push_back(convert(s[i], s[i+1]));
}
v now contains {0x3D, 0x65, 0xB8, 0xEB, 0xDD, 0x0E}
Or, as mention by #RbMm, you can use the CryptStringToBinary Windows function:
#include <wincrypt.h>
...
std::string s = "3D65B8EBDD0E";
DWORD hex_len = s.length() / 2;
BYTE *buffer = new BYTE[hex_len];
CryptStringToBinary(s.c_str(),
s.length(),
CRYPT_STRING_HEX,
buffer,
&hex_len,
NULL,
NULL
);

You have '0x' two-letter char literal summed up with md5Digest[i - 1] + md5Digest[i] and then trunketed to BYTE. This looked like you were trying to build "0xFF" byte value out of them. You should store md5 string directly:
const DWORD dataSize = static_cast<DWORD>(md5Digest.size());
::RegSetValueEx(
m_hKey,
valueName.c_str(),
0, // reserved
REG_BINARY,
reinterpret_cast< BYTE const *>(md5Digest.data()),
dataSize
);
If you actually need to store binary representation of hex numbers from md5 then you need to convert them to bytes like this:
BYTE char_to_halfbyte(char const c)
{
if(('0' <= c) && (c <= '9'))
{
return(static_cast<BYTE>(c - `0`));
}
else
{
assert(('A' <= c) && (c <= 'F'));
return(static_cast<BYTE>(10 + c - `A`));
}
}
for(std::size_t i = 0; i < md5Digest.length(); i += 2)
{
assert((i + 1) < md5Digest.length());
itemData.push_back
(
(char_to_halfbyte(md5Digest[i ]) << 4)
|
(char_to_halfbyte(md5Digest[i + 1]) )
);
}

Remove nth bit from buffer, and shift the rest

Giving a uint8_t buffer of x length, I am trying to come up with a function or a macro that can remove nth bit (or n to n+i), then left-shift the remaining bits.
example #1:
for input 0b76543210 0b76543210 ... then output should be 0b76543217 0b654321 ...
example #2: if the input is:
uint8_t input[8] = {
0b00110011,
0b00110011,
...
};
the output without the first bit, should be
uint8_t output[8] = {
0b00110010,
0b01100100,
...
};
I have tried the following to remove the first bit, but it did not work for the second group of bits.
/* A macro to extract (a-b) range of bits without shifting */
#define BIT_RANGE(N,x,y) ((N) & ((0xff >> (7 - (y) + (x))) << ((x))))
void removeBit0(uint8_t *n) {
for (int i=0; i < 7; i++) {
n[i] = (BIT_RANGE(n[i], i + 1, 7)) << (i + 1) |
(BIT_RANGE(n[i + 1], 1, i + 1)) << (7 - i); /* This does not extract the next element bits */
}
n[7] = 0;
}
Update #1
In my case, the input will be uint64_t number, then I will use memmov to shift it one place to the left.
Update #2
The solution can be in C/C++, assembly(x86-64) or inline assembly.

This is really 2 subproblems: remove bits from each byte and pack the results. This is the flow of the code below. I wouldn't use a macro for this. Too much going on. Just inline the function if you're worried about performance at that level.
#include <stdio.h>
#include <stdint.h>
// Remove bits n to n+k-1 from x.
unsigned scrunch_1(unsigned x, int n, int k) {
unsigned hi_bits = ~0u << n;
return (x & ~hi_bits) | ((x >> k) & hi_bits);
}
// Remove bits n to n+k-1 from each byte in the buffer,
// then pack left. Return number of packed bytes.
size_t scrunch(uint8_t *buf, size_t size, int n, int k) {
size_t i_src = 0, i_dst = 0;
unsigned src_bits = 0; // Scrunched source bit buffer.
int n_src_bits = 0; // Initially it's empty.
for (;;) {
// Get scrunched bits until the buffer has at least 8.
while (n_src_bits < 8) {
if (i_src >= size) { // Done when source bytes exhausted.
// If there are left-over bits, add one more byte to output.
if (n_src_bits > 0) buf[i_dst++] = src_bits << (8 - n_src_bits);
return i_dst;
}
// Pack 'em in.
src_bits = (src_bits << (8 - k)) | scrunch_1(buf[i_src++], n, k);
n_src_bits += 8 - k;
}
// Write the highest 8 bits of the buffer to the destination byte.
n_src_bits -= 8;
buf[i_dst++] = src_bits >> n_src_bits;
}
}
int main(void) {
uint8_t x[] = { 0xaa, 0xaa, 0xaa, 0xaa };
size_t n = scrunch(x, 4, 2, 3);
for (size_t i = 0; i < n; i++) {
printf("%x ", x[i]);
}
printf("\n");
return 0;
}
This writes b5 ad 60, which by my reckoning is correct. A few other test cases work as well.
Oops I coded it the first time shifting the wrong way, but include that here in case it's useful to someone.
#include <stdio.h>
#include <stdint.h>
// Remove bits n to n+k-1 from x.
unsigned scrunch_1(unsigned x, int n, int k) {
unsigned hi_bits = 0xffu << n;
return (x & ~hi_bits) | ((x >> k) & hi_bits);
}
// Remove bits n to n+k-1 from each byte in the buffer,
// then pack right. Return number of packed bytes.
size_t scrunch(uint8_t *buf, size_t size, int n, int k) {
size_t i_src = 0, i_dst = 0;
unsigned src_bits = 0; // Scrunched source bit buffer.
int n_src_bits = 0; // Initially it's empty.
for (;;) {
// Get scrunched bits until the buffer has at least 8.
while (n_src_bits < 8) {
if (i_src >= size) { // Done when source bytes exhausted.
// If there are left-over bits, add one more byte to output.
if (n_src_bits > 0) buf[i_dst++] = src_bits;
return i_dst;
}
// Pack 'em in.
src_bits |= scrunch_1(buf[i_src++], n, k) << n_src_bits;
n_src_bits += 8 - k;
}
// Write the lower 8 bits of the buffer to the destination byte.
buf[i_dst++] = src_bits;
src_bits >>= 8;
n_src_bits -= 8;
}
}
int main(void) {
uint8_t x[] = { 0xaa, 0xaa, 0xaa, 0xaa };
size_t n = scrunch(x, 4, 2, 3);
for (size_t i = 0; i < n; i++) {
printf("%x ", x[i]);
}
printf("\n");
return 0;
}
This writes d6 5a b. A few other test cases work as well.

Something similar to this should work:
template<typename S> void removeBit(S* buffer, size_t length, size_t index)
{
const size_t BITS_PER_UNIT = sizeof(S)*8;
// first we find which data unit contains the desired bit
const size_t unit = index / BITS_PER_UNIT;
// and which index has the bit inside the specified unit, starting counting from most significant bit
const size_t relativeIndex = (BITS_PER_UNIT - 1) - index % BITS_PER_UNIT;
// then we unset that bit
buffer[unit] &= ~(1 << relativeIndex);
// now we have to shift what's on the right by 1 position
// we create a mask such that if 0b00100000 is the bit removed we use 0b00011111 as mask to shift the rest
const S partialShiftMask = (1 << relativeIndex) - 1;
// now we keep all bits left to the removed one and we shift left all the others
buffer[unit] = (buffer[unit] & ~partialShiftMask) | ((buffer[unit] & partialShiftMask) << 1);
for (int i = unit+1; i < length; ++i)
{
//we set rightmost bit of previous unit according to last bit of current unit
buffer[i-1] |= buffer[i] >> (BITS_PER_UNIT-1);
// then we shift current unit by one
buffer[i] <<= 1;
}
}
I just tested it on some basic cases so maybe something is not exactly correct but this should move you onto the right track.

Bitwise shifting in C++

Trying to hide data within a PPM Image using C++:
void PPMObject::hideData(string phrase)
{
phrase += '\0';
size_t size = phrase.size() * 8;
bitset<8> binary_phrase (phrase.c_str()[0]);
//We need 8 channels for each letter
for (size_t index = 0; index < size; index += 3)
{
//convert red channel to bits
bitset<8> r (this->m_Ptr[index]);
if (r.at(7) != binary_phrase.at(index))
{
r.flip(7);
}
this->m_Ptr[index] = (char) r.to_ulong();
//convert blue channel to bits and find LSB
bitset<8> g (this->m_Ptr[index+1]);
if (g.at(7) != binary_phrase.at(index+1))
{
g.flip(7);
}
this->m_Ptr[index+1] = (char) g.to_ulong();
//convert green channel to bits and find LSB
bitset<8> b (this->m_Ptr[index+2]);
if (b.at(7) != binary_phrase.at(index+2))
{
b.flip(7);
}
this->m_Ptr[index+2] = (char) b.to_ulong();
}
//this->m_Ptr[index+1] = (r.to_ulong() & 0xFF);
}
Then extracting the data by reversing the above process:
string PPMObject::recoverData()
{
size_t size = this->width * this->height * 3;
string message("");
//We need 8 channels for each letter
for (size_t index = 0; index < size; index += 3)
{
//retreive our hidden data from the LSB in red channel
bitset<8> r (this->m_Ptr[index]);
message += r.to_string()[7];
//retreive our hidden data from the LSB in green channel
bitset<8> g (this->m_Ptr[index+1]);
message += g.to_string()[7];
//retreive our hidden data from the LSB in blue channel
bitset<8> b (this->m_Ptr[index+2]);
message += b.to_string()[7];
}
return message;
}
The above hide data function converts each channel (RGB) to binary. It then attempts to find the least significant bit and flips it if it does not match the nth bit of the phrase (starting at zero). It then assigns that new converted binary string back into the pointer as a casted char.
Is using the bitset library a "best practice" technique? I am all ears to a more straightforward, efficient technique. Perhaps, using bitwise maniuplations?
There are no logic errors or problems whatsoever with reading and writing the PPM Image. The pixel data is assigned to a char pointer: this->m_Ptr (above).

Here's some more compact code that does bit manipulation. It doesn't bounds check m_Ptr, but neither does your code.
#include <iostream>
#include <string>
using namespace std;
struct PPMObject
{
void hideData(const string &phrase);
string recoverData(size_t size);
char m_Ptr[256];
};
void PPMObject::hideData(const string &phrase)
{
size_t size = phrase.size();
for (size_t p_index = 0, i_index = 0; p_index < size; ++p_index)
for (int i = 0, bits = phrase[p_index]; i < 8; ++i, bits >>= 1, ++i_index)
{
m_Ptr[i_index] &= 0xFE; // set lsb to 0
m_Ptr[i_index] |= (bits & 0x1); // set lsb to lsb of bits
}
}
string PPMObject::recoverData(size_t size)
{
string ret(size, ' ');
for (size_t p_index = 0, i_index = 0; p_index < size; ++p_index)
{
int i, bits;
for (i = 0, bits = 0; i < 8; ++i, ++i_index)
bits |= ((m_Ptr[i_index] & 0x1) << i);
ret[p_index] = (char) bits;
}
return ret;
}
int main()
{
PPMObject p;
p.hideData("Hello World!");
cout << p.recoverData(12) << endl;
return 0;
}
Note that this code encodes from lsb to msb of each byte of the phrase.

Does anyone have an easy solution to parsing Exp-Golomb codes using C++?

Trying to decode the SDP sprop-parameter-sets values for an H.264 video stream and have found to access some of the values will involve parsing of Exp-Golomb encoded data and my method contains the base64 decoded sprop-parameter-sets data in a byte array which I now bit walking but have come up to the first part of Exp-Golomb encoded data and looking for a suitable code extract to parse these values.

Exp.-Golomb codes of what order ??
If it you need to parse H.264 bit stream (I mean transport layer) you can write a simple functions to make an access to scecified bits in the endless bit stream. Bits indexing from left to right.
inline u_dword get_bit(const u_byte * const base, u_dword offset)
{
return ((*(base + (offset >> 0x3))) >> (0x7 - (offset & 0x7))) & 0x1;
}
This function implement decoding of exp-Golomb codes of zero range (used in H.264).
u_dword DecodeUGolomb(const u_byte * const base, u_dword * const offset)
{
u_dword zeros = 0;
// calculate zero bits. Will be optimized.
while (0 == get_bit(base, (*offset)++)) zeros++;
// insert first 1 bit
u_dword info = 1 << zeros;
for (s_dword i = zeros - 1; i >= 0; i--)
{
info |= get_bit(base, (*offset)++) << i;
}
return (info - 1);
}
u_dword means unsigned 4 bytes integer.
u_byte means unsigned 1 byte integer.
Note that first byte of each NAL Unit is a specified structure with forbidden bit, NAL reference, and NAL type.

Accepted answer is not a correct implementation. It is giving wrong output. Correct implementation as per pseudo code from
"Sec 9.1 Parsing process for Exp-Golomb codes" spec T-REC-H.264-201304
int32_t getBitByPos(unsigned char *buffer, int32_t pos) {
return (buffer[pos/8] >> (8 - pos%8) & 0x01);
}
uint32_t decodeGolomb(unsigned char *byteStream, uint32_t *index) {
uint32_t leadingZeroBits = -1;
uint32_t codeNum = 0;
uint32_t pos = *index;
if (byteStream == NULL || pos == 0 ) {
printf("Invalid input\n");
return 0;
}
for (int32_t b = 0; !b; leadingZeroBits++)
b = getBitByPos(byteStream, pos++);
for (int32_t b = leadingZeroBits; b > 0; b--)
codeNum = codeNum | (getBitByPos(byteStream, pos++) << (b - 1));
*index = pos;
return ((1 << leadingZeroBits) - 1 + codeNum);
}

I wrote a c++ jpeg-ls compression library that uses golomb codes. I don't know if Exp-Golomb codes is exactly the same. The library is open source can be found at http://charls.codeplex.com. I use a lookup table to decode golomb codes <= 8 bits in length. Let me know if you have problems finding your way around.

Revised with a function to get N bits from the stream; works parsing H.264 NALs
inline uint32_t get_bit(const uint8_t * const base, uint32_t offset)
{
return ((*(base + (offset >> 0x3))) >> (0x7 - (offset & 0x7))) & 0x1;
}
inline uint32_t get_bits(const uint8_t * const base, uint32_t * const offset, uint8_t bits)
{
uint32_t value = 0;
for (int i = 0; i < bits; i++)
{
value = (value << 1) | (get_bit(base, (*offset)++) ? 1 : 0);
}
return value;
}
// This function implement decoding of exp-Golomb codes of zero range (used in H.264).
uint32_t DecodeUGolomb(const uint8_t * const base, uint32_t * const offset)
{
uint32_t zeros = 0;
// calculate zero bits. Will be optimized.
while (0 == get_bit(base, (*offset)++)) zeros++;
// insert first 1 bit
uint32_t info = 1 << zeros;
for (int32_t i = zeros - 1; i >= 0; i--)
{
info |= get_bit(base, (*offset)++) << i;
}
return (info - 1);
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to convert audio byte to samples - c++

Related

Strange behavior with with writing / reading from vector

Correctly add string to a REG_BINARY type in Windows Registry

Remove nth bit from buffer, and shift the rest

Bitwise shifting in C++

Does anyone have an easy solution to parsing Exp-Golomb codes using C++?

Categories

Resources