How do gzip don't find two same part?

How do gzip don't find two same part? - compression

I wrote such code
#include "zlib.h"
unsigned char dst[1<<26];
unsigned char src[1<<24];
int main() {
unsigned long dstlen = 1<<26;
srand (12345);
for (int i=0; i<1<<23; i++) src[i] = src[i | 1<<23] = rand();
compress(dst,&dstlen,src,1<<24);
printf ("%d/%d = %f\n", dstlen, 1<<24, dstlen / double(1<<24));
}
which tries to compress two same 223 bytes part connected together. However, the result is
16782342/16777216 = 1.000306
How is data with such rule not compressed?

The maximum distance for matching strings in zlib is 32,768 bytes back.

Related

How to convert an unsigned char variable that holds two hexadecimal values into two chars with one hex number

I need to get the hash (sha1) value of a given unsigned char array. So, I have used openssl. The SHA1 function generate the hash value in an unsigned char array which has 20 values. Indeed each value represent two hexadecimal values.
But, I should convert the generated array (with length of 20) to an array of chars with 40 values.
For example now hashValue[0] is "a0" but, I want to have hashValue[0] = "a" and hashValue[1] = "0"
#include <iostream>
#include <openssl/sha.h> // For sha1
using namespace std;
int main() {
unsigned char plainText[] = "compute sha1";
unsigned char hashValue[20];
SHA1(plainText,sizeof(plainText),hashValue);
for (int i = 0; i < 20; i++) {
printf("%02x", hashValue[i]);
}
printf("\n");
return 0;
}

You could create another array and use sprintf or safer snprintf to print into it instead of the standard output.
Something like this:
#include <iostream>
#include <stdio.h>
#include <openssl/sha.h> // For sha1
using namespace std;
int main() {
unsigned char plainText[] = "compute sha1";
unsigned char hashValue[20];
char output[41];
SHA1(plainText,sizeof(plainText),hashValue);
char *c_output = output;
for (int i = 0; i < 20; i++, c_output += 2) {
snprintf(c_output, 3, "%02x", hashValue[i]);
}
return 0;
}
Now output[0] == 'a' and output[1] == '0'.
There might be other, even better solution, this is just the first that comes to mind.
EDIT: Added fix from comments.

seems like you want to separate the high order and low order bytes.
to isolate the high order byte, shift right 4 bytes.
and to isolate the low order byte, apply a mask. AND with 0x0f
int x = 0x3A;
int y = x >> 4; // get high order nibble
int z = x & 0x0F; // get low order nibble
printf("%02x\n", x);
printf("%02x\n", y);
printf("%02x\n", z);

Failing to fill a buffer for a *.wav file using two different frequencies

This was solved by changing the buffer from int16_t to int8_t since I was trying to write 8bit audio.
I'm trying to fill a buffer for a mono wave file with two different frequencies but failing at it. I'm using CLion in Ubuntu 18.04.
I know, the buffer size is equal to duration*sample_rate so I'm creating a int16_t vector with that size. I tried filling it with one note first.
for(int i = 0; i < frame_total; i++)
audio[i] = static_cast<int16_t>(128 + 127 * sin(i));
which generated a nice long beeep. And then I changed it with:
for(int i = 0; i < frame_total; i++)
audio[i] = static_cast<int16_t>(128 + 127 * sin(i*2));
which generated a higher beeeep, but when trying to do the following:
for(int i = 0; i < frame_total/2; i++)
audio[i] = static_cast<int16_t>(128 + 127 * sin(i*2));
for(int i = frame_total/2; i < frame_total; i++)
audio[i] = static_cast<int16_t>(128 + 127 * sin(i));
I expect it to write the higher beep in the first half of the audio, and fill the another fall with the "normal" beep. The *.wav file just plays the first note the entire time.
#define FORMAT_AUDIO 1
#define FORMAT_SIZE 16
struct wave_header{
// Header
char riff[4];
int32_t file_size;
char wave[4];
// Format
char fmt[4];
int32_t format_size;
int16_t format_audio;
int16_t num_channels;
int32_t sample_rate;
int32_t byte_rate;
int16_t block_align;
int16_t bits_per_sample;
// Data
char data[4]
int32_t data_size;
};
void write_header(ofstream &music_file ,int16_t bits, int32_t samples, int32_t duration){
wave_header wav_header{};
int16_t channels_quantity = 1;
int32_t total_data = duration * samples * channels_quantity * bits/8;
int32_t file_data = 4 + 8 + FORMAT_SIZE + 8 + total_data;
wav_header.riff[0] = 'R';
wav_header.riff[1] = 'I';
wav_header.riff[2] = 'F';
wav_header.riff[3] = 'F';
wav_header.file_size = file_data;
wav_header.wave[0] = 'W';
wav_header.wave[1] = 'A';
wav_header.wave[2] = 'V';
wav_header.wave[3] = 'E';
wav_header.fmt[0] = 'f';
wav_header.fmt[1] = 'm';
wav_header.fmt[2] = 't';
wav_header.fmt[3] = ' ';
wav_header.format_size = FORMAT_SIZE;
wav_header.format_audio = FORMAT_AUDIO;
wav_header.num_channels = channels_quantity;
wav_header.sample_rate = samples;
wav_header.byte_rate = samples * channels_quantity * bits/8;
wav_header.block_align = static_cast<int16_t>(channels_quantity * bits / 8);
wav_header.bits_per_sample = bits;
wav_header.data[0] = 'd';
wav_header.data[1] = 'a';
wav_header.data[2] = 't';
wav_header.data[3] = 'a';
wav_header.data_size = total_data;
music_file.write((char*)&wav_header, sizeof(wave_header));
}
int main(int argc, char const *argv[]) {
int16_t bits = 8;
int32_t samples = 44100;
int32_t duration = 4;
ofstream music_file("music.wav", ios::out | ios::binary);
int32_t frame_total = samples * duration;
auto* audio = new int16_t[frame_total];
for(int i = 0; i < frame_total/2; i++)
audio[i] = static_cast<int16_t>(128 + 127 * sin(i*2));
for(int i = frame_total/2; i < frame_total; i++)
audio[i] = static_cast<int16_t>(128 + 127 * sin(i));
write_header(music_file, bits, samples, duration);
music_file.write(reinterpret_cast<char*>(audio),sizeof(int16_t)*frame_total);
return 0;
}

There are two major issues with your code.
The first one is that you may be writing an invalid header depending on your compiler settings and environment.
The reason is that the wave_header struct is not packed in memory, and may contain padding between members. Therefore, when you do:
music_file.write((char*)&wav_header, sizeof(wave_header));
It may write something that isn't a valid WAV header. Even if you are lucky enough to get exactly what you wanted, it is a good idea to fix it, because it may change at any moment and surely it isn't portable.
The second issue is that the call to write the actual wave:
music_file.write(reinterpret_cast<char*>(audio),sizeof(char)*frame_total);
Is writing exactly half the amount of data you are expecting. The actual size of the data pointed by audio is sizeof(int16_t) * frame_total.
This explains why you are only hearing the first part of the wave you wrote.

This was solved by changing the buffer (audio) from int16_t to int8_t since I was trying to write 8bit audio.

Convert unsigned char array of characters to int C++

How can I convert an unsigned char array that contains letters into an integer. I have tried this so for but it only converts up to four bytes. I also need a way to convert the integer back into the unsigned char array .
int buffToInteger(char * buffer)
{
int a = static_cast<int>(static_cast<unsigned char>(buffer[0]) << 24 |
static_cast<unsigned char>(buffer[1]) << 16 |
static_cast<unsigned char>(buffer[2]) << 8 |
static_cast<unsigned char>(buffer[3]));
return a;
}

It looks like you're trying to use a for loop, i.e. repeating a task over and over again, for an in-determinant amount of steps.
unsigned int buffToInteger(char * buffer, unsigned int size)
{
// assert(size <= sizeof(int));
unsigned int ret = 0;
int shift = 0;
for( int i = size - 1; i >= 0, i-- ) {
ret |= static_cast<unsigned int>(buffer[i]) << shift;
shift += 8;
}
return ret;
}

What I think you are going for is called a hash -- converting an object to a unique integer. The problem is a hash IS NOT REVERSIBLE. This hash will produce different results for hash("WXYZABCD", 8) and hash("ABCD", 4). The answer by #Nicholas Pipitone DOES NOT produce different outputs for these different inputs.
Once you compute this hash, there is no way to get the original string back. If you want to keep knowledge of the original string, you MUST keep the original string as a variable.
int hash(char* buffer, size_t size) {
int res = 0;
for (size_t i = 0; i < size; ++i) {
res += buffer[i];
res *= 31;
}
return res;
}

Here's how to convert the first sizeof(int) bytes of the char array to an int:
int val = *(unsigned int *)buffer;
and to convert in back:
*(unsigned int *)buffer = val;
Note that your buffer must be at least the length of your int type size. You should check for this.

HMAC-SHA512 bug in my code

I would greatly appreciate, if you could help me with this c++ implementation of HMAC-SHA512 code, I can't seem to find why it gives a different hash than online converters. (The SHA512 is working just fine.)
Code (based on wikipedia):
#include <iostream>
#include "sha512.h"
using namespace std;
const unsigned int BLOCKSIZE = (512/8); // 64 byte
int main(int argc, char *argv[])
{
if(argc!=3)return 0;
string key = argv[1];
string message = argv[2];
if(key.length() > BLOCKSIZE){
key = sha512(key);
}
while(key.length() < BLOCKSIZE){
key = key + (char)0x00;
}
string o_key_pad = key;
for(unsigned int i = 0; i < BLOCKSIZE; i++){
o_key_pad[i] = key[i] ^ (char)0x5c;
}
string i_key_pad = key;
for(unsigned int i = 0; i < BLOCKSIZE; i++){
i_key_pad[i] = key[i] ^ (char)0x36;
}
string output = sha512(o_key_pad + sha512(i_key_pad + message));
cout<<"hmac-sha512: \n"<<output<<endl;
return 0;
}

It turned out the BLOCKSIZE is incorrect.
According to http://en.wikipedia.org/wiki/SHA-2, sha-512's block size is 1024 bits, which 128 bytes.
So simply change the code to
const unsigned int BLOCKSIZE = (1024/8); // 128 byte
You get the correct result.

Thank you for the quick responses, the problem was with the hash function(sort of).
The sha512 output was converted to hex before return, so the "sha512(i_key_pad + message)" did not answer what I was expecting. (And also the blocksize was 1024)

Sending Hex value in win32 not working similar to c++ code

I have c++ code as follows to display hex value from int array.
#include <iostream>
using namespace std;
int main()
{
int binText[32]={1,0,1,0,1,1,1,1,0,0,0,1,1,0,1,0,1,1,1,1,0,1,0,1,1,1,1,1,0,0,0,1};
char temp[255]={0};
for (int i=0; i<32; i++)
{
sprintf(&temp[strlen(temp)], "%d", binText[i]);
}
char HexBuffer[255];
unsigned long long int Number = 0;
int BinLength = strlen(temp);
for(int i=0; i<32; i++)
{
Number += (long int)((temp[32 - i - 1] - 48) * pow((double)2, i));
}
ltoa(Number, HexBuffer, 16);
cout << HexBuffer <<endl;
}
its output is: af1af5f1
So this code converted the binary digit stored in int array into hex value.
But when i tried to use this same code to send the hex value in serial communication using win32 . it is not sending the correct hex value. the code is
serialObj.begin("COM1", 9600); //opens the port
int binText[32]={1,0,1,0,1,1,1,1,0,0,0,1,1,0,1,0,1,1,1,1,0,1,0,1,1,1,1,1,0,0,0,1};
char temp[255]={0};
for (int i=0; i<32; i++)
{
sprintf(&temp[strlen(temp)], "%d", binText[i]);
}
char HexBuffer[255];
unsigned long long int Number = 0;
int BinLength = strlen(temp);
for(int i=0; i<32; i++)
{
Number += (long int)((temp[32 - i - 1] - 48) * pow((double)2, i));
}
ltoa(Number, HexBuffer, 16);
serialObj.send(HexBuffer);
serialObj.close();//closes the port
The "send" function invoked by "serialObj.send(HexBuffer);" is as below:
void serial::send(char data[])
{
DWORD dwBytesWrite;
WriteFile(serialHandle, data, 4, &dwBytesWrite, NULL);
}
But the data it is sending is : "61 66 31 61". I couldnot figure out why it is giving this output .
The "send" function "serialObj.send" works properly for following code
char str[4]={0x24,0x24,0x53,0x3F};
serialObj.send(str);
and it sends 24 24 53 3F.
So i want to send AF 1A F5 F1 from the above the binary digit stored in the int array(shown above). how can i do this

If you want to send the actual binary bits, don't call ltoa at all:
serialObj.send((char *)&Number);
Note that Number is declared as long long, which is likely 64 bits, which isn't going to fit in the 4 bytes sent by your serial::send() function.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How do gzip don't find two same part? - compression

The maximum distance for matching strings in zlib is 32,768 bytes back.

Related

How to convert an unsigned char variable that holds two hexadecimal values into two chars with one hex number

Failing to fill a buffer for a *.wav file using two different frequencies

Convert unsigned char array of characters to int C++

HMAC-SHA512 bug in my code

Sending Hex value in win32 not working similar to c++ code

Categories

Resources