Steganography: how to change the LSB of a pixel array - c++

I'm using the lodePNG library to encode a png image and change the LSB of the pixels with an imported txt file. I've compiled the program but I'm not sure if the PNG file is actually being encoded according to my bitwise operation.
The lodePNG library decodes/encodes from a PNG image and stores the pixels in the vector "image", 4 bytes per pixel, ordered RGBARGBA...,
void decodeOneStep(const char* filename)
{
unsigned width, height;
//decode
unsigned error = lodepng::decode(image, width, height, filename);
//if there's an error, display it
if (error) std::cout << "decoder error " << error << ": " <<
lodepng_error_text(error) << std::endl;
}
The program takes a command line argument of the text file and the PNG file. I have not included error-checking for arguments yet.
int const MAX_SIZE = 100;
std::vector<unsigned char> image;
int main(int argc, char *argv[])
{
const char* filename;
char* textArray = new char[MAX_SIZE];
std::ifstream textfile;
textfile.open(argv[1]);
int numCount = 0;
while (!textfile.eof() && numCount < MAX_SIZE)
{
textfile.get(textArray[numCount]); //reading single character from file to array
numCount++;
}
textfile.close();
filename = argv[2];
decodeOneStep(filename);
unsigned width = 512, height = 512;
image.resize(width * height * 4);
int pixCount = 0;
for (int i = 0; i < numCount - 1; i++) {
std::cout << textArray[i];
for (int j = 0; j < 8; j++) {
std::cout << ((textArray[i]) & 1); //used to see actual bit value.
image[pixCount++] |= ((textArray[i]) & 1);
(textArray[i]) >>= 1;
}
std::cout << std::endl;
}
encodeOneStep(filename, image, width, height);
In the for-loop, I am going through each pixel in the vector and replacing LSB with a bit from the char. Since a char is 8 bytes, the for-loop loops 8 times. This program should work for most PNG images and texts not exceeding size, but I'm not sure the bitwise operation is actually doing anything. Also, how would I be able to shift the bits so that we store the char bits from MSB to LSB? I feel like I'm understanding something wrong with how the pixel values (bits) are stored in the array.
EDIT: Test i've run on the new bit-operation:
for (int j = 7; j >= 0; j--) {
//These tests were written to see if the 4-bits of the pixels were actually being replaced.
//The LSB of the pixel bits are replaced with the MSB of the text character.
std::cout <<"Initial pixel 4-bits: " << std::bitset <4>(image[pixCount]) << " ";
std::cout << "MSB of char: " << ((textArray[i] >> j) & 0x01) << " ";
std::cout << "Pixel LSB replaced: " << ((image[pixCount] & mask) | ((textArray[i] >> j) & 0x01)) << " ";
image[pixCount] = (image[pixCount] & mask) | ((textArray[i] >> j) & 0x01);
pixCount++;
std::cout << std::endl;
}
Test Result:
For char 'a'
Initial pixel 4-bits : 0000 MSB: 0 Pixel LSB replaced: 0
Initial pixel 4-bits : 0001 MSB: 1 Pixel LSB replaced: 1
Initial pixel 4-bits : 0001 MSB: 1 Pixel LSB replaced: 1
Initial pixel 4-bits : 0000 MSB: 0 Pixel LSB replaced: 0
Initial pixel 4-bits : 0000 MSB: 0 Pixel LSB replaced: 0
Initial pixel 4-bits : 0000 MSB: 0 Pixel LSB replaced: 0
Initial pixel 4-bits : 0000 MSB: 0 Pixel LSB replaced: 0
Initial pixel 4-bits : 0001 MSB: 1 Pixel LSB replaced: 1

You first need to clear out the lsb of the pixel before embedding your secret bit.
unsigned char mask = 0xfe; // in binary 11111110
// and in your loop
image[pixCount] = (image[pixCount] & mask) | (textArray[i] & 1);
pixCount++;
If you want to embed the bits of each character of the secret from the most to the least significant bit, you want to count down the j loop.
for (int j = 7; j >= 0; j--) {
image[pixCount] = (image[pixCount] & mask) | ((textArray[i] >> j) & 0x01);
pixCount++;
}
Edit: To explain the code above, image[pixCount] & mask is an AND operator between your pixel and the chosen mask value (1111110 in binary), so the result is that the lsb is cleared.
(textArray[i] >> j) & 0x01 shifts your char to the left by j and only keeps the lsb. If you map out the maths, this is what you get
// assume our character has the bits `abcdefgh`
j = 7
(character >> j) & 0x01 = 0000000a & 0x01 = a
j = 6
(character >> j) & 0x01 = 000000ab & 0x01 = b
j = 5
(character >> j) & 0x01 = 00000abc & 0x01 = c
// and so on
j = 0
(character >> j) & 0x01 = abcdefgh & 0x01 = h

Related

Why is this data being flipped

Below is an example of processing very similar to what I am working with. I understand the concept of endianness and have read through the suggested posts but it doesn't seem to explain what is happening here.
I have an array of unsigned characters that I am packing with data. I was under the impression that memcpy was endianness agnostic. I would think that the left-most bit would stay the left must bit. However when I attempt to print the characters each word is copied backwards.
Why does this happen?
#include <iostream>
#include <cstring>
#include <array>
const unsigned int MAX_VALUE = 64ul;
typedef unsigned char DDS_Octet[MAX_VALUE];
int main()
{
// create an array and populate it with printable
// characters
DDS_Octet octet;
for(int i = 0; i < MAX_VALUE; ++i)
octet[i] = (i + 33);
// print characters before the memcpy operation
for(int i = 0; i < MAX_VALUE; ++i)
{
if(i && !(i % 4)) std::cout << "\n";
std::cout << octet[i] << "\t";
}
std::cout << "\n\n------------------------------\n";
// This is an equivalent copy operation
// to what is actually being used
std::array<unsigned int, 16> arr;
memcpy(
arr.data(),
octet,
sizeof(octet));
// print the character contents of each
// word left to right (MSB to LSB on little endian)
for(auto i : arr)
std::cout
<< (char)(i >> 24) << "\t"
<< (char)((i >> 16) & 0xFF) << "\t"
<< (char)((i >> 8) & 0xFF) << "\t"
<< (char)(i & 0xFF) << "\n";
** output **
! " # $
% & ' (
) * + ,
- . / 0
1 2 3 4
5 6 7 8
9 : ; <
= > ? #
A B C D
E F G H
I J K L
M N O P
Q R S T
U V W X
Y Z [ \
] ^ _ `
------------------------------
$ # " !
( ' & %
, + * )
0 / . -
4 3 2 1
8 7 6 5
< ; : 9
# ? > =
D C B A
H G F E
L K J I
P O N M
T S R Q
X W V U
\ [ Z Y
` _ ^ ]
----Update-----
I took a look at the memcpy source code (below) which was far more simple than expected. It actually explains everything. It would seem that it would be correct to say that the endianness of the integer is the cause for this, but incorrect to say that memcpy does not play a role. What I was overlooking what that data is being copied on a byte-by-byte operation. Given that, it makes sense that the little endian integer would reverse it.
void *
memcpy (void *dest, const void *src, size_t len)
{
char *d = dest;
const char *s = src;
while (len--)
*d++ = *s++;
return dest;
}
When you memcpy 4 chars into a 4-byte unsigned int they get stored in the same order they were in the original array. That is, the first char in the input array will be stored in the lowest address byte of the unsigned int, the second in the second lowest address byte, and so on.
x86 is little-endian. The lowest address byte of an unsigned int is the least significant byte.
The shift operators are endianess-independent though. They work on the logical representation of an integer, not the physical bytes. That means, for an unsigned int i on a little-endain platform, i & 0xFF gives the lowest address byte and (i >> 24) & 0xFF gives the highest address byte, while on a big-endian platform i & 0xFF gives the highest address byte and (i >> 24) & 0xFF gives the lowest address byte.
Taken together, these threee facts explain why your data is reversed. '!' is the first char in your array, so when you memcpy that array into an array of unsigned int '!' becomes the lowest address byte of the first unsigned int in the destination array. The lowest address byte is the least significant on your little-endian platform, and so that is the byte you retrieve with i & 0xFF.
Maybe this will let you understand easier. Let's say we have these data defined:
uint32_t val = 0x01020304;
auto *pi = reinterpret_cast<unsigned char *>( &val );
Following code will produce the same result on big-endian and little-endian platform:
std::cout << ( (val >> 24) & 0xFF ) << '\t'
<< ( (val >> 16) & 0xFF ) << '\t'
<< ( (val >> 8) & 0xFF ) << '\t'
<< ( (val >> 0) & 0xFF ) << '\n';
but this code will have different output:
std::cout << static_cast<unsigned int>( pi[0] ) << '\t'
<< static_cast<unsigned int>( pi[1] ) << '\t'
<< static_cast<unsigned int>( pi[2] ) << '\t'
<< static_cast<unsigned int>( pi[3] ) << '\n';
it has nothing to do with memcpy(), it is how ints are stored in memory and how bit shifting operation works.
The value 0x12345678 is stored as 4 bytes: 0x78 0x56 0x34 0x12. But 0x12345678>>24 is still 0x12 because that has nothing to do with the 4 separate bytes.
If you have the 4 bytes: 0x78 0x56 0x34 0x12, and interpret them as a 4-byte little-endian integer, you get 0x12345678. If you right-shift by 24 bits, you get the 4th byte: 0x12. If you right-shift by 16 bits and mask with 0xff, you get the 3rd byte: 0x34. And so on. Because ((0x12345678 >> 16) & 0xff) == 0x34
The memcpy has nothing to do with it.

Difference between bitshifting mask vs unsigned int

For a project, I had to find the individual 8-bits of a unsigned int. I first tried bit-shifting the mask to find the numbers, but that didn't work, so I tried bit-shifting the value and it worked.
What's the difference between these two? Why didn't the first one work?
ExampleFunk(unsigned int value){
for (int i = 0; i < 4; i++) {
ExampleSubFunk(value & (0x00FF << (i * 8)));
}
}
ExampleFunk(unsigned int value){
for (int i = 0; i < 4; i++) {
ExampleSubFunk((value >> (i * 8)) & 0x00FF);
}
}
Take the value 0xAABBCCDD as an example.
The expression value & (0xFF << (i * 8)) assumes the values:
0xAABBCCDD & 0x000000FF = 0x000000DD
0xAABBCCDD & 0x0000FF00 = 0x0000CC00
0xAABBCCDD & 0x00FF0000 = 0x00BB0000
0xAABBCCDD & 0xFF000000 = 0xAA000000
While the expression (value >> (i * 8)) & 0xFF assumes the values:
0xAABBCCDD & 0x000000FF = 0x000000DD
0x00AABBCC & 0x000000FF = 0x000000CC
0x0000AABB & 0x000000FF = 0x000000BB
0x000000AA & 0x000000FF = 0x000000AA
As you can see, the results are quite different after i = 0, because the first expression is only "selecting" 8 bits from value, while the second expression is shifting them down to the least significant byte first.
Note that in the first case, the expression (0xFF << (i * 8)) is shifting an int literal (0xFF) left. You should cast the literal to unsigned int to avoid signed integer overflow, which is undefined behavior:
value & ((unsigned int)0xFF << (i * 8))
In this code:
ExampleFunk(unsigned int value){
for (int i = 0; i < 4; i++) {
ExampleSubFunk(value & (0x00FF << (i * 8)));
}
}
You are shifting the bits of 0x00FF itself, producing new masks of 0x00FF, 0xFF00, 0xFF0000, and 0xFF000000, and then you are masking value with each of those masks. The result contains only the 8 bits of value that you are interested in, but those 8 bits are not moving position at all.
In this code:
ExampleFunk(unsigned int value){
for (int i = 0; i < 4; i++) {
ExampleSubFunk((value >> (i * 8)) & 0x00FF);
}
}
You are shifting the bits of value, thus moving those 8 bits that you want, and then you are masking the result with 0x00FF to extract those 8 bits.

stretch mask - bit manipulation

I want to stretch a mask in which every bit represents 4 bits of stretched mask.
I am looking for an elegant bit manipulation to stretch using c++ and systemC
for example:
input:
mask (32 bits) = 0x0000CF00
output:
stretched mask (128 bits) = 0x00000000 00000000 FF00FFFF 00000000
and just to clarify the example let's look at the the byte C:
0xC = 1100 after stretching: 1111111100000000 = 0xFF00
Do this in a elegant form is not easy.
The simple mode maybe is create a loop with shift bit
sc_biguint<128> result = 0;
for(int i = 0; i < 32; i++){
if(bit_test(var, i)){
result +=0x0F;
}
result << 4;
}
Here's a way of stretching a 16-bit mask into 64 bits where every bit represents 4 bits of stretched mask:
uint64_t x = 0x000000000000CF00LL;
x = (x | (x << 24)) & 0x000000ff000000ffLL;
x = (x | (x << 12)) & 0x000f000f000f000fLL;
x = (x | (x << 6)) & 0x0303030303030303LL;
x = (x | (x << 3)) & 0x1111111111111111LL;
x |= x << 1;
x |= x << 2;
It starts of with the mask in the bottom 16 bits. Then it moves the top 8 bits of the mask into the top 32 bits, like this:
0000000000000000 0000000000000000 0000000000000000 ABCDEFGHIJKLMNOP
becomes
0000000000000000 00000000ABCDEFGH 0000000000000000 00000000IJKLMNOP
Then it solves the similar problem of stretching a mask from the bottom 8 bits of a 32 bit word, to the top and bottom 32-bits simultaneously:
000000000000ABCD 000000000000EFGH 000000000000IJKL 000000000000MNOP
Then it does it for 4 bits inside 16 and so on until the bits are spread out:
000A000B000C000D 000E000F000G000H 000I000J000K000L 000M000N000O000P
Then it "smears" them across 4 bits by ORing the result with itself twice:
AAAABBBBCCCCDDDD EEEEFFFFGGGGHHHH IIIIJJJJKKKKLLLL MMMMNNNNOOOOPPPP
You could extend this to 128 bits by adding an extra first step where you shift by 48 bits and mask with a 128-bit constant:
x = (x | (x << 48)) & 0x000000000000ffff000000000000ffffLLL;
You'd also have to stretch the other constants out to 128 bits just by repeating the bit patterns. However (as far as I know) there is no way to declare a 128-bit constant in C++, but perhaps you could do it with macros or something (see this question). You could also make a 128-bit version just by using the 64-bit version on the top and bottom 16 bits separately.
If loading the masking constants turns out to be a difficulty or bottleneck you can generate each one from the previous one using shifting and masking:
uint64_t m = 0x000000ff000000ffLL;
m &= m >> 4; m |= m << 16; // gives 0x000f000f000f000fLL
m &= m >> 2; m |= m << 8; // gives 0x0303030303030303LL
m &= m >> 1; m |= m << 4; // gives 0x1111111111111111LL
Does this work for you?
#include <stdio.h>
long long Stretch4x(int input)
{
long long output = 0;
while (input & -input)
{
int b = (input & -input);
long long s = 0;
input &= ~b;
s = b*15;
while(b>>=1)
{
s <<= 3;
}
output |= s;
}
return output;
}
int main(void) {
int input = 0xCF00;
printf("0x%0x ==> 0x%0llx\n", input, Stretch4x(input));
return 0;
}
Output:
0xcf00 ==> 0xff00ffff00000000
The other solutions are good. However, most them are more C than C++. This solution is pretty straight forward: it uses std::bitset and set four bits for each input bit.
#include <bitset>
#include <iostream>
std::bitset<128>
starch_32 (const std::bitset<32> &input)
{
std::bitset<128> output;
for (size_t i = 0; i < input.size(); ++i) {
// If `input[N]` is `true`, set `output[N*4, N*4+4]` to true.
if (input.test (i)) {
const size_t output_index = i * 4;
output.set (output_index);
output.set (output_index + 1);
output.set (output_index + 2);
output.set (output_index + 3);
}
}
return output;
}
// Example with 0xC.
int main() {
std::bitset<32> input{0b1100};
auto result = starch_32 (input);
std::cout << "0x" << std::hex << result.to_ullong() << "\n";
}
Try it online!
On x86 you could use the PDEP intrinsic to move the 16 mask bits into the correct nibble (into the low bit of each nibble, for example) of a 64-bit word, and then use a couple of shift + or to smear them into the rest of the word:
unsigned long x = _pdep_u64(m, 0x1111111111111111);
x |= x << 1;
x |= x << 2;
You could also replace those two OR and two shift by a single multiplication by 0xF which accomplishes the same smearing.
Finally, you could consider a SIMD approach: solutions such as samgak's above should map naturally to SIMD.

Reading 16 bit DPX Pixel Data

I'm trying to read in pixel data from a 16 bit dpx file that is an extension from a previous git repo (as it only supports 10 bit).
This is the dpx format summary
I'm utilizing this header and cpp to deal with the header info and getting that sort of data.
Note that the variables _pixelOffset, _width, _height, and _channels are based off header information of the dpx. pPixels is a float* array:
#include <iostream>
#include <fstream>
#include <dpxHeader.h>
//First read the file as binary.
std::ifstream _in(_filePath.asChar(), std::ios_base::binary);
// Seek to the pixel offset to start reading where the pixel data starts.
if (!_in.seekg (_pixelOffset, std::ios_base::beg))
{
std::cerr << "Cannot seek to start of pixel data " << _filePath << " in DPX file.";
return MS::kFailure;
}
// Create char to store data of width length of the image
unsigned char *rawLine = new unsigned char[_width * 4]();
// Iterate over height pixels
for (int y = 0; y < _height; ++y)
{
// Read full pixel data for width.
if (!_in.read ((char *)&rawLine[0], _width * 4))
{
std::cerr << "Cannot read scan line " << y << " " << "from DPX file " << std::endl;
return MS::kFailure;
}
// Iterator over width
for (int x = 0; x < _width; ++x)
{
// We do this to flip the image because it's flipped vertically when read in
int index = ((_height - 1 - y) * _width * _channels) + x * _channels;
unsigned int word = getU32(rawLine + 4 * x, _byteOrder);
pPixels[index] = (((word >> 22) & 0x3ff)/1023.0);
pPixels[index+1] = (((word >> 12) & 0x3ff)/1023.0);
pPixels[index+2] = (((word >> 02) & 0x3ff)/1023.0);
}
}
delete [] rawLine;
This currently works for 10 bit files, but as I am new the bitwise operations I'm not completely sure how to extend this to 12 and 16 bit. Anyone have any clues or a proper direction to me in?
This file format is somewhat comprehensive, but if you are only targeting a known subset it shouldn't be too hard to extend.
From your code sample it appears that you are currently working with three components per pixel, and that the components are filled into 32-bit words. In this mode both 12-bit and 16-bit will store two components per word, according to the specification you've provided. For 12-bit, the upper 4 bits of each component is padding data. You will need three 32-bit words to get six color components to decode two pixels:
...
unsigned int word0 = getU32(rawLine + 6 * x + 0, _byteOrder);
unsigned int word1 = getU32(rawLine + 6 * x + 4, _byteOrder);
unsigned int word2 = getU32(rawLine + 6 * x + 8, _byteOrder);
// First pixel
pPixels[index] = (word0 & 0xffff) / (float)0xffff; // (or 0xfff for 12-bit)
pPixels[index+1] = (word0 >> 16) / (float)0xffff;
pPixels[index+2] = (word1 & 0xffff) / (float)0xffff;
x++;
if(x >= _width) break; // In case of an odd amount of pixels
// Second pixel
pPixels[index+3] = (word1 >> 16) / (float)0xffff;
pPixels[index+4] = (word2 & 0xffff) / (float)0xffff;
pPixels[index+5] = (word2 >> 16) / (float)0xffff;
...

Bitwise shifting in C++

Trying to hide data within a PPM Image using C++:
void PPMObject::hideData(string phrase)
{
phrase += '\0';
size_t size = phrase.size() * 8;
bitset<8> binary_phrase (phrase.c_str()[0]);
//We need 8 channels for each letter
for (size_t index = 0; index < size; index += 3)
{
//convert red channel to bits
bitset<8> r (this->m_Ptr[index]);
if (r.at(7) != binary_phrase.at(index))
{
r.flip(7);
}
this->m_Ptr[index] = (char) r.to_ulong();
//convert blue channel to bits and find LSB
bitset<8> g (this->m_Ptr[index+1]);
if (g.at(7) != binary_phrase.at(index+1))
{
g.flip(7);
}
this->m_Ptr[index+1] = (char) g.to_ulong();
//convert green channel to bits and find LSB
bitset<8> b (this->m_Ptr[index+2]);
if (b.at(7) != binary_phrase.at(index+2))
{
b.flip(7);
}
this->m_Ptr[index+2] = (char) b.to_ulong();
}
//this->m_Ptr[index+1] = (r.to_ulong() & 0xFF);
}
Then extracting the data by reversing the above process:
string PPMObject::recoverData()
{
size_t size = this->width * this->height * 3;
string message("");
//We need 8 channels for each letter
for (size_t index = 0; index < size; index += 3)
{
//retreive our hidden data from the LSB in red channel
bitset<8> r (this->m_Ptr[index]);
message += r.to_string()[7];
//retreive our hidden data from the LSB in green channel
bitset<8> g (this->m_Ptr[index+1]);
message += g.to_string()[7];
//retreive our hidden data from the LSB in blue channel
bitset<8> b (this->m_Ptr[index+2]);
message += b.to_string()[7];
}
return message;
}
The above hide data function converts each channel (RGB) to binary. It then attempts to find the least significant bit and flips it if it does not match the nth bit of the phrase (starting at zero). It then assigns that new converted binary string back into the pointer as a casted char.
Is using the bitset library a "best practice" technique? I am all ears to a more straightforward, efficient technique. Perhaps, using bitwise maniuplations?
There are no logic errors or problems whatsoever with reading and writing the PPM Image. The pixel data is assigned to a char pointer: this->m_Ptr (above).
Here's some more compact code that does bit manipulation. It doesn't bounds check m_Ptr, but neither does your code.
#include <iostream>
#include <string>
using namespace std;
struct PPMObject
{
void hideData(const string &phrase);
string recoverData(size_t size);
char m_Ptr[256];
};
void PPMObject::hideData(const string &phrase)
{
size_t size = phrase.size();
for (size_t p_index = 0, i_index = 0; p_index < size; ++p_index)
for (int i = 0, bits = phrase[p_index]; i < 8; ++i, bits >>= 1, ++i_index)
{
m_Ptr[i_index] &= 0xFE; // set lsb to 0
m_Ptr[i_index] |= (bits & 0x1); // set lsb to lsb of bits
}
}
string PPMObject::recoverData(size_t size)
{
string ret(size, ' ');
for (size_t p_index = 0, i_index = 0; p_index < size; ++p_index)
{
int i, bits;
for (i = 0, bits = 0; i < 8; ++i, ++i_index)
bits |= ((m_Ptr[i_index] & 0x1) << i);
ret[p_index] = (char) bits;
}
return ret;
}
int main()
{
PPMObject p;
p.hideData("Hello World!");
cout << p.recoverData(12) << endl;
return 0;
}
Note that this code encodes from lsb to msb of each byte of the phrase.