Splitting an jpeg image in blocks in C++

Splitting an jpeg image in blocks in C++ - c++

Can anyone tell me how can an JPEG image be divided in 8 x 8 blocks in C++.
Thanks.

Ah, the die-hard approach. My heart goes out to you. Expect to learn a lot, but be forewarned that you will lose time, blood and pain doing so.
The Compression FAQ has some details on how JPEG works. A good starting point is Part 2: Subject 75: Introduction to JPEG.
In a nutshell, for a typical JPEG file, you will have to reverse the encoding steps 6 to 4:
(6) extract the appropriate headers and image data from the JFIF container
(5) reverse the Huffman coding
(4) reverse the quantization
You should then be left with 8x8 blocks you could feed into an appropriate inverse DCT.
Wikipedia has some details on the JFIF format as well as Huffman tables and structure of the JPEG data within the JFIF.
I'm assuming you're looking to play with JPEG to learn about it? Because access to the raw encoded blocks is almost certainly not necessary if you have some practical application.
EDIT after seeing comments: If you just want to get a part of a very large JPEG without reading/decompressing the whole file, you could use ImageMagick's stream command. It allows you to get a subimage without reading the whole file. Use like e.g. stream -extract 8x8+16+16 large.jpeg block.rgb to get a 8x8 block starting at (16,16).

You have to decompress the image, use the turbojpg library (it's very fast), which will give you an array of unsigned char as RGB (or RGBA). Now you have an uncompressed image, which has a byte value for R G and B respectively.
You can from here, go and make a simple for loop that will go through 3*8 char blocks and copy them, using memcpy to some other memory location.
You have to keep in mind that the array returned from the turbojpg library is a one dimensional linear array of bytes. So the scanlines are stored one after the other. Take this into account when creating your blocks, cause depending on your needs, you'll have to traverse the array differently.

Related

How does hiding files in jpeg file works

I was reading an article explaining How to Hide Files in JPEG Pictures.
I am wondering how it's possible for a file to contain both jpeg data and a rar file without any visible distortion either to the image or to the compressed file.
My guess is that it has something to do with how either the compressed file or the jpeg file is represented in binary form, but I have no idea how this works.
Can someone elaborate on that?

All that is doing is adding the archive to the end of a JPEG stream. You then hope your JPEG decoder will not read past the EOI marker, find data there, and say something is wrong.
A JPEG image is a stream of bytes starting with an SOI marker and ending with an EOI marker.
ZIP and RAR are streams of byte. A ZIP stream starts with 50 4B. A RAR stream starts with 52 61 72 21 1A 07.
The method described in the link above takes a binary copy of (multiple) a JPEG stream and appends a ZIP or RAR stream to it.
The RAR/ZIP decoders scan the stream until they find the signature for RAR or ZIP (ignoring the JPEG stream).

This answer does not address the exact case in the link you gave, but it provides another way of hiding data:
It would also be theoretically possible to hide a file within the JPEG picture itself, but you would need a complicated program to write the encoded data and then read it again.
Basically, a JPEG photograph contains a lot of information which, if it changed, would not be noticeable to the human eye. Imagine you have a photo of a person in a blue shirt. If you zoom in on that shirt you will see that it is not an even blue colour, but made up of a multitude of flecks of colour, most of which are a bluish tone (but some could be other colours as well). You could easily change some of those flecks to a slightly different tone and it would make no obvious visible difference to the picture.
A clever program could embed a code in the photo by subtly changing pixels to a pattern that represents data. A very simple example: if the "hue" (i.e. colour tone) is represented by a number between 0 and 255, pixels of an even hue could represent a "0" bit and pixels of odd hue a "1" bit. It would be hard for the human eye to detect such a difference in the picture.
It is an old idea and this article discusses how much data could be hidden in this way: High capacity data hiding in JPEG-compressed images (2004)

In general, hiding a file within another file is a practice known as Steganography. The method described in the link you provided simply concatenates the .rar to the end of the .jpg using the + operator, taking advantage of the different headers of each file type. #user3344003 does an excellent job of explaining why this works in his/her answer. This doesn't distort the image because the image data is left unaltered.
Another common method of hiding a file within an image is to use the Least Significant Bit (LSB) of each byte. The way this is performed is to replace every 8th bit in the image's bitstream with the next bit of the file you wish to hide. This works because the image's colors can be distorted slightly without being easily perceived by the human eye. In this approach, the image's size on disk will not grow as it would in the method from your link. This makes evidence of the hidden file much harder to detect. For a detailed look at this and other Steganographic methods, see this paper by Bret Dunbar.

there is a simple algorithm that I implemented it with matlab. if you division your image to 8 bit. the most significant bit has most valuable information and you can remove bit 0 and bit 1 without any change on original image. so you can put your file instead of bit 0 and 1. I saw this algorithm in anil.k.jain book.

Most efficient way to store video data

In order to accomplish some specific editing on some .avi files, I'd like to create an application (in C++) that is able to load, edit, and save those .avi files. But, what is the most efficient way? When first thinking about it, a simple 3D-Array containing a 2D-array of pixels for every frame seems the simplest solution; But then its size would be ENORMOUS. I mean, let's assume that a pixel only needs a color. One color would mean 3bytes (1char r, 1char b, 1char g). If I now have a 1920x1080 video format, this would mean 2MEGABYTES for only one frame! This data may or may not be smaller if using pointers for the colors, so that alreay used colors wont take more size - I don't really know, since I'm pretty new to C++ and the whole low-level stuff. (As a comparison: One of my AVI files recorded with Xvid codec is 40seconds long, 30fps, and only has 2MB.)
So how would you actually store the video data (Not even the audio, just the video) efficiently (while still being easily able to perform per-frame-changes on it)?

As you have realised, uncompressed video is enormous and it is not practical to store an entire video in this way.
Video compression is an extremely complex topic, but more-or-less, it works as follows: certain "key-frames" are compressed using fairly standard compression techniques similar or identical to still-photo compression such as JPEG. Frames following key-frames are compressed by comparing the frame with the previous one and looking for changes (such as moving blocks). Every now and again, a new key-frame is used.
You don't really have to worry much about that as you are not going to write your own video coder/decoder (codec). There are standard ones.
What will happen is that your program will decode the compressed video frame-by-frame and keep a certain number of frames in memory while you are working on them and then re-encode them when it is finished. In the uncompressed form, you will have access to the individual pixels and can work on them how you want.
You are probably not going to do that either by yourself - it is very hard. You probably need to use a framework, such as OpenCV. There are a huge number of standard filters and tools built in to these frameworks, and it may be that what you want to do is already implemented somewhere.
The OpenCV framework can return individual frames in a Mat object and you can then access the pixels. See this post Get Pixels from Mat
OpenCV
Tutorial page: Open CV Tutorial

Image steganography that could survive jpeg compression

I am trying to implement a steganographic algorithm where hidden message could survive jpeg compression.
The typical scenario is the following:
Hide data in image
Compress image using jpeg
The hidden data is not destroyed by jpeg compressiona nd could be restored
I was trying to use different described algorithms but with no success.
For example I was trying to use simple repetition code but the jpeg compression destroyed hidden data. Also I was trying to implementt algorithms described by the following articles:
http://nas.takming.edu.tw/chkao/lncs2001.pdf
http://www.securiteinfo.com/ebooks/palm/irvine-stega-jpg.pdf
Do you know about any algorithm that actually can survive jpeg compression?

You can hide the data in the frequency domain, JPEG saves information using DCT (Discrete Cosine Transform) for every 8x8 pixel block, the information that is invariant under compression is the highest frequency values, and they are arranged in a matrix, the lossy compression is done when the lowest coefficients of the matrix are rounded to 0 after the quantization of the block, these zeroes are arranged in the low-right part of the matrix and that is why the compression works and the information is lost.

Quite a few applications seem to implement Steganography on JPEG, so it's feasible:
http://www.jjtc.com/Steganography/toolmatrix.htm
Here's an article regarding a relevant algorithm (PM1) to get you started:
http://link.springer.com/article/10.1007%2Fs00500-008-0327-7#page-1

Perhaps the answer is late,but ...
You can do it in compressed domain steganography.Read image as binary file and analysis this file with libs like JPEG Parser. Based on your selected algorithm, find location of venues and compute new value of this venue and replace result bits in file data. Finally write file in same input extension.
I hope I helped.

What you're looking for is called watermarking.
A little warning: Watermarking algorithms use insane amounts of redundancy to ensure high robustness of the information being embedded. That means the amount of data you'll be able to hide in an image will be orders of magnitude lower compared to standard steganographic algorithms.

Cropping large jpeg

There is a task, to write a programm that will be crope a JPEG files. But the problem is that some jpeg files has large sizes - hundreds of MegaBytes. So the question: Is it possible to crop a jpeg file, but without loading all file to the RAM, using something like fseek(), and decoding only the parts that needed.
Is that possible? If yes, maybe there is some libraries do the same.
Upd. All this will be used for the deep zoom technology. So when deep zoom will asking for a file, this program will give it, but this should be in real time

There are two ways to accomplish this.
The first is lossless cropping, where you don't decode the file all the way but work with the 8x8 DCT blocks. You'll need to use a library that has this capability, and it places some restrictions on the cropping ability. You can't crop to a boundary that isn't on the DCT square, which limits you to multiples of 8 or 16 depending on the subsampling in the file.
The second way is to use a library that allows you to read and write one line at a time. I know that the IJG library can do this, and probably others as well. This is the easy way, but the downside is that the image goes through a decompression/recompression pass and will lose quality and/or be larger.

efficient TIFF tile extraction C++

I am working with 1gb large tiff images of around 20000 x 20000 pixels. I need to extract several tiles (of about 300x300 pixels) out of the images, in random positions.
I tried the following solutions:
Libtiff (the only low level library I could find) offers TIFFReadline() but that means reading in around 19700 unnecesary pixels.
I implemented my own tiff reader which extracts a tile out of the image without reading in unnecesary pixels. I expected it to be faster, but doing a seekg for every line of the tile makes it very slow. I also tried reading to a buffer all the lines of the file that include my tile, and then extracting the tile from the buffer, but results are more or less the same.
I'd like to receive suggestions that would improve my tile extraction tool!
Everything is welcome, maybe you can propose a more efficient library I could use, some tips about C/C++ I/O, some higher level strategy for my needs, etc.
Regards,
Juan

[Major edit 14 Jan 10]
I was a bit confused by your mention of tiles, when the tiff is not tiled.
I do use tiled/pyramidical TIFF images. I've created those with VIPS
vips im_vips2tiff source_image output_image.tif:none,tile:256x256,pyramid
I think you can do this with :
vips im_vips2tiff source_image output_image.tif:none,tile:256x256,flat
You may want to experiment with tile size. Then you can read using TIFFReadEncodedTile.
Multi-resolution storage using pyramidical tiffs are much faster if you need to zoom in/out. You may also want to use this to have a coarse image nearly immediately followed by a detailed picture.
After switching to (appropriately sized) tiled storage (which will bring you MASSIVE performance improvements for random access!), your bottleneck will be disk io. File read is much faster if read in sequence. Here mmapping may be the solution.
Some useful links:
VIPS
IIPImage
LibTiff.NET stackoverflow
VIPS is a image handling library which can do much more than just read/write. It has its own, very efficient internal format. It has a good documentation on the algorithms. For one, it decouples processing from filesystem, thereby allowing tiles to be cached.
IIPImage is a multi-zoom webserver/browser library. I found the documentation a very good source of information on multi-resolution imaging (like google maps)
The other solution on this page, using mmap, is efficient only for 'small' files. I've hit the 32-bit boundaries often. Generally, allocating a 1 GByte chunk of memory will fail on a 32-bit os (with 4 GBytes RAM installed) due to the fact that even virtual memory gets fragemented after one or two application runs. Still, there is sufficient memory to cache parts or whole of the image. More memory = more performance.

Just mmap your file.
http://www.kernel.org/doc/man-pages/online/pages/man2/mmap.2.html

Thanks everyone for the replies.
Actually a change in the way tiles were required, allowed me to extract the tiles from the files in hard disk, in a sequential way, instead of a random way. This allowed me to load a part of the file into ram, and extract the tiles from there.
The efficiency gain was huge. Otherwise, if you need random access to a file, mmap is a good deal.
Regards,
Juan

I did something similar to this to handle an arbitrarily large TARGA(TGA) format file.
The thing that made it simple for that kind of file is that the image is not compressed. You can calculate the position of any arbitrary pixel within the image and find it with a simple seek. You might consider targa format if you have the option to specify the image encoding.
If not there are many varieties of TIFF formats. You probably want to use a library if they've already gone through the pain of supporting all the different formats.

Did you get a specific error message? Depending on how you used that command line, you could have been stepping on your own file.
If that wasn't the issue, try using imagemagick instead of vips if it's an option.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js