Large JPEG/PNG Image Sequence Looping - c++

I have been working on my project about remotely sensed image processing, and image sequence looping. Each resulting image (in JPEG or PNG format) has approximately 8000 * 4000 pixels. Our users usually want to loop an image sequence (more than 50 images) on the basis of region of interest at a time. Thus, I have to extract the required viewing area from the each image according to user's visualization client size. For example, if user's current client view is 640 * 480, I'll have to find a size of 640 * 480 data block from each original image based on the current x (columns) and y (rows) coordinates, and remap to the client view. When user pans to another viewing area by mouse dragging, our program must accordingly re-load regional data out of each original image as soon as possible.
I know neither JPEG library nor PNG library has some built-in data block read routines, such as
long ReadRectangle (long x0, long y0, long x1, long y1, char* RectData);
long ReadInaRectangle (long x0, long y0, short width, short height, char* RectData);
The built-in JPEG decompressor lacks this kind of functionality. I know that JPEG2000 format has provisions for decompressing a specific area of the image. I'm not entirely sure about JEPG.
Someone suggest that I use CreateFileMapping, MapViewOfFile, and CreateDIBSection to commit the number of bytes of a file mapping to map to the view. Unlike the simple flat binary image formats such *.raw, *.img, and *.bmp, JPEG's Blob will contain not only the image data but also the complicated JPG header. So it's not easy to map a block of data view out of the JPEG file.
Someone recommend that I use image tiling or image pyramid technology to generate sub-images, just like mnay popular, image visualization (Google Earth, and etc.), and GIS applications (WebGIS, and etc.) do.
How can I solve this problem?
Thanks for your help.
Golden Lee

If you're OK with region co-ordinates being multiples of 8, the JPEG library from ijg may be able to help you load partial JPEG images.
You'd want to:
Get all DCT coefficients for the entire image. Here's an example of how to do this. Yes, this will involve entropy decoding of the entire image, but this is the less expensive step of JPEG decoding (IDCT is the most expensive one, and we're avoiding it).
Throw away the blocks that you don't need (each block consists of 8x8) coefficients. You'll have to do this by hand, but since the layout is quite simple (the blocks are in scanline order) it shouldn't be that hard.
Apply block inverse DCT to each of the frames. You can probably get IJG to do that for you. If you can't, then you'll have to do your own IDCT and color transform back to [0, 255] because intensities are in [-127, 128] in the world of JPEG.
If all goes well, you'll get your decoded JPEG image. Because of chroma subsampling, the luma and chroma channels may be of different dimensions, and you will have to compensate for this yourself by scaling.
The first two steps are pretty much covered by the links. The fourth one is quite trivial (you can get the type of chroma subsampling using the IJG interface, and scaling -- essentially upsampling -- is easily achieved by using something like OpenCV or rolling your own code). The third one is something I haven't tried yet, but it sounds like it would be possible.

It's easy with the gd library. LibGD is an open source code library for the dynamic creation of images on the fly by programmers.

Related

Image with sparse and continuous coordinates in ITK

I have a raw data image which is potentially sparse and has continuous coordinates (e.g. 1000 pixels which are positioned on a spiral, the coordinates are floats). What is the best way to load this data into ITK for further processing and the ability to save the image in physical coordinates?
My research so far: There is itk::SpecialCoordinatesImage which I could inherit to override TransformPhysicalPointToContinuousIndex(…) and TransformPhysicalPointToIndex(…). I do not know the position and pixel number before reading the hole data stream. So for a minimal amount of speed I will need to resort the data "manually". Isn't there a better way?
I am more familiar with vtk than itk, so propably what comes into my mind is a bit biased. You could:
load the raw data into a vtk unstructured grid (see for example the function ReadFinancialData in http://vtk.org/gitweb?p=VTK.git;a=blob;f=Examples/Modelling/Cxx/finance.cxx )
then voxelize it to an image. For example. see http://www.vtkjournal.org/browse/publication/713 (I've never used it, I dont' know if it is compatible with the last versions) or http://www.vtk.org/Wiki/VTK/Examples/Cxx/PolyData/PolyDataContourToImageData

Encode image in JPG with OpenCV avoiding the ghost effect

I have an application (openCV - C++) that grab an image from webcam, encode it in JPG and trasmitt it from a Server to Client. Thwebcam is stereo so actually I have two image, LEFT and RIGHT. In the client, when I recieve the image I decode it and I generate an Anaglyph 3D Effect.
For do this I use the OpenCV...
Well I encode the image in this way:
params.push_back(CV_IMWRITE_JPEG_QUALITY);
params.push_back(60); //image quality
imshow(image); // here the anagliphic image is good!
cv::imencode(".jpg", image, buffer, params);
and decode in this way:
cv::Mat imageRecieved = cv::imdecode( cv::Mat(v), CV_LOAD_IMAGE_COLOR );
What I see is that this kind of encode generate in the Anaglyph image a "ghost effect" (artifact?) so there is a bad effect with the edges of the object. If look a door for example there a ghost effect with the edge of the door. I'm sure that this depends of the encode because if I show the Anaglyph image before encode instruction this work well. I cannot use the PNG because it generate to large image and this is a problem for the connection between the Server and the Client.
I look for the GIF but, if I understood good, is nt supported by the cv::encode function.
So there is another way to encode a cv:Mat obj in JPG withou this bad effect and without increase to much the size of the image?
If your server is only used as an image storage, you can send to the server the 2 original stereo images (compressed) and just generate the Anaglyph when you need it. I figure that if you fetch the image pair (JPEG) from the server and then generate the Anaglyph (client-side), it will have no ghosting. It might be that the compressed pair of images combined is smaller than the Anaglyph .png.
I assume the anaglyph encoding is using line interlacing to combine both sides into one image.
You are using JPEG to compress the image.
This algorithm optimized to compress "photo-like" real world images from cameras, and works very well on these.
The difference of "photo-like" and other images, regarding image compression, is about the frequencies occurring in the image.
Roughly speaking, in "photo-like" images, the high frequency part is relatively small, and mostly not important for the image content.
So the high frequencies can be safely compressed.
If two frames are interlaced line by line, this creates an image with very strong high frequency part.
The JPEG algorithm discards much of that information as unimportant, but because it is actually important, that causes relatively strong artefacts.
JPEG basically just "does not work" on this kind of images.
If you can change the encoding of the anaqlyph images to side by side, or alternating full images from left and right, JPEG compression should just work fine.
Is this an option for you?
If not, it will get much more complicated. One problem - if you need good compression - is that the algorithms that are great for compressing images with very high frequencies are really bad at compressing "photo-like" data, which is still the larger part of your image.
Therefore, please try really hard to change the encoding to be not line-interlacing, that should be about an order of magnitude easier than other options.

How can I compress jpeg image with compression rate 4 bpp or less?

I am trying to compress my .jpeg image in Photoshop.
WHat is the best way to do this?
I am now calculating the bpp taking the image size in kb, calculating how many bits that is. Then I take the image size in pixel*pixel to get the amount of pixels in the image. After that I divide bits/pixels, to find how many bits per pixel the image has.
But How can I change this number? My guess is to change how many kb the image is, but how do i do this?
Thanks for any help!!
Yes, you can achieve higher compression ratio than 4 bits per pixel. Images with solid color can have rate as low as 0.13bpp.
In fact 4bpp is quite poor compression — it's same as uncompressed 16-color image or half of 256-color image, which even GIF can manage. JPEG can look decent at 1-2bpp.
in general, you cannot "compress" a jpeg image. all you can do is to reduce the image quality further in order to achieve a lower bpp value. jpeg streams are always compressed and they use a lossy compression method. it means that the original image will never ever be reconstructed from a jpeg image. the smaller the file the more information you have lost.
a specific "bpp value" is not, and should never be your target. especially with lossy compression. you should always look at your current image and decide whether it is still good enough or not.
if you still have the original image, try a lossless compression format, like zip-compressed or lzw-compressed tiff or compressed png. i'm sure PhotoShop can handle these formats as well. another softwares like IrfanView (https://www.irfanview.com/) or XnView MP (https://www.xnview.com/en/xnviewmp/) will convert your images too.
if you want manual (eg. full) control over your images, you should use command line utilities, like ImageMagick (https://imagemagick.org/) or NConvert (please find the XnView MP link above)
if you have only the jpeg images do not touch (edit & save) them. with every single save operation you lose another bunch of information. you should always work on file copies.
you should always keep your master image (the very picture you took with your phone or your camera).
of course, these rules of thumb will not answer your original question.

How can I process an image?

I'm building a program to convert an image file (whatever file type would be easiest) to G-Code for use on a rep-rap with a pen plotter attachment.
I'm wondering if i wanted to process the image pixel by pixel and check things like pixel color, how could I do this with C++?
I would really like to know how I can process a bitmap image, pixel by pixel, to check the color of the pixel.
The best way is to use a library, like for example Magick++.
When you load an image, you can access it's pixels data with Blob
You will probably want to use an existing library that has been tested.
But for fun/practice/etc, this would be a good exercise and wouldn't be impossible to do. The Bitmap Format is (relatively) simple compared with other image formats. The Wikipedia page has some tons of info, including some C++ code. It looks like once you've gotten past the header information, you get to a pixel array that shouldn't be difficult to parse.
Good luck.
Most image formats consist of a header and the actual raw image data. A bimpap image is no different. If you don't want to use one of the existing libraries, or if you are not allowed to, you should read about bitmap format :
http://en.wikipedia.org/wiki/BMP_file_format
Once you understand this you could create appropriate structs/classes to store the information you want from the header such as x,y size, bpp etc. And also have a pointer to the raw image data. You could then simpy iterate through every pixel and do whatever you want with it :)
Once you decipher the image file, I suggest you place the pixels into a matrix, for the first pass. (Future revisions can use other methods to access the pixels).
You can apply transformations to the pixels by using matrix multiplication. You can also access the pixels individually by using array indexing.
Search the web and SO for "introduction to graphics c++".

Can't find logic behind png file sizes

I'm saving a large number of small png files for use in a game on a phone, so space is at a premium.
I'm trying to figure out the logic behind the file sizes so I can save things most efficiently, but even after using pngcrush the sizes are totally inconsistent.
I saved a 1x1 image and it takes 3kb. I have another 23x21 image which takes only 2kb. I have two images which are almost the same size, but one takes 6kb and the other takes 13kb. I doubled the image height and copied one image into the empty space of the other and saved that. The combined image is only 11kb!
Why is a 1x1 image larger than a 23x21 image? Why can I combine a 13kb image and a 6kb image and get an 11kb image?
Here are the images I'm talking about (there's a 1x1 pixel in between the 1st and second images. It's difficult to see, so I'll just give the URL: http://g42.org/temp/png/1x1.png):
example http://g42.org/temp/png/hat.png
example http://g42.org/temp/png/1x1.png
example http://g42.org/temp/png/helmet1.png
example http://g42.org/temp/png/helmet2.png
example http://g42.org/temp/png/helmet1_2.png
It's not a compression thing, the problem with the 1x1 image is that it has metadata (added by Photoshop, it seems), a color profile (iCCP chunk). If you look inside the binary, its' the data between the strings "iCCP" and "IDAT", it could be removed and you get a 69 bytes file.
If you reopen and save the file most image viewers (xnview), or use pngcrush, you can strip that chunk. : See it here : http://i.stack.imgur.com/fmOdA.png
And regarding the helmet images: besides other informational chunks (imageReady ads some informational text, as you can see), the difference is due to different formats: the two-helmets is a paletted image (8bits per pixel), the single helmet is a RGB with alpha (32bits per pixel)
PNG compression is based on the same algorithm as zlib and is highly sensitive to the data that is being compressed so you won't see a consistent relationship between image size and file size. In the case of the combined image, it is still bigger than the smaller image and given the similarity of the two halves of the image, the compressor was probably able to reuse a lot of the Huffman tree. I don't know enough about the algorithm to say for certain how it ended up smaller than the other half.
As long as you are not seeing oddities like the 1x1 image, which you seem to have figured out in the comments, I don't think this will make a lot of sense without extensive study of image compression.
There is a great utility called pngcrush
http://pmt.sourceforge.net/pngcrush/
Compressing to PNG is a rather difficult task - there are lost of assumptions and strategies to try - do we create a palette, or are we better off without it?
PNGcrush essentially bruteforces 100+ different compression strategies, while at the same time trimming useless tags and sections.
PNG has several sub-formats: 24-bit with or without alpha, 8-bit (includes alpha), grayscale, etc. which use different amount of bytes per pixel and have different "compressibility".
Plus PNG supports several compression tricks (filters and gzip settings) which affect how well image data is compressed.
On top of that PNG can contain metadata, which sometimes can be pretty large, like some embedded color profiles.
ImageAlpha converts images to the most space-efficient PNG8+alpha variant.
ImageOptim removes junk metadata and finds best compression parameters.
With a combination of those two your images can be reduced by 30-50%.