I am writing an application that generates a huge amount of images. Each frame is 1280x800 pixels large and has 1 byte per pixel for color information (greyscale). Each of the frames must be written to disk.
Currently I simply dump the raw pixel data to a binary file on the disk. The file can then be viewed with a special viewer I also created.
This is a very unsatisfactory solution, since the images can't be viewed/processed directly. They always have to run through my custom viewer/converter.
Is there an image format I could use to write my images to disk that:
Is fast to be written (no compression etc.)
Does not increase the final file's size much
Supports dumping my raw pixel buffer in there (no alignemnt changes etc.)
Can be read by common applications (Windows Explorer, Paint, Photoshop etc.)
I already tried to use .png, but the file generation takes much too long due to the compression.
Have a look at the binary Portable GrayMap (P5) format. It consists of an extremely simple header followed by raw image data (without any alignment requirements), and is widely supported by image viewers.
Both bmp and tiff can be used to save raw data. Bmp has the oddity of having image upside down, unless height is negative. And tiff has plenty of encoding options. It should be anyway feasible to reverse engineer the format to be used as a template, where the image data is copy pasted. So no need to use a library: just a header, image data and an optional footer concatenated.
Related
I want to write a script to extract the PixelDATA of a DICOM file using c or c ++, I don't want to use external libraries like dicomsdl... if anyone can help me to write algorithm for extract and show image .
Just extracting the image data under the pixel data is not enough to interpret the DICOM image properly. You will need other attributes from DICOM file such as Rows, Columns, Bit Allocated, Bit Stored, High Bit, Photometric Interpretation, Sample Per Pixel to Number of Frames information just to interpret the raw uncompressed image data. Also, stored image data can be in Little Endian or Big Endian byte order. In addition, image data can be encapsulated or compressed (e.g. compressed using different compression algorithms such as JPEG, JPEG 2000, JPEG LS, RLE etc)) and compressed streams are stored differently than the uncompressed image data. Even the PixelData element can exist in multiple locations in a single DICOM file (e.g. one under the Icon Image Sequence (thumbnail) and one at the top level (actual image).
It can get more complicated when you need to account for Palette Color (segmented vs un-segmented), modality LUT, VOI LUT etc. My recommendation is to use an existing DICOM SDK and there are many open source and commercial SDK available for different platforms and programming environments.
I'm trying to write a TCP client/server application that transmits objects containing OpenCv Mat. I'd like to serialize these objects using JSON. I found some libraries that help me in doing that (rapidjson), but they of course do not take into account images as object members.
What would you suggest to serialize in a JSON object a cv::Mat variable? How can I use RapidJson, for example, to achieve that?
imencode can be used to encode an viewable image (with CV_8UC1 or CV_8UC3 pixel formats) into a std::vector<uchar>. Link to documentation.
The vector<uchar> will contain the same bytes as if OpenCV had saved the image into one of the supported image file formats (such as JPEG or PNG) and then have the file bytes loaded back into a byte array.
imencode can be found in highgui module when using OpenCV 2.x, or imgcodecs module when using OpenCV 3.x.
With the compressed data in a vector<uchar>, you can use Base64 encoding to format it into a string, which can then be added as a JSON value inside a JSON object.
When using JSON to transmit large amounts of data, consider very very carefully the character encoding format that the JSON library is instructed to emit. Normally, If a large portion of the data is going to be Base64, you will want to make sure the JSON is emitted in UTF8.
If you have the option of sending in binary (which requires an "out-of-band" design in the web service, something not always doable), it should be seriously considered.
When considering different serialization choices for images, these things should be taken into account:
Typical image sizes (total number of pixels)
Size efficiency is less of a concern if images are small.
Pixel format (number of channels and precision)
Most common image file formats will only allow 8-bit grayscale and 24-bit RGB pixel data. Trying to save higher-precision pixel data into these image formats will result in partial loss of precision.
Available transmission bandwidth (if it is scarce enough to be a concern). With less available bandwidth, compression becomes more important.
Compression options.
Typical (photographic or synthetic) images are highly compressible due to the common sense that images that are too "dense" will be too hard to comprehend when viewed by human eyes.
Compression can be lossless or lossy.
Choice of compression may depend on the statistical characteristics of the pixel values (image content).
As mentioned above, if compression is performed by encoding into some image formats, you have to make sure the image format can satisfy the pixel value precision requirements of your application.
If no existing image format meets your requirements and you still want to perform lossless compression, consider using the zlib API that is integrated into the OpenCV Core module.
If you are good at image processing and data compression theory, you may be able to devise an application-specific compression method based on your own needs.
Remember that reducing the image resolution can be a powerful (and super-lossy) way of reducing the transmission file size. Consider carefully what minimum image resolution is actually needed for your application.
Other considerations
Binary or text
Endianness
Availability of highgui, imgcodecs or an image decoder for the chosen image format on the receiving end.
Information source: just did this a few months ago.
In order to accomplish some specific editing on some .avi files, I'd like to create an application (in C++) that is able to load, edit, and save those .avi files. But, what is the most efficient way? When first thinking about it, a simple 3D-Array containing a 2D-array of pixels for every frame seems the simplest solution; But then its size would be ENORMOUS. I mean, let's assume that a pixel only needs a color. One color would mean 3bytes (1char r, 1char b, 1char g). If I now have a 1920x1080 video format, this would mean 2MEGABYTES for only one frame! This data may or may not be smaller if using pointers for the colors, so that alreay used colors wont take more size - I don't really know, since I'm pretty new to C++ and the whole low-level stuff. (As a comparison: One of my AVI files recorded with Xvid codec is 40seconds long, 30fps, and only has 2MB.)
So how would you actually store the video data (Not even the audio, just the video) efficiently (while still being easily able to perform per-frame-changes on it)?
As you have realised, uncompressed video is enormous and it is not practical to store an entire video in this way.
Video compression is an extremely complex topic, but more-or-less, it works as follows: certain "key-frames" are compressed using fairly standard compression techniques similar or identical to still-photo compression such as JPEG. Frames following key-frames are compressed by comparing the frame with the previous one and looking for changes (such as moving blocks). Every now and again, a new key-frame is used.
You don't really have to worry much about that as you are not going to write your own video coder/decoder (codec). There are standard ones.
What will happen is that your program will decode the compressed video frame-by-frame and keep a certain number of frames in memory while you are working on them and then re-encode them when it is finished. In the uncompressed form, you will have access to the individual pixels and can work on them how you want.
You are probably not going to do that either by yourself - it is very hard. You probably need to use a framework, such as OpenCV. There are a huge number of standard filters and tools built in to these frameworks, and it may be that what you want to do is already implemented somewhere.
The OpenCV framework can return individual frames in a Mat object and you can then access the pixels. See this post Get Pixels from Mat
OpenCV
Tutorial page: Open CV Tutorial
I'm saving a large number of small png files for use in a game on a phone, so space is at a premium.
I'm trying to figure out the logic behind the file sizes so I can save things most efficiently, but even after using pngcrush the sizes are totally inconsistent.
I saved a 1x1 image and it takes 3kb. I have another 23x21 image which takes only 2kb. I have two images which are almost the same size, but one takes 6kb and the other takes 13kb. I doubled the image height and copied one image into the empty space of the other and saved that. The combined image is only 11kb!
Why is a 1x1 image larger than a 23x21 image? Why can I combine a 13kb image and a 6kb image and get an 11kb image?
Here are the images I'm talking about (there's a 1x1 pixel in between the 1st and second images. It's difficult to see, so I'll just give the URL: http://g42.org/temp/png/1x1.png):
example http://g42.org/temp/png/hat.png
example http://g42.org/temp/png/1x1.png
example http://g42.org/temp/png/helmet1.png
example http://g42.org/temp/png/helmet2.png
example http://g42.org/temp/png/helmet1_2.png
It's not a compression thing, the problem with the 1x1 image is that it has metadata (added by Photoshop, it seems), a color profile (iCCP chunk). If you look inside the binary, its' the data between the strings "iCCP" and "IDAT", it could be removed and you get a 69 bytes file.
If you reopen and save the file most image viewers (xnview), or use pngcrush, you can strip that chunk. : See it here : http://i.stack.imgur.com/fmOdA.png
And regarding the helmet images: besides other informational chunks (imageReady ads some informational text, as you can see), the difference is due to different formats: the two-helmets is a paletted image (8bits per pixel), the single helmet is a RGB with alpha (32bits per pixel)
PNG compression is based on the same algorithm as zlib and is highly sensitive to the data that is being compressed so you won't see a consistent relationship between image size and file size. In the case of the combined image, it is still bigger than the smaller image and given the similarity of the two halves of the image, the compressor was probably able to reuse a lot of the Huffman tree. I don't know enough about the algorithm to say for certain how it ended up smaller than the other half.
As long as you are not seeing oddities like the 1x1 image, which you seem to have figured out in the comments, I don't think this will make a lot of sense without extensive study of image compression.
There is a great utility called pngcrush
http://pmt.sourceforge.net/pngcrush/
Compressing to PNG is a rather difficult task - there are lost of assumptions and strategies to try - do we create a palette, or are we better off without it?
PNGcrush essentially bruteforces 100+ different compression strategies, while at the same time trimming useless tags and sections.
PNG has several sub-formats: 24-bit with or without alpha, 8-bit (includes alpha), grayscale, etc. which use different amount of bytes per pixel and have different "compressibility".
Plus PNG supports several compression tricks (filters and gzip settings) which affect how well image data is compressed.
On top of that PNG can contain metadata, which sometimes can be pretty large, like some embedded color profiles.
ImageAlpha converts images to the most space-efficient PNG8+alpha variant.
ImageOptim removes junk metadata and finds best compression parameters.
With a combination of those two your images can be reduced by 30-50%.
Please help! Thanks in advance.
Update: Sorry for the delayed response, but if it is helpful to provide more context here, since I'm not sure what alternative question I should be asking.
I have an image for a website home page that is 300px x 300px. That image has several distinct regions, including two that have graphical copy on top of the regions.
I have compressed the image down as much as I can without compromising the appearance of that text, and those critical regions of the image.
I tried slicing the less critical regions of the image and saving those at lower compressions in order to get the total kbs down, but as gregmac posted, the sections don't look right when rejoined.
I was wondering if there was a piece of software out there, or manual solution for identifying critical regions of an image to "compress less" and could compress other parts of the image more in order to get the file size down, while keeping those elements in the graphic that need to be high resolution sharper.
You cannot - you can only compress an entire PNG file.
You don't need to (I cannot think of a single case where compressing a specific portion of a PNG file would be useful)
Dividing the image in to multiple parts ("slicing") is the only way to compress different portions of a image file, although I'd even recommend again using different compression levels in one "sliced image", as differing compression artefacts joining up will probably look odd
Regarding your update,
identifying critical regions of an image to "compress less" and could compress other parts of the image more in order to get the file size down
This is inherently what image compression does - if there's a bit empty area it will be compressed to a few bytes (using RLE for example), but if there's a very detailed region it will have more bytes "spent" on it.
The problem sounds like the image is too big (in terms of file-size), have you tried other image formats, mainly GIF or JPEG (or the other PNG format, PNG-8 or PNG-24)?
I have compressed the image down as much as I can without compromising the appearance of that text
Perhaps the text could be overlaid using CSS, rather than embedded in the image? Might not be practical, but it would allow you to compress the background more (if the background image is a photo, JPEG might work best, since you no longer have to worry about the text)
Other than that, I'm out of ideas. Is the 300*300px PNG really too big?
It sounds like you are compressing parts of your image using something like JPEG and then pasting those compressed images onto a PNG combined with other images, and the entire PNG is sent to the browser where you split them up.
The problem with this is that the more you compress your JPEG parts the more decompression artifacts you will get. Then when you put these low quality images onto the PNG, which uses deflate compression, you will actually end up increasing the file size because it won't be able to compress well.
So if you are keen on keeping PNG as your file format the best solution would be to not compress the parts using JPEG which you paste onto your PNG - keep everything as sharp as possible.
PNG compresses each row separately unless you have used a "predictor" in the compression.
So it's best to keep your PNG as wide as possible with similar images next to each other horizontally rather than under each other vertically.
Perhaps upload an example of the images you're working with?