Simple and Fast Video Encoding/Decoding

Simple and Fast Video Encoding/Decoding - c++

I need a simple and fast video codec with alpha support as an alternative to Quicktime Animation which has horrible compression rates for regular video.
Since I haven't found any good open-source encoder/decoder with alpha support, I have been trying to write my own (with inspiration from huff-yuv).
My strategy is the following:
Convert to YUVA420
Subtract current frame from previous (no need for key-frames).
Huffman encode the result from the previous step. Split each frame into 64x64 blocks and create a new huffman table for each block and encode it.
With this strategy I achieve decent compression rate 60-80%. I could probably improve the compression rate by splitting each frame into block after step 1 and add motion vectors to the reduce the data output from step 2. However, better compression-rate than 60% is lower prio than performance.
Acceptable compression-speed on a quad-core cpu 60ms/frame.
However the decoding speed suffers, 40ms/frame (barely real-time with full cpu usage).
My question is whether there is a way to compress video with much faster decoding, while still achieving decent compression rate?
Decoding huffman-coded symbols seems rather slow. I have not tried to use table-lookups yet, not sure if table lookups is a good idea since I have a new huffman table for each block, and building the lookup-table is quite expensive. As far as I have been able to figure out its not possible to make use of any SIMD or GPU features. Is there any alternative? Note that it doesn't have to be lossless.

You want to try a Golomb Code instead of a Huffman Code. A golomb code is IMO faster to decode then a huffman code. If it doesn't have to be loseless you want to use a hilbert curve and a DCT and then a Golomb Code. You want to subdivide the frames with a space-filling curve. IMO a continous subdivision of a frame with a sfc and also a decode is very fast.

Related

Image steganography that could survive jpeg compression

I am trying to implement a steganographic algorithm where hidden message could survive jpeg compression.
The typical scenario is the following:
Hide data in image
Compress image using jpeg
The hidden data is not destroyed by jpeg compressiona nd could be restored
I was trying to use different described algorithms but with no success.
For example I was trying to use simple repetition code but the jpeg compression destroyed hidden data. Also I was trying to implementt algorithms described by the following articles:
http://nas.takming.edu.tw/chkao/lncs2001.pdf
http://www.securiteinfo.com/ebooks/palm/irvine-stega-jpg.pdf
Do you know about any algorithm that actually can survive jpeg compression?

You can hide the data in the frequency domain, JPEG saves information using DCT (Discrete Cosine Transform) for every 8x8 pixel block, the information that is invariant under compression is the highest frequency values, and they are arranged in a matrix, the lossy compression is done when the lowest coefficients of the matrix are rounded to 0 after the quantization of the block, these zeroes are arranged in the low-right part of the matrix and that is why the compression works and the information is lost.

Quite a few applications seem to implement Steganography on JPEG, so it's feasible:
http://www.jjtc.com/Steganography/toolmatrix.htm
Here's an article regarding a relevant algorithm (PM1) to get you started:
http://link.springer.com/article/10.1007%2Fs00500-008-0327-7#page-1

Perhaps the answer is late,but ...
You can do it in compressed domain steganography.Read image as binary file and analysis this file with libs like JPEG Parser. Based on your selected algorithm, find location of venues and compute new value of this venue and replace result bits in file data. Finally write file in same input extension.
I hope I helped.

What you're looking for is called watermarking.
A little warning: Watermarking algorithms use insane amounts of redundancy to ensure high robustness of the information being embedded. That means the amount of data you'll be able to hide in an image will be orders of magnitude lower compared to standard steganographic algorithms.

How can I further compress a JPEG to send over a TCP Stream in C/C++ on Windows?

I am making my own version of teamviewer, where you can show your screen to other people so that they can help you with issues and provide support. However, I have ran into a slight issue.
Firstly, when I take a screenshot the .BMP is about 1 MB in memory. After using EncodingParams and converting it to a JPEG with ulong quality = 20, it drops to 92 kb. However, I still feel like this might be a little too big to constantly send over a stream.
Is there any way I can reduce the size of an image or any kind of way I can make it less network intensive? Every single byte I remove would help speed it up for slower connections, and use less bandwidth.
I would appreciate if someone could give me some advice.
Thanks

Either use a lower quality than 20, or or reduce the size of the image. Both will of course degrade the quality. But there aren't any obvious compression techniques that you can use on a JPEG compressed image - it's about as small as it will get (even high quality JPEG's will compress rather poorly, because JPEG does a good job at compressing the actual data in the image, as well as "destroying" pixels when compression levels are quite high). You can prove this by taking a number of JPEG images and compress them with your favourite compression program (ZIP, GZIP, BZIP, etc).
You can also reduce the number of colours in the image - a 256 (8-bit) or 64K (16-bit) image will compress much better than a 16M (24/32-bit) image.
Other obvious answer is to only send the DIFFERENCE between one frame and another - this can give you pretty decent benefits as long as the picture isn't changing very much - if someone is playing a "shoot-em up" type game, where things explode and the scene changes at 50 fps, it's probably not going to help much, but for regular office type applications, the number of pixels that change each frame is probably minimal.

Appropriate image file format for losslessly compressing series of screenshots

I am building an application which takes a great many number of screenshots during the process of "recording" operations performed by the user on the windows desktop.
For obvious reasons I'd like to store this data in as efficient a manner as possible.
At first I thought about using the PNG format to get this done. But I stumbled upon this: http://www.olegkikin.com/png_optimizers/
The best algorithms only managed a 3 to 5 percent improvement on an image of GUI icons. This is highly discouraging and reveals that I'm going to need to do better because just using PNG will not allow me to use previous frames to help the compression ratio. The filesize will continue to grow linearly with time.
I thought about solving this with a bit of a hack: Just save the frames in groups of some number, side by side. For example I could just store the content of 10 1280x1024 captures in a single 1280x10240 image, then the compression should be able to take advantage of repetitions across adjacent images.
But the problem with this is that the algorithms used to compress PNG are not designed for this. I am arbitrarily placing images at 1024 pixel intervals from each other, and only 10 of them can be grouped together at a time. From what I have gathered after a few minutes scanning the PNG spec, the compression operates on individual scanlines (which are filtered) and then chunked together, so there is actually no way that info from 1024 pixels above could be referenced from down below.
So I've found the MNG format which extends PNG to allow animations. This is much more appropriate for what I am doing.
One thing that I am worried about is how much support there is for "extending" an image/animation with new frames. The nature of the data generation in my application is that new frames get added to a list periodically. But I do have a simple semi-solution to this problem, which is to cache a chunk of recently generated data and incrementally produce an "animation", say, every 10 frames. This will allow me to tie up only 10 frames' worth of uncompressed image data in RAM, not as good as offloading it to the filesystem immediately, but it's not terrible. After the entire process is complete (or even using free cycles in a free thread, during execution) I can easily go back and stitch the groups of 10 together, if it's even worth the effort to do it.
Here is my actual question that everything has been leading up to. Is MNG the best format for my requirements? Those reqs are: 1. C/C++ implementation available with a permissive license, 2. 24/32 bit color, 4+ megapixel (some folks run 30 inch monitors) resolution, 3. lossless or near-lossless (retains text clarity) compression with provisions to reference previous frames to aid that compression.
For example, here is another option that I have thought about: video codecs. I'd like to have lossless quality, but I have seen examples of h.264/x264 reproducing remarkably sharp stills, and its performance is such that I can capture at a much faster interval. I suspect that I will just need to implement both of these and do my own benchmarking to adequately satisfy my curiosity.

If you have access to a PNG compression implementation, you could easily optimize the compression without having to use the MNG format by just preprocessing the "next" image as a difference with the previous one. This is naive but effective if the screenshots don't change much, and compression of "almost empty" PNGs will decrease a lot the storage space required.

Is there a faster lossy compression than JPEG?

Is there a compression algorithm that is faster than JPEG yet well supported? I know about jpeg2000 but from what I've heard it's not really that much faster.
Edit: for compressing.
Edit2: It should run on Linux 32 bit and ideally it should be in C or C++.

Jpeg encoding and decoding should be extremely fast. You'll have a hard time finding a faster algorithm. If it's slow, your problem is probably not the format but a bad implementation of the encoder. Try the encoder from libavcodec in the ffmpeg project.

Do you have MMX/SSE2 instructions available on your target architecture? If so, you might try libjpeg-turbo. Alternatively, can you compress the images with something like zlib and then offload the actual reduction to another machine? Is it imperative that actual lossy compression of the images take place on the embedded device itself?

In what context? On a PC or a portable device?
From my experience you've got JPEG, JPEG2000, PNG, and ... uh, that's about it for "well-supported" image types in a broad context (lossy or not!)
(Hooray that GIF is on its way out.)

JPEG2000 isn't faster at all. Is it encoding or decoding that's not fast enough with jpeg? You could probably be alot faster by doing only 4x4 FDCT and IDCT on jpeg.
It's hard to find any documentation on IJG libjpeg, but if you use that, try lowering the quality setting, it might make it faster, also there seems to be a fast FDCT option.
Someone mentioned libjpeg-turbo that uses SIMD instructions and is compatible with the regular libjpeg. If that's an option for you, I think you should try it.

I think wavelet-based compression algorithms are in general slower than the ones using DCT. Maybe you should take a look at the JPEG XR and WebP formats.

You could simply resize the image to a smaller one if you don't require the full image fidelity. Averaging every 2x2 block into a single pixel will reduce the size to 1/4 very quickly.

Fastest deskew algorithm?

I am a little overwhelmed by my task at hand. We have a toolkit which we use for TWAIN scanning. Some of our customers are complaining about slower scan speeds when the deskew option is set. This is because if their scanner does not support a hardware deskew, it is done in post-processing on the CPU. I was wondering if anyone knows of a good (i.e. fast) algorithm to achieve this. It is hard for me to say what algorithm we are using now. What algorithms are out there for this, and how do they rank as far as speed/accuracy? If I knew the names of the algorithms, it could be easier for me to do a google search on them.
Thank You.
-Tom

Are you scanning in Color or B/W ?
Deskew is processor intensive. A Group4 tiff or JPEG must be decompressed, skew angle determined, deskewed and then compressed.
There are many image processing algorithms out there with deskew and I have evaluated many over the years. There are some huge differences in processing speed between the different libraries and a lot of it comes down to how well it is coded rather than the algorithm used. There is a huge difference in commercial libraries just reading and writing images.
The fastest commerical deskew I have used by far comes from Unisoft Imaging (www.unisoftimaging.com). I assume much of it is written in assembler. Unisoft has been around for many years and is very fast and efficient. It supports different many different deskew options including black border removal, color and B/W deskew. The Group4 routines are very solid and very fast. The library comes with many other image processing options as well as TWAIN and native SCSI scanner support. It also supports Unix.
If you want a free deskew then you might want to have a look at Leptonica. It does not come with too much documentation but is very stable and well written. http://www.leptonica.com/
Developing code from scratch could be quite time consuming and may be quite buggy and prone to errors.
The other option is to process the document in a separate process so that scanning can run at the speed of the scanner. At the moment you are probably processing everything in a parallel fashion, one task after another, hence the slowdown.

Consider doing it as post-processing, because deskew cannot be done at real-time (unless it's hardware accelerated).
Deskew consists of two steps: skew detection and rotation. Detecting the skew angle can usually be done on a B&W (1-bit) image faster. Rotation speed depends on the quality of the interpolation. A good quality deskew will take a lot of time to run, much more than scanning pages.
A good high speed scanner can do 120 double-sided pages per minute, if it has hardware JPEG or TIFF Group 4 compression, and your TWAIN library takes advantage of it (hint: do not use native mode). You barely have enough time to save the file to the hard drive at that speed, let alone decompress, skew detect, rotate, re-compress. Quality deskew takes several seconds per page, unless you can use the video card's hardware accelerator to rotate and compress.

Do I correctly understand you already have such algorithm implemented? If so, are you sure there is no space for optimization? I'd start with profiling existing solution.
Anyway, I guess you should look for fast digital Radon transform algorithm.
Take a look at http://pagetools.sourceforge.net. They have deskew algorithm implementation.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js