My cam gives me jpeg with chroma sub-sampling 4:2:2, but I need 4:2:0.
Can I change MJPEG default chroma sub-sampling with v4l2?
v4l2 itself provides a very thin layer around the actual video data that is transferred: it will simply give you the formats that the camera (the hardware!!) delivers.
so if your hardware offers two distinct formats, then there is no way that v4l2 will offer you anything else.
you might want to checkout out libv4l2 library that does some basic colorspace conversion: in general it will have conversion from most exotic hardware formats to a handful of "standard" formats, so your application does not need to support all formats any hardware manufacturer can come up with. however, it is not very likely that these standard formats include a very specific (compressed) format, like the one you need.
Related
As far as I know, a modern JPEG decoder produces the same image when given the same input JPEG file.
Normally, we create JPEG files in such a way that the decoded image is an approximation of some input image.
Is the JPEG format flexible enough to allow lossless encoding of arbitrary input images with a custom encoder?
I'd image you'd at least have to fiddle with how quantization tables are used to essentially disable them? Perhaps something else?
(To be clear, I don't mean the special 'lossless' mode in JPEG that many decoders don't support. I am talking about using the default, mainstream code path through the decoder.)
No. Even with no quantization, the RGB to YCbCr transformation is lossy in the low few bits. Also the chroma channels are then downsampled, but that step can be skipped. While the DCT is mathematically lossless, in reality it is lossy in the least significant bit or two in the integer representation.
I need to decode video but my video player only supports RGB8 pixel format. So I'm looking into how to do pixel format conversion in the GPU, preferably in the decoding process, but if not possible, after it.
I've found How to set decode pixel format in libavcodec? which explains how to decode video on ffmpeg to an specific pixel format, as long as it's suported by the codec.
Basically, get_format() is a function which chooses, from a list of supported pixel formats from the codec, a pixel format for the decoded video. My questions are:
Is this list of supported codec output formats the same for all computers? For example, if my codec is for H264, then it will always give me the same list on all computers? (assuming same ffmpeg version of all computers)
If I choose any of these supported pixel formats, will the pixel format conversion always happen in the GPU?
If some of the pixel format conversions won't happen in the GPU, then my question is: does sws_scale() function converts in the GPU or CPU?
It depends. First, H264 is just a Codec standard. While libx264 or openh264 are implementing this standard you can guess that each implementation supports different formats. But let's assume (as you did in your question) you are using the same implementation on different machines then yes there might be still cases where different machines support different formats. Take H264_AMF for example. You will need an AMD graphics card to use the codec and the supported formats will depend on your graphics card as well.
Decoding will generally happen on your CPU unless you explicitly specify a hardware decoder. See this example for Hardware decoding: https://github.com/FFmpeg/FFmpeg/blob/release/4.1/doc/examples/hw_decode.c
When using Hardware decoding you are heavily relying on your machine. And each Hardware encoder will output their own (proprietary) frame format e.g. NV12 for a Nvida Graphics Card. Now comes the tricky part. The encoded frames will remain on your GPU memory which means you might be able to reuse the avframe buffer to do the pixel conversion using OpenCL/GL. But achieving GPU zero-copy when working with different frameworks is not that easy and I don't have enough knowledge to help you there. So what I would do is to download the decoded frame from the GPU via av_hwframe_transfer_data like in the example.
From this point on it doesn't make much of a difference if you used hardware or software decoding.
To my knowledge sws_scale isn't using hardware acceleration. Since it's not accepting "hwframes". If you want to do color conversion on Hardware Level you might wanna take a look at OpenCV you can use GPUMat there then upload your frame, call cvtColor and download it again.
Some general remarks:
Almost any image operation scaling etc. is faster on your GPU, but uploading and downloading the data can take ages. For single operations, it's often not worth using your GPU.
In your case, I would try to work with CPU decoding and CPU color conversion first. Just make sure to use well threaded and vectorized algorithms like OpenCV or Intel IPP. If you still lack performance then you can think about Hardware Acceleration.
I'm developing a multimedia streaming application for Desktop using SourceReader MediaFoundation technique.
I'm using USB camera device to show streaming. The camera supports 2-video formats: YUY2 and MJPG.
For 1980x1080p YUY2 video resolution, receiving only 48fps for 60fps. I fetched YUY2-RGB32 conversion from MSDN page and using in my application (Note: I didn't use any transform filter for color conversion).
For MJPG video format, I used MJPEG Decoder MFT to convert MJPG - YUY2 - RGB32 and then displaying on the window using Direct3D9. For specific resolution, I'm facing framerate drops from 60fps to 30fps(Ex: 1920x1080 60fps but drawing only 30-33fps).
Two ways, I verified in Graphedit to confirm about the filter:
Added MJPEG Decompressor filter and built the graph for MJPG video format to check fps for FullHD resolution and its showing 28fps for 60fps.
Added AVI Decompressor filter and built the graph for MJPG video format to check fps for FullHD resolution and its showing 60fps.
I have searched on many sites to find AVI decompressor for media foundation but no luck.
Anyone confirm, is there any filter available in MFT?
Microsoft ships [recent versions of] Windows with stock Motion JPEG decoders:
MJPEG Decompressor Filter for DirectShow
MJPEG Decoder MFT for Media Foundation
To my best knowledge those do not share codebases, however both are not supposed to be performance efficient decoders.
Your using GraphEdit means you are trying DirectShow decoders and AVI Decompressor is supposedly using another (Video for Windows) codec which you did not identify.
For Media Foundation, you might be able to use Intel Hardware M-JPEG Decoder MFT or NVIDIA MJPEG Video Decoder MFT is you have respective hardware and drivers. Presumably, vendor specific decoders deliver better performance, and also have higher priority compared to generic software peers. Other than this, for an MFT form factor you might need to look at commercial decoders and/or custom developed, as the API itself is not so much popular to offer a wide range of options.
I'm trying to write a TCP client/server application that transmits objects containing OpenCv Mat. I'd like to serialize these objects using JSON. I found some libraries that help me in doing that (rapidjson), but they of course do not take into account images as object members.
What would you suggest to serialize in a JSON object a cv::Mat variable? How can I use RapidJson, for example, to achieve that?
imencode can be used to encode an viewable image (with CV_8UC1 or CV_8UC3 pixel formats) into a std::vector<uchar>. Link to documentation.
The vector<uchar> will contain the same bytes as if OpenCV had saved the image into one of the supported image file formats (such as JPEG or PNG) and then have the file bytes loaded back into a byte array.
imencode can be found in highgui module when using OpenCV 2.x, or imgcodecs module when using OpenCV 3.x.
With the compressed data in a vector<uchar>, you can use Base64 encoding to format it into a string, which can then be added as a JSON value inside a JSON object.
When using JSON to transmit large amounts of data, consider very very carefully the character encoding format that the JSON library is instructed to emit. Normally, If a large portion of the data is going to be Base64, you will want to make sure the JSON is emitted in UTF8.
If you have the option of sending in binary (which requires an "out-of-band" design in the web service, something not always doable), it should be seriously considered.
When considering different serialization choices for images, these things should be taken into account:
Typical image sizes (total number of pixels)
Size efficiency is less of a concern if images are small.
Pixel format (number of channels and precision)
Most common image file formats will only allow 8-bit grayscale and 24-bit RGB pixel data. Trying to save higher-precision pixel data into these image formats will result in partial loss of precision.
Available transmission bandwidth (if it is scarce enough to be a concern). With less available bandwidth, compression becomes more important.
Compression options.
Typical (photographic or synthetic) images are highly compressible due to the common sense that images that are too "dense" will be too hard to comprehend when viewed by human eyes.
Compression can be lossless or lossy.
Choice of compression may depend on the statistical characteristics of the pixel values (image content).
As mentioned above, if compression is performed by encoding into some image formats, you have to make sure the image format can satisfy the pixel value precision requirements of your application.
If no existing image format meets your requirements and you still want to perform lossless compression, consider using the zlib API that is integrated into the OpenCV Core module.
If you are good at image processing and data compression theory, you may be able to devise an application-specific compression method based on your own needs.
Remember that reducing the image resolution can be a powerful (and super-lossy) way of reducing the transmission file size. Consider carefully what minimum image resolution is actually needed for your application.
Other considerations
Binary or text
Endianness
Availability of highgui, imgcodecs or an image decoder for the chosen image format on the receiving end.
Information source: just did this a few months ago.
I'm looking for a way to display videos in a 2D game. The videos need to support an alpha-channel so they can be overlayed on top of the other game elements.
Currently I just have a series of PNG files which are decompressed and then flipped through for the animation. This works, but it is a massive memory hog; a 1024x1024 animation that is 5 seconds long at 24 frames per second takes up well over 400MB. And I'm targeting embedded systems, so this is really not good.
I've been looking for some video codecs that can support these requirements, yet so far all I've really been able to come up with that support RGBA are licensed under GPL, so we can't use them in a commercial product.
Any such beast(s) out there?
Most codecs don't support an alpha channel - the only one I can think of is the QuickTime animation codec, which isn't very popular.
If you only need binary alpha channel (transparent or not) then setting the top bit of one of the color channels is a common approach.
If these are animation type frames then something like MJPEG might work well and there are lots of LGPL licensed mjpeg libs