How to get the resolution of YUV frame - c++

I have a .yuv file and want to read it using C++. The issue is that I don't know it's resolution. So, is there any means to derive the resolution from the yuv file ?
Also, I don't know the number of frames. Hence, I can't use ftell() and divide it by the total frames.

As far as I know there's no .yuv file format. Some programs use that extension to store raw data, but it's not a format. Therefore there's no way to use the file unless you already know the metadata.
And you need to know in advance if the chroma samples are interleaved with the luminance ones or stored separately and if there's any bit padding and/or any row padding, just to be able to read the data from the file.
Then you also need to know in advance the frame resolution, the chroma subsampling and the color space in order to be able to correctly process that data.

Related

Vulkan vkCreateImage with 3 components

I am trying to use vkCreateImage with a 3-component image (rgb).
But all the the rgb formats give:
vkCreateImage format parameter (VK_FORMAT_R8G8B8_xxxx) is an unsupported format
Does this mean that I have to reshape the data in memory? So add an empty byte after each 3, and then load it as RGBA?
I also noticed R8 and R8G8 formats do work, so I would guess the only reason RGB is not supported because 3 is not a power of two.
Before I actually do this reshaping of the data I'd like to know for sure that this is the only way, because it is not very good for performance and maybe there is some offset or padding value somewhere that will help loading the RGB data into an RGBA format. So can somebody confirm the reshaping into RGBA is a necessary step to load RGB formats (albeit with 33% overhead)?
Thanks in advance.
First, you're supposed to check to see what is supported before you try to create an image. You shouldn't rely on validation layers to stop you; that's just a debugging aid to catch something when you forgot to check. What is and is not supported is dynamic, not static. It's based on your implementation. So you have to ask every time your application starts whether the formats you intend to use are available.
And if they are not, then you must plan accordingly.
Second, yes, if your implementation does not support 3-channel formats, then you'll need to emulate them with a 4-channel format. You will have to re-adjust your data to fit your new format.
If you don't like doing that, I'm sure there are image editors you can use to load your image, add an opaque alpha of 1.0, and save it again.

Most efficient way to store video data

In order to accomplish some specific editing on some .avi files, I'd like to create an application (in C++) that is able to load, edit, and save those .avi files. But, what is the most efficient way? When first thinking about it, a simple 3D-Array containing a 2D-array of pixels for every frame seems the simplest solution; But then its size would be ENORMOUS. I mean, let's assume that a pixel only needs a color. One color would mean 3bytes (1char r, 1char b, 1char g). If I now have a 1920x1080 video format, this would mean 2MEGABYTES for only one frame! This data may or may not be smaller if using pointers for the colors, so that alreay used colors wont take more size - I don't really know, since I'm pretty new to C++ and the whole low-level stuff. (As a comparison: One of my AVI files recorded with Xvid codec is 40seconds long, 30fps, and only has 2MB.)
So how would you actually store the video data (Not even the audio, just the video) efficiently (while still being easily able to perform per-frame-changes on it)?
As you have realised, uncompressed video is enormous and it is not practical to store an entire video in this way.
Video compression is an extremely complex topic, but more-or-less, it works as follows: certain "key-frames" are compressed using fairly standard compression techniques similar or identical to still-photo compression such as JPEG. Frames following key-frames are compressed by comparing the frame with the previous one and looking for changes (such as moving blocks). Every now and again, a new key-frame is used.
You don't really have to worry much about that as you are not going to write your own video coder/decoder (codec). There are standard ones.
What will happen is that your program will decode the compressed video frame-by-frame and keep a certain number of frames in memory while you are working on them and then re-encode them when it is finished. In the uncompressed form, you will have access to the individual pixels and can work on them how you want.
You are probably not going to do that either by yourself - it is very hard. You probably need to use a framework, such as OpenCV. There are a huge number of standard filters and tools built in to these frameworks, and it may be that what you want to do is already implemented somewhere.
The OpenCV framework can return individual frames in a Mat object and you can then access the pixels. See this post Get Pixels from Mat
OpenCV
Tutorial page: Open CV Tutorial

Qt reading many images optimization - how to only read the size?

The title sums this one up. If I'm loading ~200 images of various size. How can I load just the header so I can know the size of each image?
Currently I find it takes a lot of cpu/memory and IO to load them all in to memory just for the size (I'm trying to generate an atlas from them).
QImage doesn't seem to have a way to do this. QImageReader sounded like it was what I wanted, yet this still seems to just go ahead and read the whole image, so not really sure what its purpose is. Is there another class or some way to use either of the class I've mentioned to only grab the image size from header?
How can I load just the header so I can know the size of each image?
Apparently it looks like you have assumed that image file header(first few bytes of) contains the size of the image. This does not hold true(at least not for all image format type). I checked it for few of formats(PNG).
Currently I find it takes a lot of cpu/memory and IO to load them all
in to memory just for the size
As you have mentioned that you are trying to load around ~200 image at one time just to find the size. This design does not looks good and we should try to decompose our problem into the smaller one. So here the efficient approach might be to open one file and find the size store into some data structure and close the file. If there is other part of your program which needs that ~200 image should be loaded into the memory then we should try to think on how can we avoid it.
QImage doesn't seem to have a way to do this?
It does not have as there seem to be no portable/consistent way to do it for all type of image format. However if you are aware about any file format which contains the header you may write small helper function which can open the file and read the header and find the size. But this helper function would be very specific to a particular type of image format and we may need to write different logic to read the header(all image formats have different header size and information).

How can I process an image?

I'm building a program to convert an image file (whatever file type would be easiest) to G-Code for use on a rep-rap with a pen plotter attachment.
I'm wondering if i wanted to process the image pixel by pixel and check things like pixel color, how could I do this with C++?
I would really like to know how I can process a bitmap image, pixel by pixel, to check the color of the pixel.
The best way is to use a library, like for example Magick++.
When you load an image, you can access it's pixels data with Blob
You will probably want to use an existing library that has been tested.
But for fun/practice/etc, this would be a good exercise and wouldn't be impossible to do. The Bitmap Format is (relatively) simple compared with other image formats. The Wikipedia page has some tons of info, including some C++ code. It looks like once you've gotten past the header information, you get to a pixel array that shouldn't be difficult to parse.
Good luck.
Most image formats consist of a header and the actual raw image data. A bimpap image is no different. If you don't want to use one of the existing libraries, or if you are not allowed to, you should read about bitmap format :
http://en.wikipedia.org/wiki/BMP_file_format
Once you understand this you could create appropriate structs/classes to store the information you want from the header such as x,y size, bpp etc. And also have a pointer to the raw image data. You could then simpy iterate through every pixel and do whatever you want with it :)
Once you decipher the image file, I suggest you place the pixels into a matrix, for the first pass. (Future revisions can use other methods to access the pixels).
You can apply transformations to the pixels by using matrix multiplication. You can also access the pixels individually by using array indexing.
Search the web and SO for "introduction to graphics c++".

Extracting basic info from animation file

I'm writing an application that handles metadata for images and all kinds of animations, so I'm looking for a way to find basic info about an animation file, e.g:
length (in minutes/seconds/frames)
aspect ratio of pixels
resolution of individual frames
framerate
Right now, I let my program execute
mplayer -identify animfile.avi
and parse its console output, which contains all the info I need in a machine-readable format. This works fine, but I know that some potential users of the program prefer vlc as a media player so I'd rather avoid having a hard dependence on mplayer being installed.
I've tried
vlc -vv animfile.avi
which prints an ungodly amount of junk on the console, sometimes containing the stuff I'm looking for. The formatting and what data gets printed seems to vary depending on the file format of the animation though.
Is there an easier way to extract basic info from an animation of any format one has a decoder for (especially the length of the animation) using vlc or som other app/library that is usually available on a typical Linux installation?
Edit: I'd rather use another program to do the dirty work, as this is supposed to work for any animation format, e.g avi, mpg, mov, wmv, vob etc.
Edit: totem-video-indexer seems more promising, and was also included with the standard installation. Enough codecs to make it useful, however, was not. That could be fixed by installing the "non-free-codecs" package from medibuntu.
The output of totem-video-indexer is very easy to parse:
TOTEM_INFO_DURATION=5217
TOTEM_INFO_HAS_VIDEO=True
TOTEM_INFO_VIDEO_WIDTH=720
TOTEM_INFO_VIDEO_HEIGHT=480
TOTEM_INFO_VIDEO_CODEC=XVID MPEG-4
TOTEM_INFO_FPS=30
TOTEM_INFO_HAS_AUDIO=True
TOTEM_INFO_AUDIO_BITRATE=50
TOTEM_INFO_AUDIO_CODEC=MPEG 1 Audio, Layer 3 (MP3)
TOTEM_INFO_AUDIO_SAMPLE_RATE=48000
TOTEM_INFO_AUDIO_CHANNELS=Stereo
mediainfo is a pretty useful program. It's LGPL, and is just a frontend for libmediainfo, which should be exactly what you want.
http://mediainfo.sf.net/
This is a little more difficult question than you may realize. The AVI file format grew over time, and often has nearly the same information in two or three different places. In some cases those are really supposed to agree (but sometimes don't) and in other cases they're subtly different.
Just for example, you asked about the width and height. There are actually four different width/height specs for a single frame: the screen width/height, the pixel width/height (from which you derive the pixel aspect ratio), the active width/height, and the compressed width/height. The frame width and height is the (theoretical) size of the screen. The active width/height excludes the overscan area. The compressed width/height takes into account rounding -- for example, JPEG compresses in blocks of 8x8 pixels, so the compressed width and height have to be multiples of 8 for a motion JPEG file. The active width/height tells you if (for example) some pixels at the border should be ignored.
In any case, since your question is tagged C++, I'm going to guess you'd rather read the file and get the data directly than depend on spawning something else to do the dirty work. If so, you probably want to look at the OpenDML AVI file spec. You can get at least some idea of the length, resolution, and framerate just from reading the basic AVI header, which is in a fixed spot at the beginning of the file, so that much is trivial to get. It'll take a bit more work to get to the pixel aspect ratio though...