I am trying to use vkCreateImage with a 3-component image (rgb).
But all the the rgb formats give:
vkCreateImage format parameter (VK_FORMAT_R8G8B8_xxxx) is an unsupported format
Does this mean that I have to reshape the data in memory? So add an empty byte after each 3, and then load it as RGBA?
I also noticed R8 and R8G8 formats do work, so I would guess the only reason RGB is not supported because 3 is not a power of two.
Before I actually do this reshaping of the data I'd like to know for sure that this is the only way, because it is not very good for performance and maybe there is some offset or padding value somewhere that will help loading the RGB data into an RGBA format. So can somebody confirm the reshaping into RGBA is a necessary step to load RGB formats (albeit with 33% overhead)?
Thanks in advance.
First, you're supposed to check to see what is supported before you try to create an image. You shouldn't rely on validation layers to stop you; that's just a debugging aid to catch something when you forgot to check. What is and is not supported is dynamic, not static. It's based on your implementation. So you have to ask every time your application starts whether the formats you intend to use are available.
And if they are not, then you must plan accordingly.
Second, yes, if your implementation does not support 3-channel formats, then you'll need to emulate them with a 4-channel format. You will have to re-adjust your data to fit your new format.
If you don't like doing that, I'm sure there are image editors you can use to load your image, add an opaque alpha of 1.0, and save it again.
Related
I have written a C/C++ implementation of what I term a "compositor" (I come from a video background) to composite/overlay video/graphics on the top of a video source. My current compositor implementation is rather naive and there is room for CPU optimization improvements (ex: SIMD, threading, etc).
I've created a high-level diagram of what I am currently doing:
The diagram is self explanatory. Nonetheless, I'll elaborate on some of the constraints:
The main video always comes served in an 8-bit YUV 4:2:2 packed format
The secondary video (optional) will come served in either an 8-bit YUV 4:2:2 or YUVA 4:2:2:4 packed format.
The output from the overlay must come out in an 8-bit YUV 4:2:2 packed format
Some other bits of information:
The number of graphics inputs will vary; it may (or may not) be a constant value.
The colour format of the Graphics can be pinned to either ARGB or YUVA format (ie. I can provide it as you see fit). At the moment, I pin it to YUVA to keep a consistent colour format.
The potential of using OpenGL and accompanying shaders is rather appealing:
No need to reinvent the wheel (in terms of actually performing the composition)
The possibility of using GPU where available.
My concern with using OpenGL is performance. Looking around on the web, it is my understanding that a YUV surface would be converted to RGB internally; I would like to minimize the number of colour format conversions and ensure optimal performance. Without prior OpenGL experience, I hope someone can shed some light and suggest if I'm about to venture down the wrong path.
Perhaps my concern relating to performance is less of an issue when using a dedicated GPU? Do I need to consider separate code paths:
Hardware with GPU(s)
Hardware with only CPU(s)?
Additionally, am I going to struggle when I need to process 10-bit YUV?
You should be able to treat YUV as independent channels throughout. OpenGL shaders will be calling them r, g, and b, but it's just data that can be treated as whatever you want.
Most GPUs will support 10 bits per channel (+ 2 alpha bits). Various will support 16 bits per channel for all 4 channels but I'm a little rusty here so I have no idea how common support is for this. Not sure about the 4:2:2 data, but you can always treat it as 3 separate surfaces.
The number of graphics inputs will vary; it may (or may not) be a constant value.
This is something I'm a little less sure about. Shaders like this to be predictable. If your implementation allows you to add each input iteratively then you should be fine.
As an alternative suggestion, have you looked into OpenCL?
I need a way to get the pixels of an already existing texture. Similarly to how D3DTexture's LockRect works with ReadOnly and NoSysLock. Some of my textures are also stored in compressed DXT1/3/5 formats, not entirely sure if that would affect anything. If those formats are simply decoded by Opengl and stored as raw pixels instead of in the compression. So would retrieving the pixels guarantee the same format that was used to set the texture with?
Generally you will want to use a PBO for reading pixels. Here's all the information you need on PBOs, click here
So would retrieving the pixels guarantee the same format that was used
to set the texture with?
It is possible to convert the format and retrieve the pixels at the same time. Look at the format conversion section on the page I linked.
I have a .yuv file and want to read it using C++. The issue is that I don't know it's resolution. So, is there any means to derive the resolution from the yuv file ?
Also, I don't know the number of frames. Hence, I can't use ftell() and divide it by the total frames.
As far as I know there's no .yuv file format. Some programs use that extension to store raw data, but it's not a format. Therefore there's no way to use the file unless you already know the metadata.
And you need to know in advance if the chroma samples are interleaved with the luminance ones or stored separately and if there's any bit padding and/or any row padding, just to be able to read the data from the file.
Then you also need to know in advance the frame resolution, the chroma subsampling and the color space in order to be able to correctly process that data.
I'm trying to use OpenCV to read/write images for me. Currently, I have them in a different, non-standard format, and I know how to get them into OpenCV's containers. Here are the requirements:
The pixels are 1, or 3 bands, U8, U16, U32, or F32
The images have metadata, random stuff, like the camera ID that took the images. I would like the metadata to be vi/notepad editable
I want to write as little code as possible when it comes to low level stuff. My experience is that this stuff requires the most maintenance.
I can define the format. It's only to read and write for these programs.
I don't want the pixels to be anything but binary, '0.5873499082' is way too much data for one float.
Is there a way to describe to OpenCV how to read and write image types it doesn't know? Are there image types already available for the types of images I have?
My interim solution is to use boost to serialize the image, and save the metadata in a separate file.
Try using gdal library for reading images and then convert it to IplImage.
OpenCV can't do that for you, you can store the metadata in a separate file, or you can use for example the jpeg exif (that won't be notepad editable though).
I'm building a program to convert an image file (whatever file type would be easiest) to G-Code for use on a rep-rap with a pen plotter attachment.
I'm wondering if i wanted to process the image pixel by pixel and check things like pixel color, how could I do this with C++?
I would really like to know how I can process a bitmap image, pixel by pixel, to check the color of the pixel.
The best way is to use a library, like for example Magick++.
When you load an image, you can access it's pixels data with Blob
You will probably want to use an existing library that has been tested.
But for fun/practice/etc, this would be a good exercise and wouldn't be impossible to do. The Bitmap Format is (relatively) simple compared with other image formats. The Wikipedia page has some tons of info, including some C++ code. It looks like once you've gotten past the header information, you get to a pixel array that shouldn't be difficult to parse.
Good luck.
Most image formats consist of a header and the actual raw image data. A bimpap image is no different. If you don't want to use one of the existing libraries, or if you are not allowed to, you should read about bitmap format :
http://en.wikipedia.org/wiki/BMP_file_format
Once you understand this you could create appropriate structs/classes to store the information you want from the header such as x,y size, bpp etc. And also have a pointer to the raw image data. You could then simpy iterate through every pixel and do whatever you want with it :)
Once you decipher the image file, I suggest you place the pixels into a matrix, for the first pass. (Future revisions can use other methods to access the pixels).
You can apply transformations to the pixels by using matrix multiplication. You can also access the pixels individually by using array indexing.
Search the web and SO for "introduction to graphics c++".