Reading contents of a qcow2 image using `bdrv_pread(..)` or alternatives - c++

I wanted to read contents off a .qcow2 image using bdrv_pread(...) functions in QEMU.
Say, the full path of my image is /path/to/myimage.qcow2, I want to be able to read 'n' bytes of data off of this image at a particular offset. Now the bdrv_pread functions takes these arguments 'BlockDriverState *bs, int64_t offset, void *buf, int count1', how exactly do I initialize the BlockDriverState (device?) from the path of the image. All other parameters other than BlockDriverState are clear to me.
Thanks.

If your goal is to access a qcow2 file from your own program I would recommend not trying to use the QEMU functions. These are going to have a lot of state associated with QEMU that is not necessary if all you want to do is read the contents of the qcow2 file. Instead you could look at the qcow2 specification or if you want to work at one higher level of abstraction you could look at the libguestfs library, which states that it has an API for accessing the supported VM disk formats (although I have never used it myself). There is some example code here that can help you get started.

Related

Get raw buffer for in-memory dataset in GDAL C++ API

I have generated a GeoTiff dataset in-memory using GDALTranslate() with a /vsimem/ filepath. I need access to the buffer for the actual GeoTiff file to put it in a stream for an external API. My understanding is that this should be possible with VSIGetMemFileBuffer(), however I can't seem to get this to return anything other than nullptr.
My code is essentially as follows:
//^^ GDALDataset* srcDataset created somewhere up here ^^
//psOptions struct has "-b 4" and "-of GTiff" settings.
const char* filep = "/vsimem/foo.tif";
GDALDataset* gtiffData = GDALTranslate(filep, srcDataset, psOptions, nullptr);
vsi_l_offset size = 0;
GByte* buf = VSIGetMemFileBuffer(filep, &size, true); //<-- returns nullptr
gtiffData seems to be a real dataset on inspection, it has all the appropriate properties (number of bands, raster size, etc). When I provide a real filesystem location to GDALTranslate() rather than the /vsimem/ path and load it up in QGIS it renders correctly too.
Looking a the source for VSIGetMemFileBuffer(), this should really only be returning nullptr if the file can't be found. This suggests i'm using it incorrectly. Does anyone know what the correct usage is?
Bonus points: Is there a better way to do this (stream the file out)?
Thanks!
I don't know anything about the C++ API. But in Python, the snippet below is what I sometimes use to get the contents of an in-mem file. In my case mainly VRT's but it shouldn't be any different for other formats.
But as said, I don't know if the VSI-api translate 1-on-1 to C++.
from osgeo import gdal
filep = "/vsimem/foo.tif"
# get the file size
stat = gdal.VSIStatL(filep, gdal.VSI_STAT_SIZE_FLAG)
# open file
vsifile = gdal.VSIFOpenL(filep, 'r')
# read entire contents
vsimem_content = gdal.VSIFReadL(1, stat.size, vsifile)
In the case of a VRT the content would be text, shown with something like print(vsimem_content.decode()). For a tiff it would of course be binary data.
I came back to this after putting in a workaround, and upon swapping things back over it seems to work fine. #mmomtchev suggested looking at the CPL_DEBUG output, which showed nothing unusual (and was silent during the actual VSIGetMemFileBuffer call).
In particular, for other reasons I had to put a GDALWarp call in between calling GDALTranslate and accessing the buffer, and it seems that this is what makes the difference. My guess is that GDALWarp is calling VSIFOpenL internally - although I can't find this in the source - and this does some kind of initialisation for VSIGetMemFileBuffer. Something to try for anyone else who encounters this.

I called av_probe_input_format3(), now I want to call avcodec_find_decoder(), how do I convert the format in a codec?

So... I'm dealing with a system that has input data coming in buffers (i.e. NOT a file). I want to determine which decoder to create to decompress an audio stream (MP3, WAV, OGG, ...) So obviously I do not know the input format.
I found out that I could determine the format using the av_probe_input_format[23]() functions. That part works great, I get a format pointer that matches the files that I use as input.
AVInputFormat * format(av_probe_input_format3(&pd, true, &score));
I can print the format->name and format->long_name and these are the correct type (so the detection is working as expected).
Now, I'm trying to understand how to convert that AVInputFormat * into a AVCodec * so I can call avcodec_alloc_context3(codec) to create the actual audio decoder.
I found a couple of functions, which I used like so:
AVCodecID const codec_id(av_codec_get_id(format->codec_tag, format->raw_codec_id));
AVCodec * codec(avcodec_find_decoder(codec_id));
Problem 1. the raw_codec_id field is marked as "private" (should not access/use anywhere in your client's code).
Problem 2. the first function always returns AV_CODEC_ID_NONE (0) so of course the second call fails each time.
Am I doing something wrong? Is there is way to instead create a generic decode that will automatically detect the type of audio I have as input? (that is, would that be the only way to make that work?)
Okay, so the fact is that trying to use these functions directly is pretty much futile. The problem I have with the design is that it forces me to actually have a callback and that callback forces me to have a thread (i.e. I have to somehow feed data from a stream, not a file or such!)
So I can use the avformat_open_input() as mentioned by Gyan, only I have to have my own AVIOContext. I was hoping I could just call functions with my incoming data and avoid the pipeline concept. The issue here is some background processes could be servers that use fork() and thus you need to be really careful (i.e. fork() is not friendly with threads).

C++ - Loading Base64 Encoded String of an image to Boost GIL image/view

I'm using Boosts Generic Image Library. I'm being given a string representation of an image. After decoding it, could I directly make an Image or View object with that data? Or would I need to write the data to the computer as example.png and use GIL's read_image functions? The documentation mentions dynamic images but still takes a filename as a parameter to the i/o functions.
I would ideally be looking for a function that takes a string or byte array as a parameter rather than the image name to be loaded from disk. Something like GDI+ FromStream. I see that the documentation says "All functions take the filename or a device as the first parameter. A device could be a FILE*, std::ifstream, and TIFF*." Maybe it is possible to edit the contents of an ifstream to have the image data, not sure if this is actually possible though.

OpenCV image dimensions without reading entire image

I'm using OpenCV and am reading gigabytes of images -- too much to fit into memory at a single time. However, I need to initialize some basic structures which require the image dimensions. At the moment I'm using imread and then freeing the image right away, and this is really inefficient.
Is there a way to get the image dimensions without reading the entire file, using opencv? If not could you suggest another library (preferably lightweight, seeing as that's all it'll be used for) that can parse the headers? Ideally it would support at least as many formats as OpenCV.
I don't think this is possible in opencv directly.
Although it isn't specified in the docs, Java's ImageReader.getHight (and getWidth) only parse the image header, not the whole image.
Alternatively here is a reasonable looking lightweight library that definitely only checks the headers, and supports a good amount of image formats.
Finally, if you're on Linux the 'identify ' terminal command will also give you the dimensions, which you could then read in programmatically.
You could use boost gil:
#include <boost/gil/extension/io/jpeg_io.hpp>
int main(int argc, char *argv[])
{
//set/get file_path
auto dims = boost::gil::jpeg_read_dimensions(file_path);
int width = dims.x;
int height = dims.y;
}
You will have to link against libjpeg, by adding -ljpeg flag to the linker. You can get some more information here.

FSCTL_GET_RETRIEVAL_POINTERS failure on very small file on a NT File System

My questions is: how would it be possible to get the file disk offset if this file (very important) is small (less than one cluster, only a few bytes).
Currently I use this Windows API function:
DeviceIOControl(FileHandle, FSCTL_GET_RETRIEVAL_POINTERS, #InBuffer, SizeOf(InBuffer), #OutBuffer, SizeOf(OutBuffer), Num, Nil);
FirsExtent.Start := OutBuffer.Pair[0].LogicalCluster ;
It works perfectly with files bigger than a cluster but it just fails with smaller files, as it always returns a null offset.
What is the procedure to follow with small files ? where are they located on a NTFS volume ? Is there an alternative way to know a file offset ? This subtility doesn't seem to be documented anywhere.
Note: the question is tagged as Delphi but C++ samples or examples would be appreciated as well.
The file is probably resident, meaning that its data is small enough to fit in its MFT entry. See here for a slightly longer description:
http://www.disk-space-guide.com/ntfs-disk-space.aspx
So you'd basically need to find the location of the MFT entry in order to know where the data is on disk. Do you control this file? If so the easiest thing to do is make sure that it's always larger than the size of an MFT entry (not a documented value, but you could always just do 4K or something).