In a camera application bitmap pixel arrays are retrieved from a streaming camera.
The pixel arrays are captured by writing them to a named pipe, where on the other end of the pipe, ffmpeg retrieves them and creates an AVI file.
I will need to create one custom frame (with custom text on), and pipe its pixels as the first frame in the resulting movie.
The question is how can I use a TBitmap (for convenience) to
Create a X by Y monochrome (8 bit) bitmap from scratch, with
custom text on. I want the background to be white, and the text to
be black. (Mostly figured this step out, see below.)
Retrieve the pixel array that I can send/write to the pipe
Step 1: The following code creates a TBitmap and writes text on it:
int w = 658;
int h = 492;
TBitmap* bm = new TBitmap();
bm->Width = w;
bm->Height = h;
bm->HandleType = bmDIB;
bm->PixelFormat = pf8bit;
bm->Canvas->Font->Name = "Tahoma";
bm->Canvas->Font->Size = 8;
int textY = 10;
string info("some Text");
bm->Canvas->TextOut(10, textY, info.c_str());
The above basically concludes step 1.
The writing/piping code expects a byte array with the bitmaps pixels; e.g.
unsigned long numWritten;
WriteFile(mPipeHandle, pImage, size, &numWritten, NULL);
where pImage is a pointer to a unsigned char buffer (the bitmaps pixels), and the size is the length of this buffer.
Update:
Using the generated TBitmap and a TMemoryStream for transferring data to the ffmpeg pipeline does not generate the proper result. I get a distorted image with 3 diagonal lines on it.
The buffersize for the camera frame buffers that I receive are are exactly 323736, which is equal to the number of pixels in the image, i.e. 658x492.
NOTE I have concluded that this 'bitmap' is not padded. 658 is not divisible by four.
The buffersize I get after dumping my generated bitmap to a memory stream, however, has the size 325798, which is 2062 bytes larger than it is supposed to be. As #Spektre pointed out below, this discrepancy may be padding?
Using the following code for getting the pixel array;
ByteBuffer CustomBitmap::getPixArray()
{
// --- Local variables --- //
unsigned int iInfoHeaderSize=0;
unsigned int iImageSize=0;
BITMAPINFO *pBitmapInfoHeader;
unsigned char *pBitmapImageBits;
// First we call GetDIBSizes() to determine the amount of
// memory that must be allocated before calling GetDIB()
// NB: GetDIBSizes() is a part of the VCL.
GetDIBSizes(mTheBitmap->Handle,
iInfoHeaderSize,
iImageSize);
// Next we allocate memory according to the information
// returned by GetDIBSizes()
pBitmapInfoHeader = new BITMAPINFO[iInfoHeaderSize];
pBitmapImageBits = new unsigned char[iImageSize];
// Call GetDIB() to convert a device dependent bitmap into a
// Device Independent Bitmap (a DIB).
// NB: GetDIB() is a part of the VCL.
GetDIB(mTheBitmap->Handle,
mTheBitmap->Palette,
pBitmapInfoHeader,
pBitmapImageBits);
delete []pBitmapInfoHeader;
ByteBuffer buf;
buf.buffer = pBitmapImageBits;
buf.size = iImageSize;
return buf;
}
So final challenge seem to be to get a bytearray that has the same size as the ones coming from the camera. How to find and remove the padding bytes from the TBitmap code??
TBitmap has a PixelFormat property to set the bit depth.
TBitmap has a HandleType property to control whether a DDB or a DIB is created. DIB is the default.
Since you are passing BMPs around between different systems, you really should be using DIBs instead of DDBs, to avoid any corruption/misinterpretation of the pixel data.
Also, this line of code:
Image1->Picture->Bitmap->Handle = bm->Handle;
Should be changed to this instead:
Image1->Picture->Bitmap->Assign(bm);
// or:
// Image1->Picture->Bitmap = bm;
Or this:
Image1->Picture->Assign(bm);
Either way, don't forget to delete bm; afterwards, since the TPicture makes a copy of the input TBitmap, it does not take ownership.
To get the BMP data as a buffer of bytes, you can use the TBitmap::SaveToStream() method, saving to a TMemoryStream. Or, if you just want the pixel data, not the complete BMP data (ie, without BMP headers - see Bitmap Storage), you can use the Win32 GetDiBits() function, which outputs the pixels in DIB format. You can't obtain a byte buffer of the pixels for a DDB, since they depend on the device they are rendered to. DDBs are only usable in-memory in conjunction with HDCs, you can't pass them around. But you can convert a DIB to a DDB once you have a final device to render it to.
In other words, get the pixels from the camera, save them to a DIB, pass that around as needed (ie, over the pipe), and then do whatever you need with it - save to a file, convert to DDB to render onscreen, etc.
This is just an addon to existing answer (with additional info after the OP edit)
Bitmap file-format has align bytes on each row (so there usually are some bytes at the end of each line that are not pixels) up to some ByteLength (present in bmp header). Those create the skew and diagonal like lines. In your case the size discrepancy is 4 bytes per row:
(xs + align)*ys + header = size
(658+ 4)*492 + 94 = 325798
but beware the align size depends on image width and bmp header ...
Try this instead:
// create bmp
Graphics::TBitmap *bmp=new Graphics::TBitmap;
// bmp->Assign(???); // a) copy image from ???
bmp->SetSize(658,492); // b) in case you use Assign do not change resolution
bmp->HandleType=bmDIB;
bmp->PixelFormat=pf8bit;
// bmp->Canvas->Draw(0,0,???); // b) copy image from ???
// here render your text using
bmp->Canvas->Brush->Style=bsSolid;
bmp->Canvas->Brush->Color=clWhite;
bmp->Canvas->Font->Color=clBlack;
bmp->Canvas->Font->Name = "Tahoma";
bmp->Canvas->Font->Size = 8;
bmp->Canvas->TextOutA(5,5,"Text");
// Byte data
for (int y=0;y<bmp->Height;y++)
{
BYTE *p=(BYTE*)bmp->ScanLine[y]; // pf8bit -> BYTE*
// here send/write/store ... bmp->Width bytes from p[]
}
// Canvas->Draw(0,0,bmp); // just renfder it on Form
delete bmp; bmp=NULL;
mixing GDI winapi calls for pixel array access (bitblt etc...) with VCL bmDIB bitmap might cause problems and resource leaks (hence the error on exit) and its also slower then usage of ScanLine[] (if coded right) so I strongly advice to use native VCL functions (as I did in above example) instead of the GDI/winapi calls where you can.
for more info see:
#4. GDI Bitmap
Delphi / C++ builder Windows 10 1709 bitmap operations extremely slow
Draw tbitmap with scale and alpha channel faster
Also you mention your image source is camera. If you use pf8bit it mean its palette indexed color which is relatively slow and ugly if native GDI algo is used (to convert from true/hi color camera image) for better transform see:
Effective gif/image color quantization?
simple dithering
Related
I've been trying to load compressed images with S3TC (BC/DXT) compression in Vulkan, but so far I haven't had much luck.
Here is what the Vulkan specification says about compressed images:
https://www.khronos.org/registry/dataformat/specs/1.1/dataformat.1.1.html#S3TC:
Compressed texture images stored using the S3TC compressed image formats are represented as a collection of 4×4 texel blocks, where each block contains 64 or 128 bits of texel data. The image is encoded as a normal 2D raster image in which each 4×4 block is treated as a single pixel.
https://www.khronos.org/registry/vulkan/specs/1.0/xhtml/vkspec.html#resources-images:
For images created with linear tiling, rowPitch, arrayPitch and depthPitch describe the layout of the subresource in linear memory. For uncompressed formats, rowPitch is the number of bytes between texels with the same x coordinate in adjacent rows (y coordinates differ by one). arrayPitch is the number of bytes between texels with the same x and y coordinate in adjacent array layers of the image (array layer values differ by one). depthPitch is the number of bytes between texels with the same x and y coordinate in adjacent slices of a 3D image (z coordinates differ by one). Expressed as an addressing formula, the starting byte of a texel in the subresource has address:
// (x,y,z,layer) are in texel coordinates
address(x,y,z,layer) = layerarrayPitch + zdepthPitch + yrowPitch + xtexelSize + offset
For compressed formats, the rowPitch is the number of bytes between compressed blocks in adjacent rows. arrayPitch is the number of bytes between blocks in adjacent array layers. depthPitch is the number of bytes between blocks in adjacent slices of a 3D image.
// (x,y,z,layer) are in block coordinates
address(x,y,z,layer) = layerarrayPitch + zdepthPitch + yrowPitch + xblockSize + offset;
arrayPitch is undefined for images that were not created as arrays. depthPitch is defined only for 3D images.
For color formats, the aspectMask member of VkImageSubresource must be VK_IMAGE_ASPECT_COLOR_BIT. For depth/stencil formats, aspect must be either VK_IMAGE_ASPECT_DEPTH_BIT or VK_IMAGE_ASPECT_STENCIL_BIT. On implementations that store depth and stencil aspects separately, querying each of these subresource layouts will return a different offset and size representing the region of memory used for that aspect. On implementations that store depth and stencil aspects interleaved, the same offset and size are returned and represent the interleaved memory allocation.
My image is a normal 2D image (0 layers, 1 mipmap), so there's no arrayPitch or depthPitch. Since S3TC compression is directly supported by the hardware, it should be possible to use the image data without decompressing it first. In OpenGL this can be done using glCompressedTexImage2D, and this has worked for me in the past.
In OpenGL I've used GL_COMPRESSED_RGBA_S3TC_DXT1_EXT as image format, for Vulkan I'm using VK_FORMAT_BC1_RGBA_UNORM_BLOCK, which should be equivalent.
Here's my code for mapping the image data:
auto dds = load_dds("img.dds");
auto *srcData = static_cast<uint8_t*>(dds.data());
auto *destData = static_cast<uint8_t*>(vkImageMapPtr); // Pointer to mapped memory of VkImage
destData += layout.offset(); // layout = VkImageLayout of the image
assert((w %4) == 0);
assert((h %4) == 0);
assert(blockSize == 8); // S3TC BC1
auto wBlocks = w /4;
auto hBlocks = h /4;
for(auto y=decltype(hBlocks){0};y<hBlocks;++y)
{
auto *rowDest = destData +y *layout.rowPitch(); // rowPitch is 0
auto *rowSrc = srcData +y *(wBlocks *blockSize);
for(auto x=decltype(wBlocks){0};x<wBlocks;++x)
{
auto *pxDest = rowDest +x *blockSize;
auto *pxSrc = rowSrc +x *blockSize; // 4x4 image block
memcpy(pxDest,pxSrc,blockSize); // 64Bit per block
}
}
And here's the code for initializing the image:
vk::Device device = ...; // Initialization
vk::AllocationCallbacks allocatorCallbacks = ...; // Initialization
[...] // Load the dds data
uint32_t width = dds.width();
uint32_t height = dds.height();
auto format = dds.format(); // = vk::Format::eBc1RgbaUnormBlock;
vk::Extent3D extent(width,height,1);
vk::ImageCreateInfo imageInfo(
vk::ImageCreateFlagBits(0),
vk::ImageType::e2D,format,
extent,1,1,
vk::SampleCountFlagBits::e1,
vk::ImageTiling::eLinear,
vk::ImageUsageFlagBits::eSampled | vk::ImageUsageFlagBits::eColorAttachment,
vk::SharingMode::eExclusive,
0,nullptr,
vk::ImageLayout::eUndefined
);
vk::Image img = nullptr;
device.createImage(&imageInfo,&allocatorCallbacks,&img);
vk::MemoryRequirements memRequirements;
device.getImageMemoryRequirements(img,&memRequirements);
uint32_t typeIndex = 0;
get_memory_type(memRequirements.memoryTypeBits(),vk::MemoryPropertyFlagBits::eHostVisible,typeIndex); // -> typeIndex is set to 1
auto szMem = memRequirements.size();
vk::MemoryAllocateInfo memAlloc(szMem,typeIndex);
vk::DeviceMemory mem;
device.allocateMemory(&memAlloc,&allocatorCallbacks,&mem); // Note: Using the default allocation (nullptr) doesn't change anything
device.bindImageMemory(img,mem,0);
uint32_t mipLevel = 0;
vk::ImageSubresource resource(
vk::ImageAspectFlagBits::eColor,
mipLevel,
0
);
vk::SubresourceLayout layout;
device.getImageSubresourceLayout(img,&resource,&layout);
auto *srcData = device.mapMemory(mem,0,szMem,vk::MemoryMapFlagBits(0));
[...] // Map the dds-data (See code from first post)
device.unmapMemory(mem);
The code runs without issues, however the resulting image isn't correct. This is the source image:
And this is the result:
I'm certain that the problem lies in the first code snipped I've posted, however, in case it doesn't, I've written a small adaption of the triangle demo from the Vulkan SDK which produces the same result. It can be downloaded here. The source-code is included, all I've changed from the triangle demo are the "demo_prepare_texture_image"-function in tri.c (Lines 803 to 903) and the "dds.cpp" and "dds.h" files. "dds.cpp" contains the code for loading the dds, and mapping the image memory.
I'm using gli to load the dds-data (Which is supposed to "work perfectly with Vulkan"), which is also included in the download above. To build the project, the Vulkan SDK include directory has to be added to the "tri" project, and the path to the dds has to be changed (tri.c, Line 809).
The source image ("x64/Debug/test.dds" in the project) uses DXT1 compression. I've tested in on different hardware as well, with the same result.
Any example code for initializing/mapping compressed images would also help a lot.
Your problem is actually quite simple - in the demo_prepare_textures function, the first line, there is a variable tex_format, which is set to VK_FORMAT_B8G8R8A8_UNORM (which is what it is in the original sample). This eventually gets used to create the VkImageView. If you just change this to VK_FORMAT_BC1_RGBA_UNORM_BLOCK, it displays the texture correctly on the triangle.
As an aside - you can verify that your texture loaded correctly, with RenderDoc, which comes with the Vulkan SDK installation. Doing a capture of it, the and looking in the TextureViewer tab, the Inputs tab shows that your texture looks identical to the one on disk, even with the incorrect format.
I have a program which runs in a window using OpenGL (VS2012 with freeglut 2.8.1). Basically at every time step (run via a call to glutPostRedisplay from my glutIdleFunc hook) I call my own draw function followed by a call to glFlush to display the result. Then I call my own screenShot function which uses the glReadPixels function to dump the pixels to a tga file.
The problem with this setup is that the files are empty when the window gets minimised. That is to say, the output from glReadPixels is empty; How can I avoid this?
Here is a copy of the screenShot function I am using (I am not the copyright holder):
//////////////////////////////////////////////////
// Grab the OpenGL screen and save it as a .tga //
// Copyright (C) Marius Andra 2001 //
// http://cone3d.gz.ee EMAIL: cone3d#hot.ee //
//////////////////////////////////////////////////
// (modified by me a little)
int screenShot(int const num)
{
typedef unsigned char uchar;
// we will store the image data here
uchar *pixels;
// the thingy we use to write files
FILE * shot;
// we get the width/height of the screen into this array
int screenStats[4];
// get the width/height of the window
glGetIntegerv(GL_VIEWPORT, screenStats);
// generate an array large enough to hold the pixel data
// (width*height*bytesPerPixel)
pixels = new unsigned char[screenStats[2]*screenStats[3]*3];
// read in the pixel data, TGA's pixels are BGR aligned
glReadPixels(0, 0, screenStats[2], screenStats[3], 0x80E0,
GL_UNSIGNED_BYTE, pixels);
// open the file for writing. If unsucessful, return 1
std::string filename = kScreenShotFileNamePrefix + Function::Num2Str(num) + ".tga";
shot=fopen(filename.c_str(), "wb");
if (shot == NULL)
return 1;
// this is the tga header it must be in the beginning of
// every (uncompressed) .tga
uchar TGAheader[12]={0,0,2,0,0,0,0,0,0,0,0,0};
// the header that is used to get the dimensions of the .tga
// header[1]*256+header[0] - width
// header[3]*256+header[2] - height
// header[4] - bits per pixel
// header[5] - ?
uchar header[6]={((int)(screenStats[2]%256)),
((int)(screenStats[2]/256)),
((int)(screenStats[3]%256)),
((int)(screenStats[3]/256)),24,0};
// write out the TGA header
fwrite(TGAheader, sizeof(uchar), 12, shot);
// write out the header
fwrite(header, sizeof(uchar), 6, shot);
// write the pixels
fwrite(pixels, sizeof(uchar),
screenStats[2]*screenStats[3]*3, shot);
// close the file
fclose(shot);
// free the memory
delete [] pixels;
// return success
return 0;
}
So how can I print the screenshot to a TGA file regardless of whether Windows decides to actually display the content on the monitor?
Note: Because I am trying to keep a visual record of the progress of a simulation, I need to print every frame, regardless of whether it is being rendered. I realise that last statement is a bit of a contradiction, since I need to render the frame in order to produce the screengrab. To rephrase; I need glReadPixels (or some alternative function) to produce the updated state of my program at every step so that I can print it to a file, regardless of whether windows will choose to display it.
Sounds like you're running afoul of the pixel ownership problem.
Render to a FBO and use glReadPixels() to slurp images out of that instead of the front buffer.
I would suggest keeping the last rendered frame stored in memory and updating this memory's contents whenever an update is called and there is actual pixel data in the new render. Either that or you could use the accum perhaps, though I cant quite recall how it stores older frames (it may just end up updating out so fast that it stores no render data as well.
Another solution might be to use a shader to manually render each frame and write the result to a file
This may well have come up before but the following code is taken from an MSDN example I am modifying. I want to know how I can iterate through the contents of the buffer which contains data about a bitmap and print out the colors. Each pixel is 4 bytes of data so I am assuming the R G B values account for 3 of these bytes, and possibly A is the 4th.
What is the correct C++ syntax for the pointer arithmetic required (ideally inside a loop) that will store the value pointed to during that iteration in to a local variable that I can use, eg. print to the console.
Many thanks
PS. Is this safe? Or is there a safer way to read the contents of an IMFMediaBuffer? I could not find an alternative.
Here is the code:
hr = pSample->ConvertToContiguousBuffer(&pBuffer); // this is the BitmapData
// Converts a sample with multiple buffers into a sample with a single IMFMediaBuffer which we Lock in memory next...
// IMFMediaBuffer represents a block of memory that contains media data
hr = pBuffer->Lock(&pBitmapData, NULL, &cbBitmapData); // pBuffer is IMFMediaBuffer
/* Lock method gives the caller access to the memory in the buffer, for reading or writing:
pBitmapData - receives a pointer to start of buffer
NULL - receives the maximum amount of data that can be written to the buffer. This parameter can be NULL.
cbBitmapData - receives the length of the valid data in the buffer, in bytes. This parameter can be NULL.
*/
I solved the problem myself and thought it best to add the answer here so that it formats correctly and maybe others will benefit from it. Basically in this situation we use 32 bits for the image data and what is great is that we are reading raw from memory so there is not yet a Bitmap header to skip because this is just raw color information.
NOTE: Across these 4 bytes we have (from bit 0 - 31) B G R A, which we can verify by using my code:
int x = 0;
while(x < cbBitmapData){
Console::Write("B: {0}", (*(pBitmapData + x++)));
Console::Write("\tG: {0}", (*(pBitmapData + x++)));
Console::Write("\tR: {0}", (*(pBitmapData + x++)));
Console::Write("\tA: {0}\n", (*(pBitmapData + x++)));
}
From the output you will see that the A value is 0 for each pixel because there is no concept of transparency or depth here, which is what we expect.
Also to verify that all we have in the buffer is raw image data and no other data I used this calculation which you may also find of use:
Console::Write("no of pixels in buffer: {0} \nexpected no of pixels based on dimensions:{1}", (cbBitmapData/4), (m_format.imageWidthPels * m_format.imageHeightPels) );
Where we divide the value of cbBitmapData by 4 because it is a count of the bytes, and as aforementioned for each pixel we have a width of 4 bytes (32-bit DWORDS in actual fact because the length of a byte is not always strictly uniform across hardware apparently!?). We compare this to the image width multiplied by its height. They are equal and thus we have just pixel color information in the buffer.
Hope this helps someone.
I've been working for a while on image processing and I've noticed weird things.
I'm reading a BMP file, using simple methods like ReadFile and stuff, and using Microsoft's BMP structures.
Here is the code:
ReadFile(_bmpFile,&bmpfh,sizeof(bfh),&data,NULL);
ReadFile(_bmpFile, &bmpih, sizeof(bih), &data, NULL);
imagesize = bih.biWidth*bih.biHeight;
image = new RGBQUAD[imagesize];
ReadFile(_bmpFile,image, imagesize*sizeof(RGBQUAD),&written,NULL);
That is how I read the file and then I'm turning it into gray scale using a simple for-loop.
for (int i = 0; i < imagesize; i++)
{
RED = image[i].rgbRed;
GREEN = image[i].rgbGreen;
BLUE = image[i].rgbBlue;
avg = (RED + GREEN + BLUE ) / 3;
image[i].rgbRed = avg;
image[i].rgbGreen = avg;
image[i].rgbBlue = avg;
}
Now when I write the file using this code:
#pragma pack(push, 1)
WriteFile(_bmpFile, &bmpfh, sizeof(bfh), &data, NULL);
WriteFile(_bmpFile, &bmpih, sizeof(bih), &data, NULL);
WriteFile(_bmpFile, image, imagesize*sizeof(RGBQUAD), &written, NULL);
#pragma pack(pop)
The file is getting much bigger(30MB -> 40MB).
The reason it happens is because I'm using RGBQUAD instead RGBTRIPLE, but if i'm using RGBTRIPLE I have a problem converting small pictures into
gray scale - can't open the picture after creating it(says it's not in the right structure).
Also the file size is missing one byte, (1174kb and after 1173kb)
Has anybody seen this before (it only occurs with small pictures)?
In a BMP file, every scan line has to be padded out so the next scan line starts on a 32-bit boundary. If you do 32 bits per pixel, that happens automatically, but if you use 24 bits per pixel, you'll need to add code to do it explicitly.
You are ignoring stride (Jerry's comment) and the pixel format of the bitmap. Which is 24bpp judging by the file size increase, you are writing it as though it is 32bpp. Your grayscale conversion is wrong, the human eye isn't equally sensitive to red, green and blue.
Consider using GDI+, you #include <gdiplus.h> in your code to use the Bitmap class. Its LockBits() method gives you access to the bitmap bits. The ColorMatrixEffect class lets you apply a color transformation in a single operation. Check this answer for the color matrix you need to get a grayscale image. The MSDN docs start here.
Each horizontal row in a BMP must be a multiple of 4 bytes long.
If the pixel data does not take up a multiple of 4 bytes, then 0x00 bytes are added at the end of the row. For a 24-bpp image, the number of bytes per row is (imageWidth*3 + 3) & ~3. The number of padding bytes is ((imageWidth*3 + 3) & ~3) - (imageWidth*3).
This was answered by immibis.
I would like to add that the size of array is ((imageWidth*3 + 3) & ~3)*imageHeight.
I hope this helps
I'm reading the moust cursor pixmap data from the StdFBShmem_t structure, as defined in the IOFrameBufferShared API.
Everything works fine, 90% of the time. However, I have noticed that some applications on the Mac set a cursor in a different format. According to the documentation for the data structures, the cursor pixmap format should always be in the same format as the frame buffer. My frame buffer is 32 bpp. I expect the pixmap data to be in the format 0xAARRGGBB, which it is (most of the time). However, in some cases, I'm reading data that looks like a mask. Specifically, the pixels in this data will either be 0x00FFFFFF or `0x00000000. This looks to me to be a mask for separate pixel data stored somewhere else.
As far as I can tell, the only application that uses this cursor pixel format is Qt Creator, but I need to work with all applications, so I'd like to sort this out.
The code I'm using to read the cursor pixmap data is:
NSAutoreleasePool *autoReleasePool = [[NSAutoreleasePool alloc] init];
NSPoint mouseLocation = [NSEvent mouseLocation];
NSArray *allScreens = [NSScreen screens];
NSEnumerator *screensEnum = [allScreens objectEnumerator];
NSScreen *screen;
NSDictionary *screenDesc = nil;
while ((screen = [screensEnum nextObject]))
{
NSRect screenFrame = [screen frame];
screenDesc = [screen deviceDescription];
if (NSMouseInRect(mouseLocation, screenFrame, NO))
break;
}
if (screen)
{
kern_return_t err;
CGDirectDisplayID displayID = (CGDirectDisplayID) [[screenDesc objectForKey:#"NSScreenNumber"] pointerValue];
task_port_t taskPort = mach_task_self();
io_service_t displayServicePort = CGDisplayIOServicePort(displayID);
io_connect_t displayConnection =0;
err = IOFramebufferOpen(displayServicePort,
taskPort,
kIOFBSharedConnectType,
&displayConnection);
if (KERN_SUCCESS == err)
{
union
{
vm_address_t vm_ptr;
StdFBShmem_t *fbshmem;
} cursorInfo;
vm_size_t size;
err = IOConnectMapMemory(displayConnection,
kIOFBCursorMemory,
taskPort,
&cursorInfo.vm_ptr,
&size,
kIOMapAnywhere | kIOMapDefaultCache | kIOMapReadOnly);
if (KERN_SUCCESS == err)
{
// For some reason, cursor data is not always in the same format as
// the frame buffer. For this reason, we need some way to detect
// which structure we should be reading.
QByteArray pixData(
(const char*)cursorInfo.fbshmem->cursor.rgb24.image[currentFrame],
m_mouseInfo.currentSize.width() * m_mouseInfo.currentSize.height() * 4);
IOConnectUnmapMemory(displayConnection,
kIOFBCursorMemory,
taskPort,
cursorInfo.vm_ptr);
} // IOConnectMapMemory
else
qDebug() << "IOConnectMapMemory Failed:" << err;
IOServiceClose(displayConnection);
} // IOServiceOpen
else
qDebug() << "IOFramebufferOpen Failed:" << err;
}// if screen
[autoReleasePool release];
My questions are:
How can I detect if the cursor is a different format
from the framebuffer?
Where can I read the actual pixel data? The bm18Cursor
structure contains a mask section, but it's not in the
right place for me to be reading it using the code
above.
How can I detect if the cursor is a different format from the framebuffer?
The cursor is in the framebuffer. It can't be in a different format than itself.
There is no way to tell what format it's in (x-radar://problem/7751503). There would be a way to divine at least the number of bytes per pixel if you could tell how many frames the cursor has, but since you can't (that information isn't set as of 10.6.1 — x-radar://problem/7751530), you are left trying to figure out two factors of a four-factor product (bytes per pixel × width × height × number of frames, where you only have the width, the height, and the product). And even if you can figure out those missing two factors, you still don't know what order the bytes are in or whether the color components are premultiplied by the alpha component.
Where can I read the actual pixel data?
In the cursor member of the shared-cursor-memory structure.
You should define IOFB_ARBITRARY_SIZE_CURSOR before including the I/O Kit headers. Cursors can be any size now, not just 16×16, which is the size you expect when you don't define that constant. As an example, the usual Mac arrow cursor is 24×24, the “Windows” arrow cursor in CrossOver is 32×32, and the arrow cursor in X11 is 10×16.
However, in some cases, I'm reading data that looks like a mask. Specifically, the pixels in this data will either be 0x00FFFFFF or 0x00000000. This looks to me to be a mask for separate pixel data stored somewhere else.
That sounds to me more like 16-bit pixels with an 8-bit alpha channel. At least it's more probably 5-6-5 than 5-5-5.
As far as I can tell, the only application that uses this cursor pixel format is Qt Creator, but I need to work with all applications, so I'd like to sort this out.
I'm able to capture the current cursor in that app just fine with my new cursor-capturing app. Is there a specific part of the app I should hit to make it show me a specific cursor?
You might try the CGSCreateRegisteredCursorImage function, as demonstrated by Karsten in a comment on my weblog.
It is a private function, so it may change or go away at any time, so you should check whether it exists and hold IOFramebuffer in reserve, but as long as it does exist, you may find it more reliable than the complex and thinly-documented IOFramebuffer.