Saving output frame as an image file CUDA decoder - c++

I am trying to save the decoded image file back as a BMP image using the code in CUDA Decoder project.
if (g_bReadback && g_ReadbackSID)
{
CUresult result = cuMemcpyDtoHAsync(g_bFrameData[active_field], pDecodedFrame[active_field], (nDecodedPitch * nHeight * 3 / 2), g_ReadbackSID);
long padded_size = (nWidth * nHeight * 3 );
CString output_file;
output_file.Format(_T("image/sample_45.BMP"));
SaveBMP(g_bFrameData[active_field],nWidth,nHeight,padded_size,output_file );
if (result != CUDA_SUCCESS)
{
printf("cuMemAllocHost returned %d\n", (int)result);
}
}
But the saved image looks like this
Can anybody help me out here what am i doing wrong .. Thank you.

After investigating further, there were several modifications I made to your approach.
pDecodedFrame is actually in some non-RGB format, I think it is NV12 format which I believe is a particular YUV variant.
pDecodedFrame gets converted to an RGB format on the GPU using a particular CUDA kernel
the target buffer for this conversion will either be a surface provided by OpenGL if g_bUseInterop is specified, or else an ordinary region allocated by the driver API version of cudaMalloc if interop is not specified.
The target buffer mentioned above is pInteropFrame (even in the non-interop case). So to make an example for you, for simplicity I chose to only use the non-interop case, because it's much easier to grab the RGB buffer (pInteropFrame) in that case.
The method here copies pInteropFrame back to the host, after it has been populated with the appropriate RGB image by cudaPostProcessFrame. There is also a routine to save the image as a bitmap file. All of my modifications are delineated with comments that include RMC so search for that if you want to find all the changes/additions I made.
To use, drop this file in the cudaDecodeGL project as a replacement for the videoDecodeGL.cpp source file. Then rebuild the project. Then run the executable normally to display the video. To capture a specific frame, run the executable with the nointerop command-line switch, eg. cudaDecodGL nointerop and the video will not display, but the decode operation and frame capture will take place, and the frame will be saved in a framecap.bmp file. If you want to change the specific frame number that is captured, modify the g_FrameCapSelect = 37; variable to some other number besides 37, and recompile.
Here is the replacement for videoDecodeGL.cpp I used pastebin because SO has a limit on the number of characters that can be entered in a question body.
Note that my approach is independent of whether readback is specified. I would recommend not using readback for this sequence.

Related

Having problems loading a jpg file using libjpeg

I need to load jpg files in my application. I used libjpeg to save JPGs (from processed raw files) and it works nicely.
Reading them though is a different issue. I am getting very weird results, the image is very distorted, in 12 columns, which are mostly gray scale.
I followed the example, and the only modification I made is how to put the data in my buffer (the put_scanline_someplace() function is missing from the example.
Here is my relevant code (I need the data in BGR format):
dest=0;
while(cinfo.output_scanline < cinfo.output_height)
{
jpeg_read_scanlines(&cinfo, buffer, 1);
src=0;
for(i=0;i<cinfo.output_width;i++)
{
image_buffer[dest*3+2]=buffer[src*3+0];
image_buffer[dest*3+1]=buffer[src*3+1];
image_buffer[dest*3+0]=buffer[src*3+2];
src++;
dest++;
}
}
Is there something wrong with this code?
I found the solution. buffer isa pointer to an array of ints, so the code that works is like so:
image_buffer[dest*3+2]=buffer[0][src*3+0];
image_buffer[dest*3+1]=buffer[0][src*3+1];
image_buffer[dest*3+0]=buffer[0][src*3+2];

OpenCV VideoCapture: Howto get specific frame correctly?

I am trying to get at specific frame from a video file using OpenCV 2.4.11.
I have tried to follow the documentation and online tutorials of how to do it correctly and have now tested two approaches:
1) The first method is brute force reading each frame using the video.grab() until I reach the specific frame (timestamp) I want. This method is slow if the specific frame is late in the video sequence!
string videoFile(videoFilename);
VideoCapture video(videoFile);
double videoTimestamp = video.get(CV_CAP_PROP_POS_MSEC);
int videoFrameNumber = static_cast<int>(video.get(CV_CAP_PROP_POS_FRAMES));
while (videoTimestamp < targetTimestamp)
{
videoTimestamp = video.get(CV_CAP_PROP_POS_MSEC);
videoFrameNumber = static_cast<int>(video.get(CV_CAP_PROP_POS_FRAMES));
// Grabe frame (but don't decode the frame as we are only "Fast forwarding")
video.grab();
}
// Get and save frame
if (video.retrieve(frame))
{
char txtBuffer[100];
sprintf(txtBuffer, "Video1Frame_Target_%f_TS_%f_FN_%d.png", targetTimestamp, videoTimestamp, videoFrameNumber);
string imgName = txtBuffer;
imwrite(imgName, frame);
}
2) The second method I uses the video.set(...). This method is faster and doesn't seem to be any slower if the specific frame is late in the video sequence.
string videoFile(videoFilename);
VideoCapture video2(videoFile);
videoTimestamp = video2.get(CV_CAP_PROP_POS_MSEC);
videoFrameNumber = static_cast<int>(video2.get(CV_CAP_PROP_POS_FRAMES));
video2.set(CV_CAP_PROP_POS_MSEC, targetTimestamp);
while (videoTimestamp < targetTimestamp)
{
videoTimestamp = video2.get(CV_CAP_PROP_POS_MSEC);
videoFrameNumber = (int)video2.get(CV_CAP_PROP_POS_FRAMES);
// Grabe frame (but don't decode the frame as we are only "Fast forwarding")
video2.grab();
}
// Get and save frame
if (video2.retrieve(frame))
{
char txtBuffer[100];
sprintf(txtBuffer, "Video2Frame_Target_%f_TS_%f_FN_%d.png", targetTimestamp, videoTimestamp, videoFrameNumber);
string imgName = txtBuffer;
imwrite(imgName, frame);
}
Problem) Now the issue is that using the two methods does end up with the same frame number of the content of the target image frame is not equal?!?
I am tempted to conclude that Method 1 is the correct one and there is something wrong with the OpenCV video.set(...) method. But if I use the VLC player finding the approximate target frame position it is actually Method 2 that is closest to a "correct" result?
As some extra info: I have tested the same video sequence but in two different video files being encoded with respectively 'avc1' MPG4 and 'wmv3' WMV codec.
Using the WMV file the two found frames are way off?
Using the MPG4 file the two found frames are only slightly off?
Is there anybody having some experience with this, can explain my findings and tell me the correct way to get a specific frame from a video file?
Obviously there's still a bug in opencv/ ffmpeg.
ffmpeg doesn't deliver the frames that are wanted and/or opencv doesn't handles this. See here and here.
[Edit:
Until that bug is fixed (either in ffmpeg or (as a work-around in opencv)) the only way to get exact frame by number is to "fast forward" as you did.
(Concerning VLC-player: I suspect that it uses that buggy set ()-interface. As for a player it is usually not too important to seek frame-exact. But for an editor it is).]
I think that OpenCV uses FFmpeg for video decoding.
We once had a similar problem but used FFmpeg directly. By default, random (but exact) frame access isn't guaranteed. The WMV decoder was particularly fuzzy.
Newer versions of FFmpeg allow you access to lower-level routines which can be used to build a retrieval function for frames. This solution was a little involved and nothing I can remember off my head right now. I try to find some more details later.
As a quick work-around, I would suggest to decode your videos off-line and then work on sequences off images. Though, this increases the amount of storage needed, it guarantees exact random frame access. You can use FFmpeg to convert your video file in to a sequence of images like this:
ffmpeg -i "input.mov" -an -f image2 "output_%05d.png"

Can't get CreateDDSTextureFromFile to work

So, I've been trying to figure out my problem for a few hours now, but I have no idea what I'm doing wrong. I'm a noob when it comes to DirectX programming, so I've been following some tutorials, and right now, I'm trying to create a obj loader.
http://www.braynzarsoft.net/index.php?p=D3D11OBJMODEL
However, I can't get my texture to work.
This is how I try to load the DDS-texture:
ID3D11ShaderResourceView* tempMeshSRV = nullptr;
hr = CreateDDSTextureFromFile(gDevice, L"boxTexture.dds", NULL, &tempMeshSRV);
if (SUCCEEDED(hr))
{
textureNameArray.push_back(L"boxTexture.dds");
material[matCount - 1].texArrayIndex = meshSRV.size();
meshSRV.push_back(tempMeshSRV);
material[matCount - 1].hasTexture = true;
}
However, my HRESULT will never Succeed, but it doesn't crash either. If I hoover over the hr, it just says "HRESULT_FROM_WIN32(ERROR_NOT_SUPPORTED) I also tried to remove the if statement, but that will just turn my box black.
Any idea on what I'm doing wrong? =/
Thanks in advance!
The most likely problem is that your "boxTexture.dds" is a 24 bit-per-pixel format file. In Direct3D 9, this was D3DFMT_R8G8B8 and was reasonably common. However, there is no DXGI equivalent format for 24 bits-per-pixel and it therefore requires format conversion to work.
The DDSTextureLoader module in DirectX Tool Kit is designed to be a minimum-overhead function, and therefore does no runtime conversions at all. If the data directly maps to a DXGI format, it loads. If it doesn't, it fails with HRESULT_FROM_WIN32(ERROR_NOT_SUPPORTED).
There are two different solutions depending on your usage scenario.
The ideal solution is to convert 'boxTexture.dds' to a supported format. You can do this with the texconv command-line tool provided with DirectXTex. This is by far the best option so that the potentially expensive conversion operation is done once and not very single time your application runs and loads the data.
If you don't actually control the source of the dds files you are trying load (i.e. they are arbitrary files provided by a user or you are doing some kind of content tool that has to support legacy formats), then you should make use of the DirectXTex 'full-fat' LoadFromDDSFile function which has extensive conversion code for handling legacy DDS file formats.
Note this situation can happen for a number of legacy format DDS files as list in the CodePlex wiki documentation
D3DFMT_R8G8B8 (24bpp RGB) - Use a 32bpp format
D3DFMT_X8B8G8R8 (32bpp RGBX) - Use BGRX, BGRA, or RGBA
D3DFMT_A2R10G10B10 (BGRA 10:10:10:2) - Use RGBA 10:10:10:2
D3DFMT_X1R5G5B5 (BGR 5:5:5) - Use BGRA 5:5:5:1 or BGR 5:6:5
D3DFMT_A8R3G3B2, D3DFMT_R3G3B2 (BGR 3:3:2) - Expand to a supported format
D3DFMT_P8, D3DFMT_A8P8 (8-bit palette) - Expand to a supported format
D3DFMT_A4L4 (Luminance 4:4) - Expand to a supported format
D3DFMT_UYVY (YUV 4:2:2 16bpp) - Swizzle to YUY2
See also Direct3D 11 Textures and Block Compression
If you look at the source code for CreateTextureFromDDS (which is called by CreateDDSTextureFromFile to do the main data processing) - http://directxtk.codeplex.com/SourceControl/latest#Src/DDSTextureLoader.cpp - you will see that there are a lot of reasons you could be getting "HRESULT_FROM_WIN32(ERROR_NOT_SUPPORTED)".
It's not likely a problem with opening or reading the file since that would return a different error code. So most likely its an unsupported DXGI_FORMAT, a malformed cubemap, an invalid mipmap count, or invalid image dimensions (i.e. larger than the limits found here: http://msdn.microsoft.com/en-us/library/ff819065(v=vs.85).aspx ).

iOS waveform generator connected via AUGraph

I have created a simple waveform generator which is connected to an AUGraph. I have reused some sample code from Apple to set AudioStreamBasicDescription like this
void SetCanonical(UInt32 nChannels, bool interleaved)
// note: leaves sample rate untouched
{
mFormatID = kAudioFormatLinearPCM;
int sampleSize = SizeOf32(AudioSampleType);
mFormatFlags = kAudioFormatFlagsCanonical;
mBitsPerChannel = 8 * sampleSize;
mChannelsPerFrame = nChannels;
mFramesPerPacket = 1;
if (interleaved)
mBytesPerPacket = mBytesPerFrame = nChannels * sampleSize;
else {
mBytesPerPacket = mBytesPerFrame = sampleSize;
mFormatFlags |= kAudioFormatFlagIsNonInterleaved;
}
}
In my class I call this function
mClientFormat.SetCanonical(2, true);
mClientFormat.mSampleRate = kSampleRate;
while sample rate is
#define kSampleRate 44100.0f;
The other setting are taken from sample code as well
// output unit
CAComponentDescription output_desc(kAudioUnitType_Output, kAudioUnitSubType_RemoteIO, kAudioUnitManufacturer_Apple);
// iPodEQ unit
CAComponentDescription eq_desc(kAudioUnitType_Effect, kAudioUnitSubType_AUiPodEQ, kAudioUnitManufacturer_Apple);
// multichannel mixer unit
CAComponentDescription mixer_desc(kAudioUnitType_Mixer, kAudioUnitSubType_MultiChannelMixer, kAudioUnitManufacturer_Apple);
Everything works fine, but the problem is that I am not getting stereo sound and my callback function is failing (bad access) when I try to reach the second buffer
Float32 *bufferLeft = (Float32 *)ioData->mBuffers[0].mData;
Float32 *bufferRight = (Float32 *)ioData->mBuffers[1].mData;
// Generate the samples
for (UInt32 frame = 0; frame < inNumberFrames; frame++)
{
switch (generator.soundType) {
case 0: //Sine
bufferLeft[frame] = sinf(thetaLeft) * amplitude;
bufferRight[frame] = sinf(thetaRight) * amplitude;
break;
So it seems I am getting mono instead of stereo. The pointer bufferRight is empty, but don't know why.
Any help will be appreciated.
I can see two possible errors. First, as #invalidname pointed out, recording in stereo probably isn't going to work on a mono device such as the iPhone. Well, it might work, but if it does, you're just going to get back dual-mono stereo streams anyways, so why bother? You might as well configure your stream to work in mono and spare yourself the CPU overhead.
The second problem is probably the source of your sound distortion. Your stream description format flags should be:
kAudioFormatFlagIsSignedInteger |
kAudioFormatFlagsNativeEndian |
kAudioFormatFlagIsPacked
Also, don't forget to set the mReserved flag to 0. While the value of this flag is probably being ignored, it doesn't hurt to explicitly set it to 0 just to make sure.
Edit: Another more general tip for debugging audio on the iPhone -- if you are getting distortion, clipping, or other weird effects, grab the data payload from your phone and look at the recording in a wave editor. Being able to zoom down and look at the individual samples will give you a lot of clues about what's going wrong.
To do this, you need to open up the "Organizer" window, click on your phone, and then expand the little arrow next to your application (in the same place where you would normally uninstall it). Now you will see a little downward pointing arrow, and if you click it, Xcode will copy the data payload from your app to somewhere on your hard drive. If you are dumping your recordings to disk, you'll find the files extracted here.
reference from link
I'm guessing the problem is that you're specifying an interleaved format, but then accessing the buffers as if they were non-interleaved in your IO callback. ioData->mBuffers[1] is invalid because all the data, both left and right channels, is interleaved in ioData->mBuffers[0].mData. Check ioData->mNumberBuffers. My guess is it is set to 1. Also, verify that ioData->mBuffers[0].mNumberChannels is set to 2, which would indicate interleaved data.
Also check out the Core Audio Public Utility classes to help with things like setting up formats. Makes it so much easier. Your code for setting up format could be reduced to one line, and you'd be more confident it is right (though to me your format looks set up correctly - if what you want is interleaved 16-bit int):
CAStreamBasicDescription myFormat(44100.0, 2, CAStreamBasicDescription::kPCMFormatInt16, true)
Apple used to package these classes up in the SDK that was installed with Xcode, but now you need to download them here: https://developer.apple.com/library/mac/samplecode/CoreAudioUtilityClasses/Introduction/Intro.html
Anyway, it looks like the easiest fix for you is to just change the format to non-interleaved. So in your code: mClientFormat.SetCanonical(2, false);

Converting PDF to JPG like Photoshop quality - Commercial C++ / Delphi library

For the implementation of a Windows based page-flip application I need to be able to convert a large number of PDF pages into good quality JPG, not just thumbnails.
The aim is to achieve the best quality / file size for that, similar to Photoshops Save for Web does that.
Currently Im using Datalogics Adobe PDF Library SDK, which does not seem to be able to fullfil that task. I am thus looking for an alternative commcerical C++ or Delphi library which provides a good qualtiy / size / speed.
After doing some search here, I noticed that most posts are about GS & Imagekick, which I have also tested, but I am not satisfied with the output and the speed.
The target is to import the PDFs with 300dpi and convert them with JPG quality 50, 1500px height and an ouput size of 300-500kb.
If anyone could point out a good library for that task, I would be most greatful.
The Gnostice PDFtoolKit VCL may be a candidate. Convert to JPEG is one of the options.
I always recommend Graphics32 for all your image manipulation needs; you have several resamplers to choose. However, I don't think it can read PDF files as images. But if you can generate the big image yourself it may be a good choice.
Atalasoft DotImage (with the PDF rasterizer add-on) will do that (I work on PDF technologies there). You'd be working in C# (or another .NET) language:
ConvertToJpegs(string outfileStem, Stream pdf)
{
JpegEncoder encoder = new JpegEncoder();
encoder.Quality = 50;
int page = 1;
PdfImageSource source = new PdfImageSource(pdf);
source.Resolution = 300; // sets the rendering resolution to 200 dpi
// larger numbers means better resolution in the image, but will cost in
// terms of output file size - as resolution increases, memory used increases
// as a function of the square of the resolution, whereas compression only
// saves maybe a flat 30% of the total image size, depending on the Quality
// setting on the encoder.
while (source.HasMoreImages()) {
AtalaImage image = source.AcquireNext();
// this image will be in either 8 bit gray or 24 bit rgb depending
// on the page contents.
try {
string path = String.Format("{0}{1}.jpg", outFileStem, page++);
// if you need to resample the image, this is the place to do it
image.Save(path, encoder, null);
}
finally {
source.Release(image);
}
}
}
There is also Quick PDF Library
Have a look at DynaPDF. I know its pretty expensive but you can try the starter pack.
P.S.:before buying a product please make sure it meets your needs.