Can't display a PNG using Glut or OpenGL - c++

Code is here:
void readOIIOImage( const char* fname, float* img)
{
int xres, yres;
ImageInput *in = ImageInput::create (fname);
if (! in) {return;}
ImageSpec spec;
in->open (fname, spec);
xres = spec.width;
yres = spec.height;
iwidth = spec.width;
iheight = spec.height;
channels = spec.nchannels;
cout << "\n";
pixels = new float[xres*yres*channels];
in->read_image (TypeDesc::FLOAT, pixels);
long index = 0;
for( int j=0;j<yres;j++)
{
for( int i=0;i<xres;i++ )
{
for( int c=0;c<channels;c++ )
{
img[ (i + xres*(yres - j - 1))*channels + c ] = pixels[index++];
}
}
}
in->close ();
delete in;
}
Currently, my code produces JPG files fine. It has the ability to read the file's information, and display it fine. However, when I try reading in a PNG file, it doesn't display correctly at all. Usually, it kind of displays the same distorted version of the image in three separate columns on the display. It's very strange. Any idea why this is happening with the given code?
Additionally, the JPG files all have 3 channels. The PNG has 2.
fname is simply a filename, and img is `new float[3*size];
Any help would be great. Thanks.`

Usually, it kind of displays the same distorted version of the image in three separate columns on the display. It's very strange. Any idea why this is happening with the given code?
This reads a lot like the output you get from the decoder is in row-planar format. Planar means, that you get individual rows one for every channel one-after another. The distortion and the discrepancy between number of channels in PNG and apparent count of channels are likely due to alignment mismatch. Now you didn't specify which image decoder library you're using exactly, so I can't look up information in how it communicates the layout of the pixel buffer. I suppose you can read the necessary information from ImageSpec.
Anyway, you'll have to rearrange your pixel buffer rearrangement loop indexing a bit so that consecutive row-planes are interleaved into channel-tuples.
Of course you could as well use a ready to use imagefile-to-OpenGL reader library. DevIL is thrown around a lot, but it's not very well maintained. SOIL seems to be a popular choice these days.

Related

SDL putting lots of pixel data onto the screen

I am creating a program that allows you to view fractals like the Mandelbrot or Julia set. I would like to render them as quickly as possible. I would love a way to put an array of uint8_t pixel values onto the screen. The array is formatted like this...
{r0,g0,b0,r1,g1,b1,...}
(A one dimensional array or RGB color values)
I know I have the proper data because before I just set individual points and it worked...
for(int i = 0;i < height * width;++i) {
//setStroke and point are functions that I made that together just draw a colored point
r.setStroke(data[i*3],data[i*3+1],data[i*3+2]);
r.point(i % r.window.w,i / r.window.w);
}
This is a pretty slow operation especially if the screen is big (which I would like it to be)
Is there any faster way to just put all the data onto the screen.
I tried doing something like this
void* pixels;
int pitch;
SDL_Texture* img = SDL_CreateTexture(ren,
SDL_GetWindowPixelFormat(win),SDL_TEXTUREACCESS_STREAMING,window.w,window.h);
SDL_LockTexture(img, NULL, &pixels, &pitch);
memcpy(pixels, data, window.w * 3 * window.h);
SDL_UnlockTexture(img);
SDL_RenderCopy(ren,img,NULL,NULL);
SDL_DestroyTexture(img);
I have no idea what I'm doing so please have mercy
Edit (thank you for comments :))
So here is what I do now
SDL_Texture* img = SDL_CreateTexture(ren, SDL_PIXELFORMAT_RGB888,SDL_TEXTUREACCESS_STREAMING,window.w,window.h);
SDL_UpdateTexture(img,NULL,&data[0],window.w * 3);
SDL_RenderCopy(ren,img,NULL,NULL);
SDL_DestroyTexture(img);
But I get this Image... which is not what it should look like
I am thinking that my data is just formatted wrong, right now it is formatted as an array of uint8_t in RGB order. Is there another way I should be formatting it (note I do not need an alpha channel)

how to get image shape after decode in Tensorflow's C++ API

I follow the Tensorflow tutorial of inception label_image, I can compile and run the demo c++ code successfully.
I want to adapt this demo to my own project, the input images to my own Network is height fixed, while width varies accordingly, for example, the original image is size of 64x100, and I want to resize it to 32x50, as I said 32 is the new_height, and I want to know original image size after reading from the file, how can I get width=100 and height=64? then I can get new_width = new_height/height x width=32/64x100=50
one possible way is first use opencv to load the image, and resize it, then copy the elements to tensor like this example pixel by pixel, but the performance is the main problem and it seems hard to compile tensorflow along with opencv. Any one knows some methods using tensorflow's API?
the following is a small piece of the image_recognition tutorial C++ codes, resize is hard coded to a pre-define size, I try float_caster.shape(), tensor(), float_caster.dimension(0), etc, all failed(float_caster, file_reader are all not Tensor, I don't know why Google design like this, really slow down the development, and I find no documentation about this), is there any easy way to get the image size? or cast the tensorflow::Ouput type to Tensor?
Thanks in advance!
// Given an image file name, read in the data, try to decode it as an image,
// resize it to the requested size, and then scale the values as desired.
Status ReadTensorFromImageFile(string file_name, const int input_height,
const int input_width, const float input_mean,
const float input_std,
std::vector<Tensor>* out_tensors) {
auto root = tensorflow::Scope::NewRootScope();
using namespace ::tensorflow::ops; // NOLINT(build/namespaces)
string input_name = "file_reader";
string output_name = "normalized";
auto file_reader =
tensorflow::ops::ReadFile(root.WithOpName(input_name), file_name);
// Now try to figure out what kind of file it is and decode it.
const int wanted_channels = 3;
tensorflow::Output image_reader;
if (tensorflow::StringPiece(file_name).ends_with(".png")) {
image_reader = DecodePng(root.WithOpName("png_reader"), file_reader,
DecodePng::Channels(wanted_channels));
} else if (tensorflow::StringPiece(file_name).ends_with(".gif")) {
image_reader = DecodeGif(root.WithOpName("gif_reader"), file_reader);
} else {
// Assume if it's neither a PNG nor a GIF then it must be a JPEG.
image_reader = DecodeJpeg(root.WithOpName("jpeg_reader"), file_reader,
DecodeJpeg::Channels(wanted_channels));
}
// Now cast the image data to float so we can do normal math on it.
auto float_caster =
Cast(root.WithOpName("float_caster"), image_reader, tensorflow::DT_FLOAT);
// The convention for image ops in TensorFlow is that all images are expected
// to be in batches, so that they're four-dimensional arrays with indices of
// [batch, height, width, channel]. Because we only have a single image, we
// have to add a batch dimension of 1 to the start with ExpandDims().
auto dims_expander = ExpandDims(root, float_caster, 0);
// Bilinearly resize the image to fit the required dimensions.
auto resized = ResizeBilinear(
root, dims_expander,
Const(root.WithOpName("size"), {input_height, input_width}));
// Subtract the mean and divide by the scale.
Div(root.WithOpName(output_name), Sub(root, resized, {input_mean}),
{input_std});
// This runs the GraphDef network definition that we've just constructed, and
// returns the results in the output tensor.
tensorflow::GraphDef graph;
TF_RETURN_IF_ERROR(root.ToGraphDef(&graph));
std::unique_ptr<tensorflow::Session> session(
tensorflow::NewSession(tensorflow::SessionOptions()));
TF_RETURN_IF_ERROR(session->Create(graph));
TF_RETURN_IF_ERROR(session->Run({}, {output_name}, {}, out_tensors));
return Status::OK();
}
As you said
I want to know original image size after reading from the file
So I suppose you don't mind get the height and width using the output tensor:
Status read_tensor_status =
ReadTensorFromImageFile(image_path, input_height, input_width, input_mean,
input_std, &resized_tensors);
if (!read_tensor_status.ok()) {
LOG(ERROR) << read_tensor_status;
return -1;
}
// #resized_tensor: the tensor storing the image
const Tensor &resized_tensor = resized_tensors[0];
auto resized_tensor_height = resized_tensor.shape().dim_sizes()[1];
auto resized_tensor_width = resized_tensor.shape().dim_sizes()[2];
std::cout << "resized_tensor_height:\t" << resized_tensor_height
<< "\nresized_tensor_width:\t" << resized_tensor_width << std::endl;
And the output is (for me)
resized_tensor_height: 636
resized_tensor_width: 1024

Writing a tif pixel by pixel using LibTiff?

Is it possible to create a new tif by iterating pixel by pixel and setting the RGB values for each pixel?
Let me explain what I'm attempting to do. I'm trying to open an existing tif, read it using TIFFReadRGBAImage, take the RGB values given by TIFFGetR/TIFFGetG/TIFFGetB, subtract them from 255, take those new values and use them to write each pixel one by one. In the end I'd like to end up with the original image and a new "complement" image that would be like a negative of the original.
Is there a way to do this using LibTiff? I've gone over the documentation and searched around Google but I've only seen very short examples of TIFFWriteScanline which provide so little lines of code/context/comments that I cannot figure out how to implement it in the way that I'd like it to work.
I'm still fairly new to programming so if someone could please either point me to a thorough example with plenty of explanatory comments or help me out directly with my code, I would appreciate it greatly. Thank you for taking the time to read this and help me learn.
What I have so far:
// Other unrelated code here...
//Invert color values and write to new image file
for (e = height - 1; e != -1; e--)
{
for (c = 0; c < width; c++)
{
red = TIFFGetR(raster[c]);
newRed = 255 - red;
green = TIFFGetG(raster[c]);
newGreen = 255 - green;
blue = TIFFGetB(raster[c]);
newBlue = 255 - blue;
// What to do next? Is this feasible?
}
}
// Other unrelated code here...
Full code if you need it.
I went back and looked at my old code. It turns out that I didn't use libtiff. Nevertheless you are on the right track. You want something like;
lineBuffer = (char *)malloc(width * 3) // 3 bytes per pixel
for all lines
{
ptr = lineBuffer
// modify your line code above so that you make a new line
for all pixels in line
{
*ptr++ = newRed;
*ptr++ = newGreen;
*ptr++ = newBlue
}
// write the line using libtiff scanline write
write a line here
}
Remember to set the tags appropriately. This example assumes 3 byte pixels. TIFF also allows for separate planes of 1 byte per pixel in each plane.
Alternately you can also write the whole image into a new buffer instead of one line at a time.

Raw RGB values to JPEG

I have an array with raw RGB values in it, and I need to write these values to a JPEG file. Is there an easy way to do this?
I tried:
std::ofstream ofs("./image.JPG", std::ios::out | std::ios::binary);
for (unsigned i = 0; i < width * height; ++i) {
ofs << (int)(std::min(1.0f, image[i].x) * 255) << (int)(std::min(1.0f, image[i].y) * 255) << (int)(std::min(1.0f, image[i].z) * 255);
}
but the format isn't recognized.
If you're trying to produce an image file you might look at Netpbm. You could write the intermediate format (PBM or PAM) fairly simply from what you have. There are then a large number of already written programs that will generate many types of images from your intermediate file.
WOH THERE!
JPEG is MUCH more complicated than raw RBG values. You are going to need to use a library, like LIBJPEG, to store the data as JPEG.
If you wrote it yourself you'd have to:
Convert from RGB to YCbCr
Sample the image
Divide into 8x8 blocks.
Perform the DCT on each block.
Run-length/huffman encode the values
Write these values in properly formatted JPEG blocks.
You could use Boost GIL: it's free, portable (it's part of Boost libraries), usable across a broad spectrum of operating systems (including Windows).
Popular Linux and Unix distributions such as Fedora, Debian and NetBSD include pre-built Boost packages.
The code is quite simple:
#include <boost/gil/extension/io/jpeg_io.hpp>
const unsigned width = 320;
const unsigned height = 200;
// Raw data.
unsigned char r[width * height]; // red
unsigned char g[width * height]; // green
unsigned char b[width * height]; // blue
int main()
{
boost::gil::rgb8c_planar_view_t view =
boost::gil::planar_rgb_view(width, height, r, g, b, width);
boost::gil::jpeg_write_view("out.jpg", view);
return 0;
}
jpeg_write_view saves the currently instantiated view to a jpeg file specified by the name (throws std::ios_base::failure if it fails to create the file).
Remember to link your program with -ljpeg.

NVIDIA CUDA Video Encoder (NVCUVENC) input from device texture array

I am modifying CUDA Video Encoder (NVCUVENC) encoding sample found in SDK samples pack so that the data comes not from external yuv files (as is done in the sample ) but from cudaArray which is filled from texture.
So the key API method that encodes the frame is:
int NVENCAPI NVEncodeFrame(NVEncoder hNVEncoder, NVVE_EncodeFrameParams *pFrmIn, unsigned long flag, void *pData);
If I get it right the param :
CUdeviceptr dptr_VideoFrame
is supposed to pass the data to encode.But I really haven't understood how to connect it with some texture data on GPU.The sample source code is very vague about it as it works with CPU yuv files input.
For example in main.cpp , lines 555 -560 there is following block:
// If dptrVideoFrame is NULL, then we assume that frames come from system memory, otherwise it comes from GPU memory
// VideoEncoder.cpp, EncodeFrame() will automatically copy it to GPU Device memory, if GPU device input is specified
if (pCudaEncoder->EncodeFrame(efparams, dptrVideoFrame, cuCtxLock) == false)
{
printf("\nEncodeFrame() failed to encode frame\n");
}
So ,from the comment, it seems like dptrVideoFrame should be filled with yuv data coming from device to encode the frame.But there is no place where it is explained how to do so.
UPDATE:
I would like to share some findings.First , I managed to encode data from Frame Buffer texture.The problem now is that the output video is a mess.
That is the desired result:
Here is what I do :
On OpenGL side I have 2 custom FBOs-first gets the scene rendered normally into it .Then the texture from the first FBO is used to render screen quad into second FBO doing RGB -> YUV conversion in the fragment shader.
The texture attached to second FBO is mapped then to CUDA resource.
Then I encode the current texture like this:
void CUDAEncoder::Encode(){
NVVE_EncodeFrameParams efparams;
efparams.Height = sEncoderParams.iOutputSize[1];
efparams.Width = sEncoderParams.iOutputSize[0];
efparams.Pitch = (sEncoderParams.nDeviceMemPitch ? sEncoderParams.nDeviceMemPitch : sEncoderParams.iOutputSize[0]);
efparams.PictureStruc = (NVVE_PicStruct)sEncoderParams.iPictureType;
efparams.SurfFmt = (NVVE_SurfaceFormat)sEncoderParams.iSurfaceFormat;
efparams.progressiveFrame = (sEncoderParams.iSurfaceFormat == 3) ? 1 : 0;
efparams.repeatFirstField = 0;
efparams.topfieldfirst = (sEncoderParams.iSurfaceFormat == 1) ? 1 : 0;
if(_curFrame > _framesTotal){
efparams.bLast=1;
}else{
efparams.bLast=0;
}
//----------- get cuda array from the texture resource -------------//
checkCudaErrorsDrv(cuGraphicsMapResources(1,&_cutexResource,NULL));
checkCudaErrorsDrv(cuGraphicsSubResourceGetMappedArray(&_cutexArray,_cutexResource,0,0));
/////////// copy data into dptrvideo frame //////////
// LUMA based on CUDA SDK sample//////////////
CUDA_MEMCPY2D pcopy;
memset((void *)&pcopy, 0, sizeof(pcopy));
pcopy.srcXInBytes = 0;
pcopy.srcY = 0;
pcopy.srcHost= NULL;
pcopy.srcDevice= 0;
pcopy.srcPitch =efparams.Width;
pcopy.srcArray= _cutexArray;///SOME DEVICE ARRAY!!!!!!!!!!!!! <--------- to figure out how to fill this.
/// destination //////
pcopy.dstXInBytes = 0;
pcopy.dstY = 0;
pcopy.dstHost = 0;
pcopy.dstArray = 0;
pcopy.dstDevice=dptrVideoFrame;
pcopy.dstPitch = sEncoderParams.nDeviceMemPitch;
pcopy.WidthInBytes = sEncoderParams.iInputSize[0];
pcopy.Height = sEncoderParams.iInputSize[1];
pcopy.srcMemoryType=CU_MEMORYTYPE_ARRAY;
pcopy.dstMemoryType=CU_MEMORYTYPE_DEVICE;
// CHROMA based on CUDA SDK sample/////
CUDA_MEMCPY2D pcChroma;
memset((void *)&pcChroma, 0, sizeof(pcChroma));
pcChroma.srcXInBytes = 0;
pcChroma.srcY = 0;// if I uncomment this line I get error from cuda for incorrect value.It does work in CUDA SDK original sample SAMPLE//sEncoderParams.iInputSize[1] << 1; // U/V chroma offset
pcChroma.srcHost = NULL;
pcChroma.srcDevice = 0;
pcChroma.srcArray = _cutexArray;
pcChroma.srcPitch = efparams.Width >> 1; // chroma is subsampled by 2 (but it has U/V are next to each other)
pcChroma.dstXInBytes = 0;
pcChroma.dstY = sEncoderParams.iInputSize[1] << 1; // chroma offset (srcY*srcPitch now points to the chroma planes)
pcChroma.dstHost = 0;
pcChroma.dstDevice = dptrVideoFrame;
pcChroma.dstArray = 0;
pcChroma.dstPitch = sEncoderParams.nDeviceMemPitch >> 1;
pcChroma.WidthInBytes = sEncoderParams.iInputSize[0] >> 1;
pcChroma.Height = sEncoderParams.iInputSize[1]; // U/V are sent together
pcChroma.srcMemoryType = CU_MEMORYTYPE_ARRAY;
pcChroma.dstMemoryType = CU_MEMORYTYPE_DEVICE;
checkCudaErrorsDrv(cuvidCtxLock(cuCtxLock, 0));
checkCudaErrorsDrv( cuMemcpy2D(&pcopy));
checkCudaErrorsDrv( cuMemcpy2D(&pcChroma));
checkCudaErrorsDrv(cuvidCtxUnlock(cuCtxLock, 0));
//=============================================
// If dptrVideoFrame is NULL, then we assume that frames come from system memory, otherwise it comes from GPU memory
// VideoEncoder.cpp, EncodeFrame() will automatically copy it to GPU Device memory, if GPU device input is specified
if (_encoder->EncodeFrame(efparams, dptrVideoFrame, cuCtxLock) == false)
{
printf("\nEncodeFrame() failed to encode frame\n");
}
checkCudaErrorsDrv(cuGraphicsUnmapResources(1, &_cutexResource, NULL));
// computeFPS();
if(_curFrame > _framesTotal){
_encoder->Stop();
exit(0);
}
_curFrame++;
}
I set Encoder params from the .cfg files included with CUDA SDK Encoder sample.So here I use 704x480-h264.cfg setup .I tried all of them and getting always similarly ugly result.
I suspect the problem is somewhere in CUDA_MEMCPY2D for luma and chroma objects params setup .May be wrong pitch , width ,height dimensions.I set the viewport the same size as the video (704,480) and compared params to those used in CUDA SDK sample but got no clue where the problem is.
Anyone ?
First: I messed around with Cuda Video Encoder, and had lots of troubles to. But it Looks to me as if you convert it to Yuv values, but as a one on one Pixel conversion (like AYUV 4:4:4). Afaik you need the correct kind of YUV with padding and compression (color values for more than one Pixel like 4:2:0). A good overview of YUV-alignments can be seen here:
http://msdn.microsoft.com/en-us/library/windows/desktop/dd206750(v=vs.85).aspx
As far as I remember you have to use NV12 alignment for Cuda Encoder.
nvEncoder application is used for codec conversion,for processing over GPU its used cuda and communicating with hardware it use API of nvEncoder.
in this application logic is read yuv data in input buffer and store that content in memory and then start encoding the frames.
and parallel write the encoding frame in to output file.
Handling of input buffer is available in nvRead function and it is available in nvFileIO.h
any other help required leave a message here...