How do I pass an OpenCV Mat into a C++ Tensorflow graph? - c++

In Tensorflow C++ I can load an image file into the graph using
tensorflow::Node* file_reader = tensorflow::ops::ReadFile(tensorflow::ops::Const(IMAGE_FILE_NAME, b.opts()),b.opts().WithName(input_name));
tensorflow::Node* image_reader = tensorflow::ops::DecodePng(file_reader, b.opts().WithAttr("channels", 3).WithName("png_reader"));
tensorflow::Node* float_caster = tensorflow::ops::Cast(image_reader, tensorflow::DT_FLOAT, b.opts().WithName("float_caster"));
tensorflow::Node* dims_expander = tensorflow::ops::ExpandDims(float_caster, tensorflow::ops::Const(0, b.opts()), b.opts());
tensorflow::Node* resized = tensorflow::ops::ResizeBilinear(dims_expander, tensorflow::ops::Const({input_height, input_width},b.opts().WithName("size")),b.opts());
For an embedded application I would like to instead pass an OpenCV Mat into this graph.
How would I convert the Mat to a tensor that could be used as input to tensorflow::ops::Cast or tensorflow::ops::ExpandDims?

It's not directly from a CvMat, but you can see an example of how to initialize a Tensor from an in-memory array in the TensorFlow Android example:
https://github.com/tensorflow/tensorflow/blob/0.6.0/tensorflow/examples/android/jni/tensorflow_jni.cc#L173
You would start off by creating a new tensorflow::Tensor object, with something like this (all code untested):
tensorflow::Tensor input_tensor(tensorflow::DT_FLOAT,
tensorflow::TensorShape({1, height, width, depth}));
This creates a Tensor object with float values, with a batch size of 1, and a size of widthxheight, and with depth channels. For example a 128 wide by 64 high image with 3 channels would pass in a shape of {1, 64, 128, 3}. The batch size is just used when you need to pass in multiple images in a single call, and for simple uses you can leave it as 1.
Then you would get the underlying array behind the tensor using a line like this:
auto input_tensor_mapped = input_tensor.tensor<float, 4>();
The input_tensor_mapped object is an interface to the data in your newly-created tensor, and you can then copy your own data into it. Here I'm assuming you've set source_data as a pointer to your source data, for example:
const float* source_data = some_structure.imageData;
You can then loop through your data and copy it over:
for (int y = 0; y < height; ++y) {
const float* source_row = source_data + (y * width * depth);
for (int x = 0; x < width; ++x) {
const float* source_pixel = source_row + (x * depth);
for (int c = 0; c < depth; ++c) {
const float* source_value = source_pixel + c;
input_tensor_mapped(0, y, x, c) = *source_value;
}
}
}
There are obvious opportunities to optimize this naive approach, and I don't have sample code on hand to show how to deal with the OpenCV side of getting the source data, but hopefully this is helpful to get you started.

Here is complete example to read and feed:
Mat image;
image = imread("flowers.jpg", CV_LOAD_IMAGE_COLOR);
cv::resize(image, image, cv::Size(input_height, input_width), 0, 0, CV_INTER_CUBIC);
int depth = 3;
tensorflow::Tensor input_tensor(tensorflow::DT_FLOAT,
tensorflow::TensorShape({1, image.rows, image.cols, depth}));
for (int y = 0; y < image.rows; y++) {
for (int x = 0; x < image.cols; x++) {
Vec3b pixel = image.at<Vec3b>(y, x);
input_tensor_mapped(0, y, x, 0) = pixel.val[2]; //R
input_tensor_mapped(0, y, x, 1) = pixel.val[1]; //G
input_tensor_mapped(0, y, x, 2) = pixel.val[0]; //B
}
}
auto result = Sub(root.WithOpName("subtract_mean"), input_tensor, {input_mean});
ClientSession session(root);
TF_CHECK_OK(session.Run({result}, out_tensors));

I had tried to run inception model on the opencv Mat file and following code worked for me https://gist.github.com/kyrs/9adf86366e9e4f04addb. Although there are some issue with integration of opencv and tensorflow. Code worked without any issue for .png files but failed to load .jpg and .jpeg. You can follow this for more info https://github.com/tensorflow/tensorflow/issues/1924

Tensor convertMatToTensor(Mat &input)
{
int height = input.rows;
int width = input.cols;
int depth = input.channels();
Tensor imgTensor(tensorflow::DT_FLOAT, tensorflow::TensorShape({height, width, depth}));
float* p = imgTensor.flat<float>().data();
Mat outputImg(height, width, CV_32FC3, p);
input.convertTo(outputImg, CV_32FC3);
return imgTensor;
}

Related

Setting pixel color of 8-bit grayscale image using pointer

I have this code:
QImage grayImage = image.convertToFormat(QImage::Format_Grayscale8);
int size = grayImage.width() * grayImage.height();
QRgb *data = new QRgb[size];
memmove(data, grayImage.constBits(), size * sizeof(QRgb));
QRgb *ptr = data;
QRgb *end = ptr + size;
for (; ptr < end; ++ptr) {
int gray = qGray(*ptr);
}
delete[] data;
It is based on this: https://stackoverflow.com/a/40740985/8257882
How can I set the color of a pixel using that pointer?
In addition, using qGray() and loading a "bigger" image seem to crash this.
This works:
int width = image.width();
int height = image.height();
for (int y = 0; y < height; ++y) {
for (int x = 0; x < width; ++x) {
image.setPixel(x, y, qRgba(0, 0, 0, 255));
}
}
But it is slow when compared to explicitly manipulating the image data.
Edit
Ok, I have this code now:
for (int y = 0; y < height; ++y) {
uchar *line = grayImage.scanLine(y);
for (int x = 0; x < width; ++x) {
int gray = qGray(line[x]);
*(line + x) = uchar(gray);
qInfo() << gray;
}
}
And it seems to work. However, when I use an image that has only black and white colors and print the gray value, black color gives me 0 and white gives 39. How can I get the gray value in a range of 0-255?
First of all you are copying too much data in this line:
memmove(data, grayImage.constBits(), size * sizeof(QRgb));
The size ob Qrgb is 4 bytes, but according to the documentation, the size of a Format_Grayscale8 pixel is only 8 bits or 1 byte. If you remove sizeof(QRgb) you should be copying the correct amount of bytes, assuming all the lines in the bitmap are consecutive (which, according to the documentation, they are not -- they are aligned to at minimum 32-bits, so you would have to account for that in size). The array data should not be of type Qrgb[size] but ucahr[size]. You can then modify data as you like. Finally, you will probably have to create a new QImage with one of the constructors that accept image bits as uchar and assign the new image to the old image:
auto newImage = QImage( data, image.width(), image.height(), QImage::Format_Grayscale8, ...);
grayImage = std::move( newImage );
But instead of copying image data, you could probably just modify grayImage directly by accessing its data through bits(), or even better, through scanLine(), maybe something like this:
int line, column;
auto pLine = grayImage.scanLine(line);
*(pLine + column) = uchar(grayValue);
EDIT:
According to scanLine documentation, the image is at least 32-bit aligned. So if your 8-bit grayScale image is 3 pixels wide, a new scan line will start every 4 bytes. If you have a 3x3 image, the total size of the memory required to hold the image pixels will be 12. The following code shows the required memory size:
int main() {
auto image = QImage(3, 3, QImage::Format_Grayscale8);
std::cout << image.bytesPerLine() * image.height() << "\n";
return 0;
}
The fill method (setting all gray values to 0xC0) could be implemented like this:
auto image = QImage(3, 3, QImage::Format_Grayscale8);
uchar gray = 0xc0;
for ( int i = 0; i < image.height(); ++i ) {
auto pLine = image.scanLine( i );
for ( int j = 0; j < image.width(); ++j )
*pLine++ = gray;
}

How to convert image storage order from channel-height-width to height-width-channel?

I would like to know how to convert an image stored as a 1D std::vector<float> from CHW format (Channel, Height, Width) to HWC format (Height, Width, Channel) in C++. The format change is needed due to requirements of a neural network.
I used OpenCV to read and show the image as below:
cv::namedWindow("Screenshot", cv::WINDOW_AUTOSIZE );
cv::imshow("Screenshot", rgbImage);
Then I converted the cv::Mat rgbImage to a 1D std::vector<float> in format CHW:
size_t channels = 3;
std::vector<float> data(channels*ROS_IMAGE_HEIGHT*ROS_IMAGE_WIDTH);
for(size_t j=0; j<ROS_IMAGE_HEIGHT; j++){
for(size_t k=0; k<ROS_IMAGE_WIDTH; k++){
cv::Vec3b intensity = rgbImage.at<cv::Vec3b>(j, k);
for(size_t i=0; i<channels; i++){
data[i*ROS_IMAGE_HEIGHT*ROS_IMAGE_WIDTH + j*ROS_IMAGE_HEIGHT + k] = (float) intensity[i];
}
}
}
Now I want to convert the format of std::vector<float> data to HWC. How can I do this?
I found some description of the "CHW" and "HWC" formats here.
If the storage order is HWC, it means that
Each sample is stored as a column-major matrix (height, width) of float[numChannels] (r00, g00, b00, r10, g10, b10, r01, g01, b01, r11, g11, b11).
Thus a pixel (x, y, c) is found using
xStride = channels;
yStride = channels * width;
cStride = 1;
data[x*xStride + y*yStride + c*cStride]
If the storage order is CHW, it means that each channel is a different plane. A pixel (x, y, c) is found using
xStride = 1;
yStride = width;
cStride = width * height;
data[x*xStride + y*yStride + c*cStride]
Note that in the code in the question, data[i*ROS_IMAGE_HEIGHT*ROS_IMAGE_WIDTH + j*ROS_IMAGE_HEIGHT + k] is incorrect, j is the y-coordinate and should be multiplied by ROS_IMAGE_WIDTH.
The code in the question can be modified to yield a std::vector in the HWC format by replacing the line in the innermost loop by:
data[i + j*ROS_IMAGE_WIDTH*channels + k*channels] = (float) intensity[i];

Why QDBMP fail to write 128*128 images?

I am developing a c++ application that reads some bitmap and work with them and then save them as bitmap . I use QDBMP library for working with bitmap file and every thing is good for 512*512 bitmap images . but when working with 128*128 bitmap files it just write some striped line in output . here is my code for reading and writing bitmap files :
int readBitmapImage(const char *file_name,UCHAR* r, UCHAR* g, UCHAR* b)
{
BMP* bmp;
UINT width, height;
bmp = BMP_ReadFile(file_name);
BMP_GetDepth(bmp);
BMP_CHECK_ERROR(stderr, -1);
width = BMP_GetWidth(bmp); height = BMP_GetHeight(bmp);
for (int x = 0; x < width; ++x)
{
for (int y = 0; y < height; ++y)
{
BMP_GetPixelRGB(bmp, x, y, &r[x*width+y], &g[x*width + y], &b[x*width + y]);
}
}
BMP_CHECK_ERROR(stderr, -2);
return 0;
}
void writeImageData(const char *file_name, UCHAR* r, UCHAR* g, UCHAR* b,int width,int height,int bitDepth)
{
BMP* bmp=BMP_Create(width,height,bitDepth);
width = BMP_GetWidth(bmp); height = BMP_GetHeight(bmp);
for (int x = 0; x < width; ++x)
{
for (int y = 0; y < height; ++y)
{
BMP_SetPixelRGB(bmp, x, y, r[x*width + y], g[x*width + y], b[x*width + y]);
}
}
BMP_WriteFile(bmp, file_name);
}
Tank's for your help
UPDATE1
The source image is :
The result of save source image is :
UPDATE2
The value of bitDepth is 24 and code block for alocate memory is :
UCHAR* WimageDataR = (UCHAR*)calloc(128* 128, sizeof(UCHAR));
UCHAR* WimageDataG = (UCHAR*)calloc(128 * 128, sizeof(UCHAR));
UCHAR* WimageDataB = (UCHAR*)calloc(128 * 128, sizeof(UCHAR));
After while i finally found out what is wrong . in BMP_ReadFile() function of QDBMP when the image has size of 128*128 , the header parameter ImageDataSize will not read from the file and has 0 size . so i add this block of code to it to prevent this problem and every thing is just fine.
if (bmp->Header.ImageDataSize == 0)
{
bmp->Header.ImageDataSize = bmp->Header.FileSize - bmp->Header.DataOffset;
}

Convert cv::Mat to openni::VideoFrameRef

I have a kinect streaming data into a cv::Mat. I am trying to get some example code running that uses OpenNI.
Can I convert my Mat into an OpenNI format image somehow?
I just need the depth image, and after fighting with OpenNI for a long time, have given up on installing it.
I am using OpenCV 3, Visual Studio 2013, Kinect v2 for Windows.
The relevant code is:
void CDifodoCamera::loadFrame()
{
//Read the newest frame
openni::VideoFrameRef framed; //I assume I need to replace this with my Mat...
depth_ch.readFrame(&framed);
const int height = framed.getHeight();
const int width = framed.getWidth();
//Store the depth values
const openni::DepthPixel* pDepthRow = (const openni::DepthPixel*)framed.getData();
int rowSize = framed.getStrideInBytes() / sizeof(openni::DepthPixel);
for (int yc = height-1; yc >= 0; --yc)
{
const openni::DepthPixel* pDepth = pDepthRow;
for (int xc = width-1; xc >= 0; --xc, ++pDepth)
{
if (*pDepth < 4500.f)
depth_wf(yc,xc) = 0.001f*(*pDepth);
else
depth_wf(yc,xc) = 0.f;
}
pDepthRow += rowSize;
}
}
First you need to understand how your data is coming... If it is already in cv::Mat you should be receiving two images, one for the RGB information that usually is a 3 channel uchar cv::Mat and another image for the depth information that usually it is saved in a 16 bit representation in milimeters (you can not save float mat as images, but you can as yml/xml files using opencv).
Assuming you want to read and process the image that contains the depth information, you can change your code to:
void CDifodoCamera::loadFrame()
{
//Read the newest frame
//the depth image should be png since it is the one which supports 16 bits and it must have the ANYDEPTH flag
cv::Mat depth_im = cv::imread("img_name.png",CV_LOAD_IMAGE_ANYDEPTH);
const int height = depth_im.rows;
const int width = depth_im.cols;
for (int y = 0; y < height; y++)
{
for (int x = 0; x < width; x++)
{
if (depth_im<unsigned short>(y,x) < 4500)
depth_wf(y,x) = 0.001f * (float)depth_im<unsigned short>(y,x);
else
depth_wf(y,x) = 0.f;
}
}
}
I hope this helps you. If you have any question just ask :)

Accessing certain pixel RGB value in openCV

I have searched internet and stackoverflow thoroughly, but I haven't found answer to my question:
How can I get/set (both) RGB value of certain (given by x,y coordinates) pixel in OpenCV? What's important-I'm writing in C++, the image is stored in cv::Mat variable. I know there is an IplImage() operator, but IplImage is not very comfortable in use-as far as I know it comes from C API.
Yes, I'm aware that there was already this Pixel access in OpenCV 2.2 thread, but it was only about black and white bitmaps.
EDIT:
Thank you very much for all your answers. I see there are many ways to get/set RGB value of pixel. I got one more idea from my close friend-thanks Benny! It's very simple and effective. I think it's a matter of taste which one you choose.
Mat image;
(...)
Point3_<uchar>* p = image.ptr<Point3_<uchar> >(y,x);
And then you can read/write RGB values with:
p->x //B
p->y //G
p->z //R
Try the following:
cv::Mat image = ...do some stuff...;
image.at<cv::Vec3b>(y,x); gives you the RGB (it might be ordered as BGR) vector of type cv::Vec3b
image.at<cv::Vec3b>(y,x)[0] = newval[0];
image.at<cv::Vec3b>(y,x)[1] = newval[1];
image.at<cv::Vec3b>(y,x)[2] = newval[2];
The low-level way would be to access the matrix data directly. In an RGB image (which I believe OpenCV typically stores as BGR), and assuming your cv::Mat variable is called frame, you could get the blue value at location (x, y) (from the top left) this way:
frame.data[frame.channels()*(frame.cols*y + x)];
Likewise, to get B, G, and R:
uchar b = frame.data[frame.channels()*(frame.cols*y + x) + 0];
uchar g = frame.data[frame.channels()*(frame.cols*y + x) + 1];
uchar r = frame.data[frame.channels()*(frame.cols*y + x) + 2];
Note that this code assumes the stride is equal to the width of the image.
A piece of code is easier for people who have such problem. I share my code and you can use it directly. Please note that OpenCV store pixels as BGR.
cv::Mat vImage_;
if(src_)
{
cv::Vec3f vec_;
for(int i = 0; i < vHeight_; i++)
for(int j = 0; j < vWidth_; j++)
{
vec_ = cv::Vec3f((*src_)[0]/255.0, (*src_)[1]/255.0, (*src_)[2]/255.0);//Please note that OpenCV store pixels as BGR.
vImage_.at<cv::Vec3f>(vHeight_-1-i, j) = vec_;
++src_;
}
}
if(! vImage_.data ) // Check for invalid input
printf("failed to read image by OpenCV.");
else
{
cv::namedWindow( windowName_, CV_WINDOW_AUTOSIZE);
cv::imshow( windowName_, vImage_); // Show the image.
}
The current version allows the cv::Mat::at function to handle 3 dimensions. So for a Mat object m, m.at<uchar>(0,0,0) should work.
uchar * value = img2.data; //Pointer to the first pixel data ,it's return array in all values
int r = 2;
for (size_t i = 0; i < img2.cols* (img2.rows * img2.channels()); i++)
{
if (r > 2) r = 0;
if (r == 0) value[i] = 0;
if (r == 1)value[i] = 0;
if (r == 2)value[i] = 255;
r++;
}
const double pi = boost::math::constants::pi<double>();
cv::Mat distance2ellipse(cv::Mat image, cv::RotatedRect ellipse){
float distance = 2.0f;
float angle = ellipse.angle;
cv::Point ellipse_center = ellipse.center;
float major_axis = ellipse.size.width/2;
float minor_axis = ellipse.size.height/2;
cv::Point pixel;
float a,b,c,d;
for(int x = 0; x < image.cols; x++)
{
for(int y = 0; y < image.rows; y++)
{
auto u = cos(angle*pi/180)*(x-ellipse_center.x) + sin(angle*pi/180)*(y-ellipse_center.y);
auto v = -sin(angle*pi/180)*(x-ellipse_center.x) + cos(angle*pi/180)*(y-ellipse_center.y);
distance = (u/major_axis)*(u/major_axis) + (v/minor_axis)*(v/minor_axis);
if(distance<=1)
{
image.at<cv::Vec3b>(y,x)[1] = 255;
}
}
}
return image;
}