I am using a PMD camera to capture depth image in the below format
struct DepthData
{
int version;
std::chrono::microseconds timeStamp;
uint16_t width;
uint16_t height;
Vector of uint_16 exposureTimes;
Vector of DepthPoint points; //!< array of points
};
Depth point structure looks like this
struct DepthPoint
{
float x; //!< X coordinate [meters]
float y; //!< Y coordinate [meters]
float z; //!< Z coordinate [meters]
float noise; //!< noise value [meters]
uint16_t grayValue; //!< 16-bit gray value
uint8_t depthConfidence; //!< value 0 = bad, 255 = good
};
And I am trying to convert it into opencv mat data structure. Below is the code.
But it is throwing an exception. Kindly help
const int imageSize = w * h;
Mat out = cv::Mat(h, w, CV_16UC3, Scalar(0, 0, 0));
const Scalar S;
for (int h = 0; h < out.rows; h++)
{
//printf("%" PRIu64 "\n", point.at(h).grayValue);
for (int w = 0; w < out.cols; w++)
{
//printf("%" PRIu64 "\n", point.at(h).grayValue);
out.at<cv::Vec3f>(h,w)[0] = point[w].x;
out.at<cv::Vec3f>(h, w)[1] = point[w].y;
out.at<cv::Vec3f>(h, w)[2] = point[w].z;
}
}
imwrite("E:/softwares/1.8.0.71/bin/depthImage1.png", out);
You seem to be doing a couple of things wrong.
You are creating an image of type CV_16UC3 which is a 3 channel 16 bit unsigned char image. Then in out.at<cv::Vec3f>(h,w)[0] you try to read part of it as a vector of 3 floats. You should probably create your image as a float image instead.
For further details please provide the exception. It will be easier to help.
UPD: If you just want a depth image, create an image like this:
Mat out = cv::Mat::zeros(h, w, CV_16UC1);
Then in every pixel:
out.at<uint16_t>(h, w) = depth_point.grayValue;
Related
I have a 2D vector of float values that I need to create an image from it.
The code that I have is as follows:
inline cv::Mat ConvertToMat(vector<vector<float>> inputData)
{
static int MAXGREY = 255;
static int MAXRANGE = 255;
int Red, Blue, Green;
float maxValue = GetMaxValue(inputData); // find max value in input data
cv::Mat output(inputData.getXSize(), inputData.getXSize(), CV_8UC3, cv::Scalar::all(0));
// if the max value is equal to or less than 0, no data in the vector to convert.
if (maxValue > 0)
{
for (int x = 0; x < inputData.size(); x++)
{
for (int y = 0; y < inputData[x].size(); y++)
{
auto Value = inputData[x][y];
Green = 0;
Red = Value * 255 / maxValue;
Blue = (maxValue - Value) * 255 / maxValue;
cv::Vec3b xyzBuffer;
xyzBuffer[0] = Blue;
xyzBuffer[1] = Red;
xyzBuffer[2] = Green;
output.at<cv::Vec3b>(x, y) = xyzBuffer;
}
}
}
return output;
}
but this method doesn't generate suitable results when there is a pixel with a very high value and a lot of pixels with small values, all small values can not be seen on the output.
for example, lets look this set of data for input:
int main()
{
vector<vector<float>> inputData =
{
{1,2,3,4,5,6,7,8,9,10},
{1.5,2.5,3.5,4.5,5.5,6.5,7.5,8.5,9.5,10.5},
{1,2,3,4,5,6,7,8,9,10},
{1.5,2.5,3.5,4.5,5.5,6.5,7.5,8.5,9.5,10.5},
{1,2,3,4,2000,6,7,8,9,10},
{1.5,2.5,3.5,4.5,5.5,6.5,7.5,8.5,9.5,10.5},
{1,2,3,4,5,6,7,8,9,10},
{1.5,2.5,3.5,4.5,5.5,6.5,7.5,8.5,9.5,10.5},
{1,2,3,4,5,6,7,8,9,10},
{1.5,2.5,3.5,4.5,5.5,6.5,7.5,8.5,9.5,10.5}
};
cv::Mat image=ConvertToMat(inputData);
cv::imwrite("c://tmp//myimage.jpg", image);
return 0;
}
The generated out is as follow (the value of each pixel is shown on the pixel):
Since we have 3X byte data for colour, we should have enough dynamic range to show the data for each pixel in a different colour. but the above algorithm converts the value of 1 and 2 and 3 into the same colour (254,0,0).
How can I convert a float into three different colours so I can see all pixels with a different colour suitable for visual inspection (so each pixel which is near to the other has similar colour but not the same colour)?
We are creating a game where there are maps. On those maps, players can walk, but to know if they can walk somewhere, we have another image, where the path is paint.
The player can move by clicking on the map, if the click match with the collider image, the character should go to the clicked point with a pathfinder. If not, the character don't move.
For example, here is a map and its collision path image :
How can I know if I've clicked on the collider (this is a png with one color and transparency) in Qt ?
I'm using QML and Felgo for rendering so if there is already a way to do it with QML, it's even better, but I can implement it in C++ too.
My second question is how can I do a pathfinder ? I know the algorithms for that but should I move by using pixels ?
I've seen the QPainterPath class which could be what i'm looking for, how can I read all pixels with a certain color of my image and know their coordonates ?
Thanks
QML interface doesn't provide efficient way to resolve your task. It should be done at C++ side.
To get image data you can use:
QImage to load image
Call N times QImage::constScanLine, each time read K pixels. N equals to image height in pixels, K equals to width.
How to deal with returned uchar* of QImage::constScanLine?
You should call QImage::format() to determine pixel format hidden by uchar*. Or you can call QImage::convertToFormat(QImage::Format_RGB32) and always cast pixel data from uchar* to your custom struct like PixelData:
#pragma pack(push, 1)
struct PixelData {
uint8_t padding;
uint8_t r;
uint8_t g;
uint8_t b;
};
#pragma pack(pop)
according to this documentation: https://doc.qt.io/qt-5/qimage.html#Format-enum
Here is compilable solution for loading image into RAM for further effective working with it's data:
#include <QImage>
#pragma pack(push, 1)
struct PixelData {
uint8_t padding;
uint8_t r;
uint8_t g;
uint8_t b;
};
#pragma pack(pop)
void loadImage(const char* path, int& w, int& h, PixelData** data) {
Q_ASSERT(data);
QImage initialImage;
initialImage.load(path);
auto image = initialImage.convertToFormat(QImage::Format_RGB32);
w = image.width();
h = image.height();
*data = new PixelData[w * h];
PixelData* outData = *data;
for (int y = 0; y < h; y++) {
auto scanLine = image.constScanLine(y);
memcpy(outData, scanLine, sizeof(PixelData) * w);
outData += w;
}
}
void pathfinder(const PixelData* data, int w, int h) {
// Your algorithm here
}
void cleanupData(PixelData* data) {
delete[] data;
}
int main(int argc, char *argv[])
{
int width, height;
PixelData* data;
loadImage("D:\\image.png", width, height, &data);
pathfinder(data, width, height);
cleanupData(data);
return 0;
}
You can access each pixel by calling this function
inline const PixelData& getPixel(int x, int y, const PixelData* data, int w) {
return *(data + (w * y) + x);
}
... or use this formula somewhere in your pathfinding algorithm, where it could be more efficient.
Here is a code that decodes a WebM frame and put them in a buffer
image->planes[p] = pointer to the top left pixel
image->linesize[p] = strides betwen rows
framesArray = vector of unsigned char*
while ( videoDec->getImage(*image) == VPXDecoder::NO_ERROR)
{
const int w = image->getWidth(p);
const int h = image->getHeight(p);
int offset = 0;
for (int y = 0; y < h; y++)
{
// fwrite(image->planes[p] + offset, 1, w, pFile);
for(int i=0;i<w;i++){
framesArray.at(count)[i+(w*y)] = *(image->planes[p]+offset+ i) ;
}
offset += image->linesize[p];
}
}
.............................
How can I write intro buffer line by line not pixel by pixel or optimize the writing of frame intro buffer?
if the source image and destination buffer share the same Width, Height and bit per pixel, you can use std::copy to copy the whole image into it.
std::copy(image->planes[p] + offset, image->planes[p] + (image->getHeight(p) * image->linesize[p], framesArray.begin()) ;
if it is same bit per pixel but different width and height, you can use std::copy by line.
In Tensorflow C++ I can load an image file into the graph using
tensorflow::Node* file_reader = tensorflow::ops::ReadFile(tensorflow::ops::Const(IMAGE_FILE_NAME, b.opts()),b.opts().WithName(input_name));
tensorflow::Node* image_reader = tensorflow::ops::DecodePng(file_reader, b.opts().WithAttr("channels", 3).WithName("png_reader"));
tensorflow::Node* float_caster = tensorflow::ops::Cast(image_reader, tensorflow::DT_FLOAT, b.opts().WithName("float_caster"));
tensorflow::Node* dims_expander = tensorflow::ops::ExpandDims(float_caster, tensorflow::ops::Const(0, b.opts()), b.opts());
tensorflow::Node* resized = tensorflow::ops::ResizeBilinear(dims_expander, tensorflow::ops::Const({input_height, input_width},b.opts().WithName("size")),b.opts());
For an embedded application I would like to instead pass an OpenCV Mat into this graph.
How would I convert the Mat to a tensor that could be used as input to tensorflow::ops::Cast or tensorflow::ops::ExpandDims?
It's not directly from a CvMat, but you can see an example of how to initialize a Tensor from an in-memory array in the TensorFlow Android example:
https://github.com/tensorflow/tensorflow/blob/0.6.0/tensorflow/examples/android/jni/tensorflow_jni.cc#L173
You would start off by creating a new tensorflow::Tensor object, with something like this (all code untested):
tensorflow::Tensor input_tensor(tensorflow::DT_FLOAT,
tensorflow::TensorShape({1, height, width, depth}));
This creates a Tensor object with float values, with a batch size of 1, and a size of widthxheight, and with depth channels. For example a 128 wide by 64 high image with 3 channels would pass in a shape of {1, 64, 128, 3}. The batch size is just used when you need to pass in multiple images in a single call, and for simple uses you can leave it as 1.
Then you would get the underlying array behind the tensor using a line like this:
auto input_tensor_mapped = input_tensor.tensor<float, 4>();
The input_tensor_mapped object is an interface to the data in your newly-created tensor, and you can then copy your own data into it. Here I'm assuming you've set source_data as a pointer to your source data, for example:
const float* source_data = some_structure.imageData;
You can then loop through your data and copy it over:
for (int y = 0; y < height; ++y) {
const float* source_row = source_data + (y * width * depth);
for (int x = 0; x < width; ++x) {
const float* source_pixel = source_row + (x * depth);
for (int c = 0; c < depth; ++c) {
const float* source_value = source_pixel + c;
input_tensor_mapped(0, y, x, c) = *source_value;
}
}
}
There are obvious opportunities to optimize this naive approach, and I don't have sample code on hand to show how to deal with the OpenCV side of getting the source data, but hopefully this is helpful to get you started.
Here is complete example to read and feed:
Mat image;
image = imread("flowers.jpg", CV_LOAD_IMAGE_COLOR);
cv::resize(image, image, cv::Size(input_height, input_width), 0, 0, CV_INTER_CUBIC);
int depth = 3;
tensorflow::Tensor input_tensor(tensorflow::DT_FLOAT,
tensorflow::TensorShape({1, image.rows, image.cols, depth}));
for (int y = 0; y < image.rows; y++) {
for (int x = 0; x < image.cols; x++) {
Vec3b pixel = image.at<Vec3b>(y, x);
input_tensor_mapped(0, y, x, 0) = pixel.val[2]; //R
input_tensor_mapped(0, y, x, 1) = pixel.val[1]; //G
input_tensor_mapped(0, y, x, 2) = pixel.val[0]; //B
}
}
auto result = Sub(root.WithOpName("subtract_mean"), input_tensor, {input_mean});
ClientSession session(root);
TF_CHECK_OK(session.Run({result}, out_tensors));
I had tried to run inception model on the opencv Mat file and following code worked for me https://gist.github.com/kyrs/9adf86366e9e4f04addb. Although there are some issue with integration of opencv and tensorflow. Code worked without any issue for .png files but failed to load .jpg and .jpeg. You can follow this for more info https://github.com/tensorflow/tensorflow/issues/1924
Tensor convertMatToTensor(Mat &input)
{
int height = input.rows;
int width = input.cols;
int depth = input.channels();
Tensor imgTensor(tensorflow::DT_FLOAT, tensorflow::TensorShape({height, width, depth}));
float* p = imgTensor.flat<float>().data();
Mat outputImg(height, width, CV_32FC3, p);
input.convertTo(outputImg, CV_32FC3);
return imgTensor;
}
I have a kinect streaming data into a cv::Mat. I am trying to get some example code running that uses OpenNI.
Can I convert my Mat into an OpenNI format image somehow?
I just need the depth image, and after fighting with OpenNI for a long time, have given up on installing it.
I am using OpenCV 3, Visual Studio 2013, Kinect v2 for Windows.
The relevant code is:
void CDifodoCamera::loadFrame()
{
//Read the newest frame
openni::VideoFrameRef framed; //I assume I need to replace this with my Mat...
depth_ch.readFrame(&framed);
const int height = framed.getHeight();
const int width = framed.getWidth();
//Store the depth values
const openni::DepthPixel* pDepthRow = (const openni::DepthPixel*)framed.getData();
int rowSize = framed.getStrideInBytes() / sizeof(openni::DepthPixel);
for (int yc = height-1; yc >= 0; --yc)
{
const openni::DepthPixel* pDepth = pDepthRow;
for (int xc = width-1; xc >= 0; --xc, ++pDepth)
{
if (*pDepth < 4500.f)
depth_wf(yc,xc) = 0.001f*(*pDepth);
else
depth_wf(yc,xc) = 0.f;
}
pDepthRow += rowSize;
}
}
First you need to understand how your data is coming... If it is already in cv::Mat you should be receiving two images, one for the RGB information that usually is a 3 channel uchar cv::Mat and another image for the depth information that usually it is saved in a 16 bit representation in milimeters (you can not save float mat as images, but you can as yml/xml files using opencv).
Assuming you want to read and process the image that contains the depth information, you can change your code to:
void CDifodoCamera::loadFrame()
{
//Read the newest frame
//the depth image should be png since it is the one which supports 16 bits and it must have the ANYDEPTH flag
cv::Mat depth_im = cv::imread("img_name.png",CV_LOAD_IMAGE_ANYDEPTH);
const int height = depth_im.rows;
const int width = depth_im.cols;
for (int y = 0; y < height; y++)
{
for (int x = 0; x < width; x++)
{
if (depth_im<unsigned short>(y,x) < 4500)
depth_wf(y,x) = 0.001f * (float)depth_im<unsigned short>(y,x);
else
depth_wf(y,x) = 0.f;
}
}
}
I hope this helps you. If you have any question just ask :)