Tensorflow DrawBoundingBoxes C++

Tensorflow DrawBoundingBoxes C++ - c++

I am detecting objects using Tensorflow in C++. It does work well and i want to draw the boxes to have some visual feedback.
There is the operation tensorflow::ops::DrawBoundingBoxes that would let me do this but the problem is :
I don't understand what the input values should be for the boxes. What does this mean :
boxes: 3-D with shape [batch, num_bounding_boxes, 4] containing bounding boxes.
I could not find an example that uses this operation in C++ anywhere, like almost as if this ops does not exist.
Is there an example somewhere in C++ where this ops is used? It sounds a basic things to do for a tutorial or to debug.

In case if you are still on this question, I have written my own implementation of this operation using OpenCV basic methods. It also supports captioning the boxes with corresponding class labels.
/** Draw bounding box and add caption to the image.
* Boolean flag _scaled_ shows if the passed coordinates are in relative units (true by default in tensorflow detection)
*/
void drawBoundingBoxOnImage(Mat &image, double yMin, double xMin, double yMax, double xMax, double score, string label, bool scaled=true) {
cv::Point tl, br;
if (scaled) {
tl = cv::Point((int) (xMin * image.cols), (int) (yMin * image.rows));
br = cv::Point((int) (xMax * image.cols), (int) (yMax * image.rows));
} else {
tl = cv::Point((int) xMin, (int) yMin);
br = cv::Point((int) xMax, (int) yMax);
}
cv::rectangle(image, tl, br, cv::Scalar(0, 255, 255), 1);
// Ceiling the score down to 3 decimals (weird!)
float scoreRounded = floorf(score * 1000) / 1000;
string scoreString = to_string(scoreRounded).substr(0, 5);
string caption = label + " (" + scoreString + ")";
// Adding caption of type "LABEL (X.XXX)" to the top-left corner of the bounding box
int fontCoeff = 12;
cv::Point brRect = cv::Point(tl.x + caption.length() * fontCoeff / 1.6, tl.y + fontCoeff);
cv::rectangle(image, tl, brRect, cv::Scalar(0, 255, 255), -1);
cv::Point textCorner = cv::Point(tl.x, tl.y + fontCoeff * 0.9);
cv::putText(image, caption, textCorner, FONT_HERSHEY_SIMPLEX, 0.4, cv::Scalar(255, 0, 0));
}
/** Draw bounding boxes and add captions to the image.
* Box is drawn only if corresponding score is higher than the _threshold_.
*/
void drawBoundingBoxesOnImage(Mat &image,
tensorflow::TTypes<float>::Flat scores,
tensorflow::TTypes<float>::Flat classes,
tensorflow::TTypes<float,3>::Tensor boxes,
map<int, string> labelsMap, double threshold=0.5) {
for (int j = 0; j < scores.size(); j++)
if (scores(j) > threshold)
drawBoundingBoxOnImage(image, boxes(0,j,0), boxes(0,j,1), boxes(0,j,2), boxes(0,j,3), scores(j), labelsMap[classes(j)]);
}
The complete example is here.

Here is a small usage example in python that draws 2 rectangles on the image.png and saves it as outout.png, I believe it should help you:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import tensorflow as tf
import numpy as np
import PIL.Image as pimg
if __name__ == '__main__':
image = tf.convert_to_tensor(np.array(pimg.open('image.png'), np.float), tf.float32)
bbox = tf.convert_to_tensor([[0.1, 0.1, 0.4, 0.4], [0.5, 0.5, 0.6, 0.7]])
with tf.Session() as s:
s.run(tf.global_variables_initializer())
output = s.run(tf.image.draw_bounding_boxes(tf.expand_dims(image, 0), tf.expand_dims(bbox, 0)))
pimg.fromarray(np.uint8(output[0])).save('output.png')
The boxes is an array of arrays of rectangles for each image, where rectangle is defined by four normalized floats [min_y, min_x, max_y, max_x].

I manage to got it working using this.
Too bad you cannot really control how the box should look like nor other informations like the score or the label text.
Status CreateBoxedTensor(Tensor &input_image, Tensor &input_boxes, Tensor *output) {
auto root = Scope::NewRootScope();
// First OP is to convert uint8 image tensor to float for the drawing op to not loose its s*
Input imgin(input_image);
auto cast_op = Cast(root, imgin, DT_FLOAT);
// Next one is the drawing itself
Input boxin(input_boxes);
auto draw_op = DrawBoundingBoxes(root, cast_op, boxin);
// And we convert back to uint8 because it's RGB after all that we want
auto cast_back_op = Cast(root, draw_op, DT_UINT8);
ClientSession session(root);
std::vector<Tensor> out_tensors;
TF_RETURN_IF_ERROR(session.Run({cast_back_op}, &out_tensors));
*output = out_tensors[0];
return Status::OK();
}

Related

My C++ code is not detecting objects correctly yolov5

I have a yolov5 onnx file where I trained apples and bananas. I was using python until today, but I decided to switch to c++ to gain some speed. I get correct results when I use yolov5's own onnx files and image in the code I added below. But when I add my own onnx file and my test image it gives me wrong result. You can also find the attached image. What is the problem here?
// Include Libraries.
\#include \<opencv2/opencv.hpp\>
\#include \<fstream\>
// Namespaces.
using namespace cv;
using namespace std;
using namespace cv::dnn;
// Constants.
const float INPUT_WIDTH = 640.0;
const float INPUT_HEIGHT = 640.0;
const float SCORE_THRESHOLD = 0.3;
const float NMS_THRESHOLD = 0.4;
const float CONFIDENCE_THRESHOLD = 0.65;
// Text parameters.
const float FONT_SCALE = 0.7;
const int FONT_FACE = FONT_HERSHEY_SIMPLEX;
const int THICKNESS = 1;
// Colors.
Scalar BLACK = Scalar(0,0,0);
Scalar BLUE = Scalar(255, 178, 50);
Scalar YELLOW = Scalar(0, 255, 255);
Scalar RED = Scalar(0,0,255);
// Draw the predicted bounding box.
void draw_label(Mat& input_image, string label, int left, int top)
{
// Display the label at the top of the bounding box.
int baseLine;
Size label_size = getTextSize(label, FONT_FACE, FONT_SCALE, THICKNESS, &baseLine);
top = max(top, label_size.height);
// Top left corner.
Point tlc = Point(left, top);
// Bottom right corner.
Point brc = Point(left + label_size.width, top + label_size.height + baseLine);
// Draw black rectangle.
rectangle(input_image, tlc, brc, BLACK, FILLED);
// Put the label on the black rectangle.
putText(input_image, label, Point(left, top + label_size.height), FONT_FACE, FONT_SCALE, YELLOW, THICKNESS);
}
vector\<Mat\> pre_process(Mat &input_image, Net &net)
{
// Convert to blob.
Mat blob;
blobFromImage(input_image, blob, 1./255., Size(INPUT_WIDTH, INPUT_HEIGHT), Scalar(), true, false);
net.setInput(blob);
// Forward propagate.
vector<Mat> outputs;
net.forward(outputs, net.getUnconnectedOutLayersNames());
return outputs;
}
Mat post_process(Mat &input_image, vector\<Mat\> &outputs, const vector\<string\> &class_name)
{
// Initialize vectors to hold respective outputs while unwrapping detections.
vector\<int\> class_ids;
vector\<float\> confidences;
vector\<Rect\> boxes;
// Resizing factor.
float x_factor = input_image.cols / INPUT_WIDTH;
float y_factor = input_image.rows / INPUT_HEIGHT;
float *data = (float *)outputs[0].data;
const int dimensions = 85;
const int rows = 25200;
// Iterate through 25200 detections.
for (int i = 0; i < rows; ++i)
{
float confidence = data[4];
// Discard bad detections and continue.
if (confidence >= CONFIDENCE_THRESHOLD)
{
float * classes_scores = data + 5;
// Create a 1x85 Mat and store class scores of 80 classes.
Mat scores(1, class_name.size(), CV_32FC1, classes_scores);
// Perform minMaxLoc and acquire index of best class score.
Point class_id;
double max_class_score;
minMaxLoc(scores, 0, &max_class_score, 0, &class_id);
// Continue if the class score is above the threshold.
if (max_class_score > SCORE_THRESHOLD)
{
// Store class ID and confidence in the pre-defined respective vectors.
confidences.push_back(confidence);
class_ids.push_back(class_id.x);
// Center.
float cx = data[0];
float cy = data[1];
// Box dimension.
float w = data[2];
float h = data[3];
// Bounding box coordinates.
int left = int((cx - 0.5 * w) * x_factor);
int top = int((cy - 0.5 * h) * y_factor);
int width = int(w * x_factor);
int height = int(h * y_factor);
// Store good detections in the boxes vector.
boxes.push_back(Rect(left, top, width, height));
}
}
// Jump to the next column.
data += 85;
}
// Perform Non Maximum Suppression and draw predictions.
vector<int> indices;
NMSBoxes(boxes, confidences, SCORE_THRESHOLD, NMS_THRESHOLD, indices);
for (int i = 0; i < indices.size(); i++)
{
int idx = indices[i];
Rect box = boxes[idx];
int left = box.x;
int top = box.y;
int width = box.width;
int height = box.height;
// Draw bounding box.
rectangle(input_image, Point(left, top), Point(left + width, top + height), BLUE, 3*THICKNESS);
// Get the label for the class name and its confidence.
string label = format("%.2f", confidences[idx]);
label = class_name[class_ids[idx]] + ":" + label;
// Draw class labels.
draw_label(input_image, label, left, top);
//cout<<"The Value is "<<label;
//cout<<endl;
}
return input_image;
}
int main()
{
vector<string> class_list;
ifstream ifs("/Users/admin/Documents/C++/First/obj.names");
string line;
while (getline(ifs, line))
{
class_list.push_back(line);
}
// Load image.
Mat frame;
frame = imread("/Users/admin/Documents/C++/First/test.jpg");
// Load model.
Net net;
net = readNet("/Users/admin/Documents/C++/First/my.onnx");
vector<Mat> detections;
detections = pre_process(frame, net);
Mat img = post_process(frame, detections, class_list);
//Mat img = post_process(frame.clone(), detections, class_list);
// Put efficiency information.
// The function getPerfProfile returns the overall time for inference(t) and the timings for each of the layers(in layersTimes)
vector<double> layersTimes;
double freq = getTickFrequency() / 1000;
double t = net.getPerfProfile(layersTimes) / freq;
string label = format("Inference time : %.2f ms", t);
putText(img, label, Point(20, 40), FONT_FACE, FONT_SCALE, RED);
imshow("Output", img);
waitKey(0);
return 0;
}
The photos I use are 640x480. I played around with the size of the photo, thinking it might be related, but the same problem persisted.

The Yolov5 output format is xyxy as can be seen here:
https://github.com/ultralytics/yolov5/blob/bfa1f23045c7c4136a9b8ced9d6be8249ed72692/detect.py#L161
Not xywh as you are assuming in your code

Calculate text size

Note: This question How to put text into a bounding box in OpenCV? is in some ways similar to this one but it is not the same question. The OP of the questions tried to spread a text to the whole size of his image & the code in the answer that gots the mark is just resizing the text using a mask.
I'm using openCV combined with C++ to do some image detection & manipulation.
So I want to align a text with a unknown length at a specific origin. The font-scale should be calculated because I'd like to specify a width factor for the maximum Text-width like you can see in the image below:
This is the code I got so far:
int fontFace = cv::FONT_HERSHEY_DUPLEX,
fontScale = myTextString.size() / 10;
cv::Size textSize = getTextSize(image, fontFace, fontScale, 0, 0);
putText(image, myTextString, cv::Point( (origin.x + textSize.width)/2, (origin.y + textSize.height)/2 ), fontFace, fontScale, Scalar(255, 0, 0));

Something like this should do it. You can change how the margins are calculated to change the horizontal/vertical alignment of the font.
If the height doesn't matter, you can just leave target.height a large number.
void drawtorect(cv::Mat & mat, cv::Rect target, int face, int thickness, cv::Scalar color, const std::string & str)
{
cv::Size rect = cv::getTextSize(str, face, 1.0, thickness, 0);
double scalex = (double)target.width / (double)rect.width;
double scaley = (double)target.height / (double)rect.height;
double scale = std::min(scalex, scaley);
int marginx = scale == scalex ? 0 : (int)((double)target.width * (scalex - scale) / scalex * 0.5);
int marginy = scale == scaley ? 0 : (int)((double)target.height * (scaley - scale) / scaley * 0.5);
cv::putText(mat, str, cv::Point(target.x + marginx, target.y + target.height - marginy), face, scale, color, thickness, 8, false);
}
* edit *
// Sample code
int L = 80; // width and height per square
int M = 60;
cv::Mat m( 5*M, 7*L,CV_8UC3,cv::Scalar(0,0,0) );
// create checkerboard
for ( int y=0,ymax=m.rows-M;y<=ymax; y+=M)
{
int c = (y/M)%2 == 0 ? 0 : 1;
for ( int x=0,xmax=m.cols-L;x<=xmax;x+=L)
{
if ( (c++)%2!=0 )
continue; // skip odd squares
// convenient way to do this
m( cv::Rect(x,y,L,M) ).setTo( cv::Scalar(64,64,64) );
}
}
// fill checkerboard ROIs by some text
int64 id=1;
for ( int y=0,ymax=m.rows-M;y<=ymax; y+=M)
{
for ( int x=0,xmax=m.cols-L;x<=xmax;x+=L)
{
std::stringstream ss;
ss<<(id<<=1); // some increasing text input
drawtorect( m, cv::Rect(x,y,L,M), cv::FONT_HERSHEY_PLAIN,1,cv::Scalar(255,255,255),ss.str() );
}
}

opencv: how to draw arrows on orientation image

I'm trying to perform orientation estimation on an input image in OpenCV. I used sobel function to get gradients of the image, and used another function called calculateOrientations, which I found on the internet, to calculate orientations.
The code is as follows:
void computeGradient(cv::Mat inputImg)
{
// Gradient X
cv::Sobel(inputImg, grad_x, CV_16S, 1, 0, 5, 1, 0, cv::BORDER_DEFAULT);
cv::convertScaleAbs(grad_x, abs_grad_x);
// Gradient Y
cv::Sobel(inputImg, grad_y, CV_16S, 0, 1, 5, 1, 0, cv::BORDER_DEFAULT);
cv::convertScaleAbs(grad_y, abs_grad_y);
// convert from CV_8U to CV_32F
abs_grad_x.convertTo(abs_grad_x2, CV_32F, 1. / 255);
abs_grad_y.convertTo(abs_grad_y2, CV_32F, 1. / 255);
// calculate orientations
calculateOrientations(abs_grad_x2, abs_grad_y2);
}
void calculateOrientations(cv::Mat gradientX, cv::Mat gradientY)
{
// Create container element
orientation = cv::Mat(gradientX.rows, gradientX.cols, CV_32F);
// Calculate orientations of gradients --> in degrees
// Loop over all matrix values and calculate the accompagnied orientation
for (int i = 0; i < gradientX.rows; i++){
for (int j = 0; j < gradientX.cols; j++){
// Retrieve a single value
float valueX = gradientX.at<float>(i, j);
float valueY = gradientY.at<float>(i, j);
// Calculate the corresponding single direction, done by applying the arctangens function
float result = cv::fastAtan2(valueX, valueY);
// Store in orientation matrix element
orientation.at<float>(i, j) = result;
}
}
}
Now, I need to make sure whether the obtained orientation is correct or not. For that I want to draw arrows for each block of size 5x5 on the orientation matrix. Could someone advice me on how to draw arrows on this? Thank you.

The simplest way for OpenCV to distinguish direction is to draw little circle or square at a start or end point of line. There are no function for arrows afaik. If you need arrow you have to write this (it is simple but takes time too). Once I did it this way (not openCV, but I hope you convert it):
double arrow_pos = 0.5; // 0.5 = at the center of line
double len = sqrt((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1));
double co = (x2-x1)/len, si = (y2-y1)/len; // line coordinates are (x1,y1)-(x2,y2)
double const l = 15, sz = linewidth*2; // l - arrow length
double x0 = x2 - len*arrow_pos*co;
double y0 = y2 - len*arrow_pos*si;
double x = x2 - (l+len*arrow_pos)*co;
double y = y2 - (l+len*arrow_pos)*si;
TPoint tp[4] = {TPoint(x+sz*si, y-sz*co), TPoint(x0, y0), TPoint(x-sz*si, y+sz*co), TPoint(x+l*0.3*co, y+0.3*l*si)};
Polygon(tp, 3);
Canvas->Polyline(tp, 2);
UPDATE: arrowedLine(...) function added since OpenCV 2.4.10 and 3.0

The easiest way to draw an arrow in opencv is:
arrowedLine(img, pointStart, pointFinish, colorScalar, thickness, line_type, shift, tipLength);
thickness, line_type, shift and tipLength have already default values, so can be omitted

Rotate an image without cropping in OpenCV in C++

I'd like to rotate an image, but I can't obtain the rotated image without cropping
My original image:
Now I use this code:
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
// Compile with g++ code.cpp -lopencv_core -lopencv_highgui -lopencv_imgproc
int main()
{
cv::Mat src = cv::imread("im.png", CV_LOAD_IMAGE_UNCHANGED);
cv::Mat dst;
cv::Point2f pc(src.cols/2., src.rows/2.);
cv::Mat r = cv::getRotationMatrix2D(pc, -45, 1.0);
cv::warpAffine(src, dst, r, src.size()); // what size I should use?
cv::imwrite("rotated_im.png", dst);
return 0;
}
And obtain the following image:
But I'd like to obtain this:

My answer is inspired by the following posts / blog entries:
Rotate cv::Mat using cv::warpAffine offsets destination image
http://john.freml.in/opencv-rotation
Main ideas:
Adjusting the rotation matrix by adding a translation to the new image center
Using cv::RotatedRect to rely on existing opencv functionality as much as possible
Code tested with opencv 3.4.1:
#include "opencv2/opencv.hpp"
int main()
{
cv::Mat src = cv::imread("im.png", CV_LOAD_IMAGE_UNCHANGED);
double angle = -45;
// get rotation matrix for rotating the image around its center in pixel coordinates
cv::Point2f center((src.cols-1)/2.0, (src.rows-1)/2.0);
cv::Mat rot = cv::getRotationMatrix2D(center, angle, 1.0);
// determine bounding rectangle, center not relevant
cv::Rect2f bbox = cv::RotatedRect(cv::Point2f(), src.size(), angle).boundingRect2f();
// adjust transformation matrix
rot.at<double>(0,2) += bbox.width/2.0 - src.cols/2.0;
rot.at<double>(1,2) += bbox.height/2.0 - src.rows/2.0;
cv::Mat dst;
cv::warpAffine(src, dst, rot, bbox.size());
cv::imwrite("rotated_im.png", dst);
return 0;
}

Just try the code below, the idea is simple:
You need to create a blank image with the maximum size you're expecting while rotating at any angle. Here you should use Pythagoras as mentioned in the above comments.
Now copy the source image to the newly created image and pass it to warpAffine. Here you should use the centre of newly created image for rotation.
After warpAffine if you need to crop exact image for this translate four corners of source image in enlarged image using rotation matrix as described here
Find minimum x and minimum y for top corner, and maximum x and maximum y for bottom corner from the above result to crop image.
This is the code:
int theta = 0;
Mat src,frame, frameRotated;
src = imread("rotate.png",1);
cout<<endl<<endl<<"Press '+' to rotate anti-clockwise and '-' for clockwise 's' to save" <<endl<<endl;
int diagonal = (int)sqrt(src.cols*src.cols+src.rows*src.rows);
int newWidth = diagonal;
int newHeight =diagonal;
int offsetX = (newWidth - src.cols) / 2;
int offsetY = (newHeight - src.rows) / 2;
Mat targetMat(newWidth, newHeight, src.type());
Point2f src_center(targetMat.cols/2.0F, targetMat.rows/2.0F);
while(1){
src.copyTo(frame);
double radians = theta * M_PI / 180.0;
double sin = abs(std::sin(radians));
double cos = abs(std::cos(radians));
frame.copyTo(targetMat.rowRange(offsetY, offsetY + frame.rows).colRange(offsetX, offsetX + frame.cols));
Mat rot_mat = getRotationMatrix2D(src_center, theta, 1.0);
warpAffine(targetMat, frameRotated, rot_mat, targetMat.size());
//Calculate bounding rect and for exact image
//Reference:- https://stackoverflow.com/questions/19830477/find-the-bounding-rectangle-of-rotated-rectangle/19830964?noredirect=1#19830964
Rect bound_Rect(frame.cols,frame.rows,0,0);
int x1 = offsetX;
int x2 = offsetX+frame.cols;
int x3 = offsetX;
int x4 = offsetX+frame.cols;
int y1 = offsetY;
int y2 = offsetY;
int y3 = offsetY+frame.rows;
int y4 = offsetY+frame.rows;
Mat co_Ordinate = (Mat_<double>(3,4) << x1, x2, x3, x4,
y1, y2, y3, y4,
1, 1, 1, 1 );
Mat RotCo_Ordinate = rot_mat * co_Ordinate;
for(int i=0;i<4;i++){
if(RotCo_Ordinate.at<double>(0,i)<bound_Rect.x)
bound_Rect.x=(int)RotCo_Ordinate.at<double>(0,i); //access smallest
if(RotCo_Ordinate.at<double>(1,i)<bound_Rect.y)
bound_Rect.y=RotCo_Ordinate.at<double>(1,i); //access smallest y
}
for(int i=0;i<4;i++){
if(RotCo_Ordinate.at<double>(0,i)>bound_Rect.width)
bound_Rect.width=(int)RotCo_Ordinate.at<double>(0,i); //access largest x
if(RotCo_Ordinate.at<double>(1,i)>bound_Rect.height)
bound_Rect.height=RotCo_Ordinate.at<double>(1,i); //access largest y
}
bound_Rect.width=bound_Rect.width-bound_Rect.x;
bound_Rect.height=bound_Rect.height-bound_Rect.y;
Mat cropedResult;
Mat ROI = frameRotated(bound_Rect);
ROI.copyTo(cropedResult);
imshow("Result", cropedResult);
imshow("frame", frame);
imshow("rotated frame", frameRotated);
char k=waitKey();
if(k=='+') theta+=10;
if(k=='-') theta-=10;
if(k=='s') imwrite("rotated.jpg",cropedResult);
if(k==27) break;
}
Cropped Image

Thanks Robula!
Actually, you do not need to compute sine and cosine twice.
import cv2
def rotate_image(mat, angle):
# angle in degrees
height, width = mat.shape[:2]
image_center = (width/2, height/2)
rotation_mat = cv2.getRotationMatrix2D(image_center, angle, 1.)
abs_cos = abs(rotation_mat[0,0])
abs_sin = abs(rotation_mat[0,1])
bound_w = int(height * abs_sin + width * abs_cos)
bound_h = int(height * abs_cos + width * abs_sin)
rotation_mat[0, 2] += bound_w/2 - image_center[0]
rotation_mat[1, 2] += bound_h/2 - image_center[1]
rotated_mat = cv2.warpAffine(mat, rotation_mat, (bound_w, bound_h))
return rotated_mat

Thanks #Haris! Here's the Python version:
def rotate_image(image, angle):
'''Rotate image "angle" degrees.
How it works:
- Creates a blank image that fits any rotation of the image. To achieve
this, set the height and width to be the image's diagonal.
- Copy the original image to the center of this blank image
- Rotate using warpAffine, using the newly created image's center
(the enlarged blank image center)
- Translate the four corners of the source image in the enlarged image
using homogenous multiplication of the rotation matrix.
- Crop the image according to these transformed corners
'''
diagonal = int(math.sqrt(pow(image.shape[0], 2) + pow(image.shape[1], 2)))
offset_x = (diagonal - image.shape[0])/2
offset_y = (diagonal - image.shape[1])/2
dst_image = np.zeros((diagonal, diagonal, 3), dtype='uint8')
image_center = (diagonal/2, diagonal/2)
R = cv2.getRotationMatrix2D(image_center, angle, 1.0)
dst_image[offset_x:(offset_x + image.shape[0]), \
offset_y:(offset_y + image.shape[1]), \
:] = image
dst_image = cv2.warpAffine(dst_image, R, (diagonal, diagonal), flags=cv2.INTER_LINEAR)
# Calculate the rotated bounding rect
x0 = offset_x
x1 = offset_x + image.shape[0]
x2 = offset_x
x3 = offset_x + image.shape[0]
y0 = offset_y
y1 = offset_y
y2 = offset_y + image.shape[1]
y3 = offset_y + image.shape[1]
corners = np.zeros((3,4))
corners[0,0] = x0
corners[0,1] = x1
corners[0,2] = x2
corners[0,3] = x3
corners[1,0] = y0
corners[1,1] = y1
corners[1,2] = y2
corners[1,3] = y3
corners[2:] = 1
c = np.dot(R, corners)
x = int(c[0,0])
y = int(c[1,0])
left = x
right = x
up = y
down = y
for i in range(4):
x = int(c[0,i])
y = int(c[1,i])
if (x < left): left = x
if (x > right): right = x
if (y < up): up = y
if (y > down): down = y
h = down - up
w = right - left
cropped = np.zeros((w, h, 3), dtype='uint8')
cropped[:, :, :] = dst_image[left:(left+w), up:(up+h), :]
return cropped

Increase the image canvas (equally from the center without changing the image size) so that it can fit the image after rotation, then apply warpAffine:
Mat img = imread ("/path/to/image", 1);
double offsetX, offsetY;
double angle = -45;
double width = img.size().width;
double height = img.size().height;
Point2d center = Point2d (width / 2, height / 2);
Rect bounds = RotatedRect (center, img.size(), angle).boundingRect();
Mat resized = Mat::zeros (bounds.size(), img.type());
offsetX = (bounds.width - width) / 2;
offsetY = (bounds.height - height) / 2;
Rect roi = Rect (offsetX, offsetY, width, height);
img.copyTo (resized (roi));
center += Point2d (offsetX, offsetY);
Mat M = getRotationMatrix2D (center, angle, 1.0);
warpAffine (resized, resized, M, resized.size());

After searching around for a clean and easy to understand solution and reading through the answers above trying to understand them, I eventually came up with a solution using trigonometry.
I hope this helps somebody :)
import cv2
import math
def rotate_image(mat, angle):
height, width = mat.shape[:2]
image_center = (width / 2, height / 2)
rotation_mat = cv2.getRotationMatrix2D(image_center, angle, 1)
radians = math.radians(angle)
sin = math.sin(radians)
cos = math.cos(radians)
bound_w = int((height * abs(sin)) + (width * abs(cos)))
bound_h = int((height * abs(cos)) + (width * abs(sin)))
rotation_mat[0, 2] += ((bound_w / 2) - image_center[0])
rotation_mat[1, 2] += ((bound_h / 2) - image_center[1])
rotated_mat = cv2.warpAffine(mat, rotation_mat, (bound_w, bound_h))
return rotated_mat
EDIT: Please refer to #Remi Cuingnet's answer below.

A python version of rotating an image and take control of the padded black coloured region you can use the scipy.ndimage.rotate. Here is an example:
from skimage import io
from scipy import ndimage
image = io.imread('https://www.pyimagesearch.com/wp-
content/uploads/2019/12/tensorflow2_install_ubuntu_header.jpg')
io.imshow(image)
plt.show()
rotated = ndimage.rotate(image, angle=234, mode='nearest')
rotated = cv2.resize(rotated, (image.shape[:2]))
# rotated = cv2.cvtColor(rotated, cv2.COLOR_BGR2RGB)
# cv2.imwrite('rotated.jpg', rotated)
io.imshow(rotated)
plt.show()

If you have a rotation and a scaling of the image:
#include "opencv2/opencv.hpp"
#include <functional>
#include <vector>
bool compareCoords(cv::Point2f p1, cv::Point2f p2, char coord)
{
assert(coord == 'x' || coord == 'y');
if (coord == 'x')
return p1.x < p2.x;
return p1.y < p2.y;
}
int main(int argc, char** argv)
{
cv::Mat image = cv::imread("lenna.png");
float angle = 45.0; // degrees
float scale = 0.5;
cv::Mat_<float> rot_mat = cv::getRotationMatrix2D( cv::Point2f( 0.0f, 0.0f ), angle, scale );
// Image corners
cv::Point2f pA = cv::Point2f(0.0f, 0.0f);
cv::Point2f pB = cv::Point2f(image.cols, 0.0f);
cv::Point2f pC = cv::Point2f(image.cols, image.rows);
cv::Point2f pD = cv::Point2f(0.0f, image.rows);
std::vector<cv::Point2f> pts = { pA, pB, pC, pD };
std::vector<cv::Point2f> ptsTransf;
cv::transform(pts, ptsTransf, rot_mat );
using namespace std::placeholders;
float minX = std::min_element(ptsTransf.begin(), ptsTransf.end(), std::bind(compareCoords, _1, _2, 'x'))->x;
float maxX = std::max_element(ptsTransf.begin(), ptsTransf.end(), std::bind(compareCoords, _1, _2, 'x'))->x;
float minY = std::min_element(ptsTransf.begin(), ptsTransf.end(), std::bind(compareCoords, _1, _2, 'y'))->y;
float maxY = std::max_element(ptsTransf.begin(), ptsTransf.end(), std::bind(compareCoords, _1, _2, 'y'))->y;
float newW = maxX - minX;
float newH = maxY - minY;
cv::Mat_<float> trans_mat = (cv::Mat_<float>(2,3) << 0, 0, -minX, 0, 0, -minY);
cv::Mat_<float> M = rot_mat + trans_mat;
cv::Mat warpedImage;
cv::warpAffine( image, warpedImage, M, cv::Size(newW, newH) );
cv::imshow("lenna", image);
cv::imshow("Warped lenna", warpedImage);
cv::waitKey();
cv::destroyAllWindows();
return 0;
}

Thanks to everyone for this post, it has been super useful. However, I have found some black lines left and up (using Rose's python version) when rotating 90º. The problem seemed to be some int() roundings. In addition to that, I have changed the sign of the angle to make it grow clockwise.
def rotate_image(image, angle):
'''Rotate image "angle" degrees.
How it works:
- Creates a blank image that fits any rotation of the image. To achieve
this, set the height and width to be the image's diagonal.
- Copy the original image to the center of this blank image
- Rotate using warpAffine, using the newly created image's center
(the enlarged blank image center)
- Translate the four corners of the source image in the enlarged image
using homogenous multiplication of the rotation matrix.
- Crop the image according to these transformed corners
'''
diagonal = int(math.ceil(math.sqrt(pow(image.shape[0], 2) + pow(image.shape[1], 2))))
offset_x = (diagonal - image.shape[0])/2
offset_y = (diagonal - image.shape[1])/2
dst_image = np.zeros((diagonal, diagonal, 3), dtype='uint8')
image_center = (float(diagonal-1)/2, float(diagonal-1)/2)
R = cv2.getRotationMatrix2D(image_center, -angle, 1.0)
dst_image[offset_x:(offset_x + image.shape[0]), offset_y:(offset_y + image.shape[1]), :] = image
dst_image = cv2.warpAffine(dst_image, R, (diagonal, diagonal), flags=cv2.INTER_LINEAR)
# Calculate the rotated bounding rect
x0 = offset_x
x1 = offset_x + image.shape[0]
x2 = offset_x + image.shape[0]
x3 = offset_x
y0 = offset_y
y1 = offset_y
y2 = offset_y + image.shape[1]
y3 = offset_y + image.shape[1]
corners = np.zeros((3,4))
corners[0,0] = x0
corners[0,1] = x1
corners[0,2] = x2
corners[0,3] = x3
corners[1,0] = y0
corners[1,1] = y1
corners[1,2] = y2
corners[1,3] = y3
corners[2:] = 1
c = np.dot(R, corners)
x = int(round(c[0,0]))
y = int(round(c[1,0]))
left = x
right = x
up = y
down = y
for i in range(4):
x = c[0,i]
y = c[1,i]
if (x < left): left = x
if (x > right): right = x
if (y < up): up = y
if (y > down): down = y
h = int(round(down - up))
w = int(round(right - left))
left = int(round(left))
up = int(round(up))
cropped = np.zeros((w, h, 3), dtype='uint8')
cropped[:, :, :] = dst_image[left:(left+w), up:(up+h), :]
return cropped

Go version (using gocv) of #robula and #remi-cuingnet
func rotateImage(mat *gocv.Mat, angle float64) *gocv.Mat {
height := mat.Rows()
width := mat.Cols()
imgCenter := image.Point{X: width/2, Y: height/2}
rotationMat := gocv.GetRotationMatrix2D(imgCenter, -angle, 1.0)
absCos := math.Abs(rotationMat.GetDoubleAt(0, 0))
absSin := math.Abs(rotationMat.GetDoubleAt(0, 1))
boundW := float64(height) * absSin + float64(width) * absCos
boundH := float64(height) * absCos + float64(width) * absSin
rotationMat.SetDoubleAt(0, 2, rotationMat.GetDoubleAt(0, 2) + (boundW / 2) - float64(imgCenter.X))
rotationMat.SetDoubleAt(1, 2, rotationMat.GetDoubleAt(1, 2) + (boundH / 2) - float64(imgCenter.Y))
gocv.WarpAffine(*mat, mat, rotationMat, image.Point{ X: int(boundW), Y: int(boundH) })
return mat
}
I rotate in the same matrice in-memory, make a new matrice if you don't want to alter it

For anyone using Emgu.CV or OpenCvSharp wrapper in .NET, there's a C# implement of Lars Schillingmann's answer:
Emgu.CV:
using Emgu.CV;
using Emgu.CV.CvEnum;
using Emgu.CV.Structure;
public static class MatExtension
{
/// <summary>
/// <see>https://stackoverflow.com/questions/22041699/rotate-an-image-without-cropping-in-opencv-in-c/75451191#75451191</see>
/// </summary>
public static Mat Rotate(this Mat src, float degrees)
{
degrees = -degrees; // counter-clockwise to clockwise
var center = new PointF((src.Width - 1) / 2f, (src.Height - 1) / 2f);
var rotationMat = new Mat();
CvInvoke.GetRotationMatrix2D(center, degrees, 1, rotationMat);
var boundingRect = new RotatedRect(new(), src.Size, degrees).MinAreaRect();
rotationMat.Set(0, 2, rotationMat.Get<double>(0, 2) + (boundingRect.Width / 2f) - (src.Width / 2f));
rotationMat.Set(1, 2, rotationMat.Get<double>(1, 2) + (boundingRect.Height / 2f) - (src.Height / 2f));
var rotatedSrc = new Mat();
CvInvoke.WarpAffine(src, rotatedSrc, rotationMat, boundingRect.Size);
return rotatedSrc;
}
/// <summary>
/// <see>https://stackoverflow.com/questions/32255440/how-can-i-get-and-set-pixel-values-of-an-emgucv-mat-image/69537504#69537504</see>
/// </summary>
public static unsafe void Set<T>(this Mat mat, int row, int col, T value) where T : struct =>
_ = new Span<T>(mat.DataPointer.ToPointer(), mat.Rows * mat.Cols * mat.ElementSize)
{
[(row * mat.Cols) + col] = value
};
public static unsafe T Get<T>(this Mat mat, int row, int col) where T : struct =>
new ReadOnlySpan<T>(mat.DataPointer.ToPointer(), mat.Rows * mat.Cols * mat.ElementSize)
[(row * mat.Cols) + col];
}
OpenCvSharp:
OpenCvSharp already has Mat.Set<> method that functions same as mat.at<> in the original OpenCV, so we don't have to copy these methods from How can I get and set pixel values of an EmguCV Mat image?
using OpenCvSharp;
public static class MatExtension
{
/// <summary>
/// <see>https://stackoverflow.com/questions/22041699/rotate-an-image-without-cropping-in-opencv-in-c/75451191#75451191</see>
/// </summary>
public static Mat Rotate(this Mat src, float degrees)
{
degrees = -degrees; // counter-clockwise to clockwise
var center = new Point2f((src.Width - 1) / 2f, (src.Height - 1) / 2f);
var rotationMat = Cv2.GetRotationMatrix2D(center, degrees, 1);
var boundingRect = new RotatedRect(new(), new Size2f(src.Width, src.Height), degrees).BoundingRect();
rotationMat.Set(0, 2, rotationMat.Get<double>(0, 2) + (boundingRect.Width / 2f) - (src.Width / 2f));
rotationMat.Set(1, 2, rotationMat.Get<double>(1, 2) + (boundingRect.Height / 2f) - (src.Height / 2f));
var rotatedSrc = new Mat();
Cv2.WarpAffine(src, rotatedSrc, rotationMat, boundingRect.Size);
return rotatedSrc;
}
}
Also, you may want to mutate the src param instead of returning a new clone of it during rotation, for that you can just set the det param of WrapAffine() as the same with src: c++, opencv: Is it safe to use the same Mat for both source and destination images in filtering operation?
CvInvoke.WarpAffine(src, src, rotationMat, boundingRect.Size);
This is being called as in-place mode: https://answers.opencv.org/question/24/do-all-opencv-functions-support-in-place-mode-for-their-arguments/
Can the OpenCV function cvtColor be used to convert a matrix in place?

If it is just to rotate 90 degrees, maybe this code could be useful.
Mat img = imread("images.jpg");
Mat rt(img.rows, img.rows, CV_8U);
Point2f pc(img.cols / 2.0, img.rows / 2.0);
Mat r = getRotationMatrix2D(pc, 90, 1);
warpAffine(img, rt, r, rt.size());
imshow("rotated", rt);
Hope it's useful.

By the way, for 90º rotations only, here is a more efficient + accurate function:
def rotate_image_90(image, angle):
angle = -angle
rotated_image = image
if angle == 0:
pass
elif angle == 90:
rotated_image = np.rot90(rotated_image)
elif angle == 180 or angle == -180:
rotated_image = np.rot90(rotated_image)
rotated_image = np.rot90(rotated_image)
elif angle == -90:
rotated_image = np.rot90(rotated_image)
rotated_image = np.rot90(rotated_image)
rotated_image = np.rot90(rotated_image)
return rotated_image

Can normal maps be generated from a texture?

If I have a texture, is it then possible to generate a normal-map for this texture, so it can be used for bump-mapping?
Or how are normal maps usually made?

Yes. Well, sort of. Normal maps can be accurately made from height-maps. Generally, you can also put a regular texture through and get decent results as well. Keep in mind there are other methods of making a normal map, such as taking a high-resolution model, making it low resolution, then doing ray casting to see what the normal should be for the low-resolution model to simulate the higher one.
For height-map to normal-map, you can use the Sobel Operator. This operator can be run in the x-direction, telling you the x-component of the normal, and then the y-direction, telling you the y-component. You can calculate z with 1.0 / strength where strength is the emphasis or "deepness" of the normal map. Then, take that x, y, and z, throw them into a vector, normalize it, and you have your normal at that point. Encode it into the pixel and you're done.
Here's some older incomplete-code that demonstrates this:
// pretend types, something like this
struct pixel
{
uint8_t red;
uint8_t green;
uint8_t blue;
};
struct vector3d; // a 3-vector with doubles
struct texture; // a 2d array of pixels
// determine intensity of pixel, from 0 - 1
const double intensity(const pixel& pPixel)
{
const double r = static_cast<double>(pPixel.red);
const double g = static_cast<double>(pPixel.green);
const double b = static_cast<double>(pPixel.blue);
const double average = (r + g + b) / 3.0;
return average / 255.0;
}
const int clamp(int pX, int pMax)
{
if (pX > pMax)
{
return pMax;
}
else if (pX < 0)
{
return 0;
}
else
{
return pX;
}
}
// transform -1 - 1 to 0 - 255
const uint8_t map_component(double pX)
{
return (pX + 1.0) * (255.0 / 2.0);
}
texture normal_from_height(const texture& pTexture, double pStrength = 2.0)
{
// assume square texture, not necessarily true in real code
texture result(pTexture.size(), pTexture.size());
const int textureSize = static_cast<int>(pTexture.size());
for (size_t row = 0; row < textureSize; ++row)
{
for (size_t column = 0; column < textureSize; ++column)
{
// surrounding pixels
const pixel topLeft = pTexture(clamp(row - 1, textureSize), clamp(column - 1, textureSize));
const pixel top = pTexture(clamp(row - 1, textureSize), clamp(column, textureSize));
const pixel topRight = pTexture(clamp(row - 1, textureSize), clamp(column + 1, textureSize));
const pixel right = pTexture(clamp(row, textureSize), clamp(column + 1, textureSize));
const pixel bottomRight = pTexture(clamp(row + 1, textureSize), clamp(column + 1, textureSize));
const pixel bottom = pTexture(clamp(row + 1, textureSize), clamp(column, textureSize));
const pixel bottomLeft = pTexture(clamp(row + 1, textureSize), clamp(column - 1, textureSize));
const pixel left = pTexture(clamp(row, textureSize), clamp(column - 1, textureSize));
// their intensities
const double tl = intensity(topLeft);
const double t = intensity(top);
const double tr = intensity(topRight);
const double r = intensity(right);
const double br = intensity(bottomRight);
const double b = intensity(bottom);
const double bl = intensity(bottomLeft);
const double l = intensity(left);
// sobel filter
const double dX = (tr + 2.0 * r + br) - (tl + 2.0 * l + bl);
const double dY = (bl + 2.0 * b + br) - (tl + 2.0 * t + tr);
const double dZ = 1.0 / pStrength;
math::vector3d v(dX, dY, dZ);
v.normalize();
// convert to rgb
result(row, column) = pixel(map_component(v.x), map_component(v.y), map_component(v.z));
}
}
return result;
}

There's probably many ways to generate a Normal map, but like others said, you can do it from a Height Map, and 3d packages like XSI/3dsmax/Blender/any of them can output one for you as an image.
You can then output and RGB image with the Nvidia plugin for photoshop, an algorithm to convert it or you might be able to output it directly from those 3d packages with 3rd party plugins.
Be aware that in some case, you might need to invert channels (R, G or B) from the generated normal map.
Here's some resources link with examples and more complete explanation:
http://developer.nvidia.com/object/photoshop_dds_plugins.html
http://en.wikipedia.org/wiki/Normal_mapping
http://www.vrgeo.org/fileadmin/VRGeo/Bilder/VRGeo_Papers/jgt2002normalmaps.pdf

I don't think normal maps are generated from a texture. they are generated from a model.
just as texturing allows you to define complex colour detail with minimal polys (as opposed to just using millions of ploys and just vertex colours to define the colour on your mesh)
A normal map allows you to define complex normal detail with minimal polys.
I believe normal maps are usually generated from a higher res mesh, and then is used with a low res mesh.
I'm sure 3D tools, such as 3ds max or maya, as well as more specific tools will do this for you. unlike textures, I don't think they are usually done by hand.
but they are generated from the mesh, not the texture.

I suggest starting with OpenCV, due to its richness in algorithms. Here's one I wrote that iteratively blurs the normal map and weights those to the overall value, essentially creating more of a topological map.
#define ROW_PTR(img, y) ((uchar*)((img).data + (img).step * y))
cv::Mat normalMap(const cv::Mat& bwTexture, double pStrength)
{
// assume square texture, not necessarily true in real code
int scale = 1.0;
int delta = 127;
cv::Mat sobelZ, sobelX, sobelY;
cv::Sobel(bwTexture, sobelX, CV_8U, 1, 0, 13, scale, delta, cv::BORDER_DEFAULT);
cv::Sobel(bwTexture, sobelY, CV_8U, 0, 1, 13, scale, delta, cv::BORDER_DEFAULT);
sobelZ = cv::Mat(bwTexture.rows, bwTexture.cols, CV_8UC1);
for(int y=0; y<bwTexture.rows; y++) {
const uchar *sobelXPtr = ROW_PTR(sobelX, y);
const uchar *sobelYPtr = ROW_PTR(sobelY, y);
uchar *sobelZPtr = ROW_PTR(sobelZ, y);
for(int x=0; x<bwTexture.cols; x++) {
double Gx = double(sobelXPtr[x]) / 255.0;
double Gy = double(sobelYPtr[x]) / 255.0;
double Gz = pStrength * sqrt(Gx * Gx + Gy * Gy);
uchar value = uchar(Gz * 255.0);
sobelZPtr[x] = value;
}
}
std::vector<cv::Mat>planes;
planes.push_back(sobelX);
planes.push_back(sobelY);
planes.push_back(sobelZ);
cv::Mat normalMap;
cv::merge(planes, normalMap);
cv::Mat originalNormalMap = normalMap.clone();
cv::Mat normalMapBlurred;
for (int i=0; i<3; i++) {
cv::GaussianBlur(normalMap, normalMapBlurred, cv::Size(13, 13), 5, 5);
addWeighted(normalMap, 0.4, normalMapBlurred, 0.6, 0, normalMap);
}
addWeighted(originalNormalMap, 0.3, normalMapBlurred, 0.7, 0, normalMap);
return normalMap;
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Tensorflow DrawBoundingBoxes C++ - c++

Related

My C++ code is not detecting objects correctly yolov5

Calculate text size

opencv: how to draw arrows on orientation image

Rotate an image without cropping in OpenCV in C++

Can normal maps be generated from a texture?

Categories

Resources