I'm using OpenCV to extract a subimage of a scanned document and would like to use tesseract to perform OCR over this subimage.
I found out that I can use two methods for text recognition in tesseract, but so far I wasn't able to find a working solution.
A.) How can I convert a cv::Mat into a PIX*?
(PIX* is a datatype of leptonica)
Based on vasiles code below, this is essentially my current code:
cv::Mat image = cv::imread("c:/image.png");
cv::Mat subImage = image(cv::Rect(50, 200, 300, 100));
int depth;
if(subImage.depth() == CV_8U)
depth = 8;
//other cases not considered yet
PIX* pix = pixCreateHeader(subImage.size().width, subImage.size().height, depth);
pix->data = (l_uint32*) subImage.data;
tesseract::TessBaseAPI tess;
STRING text;
if(tess.ProcessPage(pix, 0, 0, &text))
{
std::cout << text.string();
}
While it doesn't crash or anything, the OCR result still is wrong. It should recognize one word of my sample image, but instead it returns some non-readable characters.
The method PIX_HEADER doesn't exist, so I used pixCreateHeader, but it doesn't take the number of channels as an argument. So how can I set the number of channels?
B.) How can I use cv::Mat for TesseractRect() ?
Tesseract offers another method for text recognition with this signature:
char * TessBaseAPI::TesseractRect (
const UINT8 * imagedata,
int bytes_per_pixel,
int bytes_per_line,
int left,
int top,
int width,
int height
)
Currently I am using the following code, but it also returns non-readable characters (although different ones than from the code above.
char* cr = tess.TesseractRect(
subImage.data,
subImage.channels(),
subImage.channels() * subImage.size().width,
0,
0,
subImage.size().width,
subImage.size().height);
tesseract::TessBaseAPI tess;
cv::Mat sub = image(cv::Rect(50, 200, 300, 100));
tess.SetImage((uchar*)sub.data, sub.size().width, sub.size().height, sub.channels(), sub.step1());
tess.Recognize(0);
const char* out = tess.GetUTF8Text();
For Anybody using the JavaCPP presets of OpenCV/Tesseract, here is what works
Mat img = imread("file.jpg");
Mat gray = new Mat();
cvtColor(img, gray, CV_BGR2GRAY);
// api is a Tesseract client which is initialised
api.SetImage(gray.data().asBuffer(),gray.size().width(),gray.size().height(),gray.channels(),gray.size1())
cv::Mat image = cv::imread(argv[1]);
cv::Mat gray;
cv::cvtColor(image, gray, CV_BGR2GRAY);
PIX *pixS = pixCreate(gray.size().width, gray.size().height, 8);
for(int i=0; i<gray.rows; i++)
for(int j=0; j<gray.cols; j++)
pixSetPixel(pixS, j,i, (l_uint32) gray.at<uchar>(i,j));
First, make a deep copy of your subImage, so that it will be stored in a coninuous memory block:
cv::Mat subImage = image(cv::Rect(50, 200, 300, 100)).clone();
Then, init a PIX headed (I don't know how) with the correct parameters.
// ???? Put your own constructor here.
PIX* pix = new PIX_HEADER(width, height, channels, depth);
OR, create it manually:
PIX pix;
pix.width = subImage.width;
...
Then set the pix data pointer to the subImage data pointer
pix.data = subImage.data;
Finally, make sure your subImage objects does not go out of scope before you finish your work with pix.
Related
I have some code that takes multiple images, aligns them, and stacks them together. For some reason, the alignment is off. A simplified version of the code is below
void stackImages(uint8_t **pixels, uint32_t width, uint32_t height, size_t len)
{
cv:: Mat firstImg;
cv::Mat stacked;
for (int i = 0; i < len; i++)
{
// Transformation matrix
cv::Mat1f M = cv::Mat1f(cv::Mat::eye(3, 3, CV_8UC1));
// Convert pixels (4 channel RGBA) to Mat
cv::Mat pixels = cv::Mat(height, width, CV_8UC4, pixels[i]);
cv::Mat gray;
cv::cvtColor(pixels, gray, cv::COLOR_RGBA2GRAY);
// skip the reference image
if(!i) {
firstImg = gray;
stacked = gray;
continue;
}
cv::Mat warped;
// create size struct
cv::Size size;
size.width = width;
size.height = height;
// create the transformation matrix
cv::findTransformECC(firstImg, gray, M, cv::MOTION_HOMOGRAPHY);
// warp the image according ot the transformation matrix
cv::warpPerspective(gray, warped, M, size);
// stack the image
stacked += warped;
}
// write the image
cv::imwrite("stacked.jpg", stacked);
}
I've tested this code with three images taken in rapid succession and the results are below. This is my first foray into image processing, so I'm mostly following online documentation.
Good Day! I'm using imwrite command to save the image below after cropping them in OpenCV (C++) but it seems like it included the black portion surrounding it in writing. All I want is to save the cropped one. Please help.
Here's my code
Mat mask,draft,res;
int nPixels;
char c=0;
while(true && c!='q') {
imshow("SAMPLE", img);
if(!roi.isSet())
roi.set("SAMPLE");
if (roi.isSet()) {
roi.createMask(img.size());
mask = roi.getMask();
res = mask & img.clone();
imwrite("masked.png",res);
imshow("draft", res);
}
c = waitKey(1);
}
Here is an example how to crop an image and save the croped image (see comment from api55). Maybe that helps you.
cv::Mat img = cv::imread("Path/To/Image/image.png", cv::IMREAD_GRAYSCALE);
if(image.empty())
return -1;
cv::Rect roi(0, 0, 100, 100); // define roi here as x0, y0, width, height
cv::Mat cropedImg(img, roi);
cv::imwrite("Path/To/Save/Location/cropedImage.png", cropedImg);
I have tried this code I gotten here. Its for displaying multiple images on a single window for C++. I have included the opencv 3.0 library on the program as well. Below is the code. I am trying to load 2 images but only the first one (1.jpg) appears but when i put image2 to be equal to cv::imread("1.jpg"); two images of 1.jpg appears. I am really new to this and I dont understand where i am going wrong here. I hope someone can help me. Thank you.
int main(int argc, char *argv[])
{
// read an image
cv::Mat image1= cv::imread("1.jpg");
cv::Mat image2= cv::imread("2.jpg");
int dstWidth = image1.cols;
int dstHeight = image1.rows * 2;
cv::Mat dst = cv::Mat(dstHeight, dstWidth, CV_8UC3, cv::Scalar(0,0,0));
cv::Rect roi(cv::Rect(0,0,image1.cols, image1.rows));
cv::Mat targetROI = dst(roi);
image1.copyTo(targetROI);
targetROI = dst(cv::Rect(0,image1.rows,image1.cols, image1.rows));
image2.copyTo(targetROI);
// create image window named "My Image"
cv::namedWindow("OpenCV Window");
// show the image on window
cv::imshow("OpenCV Window", dst);
// wait key for 5000 ms
cv::waitKey(5000);
return 0;
}
This is the result of the program above
Your code works ok for me, if images have the same size. Otherwise, the call to
image2.copyTo(targetROI);
will copy image2 into a newly created image, not in dst as you would expect.
If you want to make it work in general, you should:
1) to set dstWidth and dstHeight like:
int dstWidth = max(image1.cols, image2.cols);
int dstHeight = image1.rows + image2.rows;
2) set the second ROI with the size of the second image:
targetROI = dst(cv::Rect(0, image1.rows, image2.cols, image2.rows));
// ^ ^
From the comments, to show 4 images disposed as 2x2, you need a little more work:
#include <opencv2\opencv.hpp>
#include <iostream>
using namespace cv;
using namespace std;
int main()
{
// read an image
cv::Mat image1 = cv::imread("path_to_image1");
cv::Mat image2 = cv::imread("path_to_image2");
cv::Mat image3 = cv::imread("path_to_image3");
cv::Mat image4 = cv::imread("path_to_image4");
//////////////////////
// image1 image2
// image3 image4
//////////////////////
int max13cols = max(image1.cols, image3.cols);
int max24cols = max(image2.cols, image4.cols);
int dstWidth = max13cols + max24cols;
int max12rows = max(image1.rows, image2.rows);
int max34rows = max(image3.rows, image4.rows);
int dstHeight = max12rows + max34rows;
cv::Mat dst = cv::Mat(dstHeight, dstWidth, CV_8UC3, cv::Scalar(0, 0, 0));
cv::Rect roi(cv::Rect(0, 0, image1.cols, image1.rows));
image1.copyTo(dst(roi));
roi = cv::Rect(max13cols, 0, image2.cols, image2.rows);
image2.copyTo(dst(roi));
roi = cv::Rect(0, max12rows, image3.cols, image3.rows);
image3.copyTo(dst(roi));
roi = cv::Rect(max13cols, max12rows, image4.cols, image4.rows);
image4.copyTo(dst(roi));
cv::imshow("OpenCV Window", dst);
cv::waitKey(0);
return 0;
}
I want to implement a OCR feature.
I have collected some samples and i want to use K-Nearest to implement it.
So, i use the below code to load data and initialize KNearest
KNearest knn = new KNearest;
Mat mData, mClass;
for (int i = 0; i <= 9; ++i)
{
Mat mImage = imread( FILENAME ); // the filename format is '%d.bmp', presenting a 15x15 image
Mat mFloat;
if (mImage.empty()) break; // if the file doesn't exist
mImage.convertTo(mFloat, CV_32FC1);
mData.push_back(mFloat.reshape(1, 1));
mClass.push_back( '0' + i );
}
knn->train(mData, mClass);
Then, i call the code to find best result
for (vector<Mat>::iterator it = charset.begin(); it != charset.end(); ++it)
{
Mat mFloat;
it->convertTo(mFloat, CV_32FC1); // 'it' presents a 15x15 gray image
float result = knn->find_nearest(mFloat.reshape(1, 1), knn->get_max_k());
}
But, my application crashes at find_nearest.
Anyone could help me?
I seemed to find the problem...
My sample image is a converted gray image by cvtColor, but my input image isn't.
After i add
cvtColor(mImage, mImage, COLOR_BGR2GRAY);
between
if (mImage.empty()) break;
mImage.convertTo(mFloat, CV_32FC1);
find_nearest() return a value and my application is fine.
cvSetImageROI(dst, cvRect(0, 0,img1->width,img1->height) );
cvCopy(img1,dst,NULL);
cvResetImageROI(dst);
I was using these commands to set image ROI but now i m using MAT object and these functions take only Iplimage as a parameter. Is there any similar command for Mat object?
thanks for any help
You can use the cv::Mat::operator() to get a reference to the selected image ROI.
Consider the following example where you want to perform Bitwise NOT operation on a specific image ROI. You would do something like this:
img = imread("image.jpg", CV_LOAD_IMAGE_COLOR);
int x = 20, y = 20, width = 50, height = 50;
cv::Rect roi_rect(x,y,width,height);
cv::Mat roi = img(roi_rect);
/* ROI data pointer points to a location in the same memory as img. i.e.
No separate memory is created for roi data */
cv::Mat complement;
cv::bitwise_not(roi,complement);
complement.copyTo(roi);
cv::imshow("Image",img);
cv::waitKey();
The example you provided can be done as follows:
cv::Mat roi = dst(cv::Rect(0, 0,img1.cols,img1.rows));
img1.copyTo(roi);
Yes, you have a few options, see the docs.
The easiest way is usually to use a cv::Rect to specifiy the ROI:
cv::Mat img1(...);
cv::Mat dst(...);
...
cv::Rect roi(0, 0, img1.cols, img1.rows);
img1.copyTo(dst(roi));