Tesseract/Leptonica proper way to handle single and multipage images? - c++

I got a few questions about how input images are handled in Tesseract (with leptonica).
What I'm trying to do here is to have a method that can process any image file (no specific format requiered) and feed it later to the tesseract API, but this doesn't seems to be the right way of doing things with leptonica...
Here is an exemple of what I'm doing:
string tmpFile ="path/to/my/file";
// Trying to load a PIXA struct, since it can handle multipage images
PIXA* sourceImg =pixaRead(tmpFile.c_str());
if (sourceImg == NULL) {
// this happen when pixaRead method fails to load the image
// So we suppose it's a single page image-file.
sourceImg =new PIXA;
sourceImg->n =1;
sourceImg->pix =(Pix**)malloc(sizeof(Pix*));
assert(sourceImg->pix != NULL);
sourceImg->pix[0] =pixRead(tmpFile.c_str());
sourceImg->refcount =1;
}
api = new tesseract::TessBaseAPI();
if (api->Init(NULL, "eng")) {
fprintf(stderr, "Could not initialize tesseract.\n");
exit(1);
}
// Now we can process each pages
for(int i=0; i<sourceImg->n; i++) {
// results is an object I use to save text from each documents,
// with page count
if (i > 0)
results.addPage();
Pix* image =sourceImg->pix[i];
api->SetImage(image);
// Get OCR result
outText = api->GetUTF8Text();
// Here I process stuff, not really important
int dummyPos=0;
results.addLine(outText, dummyPos, dummyPos, dummyPos, dummyPos);
delete [] outText;
}
pixaDestroy(&sourceImg);
api->End();
So this is working, but not in the way I want, because even if I use a multipage tiff I got the following message when loading the image:
Error in pixaReadStream: not a pixa file
Error in pixaRead: pixa not read
It's still able to process the document thank's to the "pixRead" method I use in case the "pixaRead" fails...
Can someone explain to me what is wrong with my use of the "pixaRead" function?
And is it possible to handle single and multipage images with something like that?
PS: I'm using Tesseract V4.0 and Leptonica V1.74.4
Thanks in advance!

Use pixaReadMultipageTiff to read TIFF image (single or multiple pages) and pixRead for other image formats.

Related

Does anyone know why i got this exception using OCR Tesseract in c++?

I'm trying to use Tesseract in C++ for a small personal project.
It's my first time using tesseract in C++ and I'm using Visual Studio 2019 to code.
I followed this tutorial to install Tesseract with VB :
https://www.technical-recipes.com/2021/getting-started-with-tesseract-optical-character-recognition-ocr-library-in-visual-studio/
But when i use the code to try if everything is working, my code raise an exception.
Image of the error
The code is :
{
char* outText;
tesseract::TessBaseAPI* api = new tesseract::TessBaseAPI();
// Initialize tesseract-ocr with English, without specifying tessdata path
if (api->Init("C:\\Users\\Kilian\\source\\repos\\LecteurHdv\\tessdata", "fra")) {
fprintf(stderr, "Could not initialize tesseract.\n");
exit(1);
}
// Open input image with leptonica library
Pix* image = pixRead("C:\\Users\\Kilian\\Pictures\\TestOcr\\test.png");
api->SetImage(image);
// Get OCR result
outText = api->GetUTF8Text();
printf("OCR output:\n%s", outText);
// Destroy used object and release memory
api->End();
delete api;
delete[] outText;
pixDestroy(&image);
return 0;
}
The only changes I made to the base code were changing the language to French and changing the image folder.
Does anyone have any idea why i got this exception ?
Sorry if a post already exist about that.
Thanks you all for your replies.

Trying to encode a GIF file using giflib

I am given image data and color table I am trying to export it as a single frame GIF using giflib. I looked into the API, but can't get it to work. The program crashes even at the first function:
GifFileType image_out;
int errorCode = 0;
char* fileName = "SomeName.gif";
image_out = *EGifOpenFileName(fileName,true, &errorCode);
It is my understanding that I first need to open a file by specifying it's name and then update it with fileHandle. Then Fill in the screen description, the extension block the image data and add the 3B ending to the file. Then use EGifSpew to export the whole gif. The problem is that I can't even use EGifOpenFileName(); The program crashes at that line.
Can someone help me the API of giflib? This problem is getting really frustrating.
Thanks.
EDIT:
For the purposes of simple encoding I do not want to specify a color table and I just want to encode a single frame GIF.
The prototype is:
GifFileType *EGifOpenFileName(char *GifFileName, bool GifTestExistance, int *ErrorCode)
You should write as
GifFileType* image_out = EGifOpenFileName(fileName,true, &errorCode);
Note GifFileType is not POD type so you should NOT copy like that.

Compress DICOM file with DCMTK (C++)

damn i'm very frustated...
Following the example in this page http://support.dcmtk.org/docs/mod_dcmjpeg.html, I have written a C++ program to decompress a JPEG-compressed DICOM image file
Now I want to do the vice versa, from uncompressed to compressed and if I use the other example in the same page, with the same (or other file) the code compile and run but is not able to compress the file...
I saw that afetr the following code, the originale Xfer and the Current is the same, and this is not good because need to be different
dataset->chooseRepresentation(EXS_JPEGProcess14SV1, &params);
It's like the chooseRepresentation method fail....
More the line
dataset->canWriteXfer(EXS_JPEGProcess14SV1)
return false
I saw that in the dcpixel.cc file, with debugging the code go in
DcmPixelData::canChooseRepresentation(.........
....
....
// representation not found, check if we have a codec that can create the
// desired representation.
if (original == repListEnd)
{
result = DcmCodecList::canChangeCoding(EXS_LittleEndianExplicit, toType.getXfer());
}
and result is FALSE....
How can I fix it? Someone have a code that works to compress a DICOM image with DCMTK or another library
This is the full code:
int main()
{
//dcxfer.h
DJDecoderRegistration::registerCodecs(); // register JPEG codecs
DcmFileFormat fileformat;
/**** MONO FILE ******/
if (fileformat.loadFile("Files/cnv3DSlice (1)_cnv.dcm").good())
{
DcmDataset *dataset = fileformat.getDataset();
DcmItem *metaInfo = fileformat.getMetaInfo();
DJ_RPLossless params; // codec parameters, we use the defaults
// this causes the lossless JPEG version of the dataset to be created
dataset->chooseRepresentation(EXS_JPEGProcess14SV1, &params);
// check if everything went well
if (dataset->canWriteXfer(EXS_JPEGProcess14SV1))
{
// force the meta-header UIDs to be re-generated when storing the file
// since the UIDs in the data set may have changed
delete metaInfo->remove(DCM_MediaStorageSOPClassUID);
delete metaInfo->remove(DCM_MediaStorageSOPInstanceUID);
// store in lossless JPEG format
fileformat.saveFile("Files/test_jpeg_compresso.dcm", EXS_JPEGProcess14SV1);
}
}
DJDecoderRegistration::cleanup(); // deregister JPEG codecs
return 0;
}
When trying to compress an image you need to call
DJEncoderRegistration::registerCodecs();
Decompress is
DJDecoderRegistration::registerCodecs();

How do I load an image (raw bytes) with OpenCV?

I am using Mat input = imread(filename); to read an image but I'd like to do it from memory instead. The source of the file is from an HTTP server. To make it faster, instead of writing the file to disk and then use imread() to read from it, i'd like to skip a step and directly load it from memory. How do I go about doing this?
Updated to add error
I tried the following but I'm getting segmentation fault
char * do_stuff(char img[])
{
vector<char> vec(img, img + strlen(img));
Mat input = imdecode(Mat(vec), 1);
}
See the man page for imdecode().
http://docs.opencv.org/modules/highgui/doc/reading_and_writing_images_and_video.html#imdecode
I had a similar problem. I needed to decode a jpeg image stream in memory and use the Mat image output for further analysis.
The documentation on OpenCV::imdecode did not provide me enough information to solve the problem.
However, the code here by OP worked for me. This is how I used it ( in C++ ):
//Here pImageData is [unsigned char *] that points to a jpeg compressed image buffer;
// ImageDataSize is the size of compressed content in buffer;
// The image here is grayscale;
cv::vector<unsigned char> ImVec(pImageData, pImageData + ImageDataSize);
cv:Mat ImMat;
ImMat = imdecode(ImVec, 1);
To check I saved the ImMat and was able to open the image file using a image viewer.
cv::imwrite("opencvDecodedImage.jpg", ImMat);
I used : OpenCV 2.4.10 binaries for VC10 on x86.
I hope this information can help others.

OpenCV Error: Null pointer (NULL array pointer is passed) in cvGetMat

I have run the code of Caltech-Lanes-Detection. There is my command:
$ ./LaneDetector32 --show --list-file=/home/me/caltech-lanes/cordova1/list.txt --list-path=/home/me/caltech-lanes/cordova1/ --output-suffix=_result
and there is a problem as following:
main.cc:187 msg Loaded camera file
main.cc:194 msg Loaded lanes config file
main.cc:249 msg Processing image: /home/me/caltech-lanes/cordova1/f00000.png
OpenCV Error: Null pointer (NULL array pointer is passed) in cvGetMat, file /home/me/OpenCV-2.0.0/src/cxcore/cxarray.cpp, line 2370
terminate called after throwing an instance of 'cv::Exception'
and if I run this command:
eog /home/me/caltech-lanes/cordova1/f00000.png
I can see the picture.Please help me. Thank you.
This question might better be answered by Mohamed Aly, the guy who actually worked on this. His contact is right on the page you linked.
That said, let's take a look. (There's a TLDR if you want to skip this) The error is caused by the cvGetMat in the cxarray.cpp file. The first couple lines of which are:
2362 cvGetMat( const CvArr* array, CvMat* mat,
2363 int* pCOI, int allowND )
2364 {
2365 CvMat* result = 0;
2366 CvMat* src = (CvMat*)array;
2367 int coi = 0;
2368
2369 if( !mat || !src )
2370 CV_Error( CV_StsNullPtr, "NULL array pointer is passed" );
...
return result;
}
It isn't until later that we actually check if you're image has data in it or not.
So now lets find where Mr. Aly used cvGetMat(). We're in luck! Only one place where he's used it without commenting it out: File is mcv.cc
void mcvLoadImage(const char *filename, CvMat **clrImage, CvMat** channelImage)
{
// load the image
IplImage* im;
im = cvLoadImage(filename, CV_LOAD_IMAGE_COLOR);
// convert to mat and get first channel
CvMat temp;
cvGetMat(im, &temp);
*clrImage = cvCloneMat(&temp);
// convert to single channel
CvMat *schannel_mat;
CvMat* tchannelImage = cvCreateMat(im->height, im->width, INT_MAT_TYPE);
cvSplit(*clrImage, tchannelImage, NULL, NULL, NULL);
// convert to float
*channelImage = cvCreateMat(im->height, im->width, FLOAT_MAT_TYPE);
cvConvertScale(tchannelImage, *channelImage, 1./255);
// destroy
cvReleaseMat(&tchannelImage);
cvReleaseImage(&im);
}
This is clearly where the filename you specified ends up. Nothing wrong here. It would be nice if he double-checked that the image was actually loaded in the code, but not strictly necessary. The cvGetMat has two inputs, the image, and the mat it gets written into. The mat should be fine, so we need to check the image. cvLoadImage would work with any filename - whether or not the file exists - without giving an error; so we need to check that the filename got there intact. mcvLoadImage is called in ProcessImage(*) in the main.cc file - but this also gets the filename passed into it. ProcessImage is called in Process() where the filename is put in as the same string that is printed out when it says
Processing image: /home/me/caltech-lanes/cordova1/f00000.png
Of course, that's just a string - he didn't check if he could read in the file beforehand, so when he say "Processing Image" he really means "This is the path I was given to the image - but I don't actually know if I can read it yet".
TLDR: (And I can't blame ya) So it seems like the main issue is that it can't read the file despite eog being able to display it. As-is the only thing I can suggest is trying to move the folder cordova1 to something like C:/Test/cordova1/ or (if there are settings on your computer that prevent that from working) C:/Users/[You]/cordova1/ with the files in there and do a
$ ./LaneDetector32 --show --list-file=/home/me/caltech-lanes/cordova1/list.txt --list-path=/home/me/caltech-lanes/cordova1/ --output-suffix=_result
to see if it's a permissions error preventing the lane-detection program from actually reading in the file.
Just in case it helps, I had this same error because I was dealing (trying to show) with very large images.
So I had to segment the images and process it chunk by chunk.
(I was using OpenCV 3.0 for Python, I know this was for C++ but it is basically what is running underneath).