How do I link Tesseract to a C++ project in VS 2019? - c++

So I've linked OpenCV already and that was pretty straightforward and there are many guides online how to do it.
But I don't know how to go about downloading Tesseract for usage in one's own applications. I want to get the API and use it in my code in conjunction with OpenCV. Can anyone guide me through what I need to download and what settings I'd need to tinker with to achieve this?

Install vcpkg ( MS packager to install windows based open source projects) and use powershell command like so .\vcpkg install tesseract:x64-windows-static. Dependency libraries like Leptonica will be auto installed for you. The tesseract can be auto integrated to your VS project using .\vcpkg integrate install.

I had a similar problem and in this thread I shared my experience on how I solved it. May be helpful for someone. I'll cope the text here:
I've been trying to link tesseract library to my c++ project in Visual Studio 2019 for a couple of days and I finally managed to do it.
Any thread that I found or even official tesseract documentation do not have full list of instructions on what to do.
I'll list what I have done, hopefully it will help someone. I don't pretend its the optimal way to do so.
There are basic tips in official tesseract documentation.
Go to "Windows" section.
I did install sw and cppan but I guess it wasn't necessary.
The main thing here is installing vcpkg.
It requiers Git so I installed it.
then:
> cd c:tools (I installed it in c:\tools, you may choose any dir)
> git clone https://github.com/microsoft/vcpkg
> .\vcpkg\bootstrap-vcpkg.bat
> .\vcpkg\vcpkg install tesseract:x64-windows-static (I used x64 version)
> .\vcpkg\vcpkg integrate install
At this point everything should work, they said. Headers should be included, libs should be linked. But none was working for me.
Change project configuration to Release x64 (or Release x86 if you installed x86 tesseract).
To include headers: Go to project properties -> C/C++ -> General. Set Additional Include Directories to C:\tools\vcpkg\installed\x64-windows-static\include (or whereever you installed vcpkg)
To link libraries : project properties -> Linker -> General. Set Additional Library Directories to C:\tools\vcpkg\installed\x64-windows-static\lib
Project properties -> C/C++ -> Code Generation. Set Runtime Library to Multi-threaded(/MT). Otherwise I got errors like "runtime mismatch static vs DLL"
Tesseract lib couldn't link to its dependcies, so I added all libs that I had installed to C:\tools\vcpkg\installed\x64-windows-static\lib.
Project properties -> Linker -> Input. I set Additional Dependencies to archive.lib;bz2.lib;charset.lib;gif.lib;iconv.lib;jpeg.lib;leptonica-1.80.0.lib;libcrypto.lib;libpng16.lib;libssl.lib;libwebpmux.lib;libxml2.lib;lz4.lib;lzma.lib;lzo2.lib;openjp2.lib;tesseract41.lib;tiff.lib;tiffxx.lib;turbojpeg.lib;webp.lib;webpdecoder.lib;webpdemux.lib;xxhash.lib;zlib.lib;zstd_static.lib;%(AdditionalDependencies)
And after that it finally compiled and launched.
But... api->Init returned -1. To work with tesseract you should have tessdata directory with .traineddata files for the languages you need.
Download tessdata. I got it from official docs.
BTW, tessdata_fast worked better than tessdata_best for my purposes :)
So I downloaded single "eng" file and saved it like C:\tools\TesseractData\tessdata\eng.traineddata.
Then I added environment variable TESSDATA_PREFIX with value C:\tools\TesseractData\tessdata. I also added C:\tools\TesseractData to Path variables (just in case)
And after all this it is finally working for me.

Related

How to setup complex project with Visual Studio 2022?

I have project that I was developing for years in Linux.
It depends on MKL, libxml++, GSL and armadillo library.
Its installation structure is done in CMake and project is formed by building a shared library and couple of executables that link to it. There are about 20 classes in the library.
Project structure is:
--src
--executable1.cpp
--executable2.cpp
--mysharedlib
--class1.h
--class1.cpp
--...
My question is how to install and run this code in Visual Studio in Windows.
I never used VS before and am still going through tutorials. I managed to run my code by installing Ubuntu on WSL, but I I'd like a VS solution as it'd be handy to pass to user not familiar with Linux.
I tried opening the project directory with VS, hoping CMAKE would do all the magic, but expectedly it cannot locate the dependent libraries, so I am now going through web looking how to integrate each to VS. I managed to find armadillo and mkl guide, but I am lost on how to link these libraries to my project codes and whether I should abandon its current cmake setup and start building the code structure differently in VS.
Any links to useful VS tutorials and advices how to this are greatly appreciated.
VS does have support for CMake, although I have no idea how well VS integrates CMake. If you're not set on using VS, you might want to look into an IDE that uses CMake at it's core, Clion comes to mind. That being said, when coming from Linux you don't have the (initial) luxury of simply installing all the dependencies via a preinstalled package manager.
In order for CMake to find your dependencies (assuming you've configured them by using find_package()) you should add the sources of your dependencies to your project in a thirdparty folder (name is up to you) and add these dependencies using add_subdirectory() instead. This will compile all your dependencies from source, so you might have to configure these dependencies yourself (look into the documentation of your dependencies on how to build them from source).
Another way is to use a package manager that is available on Windows to download, compile and provide your dependencies to your build tools. vcpkg comes to mind, claiming to integrate well with CMake by providing a toolchain file that you can pass to CMake when building your project. You might even be able to configure VS to automatically pass this toolchain to CMake whenever it's invoked.
From personal experience, there is no need to convert an existing project to the VS project structure. There's plenty of available solutions and tools available on Windows to work with CMake projects. Going with the cross-platform approach should be preferred unless you're only targeting Windows, using VS to it's fullest then might give you some additional quality of life.
If you have more specific questions regarding this, I suggest that you update your original post or to create separate, specific questions regarding the processes involved in setting up an existing CMake project on Windows.

Installing C++ Boost library on different hard drive

I'm still pretty inexperienced with C++ but I need to install Boost 1.6.1.
I just want to do it with the minimum hassle possible.
I'm using visual studio 2015 for development, which is installed on my C drive. The problem is I don't have much space left on my C drive .
Is it possible to install boost on my D drive?
Can someone explain to me step by step how to so this or point me to a good step by step tutorial that explains how to do this?
Thanks
Download my Boost Build Environment.
Extract it to the root of your D drive. It will create a boost_build_environment directory.
Open the MSBuild Command Prompt for VS2015.
CD into D:\boost_build_environment.
Build boost as follows.
msbuild /nologo /target:BuildAll BuildBoost.proj
Run the CleanAll target as follows.
msbuild /nologo /target:CleanAll BuildBoost.proj
Have fun using Boost.
The magic is in the Microsoft.Cpp.Win32.user.props and Microsoft.Cpp.x64.user.props files, which are copied into $(LOCALAPPDATA)\Microsoft\MSBuild\v4.0 by the CopyProps target. These props files are automatically imported by most, if not all project files. They set the AdditionalIncludeDirectories and AdditionalLibraryDirectories lists so that ICU and Boost will be found.

C++ how to manage dependencies (use libraries from github for example)

I'm very new to C++ world, so please, sorry for such a dummy question. I googled a little, but wasn't able to find a proper answer.
My question is fairly simple - how should I use libraries in C++ world. For example in Java - there is maven and gradle for this task. In Python - I use pip. In javascript npm and bower do all the stuff. In C# you use nuget or just adding DLL lib to your project. But looks like in C++ things isn't such easy.
I found a tool, called conan but amount of libraries they have is pretty small and does not include any what I'm looking for.
So, for example - I want to use nlp lib meta but it seems like they don't provide any installer files. So I assume I need to get sources from Github. Should I compile them and then try to add the compiled files to my project or do I need to have a lib folder in my project, and put meta's sources in those folder and after operate with meta's sources as they are in my project?
My question isn't about how to install specific meta lib, but more from the source management point of view. If I use Visual Studio on Windows for example, but my colleague will be coding Clion under Linux. And I don't know the proper way of managing dependencies in C++ world.
C++ doesn't have anything like pip or npm/bower. I don't know if maven or gradle can be persuaded to handle C++ libraries.
In general, you are going to have to end up with
Header files in a directory somewhere
library files (either static libraries, or DLLs/shared objects). If the library is a header-only library like some of the boost libraries, then you won't need this.
You get hold of the library files, either by building them on your machine (typical for open source projects, and projects aimed at Linux platforms), or by downloading the pre-compiled binaries (typical for Windows libraries, particularly paid-for).
Hopefully, the instructions for building the library will be included on the library website. As noted in the comments, 'meta' seems to be quite good at that.
When you try to compile with the library, you may need a command line option (eg -I) to specify the directory containing the header files, and you may need a linker option (eg -l) to tell the linker to link against your library.
Cget will install any package that uses standard cmake, and works for linux and windows. It has shorten syntax for getting packages directly from github(such as cget install google/googletest).
In addition, dependencies can be automatically downloaded as well by listing them in a requirements.txt file.
There is also recipes for installing non-cmake packages and the repository here has over 300 libraries(and growing). So you can install curl with just cget install pfultz2/cget-recipes curl.
C++ sadly has no package manager for libraries. Some are out there and try to be one which are still small and scattered though (like conan).
In linux you have some "-dev" packages you can install but they are also not "all".
You most likely end up downloading them yourself. Next though is you have the problem of integrating those libraries. You have different build systems per operating system so you have to see how you build c++ files.
Like in windows with Visual studio you have to get a visual studio project or a nmake compatible makefile to build the libraries and then add them to your project. Same with linux makefiles.
There are several build frameworks who are higher level like cmake. The example you have in your post also works with CMake. So integrating that one into a cmake build environment would be easier but this only applies for other libraries also trying to use/integrate cmake build environments to it (e.g. boost / qt is doing this).
Yeah these are some thoughts to this. Sadly there won't be an easy/definitive answer to this because there is no real central c++ packet repository which is also integrated into a build system.
It appears to me that the Crascit/DownloadProject could be of help in your situation. It provides CMake plugins for downloading projects from a git repository by specifying tags, etc. Then you can use add_custom_target to run commands you need to have the project built.
There are a number of popular C++ released via nuget packages.
You can search on the gallery for them, usually using the native or c++ tags. Obviously you need a nuget manager for your OS, and I'm pretty sure that the C++ nuget packages rely on MSBuild for a lot of the grunt work, so you may have trouble getting a non-Visual Studio oriented setup to work nicely.
Also Gradle actually does have some support for native dependencies as well. I had a look at little while ago but the work on it was curtailed because the support for VS 2015 was lacking.
I recommend vcpkg for cross platform development. It has a number of IDE integrations. GitHub project is here.
I do cross platform development using tools like CMake, Visual Studio, WSL. vcpkg was incredibly helpful.
I started new project... in cureent time it's just "source package manager" you can provide some source code on github and then it will be just copy to you project (based on cmake + auto generating cmake files)
So links here:
https://github.com/wsjcpp/wsjcpp

Installing Boost libraries with MinGW and CodeBlocks

I'm having my first fling with the Boost libraries, and I've picked a pretty girl named Regex.
I've installed the libraries (which build automatically?) on my machine, but I'm getting the above error (cannot find -lboost_regex). I'm using Code::Blocks with MinGW, and a C++0X compiler flag.
I have
Pointed the "search directories" to the installation directory
Added the -lboost_regex flag to the linker
but no luck. Can someone help me get this working?
Update
Got things running now. I've added some further notes in an answer below, for newcomers to this problem.
(Also, changed the title of the question since it turned out to be a broader issue than when I started out.)
Here's some links and tips that can help a newcomer, from my first build experience. I built the libraries directly from the zip file. I built on MinGW and I used CodeBlocks for the IDE.
Download Boost zip, unzip somewhere (I'll call that place $boostdir)
Pretty large when unzipped, > 300MB
Add MinGW bin to PATH var
When Boost builds, it will need access to MinGW executables
Build b2.exe and bjam.exe
The documentation for Windows blithely assumes MSVC compiler is available.
If it is, you can apparently use the bootstrap.bat like the docs say.
If it's not (like mine), you'll have to build the exe files yourself, in steps 4 and 5.
In CMD, navigate to $boostdir/tools/build/v2/engine
Run build.bat mingw (will build b2.exe and bjam.exe)
Some aging basic documentation on that
Now you've got b2 and bjam custom-built according to your system spec. Navigate back up to $boostdir and get ready to start building the libraries.
Boost will make a new bin.v2 directory in the current directory.
All the libs will go in bin.v2.
This is an "intermediate" directory, for some reason
Nothing to do in this step, just some extra info :)
Run b2 toolset=gcc --build-type=complete
This takes a long time, in the neighborhood of 1 - 2 hours.
You'll know if it's working. If you think something's wrong, it's not working.
The build can use various flags
Now you're all built. Time to set up CodeBlocks.
Point your compiler to the header files
Right click your project -> Build Options -> Search Directories tab -> Compiler tab -> add $boostdir address
Boost has built a DLL for the library you want according to your current system spec. Look in the stage\lib\ directory of $boostdir
This DLL will be used later in the linker, so don't close its explorer window yet
Mine was in C:\Program Files\Boost_1_52\stage\lib\libboost_regex-mgw44-1_52.dll
I think the documentation had a smart way to do this but I haven't tried it yet
The "intermediate" directory from step #6 can be deleted now that the build is finished
Point your linker to the directory of that DLL
Right click your project -> Build Options -> Search Directories tab -> Linker tab -> add
that directory address (blah\blah\blah\stage\lib\)
Add that DLL flag to your linker settings
Mine was -lboost_regex-mgw44-1_52
Deep breath, prayers to your god, and fire up a test.
Further docs that may either help or confuse:
The Code::Blocks website has a version of this that I didn't find until I neared the end of my search. It was fairly helpful but had a few weird things. This post also is helpful.
Good luck!
I'm not sure what you mean by which build automatically. Most of the Boost libraries are header-only, but a few, such as regex, need to be compiled to a shared / static library. The compilation step is not automatic, you need to invoke the Boost build system (bjam) to do this. Of course, there are sources (BoostPro for instance) that distribute pre-built Boost binaries for various platforms.
Once that's done, you need to add the path where the libraries are present to the linker's search path. For MinGW, this option is -L"path/to/library". Boost does have directives to allow auto-linking of the required libraries, and this seems to work pretty well with MSVC, but I've never gotten it to work with MinGW. So you must also list the libraries to be linked explicitly. The Boost libraries include target and version information in the file name by default, so a typical linker command line option will look like -lboost_regex-mgw47-mt-1_51 for MinGW gcc 4.7 and Boost 1.51

Windows package-manager for C++ libraries

I've been working on various open-source projects, which involve the following C++ libraries (& others):
MuPDF
Boost
FreeType
GTKmm
hummus PDF libraries
LibTiff
LibXML2
Wt xpdf
xpdf
Poppler
ZLib
It often takes a long time to configure these libraries, when setting them up on a clean machine. Is there a way to automate the grabbing of all dependencies on a windows machine?
The closest I've found is CMake, which checks to make sure you have the dependencies installed/extracted before generating your project files. But I haven't found anything for Windows which can parse the list of dependencies and then download+install the required versions.
Please recommend a package manager for Windows with up-to-date C++ libraries.
Vcpkg, a Microsoft open source project, helps you get C and C++ libraries on Windows.
Take a look at the Hunter package manager when you already use CMake to setup your project. It automatically downloads and builds your dependencies whith only a few lines of extra cmake code. Hunter is based on cmake export and import targets.
For example if you want to use the GoogleTest library in your cmake based project you would add the following lines to your root CMakeLists.txt
# file root CMakeLists.txt
cmake_minimum_required(VERSION 3.0)
# To get hunter you need to download and include a single cmake file
# see documentation for correct name
include("../gate.cmake")
project(download-gtest)
# set the location of all your hunter-packages
set( HUNTER_ROOT_DIR C:/CppLibraries/HunterLibraries )
# This call automaticall downloads and compiles gtest the first time
# cmake is executed. The library is then cached in the HUNTER_ROOT_DIR
hunter_add_package(GTest)
# Now the GTest library can be found and linked to by your own project
find_package(GTest CONFIG REQUIRED)
add_executable(foo foo.cpp)
target_link_libraries(foo GTest::main)
Not all the libraries you list are available as "hunter-packages" but the project is open source so you can create hunter-packages for your dependencies and commit them to the project. Here is a list of libraries that are already available as hunter packages.
This will not solve all your problems out of the box because you have to create hunter-packages for your dependencies. But the existing framework already does a lot of the work and it is better to use that instead of having a half-assed selfmade solution.
Biicode is a new dependency manager for C++. It also has a few libraries that you listed. Biicode automatically scans your source files for dependencies, downloads and builds them. See here for a very cool example that includes Freeglut.
What I've found:
Closest thing to what I'm looking for:
NuGET
Unfortunately it doesn't have any of the libraries I require in its repository.
So I ended getting most of the libraries from the KDE4windows project and custom building the rest.
Npackd is a package manager for Windows. There is a default repository for C++ libraries and also a third party repository for Visual Studio 2010 64 bit libraries. Boost and zlib are already in the default repository. If you decide to use Npackd, you could file an issue if you need other libraries.
Windows does not have a package manager. Go to the libraries' website and download the Windows builds if they provide any.
There are some alternatives, but not without drawbacks:
Cygwin: provides a nice package manager, but all binaries are built for Cygwin, which means they run slower than their native equivalent, any apps using them will link to the Cygwin DLL, and you're stuck with that license. Also the use of the native Win32 API is sometimes troublesome due to incompatibility with the POSIX emulation offered. Only for GCC.
MinGW-get: is a package manager for the MinGW.org compiler. These are native Win32 binaries, but only for use with MinGW's GCC.
There is no package manager or slightly equivalent thing for anything Visual Studio or MinGW-w64 related.
There is no package management on Windows. On Windows developers typically use full-blown everything-and-the-kitchen-sink development environments and produce monolithic applications themselves, shipped with all dependencies.