detect language of text using c++ or shared object - c++

i'm new to c++ and need a way for detecting the language of the text.
i searched for any plugin to do that and only found the chromum open source code.
https://code.google.com/p/cld2/
there is many pages talking about using it at python, java or stand alone but i cant find any tutorial about using it in a c++ code.
so i need a declaration for how to use it or any other good library for detecting the text language using c++.
the language text will be added by user and i want to detect if it is English , French or Arabic .. etc to apply nlp according to that.
Thanks,

Although its not a library, one option you have is to simply use Google translates API to detect the language. This is done using REST. The obvious downside is that you need to be connected to the internet to make the call. The docs explain how you can do this here

Related

Is there common practice to find required header files of libraries in C++?

I am fairly new to C++ and this might sound like a very dumb question, but is there any resource or common practice to find the headers that need to be included when using C/C++ libraries?
For example: I am currently doing a project using the OpenSSL library.
How do I find out which headers I need to include for the sample codes on this page:
https://wiki.openssl.org/index.php/EVP_Key_and_Parameter_Generation
I had this issue many times and I am almost always struggling to find the right header files to include if the documentation doesn't provide a full working example.
Am I missing smoething when it comes to finding the required header files or is this lack of documentation the norm for examples?
I am aware that you were hoping to find something like a fancy database or any secret documentation to get the header.
Like for the Qt framework in the Qt Creator would be positioning the cursor on a written class in the code and just pressing alt + enter is adding the necessary header on top of the code.
Sadly that functionality is just for Qt not for c/c++ implemented.
The correct answer to your question might be as simple and maybe basic as so and it is also the fastest way I found and probably most people use:
A search engine of your choice(Google, DuckDuckgo, Startpage.com ...)
library command (f.e. EVP_PKEY)
Programming language name (f.e. c++, Qt, ...)
Proof of concept f.e. for startpage.com:
Maybe https://en.cppreference.com/w/cpp/header is an alternative - I just never found stuff real quick there.
The second best option I found and use regularly to find in addition to the header even good sample codes/examples is a program called Recoll (For Linux, Mac and Windows - or a similar desktop search engine)
Recoll is based on the very capable Xapian search engine library, for
which it provides a powerful text extraction layer and a complete, yet
easy to use, Qt graphical interface.
(https://www.lesbonscomptes.com/recoll/)
It works like that:
I put a selection of the best 50 books to a special topic in a folder (f.e. c++, c, qt - just stay really specific) and let recoll crawl the folder.
Now use keywords like EVP_PKEY to find every topic in all of your most loved and respected pdf c++ books in nano- to milliseconds - depending on how much money you spend on your pdf library. (Sure, you have to get/buy them first)
(But its a freaking fast tool and even prioritized due to the Xapian search engine library)

C++ reading from .lang files with locales

I am creating an OpenGL game and I would like to make it open to more languages than just English for obvious reasons. From looking around and fiddling around with the games installed on my computer I can see that locales play a big part in this and that .lang files, such as en-US.lang that is shipped with minecraft, are basically text documents with a language code, "item.iron.ingot" for example, an equal sign, and then what it means for that given language, English as per en-US, so in this case would be, "Iron Ingot". Well I created a file that I named en-US.lang and this is its contents:
item.iron.ingot=Iron Ingot
In my C++ main method I put:
setlocale(LC_ALL, "en-US");
After including the locale header file. So I suppose the part that I am confused by is how to use the locales to read from the .lang file? Please help SO and some example code would be appreciated.
C++ Does not come with a built-in support for resource files / internationalization. However there is a huge variety of solutions.
To support multi-language messages, you should have some basic understanding of how such strings are encoded in files and read to memory. Here is a basic introduction if you are not familiar:
"http://www.joelonsoftware.com/articles/Unicode.html"
To keep and load the correct text at runtime you need to use a third party library: GNU gettext http://www.gnu.org/software/gettext/ is one such example. However there are other solutions out there.

Small library for generating HTML files in C++

Is there a library that will allow easier generation of a simple website using C++ code. This 'website' will then be compiled into a CHM help file (which is the final goal here). Ideally, it will allow generation of pages easily and allow links to be generated between pages easily. I can do this all by hand, but that is going be very tedious and error prone.
I know about bigger libraries such as Wt, but am more interested in smaller ones with little or no dependencies and a need for installation.
You can try CTPP template engine. It is written in C++ is small and quite fast.
Do you need this project to be written in c++? Because if you just need to prepare documentation in CHM I would go with Sphinx. Sphinx is a set of tools written in Python that generate manuals in few formats (chm, html, LaTeX, PDF) from text files (formated using reStructuredText markup language). Those text files could be created by hand or using some application and then combined into one manual using Sphinx. In my work right now we are using this solution to write documentation, because it is very easy to maintain text files (merging, tracking changes etc.) than for example html or doc. Sphinx is used to generate Python language documentation (chm), so it is capable to handle really large project.
I've used the FLATE library every day for ten years and it works flawlessly. It's a piece of cake to use; I can't recommend it enough.
It will definitely do the trick, though probably at a much lower level than you have in mind. It is a C-language source library that you can link with a C++ caller. It's also available as a Perl module, but I haven't used that.
FLATE library
Flate is a template library used to deal with html code in CGI applications. The library includes C and Perl support. All html code is put in an external file (the template) and printed using the library functions: variables, zones (parts to be displayed or not) and tables (parts to be displayed 0 to n times). Using this method you don't need to modify/recompile your application when modifying html code, printing order doesn't matter in your CGI code, and your CGI code is much cleaner.
HTH and good luck!
Are this CHM lib and the related links what you're looking for?

Help programmatically add text to an existing PDF

I need to write a program that displays a PDF which a third-party supplies. I need to insert text data in to the form before displaying it to the user. I do have the option to convert the PDF in to another format, but it has to look exactly like the original PDF. C++ is the preferred language of choice, but other languages can be investigated (e.g. C#). It need to work on a Windows desktop machine.
What libraries, tools, strategies, or other programming languages do you suggest investigate to accomplish this task? Are there any online examples you could direct me to.
Thank-you in advance.
What about PoDoFo:
The PoDoFo library is a free, portable
C++ library which includes classes to
parse PDF files and modify their
contents into memory. The changes can
be written back to disk easily. The
parser can also be used to extract
information from a PDF file (for
example the parser could be used in a
PDF viewer). Besides parsing PoDoFo
includes also very simple classes to
create your own PDF files. All classes
are documented so it is easy to start
writing your own application using
PoDoFo.
iTextSharp is a free library that you can use in .Net applications. Take a look at the iText page - that is for the iText project, which is a Java library. iTextSharp is part of that project, and is a port to C# and .Net.
Consider Python It have a lot PDF librarys (both creating and extracting) eg:
http://pypi.python.org/pypi/pdfsplit/0.4.2
http://pypi.python.org/pypi/JagPDF/1.4.0
http://pypi.python.org/pypi/pdfminer/20091129
http://pypi.python.org/pypi/podofo/0.0.1
http://pypi.python.org/pypi/pyFPDF/1.52
There are also good tools for using C/C++ code in Python and to create .exe form Python scripts. If you decide to use different language consider Python as prototyping language!

C++ Transformer scripting

Im looking to see if there are any pre-existing projects that do this.
Generally, I need something that will load in a c++ file and parse it and then based on a set of rules in script, transform it, say to add headers, reformat, or remove coding quirks for example, turning const int parameters in functions to int parameters, etc Or perhaps something that would generate a dom of some sorts based on the c++ file fed in that could be manipulated and written out again.
Are there any such projects/products out there free or commercial?
Taras Glek of Mozilla has been working on the dehydra tool, based on Elkhound and scripted using JavaScript to transform the Mozilla codebase to fit with XPCOM and garbage collector changes.
The Parser from Eclipse CDT seems to be pretty complete by now, as some refactoring methods have been alredy contributed to CDT.