What Linux Full Text Indexing Tool Has A Good C++ API? - c++

I'm looking to add full text indexing to a Linux desktop application written in C++. I am thinking that the easiest way to do this would be to call an existing library or utility. This article reviews various open source utilities available for the Gnome and KDE desktops; metatracker, recoll and stigi are all written in C++ so they each seem reasonable. But I cannot find any notable documentation on how to use them as libraries or through an API. I could, instead, use something like Clucene or Xapian, which are generic full text indexing libraries. They seem more straightforward but if I used them, I'd have to implement my own indexing daemon, an unappealing prospect.
Also, Xesam seems to be the latest thing, does anyone have any evidence that it works?
So, does anyone have experience using any of the applications or libraries? How did you use it and what documentation was useful?

I used CLucene, which you mentioned (and also Lucene.NET), and found it to be pretty good.

There's also Strigi which AFAIK works with Xesam and is the default used in KDE.

After further looking around, I found and worked with Recol. It believe that it has the best C++ interface to a full text search engine, in this case Xapian.
It is important to realize that clucene and Xapian are both highly complex libraries designed primarily for multi-user server applications. Cutting them down to a level appropriate for a client-system is not easy. If I remember correctly, Strigi has a complex, pure C interface which isn't adapted.
Clucene also doesn't seem to be that actively maintained currently and Xapian seems to be maintained. But the thing is the existence of recol, which allows you to index particular files without the massive, massive setup that raw Xapian or clucene requires - creating your own "stemming" set is not normally desirable, etc.

Related

using iRODS c++ API

I intend to put/get file to/from an iRODS server. iRODS provides well documented JAVA and PHP APIs, however I'm looking for a C/C++ library providing such functions.
Are there libraries or examples of code I could use ?
Not exactly what you've requested, but you might be interested in Baton, which uses the iRODS C API, and has had a lot of work put into it, so you might be able to use it as-is, depending on your use case, rather than writing your own from scratch. Failing that, it provides a lot of examples of using the API.
If you have reasons that you must write your own, then in 4.x the the iRODS docs for the C API are much improved from the earlier 3.3.1 code path.
Good luck and do try the mailing list as a previous commenter mentioned - the developers respond often.

Does a C++ shell framework exist?

I used to develop some Perl programs using Fry::Shell. I think it is very powerful and easy to use.
For one of my C++ projects I need to create a command line client. The idea is to create a TUI like the one found in routing hardware.
Does such a framework exist ?
You can keep using Fry::Shell. It's not much of a hassle to call Perl from C++. Here's a starting point for that, there might be a better way to do it.
EDIT: I just found a project on Github. It's written in C and seems pretty much dead, but try it out, it might be useful. Even if it's not, since it's open source, you can use it as a starting point. It claims to provide a Cisco-like interface, which should suit you pretty well.

Easiest way to build a cross-platform application

I have read a few articles in the cross-platform tag. However, as I'm starting a fresh application (mostly a terminal/console app), I'm wondering about the easiest way to make it cross-platform (i.e. working for Linux, Mac OS X, and Windows). I have thought about the following:
adding various macro/tags in my code to build different binary executables for each operating system
use Qt platform to develop a cross-functional app (although the GUI and platform component would add more development time as I'm not familiar with Qt)
Your thoughts? Thanks in advance for your contribution!
Edit: Sounds like there are a lot of popular responses on Java and Qt. What are the tradeoffs between these two while we're at it?
Do not go the first way. You'll encounter a lot of problems that are already solved for you by numerous tools.
Qt is an excellent choice if you definitely want C++. In fact, it will speed up development even if you aren't familiar with it, as it has excellent documentation and is easy to use. The good part about it is that it isn't just a GUI framework, but also networking, XML, I/O and lots of other stuff you'll probably need.
If not necessary C++, I'd go with Java. C++ is far too low level language for most applications. Debugging memory management and corrupt stacks can be a nightmare.
To your edited question:
The obvious one: Java has garbage collection, C++ doesn't. It means no memory leaks in Java (unless you count possible bugs in JVM), no need to worry about dangling pointers and such.
Another obvious one: it is extremely easy to use platform-dependent code in C++ using #ifdefs. In Java it is a real pain. There is JNI but it isn't easy to use at all.
Java has very extensive support of exceptions. While C++ has exceptions too, Qt doesn't use them, and some things that generate exceptions in Java will leave you with corrupt memory and crashes in C++ (think buffer overflows).
"Write once, run everywhere." Recompiling C++ program for many platforms can be daunting. Java programs don't need to be recompiled.
It is open to debate, but I think Java has more extensive and well-defined library. The abstraction level is generally higher, the interfaces are cleaner. And it supports more useful things, like XML schemas and such. I can't think of a feature that is present in Qt, but absent in Java. Maybe multimedia or something, I'm not sure.
Both languages are very fast nowadays, so performance is usually not an issue, but Java can be a real memory hog. Not extremely important on modern hardware too, but still.
The least obvious one: C++ can be more portable than Java. One example is FreeBSD OS which had very poor support for Java some time ago (don't know if it is still the case). C++/Qt works perfectly there. If you plan on supporting a wide range of Unix systems, C++ may be a better choice.
Use Java. As much bashing as it gets/used to get, it's the best thing to get stuff working across any platform. Sure, you will still need to handle external OS related functions you may be using, but it's much better than using anything else.
Apart from Java, there are a few things you can run on the JVM - JRuby, Jython, Scala come to mind.
You could also write with the scripting languages directly( Ruby, Python, etc ).
C/C++ is best left for applications that demand complete memory control and high controllability.
I'd go with the QT (or some other framework) option. If you went with the first you'd find it considerably harder. After all, you have to know what to put into the various conditionally compiled sections for all the platforms you're targeting.
I would suggest using a technology designed for cross-platform application development. Here are two technologies I know of that -- as long as you read the documentation and use the features properly -- you can build the application to run on all 3 platforms:
Java
XULRunner (Mozilla's Development Platform)
Of course, there is always the web. I mostly use web applications not just for their portability, but also because they run on my Windows PC, my Ubuntu computer, and my Mac.
We mainly build web applications because the web is the future. Local applications are viewed in my organization as mostly outdated, unless there is of course some feature or technology the web doesn't yet support that holds that application back from being fully web-based.
I would also suggest Github's electron which allows to build cross platform desktop applications using NodeJs and the Google's Chromium. The only drawback for this method is that an electron application run much slower than a native C++ application due to the abstraction layers between Javascript and native C++.
If you're making a console app, you should be able to use the same source for all three platforms if you stick to the functions defined in the POSIX libraries. Setting up your build environment is the most complicated part, especially if you want to be able to build for multiple platforms out of the same source tree.
I'd say if you really want to use C++, QT is the easiest way for cross-platform application, I found myself using QT when I need an UI even though QT has a large set of library which makes pretty much everything easier in C++.
If you don't want to use QT then you need a good design and a lot of abstraction to make cross-platfform application.
However I'm using more and more Python bindinq to QT for medium size application.
If you are working on a console application and you know a bit of python, you might find Python scripting much more comfortable than C++. It keeps the time comsuming stuff away to be able to focus on your application.

Small native cross-platform GUI framework for C++

I wrote a small program with Boost in c++. It works fine and so I want to give it a graphical interface so that it is easier to use.
In order to do so, I am looking for small cross-platform framework which provides native look and feel. Windows and Linux support would be enough, currently i do not need os x...
I used wxWidgets for some other project, but it was a pain to set everything up and ship this big library with the software.
But I was really amazed by the use of real native controls.
In order to keep the program small I also tried fltk, but it has an awful look.
I just need an simple framework without network support or other gimmicks.
So my question: Is there any framework out there which fits all the requirements? Or if not, which frameworks fits at least some of these needs?
Thanks in advance!
When it has the word "framework" in its name it's almost never small.
Anyway, graphical frameworks/libraries tend to be big, cause they need to handle a lot of stuff.
Qt is probably the best straightforward library for cross-platform GUI, but it definitely doesn't constitute a "small framework". On the other hand, on Linux systems, Qt will be most likely already installed. Plus it definitely pays for its size.
wxwidgets is fairly small as far as gui toolkits go.
And it's cross platform
http://www.wxwidgets.org/
You have mentioned it, but as far as cross platform toolkits go it's one of the smallest I've seen.
The only other suggestion I have is that you could wrap your code up into a C library and link that into another language. e.g. Use .NET on windows and mono for linux or even a java based app (although they don't always look very native to the platform). Then use your library from there.
Ultimate++ might contain what you need. (Although they make it sound in the FAQ as if their library is really big, it doesn't seem that bad to me.)
don't forget to check juce as well
Qt works amazingly, but is not very small. I've found there is a genuine lack of "small" cross-platform GUIs. You either might try to just abstract your GUI with #ifdefs all over the place, or use Qt/wx.
If you want it to be small, just write the GUI twice -- once in MFC and then in X. Your GUI sounds simple enough. Build up your own small abstraction that is just what you need.
There is a long list of both active and dead cross-platform C++ UI libraries here: https://philippegroarke.com/posts/2018/c++_ui_solutions/
Some of them are small and have a native look.
Like others mentioned you cannot mix the "cross platform" and small in size in the same sentence.
More work, smaller in size:
One solution I can suggest is to use native python binding for the UI portion. Since you are already using boost, it should be fairly trivial to have Boost.Python communicate between C++ space and python space. You already have python on Linux and its a 20-40MB package on Windows (can't remember how big the latest release is). But here you will have to use win32 binding on windows and gtk/qt bindings on linux, so more work. Nah, too much work to maintain, scratch this.
Moderate work, smaller in size but with non-native controls:
You can try to get clutter or freeglut to get your UI work done but I personally haven't used them so I don't know if they provide full native looks for your apps. But they are small in size compared to wx or qt.
Less work, bigger in size:
Use WxWidgets if you are already comfortable with it, otherwise I recommend Qt.
You can also have a look at some of the other offerings: http://en.wikipedia.org/wiki/List_of_widget_toolkits
Clutter: http://www.clutter-project.org/about
FreeGLUT: http://freeglut.sourceforge.net
ever heard of QT ???
http://qt.nokia.com/products/
i think it should fits all your your needs

c/c++ XML library question

I know that a lot of c/c++ XML library questions have been asked already (I tried to read through all of them before getting to this).
Here are the things I'm going to need in my own project:
Excellent performance
SAX2
Validation
Open source
Cross platform
I was going to use Xerces-C, but I see that a simple SAX2 setup with nothing going on in the filter is taking 5 seconds to run. (Perhaps I'm doing something wrong here?)
I would like to use libxml++, but as I tried to get it set up on my MacBook, there were some crazy dependencies that took me all the way back to gtk-doc, at which point I sort of tabled the idea.
So now I'm at libxml2. Is this the way to go? Have I missed an important option, bearing in mind the five requirements above? I don't mind using a (good) c-library like libxml2, but a c++ interface would be nice. (I don't like Xerces-C's API very much.)
I am willing to bend on the SAX2 requirement if comparable functionality is available.
Having spent a goodly amount of time on this same problem, it was my conclusion that libxml2 is the best option available under your guidelines. The C interface is not too difficult to use and it's very fast.
There are some other good options for commercial libraries, but most of the other comparable open-source options are either painfully slow or are mired in a deep, annoying vat of dependency soup.
You say you need these things in your project, but don't give any idea of the pipeline. For example, we had a whole load of static XML files which needed to be loaded quickly, but only validated rarely. So validated using a separate process in batch (using RelaxNG as it was human writable markup ) and loaded the XML using expat. The system also used XMPP, so checked streaming input, but that didn't require validating against a schema (partly because it was streamed, and mostly because most of the possible errors were not expressible in a schema).
If you need a whole host of other facilities, you can consider Qt, which has good XML support. Be warned though, it's WAY more than an XML processing library; it's a full blown application framework with support for GUIs, networking and a whole host of other things.
Qt
You can also try Poco. It's another application framework, but not as huge as Qt (i.e. no GUI-related things etc.)
Poco
Lastly, if you don't mind a C library, you can use Expat. It's not SAX per se, but writing code using Expat is somewhat like SAX. It has C++ wrappers, but they're not officially part of the project IIRC, and may not be as well-maintained or designed. I'm not too sure though.
Expat
Hope this helps!
EDIT: I misread your original post: not too sure about the validation features of these libraries, I've never used them before.