Text indexing library in C/C++

Text indexing library in C/C++ - c++

I am developing a Windows desktop product which requires text indexing library in C/C++. I would want to give it series of words and a record that needs to be stored against those words. Searching those words should bring back one or more records quickly. Data will be stored on disk.
I have searched this forum and found Lucene. But it is basically Java. There is a CLucene C++ port also. But I am not sure if it is suitable (light weight?) for a small Windows desktop product.
I have found other .net based libraries but not something light weight and for C++.
Can you help please?

Have you considered sqlite? A RDBMS might be a little heavy, but I believe that it is used inside of some web browsers to implement HTML5 "Local Databases".

Related

Grid view in MFC

I've to create an application which is made up in a tabular way, with rows and columns having cells, in a grid like format. There have to be appropriate cell-level controls as well.
Because of certain constraints, this has to be done in MFC.
I tried searching for something like Grid view/tabular view in MFC, but couldn't locate it. All I managed to get were user developed libraries on other sites, but which I cannot use because of license restrictions.
As a starting point, what should I be looking for? I've worked on Qt before, but not MFC, and am fining it difficult to locate appropriate tutorials regarding grid/tabular view.
Kindly give me a starting point, or a library name for me to start looking into.
Thanks.

Either you use something open source like the ultimate grid or CGridCtrl, or you use a library like BCGSuite. You say 'cannot use because of licence restrictions' but you don't say what you mean. CGridCtrl for example can be used in commercial and free applications. For a high-quality (i.e., with support for modern features like theming), you'll need a commercial library.

Using a C++ CLR Library in a C# Metro Application

I have been writing some applications for windows metro in c# and have been trying to create a twitter program, using the tweet sharp library, that will allow the user to tweet and view the tweets of the people they are following and check for updates in a background task.
The problem that I was having is I wanted to use c++ for sorted maps. The sorted maps was, at least what I found, to be the quickest way to sort through and organize the large amounts of tweeters and their tweets. Which is especially helpful because of the constraints that background tasks have when it comes to accessing the CPU.
But I found that my CLR libraries couldn't be used in my metro application because of the improper build target for the dll file.
Is it possible to use a CLR library with WinRT applications and deploy them on the app store or does anyone know of an alternative way to manage these large amounts of tweets considering the CPU constraints.
Thanks in advance.

According to this post it currently is not possible.
Apart from that, WinRT has a different system for processing metadata in apps and class libraries.
For one, you can look at .NET for Windows Store apps overview and secondly about CLR integration (C++/CX).

Choosing Cross-Platform GUI Toolkit for Desktop App With WebServices

For a current project, we're designing a client desktop application that parses text files and interfaces with a web based database.
So far we've split the project into parts:
(Third-Party Program) -> (Our Desktop Client) -> (Our Parsing Library #1 and #2) -> (Our Web Server) -> (Our Verification Library) -> (Our Database)
We've hit confusion when it comes to choosing the correct way (and the best language) to make these pieces work together.
The third-party program's output is a simple text file, and we're just parsing it into a SQL-esque format for insertion into our database after verifying the numbers are in a certain range.
The first question we have is regarding the client language itself. We're planning on writing the parser libraries in C++ as they're just mostly text management. Our desktop client needs to be cross-platform for Windows and Mac. Currently we're leaning towards writing this in Java using Swing and the JNI. However, we realize there's a lot of hate for Java and that we'd have to worry about bundling in the JRE.
Is Java a good choice in this situation? Our other options seem to be writing this also in C++ using something like Qt for the GUI, or going platform specific and writing the windows version in .NET and then a Mac specific version. Our Windows community is the vast majority of users.
Our second issue is connecting this client with our web server. Originally we were just going to use an http POST to upload the file. We could also FTP the file which seems like overkill. We started to explore web services but were not sure if a web service could handle large amounts of text data.
Is there an easier way to do this? Everything is text, so it's no problem to send them in chunks or one giant string. If we go the web services route, will that effect our language choice for the desktop client?
There are definitely hundreds of ways to handle something like this, but most of these concepts are new for us. Any suggestions would be greatly appreciated.

Qt is an excellent choice and as it's native C++ it will be easy to integrate with your parsers too. Why write two versions when a single Qt version will run fine on both platforms with native look and feel? Depending on the license you choose you can even statically link Qt if you're concerned about deployment complexity.
A web service would generally have no problem handling large amounts of text and pretty much any language will interact with it easily assuming basic network I/O functionality. Depending on the language you will probably be able to find libraries that do most of the work for you, assuming it's not already supported natively.
As you say, there are many different ways to do what you want to achieve. There is no right or wrong way but obviously some designs will suit your needs better than others.

File preview component (C++/MFC)

Is anyone aware of a good, general purpose file preview component for MFC/C++ desktop applications?
Specifically, I'm looking for a component that I could embed in my application that would allow a broad range of file types (text files, multimedia, etc.) to be previewed without the need for original applications (such as MS Word, etc.) to be installed.
I could only find one, via Google:
http://www.file-viewer-sdk.com/
Unfortunately, these folks want $60k for unlimited redistribution, which is outside of our budget.
Anyone have any recommendations? If not a component, is anyone using another general-purpose strategy that works well for them?

You can write your own shell preview host once you know the interfaces.

You might want to check out Autovue, originally made by Cimmetry since acquired by Oracle
.
Our product makes limited use of their SDK to do some document conversions (Mostly RTF->PS) and that works well enough for us.

How to replace text in a PowerPoint (.ppt) document?

What solutions are there? I know only solutions for replacing Bookmarks in Word (.doc) files with Apache POI?
Are there also possibilities to change images, layouts, text-styles in .doc and .ppt documents?
I think about replacement of areas in Word and PowerPoint documents for bulk processing.
Platform: MS-Office 2003

What are your platform limitations?
Obviously Apache POI will get you at least part of the way there.
Microsoft's own COM API's are fairly powerful and are documented here. I would recommend using them if a) you are not running in a server (many users, multithreaded) environment; b) you can have a proper version of powerpoint installed on the production machine; and c) you can code against a COM object model.

It's a bit pricey, but Aspose.Slides is a very powerful library for manipulating PowerPoint files

If you include using other Office suits as an option, here's a list of possible solutions:
Apache POI-HSLF
PowerPoint 2007 APIs
OpenOffice.org UNO
Using POI you can't edit .pptx file format, but you don't depend on the apps installed on the system. Other two options, on the contrary, make use of other apps, but they are definitely better for dealing with presentations. OpenOffice has better compability with older formats, by the way. Also if you use UNO, you'll have a great choice of languages, UNO exists for Java, C++, Python and other languages.

My experience is not directly with Power Point, but I've actually rolled my own WordML (XML) generator. It a) removed all dependencies on Word, b) was very fast c) and let me build up documents from scratch.
But it was a lot of work to create. And I was only creating a write only implementation.
I'm not as familiar with Power Point, so this is conjecture, but you may be able to roll your own by reading XML (Power Point 2003??) and/or cracking the Office Open XML file (zipped XML), then using XPath to manipulate the data, and then saving everything back to disk.
This won't work on older OLE Compound Document based Power Point files though.

I've done something like that before: programmatically accessed and manipulated PowerPoint presentations. Back when I did it, it was all in C++ using COM, but similar principles apply to C#/VB .NET apps, since they do COM interop very easily.
What you're looking for is called the Office Document Model. Basically, Office applications expose their documents programmatically, as trees of objects that define their contents. These objects are accessible via an API, and you can manipulate them, add new ones, and do whatever other processing you want. It's exceedingly powerful; you can use it to manipulate pretty much all aspects of a document. But you'll need an installation of Office and Visual Studio to be able to use it.
Some links:
Intro: http://msdn.microsoft.com/en-us/library/d58327k6.aspx
Hope this helps!

Apparently new users can only include one link per posting. How lame! :)
Here's the other link I meant to include:
Example of manipulating PowerPoint documents programmatically: http://msdn.microsoft.com/en-us/library/cc668192.aspx

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js