best way to programmatically modify excel spreadsheets - c++

I'm looking for a library that will allow me to programatically modify Excel files to add data to certain cells. My current idea is to use named ranges to determine where to insert the new data (essentially a range of 1x1), then update the named ranges to point at the data. The existing application this is going to integrate with is written entirely in C++, so I'm ideally looking for a C++ solution (hence why this thread is of limited usefulness). If all else fails, I'll go with a .NET solution if there is some way of linking it against our C++ app.
An ideal solution would be open source, but none of the ones I've seen so far (MyXls and XLSSTREAM) seem up to the challenge. I like the looks of Aspose.Cells, but it's for .NET or Java, not C++ (and costs money). I need to support all Excel formats from 97 through the present, including the XLSX and XLSB formats. Ideally, it would also support formats such as OpenOffice, and (for output) PDF and HTML.
Some use-cases I need to support:
reading and modifying any cell in the spreadsheet, including formulas
creating, reading, modifying named ranges (the ranges themselves, not just the cells)
copying formatting from a cell to a bunch of others (including conditional formatting) -- we'll use one cell as a template for all the others we fill in with data.
Any help you can give me finding an appropriate library would be great. I'd also like to hear some testimonials about the various suggestions (including the ones in my post) so I can make more informed decisions -- what's easy to use, bug-free, cheap, etc?

The safest suggestion is to just use OLE. It uses the COM, which does not require .NET at all.
http://en.wikipedia.org/wiki/OLE_Automation <--about halfway down is a C++ example.
You may have to wrap a few functionalities into functions for usability, but it's really not ugly to work with.
EDIT: Just be aware that you need a copy of Excel for it to work. Also, there's some first-party .h files that you can find specific to excel. (it's all explained in the Wikipedia article)

I don't know if this is an option for you, but the new office 2007 formats are in zipped XML format, which makes it very doable to do your own modifications. See here for the specifications.

SpreadsheetGear for .NET will handle your requirements and has an API which is very similar to Excel.
When you insert cells, your defined names (and any other formulas / charts / etc...) will automatically be fixed up to reference the new range (just as they would in Excel). So you would not need to update your defined names (although there is complete support for creating and updating defined names if that is what you want to do).
SpreadsheetGear is a .NET component, but you can build your own wrapper which is callable from C++.
You can see what our customers say and download the free, fully functional evalution here.

Have you already tried using the Excel COM interfaces? Obviously Excel needs to be install on the machine, and it's a pain to deal with...

I would argue that a .net solution with COM interop for linking into your C++ application is the best solution. In more than ten years of working with them, I've never seen a COM automation of Excel that didn't leak memory somewhere.
If you need to automate Excel, I recommend Visual Studio Tools for Office. If you don't need to automate, only modify files and those files can be in Office 2007 format, you're better off finding a library that manipulates the files directly instead of opening Excel to do it.

I ended up using Aspose.Cells as I mentioned in my original post, since it seemed like the easiest path. I'm very happy with the way it turned out, and their support is very good. I had to create a wrapper around it in C# that exported a COM interface to my C++ application.

Related

Can I save delta information from a spreadsheet?

I am going to develop a generic C++/Qt GUI tool for data I/O from the user.
The data will be directed to/from the core application through a file.
However the same task could be performed by a spreadsheet. The only doubt I have is whether spreadsheets can save/load only the data that have changed since the last save/load operation, even in a temporary file.
I would like to know if this is a common feature among spreadsheets (especially the open source ones).
Do you mean with spreadsheet a software like Excel or or OpenOffice calc? That's a big difference to a customized Qt application. I am sure you can do this with Excel or OpenOffice calc. To decide the way to go it is more important which other requirements you have. Who should use the application and for which purpose? Do you know the neccessary programming languages/frameworks? Which functions should it implement?
Without a LOT more details you will not get a good answer here.
It seems that with spreadsheets (e.g. Excel, LibreOffice Calc, ...) it is not possible to save/load portions of the project, not even in a temporary delta file.
For these tasks, a database is the tool to use.

Copy specific cells from an excel document using C++

I have been looking into trying to manipulate an excel document using C++. Basically what I want to do is access an excel document and copy a specified line from that document to the windows clipboard.
I haven't found any libraries I am able to use or any commands I can use to accomplish this task. If anyone can point me to any documentation or examples that show me how to get this accomplished it would be much appreciated.
You can do so using Excel's COM object. Microsoft provides examples in all their languages, here's one in C++
http://support.microsoft.com/kb/216686
And another, more recent one also from Microsoft
http://blogs.msdn.com/b/codefx/archive/2012/03/17/sample-of-mar-17th-excel-automation-in-c-c-and-vb.aspx
It's not an easy subject, the interfaces offered are quite extensive, but it's quite popular and thus lots of examples and reading material can be found.
EDIT: do you have to do this in C++? Because you can do exactly the same thing using the same COM object but using Powershell. Have a look here
http://blogs.technet.com/b/heyscriptingguy/archive/2006/09/08/how-can-i-use-windows-powershell-to-automate-microsoft-excel.aspx

Making charts in excel programmatically

I have an XLL made with the help of the XLW wrapper for Excel's C API. I would like to be able to programmatic make charts in excel using data stored in some objects in the XLL. Is this possible and if so, what is the best way to do it? I believe one way to do is through COM but would like to avoid it if at all possible.
Currently using Excel 2007 on Windows 7 and VS2010.
Edit: In general what API does excel expose that supports programmatic charting? Can anyone point me towards some documentation?
Edit2: Since I'm not getting any hits I will try to give a little more detail abaout what I'm trying to do. I want to call a formula from Excel like =PlotCurve("CurvreHandle") and I want to get a curve object stored in memory and owned by the XLL (unmanaged code), get an some data from it and display it in a chart somewhere on the sheet where the PlotCurve was made. From what I have been able to gleam so far, the C API that XLW is wrapping offers no support for the second part of the problem so I need to go to either COM (which I have no idea how to mix with the already running C api) or some .net interop which I also don't know how to do . If someone has ever done something like this or knows of a safe and stable way to do this, I would love to hear it.
You can use http://xll.codeplex.com and Excel4/Excel12. The function ExcelX can be used to handle the memory management. The command to use is xlcChartWizard.
To create, modify, save charts in Excel programmatically you can use the C++ classes in the Microsoft.Office.Interop.Excel namespace.

Internationalization in MFC

It's finally (after years of postponing) the time to localize my app in a few other languages other than English.
The first challenge is to design the integration into my C++ / MFC application that has dozens of dialogs and countless strings. I came across two possible alternative implementations:
Compile and deploy localized resource files as DLLs
Extract and replace all strings with the localized version. For each
language there will be an XML (or simple text) file.
Personally I opt for the second alternative since it seems to me more flexible. The changes are many but not hard to make, and very importantly the XML files will be very easy to modify for the translators.
Any advise is greatly appreciated.
Regards,
Cosmin Unguru
http://www.batchphoto.com/
I did some long-living MFC projects with different languages.
I strongly recommend the first approach with resource-only DLLs.
The reasons:
(1) If the user does a XCOPY install, he always has the default language (English) in the main executables.
(2) If you don't translate everything (e.g. you're late with your release or forget some strings), the Windows resource functions if properly used return the resource in the default language automatically - you don't have to implement it on your own.
(3) My very person opinion: (a) Line breaks, tabs, whitespaces in XML files are a pain in your a**. (b) Merging XML files is even worse...
(4) Don't forget the encoding. It's okay in XML but your translators might use an unsuitable editor and damage the file.
And now for the main reason:
(5) You will have to rearrange many of your dialogs, because many strings are longer in e.g. French or German than in English. And making all statics, buttons, ... larger "just in case" looks crappy.
Another hint: Spend some bucks and buy one of the translation tools which import your projects / binaries and build up a translation database. This will be amortized after the first release.
Another hint (2): If possible make a release which doesn't contain any changes but only the multi-language feature. Also in future, if possible: Release your product in English. Then do the translation in one single step (per language) and release the other languages.
My good and friendly suggestion from somebody who worked a lot with localization:
Grab GNU Gettext,
Mark all your strings as _("English").
Extract all strings using gettext tool xgettext and compile dictionalries
Translate string using great tools like poedit.
Use gettext in your project and make your localization life simpler!
You can also use boost::locale for same purpose - it uses GNU Gettext dictionaries and approach but provides different and more powerful runtime and for windows developer it has very good addon - it supports wide strings that MFC requires to use for normal Unicode support.
Don't use resources and other "translation" tools that are total crap from linguistic point of view (and developer's point of view as well).
Further reading: http://cppcms.sourceforge.net/boost_locale/html/tutorial.html
Using a DLL resource library is a relatively straightforward operation, and allows you to manage not only strings, but other resources as well. And this is its main advantage, because i18n is not only about string translation.
However, depending on your needs, a text-based solution may be a better decision, because of its easier handling - resource scripts being more complex than xml files, especially for the average translator.
I would suggest creating your own abstraction layer, something like "LoadLocalizedString", etc.; in this way, you can start implementing it just with text files, and then change to something more complex when and if required in a transparent way - all the effort for making your software i18n aware would still be valid.
In our case we had diffrent dialogues per Language. The resource file was the same as the multiple laguages were implemented at development time. You could basically append on existing resource files the diferent languages. I hope it helps to find your way.
The DLL option is commonly used for this since the resource loading procedure (e.g. LoadLibrary) is already written - meaning you don't have to write any parsing/loading code. While XML is easier to edit, DLLs have a bit more security (users won't be able to easily edit them) and will allow the developer (meaning you) more time to work on application logic instead of writing a language loading system.
HMODULE hLangDLL = LoadLibrary("text_en.dll");
// more stuff
TCHAR mybuffer[1024] = {0};
LoadString(hLangDLL, IDS_MYSTRING, mybuffer, 1023);
If it is just the strings that are changing then I agree that XML is the way forward here for the exact reasons you outline. Easy for other people to edit, easy to change language at runtime, etc.
The only reason (in my eyes) that you'd choose option 1 is if things other than strings are being localized such as needing different icons.
If it's just text? I say go with the XML.

How to replace text in a PowerPoint (.ppt) document?

What solutions are there? I know only solutions for replacing Bookmarks in Word (.doc) files with Apache POI?
Are there also possibilities to change images, layouts, text-styles in .doc and .ppt documents?
I think about replacement of areas in Word and PowerPoint documents for bulk processing.
Platform: MS-Office 2003
What are your platform limitations?
Obviously Apache POI will get you at least part of the way there.
Microsoft's own COM API's are fairly powerful and are documented here. I would recommend using them if a) you are not running in a server (many users, multithreaded) environment; b) you can have a proper version of powerpoint installed on the production machine; and c) you can code against a COM object model.
It's a bit pricey, but Aspose.Slides is a very powerful library for manipulating PowerPoint files
If you include using other Office suits as an option, here's a list of possible solutions:
Apache POI-HSLF
PowerPoint 2007 APIs
OpenOffice.org UNO
Using POI you can't edit .pptx file format, but you don't depend on the apps installed on the system. Other two options, on the contrary, make use of other apps, but they are definitely better for dealing with presentations. OpenOffice has better compability with older formats, by the way. Also if you use UNO, you'll have a great choice of languages, UNO exists for Java, C++, Python and other languages.
My experience is not directly with Power Point, but I've actually rolled my own WordML (XML) generator. It a) removed all dependencies on Word, b) was very fast c) and let me build up documents from scratch.
But it was a lot of work to create. And I was only creating a write only implementation.
I'm not as familiar with Power Point, so this is conjecture, but you may be able to roll your own by reading XML (Power Point 2003??) and/or cracking the Office Open XML file (zipped XML), then using XPath to manipulate the data, and then saving everything back to disk.
This won't work on older OLE Compound Document based Power Point files though.
I've done something like that before: programmatically accessed and manipulated PowerPoint presentations. Back when I did it, it was all in C++ using COM, but similar principles apply to C#/VB .NET apps, since they do COM interop very easily.
What you're looking for is called the Office Document Model. Basically, Office applications expose their documents programmatically, as trees of objects that define their contents. These objects are accessible via an API, and you can manipulate them, add new ones, and do whatever other processing you want. It's exceedingly powerful; you can use it to manipulate pretty much all aspects of a document. But you'll need an installation of Office and Visual Studio to be able to use it.
Some links:
Intro: http://msdn.microsoft.com/en-us/library/d58327k6.aspx
Hope this helps!
Apparently new users can only include one link per posting. How lame! :)
Here's the other link I meant to include:
Example of manipulating PowerPoint documents programmatically: http://msdn.microsoft.com/en-us/library/cc668192.aspx