How to create and save .docx document on C/C++? - c++

I need to manipulate .docx documents using C/Visual C++. Any samples i found is c# samples.
How to do so?

What I've found is that Microsoft wants you to either use .NET or use their Office Automation API to invoke Word to perform the manipulations for you. Depending on how low you want to go with these manipulations, you might be able to get by with the Office Automation API. If not, you may have to get your hands dirty with the Open Office XML format that's behind the .docx file format.
Here's Microsoft's skimpy documentation on Office Automation
And here's an article that goes into it a bit more, although it may be out of date.
I just thought that one big issue with Office Automation is that you need to have Word to do anything with it. Of course, this all depends on what exactly you need to to.

Try http://libopc.codeplex.com/

Related

Safe to perform Word automation on open document?

I'm looking to perform Microsoft Word automation -- straightforward stuff such as instructing Word to open a document and save it as an RTF file. But what happens if the user already has the document open in a running instance of Word? Can I still safely perform automation actions (that don't modify the document itself), or am I asking for trouble? Will this interfere with the user working on the open document? Are there any alternative ways to accomplish the same thing?
I'm only interested in Word 2003 and later (but also about the Word for the Mac, as this will eventually be a cross-platform application).
There are lot of problems in doing this.
First of all Microsoft doesnt recommend to use Word for automation. use OpenXML instead. In my experience every day I ended up investigation Com exceptions by automating Word in Server.
http://support.microsoft.com/kb/257757
Even if you attempt to take the risk by automating, It is bad idea to use the machine where there is users interaction. If there is some dialog box opened like find or save as it will not allow another instance of Word to do any other functionality.
If you dont find any other solution like me then create a new user profile called OfficeAutomationUser and follow the steps in http://theether.net/download/Microsoft/kb/288367.html
Thanks for reading my Words of caution about automating. Note: I am not C++ programmer I use VSTO with C#

Creating a PDF reader in C++

So I wanna make a PDF reader using C++ as a hobby project. The problem is I am not finding much of head start so if anyone has worked on similar project please guide me, a few web links would be great! I will be using windows environment and Visual studio.
If you want to simply "host" an existing PDF reader (such as Acrobat or Foxit) in your own window, then you'll want to look in to ActiveX.
Alternately, if you want to do your own PDF decoding, then the best place to start would be find a soft couch and cozy up with the PDF format specification, and in particular, ISO 32000-1. It's a real page-turner.
http://www.adobe.com/devnet/pdf/pdf_reference.html
Adobe's publication about the details of the PDF file format.
There are PDF components as well, if you want to go that route, but the majority of them are either not free, or already have a UI of their own. Just tossing a PDF component into a form doesn't strike me as much of a hobby project. :)
You might find this article on parsing Reg files using Boost Spirit a useful starter. I've used Spirit before for parsing complex data but I think you're biting off a mighty big challenge!
If you want to look at existing parsers, try PoDoFo in C++ or the lexing side of Panda, in C.

How to replace text in a PowerPoint (.ppt) document?

What solutions are there? I know only solutions for replacing Bookmarks in Word (.doc) files with Apache POI?
Are there also possibilities to change images, layouts, text-styles in .doc and .ppt documents?
I think about replacement of areas in Word and PowerPoint documents for bulk processing.
Platform: MS-Office 2003
What are your platform limitations?
Obviously Apache POI will get you at least part of the way there.
Microsoft's own COM API's are fairly powerful and are documented here. I would recommend using them if a) you are not running in a server (many users, multithreaded) environment; b) you can have a proper version of powerpoint installed on the production machine; and c) you can code against a COM object model.
It's a bit pricey, but Aspose.Slides is a very powerful library for manipulating PowerPoint files
If you include using other Office suits as an option, here's a list of possible solutions:
Apache POI-HSLF
PowerPoint 2007 APIs
OpenOffice.org UNO
Using POI you can't edit .pptx file format, but you don't depend on the apps installed on the system. Other two options, on the contrary, make use of other apps, but they are definitely better for dealing with presentations. OpenOffice has better compability with older formats, by the way. Also if you use UNO, you'll have a great choice of languages, UNO exists for Java, C++, Python and other languages.
My experience is not directly with Power Point, but I've actually rolled my own WordML (XML) generator. It a) removed all dependencies on Word, b) was very fast c) and let me build up documents from scratch.
But it was a lot of work to create. And I was only creating a write only implementation.
I'm not as familiar with Power Point, so this is conjecture, but you may be able to roll your own by reading XML (Power Point 2003??) and/or cracking the Office Open XML file (zipped XML), then using XPath to manipulate the data, and then saving everything back to disk.
This won't work on older OLE Compound Document based Power Point files though.
I've done something like that before: programmatically accessed and manipulated PowerPoint presentations. Back when I did it, it was all in C++ using COM, but similar principles apply to C#/VB .NET apps, since they do COM interop very easily.
What you're looking for is called the Office Document Model. Basically, Office applications expose their documents programmatically, as trees of objects that define their contents. These objects are accessible via an API, and you can manipulate them, add new ones, and do whatever other processing you want. It's exceedingly powerful; you can use it to manipulate pretty much all aspects of a document. But you'll need an installation of Office and Visual Studio to be able to use it.
Some links:
Intro: http://msdn.microsoft.com/en-us/library/d58327k6.aspx
Hope this helps!
Apparently new users can only include one link per posting. How lame! :)
Here's the other link I meant to include:
Example of manipulating PowerPoint documents programmatically: http://msdn.microsoft.com/en-us/library/cc668192.aspx

best way to programmatically modify excel spreadsheets

I'm looking for a library that will allow me to programatically modify Excel files to add data to certain cells. My current idea is to use named ranges to determine where to insert the new data (essentially a range of 1x1), then update the named ranges to point at the data. The existing application this is going to integrate with is written entirely in C++, so I'm ideally looking for a C++ solution (hence why this thread is of limited usefulness). If all else fails, I'll go with a .NET solution if there is some way of linking it against our C++ app.
An ideal solution would be open source, but none of the ones I've seen so far (MyXls and XLSSTREAM) seem up to the challenge. I like the looks of Aspose.Cells, but it's for .NET or Java, not C++ (and costs money). I need to support all Excel formats from 97 through the present, including the XLSX and XLSB formats. Ideally, it would also support formats such as OpenOffice, and (for output) PDF and HTML.
Some use-cases I need to support:
reading and modifying any cell in the spreadsheet, including formulas
creating, reading, modifying named ranges (the ranges themselves, not just the cells)
copying formatting from a cell to a bunch of others (including conditional formatting) -- we'll use one cell as a template for all the others we fill in with data.
Any help you can give me finding an appropriate library would be great. I'd also like to hear some testimonials about the various suggestions (including the ones in my post) so I can make more informed decisions -- what's easy to use, bug-free, cheap, etc?
The safest suggestion is to just use OLE. It uses the COM, which does not require .NET at all.
http://en.wikipedia.org/wiki/OLE_Automation <--about halfway down is a C++ example.
You may have to wrap a few functionalities into functions for usability, but it's really not ugly to work with.
EDIT: Just be aware that you need a copy of Excel for it to work. Also, there's some first-party .h files that you can find specific to excel. (it's all explained in the Wikipedia article)
I don't know if this is an option for you, but the new office 2007 formats are in zipped XML format, which makes it very doable to do your own modifications. See here for the specifications.
SpreadsheetGear for .NET will handle your requirements and has an API which is very similar to Excel.
When you insert cells, your defined names (and any other formulas / charts / etc...) will automatically be fixed up to reference the new range (just as they would in Excel). So you would not need to update your defined names (although there is complete support for creating and updating defined names if that is what you want to do).
SpreadsheetGear is a .NET component, but you can build your own wrapper which is callable from C++.
You can see what our customers say and download the free, fully functional evalution here.
Have you already tried using the Excel COM interfaces? Obviously Excel needs to be install on the machine, and it's a pain to deal with...
I would argue that a .net solution with COM interop for linking into your C++ application is the best solution. In more than ten years of working with them, I've never seen a COM automation of Excel that didn't leak memory somewhere.
If you need to automate Excel, I recommend Visual Studio Tools for Office. If you don't need to automate, only modify files and those files can be in Office 2007 format, you're better off finding a library that manipulates the files directly instead of opening Excel to do it.
I ended up using Aspose.Cells as I mentioned in my original post, since it seemed like the easiest path. I'm very happy with the way it turned out, and their support is very good. I had to create a wrapper around it in C# that exported a COM interface to my C++ application.

Generating Powerpoint PPT with ColdFusion?

Does anyone know if it's possible to generate powerpoint ppts within ColdFusion? I can't rely on the approach of installing a copy of office and generate one through COM and I can't use ooxml since my client is still in the office 2003 era. Any suggestion is much appreciated.
You can try using Apache POI, specifically their Powerpoint support. Looks to be still in beta though:
http://poi.apache.org/slideshow/index.html
I've used POI to extra from Word docs before and it was rather easy in ColdFusion.
ColdFusion doesn't have built in PPT creation, but you may be able to make something work with OpenOffice.
Look into CFPresentation (CF8), it allows you to create web-based presentations - not actually PPT format, but displayed in the same way via Flash player.
Have you considered using PDF instead? For all intents and purposes except perhaps some animation, PDFs do well replacing PPTs. And CF has tons of PDF creation and manipulation features!
I know it's not a good answer, but ColdFusion 9 can turn a cfpresentation into a PowerPoint file, and creating a cfpresentation is pretty damned trivial...
However, this of course requires a server that's still in beta, and a large cash outlay once it's released if you're running your own server.
Dan