XLS/XLSX to CSV in c++ - c++

I've been given a project in which I need to import data from CSV, XLS and XLSX files, do some processing, then write the results to a database.
I'm working on a project that's been going on for a while and there are several import functions already that use a very nice object to handle opening files with all sorts of separators and such. And this object is key to the processing that I need to perform.
Since a CSV is basically a textfile with a different extension this object opens it perfectly and I've managed to complete most of the processing and testing with the object and values stored within.
But now I need to add the XLS and XLSX support. And since this object is now pretty much central to the processing I figured the easiest way to fit XLS and XLSX files in would be to convert them to CSV, then import that.
Any help would be appreciated and I'll try answer questions if it's necessary, but since the request is just for some way to convert from one file type to another and nothing more insightful I don't think it's really necessary to add any snippets just yet.

Your options in terms of C++ libraries:
OpenXLSX - https://github.com/troldal/OpenXLSX
XLNT - https://github.com/tfussell/xlnt
Or you could give "XLSX I/O" a try. It's a small C library.
"XLSX I/O" - https://github.com/brechtsanders/xlsxio
Don't forget to add the usual extern "C", when calling C functions from C++.
The repo contains basic xlsx-to-csv (and csv-to-xlsx) examples, which should get you started: https://github.com/brechtsanders/xlsxio/blob/master/src/xlsxio_xlsx2csv.c

Maybe this will help:
http://www.codeproject.com/Articles/42504/ExcelFormat-Library
Also you can use libraries from Open/Libre Office project.

Related

Beginner - data storage through XML or text files

I am a beginner in visual studio and has only code C and C++ in command line settings.
Currently, I am taking a module(software development) which requires me to come up with an expense tracker - a program which helps user tracks his/her daily expenses. Therefore, at the end of each individual day, or after a user uses finishes the program, we would have to perform data storage to store all the info in one place which we would export it during the next usage.
My constraint include not using any relational database(although i have no idea what it is :( ). Data storage must be done using XML or text files. Following this, I have several questions regarding data storage:
1) If data is stored successfully, do we export it everytime we start the program? And everytime after the user closes the program, we overwrite the existing data file and then store it accordingly?
2) I have heard from some people that using text file may be easier. Searching on the internet and library only provides me with information regarding XML and not text. Would anyone be able to help me with it? Like tutorials link and stuff?
Thank you very much!
File writing/handling works similar to every other buffer in c++.
you can enable file handling using the fstream header. You can create a file, write to it and over-write every time the program is run, or can even create a file the first time the program is run and then append to it every subsequent time the program runs.
Ive only ever done text files, never tried XML, but Im guessing they're similar.
http://www.cplusplus.com/doc/tutorial/files/ should give you everything you need to know.
Your choice of XML vs plain text depends on the kind of data that you'll be storing.
The reason why you'll only find XML libraries on the internet is because XML is a lot more complicated than plain text. If you don't know what XML is or if the data that you're storing isn't very complex, then I would suggest going with plain text.
For example, to track expenses, you might store a file like this:
sandwich 5.00
coffee 2.30
soft drink 1.50
...
It's very easy to read/write lines like this to/from a file in C++.

Hiding application resources

I'm making a simple game with SFML 1.6 in C++. Of course, I have a lot of picture, level, and data files. Problem is, I don't want these files visible. Right now they're just plain picture files in a res/ subdirectory, and I want to either conceal them or encrypt them. Is it possible to put the raw data from the files into a resource file or something? Any solution is okay to me, I just don't want the files exposed to the user.
EDIT
Cross platform solutions best, but if they don't exist, that's okay, I'm working on windows. But I don't really want to use a library if it's not needed.
Most environments come with a resource compiler that converts images/icons/etc into string data and includes them in the source.
Another common technique is to copy them into the end of the final .exe as the last part of the build process. Then at run time, open the .exe as a file and read the data from some determined offset, see Embedding a filesystem in an executable?
The ideal way for this is to make your own archive format, which would contain all of your files' data along with some extra info needed to split files distinctly within it.

Prgrammably making excel spreadsheets (97 - 2003 format)

I was wondering how difficult it would be to make an application like this. Basically, I have some old html files that use tables. I want to put these tables into excel for easier reading and manipulation. I only have text, I have no numbers of formulas or anything.
Are there any tutorials on how to do this sort of thing?
The application would produce .xls
Thanks
You have three options:
Output a CSV file. While not an XLS file, Excel is more than capable of opening such a file, and it's extremely easy to create. You need nothing more than standard C++ to implement this solution. This is by far the easiest and quickest way to output to Excel (or any spreadsheet program, for that matter).
Use OLE automation. Microsoft even has a Knowledge Base article that provides an example of how to invoke Excel from your native C++ application and fill in some values. If you absolutely need to output XLS files, this is the easiest way to go. Note that users must have Excel installed on their computers for this to work.
Create your own XLS writer. Don't even bother with this option unless you really want to generate XLS files without requiring Excel to be installed on end-user computers. Options 1 and 2 are more than good enough for just about any application.
You don't need to reverse-engineer the XLS format; Microsoft documents the excel file format here. Due to the evolution of Excel over the years, it's not exactly a clean specification.
If you don't mind installing a copy of Excel along with your program, using OLE Automation would be much easier.
The simplest thing to do is simply create a CSV file. If you have column headers, put them in the first row. CSV files can be opened natively in Excel as if they were Excel spreadsheets.
There is a trick here: save .html tables with the .xls extension and Excel can read them (ie Excel can read the output of the DataGrid control).
But, if you want to create 'real' Excel files, then you can either use Excel Interop (which could be messy, requires Excel and the PIA's to be installed on the machine, and needs careful memory management (since its COM)). You could also opt for a 3rd-party library like FlexCel - which will avoid many of the InterOp problems but will not give you 'complete' Excel functionality (addins, custom vba macros etc.). For most uses, however, a 3rd party library should do the trick.
Looks like there's another alternative called ExcelFormat. I didn't try it, though.

Load Excel data into Linux / wxWidgets C++ application?

I'm using wxWidgets to write cross-plafrom applications. In one of applications I need to be able to load data from Microsoft Excel (.xls) files, but I need this to work on Linux as well, so I assume I cannot use OLE or whatever technology is available on Windows.
I see that there are many open source programs that can read excel files (OpenOffice, KOffice, etc.), so I wonder if there is some library that I could use?
Excel files it needs to support are very simple, straight tabular data. I don't need to extract any formatting except column/row position and the data itself.
Suggestedd reference: What is a simple and reliable C library for working with Excel files?
I came across other libraries (chicago on sf.net, xlsLib) but they seem to be outdated.
jrh
I can say that I know of a wxWidgets application that reads Excel .xls and .xlsx files on any platform. For the .xlsx files we used an XML parser and zip stream reader and grab the data we need, pretty easy to get going. For the .xls files we used: ExcelFormat, which works well and we found the author to be very generous with his support.
Maybe just some encouragement to give it a go? It was a couple of days work to get working.
Maybe http://www.libxl.com/ can help ?
I think that it is not something easy to do. xls files are quite complex and it is a proprietary format.
Maybe this is a stupid idea but why don't you upload and access your doc with Google docs. There are some apis to access your doc.
2 potential problems:
- Your app needs internet access
- Currently there is no C++ api.
But there are api for several languages including python see http://code.google.com/intl/fr/apis/gdata/articles/python_client_lib.html

How to read/write data into excel 2007 in c++?

How to read/write data into excel 2007 in c++?
Excel provides COM interface which you can use from your C++ application.
I have experience with Excel 2003 only but I think for excel 2007 it will also work.
This can be done e.g. with #import or in the way described in this article:
http://support.microsoft.com/kb/216686
There is a python solution (using COM dispatch) here: http://snippets.dzone.com/posts/show/2036
It's not C++, but the COM interface should be the same no matter which language you use, right?
You wouldn't need to port everything. Just __init__, set_range, get_value, set_value, save (or save_as), close, and quit. You might also need to dispose of garbage (as python has automatic gc).
Or you could just port (and modify) the following code (which I haven't tested, as I don't have excel anymore - you should probably check it by downloading python and pythonwin):
from win32com.client import Dispatch
app = Dispatch("Excel.Application")
app.Visible = True # spooky - watch the app run on your desktop!
app.Workbooks.Open("C:\\book.xls")
range = app.ActiveWorkbook.Sheets(1).Range("a1")
print 'The range was:'
print range.Value
range.Value = 42
print 'The value is now 42'
app.ActiveWorkbook.Save()
app.ActiveWorkbook.SaveAs("C:\\excel2.xls")
app.ActiveWorkbook.Close()
app.Quit()
# any gc to do?
Excel 2007 files are simply zip files. Try to rename .xlsx to .zip: you can extract files and folders. With a text editor you can view that they are all XML files.
So the solution:
use a common class to unzip your xlsx
use an xml parser to grab you data
if you have modified somethig, re-zip all
No COM object required.
Depending on your c++ compiler you can easyly find the required sources.
There are three main things you need to do.
1) Ensure pre-requisite files are installed and locatable.
Blindingly obvious I know, but make sure you have a suitable version of Excel installed, so that you can locate the required Microsoft libraries (and their locations).
namely MSO.DLL, VBE6EXT.OLB and EXCEL.EXE
2) Set up the Microsoft libraries.
In your C++ code, let's say you start with a simple console application, be sure to include the import libraries in any C++ application that interfaces with Excel. In my example I use:
#import "C:\\Program Files (x86)\\Common Files\\microsoft shared\\OFFICE11\\MSO.DLL" \
rename( "RGB", "MSORGB" )
using namespace Office;
#import "C:\\Program Files (x86)\\Common Files\\microsoft shared\\VBA\\VBA6\\VBE6EXT.OLB"
using namespace VBIDE;
#import "C:\\Program Files (x86)\\Microsoft Office\\OFFICE11\\EXCEL.EXE" \
rename( "DialogBox", "ExcelDialogBox" ) \
rename( "RGB", "ExcelRGB" ) \
rename( "CopyFile", "ExcelCopyFile" ) \
rename( "ReplaceText", "ExcelReplaceText" ) \
exclude( "IFont", "IPicture" ) no_dual_interfaces
3) Use the Excel Object Model in your C++ code
For example, to declare an Excel Application Object pointer for reading / writing an Excel Workbook:
Excel::_ApplicationPtr pXL;
pXL->Workbooks->Open( L"C:\\dump\\book.xls" );
And to access and manipulate the Excel Worksheet and the cells within it:
Excel::_WorksheetPtr pWksheet = pXL->ActiveSheet;
Excel::RangePtr pRange = pWksheet->Cells;
double value = pRange->Item[1][1];
pRange->Item[1][1] = 5.4321;
And so on. I have a more in-depth discussion at this following blog posting.
A low tech way I have used on a couple of projects is to make a simple script/macro/whatever that runs an external program. Write that external program in C++. Get that external program to read its input from and write its output to a .csv file (simple comma separated value text file). Excel can easily read and write .csv files, so this setup gives you everything you need to craft a viable solution.
This can all be done via the IDispatch interface. It can get really bloody complicated quickly (Cheers Microsoft) but at least once you've got your head round Excel you'll find integrating with any other MS application easy too :)
Fortunately there is someone over on codeguru who has made the process nice and easy. Once I'd read through that code I started to get my head round what excel was doing and, furthermore, i found it became REALLY easy to extend it to do other things that I wanted. Just remember that you are sending commands to Excel via the IDispatch interface. This means you need to think about how YOU would do something to figure out how to do it programatically.
Edit: the code guru example is for Excel 2003 but it should be fairly easy to extend it to 2007 :)
Start here with the OpenXml Sdk. Download the SDK from here. The SDK uses .NET, so you might need to use C++.NET
It has been a very long time but I have used the Jet OLEDB to get at Excel files. You might start searching in that direction.
I have used a 3rd Party component for this in the past: OLE XlsFile from SM Software (not free, but inexpensive). The advantage of this component over the Microsoft COM components is that you can use it to write XLS files even if Excel is not installed.
It also allows you to create speadsheets or workbooks with embedded formulas and formatting, something not possible if you use CSV files as an interchange format.
If you need the best performance, writing binary Excel files is the way to go and writing them is not as difficult as reading them. The binary file format is relatively well documented by the OpenOffice.org project:
http://sc.openoffice.org/excelfileformat.pdf
and Microsoft has also released the documentation:
http://download.microsoft.com/download/0/B/E/0BE8BDD7-E5E8-422A-ABFD-4342ED7AD886/Excel97-2007BinaryFileFormat(xls)Specification.xps
This solution will work best if you have to write many Excel files, mostly with data and little formatting, and if you don't want to open the files at the same time. If you write a single Excel file to open it afterwards, you can probably use the propsed C++ COM Interface as the other posters have explained.
Using the COM interface will necessitate expensive out-of-process calls unless you write an Excel COM Addin. If you write an Addin that imports your data into the current Excel sheet, filling the sheet will still be much slower than dumping a file, but for a single file at a time this can be acceptable depending on your user scenario. If you do decide to use the COM interface, minimize the number of calls to Excel. Use the methods that allow to insert an array of values if they exist, set style properties by row and column is possible and not per cell.
You can use ODBC to read and write data from an excel file easily without Excel. The link:Here
The link for how to write data using odbc: Here
I've used a set of C++ classes that is available from CodeProject before to write to XLS files.
It's really easy to use, and free! You can find it here.
Took me no time to set up.