How do you convert hdf5 files into a format that is readable by SAS Enterprise Miner(sas7bdat) - sas

I have a subset of the data set called as 'million song dataset' available on the website (http://labrosa.ee.columbia.edu/millionsong/) on which I would like to perform data mining operations on SAS Enterprise Miner (13.2).
The subset I have downloaded contains 10,000 files and they are all in HDF5 format.
How do you convert hdf5 files into a format that is readable by SAS Enterprise Miner(sas7bdat)

On Windows there is an ODBC driver for HD5. If you have SAS/ACCESS ODBC then you can use that to read the file.

I don't think it's feasible to do this directly, as hdf5 seems to be a binary file format. You might be able to use another application to convert hdf5 to a plain text format and then write SAS code to import that.
I think some of the other files on this page might be easier to import:
http://labrosa.ee.columbia.edu/millionsong/pages/getting-dataset

Related

need a tool or editor to read SAS files for free

Need a free editor or tool to read/open .sas7bat or .wpd files.Trying to open/read a sas files.I don`t have SAS installed
tried using sublime and notepad++
SAS provides a tool called SASĀ® Universal Viewer
From the link:
The SAS Universal Viewer is a replacement for the SAS System Viewer. The SAS Universal Viewer enables you to view, sort, and filter SAS data sets and other simple text-based files. You cannot edit SAS data sets with the SAS Universal Viewer. You do not have to invoke SAS or install SAS on your computer in order to use the SAS Universal Viewer.
The most recent release is SAS Universal Viewer 1.42.
You can use this free tool: www.clinbay.com/datasly which allows to open SAS datasets in .sas7bdat and XPT formats, and also export them to Excel format as well.
You can pull them into R using the Haven package (part of the tidyverse)
https://cran.r-project.org/web/packages/haven/readme/README.html

converting a large number of excel files into a single mysql table

I am using microsoft visual studio 10 c++ and mysql workbench.I have a large number of excel files and i want to update the content of all excel files into a single mysql table.I can create a csv file for each excel file and then import it but i want it to be done with the help of a stored procedure.I want to use c++.And this procedure has to be repeated with different excel files.
i was thinking of connecting my c++ program to both excel and mysql simultaneously(is it possible?) and reading the excel files and adding the data into the mysql table.
i have already connected my program to mysql database.
Any other approach would be appreciated.
In MySQL Store Procedure it will not accept bulk load or CSV import But You can use without SP. Better Try to import using C++.

SAS transport file explanation?

I am trying to find an easy to understand definition of what a SAS transport file is, but most of what I find is on how to create them.
I am hoping for a basic explanation/definition, if possible geared towards someone who isn't too techno-savvy.
A SAS transport file is one format for saving a SAS dataset that is easily read in by other sources. SAS's native format, .sas7bdat, is proprietary - they do not publish how to make a .sas7bdat, so the few people who have done so had to reverse engineer it themselves. SAS does however have some interest in being able to send or receive data to other programs; so they created the transport file format, which is an "open" format - meaning they published the specification for how to make a transport file, so anybody could easily write a program in another language to make a SAS transport file.
Think of it like the old DOC format for Microsoft Word, versus RTF. Both convey the same information (roughly), but RTF is an open format that many programs can write out, while the old DOC format was not initially published (I think).
The transport format does lose some advantages of the sas7bdat, in terms of speed of access and some of the more modern choices in terms of lengths of variable name, as well as formats. They're most commonly used for FDA transmittals.

Programmatically creating Excel file in C++

I have seen programs exporting to Excel in two different ways.
Opening Excel and entering data cell by cell (while it is running it looks like a macro at work)
Creating an Excel file on disk and writing the data to the file (like the Export feature in MS Access)
Number 1 is terribly slow and to me it is just plain aweful.
Number 2 is what I need to do. I'm guessing I need some sort of SDK so that I can create Excel files in C++.
Do I need different SDKs for .xls and .xlsx?
Where do I obtain these? (I've tried Googling it but the SDKs I've found looks like they do other things than providing an interface to create Excel files).
When it comes to the runtime, is MS Office a requirement on the PC that needs to create Excel files or do you get a redistributable DLL that you can deploy with your executable?
You can easily do that by means of the XML Excel format. Check the wikipedia about that:
http://en.wikipedia.org/wiki/Microsoft_Excel#XML_Spreadsheet
This format was introduced in Excel 2002, and it is an easy way to generate a XLS file.
You can also try working with XLS/XLSX files over ODBC or ADO drivers just like databases with a limited usage. You can use some templates if you need formatting or create the files from stratch. Of course you are limited by playing with the field values that way. For styling etc. you will need to use an Excel API like Microsoft's.
I'm doing this via Wt library's WTemplate
In short, I created the excel document I wanted in open office, and save-as excel 2003 (.xml) format.
I then loaded that in google-chrome to make it look pretty and copied it to the clipboard.
Now I'm painstakingly breaking it out into templates so that Wt can render a new file each time.

How can I read/convert SAS Gov't Data files on a MAC?

There are gov't data files: http://www.cdc.gov/EpiInfo/
Available in this weird SAS format. How can I convert them into XML/CSV, something much simpler that can be read by scripts/etc.???
I had the same problem, so i made a simple SAS data viewer. You download it from the downloads section here: http://code.google.com/p/sasquatch
It has alot of the same features as SAS Universal Viewer, but its still a work in progress.
You need to have Adobe AIR installed, you can get that on the adobe website.
Are the data in the SAS XPORT (.xpt) or .sas7bdat format?
For future reference, SAS XPORT files can be read and written using the 'SASxport' package for R (http://cran.r-project.org/web/packages/SASxport/index.html).
(Already posted this to superuser.com)
SAS Institute (the company that makes SAS) produces a viewer for SAS data sets.
Note that SAS program files usually have the extension .sas, whereas the data files themselves usually have the extension .sas7bdat.
(EDIT: I notice belatedly that your title says on a Mac, so this may not help much as I believe the tool is Windows only.)
Here a quick-and-dirty python five-liner to convert a SAS .xpt (aka XPORT) file to .csv
import pandas as pd
FILE_PATH = "(fully qualified name of directory containing file)"
FILE = "ABC" # filename itself (without suffix)
# Note: might need to substitute the column name of the index (in quotes) for "None" here
df = pd.read_sas(FILE_PATH + FILE + '.XPT', index=None)
df.to_csv(FILE_PATH + FILE + '.csv')
Hopefully this might help someone
JMP runs on MAC and can read sas files. Visit jmp.com for more information.
There are two parts to your question
1. Read these files
2. Convert these files
I looked into the link you shared there are no directly downloadable files, but I am assuming that you mean the files for windows.
For viewing you can use the folloiwng
a. SAS Universal viewer: https://support.sas.com/downloads/package.htm?pid=667
b. Use SAS on mac to directly read the files
For conversion you can do the following
a. Use SAS proc import to export and proc export to export the files feature,
b. Use third party softwares, e.g., DBMSCopy for this;
c. Download trial version of JMP and convert the files to desired format, e.g., CSV/txt etc and get done with it.