How to use Talend to read/write SAS files without installing a SAS server? - sas

I'd like to use Talend to manipulate SAS files; however, the SAS plugins require some sort of server authentication. I don't have a SAS server on my machine and I'd like not to install one if possible. Is there a way to read/write SAS files without installing the server?

Not easily! The simplest way would be to purchase a copy of BASE SAS and use it as a local server - see thread below.
How can I read a SAS dataset?
A cheaper way would be to purchase a licensed version of the WPS product:
http://en.wikipedia.org/wiki/World_Programming_System
A free, but less reliable way would be to use an open source reader, such as:
http://kasper.eobjects.org/2011/06/sassyreader-open-source-reader-of-sas.html
hope this helps.

Related

How to copy huge file(200-500GB) everyday from Teradata server to HDFS

I have teradata files on SERVER A and I need to copy to Server B into HDFS. what options do i have?
distcp is ruled because Teradata is not on HDFS
scp is not feasible for huge files
Flume and Kafka are meant for Streaming and not for file movement. Even if i use Flume using Spool_dir, it will be an overkill.
Only option I can think of is NiFi. Does anyone has any suggestions on how can i utilize Nifi?
or if someone has already gone through these kind of scenarios, what was the approach followed?
I haven't specifically worked with Teradata dataflow in NiFi but having worked with other SQL sources on NiFi, I believe it is possible & pretty straight-forward to develop dataflow that ingests data from Teradata to HDFS.
For starters you can do a quick check with ExecuteSQL processor available in NiFi. The SQL related processors take one DBCPConnectionPool property which is a NiFi controller service which should be configured with the JDBC URL of your Teradata server and the driver path and driver class name. Once you validate the connection is fine, you can take a look at GenerateTableFetch/ QueryDatabaseTable
Hortonworks has an article which talks about configuring DBCPConnectionPool with a Teradata server : https://community.hortonworks.com/articles/45427/using-teradata-jdbc-connector-in-nifi.html

driver to support to read or write to HIVE from c++ code

I have core product built on c++ which uses RDBMS namely oracle DB. We are in phase to Big data enable on this product with access to Hive tables. I know from apache spark we have libraries to directly have access to hive tables.
Now with C++ being base language, what could be possible ways to read/write data on hive on cloudera?
Note: Not looking for pull data to/fro from hive and RDBMS or vice versa.(sqoop). Looking to read or fire query execution on hive itself.
Thanks in advance.
This is what worked out for me.
1. Install ODBC driver ODBC
2. Go through Installation guide Installation Guide
3. Open the Project in Visual cpp++ and execute .

Using a result (table from sas entrep guide) into sas miner

In SAS Miner, I would like to use a result/table from SAS entrep guide.
So far, I managed to save this result/table into sas studios.
In SAS Miner, when creating a data source, I have to select a SAS table or a metadata repository. I select Sas table then I am stuck as I can not access my source
It would be great if you know know how to solve this?
Save the data set into a location available to both the server used by Enterprise Guide and SAS Enterprise Miner. If they are the same server, great, put it into an easy to get to location.
After that, you will need to create a library reference to that location in Enterprise Miner. The specifics of that I am unsure of (I don't have a version to play with), but you administrator should be able to do that for you. If not, that's a good question for SAS Tech Support (which is free since you are a client).

Use a Windows share in a Libname statement with SAS

When running SAS through EGuide locally I can successfully declare a libname as follows:
libname winlib '\\pc\folder\';
When using a SAS server this is not possible and I have to resort to using a Copy Files task.
For interest:
I believe this is because of the fact that the SAS server is Unix, is this correct?
What I've tried:
libname test '//pc/folder/'
libname test2 'smb://pc/folder/'
The other options I can think of is mounting the drive to the SAS server, this isn't viable for me as this is for ad-hoc cases.
The question:
How would I correctly declare a libname to \\pc\folder for the SAS server?
A few notes:
I cannot run locally as I have to connect to a few DBs, and I don't want to use a PROC UPLOAD or DOWNLOAD for this.
If you want SAS to read a directory then the SAS process needs to be able to see the directory.
What most companies do is create a shared directory that can be mounted by both the SAS machine and your PC then you can reference the files directly from both, just using different paths.
Otherwise if you want SAS to use a file that EG can see but SAS cannot then I suggest asking EG to upload the file. There are custom tasks available for EG to upload binary files.
Another method would be to create SAS code to connect to a machine that can see the files and pull the files over. Perhaps using FTP or SFTP protocol.
Unfortunately there is no way to do this in the manner I wish (directly using the remote path in the libname statement in a Unix environment).
You should be able to do this with a Windows SAS server and can do it with the local windows SAS server.
This is due to how Unix works, meaning one would have to mount the share.
That isn't feasible as an ad-hoc method.
I do wish Unix had a more direct way of accessing remote directories.
That being said, alternatively one can do one of the following:
write the data to a server-local directory, even work or home. Then Copy the data to a local directory. (by using Copy Files task in Enterprise Guide for example, or copying them manually if you have access to the location from your local PC)
Do the SAS processing locally and fetch the needed data over the network (this isn't feasible if you need DB access, which can't be done on the local server)
Get whoever is in charge of your SAS server's to set up a mount that's accessible from both your machine and the SAS server.
Use SAS PC files Server to accomplish this for M$ office files.
Setup an FTP server on your local machine and use a filename with the FTP option to read/write to it. see How do I read raw data via FTP in SAS? for an idea.
Thanks to #Tom for the suggestions.

How to run SAS using batch if I do not have it locally

Is there a way to run SAS using batch if I don't have the sas.exe in my machine?
My computer has the SAS EG but the code is ran on our companies servers
Thanks
If you are asking whether it is possible to run SAS batch on your local machine without having SAS on your local machine, the answer is no.
If you are using EG to connect to a SAS server, and you want to execute a batch job on the SAS server, that is possible (just not with EG). For example, if you have terminal access to the SAS server via putty or whatever, you can do a batch submit.
Enterprise Guide is quite capable of scheduling jobs, whether or not you have a local SAS installation.
Wendy McHenry covers this well in Four Ways to Schedule SAS Tasks. Way 1 is what you probably are familiar with ('batch'), but Ways 2 through 4 are all possible in server environments.
Way 2 is what I use, which is specifically covered in Chris Hemedinger's post Doing More with SAS Enterprise Guide Automation. In Enterprise Guide since I think EG 4.3, there has been an option in the File menu "Schedule ...", as well as a right-click option on a process flow "Schedule ...". These create VBScript files that can be scheduled using your normal Windows scheduler, and allow you to schedule a process flow or a project to run unattended, even if it needs to connect to a server.
You need to make sure you can connect to that server using the credentials you'll schedule the job to run under, of course, and that any network connections are created when you're not logged in interactively, but other than that it's quite simple to schedule the job. Then, once you've run it, it will save the project with the updated log and results tabs.
If your company uses the full suite of server products, I would definitely recommend seeing if you can get Way 3 to work (using SAS Management Console) - that is likely easier than doing it through EG. That's how SAS would expect you to schedule jobs in that kind of environment (and lets your SAS Administrator have better visibility on when the server will be more/less busy).