SAS automation move files from folder to folder - sas

I currently have python code running a html scrape and storing the data as CSV files in a folder on my computer called "New Data". I would then like to run my SAS code through every CSV file that is uploaded to that folder. After I run that data through my SAS code I would like to move all of the CSV files from "New Data", to a folder named "Processed Data". I was wondering what SAS code would help me to move CSV files from one folder on my computer to another, after they have been sent through code. Also, the code has to be automated as there will be new CSV files coming in daily.
Thanks!

I have done something similar. Part of data steps is copying files from one folder to another. I used DOS commands in SAS. The command need to be in single quotation. If there is a space in the folder name or file name, the directory of the file needs to be in double quotation.
Here is an example for all the csv files in the "new data" folder to be moved to the "processed data" folder:
data a;
b = system ('move "x:\sas\project\new data\*.csv" "x:\sas\project\processed data\" ');
run;
Pay attention to the quotation. As Reeza mentioned, this code assumes XCMD is enabled.

Related

Load multiple files, check file name, archive a file

In Data Fusion pipeline:
How do I read all the file names from a bucket and load some based on file name, archive others ?
Is it possible to run gsutil script from the Data Fusion pipeline ?
Sometimes more complex logic needs to be put in place to decide what files should be loaded. Need to go through all the files on a location then load only those that are with current date or higher. The date is in a file name as a suffix i.e. customer_accounts_2021_06_15.csv
Depending on where you are planning on writing the files to, you may be able to use the GCS Source plugin with the logicalStartTime Macro in the Regex Path Filter field in order to filter on only files after a certain date. However, this may cause all your file data to be condensed down to record formats. If you want to retain each specific file in their original formats, you may want to consider writing your own custom plugin.

SAS Folder mapping

I have created a SAS folder say "/Public Development/Area Name/Project Name" under "Folders" tab of SAS Management console.
In SAS EG this folder shows under "SAS Folder" option. I'm able to save EGP project and stored processes in this folder but not SAS code, log etc.
I believe its just a folder at meta data level and only items registered at meta data can be saved here.
So what approach should I take to organize my other project items like code, jobs, macros, Reports...?
The Enterprise Guide model includes storing your code as part of your EGP project. You put code modules in process flows, and log and output are stored alongside them (in a somewhat similar fashion to if you had run them in batch mode - log, output, and program are grouped as one entity effectively).
Your organization may have specific rules for how code/etc. is stored, such as storing it in a SVN repository or similar, so you should check with your manager or site SAS admin to get a more complete answer that is specific to your site.
I tend to keep metadata folders for storing metadata objects (stored processes, DI jobs, etc), and I use OS file system for storing code (.sas files), .log files, etc and .egp projects. Generally I don't store code as part of the EG project, instead the project just links to code that is sitting in the OS file system. So basically, I store my code, logs, macros, format catalogs, output reports, etc etc the same way as I did when I was using PC SAS.

SAS Path to "My Folder"

I created some CSV files and exported them to a file folder on a SAS server. I'm using the Excel SAS add-in to make some charts. For whatever reason, the only folder I can access is "My Folder", which I can also view inside Enterprise Guide. There, I can modify it and make changes.
Unfortunately, I can't figure out the path to the folder. I want to write my text files (or maybe some datasets) to that folder so I can access them with the add-in. Side note - I tried to just export the CSV files to a network drive but wasn't allowed for security reasons I guess. It looks like I'm stuck with "My Folder" being the only option, I just can't figure out the path to make use of it.
If your "My Folder" is equivalent to a SAS library, you can do the following:
%sysfunc(pathname(work));
That gives you the path to the work library, which is at least one location that you have write access to.
My guess is that you are confusing two things:
1. Physical folders. (the ones you are looking for)
2. SAS Metadata. (the 'file system' you are seeing)
It has been a while i worked with the excel add-in, but if (no guarantees ;)) i recall correctly, you can only access SAS objects that were registered in the SAS server metadata.
The SAS metadata looks like a file structure, but it is virtual. Objects in the same metadata folder can actually have a totally different disk location.
The easiest way would be to register the file you want to access in the metadata. (the 'my folder' if you want to make it easiest) Of course, this requires certain administrative rights on the server.
If not possible, i'm not sure that you can access it some other way through the SAS add-in.
For reference, the metadata path to your "My Folder" is /User Folders/&sysuserid/My Folder
You can store the files in a folder on the server and give a reference to the folder using LIBNAME in the autoexec.sas file in your ~/home folder on the server. The when you browse libraries using the add in, you will see the reference to your folder present there.
for the university demo edition on linux/Mac try this
INFILE '/folders/myfolders/yourfilename';
if you have set up your shared folders as described in the install howto.
See one example from "the little SAS book" loading raw data:
You can also see the path in the status line at the bottom
Other aproach: enter
%put all;
will list "all" macro variables in the log. There you can find:
GLOBAL USERDIR /folders/myfolders
So in the example above you could also use
INFILE "&USERDIR/yourfilename";

How to open a .dat file (ASCII)?

I tried to open a .dat file using Stata, and it actually opened, but the data set was a complete mess. I took the file from NBER (CPS data)...
click on the A icon of the year 1964 March.
I tried the regular Stata procedure for .dat files: File->Import->ASKII data created by spreadsheet (delimiter " ") as recommended in Stata manual for .dat files.
But it is still not working. Are there any other ways to open .dat file? Can I convert it to .csv somehow?
(All the data files are ASCII files compressed with the Unix compress command.)
There is a Java app to get you the data from CPS, DataFerrett This app lets you get CPS and other data sets. But it is not very efficient.
I can show you an example how to open one of them yourself (you can use it for any years in the interval 1989 till 2012).
Download the .dat file
Save it in a Desktop folder (C:\Users\Owner...)
Download corresponding .do and .dct files from here
Save them in the same folder
Open the .dat file just the way you open it in your question in Stata
Save it as a Stata .dta file in the same folder (C:\Users\Owner...)
Open the .do file (using Notepad++) that is in your (C:\Users\Owner...) folder
At the very beginning you will see the author presctibes local variables for the paths of .dta, .dat and .dct files. Change the paths so that they point to the saved .dta, .dat and .dct files in your folder (C:\Users\Owner...) on your Desktop
Reopen Stata, and run the .do file from your folder (C:\Users\Owner...)
Done! Save the .dta file
Now, for the years 1962 to 1988, you can do the same procedure (10 steps) as I explained above, but unfortunately NBER does not provide the .do and .dct files. It means that you have to write them yourself. Take one of the available .do and .dct files from any of the years (1989 - 2012) as a benchmark, and write your own .do and .dct files. You will have to make corrections so that the new .do and .dct files are consistent with the corresponding .pdf documentation for each year. I know it is very tideous, but this is the only way you can handle it.
We need more information.
".dat" is not an extension that is special so far as Stata is concerned. Perhaps you meant .dta.
Even if so, what file was it, what command did you use and what was wrong?
The page you linked to leads to numerous files. We have not a hope of guessing which you mean.
Spelling is "Stata".
might not save you from spending days digging into that data but here's some ideas:
the file contains 2 completely different kinds of lines. this might be the reason why you can't import them. you can see this by opening the unzipped file in a text editor. you have to find out what that means.
what do you want to obtain from this file? according to the pdf it contains 85 different values per record. do you need them all? if you're only interested in a few values you could extract them in a unix shell.

Read all files in a folder then place each file and its match in new excel document. Is it possible with C++?

I have a folder that contains 300 different files. There are 150 .cft files and 150 .s01 files. Each .cft file has a corresponding .s01 file of the same name. I would like to create a program that can read the files from the folder and place each .cft file and its corresponding .s01 file into an excel document. I would like the .cft file to be on the first worksheet in the document and the .s01 file to be on the second sheet. Then I would like the program to save the file and name it (---------).xls. The (---------) would be the name of the .cft and .s01 file since they are both the same.
So!!! I wrote a program that is able to take the .cft file and the .s01 file, append them and place them in a user defined .xls document. However...I don't want to manually get the names of the 150 files and have to type each one into the program. I also don't want the files to be placed on the same worksheet.
So!!!! I don't want to waste time trying to code something impossible, so before I spend anymore time on this I have a few questions:
Is it possible to read all of the files in a folder and match files of the same name but with different types?
If this is possible, is it then possible to place the corresponding .cft file and .s01 file in the same excel document but on different worksheets?
Then, is it possible to create and save this worksheet as (---------).xls, (-------) being the name of the matching .cft and .s01 file?
So basically...I want to write this code because I am lazy and I don't want to do anything manually ><;;; lol
Example:
The main folder contains 8 files:
dog.cft dog.s01 cat.cft cat.s01 tree.cft tree.s01 bird.cft bird.s01
The program reads all of the files in the folder and recognizes that dog.cft and dog.s01 go together.
The program then creates an excel document and on worksheet 1 places dog.cft and on worksheet 2 places dog.s01.
The program then saves the excel document as dog.xls
Then the program loops through the main folder repeating this process for each of the .cft and .s01 pairs until all 150 pairs have been separated and saved in their own excel document.
I don't know if I'm dreaming a little too big with this but any advice is much appreciated!
personally I would do this with a macro in excel rather than in c++ because doing excel related functions is much easier that way. All of the requirements are possible using VBA within excel.
Yes, it's possible.
For the listing of files in a folder, you can use the Windows API functions FindFirstFile and FindNextFile. When you finish iterating the folder, you'll need to call FindClose.
For creating the Excel spreadsheet and working with the workbook's sheets, you can use COM automation. Here's a link to an article on doing so from C++ (MFC); the article explains where to find one that isn't MFC based.
If you get started and have specific questions about either of the tasks, please post them as separate questions. This should have been two individual questions, in fact - one about iterating the content of a folder and a different one about working with Excel files from C++.