i am trying to use Google's cloud command line interface (SDK) on desktop to extract a file from Google big query and place it in a google storage area. I have managed to do this initial part but now i want to give the file a dynamic date name as this process will be repeated in order to create a history of these files. The idea would be to have (filename).20.01.2020 or something like that so that we could have an organised history of these exports.
Here is what i currently have
bq extract mp-uid-all-touchpoints:83778322.prod_placementTouchpoints gs://touchpointsrecord/placements/%date%
What this does is correctly gather the current date and try to pass that as the filename. The problem is that in google sdk's syntax, '/' stands for newline, so when the date is passed in this format 'dd/mm/yyy' what ends up happening is it creates a new folder for day, month and then year.
i need it to just be one file not multiple folders within folders
hope someone can help solve
Putting / will produce directory structure so becomes difficult to Read/Write objects with / in the name.
I suggest to try some other date format like DDMMYYYY or DD-MM-YYYY or DD.MM.YYYY
For 20012020
export CURR_DATE=$(date '+%d%m%Y')
bq extract mp-uid-all-touchpoints:83778322.prod_placementTouchpoints gs://touchpointsrecord/placements/$CURR_DATE
Hope this helps.
You can use this format to create a file.
gs://touchpointsrecord/placements-21-01-2020/
gs://touchpointsrecord/placements-$(date '+%d-%m-%Y')
Related
In Data Fusion pipeline:
How do I read all the file names from a bucket and load some based on file name, archive others ?
Is it possible to run gsutil script from the Data Fusion pipeline ?
Sometimes more complex logic needs to be put in place to decide what files should be loaded. Need to go through all the files on a location then load only those that are with current date or higher. The date is in a file name as a suffix i.e. customer_accounts_2021_06_15.csv
Depending on where you are planning on writing the files to, you may be able to use the GCS Source plugin with the logicalStartTime Macro in the Regex Path Filter field in order to filter on only files after a certain date. However, this may cause all your file data to be condensed down to record formats. If you want to retain each specific file in their original formats, you may want to consider writing your own custom plugin.
I my c++ project I want to open an excel file and query data and identify the rows that matches the specific column values(more than one column). What is the best methodology to connect to excel and query the worksheets?
The excel might contain several thousands of records and hence it is very important to complete the search and show the results in quick time and optimum performance.
Request you to let me know more than one option and suggest the best out of it.
See here for a library (under an open CPOL) license another Stackoverflow user recommends:
https://stackoverflow.com/a/2879322/444255
I created some CSV files and exported them to a file folder on a SAS server. I'm using the Excel SAS add-in to make some charts. For whatever reason, the only folder I can access is "My Folder", which I can also view inside Enterprise Guide. There, I can modify it and make changes.
Unfortunately, I can't figure out the path to the folder. I want to write my text files (or maybe some datasets) to that folder so I can access them with the add-in. Side note - I tried to just export the CSV files to a network drive but wasn't allowed for security reasons I guess. It looks like I'm stuck with "My Folder" being the only option, I just can't figure out the path to make use of it.
If your "My Folder" is equivalent to a SAS library, you can do the following:
%sysfunc(pathname(work));
That gives you the path to the work library, which is at least one location that you have write access to.
My guess is that you are confusing two things:
1. Physical folders. (the ones you are looking for)
2. SAS Metadata. (the 'file system' you are seeing)
It has been a while i worked with the excel add-in, but if (no guarantees ;)) i recall correctly, you can only access SAS objects that were registered in the SAS server metadata.
The SAS metadata looks like a file structure, but it is virtual. Objects in the same metadata folder can actually have a totally different disk location.
The easiest way would be to register the file you want to access in the metadata. (the 'my folder' if you want to make it easiest) Of course, this requires certain administrative rights on the server.
If not possible, i'm not sure that you can access it some other way through the SAS add-in.
For reference, the metadata path to your "My Folder" is /User Folders/&sysuserid/My Folder
You can store the files in a folder on the server and give a reference to the folder using LIBNAME in the autoexec.sas file in your ~/home folder on the server. The when you browse libraries using the add in, you will see the reference to your folder present there.
for the university demo edition on linux/Mac try this
INFILE '/folders/myfolders/yourfilename';
if you have set up your shared folders as described in the install howto.
See one example from "the little SAS book" loading raw data:
You can also see the path in the status line at the bottom
Other aproach: enter
%put all;
will list "all" macro variables in the log. There you can find:
GLOBAL USERDIR /folders/myfolders
So in the example above you could also use
INFILE "&USERDIR/yourfilename";
I have the file created within "Open Office Calc" which I need to load into my C++ program to generate some game-specific code for which this design file was written. All I need is spreadsheets names, fields data and formulas results.
Is there a way to do so?
Thank you.
All OpenOffice files are zip-archives, with the contents in publicly specified XML files. Might be some work but would not be impossible to get the data you need with the help of most available XML-parsers.
There are gov't data files: http://www.cdc.gov/EpiInfo/
Available in this weird SAS format. How can I convert them into XML/CSV, something much simpler that can be read by scripts/etc.???
I had the same problem, so i made a simple SAS data viewer. You download it from the downloads section here: http://code.google.com/p/sasquatch
It has alot of the same features as SAS Universal Viewer, but its still a work in progress.
You need to have Adobe AIR installed, you can get that on the adobe website.
Are the data in the SAS XPORT (.xpt) or .sas7bdat format?
For future reference, SAS XPORT files can be read and written using the 'SASxport' package for R (http://cran.r-project.org/web/packages/SASxport/index.html).
(Already posted this to superuser.com)
SAS Institute (the company that makes SAS) produces a viewer for SAS data sets.
Note that SAS program files usually have the extension .sas, whereas the data files themselves usually have the extension .sas7bdat.
(EDIT: I notice belatedly that your title says on a Mac, so this may not help much as I believe the tool is Windows only.)
Here a quick-and-dirty python five-liner to convert a SAS .xpt (aka XPORT) file to .csv
import pandas as pd
FILE_PATH = "(fully qualified name of directory containing file)"
FILE = "ABC" # filename itself (without suffix)
# Note: might need to substitute the column name of the index (in quotes) for "None" here
df = pd.read_sas(FILE_PATH + FILE + '.XPT', index=None)
df.to_csv(FILE_PATH + FILE + '.csv')
Hopefully this might help someone
JMP runs on MAC and can read sas files. Visit jmp.com for more information.
There are two parts to your question
1. Read these files
2. Convert these files
I looked into the link you shared there are no directly downloadable files, but I am assuming that you mean the files for windows.
For viewing you can use the folloiwng
a. SAS Universal viewer: https://support.sas.com/downloads/package.htm?pid=667
b. Use SAS on mac to directly read the files
For conversion you can do the following
a. Use SAS proc import to export and proc export to export the files feature,
b. Use third party softwares, e.g., DBMSCopy for this;
c. Download trial version of JMP and convert the files to desired format, e.g., CSV/txt etc and get done with it.