bulk change sas project code nodes directory - sas

I have a SAS Project(SAS Enterprise Guide version 7.15) and all the sas code nodes are currently saved on directory location,
'C:\Users\SAS\Prod\SASCode'. I am taking a backup of this project to the Dev area and I would like to save the sas code nodes to the following Dev location 'C:\Users\SAS\Prod\SASCode'. Is there a way I can bulk change the location of all this sas code nodes to the above Dev location instead of going into individual sas codes nodes and change it one by one which is of course very time consuming process.

Enterprise Guide project files are really a collection of files within a zip archive that has a file extension .egp instead of .zip.
Suppose your code nodes are shortcuts to files in the operating system, in a single folder, and you want to change where those files are located. The targets of the shortcuts are stored in the egp file in a contained file project.xml as project metadata. project.xml is a UTF-16 encoded file and the shortcuts are xml data within a <FullPath> node. The xml file is very systematic and can be text processed with out needing to do a full xml parsing. The goal is to replace the old path with the new path for data in FullPath nodes.
Powershell is a very potent tool for Windows scripting and can be used to tear apart the project file, modify the paths and reassemble the pieces back into an .egp project file.
Consider this powershell script:
The code nodes for the original project were shortcuts to files in C:\Zeus\sources. Suppose those sources were copied or moved to C:\Olympus\sources. The code nodes of an updated EG project have to be modified to point to the new location.
$egpIn = 'C:\Q1\Zeus.egp'
$egpOut = 'C:\Q2\ZeusModified.egp'
$oldPath = 'C:\Zeus\sources'
$newPath = 'C:\Olympus\sources'
Copy-Item "$oldPath\*.sas" -Destination $newPath
$oldLandmark = [regex]::Escape('<FullPath>' + $oldPath)
$newLandmark = '<FullPath>' + $newPath
$workingFolder =
[System.IO.Path]::Combine(
[System.IO.Path]::GetTempPath(),
[System.IO.Path]::GetRandomFilename()
)
[System.IO.Compression.ZipFile]::ExtractToDirectory($egpIn, $workingFolder)
Get-ChildItem $workingFolder *.xml -recurse |
Foreach-Object {
$fileContent = ($_ | Get-Content -Encoding Unicode)
$fileContent = $fileContent -replace $oldLandmark, $newLandmark
[IO.File]::WriteAllText($_.FullName, ($fileContent -join "`r`n"), [System.Text.Encoding]::Unicode)
}
if (Test-Path $egpOut)
{
Remove-Item $egpOut
}
[System.IO.Compression.ZipFile]::CreateFromDirectory($workingFolder,$egpOut)
Remove-Item $workingFolder -Recurse

Should be possible, since EG-Project are basicly ZIP-files containing the various elements.
One of these Elements is the file "project.xml" wich contains the references to external code-files.
Replacing this references with the new location should solve your problem.

Related

I wonder if I can perform data-pipeline by directory of a specific name with DataFusion

I'm using google-cloud-platform data fusion.
Assuming that the bucket's path is as follows:
test_buk/...
In the test_buk bucket there are four files:
20190901, 20190902
20191001, 20191002
Let's say there is a directory inside test_buk called dir.
I have a prefix-based bundle based on 201909(e.g, 20190901, 20190902)
also, I have a prefix-based bundle based on 201910(e.g, 20191001, 20191002)
I'd like to complete the data-pipeline for 201909 and 201910 bundles.
Here's what I've tried:
with regex path filter
gs://test_buk/dir//2019 to run the data pipeline.
If regex path filter is inserted, the Input value is not read, and likewise there is no Output value.
When I want to create a data pipeline with a specific directory in a bundle, how do I handle it in a datafusion?
If using directly the raw path (gs://test_buk/dir/), you might be getting an error when escaping special characters in the regex. That might be the reason for which you do not get any input file into the pipeline that matches your filter.
I suggest instead that you use ".*" to math the initial part (given that you are also specifying the path, no additional files in other folders will match the filter).
Therefore, I would use the following expressions depending on the group of files you want to use (feel free to change the extension of the files):
path = gs://test_buk/dir/
regex path filter = .*201909.*\.csv or .*201910.*\.csv
If you would like to know more about the regex used, you can take a look at (1)

Powershell: Export List of apps

I wanted a hands off approach to get a list of all the applications installed on a system.
A search brought me to many websites that utilized Get-ItemProperty like on this page here
I quickly found that I could export the list to txt file for easy access at a later time.
Get-ItemProperty HKLM:\Software\Wow6432Node\Microsoft\Windows\CurrentVersion\Uninstall\* | Select-Object DisplayName, DisplayVersion, Publisher, InstallDate | Format-Table –AutoSize > C:\Users\username\Documents\InstalledPrograms-PS.txt
*username is a place holder
What I found, however, is that I could run this when pasting it directly into powershell, but when placing this into a script, it would crash or not run at all.
I'm assuming I'm missing something and a straight copy and paste will not work.
Any help would be appreciated.
NOTE: I'm sure someone is bound to recommend WMIC, but this does not seem to list all the apps.
Update:
Seems that within my script
Get-ItemProperty HKLM:\Software\Wow6432Node\Microsoft\Windows\CurrentVersion\Uninstall\*
on its own works just fine. It's when the formatting happens that it crashes.
Solution:
For anyone looking to get an easy and quick way to get a list of all applications, including microsoft store ones, here's what I did. I just have them export to a .txt file to keep for later.
Get-ItemProperty HKLM:\Software\Wow6432Node\Microsoft\Windows\CurrentVersion\Uninstall\* | Select-Object DisplayName, DisplayVersion, Publisher, InstallDate | Format-Table > c:\applist.txt
Microsoft Store Apps
Get-AppxPackage | Select Name, PackageFullName |Format-Table -AutoSize > c:\microsoftlist.txt
It worked for me, although emacs said the '-' before AutoSize was a foreign character. \226
It may be unicode en dash, 8211 decimal: https://www.fileformat.info/info/unicode/char/2013/index.htm A regular dash would be 45.
[int][char]'–'
8211
By the way, there's also get-package to list installed applications.

Pick up a particular file from a directory using regex in Talend

My directory contains files named as WM_PersonFile_22022018 , WM_PersonFile_23022018, WM_PersonFile_24022018 , WM_PersonFile_25022018 and these files come on a daily basis. I am using tFileList to iterate through the files
What should be my regex in my Filemask to pick up the most recent file? Should the Use Global Expressions as Filemask be unchecked?
I tried "*.txt" which is picking up all the files.
RegEx would help you to filter for the correct files.
Some other logic would get you the newest file. If you use tFileList, you might be able to sort after date and only take the first result.
Alternatively, if you also want to check the date in the filename is correct, you might need to add a little logic with a tMap, tAssert, tJava or tJavaRow.

Batch file move, and rename, using part of directory name

I've read several batch renaming answers, and haven't made them work for my application. My regex and loop skills are weak.
I need to move many files with the same name, let's say non_unique_file.txt from directories with semi-unique names such as 'Directory#1/' or 'Directory#2/' to the 'non_unique_files/' directory, while modifying their name so it contains the unique identifier from the directory of origin. If I were to move just one file, it would look like:
cp Directory#1/non_unique_file.txt non_unique_files/#1.txt
I tried several loops such as:
for f in Directory* ; do cp $f/*txt non_unique_files/$f ; done
knowing that it was not exactly what I needed, but I don't know how to parse the original directory names and add that to the new file names, in the new directory.
Any help/resources would be appreciated.
Figured it out. Not certain how this is working but it works.
for f in Directory* ; do cp $f/non_unique_file.txt non_unique_files/$f"".txt"" ; done
Where my files get renamed to 'Directoy#X.txt' in my non_unique_files/ directory.

Use TfileUnarchive on Amazon S3

I have a talend job which is simple like below:
ts3Connection -> ts3Get -> tfileinputDelimeted -> tmap -> tamazonmysqloutput.
Now the scenario here is that some times I get the file in .txt format and sometimes I get it in a zip file.
So I want to use tFileUnarchive to unzip the file if it's in zip or process it bypassing the tFileUnarchive component if the file is in unzipped format i.e only in .txt format.
Any help on this is greatly appreciated.
The trick here is to break the file retrieval and potential unzipping into one sub job and then the processing of the files into another sub job afterwards.
Here's a simple example job:
As normal, you connect to S3 and then you might list all the relevant objects in the bucket using the tS3List and then pass this to tS3Get. Alternatively you might have another way of passing the relevant object key that you want to download to tS3Get.
In the above job I set tS3Get up to fetch every object that is iterated on by the tS3List component by setting the key as:
((String)globalMap.get("tS3List_1_CURRENT_KEY"))
and then downloading it to:
"C:/Talend/5.6.1/studio/workspace/S3_downloads/" + ((String)globalMap.get("tS3List_1_CURRENT_KEY"))
The extra bit I've added starts with a Run If conditional link from the tS3Get which links the tFileUnarchive with the condition:
((String)globalMap.get("tS3List_1_CURRENT_KEY")).endsWith(".zip")
Which checks to see if the file being downloaded from S3 is a .zip file.
The tFileUnarchive component then just needs to be told what to unzip, which will be the file we've just downloaded:
"C:/Talend/5.6.1/studio/workspace/S3_downloads/" + ((String)globalMap.get("tS3List_1_CURRENT_KEY"))
and where to extract it to:
"C:/Talend/5.6.1/studio/workspace/S3_downloads"
This then puts any extracted files in the same place as the ones that didn't need extracting.
From here we can now iterate through the downloads folder looking for the file types we want by setting the directory to "C:/Talend/5.6.1/studio/workspace/S3_downloads" and the global expression to "*.csv" in my case as I wanted to read in only the CSV files (including the zipped ones) I had in S3.
Finally, we then read the delimited files by setting the file to be read by the tFileInputDelimited component as:
((String)globalMap.get("tFileList_1_CURRENT_FILEPATH"))
And in my case I simply then printed this to the console but obviously you would then want to perform some transformation before uploading to your AWS RDS instance.