I have been trying to change the permissions of all the output datasets created by a SAS code.
I know that I can use the chmod command post a dataset's creation to change the permissions. But, what if I have multiple outputs from a code and i want to change their permissions, then how can I dynamically apply chmod on every dataset created in the code without having to manually supply dataset names?
Ex. If my SAS code does this :
`Data Lib1.test1
Lib2.test2
Lib3.test3;
Set Work.Test;
If (certain conditions) then output Lib1.test1;
If (certain conditions) then output Lib2.test2;
If (certain conditions) then output Lib3.test3;
run;`
How can I apply chmod to all these datasets after they have been created without having to manually supply the dataset names. I know &syslast can give me the last touched dataset, but what about the other two?
P.S. I can't update the Data step code block here, I can only add code post the end of the Data step mentioned above.
I tried using the Umask command in the very beginning of the session, but it does not let me set the execute permissions.
X umask 000
It only sets rw,rw,rw settings for datasets does not give u=rwx,g=rwx,o=rwx permissions.
Two options:
Try setting the setgid bit on the folders where your datasets are created. They should then automatically inherit their group permissions from the folder. E.g. chmod g+s /mydir
Use a wildcard command to set the permissions for all sas datasets in a folder, e.g. chmod g=rw /mydir/*.sas7bdat
Related
Everytime I download a "power portal" with the pac cli command:
pac paportal download -id <guid> --path ./ --overwrite true
Many of the files seem to be regenerated with new short guids on the end, although they haven't changed. And the sitestettings.yml file gets re-ordered so it shows a bunch of changes.
Below I made one change to a site setting, and I have 134 changes.
Can this be avoided? It makes it frustrating to track actual changes in source control.
If you have multiple records with the same name, then the short guid will be appended as file/folders cannot have same names, if you avoid creating records with exact same names (active/inactive both) you should not face this issue
I need to rename files in s3 bucket and was wondering if there is a way to rename the files by somehow querying an SQL script so that it can return the information and use that to update the name of the file.
for example a file in s3
00123456789.word
//Some SQL Script
select u.name, a.address from users u
left join address a
on u.id = a.id
where u.id = 00123456789
-- returns George Washington / 111 abc street
Then somehow use this data returned from running the SQL script to rename the file
george_washington_111_abc_street.word
I will most likely need to use the CLI since there are many files and renaming them manually seems irrational.
Is this possible to do or is there some other method?
EDIT:
Is there a way to do this programmatically?
You would need to somehow program or script this activity.
The way I would do it is:
Run the SQL query to return the three columns (id, name, address) for ALL rows
Load the resulting data into an Excel spreadsheet
Add a column that makes a 'rename' command like:
aws s3 mv s3://bucketname/00123456789.word s3://bucketname/george_washington_111_abc_street.word
You can use Excel formulas to convert the name and address to lowercase and replace spaces with underscores.
Then, use Copy Down to create the aws s3 mv command for every row. You can then copy those commands into a text file and run the file from the command line. It will individually rename every object.
This might not work if you have a huge number of objects (eg tens of thousands). In this situation, you might need to do it programmatically.
I am in the process of migrating a database from an external server to cloud sql 2nd gen. Have been following the recommended steps and the 2TB mysqlsump process was complete and replication started. However, got an error:
'Error ''Access denied for user ''skip-grants user''#''skip-grants host'' (using password: NO)'' on query. Default database: ''mondovo_db''. Query: ''LOAD DATA INFILE ''/mysql/tmp/SQL_LOAD-0a868f6d-8681-11e9-b5d3-42010a8000a8-6498057-322806.data'' IGNORE INTO TABLE seoi_volume_update_tracker FIELDS TERMINATED BY ''^#^'' ENCLOSED BY '''' ESCAPED BY ''\'' LINES TERMINATED BY ''^|^'' (keyword_search_volume_id)'''
2 questions,
1) I'm guessing the error has come about because cloud sql requires LOAD DATA LOCAL INFILE instead of LOAD DATA INFILE? However am quite sure on the master we run only LOAD DATA LOCAL INFILE so not sure how it changes to remove LOCAL while in replication, is that possible?
2) I can't stop the slave to skip the error and restart since SUPER privileges aren't available and so am not sure how to skip this error and also avoid it for the future while the the final sync happens. Suggestions?
There was no way to work around the slave replication error in Google Cloud SQL, so had to come up with another way.
Since replication wasn't going to work, I had to do a copy of all the databases. However, because of the aggregate size of all my DBs being at 2TB, it was going to take a long time.
The final strategy that took the least amount of time:
1) Pre-requisite: You need to have at least 1.5X the amount of current database size in terms of disk space remaining on your SQL drive. So my 2TB DB was on a 2.7TB SSD, I needed to eventually move everything temporarily to a 6TB SSD before I could proceed with the steps below. DO NOT proceed without sufficient disk space, you'll waste a lot of your time as I did.
2) Install cloudsql-import on your server. Without this, you can't proceed and this took a while for me to discover. This will facilitate in the quick transfer of your SQL dumps to Google.
3) I had multiple databases to migrate. So if in a similar situation, pick one at a time and for the sites that access that DB, prevent any further insertions/updates. I needed to put a "Website under Maintenance" on each site, while I executed the operations outlined below.
4) Run the commands in the steps below in a separate screen. I launched a few processes in parallel on different screens.
screen -S DB_NAME_import_process
5) Run a mysqldump using the following command and note, the output is an SQL file and not a compressed file:
mysqldump {DB_NAME} --hex-blob --default-character-set=utf8mb4 --skip-set-charset --skip-triggers --no-autocommit --single-transaction --set-gtid-purged=off > {DB_NAME}.sql
6) (Optional) For my largest DB of around 1.2TB, I also split the DB backup into individual table SQL files using the script mentioned here: https://stackoverflow.com/a/9949414/1396252
7) For each of the files dumped, I converted the INSERT commands into INSERT IGNORE because didn't want any further duplicate errors during the import process.
cat {DB_OR_TABLE_NAME}.sql | sed s/"^INSERT"/"INSERT IGNORE"/g > new_{DB_OR_TABLE_NAME}_ignore.sql
8) Create a database by the same name on Google Cloud SQL that you want to import. Also create a global user that has permission to access all the databases.
9) Now, we import the SQL files using the cloudsql-import plugin. If you split the larger DB into individual table files in Step 6, use the cat command to combine a batch of them into a single file and make as many batch files as you see appropriate.
Run the following command:
cloudsql-import --dump={DB_OR_TABLE_NAME}.sql --dsn='{DB_USER_ON_GLCOUD}:{DB_PASSWORD}#tcp({GCLOUD_SQL_PUBLIC_IP}:3306)/{DB_NAME_CREATED_ON_GOOGLE}'
10) While the process is running, you can step out of the screen session using Ctrl+a
+ Ctrl+d (or refer here) and then reconnect to the screen later to check on progress. You can create another screen session and repeat the same steps for each of the DBs/batches of tables that you need to import.
Because of the large sizes that I had to import, I believe it did take me a day or two, don't remember now since it's been a few months but I know that it's much faster than any other way. I had tried using Google's copy utility to copy the SQL files to Cloud Storage and then use Cloud SQL's built-in visual import tool but that was slow and not as fast as cloudsql-import. I would recommend this method up until Google fixes the ability to skip slave errors.
In the deploy dacpac step in VSTS, you can set the database to only run based on custom conditions. The conditions examples are based on VSTS build information, and I can't find any documentation on using conditions from a connected Azure subscription or dacpac metadata. In the conditional page, they have a version variable which seems like it might be useful, but I can't find other information about it.
Basically, when the dacpac step is triggered, I want to check metadata against existing data, conditionally run the build step, and update metadata. Is this possible through a VSTS build step?
Yes, it is possible. You can add an user defined variable (such as the variable result with default value 0) in the VSTS build definition. And with the value 1 to run the dacpac step, with value 0 to skip the step.
Detail steps as below:
Add a PowerShell task with two operations before the dacpac step:
Check if there has new changes for the existing data.
If the metadata only stored in Azure, you can refer this way to connect with Azure in powershell. If the metadata also stored in the repository (such as a git repo) you build with, you can also check the update in the repository.
Set the result variable value based on if there the metadata is updated or not.
If the data is updated, then change the result variable with value 1:
Write-Host ("##vso[task.setvariable variable=result]1")
Else, do not change the value (keep the value with 0)
Since the data are managed in git VCS, you can check if the data is update or not in git repo. If the data is changed, then change the variable result as 1. detail powershell script as below:
$files=$(git diff HEAD HEAD~1 --name-only)
echo "changed files as below: $files"
if ($files -contains 'filename')
Write-Host ("##vso[task.setvariable variable=result]1")
Set conditions for the dacpac step:
In the task, select Custom conditions for Run this task. If you want to run this task when succeeding and the variable result variable is 1, you can the express:
and(succeeded(), eq(variables['result'], '1'))
Now if the result with the value 0, the dacpac step will be skipped, is the result with value 1, the dacpack will be executed.
Currently, the project we are working on has a freelance front-end developer involved. As we have never used him before we are looking for a way to limit his access to our servers and files but at the same time let him modify the view files currently on these servers.
The current project (all on one server) is compartmentalised into 6 separate mini sites, all using an MVC structure.
e.g.
Mini Site 1
-- Models
-- Views
-- Controllers
Mini Site 2
-- Models
-- Views
-- Controllers
etc
We need to limit his access to each view folder for each project but nothing else.
We are using Amazon EC2 and are using security groups with a limited IP range. We are unable to allow him to use FTP because that opens us up to more potential issues.
Also we have looked at file and group permissions but we have thousands of files on this server alone.
Any ideas on how this can be achieved with as little footprint as possible, so once he leaves we can remove his access and revert the settings etc.?
You could use chmod. I assume that your normal users can sudo and modify files at will? Or are they group based? Here are the two approaches you can pick from.
Approach 1:
If your normal employees/users can use sudo, you can chown all the folders so they are owned by root and a new group called programmers by doing chown -R root:programmers /var/www/dir/ This will make dir and everything in it owned by root and the group programmers. Then you would do chown -R 744 /var/www/dir/ . This will make it so that the root user has R/W/X permissions on dir and all folders in it (that is the 7), users in the programmers group would have Read only permissions (the 4), and all other users would have Read only permissions (the last 4).
From there you would go through and the directories you would want him to have access to you would do: chown -R 774 /var/www/dir/front-end/views/ which would give root and all users in programmers group full R/W/X permissions. If you wanted to do it per file, you could do chown 774 /var/www/dir/front-end/views/index.html
For all other users if they wanted to modify a file (let us say they are using vim), they'd need to do sudo vim /var/www/dir/front-end/views/index.html . This would let them pretend to be root and be able to edit regardless of the Other permission (which is that last 4 in the three digit octal).
Approach 2
If they are group based you could make all files owned by root and the group employees (assuming normal users are in that group). Then for the files that you want him to edit (let use say his username is frontdev), you could do chown -R frontdev:employees /var/www/dir/front-end/views/ and then chmod that directory to 774...and you can do the same for individual files. That way all your employees, including you, in the employees group would have full permissions. Root would have permissions on all files and directories...and then you could assign his user as the one-off user in control of the files/dirs you need him to have access to.
You can also look into jailing the user to only authorized directories. Jailkit is a big one. Here is a good tutorial: https://askubuntu.com/questions/93411/simple-easy-way-to-jail-users