aws-cli 1.2.10 cron script fails - amazon-web-services

I have a crontab that fires a PHP script that runs the AWS CLI command "aws ec2 create-snapshot".
When I run the script via the command line the php script completes successfully with the aws command returning a JSON string to PHP. But when I setup a crontab to run the php script the aws command doesn't return anything.
The crontab is running as the same user as when I run the PHP script on the command line myself, so I am a bit stumped?

I had the same problem with running a ruby script (ruby script.rb).
I replace ruby by its full path (/sources/ruby-2.0.0-p195/ruby) and it worked.
in you case, replace "aws" by its full path. to find it:
find / -name "aws"

The reason it's necessary to specify the full path to the aws command is because cron by default runs with a very limited environment. I ran into this problem as well, and debugged it by adding this to the cron script:
set | sort > /tmp/environment.txt
I then ran the script via cron and via command line (renaming the environment file between runs) and compared them. This led me to see that I needed to set both the PATH and the AWS_DEFAULT_REGION environment variables. After doing this the script worked just fine.

Related

Jenkins - bash: aws: command not found but runs fine from terminal

In Build Step, I've added Send files or execute command over SSh -> SSH Publishers -> Exec command, I'm trying to run aws command to copy file from ec2 to s3. The same command runs fine when I execute it over the terminal, but via jenkins it simply returns:
bash: aws: command not found
The command is
cd ~/.local/bin/ && aws s3 cp /home/ec2-user/lambda_test/lambda_function.zip s3://temp-airflow-us/lambda_function.zip
Based on the comments.
The solution was to use the following command:
cd ~/.local/bin/ && ./aws s3 cp /home/ec2-user/lambda_test/lambda_function.zip s3://temp-airflow-us/lambda_function.zip
since aws is not available in PATH env variable.
command not found indicates that the aws utility is not on $PATH for the jenkins user.
To confirm, sudo su -l jenkins and then issue the command which aws - this will most likely return no results.
You have two options:
use the full path (likely /usr/local/bin/aws)
add /usr/local/bin to the jenkins user's $PATH
I need my Makefile to work in both Linux and Windows so the accepted answer is not an option for me.
I diagnosed the problem by adding the following to the top of my build script:
whoami
which aws
env|grep PATH
This returned:
root
which: no aws in (/sbin:/bin:/usr/sbin:/usr/bin)
PATH=/sbin:/bin:/usr/sbin:/usr/bin
Bizarrely, the path does not include /usr/local/bin, even though the interactive shell on the Jenkins host includes it. The fix is simple enough, create a symlink on the Jenkins host:
ln -s /usr/local/bin/aws /bin/aws
Now the aws command can be found by scripts running in Jenkins (in /bin).

AWS EB ( Elastic Beanstalk) CLI not working in the command line of git bash

AWS EB (Elastic Beanstalk) CLI not running in git bash (Windows 10). I have successfully installed the AWS EB CLI from AWS documentation at https://github.com/aws/aws-elastic-beanstalk-cli-setup/blob/master/README.md . At the end I have set the environment variables as mentioned in the doc. So "eb" command is working from Windows Power shell. But when I am trying to access the "eb" command from GIT Bash / IntelliJ bash prompt, it is not working.
Working fine with windows power shell:
PS C:\> eb --version
EB CLI 3.19.2 (Python 3.7.3)
Environment variable set as below under "User Variable" -> "Path":
Environment variable set windows
While trying to access the "eb" from Git Bash the error is as below:
$ eb
bash: eb: command not found
$ echo $PATH
.....
......
/c/Users/xxxxxx/.ebcli-virtual-env/executables:
Restarted the system and commandline interfaces multiple time.
Can someone please let me know if there are some issue with environment variable set, or need to configure something additional in bash environment?
After so many trial and error with different solution available in internet along with AWS doc suggestion, finally I can use "eb" from Git bash of windows 10. The problem fixed after I put the below location in my environment variable path:
C:\Users\XXXX\AppData\Roaming\Python\Python37\Scripts
The issue for me was a username with a space. The path would then look like this: C:\Users\fname lastname.ebcli-virtual-env\executables. The problem came about with the .bat files created by the AWS script did not wrap the path in double quotes. Windows then interprets it as multiple parameters.
I had to go edit eb.bat and path_exporter.bat and wrap the directives like this: (in eb.bat) CALL "C:\Users\fname lastname.ebcli-virtual-env\Scripts\activate.bat"
#start CALL "C:\Users\fname lastname.ebcli-virtual-env\Scripts\eb.exe" %args%
The EB cli seems to work properly now.

spark cluster on aws emr cant find spark-env.sh

I am playing with apache-spark on aws emr, and trying to use this to set the cluster to use python3,
I use the command as the last command in a bootstrap script
sudo sed -i -e '$a\export PYSPARK_PYTHON=/usr/bin/python3' /etc/spark/conf/spark-env.sh
When I use it the cluster crashes during the bootstrap with the following error.
sed: can't read /etc/spark/conf/spark-env.sh: No such file or
directory
How should I set it to use python3 properly?
This is not a duplicate of, My issue is that the cluster is not finding the spark-env.sh file while bootstrapping, while the other question addresses the issue of the system not finding python3
In the end I did not use that script, but Used the EMR configuration file that is available on the creation stage, It gave me the proper configurations via spark_submit (in the aws gui) If you need it to be available for pyspark scripts in a more programatic way, you can use os.environ to set the pyspark python version in the python script

AWS EMR bootstrap action as sudo

I need to update /etc/hosts for all instances in my EMR cluster (EMR AMI 4.3).
The whole script is nothing more than:
#!/bin/bash
echo -e 'ip1 uri1' >> /etc/hosts
echo -e 'ip2 uri2' >> /etc/hosts
...
This script needs to run as sudo or it fails.
From here: https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-bootstrap.html#bootstrapUses
Bootstrap actions execute as the Hadoop user by default. You can execute a bootstrap action with root privileges by using sudo.
Great news... but I can't figure out how to do this, and I can't find an example.
I've tried a bunch of things... including...
running as Hadoop and adding 'sudo' to each of the 'echo' statements in the script
using a shell script to copy and chmod the above ('echo' statements with no 'sudo') and running local copy using run-if bootstrap that calls 1=1 sudo bash /home/hadoop/myDir/myScript.sh
hard coding the whole script as a one-liner into a run-if bootstrap action
I consistently get:
On the master instance (i-xxx), bootstrap action 2 returned a non-zero return code
If i check the logs for the "Setup hadoop debugging" step, there's nothing there.
From here: https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-overview.html#emr-overview-cluster-lifecycle
summary emr setup (in order):
provisions ec2 instances
runs bootstrap actions
installs native applications... like hadoop, spark, etc.
So it seems like there's some risk that since I'm mucking around as user Hadoop before hadoop is installed, I could be messing something up there, but I can't imagine what.
I think it must be that my script isn't running as 'sudo' and it's failing to update /etc/hosts.
My question... how can I use bootstrap actions (or something else) on EMR to run a simple shell script as sudo? ...specifically to update /etc/hosts?
I've not had problems using sudo from within a shell script run as an EMR bootstrap action, so it should work. You can test that it works with a simple script that simply does "sudo ls /root".
Your script is trying to append to /etc/hosts by redirecting stdout with:
sudo echo -e 'ip1 uri1' >> /etc/hosts
The problem here is that while the echo is run with sudo, the redirection (>>) is not. It's run by the underlying hadoop user, who does not have permission to write to /etc/hosts. The fix is:
sudo sh -c 'echo -e "ip1 uri1" >> /etc/hosts'
This runs the entire command, including the stdout redirection, in a shell with sudo.

Getting Data From A Specific Website Using Google Cloud

I have a machine learning project and I have to get data from a website every 15 minutes. And I cannot use my own computer so I will use Google cloud. I am trying to use Google Compute Engine and I have a script for getting data (here is the link: https://github.com/BurkayKirnik/Automatic-Crypto-Currency-Data-Getter/blob/master/code.py). This script gets data every 15 mins and writes it down to csv files. I can run this code by opening an SSH terminal and executing it from there but it stops working when I close the terminal. I tried to run it by executing it in startup script but it doesn't work this way too. How can I run this and save the csv files? BTW I have to install an API to run the code and I am doing it in startup script. There is no problem in this part.
Instances running in Google Cloud Platform can be configured with the same tools available in the operating system that they are running. If your instance is a Linux instance, the best method would be to use a cronjob to execute your script repeatedly at your chosen interval.
Once you have accessed the instance via SSH, you can open the crontab configuration file by running the following command:
$ crontab -e
The above command will provide access to your personal crontab configuration (for the user you are logged in as). If you want to run the script as root you can use this instead:
$ sudo crontab -e
You can now edit the crontab configuration and add an entry that tells cron to execute your script at your required interval (in your case every 15 minutes).
Therefore, your crontab entry should look something like this:
*/15 * * * * /path/to/you/script.sh
Notice the first entry is for minutes, so by using the */15, you are telling the cron daemon to execute the script once every 15 minutes.
Once you have edited the crontab configuration file, it is a good idea to restart the cron daemon to ensure the change you made will take place. To do this you can run:
$ sudo service cron restart
If you would like to check the status to ensure the cron service is running you can run:
$ sudo service cron status
You script will now execute every 15 minutes.
In terms of storing the CSV files, you could either program your script to store them on the instance, or an alternative would be to use Google Cloud Storage bucket. File can be copied to buckets easily by making use of the gsutil (part of Cloud SDK) command as described here. It's also possible to mount buckets as a file system as described here.