Create a txt file with Sagemaker Lifecycle Configuration script - amazon-web-services

I'm unable to get Sagemaker Lifecycle Configuration to create a plain .txt file in the directory with my jupyter notebooks when the sagemaker notebook starts.
In the future I'll add text to this file, but creating the file is the first step.
Start notebook script:
#!/bin/bash
set -e
touch filename.txt
Note: I have edited my notebook to add this lifecycle configuration.
But when the notebook starts and I open it, the file does not exist. Is this possible?

You are creating the file in the root directory.
Use the terminal option of your notebook to explore

The working directory of the Lifecycle Configuration script is "/" and Jupyter starts up from the "/home/ec2-user/SageMaker" directory. So, if you create a file outside "/home/ec2-user/SageMaker", it will not be visible inside the Jupyter file browser.
To address this, you can modify your script to
touch /home/ec2-user/SageMaker/filename.txt
Thanks for using Amazon SageMaker!

Related

Why can't my GCP script/notebook find my file?

I have a working script that finds the data file when it is in the same directory as the script. This works both on my local machine and Google Colab.
When I try it on GCP though it can not find the file. I tried 3 approaches:
PySpark Notebook:
Upload the .ipynb file which includes a wget command. This downloads the file without error but I am unsure where it saves it to and the script can not find the file either (I assume because I am telling it that the file is in the same directory and pressumably using wget on GCP saves it somewhere else by default.)
PySpark with bucket:
I did the same as the PySpark notebook above but first I uploaded the dataset to the bucket and then used the two links provided in the file details when you click the file name inside the bucket on the console (neither worked). I would like to avoid this though as wget is much faster then downloading on my slow wifi then reuploading to the bucket through the console.
GCP SSH:
Create cluster
Access VM through SSH.
Upload .py file using the cog icon
wget the dataset and move both into the same folder
Run script using python gcp.py
Just gives me an error saying file not found.
Thanks.
As per your first and third approach, if you are running a PySpark code on Dataproc, irrespective of whether you use .ipynb file or .py file, please note the below points:
If you use the ‘wget’ command to download the file, then it will be downloaded in the current working directory where your code is executed.
When you try to access the file through the PySpark code, it will check defaultly in HDFS. If you want to access the downloaded file from the current working directory, use the “ file:///” URI with absolute file path.
If you want to access the file from HDFS, then you have to move the downloaded file to HDFS and then access from there using an absolute HDFS file path. Please refer the below example:
hadoop fs -put <local file_name> </HDFS/path/to/directory>

userdata .sh script not running on startup for cloudformation

I need to run a .sh script on startup of ec2 created from cloudformation.I am copying script from s3 and then trying to run it. The script is able to be copied from the s3 bucket to ec2 root but its not running when we try . setupec2.sh . The script has no issues when run manually (its a bit long as its doing a couple of installations) and I can find it when we login into ec2 but wanted to run it from cloudformation startup and so gave it as user data.
The error its giving is
/var/lib/cloud/instance/scripts/part-001: line 33: setupec2.sh: No such file or directory
You need to specify a full path when you call setupec2.sh
EG /setupec2.sh if it is in the root folder.

How to I resolve command not found in AWS EC2?

All of a sudden no linux command(ls, vi, etc..) is working in AWS EC2 instance and I get message saying command not found.
I had launched an EC2 instance and all linux commands were working fine.
I then uploaded some files to EC2 and extracted them(setting up my environment).
I made following changes to the ~/.bashrc file
export M2_HOME=/home/ec2-user/apache-maven-3.6.0
export JAVA_HOME=/home/ec2-user/jdk1.8.0_151
export ANT_HOME=/home/ec2-user/apache-ant-1.9.13
export PATH=/home/ec2-user/jdk1.7.0_80/bin:/home/ec2-user/apache-maven-3.6.0/bin
export JBOSS_HOME=target/wildfly-run/wildfly-11.0.0.Final
and I executed below command in my AWS EC2 instance.
source ~/.bashrc
After this linux commands(ls, vi, cat, etc..) are not working, however "which", "pwd" commands are working.
Can someone help to me to correct the PATH settings so that my commands start executing normally
You should append the original PATH to the additions you made (using the $PATH variable), like below:
export PATH=/home/ec2-user/jdk1.7.0_80/bin:/home/ec2-user/apache-maven-3.6.0/bin:$PATH
Changing value of path as below sorted out all the issues
export PATH=/usr/bin:/usr/local/sbin:/sbin:/bin:/usr/sbin:/usr/local/bin:/opt/aws/bin:/root/bin:/home/ec2-user/jdk1.7.0_80/bin:/home/ec2-user/apache-maven-3.5.2/bin:/home/ec2-user/apache-ant-1.9.14/bin
below is the system default path
PATH=/usr/bin:/usr/local/sbin:/sbin:/bin:/usr/sbin:/usr/local/bin:/opt/aws/bin:/root/bin

How to set environment variable for root user at start-up?

I'm trying to add memory usage monitoring to the monitoring tab of an instance at console.aws.amazon.com. It's an instance running Amazon Linux AMI 2013.09.2 I have found the Amazon CloudWatch Monitoring Scripts for Linux and specifically mon-put-instance-data.pl that let's me collect memory stats and report it to CloudWatch as custom metrics.
To have this working I need to set the environment variable AWS_CREDENTIAL_FILE to point to a file containing my AWSAccessKeyId and AWSSecretKey. I do this by typing:
export AWS_CREDENTIAL_FILE=/home/ec2-user/aws-scripts-mon/awscreds.template
To avoid having to type this over and over again, I'm looking for a way to set the environment variable at startup. I have tried adding the code to these files:
/etc/rc.local file
/etc/profile
/home/ec2-user/.bash_profile
As adding the line of code in either of the files seems to work when I switch to root user, where should I put it? If I set the variable in /home/ec2-user/.bash_profile the variable is set for ec2-user but not for root. If i then sudo -E su it works, but I don't know if this is the best way to go about it?
Create a sh file and put the code in it. Then put this sh file in /etc/profile.d/ folder.
Note: create this sh file using the root user.
Once your instance is created, this sh file will automatically run and creates the environment variable for you and this environment variable will be accessible to all the users.

How to set up and use EC2 CLI on Mac?

I am stuck at using Amazon EC2 CLI.
I have downloaded the Command Line Tools from
http://aws.amazon.com/developertools/351.
I placed the bin and lib folder into my Amazon project folder: /Users/Invictus/EC2
I downloaded the cert-xxxx.pem and pk-xxx.pem into the same folder.
Created a .bash_profile in the same folder.
I tried to execute ec2-describe-images -o amazon after I moved to cd /Users/Invictus/EC2.
The system does not recognise the command: command not found.
If I try to execute the same command inside the bin folder, the result is the same.
My .bash_profile:
export EC2_HOME=~/.EC2
export PATH=$PATH:$EC2_HOME/bin
export EC2_PRIVATE_KEY=`ls $EC2_HOME/pk-*.pem`
export EC2_CERT=`ls $EC2_HOME/cert-*.pem`
export JAVA_HOME=/System/Library/Frameworks/JavaVM.framework/Home/
Where did I make a mistake?
My aim is to connect to the launched instance and be able to execute commands there from my local machine.
I have Java installed.
The newer AWS Unified CLI Tools is much, much easier to set up. All you need is Python, which comes built-in to every Mac.
Here are a few things I can think of:
Your .bash_profile should be in /Users/Invictus/ , not /Users/Invictus/EC2. Move it to your home directory and log off and log back in (or restart your machine) and see if it picks up the right path.
Instead of ec2-describe-images, can you run it as "./ec2-describe-images" - does that work? If not, can you check the permissions on that script?