"Connect timeout on endpoint URL" when running cron job

"Connect timeout on endpoint URL" when running cron job - amazon-web-services

I have set a crontab file to run a Python script that creates an JSON file and writes it to an S3 bucket. It runs as expected when executed from the command line, but when I run it as a cron job, I get the following error:
botocore.exceptions.ConnectTimeoutError: Connect timeout on endpoint URL
This results from the following lines of code in the script:
import boto3
def main():
# Create EC2 client and get EC2 response
ec2 = boto3.client('ec2')
response = ec2.describe_instances()
My guess is that some permission is not set in the cron job, denying me access to the URL.

It turns out that I had to set the proxy settings so I access AWS as myself rather than root. I ran the cron job as a Linux shell script rather then a Python script, and exported my http_proxy, https_proxy, and no_proxy settings found in ~/.bash_profile in the first lines of the shell script
`export http_proxy=<http_proxy from ~/.bash_profile>
export https_proxy=<https_proxy from ~/.bash_profile>
export no_proxy=<no_proxy from ~./bash_profile>
python <python script>`

Related

Tensorboard - can't connect from Google Cloud Instance

I'm trying to load Tensorboard from within my google cloud VM terminal.
tensorboard --logdir logs --port 6006
Serving TensorBoard on localhost; to expose to the network, use a proxy or pass --bind_all
TensorBoard 2.2.1 at http://localhost:6006/ (Press CTRL+C to quit)
When I click on the link:
Chrome I get error 400
Firefox Error: Could not connect to Cloud Shell on port 6006. Ensure your server is listening on port 6006 and try again.
I've added a new firewall rule to allow port 6006 for ip 0.0.0.0/0 , but still can't get this to work. I've tried using --bind_all too but this doesn't work.

From Training a Keras Model on Google Cloud ML GPU:
... To train this model now on google cloud ML engine run the below command on cloud sdk terminal
gcloud ml-engine jobs submit training JOB1
--module-name=trainer.cnn_with_keras
--package-path=./trainer
--job-dir=gs://keras-on-cloud
--region=us-central1
--config=trainer/cloudml-gpu.yaml
Once you have started the training you can watch the logs from google console. Training would take around 5 minutes and the logs should look like below. Also you would be able to view the tensorboard logs in the bucket that we had created earlier named ‘keras-on-cloud’
To visualize the training and changes graphically open the cloud shell by clicking the icon on top right for the same. Once started type the below command to start Tensorboard on port 8080.
tensorboard --logdir=gs://keras-on-cloud --port=8080

For anyone else struggling with this, I decided to output my logs to an S3 bucket, and then rather than trying to run tensorboard from within the GCP instance, I just ran it locally, tested with the below script.
I needed to put this into a script rather than calling directly from the command line as I needed my AWS credentials to be loaded. I then use subprocess to run the command line function as normal.
Credentials contained within an env file, found using python-dotenv
from dotenv import find_dotenv, load_dotenv
import subprocess
load_dotenv(find_dotenv())
if __name__=='__main__':
cmd = 'tensorboard --logdir s3://path-to-s3-bucket/Logs/'
p = subprocess.Popen(cmd, shell=True)
p.wait()
Serving TensorBoard on localhost; to expose to the network, use a proxy or pass --bind_all
TensorBoard 2.1.1 at http://localhost:6006/ (Press CTRL+C to quit)

Automatically "stop" Sagemaker notebook instance after inactivity?

I have a Sagemaker Jupyter notebook instance that I keep leaving online overnight by mistake, unnecessarily costing money...
Is there any way to automatically stop the Sagemaker notebook instance when there is no activity for say, 1 hour? Or would I have to make a custom script?

You can use Lifecycle configurations to set up an automatic job that will stop your instance after inactivity.
There's a GitHub repository which has samples that you can use. In the repository, there's a auto-stop-idle script which will shutdown your instance once it's idle for more than 1 hour.
What you need to do is
to create a Lifecycle configuration using the script and
associate the configuration with the instance. You can do this when you edit or create a Notebook instance.
If you think 1 hour is too long you can tweak the script. This line has the value.

You could also use CloudWatch + Lambda to monitor Sagemaker and stop when your utilization hits a minimum. Here is a list of what's available in CW for SM: https://docs.aws.amazon.com/sagemaker/latest/dg/monitoring-cloudwatch.html.
For example, you could set a CW alarm to trigger when CPU utilization falls below ~5% for 30 minutes and have that trigger a Lambda which would shut down the notebook.

After we've burned quite a lot of money by forgetting to turn off these machines, I've decided to create a script. It's based on AWS' script, but provides an explanation why the machine was or was not killed. It's pretty lightweight because it does not use any additional infrastructure like Lambda.
Here is the script and the guide on installing it! It's just a simple lifecycle configuration!

Unfortunately, automatically stopping the Notebook Instance when there is no activity is not possible in SageMaker today. To avoid leaving them overnight, you can write a cron job to check if there's any running Notebook Instance at night and stop them if needed.

SageMaker Studio Notebook Kernels can be terminated by attaching the following lifecycle configuration script to the domain.
#!/bin/bash
# This script installs the idle notebook auto-checker server extension to SageMaker Studio
# The original extension has a lab extension part where users can set the idle timeout via a Jupyter Lab widget.
# In this version the script installs the server side of the extension only. The idle timeout
# can be set via a command-line script which will be also created by this create and places into the
# user's home folder
#
# Installing the server side extension does not require Internet connection (as all the dependencies are stored in the
# install tarball) and can be done via VPCOnly mode.
set -eux
# timeout in minutes
export TIMEOUT_IN_MINS=120
# Should already be running in user home directory, but just to check:
cd /home/sagemaker-user
# By working in a directory starting with ".", we won't clutter up users' Jupyter file tree views
mkdir -p .auto-shutdown
# Create the command-line script for setting the idle timeout
cat > .auto-shutdown/set-time-interval.sh << EOF
#!/opt/conda/bin/python
import json
import requests
TIMEOUT=${TIMEOUT_IN_MINS}
session = requests.Session()
# Getting the xsrf token first from Jupyter Server
response = session.get("http://localhost:8888/jupyter/default/tree")
# calls the idle_checker extension's interface to set the timeout value
response = session.post("http://localhost:8888/jupyter/default/sagemaker-studio-autoshutdown/idle_checker",
json={"idle_time": TIMEOUT, "keep_terminals": False},
params={"_xsrf": response.headers['Set-Cookie'].split(";")[0].split("=")[1]})
if response.status_code == 200:
print("Succeeded, idle timeout set to {} minutes".format(TIMEOUT))
else:
print("Error!")
print(response.status_code)
EOF
chmod +x .auto-shutdown/set-time-interval.sh
# "wget" is not part of the base Jupyter Server image, you need to install it first if needed to download the tarball
sudo yum install -y wget
# You can download the tarball from GitHub or alternatively, if you're using VPCOnly mode, you can host on S3
wget -O .auto-shutdown/extension.tar.gz https://github.com/aws-samples/sagemaker-studio-auto-shutdown-extension/raw/main/sagemaker_studio_autoshutdown-0.1.5.tar.gz
# Or instead, could serve the tarball from an S3 bucket in which case "wget" would not be needed:
# aws s3 --endpoint-url [S3 Interface Endpoint] cp s3://[tarball location] .auto-shutdown/extension.tar.gz
# Installs the extension
cd .auto-shutdown
tar xzf extension.tar.gz
cd sagemaker_studio_autoshutdown-0.1.5
# Activate studio environment just for installing extension
export AWS_SAGEMAKER_JUPYTERSERVER_IMAGE="${AWS_SAGEMAKER_JUPYTERSERVER_IMAGE:-'jupyter-server'}"
if [ "$AWS_SAGEMAKER_JUPYTERSERVER_IMAGE" = "jupyter-server-3" ] ; then
eval "$(conda shell.bash hook)"
conda activate studio
fi;
pip install --no-dependencies --no-build-isolation -e .
jupyter serverextension enable --py sagemaker_studio_autoshutdown
if [ "$AWS_SAGEMAKER_JUPYTERSERVER_IMAGE" = "jupyter-server-3" ] ; then
conda deactivate
fi;
# Restarts the jupyter server
nohup supervisorctl -c /etc/supervisor/conf.d/supervisord.conf restart jupyterlabserver
# Waiting for 30 seconds to make sure the Jupyter Server is up and running
sleep 30
# Calling the script to set the idle-timeout and active the extension
/home/sagemaker-user/.auto-shutdown/set-time-interval.sh
Resource
https://docs.aws.amazon.com/sagemaker/latest/dg/notebook-lifecycle-config.html
https://github.com/aws-samples/sagemaker-studio-lifecycle-config-examples/blob/main/scripts/install-autoshutdown-server-extension/on-jupyter-server-start.sh

Python Crontab permanent registered

I have setup Python CronTab in SSH Cloud Linux like this:
ipython
from crontab import CronTab; cron = CronTab('user');
for job in cron: print job
job = cron.new(command='ipython /home/batch_query.py &>> /home/logfile_batch.txt', comment='b01')
job.day.on(21); job.enable()
cron.write( 'bb.tab' )
for job in cron: print job
However, when leaving the SSH session, the cron is not anymore registered.
how to make permanently registered in the linux ?

If you write the cron out to a tab file, then you've not written it out to the user account. If you want the user to have a modified tab, you should write it back to the user's crontab.
cron = CronTab(user='user')
cron.do_some_stuff(...)
cron.write()

Running Python script on AWS EC2

Apologies if it is repeat but i couldn't find anything worthwhile to accomplish my task.
I have an instance and i have figured out starting and stopping it using boto3 and it works but the real problem is running the script when instance is up. I would like to wait for script to finish and then stop the instance.
python /home/ubuntu/MyProject/TechInd/EuropeRun.py &
python /home/ubuntu/FTDataCrawlerEU/EuropeRun.py &
Reading quite a few post leads to the direction of Lambda and AWS Beanstalk but those don't appear simple.
Any suggestion is greatly appreciated.
Regards
DC

You can use the following code.
import boto3
import botocore
import os
from termcolor import colored
import paramiko
def stop_instance(instance_id, region_name):
client = boto3.client('ec2', region_name=region_name)
while True:
try:
client.stop_instances(
InstanceIds=[
instance_id,
],
Force=False
)
except Exception, e:
print e
else:
break
# Waiter to wait till instance is stopped
waiter = client.get_waiter('instance_stopped')
try:
waiter.wait(
InstanceIds=[
instance_id,
]
)
except Exception, e:
print e
def ssh_connect(public_ip, cmd):
# Join the paths using directory name and file name, to avoid OS conflicts
key_path = os.path.join('path_to_aws_pem', 'file_name.pem')
key = paramiko.RSAKey.from_private_key_file(key_path)
client = paramiko.SSHClient()
client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
# Connect/ssh to an instance
while True:
try:
client.connect(hostname=public_ip, username="ubuntu", pkey=key)
# Execute a command after connecting/ssh to an instance
stdin, stdout, stderr = client.exec_command(cmd)
print stdout.read()
# close the client connection once the job is done
client.close()
break
except Exception, e:
print e
# Main/Other module where you're doing other jobs:
# Get the public IP address of EC2 instance, I assume you should have handle to the ec2 instance already
# You can use any alternate method to fetch/get public ip of your ec2 instance
public_ip = ec2_instance.public_ip_address
# Get the instance ID of EC2 instance, I assume you should have handle to the ec2 instance already
instance_id = ec2_instance.instance_id
# Command to Run/Execute python scripts
cmd = "nohup python /home/ubuntu/MyProject/TechInd/EuropeRun.py & python /home/ubuntu/FTDataCrawlerEU/EuropeRun.py &"
ssh_connect(public_ip, cmd)
print colored('Script execution finished !!!', 'green')
# Shut down/Stop the instance
stop_instance(instance_id, region_name)

You can execute your shutdown command through python code once your script is done.
An example of using ls
from subprocess import call
call(["ls", "-l"])
But for something that simple lambda is much easier and resource efficient. You only need to upload your script to s3 and the execute the lambda function through boto3.
Actually you can just copy paste your script code to the lambda console if you don't have any dependencies.

Some options for running the script automatically at system startup:
Call the script via the EC2 User-Data
Configure the AMI to start the script on boot via an init.d script, or an #reboot cron job.
To shutdown the instance after the script is complete, add some code at the end of the script to either initiate a OS shutdown, or call the AWS API (via Boto) to shutdown the instance.

aws-cli 1.2.10 cron script fails

I have a crontab that fires a PHP script that runs the AWS CLI command "aws ec2 create-snapshot".
When I run the script via the command line the php script completes successfully with the aws command returning a JSON string to PHP. But when I setup a crontab to run the php script the aws command doesn't return anything.
The crontab is running as the same user as when I run the PHP script on the command line myself, so I am a bit stumped?

I had the same problem with running a ruby script (ruby script.rb).
I replace ruby by its full path (/sources/ruby-2.0.0-p195/ruby) and it worked.
in you case, replace "aws" by its full path. to find it:
find / -name "aws"

The reason it's necessary to specify the full path to the aws command is because cron by default runs with a very limited environment. I ran into this problem as well, and debugged it by adding this to the cron script:
set | sort > /tmp/environment.txt
I then ran the script via cron and via command line (renaming the environment file between runs) and compared them. This led me to see that I needed to set both the PATH and the AWS_DEFAULT_REGION environment variables. After doing this the script worked just fine.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

"Connect timeout on endpoint URL" when running cron job - amazon-web-services

Related

Tensorboard - can't connect from Google Cloud Instance

Automatically "stop" Sagemaker notebook instance after inactivity?

Python Crontab permanent registered

Running Python script on AWS EC2

aws-cli 1.2.10 cron script fails

Categories

Resources