Running expect script on EC2 hangs, but runs successfully when manually invoked

Running expect script on EC2 hangs, but runs successfully when manually invoked - amazon-web-services

I'm writing an expect script to start an SSH tunnel.
It gets run on EC2 when the instance starts, as part of the deployment which creates the script from a .ebextensions config file.
When the script is run, it always gets stuck at this point:
Enter passphrase for key '/home/ec2-user/id_data_app_rsa':
If I run the same script manually on the server it succeeds and i can see the tunnel process running.
ps aux | grep ssh
root 19046 0.0 0.0 73660 1068 ? Ss 16:58 0:00 ssh -i /home/ec2-user/id_data_app_rsa -p222 -vfN -L 3306:X.X.X.X:3306 root#X.X.X.X
I can verify that the script is reading the SSH_PASSPHRASE correctly by printing it to the console.
set password $::env(SSH_PASSPHRASE)
send_user "retrieved env variable : $password "
This is the debug output I get from the EC2 logs:
Enter passphrase for key '/home/ec2-user/id_data_app_rsa':
interact: received eof from spawn_id exp0
I'm baffled as to why it's getting no further here when the EC2 deployer runs, but it continues normally when run manually.
This is the script in .ebextensions, the script itself starts at #!/usr/bin/expect:
files:
"/scripts/createTunnel.sh" :
mode: "000755"
owner: root
group: root
content: |
#!/usr/bin/expect
exp_internal 1
set timeout 60
# set variables
set password $::env(SSH_PASSPHRASE)
send_user "retrieved env variable : $password "
spawn -ignore HUP ssh -i /home/ec2-user/id_data_app_rsa -p222 -vfN -L 3306:X.X.X.X:3306 root#X.X.X.X
expect {
"(yes/no)?" { send "yes\n" }
-re "(.*)assphrase" { sleep 1; send -- "$password\n" }
-re "(.*)data_app_rsa" { sleep 1; send -- "$password\n" }
-re "(.*)assword:" { sleep 1; send -- "$password\n" }
timeout { send_user "un-able to login: timeout\n"; return }
"denied" { send_user "\nFatal Error: denied \n"}
eof { send_user "Closed\n" ; return }
}
interact

We finally resolved this. There were two things that seemed to be at issue:
Changing the final interact to expect eof.
Trimming down the
expect pattern matching as much as possible.
We noticed in testing that expect seemed to be matching falsely, sending a password, for example, when it should have been sending a 'yes' matching on the 'yes/no' prompt.
This is the final script we ended up with in case it's useful to anyone else:
#!/usr/bin/expect
exp_internal 1
set timeout 60
# set variables
set password $::env(SSH_TUNNEL_PASSPHRASE)
spawn -ignore HUP ssh -i /home/ec2-user/id_data_rsa -p222 -vfN -L 3306:X.X.X.X:3306 root#X.X.X.X
expect {
"(yes/no)?" { send "yes\r" }
"Enter passphrase" { sleep 2; send -- "$password\r"; sleep 2; exit }
}
expect eof

Your problem is here:
set password $::env(SSH_PASSPHRASE)
and the way shell works with environment variables. When the script is invoked, you assume your environment variables are set. Depending on how the script is invoked, $::env(SSH_PASSPHRASE) may not be set, resulting in the variable to be null/blank. When init scripts (or cloud-init) are run, they are not run with the environment of a login shell. So you should not assume that .profile or /etc/profile environment variables are set, but rather source or set them explicitly.
A possible solution may be
. ~ec2-user/.profile /path/to/above.script

Related

ssh tunnel script hangs forever on beanstalk deployment

I'm attempting to create a ssh tunnel, when deploying an application to aws beanstalk. I want to put the tunnel as a background process, that is always connected on application deploy. The script is hanging forever on the deployment and I can't see why.
"/home/ec2-user/eclair-ssh-tunnel.sh":
mode: "000500" # u+rx
owner: root
group: root
content: |
cd /root
eval $(ssh-agent -s)
DISPLAY=":0.0" SSH_ASKPASS="./askpass_script" ssh-add eclair-test-key </dev/null
# we want this command to keep running in the backgriund
# so we add & at then end
nohup ssh -L 48682:localhost:8080 ubuntu#[host...] -N &
and here is the output I'm getting from /var/log/eb-activity.log:
[2019-06-14T14:53:23.268Z] INFO [15615] - [Application update suredbits-api-root-0.37.0-testnet-ssh-tunnel-fix-port-9#30/AppDeployStage1/AppDeployPostHook/01_eclair-ssh-tunnel.sh] : Starting activity...
The ssh tunnel is spawned, and I can find it by doing:
[ec2-user#ip-172-31-25-154 ~]$ ps aux | grep 48682
root 16047 0.0 0.0 175560 6704 ? S 14:53 0:00 ssh -L 48682:localhost:8080 ubuntu#ec2-34-221-186-19.us-west-2.compute.amazonaws.com -N
If I kill that process, the deployment continues as expected, which indicates that the bug is in the tunnel script. I can't seem to find out where though.

You need to add -n option to ssh when run it in background to avoid reading from stdin.

Shell Script stops after connecting to external server

I am in the process of trying to automate deployment to an AWS Server as a cool project to do for my coding course. I'm using ShellScript to automate different processes but when connecting to the AWS E2 Ubuntu server. When connected to the server, it will not do any other shell command until I close the connection. IS there any way to have it continue sending commands while being connected?
read -p "Enter Key Name: " KEYNAME
read -p "Enter Server IP With Dashes: " IPWITHD
chmod 400 $KEYNAME.pem
ssh -i "$KEYNAME.pem" ubuntu#ec2-$IPWITHD.us-east-2.compute.amazonaws.com
ANYTHING HERE AND BELOW WILL NOT RUN UNTIL SERVER IS DISCONNECTED

A couple of basic points:
A shell script is a sequential set of commands for the shell to execute. It runs a program, waits for it to exit, and then runs the next one.
The ssh program connects to the server and tells it what to do. Once it exits, you are no longer connected to the server.
The instructions that you put in after ssh will only run when ssh exits. Those commands will then run on your local machine instead of the server you are sshed into.
So what you want to do instead is to run ssh and tell it to run a set of steps on the server, and then exit.
Look at man ssh. It says:
ssh destination [command]
If a command is specified, it is executed on the remote host instead of a login shell
So, to run a command like echo hi, you use ssh like this:
ssh -i "$KEYNAME.pem" ubuntu#ec2-$IPWITHD.us-east-2.compute.amazonaws.com "echo hi"
Or, for longer commands, use a bash heredoc:
ssh -i "$KEYNAME.pem" ubuntu#ec2-$IPWITHD.us-east-2.compute.amazonaws.com <<EOF
echo "this will execute on the server"
echo "so will this"
cat /etc/os-release
EOF
Or, put all those commands in a separate script and pipe it to ssh:
cat commands-to-execute-remotely.sh | ssh -i "$KEYNAME.pem" ubuntu#ec2-$IPWITHD.us-east-2.compute.amazonaws.com
Definitely read What is the cleanest way to ssh and run multiple commands in Bash? and its answers.

Ansible expect module not working

This has been irritating me for the past hour, I use Ansible's expect module to answer to a command prompt, namely:
Re-format filesystem in Storage Directory /mnt/ephemeral-hdfs/dfs/name ? (Y or N)
for which I want to reply
Y
This should work according to standard regex matching and this other stackoverflow question
- name: Run Spark Cluster script
expect:
command: /home/ubuntu/cluster_setup/scripts/shell/utils-cluster_launcher-start_spark.sh
responses:
"Re-format filesystem": "Y"
timeout: 600
echo: yes
The issue I am facing is that when it reaches the point where it expects keyboard input it doesn't get anything, therefore it hangs. There is no error output as such; it just stays still.
Any ideas how to fix this?

The task from the question works properly on the data included in the question:
---
- hosts: localhost
gather_facts: no
connection: local
tasks:
- name: Run script producing the same prompt as Spark Cluster script
expect:
command: ./prompt.sh
responses:
"Re-format filesystem": "Y"
timeout: 600
echo: yes
register: prompt
- debug:
var: prompt.stdout_lines
Contents of the ./prompt.sh:
#!/bin/bash
read -p "Re-format filesystem in Storage Directory /mnt/ephemeral-hdfs/dfs/name ? (Y or N) " response
echo pressed: $response
Result:
PLAY [localhost] ***************************************************************
TASK [Run script producing the same prompt as Spark Cluster script] ************
changed: [localhost]
TASK [debug] *******************************************************************
ok: [localhost] => {
"prompt.stdout_lines": [
"Re-format filesystem in Storage Directory /mnt/ephemeral-hdfs/dfs/name ? (Y or N) Y",
"pressed: Y"
]
}
PLAY RECAP *********************************************************************
localhost : ok=2 changed=1 unreachable=0 failed=0

The Ansible documentation for expect does not have quotes around the regex in the example.
# Case insensitve password string match
- expect:
command: passwd username
responses:
(?i)password: "MySekretPa$$word"
Maybe try:
Re-format\sfilesystem: "Y"

I know this is old but I had the same trouble with this module and these answers didn't help, however I did find my own solutions eventually and thought I'd save people some time.
First the timeout in the poster's example is 10 minutes. Though this makes sense for a reformat, it means that you need to wait 10 minutes before the script will fail. e.g. If it is stopped waiting for a response to "Are you sure?". When debugging keep that timeout low and if you can't then wait patiently.
Second the fields in the responses are listed alphabetically so
responses:
"Test a specific string": "Specific"
"Test": "General"
Will always respond to ALL messages containing Test with General as that is the first alphabetically in the responses map.
Third (following on) this caught me out because in my case expect was simply hitting enter at the prompt and the script asked again for valid data. The problem then is that my timeout never fires and nothing gets returned so I don't see any response from the module, it just hangs. The solution in this case is to go to the server you are provisioning with Ansible, find the command Ansible is running with ps and kill it. This allows Ansible to collect the output and show you where it is stuck in an infinite loop.

failed to get ec2 instance public ip on instance startup (windows)

I tried the below powershell script on startup (passed it as a user data):
<powershell>
$instancePublicIp = (wget http://169.254.169.254/latest/meta-data/public-ipv4).Content
while ($instancePublicIp.length -eq 0)
{
Start-Sleep -s 40
$instancePublicIp = (wget http://169.254.169.254/latest/meta-data/public-ipv4).Content
Add-Content c:\debug.txt "public ip : $($instancePublicIp) - "
}
</powershell>
I verified the script is running because debug.txt exists on C: drive. but the public ip comes back empty.
I also tried sleeping for 40 seconds to allow the instance to start and the network adapter is connected with no luck.
the same script above runs and gives back proper ip address when running it manually though.
Any thoughts?

Authentication or permission failure, did not have permissions on the remote directory

I am using ansijet to automate the ansible playbook to be run on a button click. The playbook is to stop the running instances on AWS. If run, manually from command-line, the playbook runs well and do the tasks. But when run through the web interface of ansijet, following error is encountered
Authentication or permission failure. In some cases, you may have been able to authenticate and did not have permissions on the remote directory. Consider changing the remote temp path in ansible.cfg to a path rooted in "/tmp". Failed command was: mkdir -p $HOME/.ansible/tmp/ansible-tmp-1390414200.76-192986604554742 && chmod a+rx $HOME/.ansible/tmp/ansible-tmp-1390414200.76-192986604554742 && echo $HOME/.ansible/tmp/ansible-tmp-1390414200.76-192986604554742, exited with result 1:
Following is the ansible.cfg configuration.
# some basic default values...
inventory = /etc/ansible/hosts
#library = /usr/share/my_modules/
remote_tmp = $HOME/.ansible/tmp/
pattern = *
forks = 5
poll_interval = 15
sudo_user = root
#ask_sudo_pass = True
#ask_pass = True
transport = smart
#remote_port = 22
module_lang = C
I try to change the remote_tmp path to /home/ubuntu/.ansible/tmp
But still getting the same error.

By default, the user Ansible connects to remote servers as will be the same name as the user ansible runs as. In the case of Ansijet, it will try to connect to remote servers with whatever user started Ansijet's node.js process. You can override this by specifying the remote_user in a playbook or globally in the ansible.cfg file.
Ansible will try to create the temp directory if it doesn't already exist, but will be unable to if that user does not have a home directory or if their home directory permissions do not allow them write access.
I actually changed the temp directory in my ansible.cfg file to point to a location in /tmp which works around these sorts of issues.
remote_tmp = /tmp/.ansible-${USER}/tmp

I faced the same problem a while ago and solved like this . The possible case is that either the remote server's /tmp directory did not have enough permission to write . Run the ls -ld /tmp command to make sure its output looks something like this
drwxrwxrwt 7 root root 20480 Feb 4 14:18 /tmp
I have root user as super user and /tmp has 1777 permission .
Also for me simply -
remote_tmp = /tmp worked well.
Another check would be to make sure $HOME is present from the shell which you are trying to run . Ansible runs commands via /bin/sh shell and not /bin/bash.Make sure that $HOME is present in sh shell .

In my case I needed to login to the server for the first time and change the default password.

Check the ansible user on the remote / client machine as this error occurs when the ansible user password expires on the remote / client machine.
==========
'WARNING: Your password has expired.\nPassword change required but no TTY available.\n')
<*.*.*.*> Failed to connect to the host via ssh: WARNING: Your password has expired.
Password change required but no TTY available.
Actual error :
host_name | UNREACHABLE! => {
"changed": false,
"msg": "Failed to create temporary directory.In some cases, you may have been able to authenticate and did not have permissions on the target directory. Consider changing the remote tmp path in ansible.cfg to a path rooted in \"/tmp\", for more error information use -vvv. Failed command was: ( umask 77 && mkdir -p \"` echo /tmp/ansible-$USER `\"&& mkdir /tmp/ansible-$USER/ansible-tmp-1655256382.78-15189-162690599720687 && echo ansible-tmp-1655256382.78-15189-162690599720687=\"` echo /tmp/ansible-$USER/ansible-tmp-1655256382.78-15189-162690599720687 `\" ), exited with result 1",
"unreachable": true
===========

This could happen mainly because on the Remote Server, there is no home directory present for the user.
The following steps resolved the issue for me -
Log into the remote server
switch to root
If the user is linux_user from which Host (in my case Ansible) is trying to connect , then run following commands
mkdir /home/linux_user
chown linux_user:linux_user /home/linux_user

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Running expect script on EC2 hangs, but runs successfully when manually invoked - amazon-web-services

Related

ssh tunnel script hangs forever on beanstalk deployment

Shell Script stops after connecting to external server

Ansible expect module not working

failed to get ec2 instance public ip on instance startup (windows)

Authentication or permission failure, did not have permissions on the remote directory

Categories

Resources