Subprocess Error when Launching ec2 cluster instance - amazon-web-services

I am getting subprocess error when launching ec2 cluster instance.
The terminal lags on
Waiting for cluster to enter 'ssh-ready' state'
when running
./spark-ec2 --key-pair=ru_spark --identity-file=ru_spark.pem --region=us-east-1 --zone=us-east-1a launch mycluster
Console:
Warning: Permanently added 'ec2-52-87-225-32.compute-1.amazonaws.com,52.87.225.32' (RSA) to the list of known hosts.
Connection to ec2-52-87-225-32.compute-1.amazonaws.com closed.
Warning: Permanently added 'ec2-52-87-225-32.compute-1.amazonaws.com,52.87.225.32' (RSA) to the list of known hosts.
Transferring cluster's SSH key to slaves...
ec2-34-207-153-79.compute-1.amazonaws.com
Warning: Permanently added 'ec2-34-207-153-79.compute-1.amazonaws.com,34.207.153.79' (RSA) to the list of known hosts.
Cloning spark-ec2 scripts from https://github.com/amplab/spark-ec2/tree/branch-1.6 on master...
Warning: Permanently added 'ec2-52-87-225-32.compute-1.amazonaws.com,52.87.225.32' (RSA) to the list of known hosts.
Cloning into 'spark-ec2'...
error: Peer reports incompatible or unsupported protocol version. while accessing https://github.com/amplab/spark-ec2/info/refs?service=git-upload-pack
fatal: HTTP request failed
Connection to ec2-52-87-225-32.compute-1.amazonaws.com closed.
Error executing remote command, retrying after 30 seconds: Command '['ssh', '-o', 'StrictHostKeyChecking=no', '-o', 'UserKnownHostsFile=/dev/null', '-i', 'ru_spark.pem', '-t', '-t', u'root#ec2-52-87-225-32.compute-1.amazonaws.com', 'rm -rf spark-ec2 && git clone https://github.com/amplab/spark-ec2 -b branch-1.6 spark-ec2']' returned non-zero exit status 128
I updated to curl ssl, changed file permissions to 400 and 600 for ru_spark.pem but neither have helped solve the issue.

Related

Ansible GCP dynamic inventory Failed to connect to the host via ssh Permission denied (publickey)

Configuration
I followed the steps in the below links to set up my GCP dynamic inventory.
https://docs.ansible.com/ansible/latest/scenario_guides/guide_gce.html
http://matthieure.me/2018/12/31/ansible_inventory_plugin.html
In short, it was the below steps
I installed the needed requisites.
$ pip install requests google-auth1
I created a service account with sufficient privileges. and set it's
credentials.
I added the below to the /etc/ansible/ansible.cfg file
[inventory]
enable_plugins = gcp_compute
I created a file called hosts.gcp.yml which holds the dynamic inventory setup (as shown below):
projects:
- my-project-id
hostnames:
- name
filters: []
auth_kind: serviceaccount
service_account_file: my/credentials_path.json
keyed_groups:
- key: zone
and tried to run the below command which worked fine
macbook#MacBooks-MacBook-Pro Ansible % ansible-inventory --graph -i hosts.gcp.yml
#all:
|--#_us_central1_a:
| |--test
|--#ungrouped:
but when running the below command I got the following errors
macbook#MacBooks-MacBook-Pro Ansible % ansible -i hosts.gcp.yml all -m ping
test | UNREACHABLE! => {
"changed": false,
"msg": "Failed to connect to the host via ssh: ssh: Could not resolve hostname test: nodename nor servname provided, or not known",
"unreachable": true
}
I then commented out the - name option from the hosts.gcp.yml file but got another error.
macbook#MacBooks-MacBook-Pro Ansible % ansible -i hosts.gcp.yml all -m ping
34.X.X.8 | UNREACHABLE! => {
"changed": false,
"msg": "Failed to connect to the host via ssh: macbook#34.X.X.8: Permission denied (publickey).",
"unreachable": true
}
This raises the following questions
1- Is an SSH setup (creating users and copying ssh-keys) needed on the host machines when using dynamic Inventories (I don't think so)?
2- Why is ansible resorting to SSH though a dynamic Inventory is set? What if the host didn't expose SSH to the public or didn't have a public IP?
Your kind support is highly appreciated.
Thanks.
A more verbose output of the test
macbook#MacBooks-MacBook-Pro Ansible % ansible -i hosts.gcp.yml all -vvv -m ping
ansible [core 2.11.6]
config file = /etc/ansible/ansible.cfg
configured module search path = ['/Users/macbook/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
ansible python module location = /usr/local/Cellar/ansible/4.7.0/libexec/lib/python3.9/site-packages/ansible
ansible collection location = /Users/macbook/.ansible/collections:/usr/share/ansible/collections
executable location = /usr/local/bin/ansible
python version = 3.9.7 (default, Oct 13 2021, 06:45:31) [Clang 13.0.0 (clang-1300.0.29.3)]
jinja version = 3.0.2
libyaml = True
Using /etc/ansible/ansible.cfg as config file
redirecting (type: inventory) ansible.builtin.gcp_compute to google.cloud.gcp_compute
Parsed /Users/macbook/xxxx/Projects/xxxx/Ansible/hosts.gcp.yml inventory source with ansible_collections.google.cloud.plugins.inventory.gcp_compute plugin
Skipping callback 'default', as we already have a stdout callback.
Skipping callback 'minimal', as we already have a stdout callback.
Skipping callback 'oneline', as we already have a stdout callback.
META: ran handlers
<34.132.201.8> ESTABLISH SSH CONNECTION FOR USER: None
<34.132.201.8> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 -o ControlPath=/Users/macbook/.ansible/cp/026bb454d7 34.132.201.8 '/bin/sh -c '"'"'echo ~ && sleep 0'"'"''
<34.X.X.8> (255, b'', b'macbook#34.X.X.8: Permission denied (publickey).\r\n')
34.X.X.8 | UNREACHABLE! => {
"changed": false,
"msg": "Failed to connect to the host via ssh: macbook#34.X.X.8: Permission denied (publickey).",
"unreachable": true
}
macbook#MacBooks-MacBook-Pro Ansible % ansible -i hosts.gcp.yml all -u ansible -vvv -m ping
ansible [core 2.11.6]
config file = /etc/ansible/ansible.cfg
configured module search path = ['/Users/macbook/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
ansible python module location = /usr/local/Cellar/ansible/4.7.0/libexec/lib/python3.9/site-packages/ansible
ansible collection location = /Users/macbook/.ansible/collections:/usr/share/ansible/collections
executable location = /usr/local/bin/ansible
python version = 3.9.7 (default, Oct 13 2021, 06:45:31) [Clang 13.0.0 (clang-1300.0.29.3)]
jinja version = 3.0.2
libyaml = True
Using /etc/ansible/ansible.cfg as config file
redirecting (type: inventory) ansible.builtin.gcp_compute to google.cloud.gcp_compute
Parsed /Users/macbook/xxxx/Projects/xxx/Ansible/hosts.gcp.yml inventory source with ansible_collections.google.cloud.plugins.inventory.gcp_compute plugin
Skipping callback 'default', as we already have a stdout callback.
Skipping callback 'minimal', as we already have a stdout callback.
Skipping callback 'oneline', as we already have a stdout callback.
META: ran handlers
<34.132.201.8> ESTABLISH SSH CONNECTION FOR USER: ansible
<34.132.201.8> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="ansible"' -o ConnectTimeout=10 -o ControlPath=/Users/macbook/.ansible/cp/46d2477dfb 34.132.201.8 '/bin/sh -c '"'"'echo ~ansible && sleep 0'"'"''
<34.X.X.8> (255, b'', b'ansible#34.X.X.8: Permission denied (publickey).\r\n')
34.X.X.8 | UNREACHABLE! => {
"changed": false,
"msg": "Failed to connect to the host via ssh: ansible#34.X.X.8: Permission denied (publickey).",
"unreachable": true
}
Dynamic inventory used only for collect data of your machines. If you want to get access into it, you should use SSH.
You must add your ssh-public key into VM's config and specify username
Add these lines in your ansible.cfg into the [defaults] section:
host_key_checking = false
remote_user = <username that you specify in VM's config>
private_key_file = <path to private ssh-key>
Most probably Ansible can't establish ssh connection to the hosts (listed in hosts.gcp.yml) because they don't recognize ssh key of the machine that tries to ping them.
Since you're using a macbook it's clear it's not a GCP VM. This means your GCP VM's don't have it's public ssh key by default.
You can add your macboook's key (found in ~ssh/id_rsa.pub) to the list of authorized keys that all GCP VM's will accept without any action on your side.
As for the first question - it's clearly DNS issue - however I'm not versed enough with this tool so You'd have tell if you can ping all the VM's using their DNS names directly from your mac's terminal. If so then the issue will be with Ansible configuration - otherwise it's DNS issue that prevent's your computer from using DNS names of your VM's.
Additionally - ansible-inventory --graph i /file/path works "offline" and will only show the structure of your inventory regardles if it exists or works.
There are a couple of points in your question, one about inventory and one about connections.
Inventory
Your hosts.gcp.yml file is for a dynamic inventory plugin, as you said. What that means is that Ansible will run the GCP inventory plugin using the settings in that file, and the plugin will call GCP's API and generate a list of hosts to use as inventory. What the ansible-inventory command returns is what the ansible command will use also. In the example bit of output you pasted into your question, it looks like "test" is the only host it sees.
Connections
When you run the ansible command it will run the module against each host. It will first get the hostname returned by inventory, and then connect to that host using the transport type you specified. This is true even for the ping module. From the ping module's doc page: "This is NOT ICMP ping, this is just a trivial test module that requires Python on the remote-node." Meaning, it makes a connection.
Potential Gotchas
Is inventory returning the correct hostname for your environment?
What is the connection type you're using?
As for hostname, you set "hostnames" to "name" in your inventory file. Just be sure that's right. It might not be in your case.
As for connection type, if you haven't configured it, then by default it will be "smart", which uses SSH. You can find what you're using by doing this:
ansible-config dump | grep DEFAULT_TRANSPORT
You can change the connection type with the --connection option to the ansible command, or any of the other ways ansible lets you specify config options. Connection type is set independently from inventory type. They are two separate steps. The connection type is set via config or the command line option and is not based on what inventory plugin you're using.
Your Problem
To resolve your problem, figure out what hostnames ansible-inventory is actually returning, and what connection type you're using. Then see if you can connect to that hostname using that connection type. If the hostname being returned is "test" and your connection type is "smart" or "ssh", then try actually connecting with ssh to "test". From the command line, literally do ssh test. If that succeeds, then ansible should successfully connect to that host when it's run. If that doesn't succeed, then you have to do whatever you need to do to fix it in order for ansible to run successfully. Likewise, if you set a connection plugin different from SSH, then you should try to connect to your host using whatever that connection method uses in order to ensure that those types of connections are actually working.
More info about all this can be found in ansible's user guide. See, for example, "Connecting to remote nodes".

Not able to access HDFS

I installed cloudera vm and started trying some basic stuff. First I just wanted to ls the hdfs directoires. so I issued the below command.
[cloudera#quickstart ~]$ hadoop fs -ls /
ls: Failed on local exception: java.net.SocketException: Network is unreachable; Host Details : local host is: "quickstart.cloudera/10.0.2.15"; destination host is: "quickstart.cloudera":8020;
though ps -fu hdfs says both namenode and data node is running. I checked the status using the service command.
[cloudera#quickstart ~]$ sudo service hadoop-hdfs-namenode status
Hadoop namenode is not running [FAILED]
Thinking all the problems will be resolved if I restart all the services, I executed the below command.
[cloudera#quickstart conf]$ sudo /home/cloudera/cloudera-manager --express --force
[QuickStart] Shutting down CDH services via init scripts...
[QuickStart] Disabling CDH services on boot...
[QuickStart] Starting Cloudera Manager daemons...
[QuickStart] Waiting for Cloudera Manager API...
[QuickStart] Configuring deployment...
Submitted jobs: 92
[QuickStart] Deploying client configuration...
Submitted jobs: 93
[QuickStart] Starting Cloudera Management Service...
Submitted jobs: 101
[QuickStart] Enabling Cloudera Manager daemons on boot...
Now I thought all services will be up so again checked the status of namenode service. Again it came failed.
[cloudera#quickstart ~]$ sudo service hadoop-hdfs-namenode status
Hadoop namenode is not running [FAILED]
Now I decided to manually stop and start the namenode service. Again not much use.
[cloudera#quickstart ~]$ sudo service hadoop-hdfs-namenode stop
no namenode to stop
Stopped Hadoop namenode: [ OK ]
[cloudera#quickstart ~]$ sudo service hadoop-hdfs-namenode status
Hadoop namenode is not running [FAILED]
[cloudera#quickstart ~]$ sudo service hadoop-hdfs-namenode start
starting namenode, logging to /var/log/hadoop-hdfs/hadoop-hdfs-namenode-quickstart.cloudera.out
Failed to start Hadoop namenode. Return value: 1 [FAILED]
I checked the file /var/log/hadoop-hdfs/hadoop-hdfs-namenode-quickstart.cloudera.out . It just said below
log4j:ERROR Could not find value for key log4j.appender.RFA
log4j:ERROR Could not instantiate appender named "RFA".
I also checked /var/log/hadoop-hdfs/hadoop-cmf-hdfs-NAMENODE-quickstart.cloudera.log.out . Found below when I searched for error. Can anyone please suggest me what is the best way to get the services back on track. Unfortunately I am not able to access cloudera manager from browser. Anything that I can do from command line?
2016-02-24 21:02:48,105 WARN com.cloudera.cmf.event.publish.EventStorePublisherWithRetry: Failed to publish event: SimpleEvent{attributes={ROLE_TYPE=[NAMENODE], CATEGORY=[LOG_MESSAGE], ROLE=[hdfs-NAMENODE], SEVERITY=[IMPORTANT], SERVICE=[hdfs], HOST_IDS=[quickstart.cloudera], SERVICE_TYPE=[HDFS], LOG_LEVEL=[WARN], HOSTS=[quickstart.cloudera], EVENTCODE=[EV_LOG_EVENT]}, content=Only one image storage directory (dfs.namenode.name.dir) configured. Beware of data loss due to lack of redundant storage directories!, timestamp=1456295437905} - 1 of 17 failure(s) in last 79302s
java.io.IOException: Error connecting to quickstart.cloudera/10.0.2.15:7184
at com.cloudera.cmf.event.shaded.org.apache.avro.ipc.NettyTransceiver.getChannel(NettyTransceiver.java:249)
at com.cloudera.cmf.event.shaded.org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:198)
at com.cloudera.cmf.event.shaded.org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:133)
at com.cloudera.cmf.event.publish.AvroEventStorePublishProxy.checkSpecificRequestor(AvroEventStorePublishProxy.java:122)
at com.cloudera.cmf.event.publish.AvroEventStorePublishProxy.publishEvent(AvroEventStorePublishProxy.java:196)
at com.cloudera.cmf.event.publish.EventStorePublisherWithRetry$PublishEventTask.run(EventStorePublisherWithRetry.java:242)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.SocketException: Network is unreachable
You can try this:
check witch process is using the port 7184 of namenode (i.e netstat linux command)
and kill that and then restart
Or
change you namenode port from conf and restart hadoop

Why does Kubernetes apiserver present a bad certificate to the etcd server?

Running Kubernetes on CoreOS on an AWS EC2 instance, I am unable to execute apiserver via a hyperkube Docker container successfully. The problem is that the etcd server refuses connections due to a bad certificate.
What happens is this:
$ docker run -v /etc/ssl/etcd:/etc/ssl/etcd:ro gcr.io/google_containers/hyperkube:v1.1.2 /hyperkube apiserver --bind-address=0.0.0.0 --insecure-bind-address=127.0.0.1 --etcd-servers=https://172.31.29.111:2379 --allow-privileged=true --service-cluster-ip-range=10.3.0.0/24 --secure-port=443 --advertise-address=172.31.29.111 --admission-control=NamespaceLifecycle,NamespaceExists,LimitRanger,SecurityContextDeny,ServiceAccount,ResourceQuota --tls-cert-file=/etc/ssl/etcd/master1-master-client.pem --tls-private-key-file=/etc/ssl/etcd/master1-master-client-key.pem --client-ca-file=/etc/ssl/etcd/ca.pem --kubelet-certificate-authority=/etc/ssl/etcd/ca.pem --kubelet-client-certificate=/etc/ssl/etcd/master1-master-client.pem --kubelet-client-key=/etc/ssl/etcd/master1-master-client-key.pem --kubelet-https=true
I0227 17:07:34.117098 1 plugins.go:71] No cloud provider specified.
I0227 17:07:34.549806 1 master.go:368] Node port range unspecified. Defaulting to 30000-32767.
[restful] 2016/02/27 17:07:34 log.go:30: [restful/swagger] listing is available at https://172.31.29.111:443/swaggerapi/
[restful] 2016/02/27 17:07:34 log.go:30: [restful/swagger] https://172.31.29.111:443/swaggerui/ is mapped to folder /swagger-ui/
E0227 17:07:34.659701 1 cacher.go:149] unexpected ListAndWatch error: pkg/storage/cacher.go:115: Failed to list *api.Pod: 501: All the given peers are not reachable (failed to propose on members [https://172.31.29.111:2379] twice [last error: Get https://172.31.29.111:2379/v2/keys/registry/pods?quorum=false&recursive=true&sorted=true: remote error: bad certificate]) [0]
The certificate should be good though. If I execute an interactive shell within that Docker image, I can get the etcd URL via curl without any issues. So, what is going wrong in this case and how do I fix it?
I found I could solve this by using --etcd-config instead of --etcd-servers:
docker run -p 443:443 -v /etc/kubernetes:/etc/kubernetes:ro -v /etc/ssl/etcd:/etc/ssl/etcd:ro gcr.io/google_containers/hyperkube:v1.1.2 /hyperkube apiserver --bind-address=0.0.0.0 --insecure-bind-address=127.0.0.1 --etcd-config=/etc/kubernetes/etcd.client.conf --allow-privileged=true --service-cluster-ip-range=10.3.0.0/24 --secure-port=443 --advertise-address=172.31.29.111 --admission-control=NamespaceLifecycle,NamespaceExists,LimitRanger,SecurityContextDeny,ServiceAccount,ResourceQuota --kubelet-certificate-authority=/etc/ssl/etcd/ca.pem --kubelet-client-certificate=/etc/ssl/etcd/master1-master-client.pem --kubelet-client-key=/etc/ssl/etcd/master1-master-client-key.pem --client-ca-file=/etc/ssl/etcd/ca.pem --tls-cert-file=/etc/ssl/etcd/master1-master-client.pem --tls-private-key-file=/etc/ssl/etcd/master1-master-client-key.pem
etcd.client.conf:
{
"cluster": {
"machines": [ "https://172.31.29.111:2379" ]
},
"config": {
"certFile": "/etc/ssl/etcd/master1-master-client.pem",
"keyFile": "/etc/ssl/etcd/master1-master-client-key.pem"
}
}

Docker Private Registry: ping attempt failed

I'm trying to set up my private Docker Registry and I'm following the official documentation.
I have installed Docker and I'm able to run my registry on my server. But I want my registry to be more widely available.
My docker-server with the private registry is installed on an AWS-instance.
I have created my own certificate and key by using keytool:
docker run -d -p 5000:5000 --restart=always --name registry \
-v `pwd`/certs:/certs \
-e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/domain.crt \
-e REGISTRY_HTTP_TLS_KEY=/certs/domain.key \
registry:2
I'm able to ping this instance by:
ping ec2-xx-xx-xx-xx.xx-west/east-1.compute.amazonaws.com
But pushing is not possible:
The push refers to a repository [ec2-xx-xx-xx-xx.compute.amazonaws.com:5000/ubuntu] (len: 1)
unable to ping registry endpoint https://ec2-xx-xx-xx-xx.compute.amazonaws.com:5000/v0/
v2 ping attempt failed with error: Get https://ec2-xx-xx-xx-xx.compute.amazonaws.com:5000/v2/: dial tcp 10.x.x.x:5000: i/o timeout
v1 ping attempt failed with error: Get https://ec2-xx-xx-xx-xx.amazonaws.com:5000/v1/_ping: dial tcp 10.0.x.x:5000: i/o timeout
EDIT1:
After changing my aws-security group. Set port 5000 to TCP, the error changed:
unable to ping registry endpoint https://ec2-xx-xx-xx-xx.compute.amazonaws.com:5000/v0/
v2 ping attempt failed with error: Get https://ec2-xx-xx-xx-xx.compute.amazonaws.com:5000/v2/: dial tcp 10.0.x.x:5000: connection refused
v1 ping attempt failed with error: Get https://ec2-xx-xx-xx-xx.compute.amazonaws.com:5000/v1/_ping: dial tcp 10.0.x.x:5000: connection refused
How do I have to make my registry accessible for other aws-instances?
My docker logs are showing the following. They can't find my certificate.
level=fatal msg="open /certs/domain.crt: no such file or directory"
Do I have to put this certificate in my container itself? (and generate it with keytool by myself or using an existing)
EDIT2:
I've generated my own certificates using this documentation.
After generating the certificates I did restart my docker daemon. I did not perform the copy of domain.crt to ca.crt because the path didn't exist. Maybe I have to create it by myself?
new error:
unable to ping registry endpoint https://ec2-xx-xx-xx-xx.compute.amazonaws.com:5000/v0/
v2 ping attempt failed with error: Get https://ec2-xx-xx-xx-xx.compute.amazonaws.com:5000/v2/: dial tcp 10.0.x.x:5000: no route to host
v1 ping attempt failed with error: Get https://ec2-xx-xx-xx-xx.compute.amazonaws.com:5000/v1/_ping: dial tcp 10.0.x.x:5000: no route to host
But I still get the following in my docker logs:
level=fatal msg="open /certs/domain.crt: no such file or directory"
After trying to perform a push, there is created a new /certs folder into my existing certsfolder
EDIT3:
After finding the right directory for my certificate (/home/centos/certs/certs/*.). I get the following error:
level=fatal msg="open /certs/domain.crt: permission denied
Even if I perform a chmod -R 777 and chown -R root:root
You will need to place the certificate in this directory.
/etc/docker/certs.d/<your-domain-name>:5000/ca.crt

Why am I getting a permission denied error on docker/aws eb?

I don't know why but I cannot seem to figure out why this is happening. I can build and run the docker image locally.
Recent Events:
2015-05-25 12:57:07 UTC+1000 ERROR Update environment operation is complete, but with errors. For more information, see troubleshooting documentation.
2015-05-25 12:57:07 UTC+1000 INFO New application version was deployed to running EC2 instances.
2015-05-25 12:57:04 UTC+1000 INFO Command execution completed on all instances. Summary: [Successful: 0, Failed: 1].
2015-05-25 12:57:04 UTC+1000 ERROR [Instance: i-4775ec9b] Command failed on instance. Return code: 1 Output: (TRUNCATED)... run Docker container: vel="fatal" msg="Error response from daemon: Cannot start container 02c057b331bf3a3d912bf064f1dca3e00c95746b5748c3c4a28a5c6b452ff335: [8] System error: exec: \"bin/app\": permission denied" . Check snapshot logs for details. Hook /opt/elasticbeanstalk/hooks/appdeploy/pre/04run.sh failed. For more detail, check /var/log/eb-activity.log using console or EB CLI.
2015-05-25 12:57:03 UTC+1000 ERROR Failed to run Docker container: vel="fatal" msg="Error response from daemon: Cannot start container 02c057b331bf3a3d912bf064f1dca3e00c95746b5748c3c4a28a5c6b452ff335: [8] System error: exec: \"bin/app\": permission denied" . Check snapshot logs for details.
Dockerfile:
FROM java:8u45-jre
MAINTAINER Terence Munro <terry#zenkey.com.au>
ADD ["opt", "/opt"]
WORKDIR /opt/docker
RUN ["chown", "-R", "daemon:daemon", "."]
USER daemon
ENTRYPOINT ["bin/app"]
EXPOSE 9000
Dockerrun.aws.json:
{
"AWSEBDockerrunVersion": "1",
"Ports": [
{
"ContainerPort": "9000"
}
],
"Volumes": []
}
Additional logs as attachment at: https://forums.aws.amazon.com/thread.jspa?threadID=181270
Any help is extremely appreciated.
#nick-humrich suggestion of trying eb local run worked. So using eb deploy ended up working.
I had previously been uploading through the web interface.
Initially using eb deploy was giving me a ERROR: TypeError :: data must be a byte string but I found this issue which was resolved by uninstalling pyopenssl.
So I don't know why the web interface was giving me permission denied perhaps something to do with the zip file?
But anyway I'm able to deploy now thank you.
I had a similar problem running Docker on Elastic Beanstalk. When I pointed CMD in the Dockerfile to a shell script (/path/to/my_script.sh), the EB deployment would fail with
/path/to/my_script.sh: Permission denied.
Apparently, even though I had run RUN chmod +x /path/to/my_script.sh during the Docker build, by the time the image was run, the permissions had been changed. Eventually, to make it work I settled on:
CMD ["/bin/bash","-c","chmod +x /path/to/my_script.sh && /path/to/my_script.sh"]