cloud-init manage_resolv_conf - centos7

I have the following in my user-data file for cloud-init, but this doesn't seem to work
#cloud-config
manage_resolv_conf: true
resolv_conf:
nameservers: ['10.0.100.1']
searchdomains:
- myawesomedomain.com
domain: myawesomedomain.com
options:
rotate: true
timeout: 1
In my centos 7 resolv.conf after initial VM creation:
; Created by cloud-init on instance boot automatically, do not edit.
;
# Generated by NetworkManager
nameserver 10.0.2.3
search localdomain
I haven't the slightest idea where that IP for the nameserver came from. Any idea what I'm missing?

I figured it out eventually.
Turns out that on CentOS 7, the resolv_conf cloud-init module doesn't run by default. I had to enable this in cloud_config_modules in my user-data file:
cloud_config_modules:
- resolv_conf

Related

Change Hostname permanently on Google Cloud Compute - WHM

I've had to change hostname on a Google Cloud Compute that is running a WHM instance, but it keeps resetting every now and then and restart.
My /etc/hosts are currently as follow:
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
10.156.0.7 cpanel.server-location-c.c.ascendant-hub-hidden.internal cpanel # Added by Google
169.254.169.254 metadata.google.internal # Added by Google
My System Information are:
Linux cpanel.xxx.com 3.10.0-1127.10.1.el7.x86_64 #1 SMP Wed Jun 3 14:28:03 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
My Old Hostname is something alike:
cpanel.xxx.com
I want my new hostname to become:
brain.xxx.com
Even when I change it from WHM using their Change Hostname feature, it keeps resetting.
Is their a cleaner method then setting a crontab?
Unfortunately, you're not able to change a custom hostname after you've created VM instance. Have a look at the documentation Creating a VM instance with a custom hostname:
You can create a VM with a custom hostname by specifying any fully
qualified DNS name.
and at the section Limitations:
You cannot change a custom hostname after you have created the VM.
To change this behavior you can try to file a feature request at Google Issue Tracker under this component.
UPDATE In addition, have a look at the documentation Storing and retrieving instance metadata section Default metadata keys:
Compute Engine defines a set of default metadata entries that provide
information about your instance or project. Default metadata is always
defined and set by the server. You can't manually edit any of these
metadata pairs.
and hostname is part of the default metadata entries and could not be changed manually.
UPDATE 2 As a possible workaround, you can use a startup script or other solutions to change the hostname every time the system restarts, otherwise it will automatically get re-synced with the metadata server on every reboot. For example, I applied this startup script via Custom metadata:
Key: startup-script
Value: #! /bin/bash
hostname changed-host-name'
then restarted VM instance and it works for me:
changed-host-name:~$ hostname
changed-host-name
These are few ways to change your hostname:
One way is to edit /etc/hostname directly - just switch file content with your new hostname.
The other way is to use hostnamectl set-hostname <your new hostname> which change /etc/hostname file for you.
But I think your problem is that Google keeps to overwrite some data not only when you reboot system but also while your VM is running. Assuming that above solutions, won't solve your issue.
Solution:
Thankfully Google Cloud Platform allows you to have custom hostname but you have to define them when creating new virtual instance. Check out this GCP document.

Why is Google Compute Engine not running my container?

I can do this successfully:
Bundle my app into a docker image
Build this image into a container using Google Cloud Build upon push to master
(This container is stored in the registry at, for example, gcr.io/my-project/my-container)
Deply this container to the web using Google Cloud Run
Visit the Cloud Run url and see my website
I am now trying more sophisticated builds and I think the next step is to use Google Compute Engine.
To start, I am simply trying to deploy a single instance of the same app that I deployed to Cloud Run:
Navigate to Compute Engine > VM Instances
Enter basics like instance name
Enter my container location under "Container Image": gcr.io/my-project/my-container
(As an aside, I find it suspect that the interface does not offer a selector for your existing Container Registry items here.)
Select "Allow HTTP Traffic" and "Allow HTTPS Traffic"
Click "Create"
GCE takes a minute to create it, and then it shows the green checkmark and the instance name, and "External IP: 35.238.xxx.xxx". I visit that URL in my browser and get... "35.238.xxx.xxx refused to connect."
To inspect, I go back to the GCE page and select "SSH > Open in browser window" next to my instance, which opens a type of cloud terminal to the machine.
In this terminal window, type ps and see that no processes are running. The container Dockerfile ends with CMD yarn start:prod, so I guess that's not happening here.
Further, I ls here and there and navigate around, and see that there is no /app directory from my Dockerfile's WORKDIR /app command. It seems like not only did my app not boot, but was the container not copied to the VM instance?
What am I doing wrong?
For anyone having this issue. I faced the same problem and couldn't figure it out.
Reading Serhii's answer give me the clue. I believe as of today (Jan 2021) the GCP Console UI is a bit unhelpful. It appears that if you type in a container name when creating your VM but WITHOUT specifying a tag on the end, it doesn't complain nor assume a default such as 'latest', it just fails silently. Hence the VM but with no docker container running.
At least it this now works for me, hopefully this helps others.
Check whether your VM has an external IP address.
If it doesn't, the VM might not have network access to the public repository and even to the Google Container Registry (gcr.io) and the docker container doesn't start silently.
I've decided to follow Deploying a container on a new VM instance again.
Please find my steps and commands below:
create a new VM that runs the Docker image gcr.io/cloud-marketplace/google/nginx1:latest with network tag http-server:
$ gcloud compute instances create-with-container instance-3 --tags=http-server,https-server --container-image=gcr.io/cloud-marketplace/google/nginx1:latest
Created [https://www.googleapis.com/compute/v1/projects/test-prj/zones/europe-west3-a/instances/instance-3].
NAME ZONE MACHINE_TYPE PREEMPTIBLE INTERNAL_IP EXTERNAL_IP STATUS
instance-3 europe-west3-a n1-standard-1 10.156.0.30 35.XXX.111.XXX RUNNING
create a new firewall rule:
$ gcloud compute firewall-rules create default-allow-http --direction=INGRESS --priority=1000 --network=default --action=ALLOW --rules=tcp:80 --source-ranges=0.0.0.0/0 --target-tags=http-server
Creating firewall...⠹
Created [https://www.googleapis.com/compute/v1/projects/test-prj/global/firewalls/default-allow-http].
Creating firewall...done.
NAME NETWORK DIRECTION PRIORITY ALLOW DENY DISABLED
default-allow-http default INGRESS 1000 tcp:80 False
check current firewall rules:
$ nmap -Pn 35.XXX.111.XXX
Starting Nmap 7.70 ( https://nmap.org ) at 2020-04-02 12:04 CEST
PORT STATE SERVICE
...
80/tcp open http
check if NGINX is running in the container:
$ curl -I http://35.XXX.111.XXX
HTTP/1.1 200 OK
Server: nginx/1.16.1
...
$ curl http://35.XXX.111.XXX
...
<h1>Welcome to nginx!</h1>
...
also via web browser at http://35.XXX.111.XXX
check status of the container:
$ gcloud compute ssh instance-3
...
instance-3 ~ $ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
...
a657c8871239 gcr.io/cloud-marketplace/google/nginx1:latest "/usr/local/bin/dock…" 14 minutes ago Up 14 minutes klt-instance-3-uwtu
attach to the container and run curl http://35.XXX.111.XXX in the separate terminal:
instance-3 ~ $ docker attach a657c8871239
YY.YY.43.203 - - [02/Apr/2020:10:18:06 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.64.0" "-"
YY.YY.43.203 - - [02/Apr/2020:10:18:07 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.64.0" "-"
I found no errors while following documentation.
To solve your issue:
Compare your steps and commands to mine.
Run test Docker image by following documentation on your project.
Try to replicate steps from documentation with your custom image.
If you still have issue - update your question with all your steps, commands and outputs.
I also had the problem, the instance was running, but could not pull my container.
Error: Failed to start container: Error response from daemon:
{"message":"unautho rized: You don't have the needed permissions to
perform this operation, and you may have invalid credentials. To
authenticate your request, follow the steps in:
https://cloud.google.com/container-registry/docs/advanced-authentication"
I had to add some extra scope to the yaml file : https://www.googleapis.com/auth/source.full_control
steps:
- name: gcr.io/cloud-builders/docker
args: ['build', '-t', 'gcr.io/local-xxxxxxxxxxxxxx/apptraining', '.']
- name: 'gcr.io/cloud-builders/docker'
args: ["push", "gcr.io/local-xxxxxxxxxxxxxx/apptraining"]
- name: 'gcr.io/cloud-builders/gcloud'
args: ['compute', 'instances', 'create-with-container', 'instanceapptraining', '--machine-type=n1-standard-1', '--scopes=https://www.googleapis.com/auth/devstorage.full_control,https://www.googleapis.com/auth/trace.append,https://www.googleapis.com/auth/logging.write,https://www.googleapis.com/auth/userinfo.email,https://www.googleapis.com/auth/bigquery,https://www.googleapis.com/auth/datastore,https://www.googleapis.com/auth/cloud-platform,https://www.googleapis.com/auth/trace.append,https://www.googleapis.com/auth/source.full_control,https://www.googleapis.com/auth/source.read_only,https://www.googleapis.com/auth/compute.readonly','--zone=us-central1-a', '--preemptible', '--container-image=gcr.io/local-xxxxxxxxxxxxxx/apptraining:latest']

AWS Elasticbeanstalk with Django: Incorrect application version found on all instances

I'm trying to deploy a django application on elasticbeanstalk. It has been working fine then suddenly stopped and I cannot figure out why.
When I do eb deploy I get
INFO: Environment update is starting.
INFO: Deploying new version to instance(s).
INFO: New application version was deployed to running EC2 instances.
INFO: Environment update completed successfully.
Alert: An update to the EB CLI is available. Run "pip install --upgrade awsebcli" to get the latest version.
INFO: Attempting to open port 22.
INFO: SSH port 22 open.
INFO: Running ssh -i /home/ubuntu/.ssh/web-cdi_011017.pem ec2-user#54.188.214.227 if ! grep -q 'WSGIApplicationGroup %{GLOBAL}' /etc/httpd/conf.d/wsgi.conf ; then echo -e 'WSGIApplicationGroup %{GLOBAL}' | sudo tee -a /etc/httpd/conf.d/wsgi.conf; fi;
INFO: Attempting to open port 22.
INFO: SSH port 22 open.
INFO: Running ssh -i /home/ubuntu/.ssh/web-cdi_011017.pem ec2-user#54.188.214.227 sudo /etc/init.d/httpd reload
Reloading httpd: [ OK ]
When I then run eb health, I get
Incorrect application version found on all instances. Expected version
"app-c56a-190604_135423" (deployment 300).
If I eb ssh and look in /opt/python/current there is nothing there so nothing is being copied across
I think something may be wrong with .elasticbeanstalk/config.yml. Somehow the directory was deleted and setup again. This is the config.yml
branch-defaults:
master:
environment: app-prod
scoring-dev:
environment: app-dev
environment-defaults:
app-prod:
branch: null
repository: null
global:
application_name: my-app
default_ec2_keyname: am-app_011017
default_platform: arn:aws:elasticbeanstalk:us-west-2::platform/Python 2.7 running
on 64bit Amazon Linux/2.3.1
default_region: us-west-2
include_git_submodules: true
instance_profile: null
platform_name: null
platform_version: null
profile: null
sc: git
workspace_type: Application
Please, any ideas about how to troubleshoot?
I upgraded to the latest AWS stack for python 2.7 and that sorted it
I faced the same problem and the cause the command timeout
Default max deployment time -Command timeout- is 600 (10 minutes)
Your Environment → Configuration → Deployment preferences → Command timeout
Increase the Deployment preferences for example 1800
or upgrade the instance type to work faster

Intermittent DNS issues while pulling docker image from ECR repository

Has anyone facing this issue with docker pull. we recently upgraded docker to 18.03.1-ce from then we are seeing the issue. Although we are not exactly sure if this is related to docker, but just want to know if anyone faced this problem.
We have done some troubleshooting using tcp dump the DNS queries being made were under the permissible limit of 1024 packet. which is a limit on EC2, We also tried working around the issue by modifying the /etc/resolv.conf file to use a higher retry \ timeout value, but that didn't seem to help.
we did a packet capture line by line and found something. we found some responses to be negative. If you use Wireshark, you can use 'udp.stream eq 12' as a filter to view one of the negative answers. we can see the resolver sending an answer "No such name". All these requests that get a negative response use the following name in the request:
354XXXXX.dkr.ecr.us-east-1.amazonaws.com.ec2.internal
Would anyone of you happen to know why ec2.internal is being adding to the end of the DNS? If run a dig against this name it fails. So it appears that a wrong name is being sent to the server which responds with 'no such host'. Is docker is sending a wrong dns name for resolution.
We see this issue happening intermittently. looking forward for help. Thanks in advance.
Expected behaviour
5.0.25_61: Pulling from rrg
Digest: sha256:50bbce4af6749e9a976f0533c3b50a0badb54855b73d8a3743473f1487fd223e
Status: Downloaded newer image forXXXXXXXX.dkr.ecr.us-east-1.amazonaws.com/rrg:5.0.25_61
Actual behaviour
docker-compose up -d rrg-node-1
Creating rrg-node-1
ERROR: for rrg-node-1 Cannot create container for service rrg-node-1: Error response from daemon: Get https:/XXXXXXXX.dkr.ecr.us-east-1.amazonaws.com/v2/: dial tcp: lookup XXXXXXXX.dkr.ecr.us-east-1.amazonaws.com on 10.5.0.2:53: no such host
Steps to reproduce the issue
docker pull XXXXXXXX.dkr.ecr.us-east-1.amazonaws.com/rrg:5.0.25_61
Output of docker version:
(Docker version 18.03.1-ce, build 3dfb8343b139d6342acfd9975d7f1068b5b1c3d3)
Output of docker info:
([ec2-user#ip-10-5-3-45 ~]$ docker info
Containers: 37
Running: 36
Paused: 0
Stopped: 1
Images: 60
Server Version: swarm/1.2.5
Role: replica
Primary: 10.5.4.172:3375
Strategy: spread
Filters: health, port, containerslots, dependency, affinity, constraint
Nodes: 12
Plugins:
Volume:
Network:
Log:
Swarm:
NodeID:
Is Manager: false
Node Address:
Kernel Version: 4.14.51-60.38.amzn1.x86_64
Operating System: linux
Architecture: amd64
CPUs: 22
Total Memory: 80.85GiB
Name: mgr1
Docker Root Dir:
Debug Mode (client): false
Debug Mode (server): false
Experimental: false
Live Restore Enabled: false
WARNING: No kernel memory limit support)

Metabase on Google App Engine

I'm trying to set up Metabase on a gcloud engine using Google Cloud SQL (MySQL).
I've got it running using this git and this app.yaml:
runtime: custom
env: flex
# Metabase does not support horizontal scaling
# https://github.com/metabase/metabase/issues/2754
# https://cloud.google.com/appengine/docs/flexible/java/configuring-your-app-with-app-yaml
manual_scaling:
instances: 1
env_variables:
# MB_JETTY_PORT: 8080
MB_DB_TYPE: mysql
MB_DB_DBNAME: [db_name]
# MB_DB_PORT: 5432
MB_DB_USER: [db_user]
MB_DB_PASS: [db_password]
# MB_DB_HOST: 127.0.0.1
CLOUD_SQL_INSTANCE: [project-id]:[location]:[instance-id]
I have 2 issues:
The Metabase fails in connecting to the Cloud SQL - the Cloud SQL is part of the same project and App Engine is authorized.
After I create my admin user in Metabase, I am only able to login for a few seconds (and only sometimes), but it keeps throwing me to either /setup or /auth/login saying the password doesn't match (when it does).
I hope someone can help - thank you!
So, we just got metabase running in Google App Engine with a Cloud SQL instance running PostgreSQL and these are the steps we went through.
First, create a Dockerfile:
FROM gcr.io/google-appengine/openjdk:8
EXPOSE 8080
ENV JAVA_OPTS "-XX:+IgnoreUnrecognizedVMOptions -Dfile.encoding=UTF-8 --add-opens=java.base/java.net=ALL-UNNAMED --add-modules=java.xml.bind"
ENV JAVA_TOOL_OPTIONS "-Xmx1g"
ADD https://downloads.metabase.com/enterprise/v1.1.6/metabase.jar $APP_DESTINATION
We tried pushing the memory further down, but 1 GB seemed to be the sweet spot. On to the app.yaml:
runtime: custom
env: flex
manual_scaling:
instances: 1
resources:
cpu: 1
memory_gb: 1
disk_size_gb: 10
readiness_check:
path: "/api/health"
check_interval_sec: 5
timeout_sec: 5
failure_threshold: 2
success_threshold: 2
app_start_timeout_sec: 600
beta_settings:
cloud_sql_instances: <Instance-Connection-Name>=tcp:5432
env_variables:
MB_DB_DBNAME: 'metabase'
MB_DB_TYPE: 'postgres'
MB_DB_HOST: '172.17.0.1'
MB_DB_PORT: '5432'
MB_DB_USER: '<username>'
MB_DB_PASS: '<password>'
MB_JETTY_PORT: '8080'
Note the beta_settings field at the bottom, which handles what akilesh raj was doing manually. Also, the trailing =tcp:5432 is required, since metabase does not support unix sockets yet.
Relevant documentation can be found here.
Although I am not sure of the reason, I think authorizing the service account of App engine is not enough for accessing cloud SQL.
In order to authorize your App to access your Cloud SQL you can do either of both methods:
Within the app.yaml file, configure an environment variable pointing to a a service account key file with a correct authorization configuration to Cloud SQL :
env_variables:
GOOGLE_APPLICATION_CREDENTIALS=[YOURKEYFILE].json
Your code executes a fetch of an authorized service account key from a bucket, and loads it afterwards with the help of the Cloud storage Client library. Seeing your runtime is custom, the pseudocode which would be translated into the code you use is the following:
.....
It is better to use the Cloud proxy to connect to the SQL instances. This way you do not have to authorize the instances in CloudSQL every time there is a new instance.
More on CloudProxy here
As for setting up Metabase in the Google App Engine, I am including the app.yaml and Dockerfile below.
The app.yaml file,
runtime: custom
env: flex
manual_scaling:
instances: 1
env variables:
MB_DB_TYPE: mysql
MB_DB_DBNAME: metabase
MB_DB_PORT: 3306
MB_DB_USER: root
MB_DB_PASS: password
MB_DB_HOST: 127.0.0.1
METABASE_SQL_INSTANCE: instance_name
The Dockerfile,
FROM gcr.io/google-appengine/openjdk:8
# Set locale to UTF-8
ENV LANG C.UTF-8
ENV LC_ALL C.UTF-8
# Install CloudProxy
ADD https://dl.google.com/cloudsql/cloud_sql_proxy.linux.amd64 ./cloud_sql_proxy
RUN chmod +x ./cloud_sql_proxy
#Download the latest version of Metabase
ADD http://downloads.metabase.com/v0.21.1/metabase.jar ./metabase.jar
CMD nohup ./cloud_sql_proxy -instances=$METABASE_SQL_INSTANCE=tcp:$MB_DB_PORT & java -jar /startup/metabase.jar