To start with I am trying to upgrade from 1.9 version to 1.10 so my setup contains two vms running different versions of airflow with different port forwarding.
I can access UI from vm running with 1.9 but not able to access UI from 1.10.
To debug I want to confirm if airflow webserver is running. if I execute
sudo systemctl start airflow-webserver
it throws no error but when
I am looking at netstat I am not seeing any process listening to port 8080(default).
Also I have not created any user as I do not need rbac authentication ? Can that be a problem?
As requested by #kaxil. Below is the output of ps aux | grep airflow
Can someone provide some suggestions on how to fix this problem? Also if you need any further resource can provide it. I am not sure what is relevant here.
Output of journalctl -u airflow-webserver.service -b
The Error message shows that there is an issue with airflow.cfg file i.e. there might be a character in your airflow.cfg that is causing the issue. Recheck your config file, if you don't find an issue, post your config file in your question and we will try to figure it out.
Related
I am using Code Server within my Cloud Shell. I need to use the port 3000 for a specific npm package. Unfortunately port 3000 is already used by the default editor Theia within Cloud Shell.
I have already tried the following:
sudo kill {{PID of Theia process}} ...but it restarts again immediatelly
searched for settings within /google/devshell/editor/theia ...but could not find any port settings
sudo netstat -tlnp gives the following output:
Any help is very appreciated.
As mentioned by JShinigami, That issue got resolved here by changing the port of the other application, other alternative of resolving this issue is as below :
First I would recommend you to reset your cloud shell.
You can refer to the Answer to follow the steps on how to kill a process running on the particular Port.
Option 1 A One-liner to kill only LISTEN on specific port:
kill -9 $(lsof -t -i:3000 -sTCP:LISTEN)`
Option 2 If you have npm installed you can also run
npx kill-port 3000
I also found this answer on stack overflow that may be relevant as it shows how they were able to kill the process once they obtained its PID.
could you run the following command :
"sudo netstat -tlnp"
From the above you will be able to tell what processes are running on the ports. From there you will see the Possibility of "auto restart" configuration somewhere causing the process to appear even after kill command.
Found this useful article on ways to list processes running on ports.
This is cloudshelledit occupy the port
If you don't need cloudshelledit and can kill off
And if you open the cloudshelledit, this process is not shut off
cloudshelledit
I'm not referring to more sophisticated debugging techniques, but how to get access to the same kind of error messages that are normally directed to terminal tabs?
Basically I'm adopting Docker in a Django project also using Redis.
In the old way of working I opened a linux terminal tab for gunicorn like this: gunicorn --reload --bind 0.0.0.0:8001 myapp.wsgi:application
And this tab kept running Gunicorn and any Python error was shown in this tab so I could see the problem and fix it.
I could also open a second tab for the celery woker: celery -A myapp worker --pool=solo -l info
The same thing happened, the tab was occupied by Celery and any Python error in a task was shown in the tab and I could see the problem and correct the code.
My question is: Using docker is there a way to make each of the containers direct these same errors that would previously go to the screen, to go to log files so that I can debug my code when an error occurs in Python?
What is the correct way to handle simple debugging during development using Docker containers?
After looking more about this in the docker documentation I found a link that solves this problem: View logs for a container or service
Basically the command "docker logs CONTAINER_ID" shows on the screen exactly what we would see in the terminal running the application.
Works perfectly to see Django, Redis and Angular logs.
Just type:
docker logs CONTAINER_ID
Replace the container_id keyword with the real id of the container you want to log in.
To find the id type:
docker ps
The problem:
I have a managed Cloud composer environment, under a 1.9.7-gke.6 Kubernetes cluster master.
I tried to upgrade it (as well as the default-pool nodes) to 1.10.7-gke.1, since an upgrade was available.
Since then, Airflow has been acting randomly. Tasks that were working properly are failing for no given reason. This makes Airflow unusable, since the scheduling becomes unreliable.
Here is an example of a task that runs every 15 minutes and for which the behavior is very visible right after the upgrade:
airflow_tree_view
On hover on a failing task, it only shows an Operator: null message (null_operator). Also, there is no log at all for that task.
I have been able to reproduce the situation with another Composer environment in order to ensure that the upgrade is the cause of the dysfunction.
What I have tried so far :
I assumed the upgrade might have screwed up either the scheduler or Celery (Cloud composer defaults to CeleryExecutor).
I tried restarting the scheduler with the following command:
kubectl get deployment airflow-scheduler -o yaml | kubectl replace --force -f -
I also tried to restart Celery from inside the workers, with
kubectl exec -it airflow-worker-799dc94759-7vck4 -- sudo celery multi restart 1
Celery restarts, but it doesn't fix the issue.
So I tried to restart the airflow completely the same way I did with airflow-scheduler.
None of these fixed the issue.
Side note, I can't access Flower to monitor Celery when following this tutorial (Google Cloud - Connecting to Flower). Connecting to localhost:5555 stay in 'waiting' state forever. I don't know if it is related.
Let me know if I'm missing something!
1.10.7-gke.2 is available now [1]. Can you further upgrade to 1.10.7-gke.2 to see if the issue persists?
[1] https://cloud.google.com/kubernetes-engine/release-notes
I have a standard Debian 8.9 instance on google cloud compute (GCE) where my startup script is ignored.
In the custom metadata field, for startup-script, I am trying to run an Rscript (which is used for batch execution of R files), followed by a system shutdown, with the following:
#! /bin/bash
sudo /usr/bin/Rscript /home/myuser/launch_script.R
sudo shutdown -h now
Starting the instance is immediately followed by a shutdown and the Rscript is ignored. Removing the last line to shutdown causes the GCE instance to start, but the Rscript to be ignored. Running just "sudo /usr/bin/Rscript /home/myuser/launch_script.R" from the terminal results in the script being run. It has a chmod of 755, so I don't think this is a permissions issue.
In addition to this problem, I have read elsewhere that logging should happen in /var/log/, but there is nothing there. Instead, I have a bunch of log files (that only contain the start-up script and nothing else) in the root of my instance:
I got in touch with Google cloud support, who gave the following response:
script definition is kept under /var/run/google.startup.script
If the script does not run initially, you can force it manually with : $ sudo google_metadata_script_runner --script-type startup # for Debian, or # sudo /usr/share/google/run-startup-scripts # on Ubuntu and older images
I'm posting this information here, because it is not in their documentation (as of August 2017). I'm not sure how helpful it is, since the google.startup.script didn't exist in my case (using the latest Debian image on GCE), but I did run the other commands.
However, I think my main issues were:
I was using autossh to connect to a remote database. The startup-script was running before autossh. Building a 40 second delay into the script and running the script as a user (not sudo-type root) seems to have solved this problem for now. Autossh was being run as the main user, which I think gets loaded before lower-privilege user-defined scripts get loaded.
I was using some gcloud commands from the user account which had its own authentication issues. Running gcloud auth login as the user and ensuring correct permissions on my private key solved this.
Always remember to check the messages and syslog files in /var/log for troubleshooting. This allowed me to see the order of things being loaded at system-boot.
Please feel free to redirect me to any other place if this isn't the right one for this question.
Problem: When I log to the administration panel : "localhost:8083" with "root" "root" I cannot see the existing databases nor the data in it. Also, I have no way to access InfluxDB from the command line.
Also the line sudo /etc/init.d/influxdb start does not work for my setup. I have to go into /etc/init.d/ and run sudo ./influxdb start -config=config.toml in order to get the server running.
I've installed influxDB v0.8 from https://influxdb.com/docs/v0.8/introduction/installation.html for Ubuntu 14.04.
I've been developing a Clojure program using the Capacitor API just to get started and interact with InfluxDB. It runs well, I can create delete, insert and query a database without problems.
netstat -anp | grep LISTEN confirms me that ports 8083 8086 8090 and 8099 are listening.
I've been Googling all around but cannot manage to get a solution.
Thanks for the support and enjoy building things !
Problem solved: the database weren't visible in firefox but everything is visible in Chromium!
Why couldn't I access the CLI ? I was expecting the v0.8 to behave exactly like the v0.9.
You help was appreciated anyway !
For InfluxDB 0.9 the CLI could be started with:
/opt/influxdb/influx
then you can display available databases:
Connected to http://localhost:8086 version 0.9.1
InfluxDB shell 0.9.1
> show databases
name: databases
---------------
name
collectd
graphite
> use collectd
Using database collectd
> show series limit 5
You can try creating new database from CLI:
> CREATE DATABASE mydb
or with curl command:
curl -G 'http://localhost:8086/query' --data-urlencode "q=CREATE DATABASE mydb"
Web UI should be available on http://localhost:8083