I have a Django web app served from Apache2 with mod_wsgi in docker containers running on a Kubernetes cluster in Google Cloud Platform, protected by Identity-Aware Proxy. Everything is working great, but I want to send GCP Stackdriver traces for all requests without writing one for each view in my project. I found middleware to handle this, using Opencensus. I went through this documentation, and was able to manually generate traces that exported to Stackdriver Trace in my project by specifying the StackdriverExporter and passing the project_id parameter as the Google Cloud Platform Project Number for my project.
Now to make this automatic for ALL requests, I followed the instructions to set up the middleware. In settings.py, I added the module to INSTALLED_APPS, MIDDLEWARE, and set up the OPENCENSUS_TRACE dictionary of options. I also added the OPENCENSUS_TRACE_PARAMS. This works great with the default exporter 'opencensus.trace.exporters.print_exporter.PrintExporter', as I can see the Trace and Span information, including Trace ID and all details in my Apache2 web server logs. However, I want to send these to my Stackdriver Trace processor for analysis.
I tried setting the EXPORTER parameter to opencensus.trace.exporters.stackdriver_exporter.StackdriverExporter, which works when run manually from the shell, as long as you supply the project number.
When it is set up to use StackdriverExporter, the web page will not respond load, the health check starts to fail, and ultimately the web page comes back with a 502 error, stating I should try again in 30 seconds (I believe the Identity-Aware Proxy is generating this error, once it detects the failed health check), but the server generates no errors, and there are no logs in access or errors for Apache2.
There is another dictionary in settings.py named OPENCENSUS_TRACE_PARAMS, which I presume is needed to determine which project number the exporter should be using. The example has GCP_EXPORTER_PROJECT set as None, and SERVICE_NAME set as 'my_service'.
What options do I need to set to get the exporter to send back to Stackdriver instead of printing to logs? Do you have any idea about how I can set this up?
settings.py
MIDDLEWARE = (
...
'opencensus.trace.ext.django.middleware.OpencensusMiddleware',
)
INSTALLED_APPS = (
...
'opencensus.trace.ext.django',
)
OPENCENSUS_TRACE = {
'SAMPLER': 'opencensus.trace.samplers.probability.ProbabilitySampler',
'EXPORTER': 'opencensus.trace.exporters.stackdriver_exporter.StackdriverExporter', # This one just makes the server hang with no response or error and kills the health check.
'PROPAGATOR': 'opencensus.trace.propagation.google_cloud_format.GoogleCloudFormatPropagator',
# 'EXPORTER': 'opencensus.trace.exporters.print_exporter.PrintExporter', # This one works to print the Trace and Span with IDs and details in the logs.
}
OPENCENSUS_TRACE_PARAMS = {
'BLACKLIST_PATHS': ['/health'],
'GCP_EXPORTER_PROJECT': 'my_project_number', # Should this be None like the example, or Project ID, or Project Number?
'SAMPLING_RATE': 0.5,
'SERVICE_NAME': 'my_service', # Not sure if this is my app name or some other service name.
'ZIPKIN_EXPORTER_HOST_NAME': 'localhost', # Are the following even necessary, or are they causing a failure that is not detected by Apache2?
'ZIPKIN_EXPORTER_PORT': 9411,
'ZIPKIN_EXPORTER_PROTOCOL': 'http',
'JAEGER_EXPORTER_HOST_NAME': None,
'JAEGER_EXPORTER_PORT': None,
'JAEGER_EXPORTER_AGENT_HOST_NAME': 'localhost',
'JAEGER_EXPORTER_AGENT_PORT': 6831
}
Here's an example (I prettified the format for readability) of the Apache2 log when it is set to use the PrintExporter:
[Fri Feb 08 09:00:32.427575 2019]
[wsgi:error]
[pid 1097:tid 139801302882048]
[client 10.48.0.1:43988]
[SpanData(
name='services.views.my_view',
context=SpanContext(
trace_id=e882f23e49e34fc09df621867d753532,
span_id=None,
trace_options=TraceOptions(enabled=True),
tracestate=None
),
span_id='bcbe7b96906a482a',
parent_span_id=None,
attributes={
'http.status_code': '200',
'http.method': 'GET',
'http.url': '/',
'django.user.name': ''
},
start_time='2019-02-08T17:00:29.845733Z',
end_time='2019-02-08T17:00:32.427455Z',
child_span_count=0,
stack_trace=None,
time_events=[],
links=[],
status=None,
same_process_as_parent_span=None,
span_kind=1
)]
Thanks in advance for any tips, assistance, or troubleshooting advice!
Edit 2019-02-08 6:56 PM UTC:
I found this in the middleware:
# Initialize the exporter
transport = convert_to_import(settings.params.get(TRANSPORT))
if self._exporter.__name__ == 'GoogleCloudExporter':
_project_id = settings.params.get(GCP_EXPORTER_PROJECT, None)
self.exporter = self._exporter(
project_id=_project_id,
transport=transport)
elif self._exporter.__name__ == 'ZipkinExporter':
_service_name = self._get_service_name(settings.params)
_zipkin_host_name = settings.params.get(
ZIPKIN_EXPORTER_HOST_NAME, 'localhost')
_zipkin_port = settings.params.get(
ZIPKIN_EXPORTER_PORT, 9411)
_zipkin_protocol = settings.params.get(
ZIPKIN_EXPORTER_PROTOCOL, 'http')
self.exporter = self._exporter(
service_name=_service_name,
host_name=_zipkin_host_name,
port=_zipkin_port,
protocol=_zipkin_protocol,
transport=transport)
elif self._exporter.__name__ == 'TraceExporter':
_service_name = self._get_service_name(settings.params)
_endpoint = settings.params.get(
OCAGENT_TRACE_EXPORTER_ENDPOINT, None)
self.exporter = self._exporter(
service_name=_service_name,
endpoint=_endpoint,
transport=transport)
elif self._exporter.__name__ == 'JaegerExporter':
_service_name = self._get_service_name(settings.params)
self.exporter = self._exporter(
service_name=_service_name,
transport=transport)
else:
self.exporter = self._exporter(transport=transport)
The exporter is now named StackdriverExporter, instead of GoogleCloudExporter. I set up a class in my app named GoogleCloudExporter that inherits StackdriverExporter, and updated my settings.py to use GoogleCloudExporter, but it didn't seem to work, I wonder if there is other code referencing these old naming schemes, possibly for the transport. I'm searching the source code for clues... This at least tells me I can get rid of the ZIPKIN and JAEGER param options, as this is determined on the EXPORTER param.
Edit 2019-02-08 11:58 PM UTC:
I scrapped Apache2 to isolate the problem and just set my docker image to use Django's built in webserver CMD ["python", "/path/to/manage.py", "runserver", "0.0.0.0:80"] and it works! When I go to the site, it writes traces to Stackdriver Trace for each request, the Span name is the module and method being executed.
Somehow Apache2 is not being allowed to send these, but I can do so from the shell when running as root. I'm adding Apache2 and mod-wsgi tags to the question, because I have a funny feeling this has to do with forking child processes in Apache2 and mod-WSGI. Would it be the child process being unable to be created as apache2's child process is sandboxed, or could this be a permissions thing? It seems strange, because it is just calling python modules, no external system OS binaries, that I am aware of. Any other ideas would be greatly appreciated!
I had this problem while using gunicorn with gevent as the worker class. To resolve and get cloud traces working the solution was to monkey patch grpc like so
from gevent import monkey
monkey.patch_all()
import grpc.experimental.gevent as grpc_gevent
grpc_gevent.init_gevent()
See https://github.com/grpc/grpc/issues/4629#issuecomment-376962677
I've got an address say example.com and have added it to the allowed_hosts list. But when I access the site I get
ALLOWED_HOSTS ['127.0.0.1', '::1', '178.XX.XX.XXX', 'xx80::xx81:xxx:xx3x:x12x%eth0']
in the debug error page, while the actual settings.py file reads ['178.XX.XX.XXX','example.com']. I figured the changes to settings.py aren't registered as I can remove 178.XX.XX.XXX from the list and still access the site (after clearing the browser cache) I've restarted nginx, gunicorn and the whole server to no avail. The whole thing is set up on ubuntu 16.04 running django 1.8 and using nginx and gunicorn. Any ideas where this override of allowed_hosts could be coming from?
Ok this is embarassing but the One-cick install for django on 16.04 from Digital Ocean adds a line at the very end of settings.py where ALLOWED_HOSTS is redefined.
# Find out what the IP addresses are at run time
# This is necessary because otherwise Gunicorn will reject the connections
def ip_addresses():
ip_list = []
for interface in netifaces.interfaces():
addrs = netifaces.ifaddresses(interface)
for x in (netifaces.AF_INET, netifaces.AF_INET6):
if x in addrs:
ip_list.append(addrs[x][0]['addr'])
return ip_list
# Discover our IP address
ALLOWED_HOSTS = ip_addresses()
ALLOWED_HOSTS.append('.example.com') #I added this line
So adding an append to the line fixes the problem.
I'm using Django served by uWSGI and NGINX.
Ubuntu 14.04.1 LTS 64-bit
Python 3.4
Django 1.7.4
uWSGI 1.9.17.1-debian (64bit)
NGINX 1.4.6
python-pdfkit 0.5.0
wkhtmltopdf 0.12.2.1
OpenLayers v3.0.0
When I try running pdfkit.from_url(...) to print a map to pdf the request times out.
More specifically it hangs in python's subprocess.py communicate, self._communicate:
with _PopenSelector() as selector:
if self.stdin and input:
selector.register(self.stdin, selectors.EVENT_WRITE)
if self.stdout:
selector.register(self.stdout, selectors.EVENT_READ)
if self.stderr:
selector.register(self.stderr, selectors.EVENT_READ)
while selector.get_map():
...
selector.get_map() always returns a valid result, ensuring an infinite loop.
If I run this in the Django development server (instead of uWSGI+NGINX) everything runs fine.
in my view:
wkhtmltopdfBinLocationString = '/usr/local/bin/wkhtmltopdf'
wkhtmltopdfBinLocationBytes = wkhtmltopdfBinLocationString.encode('utf-8')
#this fixes some leftover python2 assumptions about strings
config = pdfkit.configuration(wkhtmltopdf=wkhtmltopdfBinLocationBytes)
pdfkit.from_url(reportPdfUrl, reportPdfFile, configuration=config, options={
'javascript-delay': 1500
})
Several places I have seen answers along the line of "set the close-on-exec flag on the socket" solving similar issues.
Is this something I can set from my "from_url" options (wkhtmltopdf does not accept it by that name) or can I configure uWSGI to assume 'close-on-exec'? I have not been able to make either of these work, but maybe I just need help with changing my uWSGI customization file:
[uwsgi]
workers = 1
chdir = [...]
plugins = python34
wsgi-file = [...]/wsgi.py
pythonpath = [...]
I tried something like
close-on-exec = true
but that didn't seem to do anything.
NOTE: the wsgi.py file is simple:
"""
WSGI config for dst project.
It exposes the WSGI callable as a module-level variable named ``application``.
For more information on this file, see
https://docs.djangoproject.com/en/dev/howto/deployment/wsgi/
"""
import os
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "[my_project].settings")
from django.core.wsgi import get_wsgi_application
application = get_wsgi_application()
Any thoughts?
I have django-rq set up and running (jobs are queued, I can run manage.py rqworker) but I cannot seem to get the admin screen working - I consistently get a 404 for the /admin/django_rq url.
I have django-extensions installed, and the show_urls commands shows that the URL is registered:
$ python manage.py show_urls
...
/admin/django_rq/ django_rq.views.stats rq_home
/admin/django_rq/queues/<queue_index>/ django_rq.views.jobs rq_jobs
/admin/django_rq/queues/<queue_index>/<job_id>/ django_rq.views.job_detail rq_job_detail
/admin/django_rq/queues/<queue_index>/<job_id>/delete/ django_rq.views.delete_job rq_delete_job
...
I am logged in as someone who is a staff member, so the #staff_member_required decorator on the stats view should be working.
(r'^admin/django_rq/', include('django_rq.urls'))
must be before
url(r'^admin/', include(admin.site.urls))
I've updated the installation instructions in django-rq's README.rst to drop the "admin" prefix from django-rq's URL pattern so Hugo's issue shouldn't happen again.
When i run the server it shows me a message http://127.0.0.1:8000/. I want to get the url in code. I dont have request object there.
request.build_absolute_uri('/')
or you could try
import socket
socket.gethostbyname(socket.gethostname())
With Django docs here, You can use:
domain = request.get_host()
# domain = 'localhost:8000'
If you want to know exactly the IP address that the development server is started with, you can use sys.argv for this. The Django development server uses the same trick internally to restart the development server with the same arguments
Start development server:
manage.py runserver 127.0.0.1:8000
Get address in code:
if settings.DEBUG:
import sys
print sys.argv[-1]
This prints 127.0.0.1:8000
import socket
host = socket.gethostname()
import re
import subprocess
def get_ip_machine():
process = subprocess.Popen(['ifconfig'], stdout=subprocess.PIPE)
ip_regex = re.compile('(((1?[0-9]{1,2})|(2[0-4]\d)|(25[0-5]))\.){3}((1?[0-9]{1,2})|(2[0-4]\d)|(25[0-5]))')
return ip_regex.search(process.stdout.read(), re.MULTILINE).group()