Can I stop the significant amount of [Django] ERROR (EXTERNAL IP): Invalid HTTP_HOST header from strange sites I'm getting? - django

Since adding the option to email me (the admin) when there are problems with my Django server, I keep getting a LOT of the following emails (20 in the last hour alone).
[Django] ERROR (EXTERNAL IP): Invalid HTTP_HOST header: 'staging.menthanhgiadalat.com'. You may need to add 'staging.menthanhgiadalat.com' to ALLOWED_HOSTS.
I've set my server up to have the following at the top of the file in my sites-enabled nginx config as I read (somewhere on SO) that this would may prevent me from getting these types of emails:
server {
server_name _;
return 444;
}
But it hasn't done anything.
In the next server block I have the IP address and domain names for my site. Could this be causing the problem?
This 'staging' site isn't the only domain I'm being asked to add to my ALLOWED_HOSTS. But it is, by far, the most frequent.
Can I stop this type of alert being sent? Can I stop it from being raised? Is there something I've configured incorrectly on my server (I'm ashamed to admit I'm pretty new at this).
Thanks for any help you might be able to give.

You can configure LOGGING in your settings.py to silence django.security.DisallowedHost as directed at https://docs.djangoproject.com/en/3.2/topics/logging/#django-security

Related

Why is google trying to access my backend server?

I have a productionized Django backend server running as on Kubernetes (Deployment/Service/Ingress) on GCP.
My django is configured with something like
ALLOWED_HOSTS = [BACKEND_URL,INGRESS_IP,THIS_POD_IP,HOST_IP]
Everything is working as expected.
However, my backend server logs intermittent errors like these (about 7 per day)
DisallowedHost: Invalid HTTP_HOST header: 'www.google.com'. You may need to add 'www.google.com' to ALLOWED_HOSTS.
DisallowedHost: Invalid HTTP_HOST header: 'xxnet-f23.appspot.com'. You may need to add 'xxnet-f23.appspot.com' to ALLOWED_HOSTS.
DisallowedHost: Invalid HTTP_HOST header: 'xxnet-301.appspot.com'. You may need to add 'xxnet-301.appspot.com' to ALLOWED_HOSTS.
DisallowedHost: Invalid HTTP_HOST header: 'www.google.com'. You may need to add 'www.google.com' to ALLOWED_HOSTS.
DisallowedHost: Invalid HTTP_HOST header: 'narutobm1234.appspot.com'. You may need to add 'narutobm1234.appspot.com' to ALLOWED_HOSTS.
DisallowedHost: Invalid HTTP_HOST header: 'z-h-e-n-116.appspot.com'. You may need to add 'z-h-e-n-116.appspot.com' to ALLOWED_HOSTS.
DisallowedHost: Invalid HTTP_HOST header: 'www.google.com'. You may need to add 'www.google.com' to ALLOWED_HOSTS.
DisallowedHost: Invalid HTTP_HOST header: 'xxnet-131318.appspot.com'. You may need to add 'xxnet-131318.appspot.com' to ALLOWED_HOSTS.
DisallowedHost: Invalid HTTP_HOST header: 'www.google.com'. You may need to add 'www.google.com' to ALLOWED_HOSTS.
DisallowedHost: Invalid HTTP_HOST header: 'stoked-dominion-123514.appspot.com'. You may need to add 'stoked-dominion-123514.appspot.com' to ALLOWED_HOSTS.
My primary question is: Why - what are all of these hosts?.
I certainly don't want to allow those hosts without understanding their purpose.
Bonus question: What's the best way to silence unwanted hosts within my techstack?
My primary question is: Why - what are all of these hosts?.
Some of them are web crawlers that gather information for various purposes. For example, the www.google.com address is most likely the web crawlers that populate the search engine databases for Google search, etcetera.
Google probably got to your back-end site by accident by following a chain of links from some other page that is searchable; e.g. your front end website. You could try to identify that path. I believe there is also a page where you can request the removal of URLs from search ... though I'm not sure how effective that would be in quieting your logs.
Others may be robots probing your site for vulnerabilities.
I certainly don't want to allow those hosts without understanding their purpose.
Well, you can never entirely know their purpose. And in some cases, you may never be able to find out.
Bonus question: What's the best way to silence unwanted hosts within my techstack?
One way is to simply block access using a manually managed blacklist or whitelist.
A second way is to have your back-end publish a "/robots.txt" document; see About /robots.txt. Note that not all crawlers will respect a "robots.txt" page, but the reputable ones will; see How Google interprets the robots.txt specification.
Note that it is easy to craft a "/robots.txt" that says "nobody crawl this site".
Other ways would include putting your backend server behind a firewall or giving it a private IP address. (It seems a bit of an odd decision to expose your back-end services to the internet.)
Finally, the sites you are seeing are already being blocked, and Django is telling you that. Perhaps what you should be asking is how to mute the log messages for these events.
Django checks any request with a header against the ALLOWED_HOSTS setting. When it’s not there, Django throws the Invalid HTTP_HOST header error. See documentation.
These HTTP requests could be coming from bots with wrong host header value. You may want to consider checking Cloud Armor to block traffic from specific host header/domain.

Django Invalid HTTP_HOST header: '<ip>'. You may need to add '<ip>' to ALLOWED_HOSTS

<ip> is showing an actual ip address, I just didn't include it in the title.
I believe this ip address is the internal ip of my EC2 instance. I'm using AWS Elastic beanstalk to host.
I see this question has been answered a lot on SO, and the answer is always to add the ip address to the ALLOWED_HOSTS, but in my case I've set ALLOWED_HOSTS=['*'] and I'm still getting the error.
The weird thing is, I'm only getting the error when I try to access the site from my phone. When I access from the desktop browser, it works fine...
Things I've tried:
I've double checked my elastic beanstalk deployment and the changes are definitely deployed.
Ok so this probably won't be the answer for other people but it was for me..
In my case, I was doing an http GET request from my frontend and forgot the extra "/" at the end of the url. My django urls.py defines the url with a slash at the end. My fix was to add the extra "/" when doing the http GET.
On my desktop, this was automatically handled because django would reply with an automatic redirect (302) and my desktop browser would go to the url with / at the end, but my phone was not doing the redirect!
This somehow was throwing the invalid HTTP_HOST header error.
For most people, the fix for an error message like this is to add all of your domains to the list of ALLOWED_HOSTS.
Oh and if you're using elasticbeanstalk like me, don't forget to add the AWS domain name. It should look something like this:
ALLOWED_HOSTS = ['<your-unique-id>.elasticbeanstalk.com',
'example.com', '<your-subdomain>.example.com']
DON'T do ALLOWED_HOSTS = ['*'] in prod!!

Invalid HOST Header from router IP

I keep getting an Invalid HOST Header error which I am trying to find the cause of. It reads as such:
Report at /GponForm/diag_Form
Invalid HTTP_HOST header: '192.168.0.1:443'. You may need to add '192.168.0.1' to ALLOWED_HOSTS
I do not know what /GponForm/diag_Form is but from the looks of it, it may be a vulnerability attacked by malware.
I also am wondering why the IP is from a router 192.168.0.1 as well as why it is coming through SSL :443
Should I consider putting a HoneyPot and blocking this IP address? Before I do, why does the IP look like a local router?
The full Request URL in the report looks like this:
Request URL: https://192.168.0.1:443/GponForm/diag_Form?style/
I am getting this error at least ~10x/day now so I would like to stop it.
Yes, this surely represents a vulnerability - someone tried to access this url on router (which usually have ip 192.168.0.1).
It looks so because request from attacker contains HOST header with this value.
Maybe django is run locally with DEBUG=True.
You may consider running it more production wised with web-server (i.e. nginx) in front filtering unwanted requests with nginx config and further adding fail2ban to parse nginx error logs and ban ip.
Or make site available only from specific ips / ads simple authorization, i.e. Basic Auth on web-server level.
Previous irrelevant answer
ALLOWED_HOSTS option specifies domains django project can serve.
In running locally - python manage.py runserver or with DEBUG=True - it defaults to localhost, 127.0.0.1 and similar.
If you are accessing django via different url - it will complain in such a manner.
To allow access from another domains - add them to ALLOWED_HOSTS: ALLOWED_HOSTS = ['localhost', '127.0.0.1', '[::1]', '192.168.0.1'].

unknown Invalid HTTP_HOST header in Django logs: api-keyboard.cmcm.com

I have been testing a new Django application on aws beanstalk. While looking through the httpd error logs I see thousands of lines like this:
... Invalid HTTP_HOST header: 'api-keyboard.cmcm.com'. You may need to add 'api-keyboard.cmcm.com' to ALLOWED_HOSTS.*
Normally this is because I didn't add my own hostname to ALLOWED_HOSTS but this domain is completely foreign to me and I can't find references to it online.
So I'm wondering what this means, how do random host like this end up in the header, and if anyone recognized this one.
Thanks!

Django with gunicorn and nginx: HTTP 500 not appearing in log files

I have a Django app running on a gunicorn server with an
nginx up front.
I need to diagnose a production failure with an HTTP 500 outcome,
but the error log files do not contain the information I would expect.
Thusly:
gunicorn has setting errorlog = "/somepath/gunicorn-errors.log"
nginx has setting error_log /somepath/nginx-errors.log;
My app has an InternalErrorView the dispatch of which does an
unconditional raise Exception("Just for testing.")
That view is mapped to URL /fail_now
I have not modified handler500
When I run my app with DEBUG=True and have my browser request
/fail_now, I see the usual Django error screen alright, including
the "Just for testing." message. Fine.
When I run my app with DEBUG=False, I get a response that consists
merely of <h1>Server Error (500)</h1>, as expected. Fine.
However, when I look into gunicorn-errors.log, there is no entry
for this HTTP 500 event at all. Why? How can I get it?
I would like to get a traceback.
Likewise in nginx-errors.log: No trace of a 500 or the /fail_now URL.
Why?
Bonus question:
When I compare this to my original production problem, I am getting
a different response there: a 9-line document with
<h1><p>Internal Server Error</p></h1> as the central message.
Why?
Bonus question 2:
When I copy my database contents to my staging server (which is identical
in configuration to the production server) and set
DEBUG=True in Django there, /fail_now works as expected, but my original
problem still shows up as <h1><p>Internal Server Error</p></h1>.
WTF?
OK, it took long, but I found it all out:
The <h1>Server Error (500)</h1> response comes from Django's
django.views.defaults.server_error (if no 500.html template exists).
The <h1><p>Internal Server Error</p></h1> from the bonus question
comes from gunicorn's gunicorn.workers.base.handle_error.
nginx logs the 500 error in the access log file, not the error log file;
presumably because it was not nginx itself that failed.
For /fail_now, gunicorn will also log the problem in the access log,
not the error log; again presumably because gunicorn as such has
not failed, only the application has.
My original problem did actually appear in the gunicorn error log,
but I had never searched for it there, because I had
introduced the log file only freshly (I had relied on Docker logs
output before, which is pretty confusing) and assumed it would be
better to use the very explicit InternalErrorView for initial
debugging. (This was an idea that was wrong in an interesting way.)
However, my actual programming error involved sending a response
with a Content-Disposition header (generated in Django code) like this:
attachment; filename="dag-wönnegården.pdf".
The special characters are apparently capable of making
gunicorn stumble when it processes this response.
Writing the question helped me considerably with diagnosing this situation.
Now if this response helps somebody else,
the StackOverflow magic has worked once again.
may be server response 500 is logged in access_log not in errorlog
in nginx default file
access_log /var/log/nginx/example.log;
i think <h1><p>Internal Server Error</p></h1> is generated by nginx in production `
in debug=False
raise exception is treated as error or http500,so unless you changed the view for handler500,default 500 error page will be displayed
debug =true
raise exception is displayed in fancy djnago's debug page