Unusual request activity log found in django server - django

Following is the screenshot of the server activity log.I can see that many requests are automatically raised in the server.How can I avoid this.?

It looks like someone is fuzzing your website and scanning to find any common file names or extensions that commonly have security vulnerabilities. One way to limit this behaviour is to implement rate limiting whereby you might limit the number of requests a user makes that result in HTTP 404 Not Found during some time period before giving them a temporary ban. Note: this solution doesn't stop this from happening but it does buy you time and may deter the attacker or researcher

Related

Request data seemingly dirty in multithreaded flask app

We are seeing a random error that seems to be caused by two requests' data getting mixed up. We receive a request for quoting shipping costs on an Order, but the request fails because the requested Order is not accessible by the requesting account. I'm looking for anyone who can provide an inkling on what might be happening here, I haven't found anything on google, the official flask help channels, or SO that looks like what we're experiencing.
We're deployed on AWS, with apache, mod_wsgi, 1 process, 15 threads, about 10 instances.
Here's the code that sends the email:
msg = f"Order ID {self.shipping.order.id} is not valid for this Account {self.user.account_id}"
body = f"Error:<br/>{msg}<br/>Request Data:<br/>{request.data}<br/>Headers:<br/>{request.headers}"
send_email(msg, body, "devops#*******.com")
request_data = None
The problem is that in that scenario we email ourselves with the error and the request data, and the request data we're getting, in many cases, would've never landed in that particular piece of code. It can be a request from the frontend to get the current user's settings, for example, that make no reference to any orders, nevermind trying to get a shipping quote for it.
Comparing the application logs with apache's access_log, we see that, in all cases, we got two requests on the same instance, one requesting the quoting, and another which is the request that is actually getting logged. We don't know whether these two requests are processed by the same thread in rapid succession, or by different threads, but they come so close together that I think the latter is much more probable. We have no way of univocally tying the access_log entries with the application logging, so far, so we don't know which one of the requests is logging the error, but the fact is that we're getting routed to a view that does not correspond to the request's content (i.e., we're not sure whether the quoting request is getting the wrong request object, or if the other one is getting routed to the wrong view).
Another fact that is of interest is that we use graphql, so part of the routing is done after flask/werkzeug do theirs, but the body we get from flask.request at the moment the error shows up does not correspond with the graphql function/mutation that gets executed. But this also happens in views mapped directly through flask. The user is looked up by the flask-login workflow at the very beginning, and it corresponds to the "bad" request (i.e., the one not for quoting).
The actual issue was a bug on one of python-graphql's libraries (promise), not on Flask, werkzeug or apache. It was not the request data that was "moving" to a different thread, but a different thread trying to resolve the promise for a query that was supposed to be handled elsewhere.

How can I filter out errors on sentry to avoid consuming my quota?

I'm using Sentry to log my errors, but there are errors I'm not able to fix (or could not be fixed by me) like
OSError (write error)
Or error that come from RQ (each time I deploy my app)
Or client errors (which are client.errors)
I can't just ignore them because I consume all my quota. How I can filter out this errors?
Here some references for interested people.
uwsgi: OSError: write error during GET request
Fixing broken pipe error in uWSGI with Python
https://github.com/unbit/uwsgi/issues/1623
I created a Gist for rate limiting the amount of events that are being send to Sentry:
https://gist.github.com/jurrian/e22f8e724b8499a29c5537e956f0dc7f
It uses ratelimitingfilter which can be configured to set a rate per minute, and additionally add a burst to start rate limiting after a number of events.
I get the same errors, but i never had any problems with my quota. But if you really want to filter it, you can just do it in your sdk:
https://docs.sentry.io/error-reporting/configuration/filtering/?platform=python
But beware, this could hide other errors as mentioned here:
https://github.com/pypa/warehouse/issues/679
To safe yourself some quota, you have two options:
Avoid forwarding events client side, thus preventing events being send to sentry at all. Have a look at the docs for available client-side filters. The drawback with this approach is of course that you need a new code deployment for any adjustment of client-side filters and some clients may not instantly reflect your code changes.
Avoid forwarding events on sentry's side, via inbound filters ([Project] > Project Settings > Inbound Filters). According to the sentry documentation on quota usage, events filtered via inbound filters are not affecting your quota.
Inbound filters include:
Common browser extension errors
Events coming from localhost
Known legacy browsers errors
Known web crawlers
By their error message
From specific release versions of your code
From certain IP addresses
Business plans and above also allow to filter events by error messages.

Google Places API error 502 - The server encountered a temporary error

we run a website that obtains location data through the Google Place API. We have 150k daily searches available, which we haven´t met yet as the website has been live for few weeks only. We have suddenly received a 502 error. A notification in the Console says: “The server encountered a temporary error and could not complete your request.”. Is this a temporary error? Is there any suggestions on what we can do? The website hasn’t been available for 40 minutes.
When you receive 5xx status or UNKNOWN_ERROR in the response, you should implement a retrying logic. Google has a following recommendation in their web services documentation:
In rare cases something may go wrong serving your request; you may receive a 4XX or 5XX HTTP response code, or the TCP connection may simply fail somewhere between your client and Google's server. Often it is worthwhile re-trying the request as the followup request may succeed when the original failed. However, it is important not to simply loop repeatedly making requests to Google's servers. This looping behavior can overload the network between your client and Google causing problems for many parties.
A better approach is to retry with increasing delays between attempts. Usually the delay is increased by a multiplicative factor with each attempt, an approach known as Exponential Backoff.
https://developers.google.com/maps/documentation/directions/web-service-best-practices#exponential-backoff
However, if retrying logic with Exponential Backoff doesn't help and the error persists for a long time you should file a bug in Google issue tracker
I hope this addresses your doubt!
UPDATE
There was an issue on Google side yesterday (November 6, 2017), you can refer to the following bug that explains the issue:
https://issuetracker.google.com/issues/68938173

When using the Admin SDK directory API to insert Org Units a dailyLimitExceeded error is returned even though that quota has not been reached

I work for an Student Information System and we're using the Admin SDK directory API to create school districts Google Org Unit structures from within our software.
POST https://www.googleapis.com/admin/directory/v1/customer/customerId/orgunits
When generating these API requests we're consistently receiving dailyLimitExceeded errors even when the district's quota has not been reached.
This error can be bypassed by ignoring the error, and implementing an exponential back-off routine, but I believe this to be acting much more like the quotaExceeded error is intended to act rather than dailyLimitExceeded, in that the request succeeds afterward on the first retry of this request.
In detail, the test I just ran successfully completed 9 of these API calls and then I received this response on the 10th:
Google.Apis.Requests.RequestError
Quota limit exceeded for the day. [403]
Errors [Message[Quota limit exceeded for the day.] Location[ - ] Reason[dailyLimitExceeded] Domain[usageLimits]
From the start of the batch of API calls it took about 10 seconds to get to the point where the error occurred.
Thanks for your help!
What I would suggest is to slow down your API requests. Don't make like 10 requests in 1 second. Give it a space in between requests. You are correct to implement exponential backoff. Also, if you can, use other accounts as well to make requests.

Sustain an http connection while django processes a big request (20mins+)

I've got a django site that is producing a csv download. The content of the csv is dictated by user defined parameters. It's possible that users will set parameters that require significant thinking time on the server. I need a way of sustaining the http connection so the browser doesn't kick up an error message. I heard that it's possible to send intermittent http headers to do this. Can anyone point me in the right direction to set this up on a django site?
(unfortunatly I'm stuck with the possibility of slow reports - improving my sql won't mitigate this)
Don't do it online. Trigger an offline task, use a bit of Javascript to repeatedly call a view that checks if the task has finished, and redirect to the finished file when it's ready.
Instead of blocking the user and it's browser for 20 minutes (which is not a good idea) do the time-consuming task in the background. When the task will finish and generate the result simply notify the user so that he/she will just need to download the ready result.