requests hang unless timeout argument given - django

My backend runs on openshift and makes get requests to other openshift clusters via kubernetes python client. I am having an issue where requests hang until the default timeout value is reached. I have done some tests in the pod to see if it can reach other openshift clusters and discovered the following:
requests.get("some_other_cluster_api_url") will hang and return correctly in 2 mins, but requests.get("some_other_cluster_api_url", timeout=1) returns correctly in 1 second. Why does the request not return immediately in the first case?
Edit: curl also instantly returns the right response

As per this Timeout Doc, in the first case you have not set any timeout value By default, requests do not time out unless a timeout value is set explicitly. Without a timeout, your code may hang for minutes or more.
In the second case, you have set the timeout value to 1 , which returns the response in 1 second only.
For example :
If you specify a single value for the timeout, like this:
r = requests.get('https://github.com', timeout=5)
Then it returns the response in 5 seconds .
Refer to this Doc and SO for more information and usage of timeout requests.

Related

gUnicorn/Flask/GAE - two processes started for processing the same http request

I have an app on Google AppEngine (Python39 standard env) running on gUnicorn and Flask. I'm making a request to the server from client-side app for a long-running operation and seeing that the request processed twice. The second process (worker) started after a while (a hour and a half) after the first one has been working.
I'm not sure is it related to gUnicorn specifically or to GAE.
The server controller has logging at the beginning :
#app.route("/api/campaign/generate", methods=["GET"])
def campaign_generate():
logging.info('Entering campaign_generate');
# some very long processing here
The controller is called by clicking a button from the UI app. I checked the network in DevTools in the browser that only one request fired. And I can see that there's only one request in server logs at the moment of executing of workers (more on this follow).
The whole app.yaml is like this:
runtime: python39
default_expiration: 0
instance_class: B2
basic_scaling:
max_instances: 1
entrypoint: gunicorn -b :$PORT server.server:app --timeout 0 --workers 2
So I have 2 workers with infinite timeouts, basic scaling with max instances = 1.
I expect while the app is processing one request for a long-running operation, another worker is available for serving.
I don't expect the second worker will used to processing the same request, it's a nonsense (if only the user won't start another operation from another browser).
Thanks to timeout=0 I expect gUnicorn will wait indefinitely till the controller finishes. And only one thing that can hinder is GAE'e timeout. But thanks to basic-scaling it's 24 hours. So I expect the app should process requests for several hours without problem.
But what I'm seeing instead is that after the processing the request for a while another execution is started. Here's simplified logs I see in Cloud Logging:
13:00:58 GET /api/campaign/generate
13:00:59 Entering campaign_generate
..skipped
13:39:13 Starting generating zip-archive (it's something that takes a while)
14:25:49 Entering campaign_generate
So, at 14:25, 1:25 after the current request came another processing of the same request started!
And now there're two request processings running in parallel.
Needless to say that this increase memory pressure and doubles execution time.
When the first "worker" finished (14:29:28 in our example) its processing, its result isn't being returned to the client. It looks like gUnicorn or GAE simply abandoned the first request. And the client has to wait till the second worker finishes processing.
Why is it happening?
And how can I fix it?
Regarding http requests records in the log.
I did see only one request in Cloud Logging (the first one) when the processing was active, and even after the controller was called for the second time ('Entering campaign_generate' in logs appeared) there was not any new GET-request in the logs. But after that everything completed (actually the second processing returned a response) a mysterious second GET-request appeared. So technically after everything is done, from the server logs' view (Cloud Logging) it looks like there were two subsequent requests from the client. But there weren't! There was only one, and I can see it in the browser's DevTools.
Those two requests have different traceId and requestId http headers.
It's very hard to understand what's going on, I tried running the app locally (on the same data) but it works as intended.

Google Cloud Tasks HTTP trigger - how to disable retry

I'm trying to create a Cloud Tasks queue that never retries if an HTTP task fails.
According to the documentation, maxAttempts should be what I'm looking for:
Number of attempts per task.
Cloud Tasks will attempt the task maxAttempts times (that is, if the
first attempt fails, then there will be maxAttempts - 1 retries). Must
be >= -1.
So, if maxAttempts is 1, there should be 0 retries.
But, for example, if I run
gcloud tasks queues create test-queue --max-attempts=1 --log-sampling-ratio=1.0
then use the following Python code to create an HTTP task:
from google.cloud import tasks_v2beta3
from google.protobuf import timestamp_pb2
client = tasks_v2beta3.CloudTasksClient()
project = 'project_id' # replace by real project ID
queue = 'test-queue'
location = 'us-central1'
url = 'https://example.com/task_handler' # replace by some endpoint that return 5xx status code
parent = client.queue_path(project, location, queue)
task = {
'http_request': { # Specify the type of request.
'http_method': 'POST',
'url': url # The full url path that the task will be sent to.
}
}
response = client.create_task(parent, task)
print('Created task {}'.format(response.name))
In the Stackdriver logs for the queue (which I can see because I used --log-sampling-ratio=1.0 when creating the queue), the task is apparently retried once: there is one dispatch attempt, followed by a dispatch response with status UNAVAILABLE, followed by another dispatch attempt, which is finally followed by the last dispatch response (also indicating UNAVAILABLE).
Is there any way to retry 0 times?
Note
About maxAttempts, the documentation also says:
This field has the same meaning as task_retry_limit in queue.yaml/xml.
However, when I go to the description for task_retry_limit, it says:
The number of retries. For example, if 0 is specified and the task
fails, the task is not retried at all. If 1 is specified and the task
fails, the task is retried once. If this parameter is unspecified, the
task is retried indefinitely. If task_retry_limit is specified with
task_age_limit, the task is retried until both limits are reached.
This seems to be inconsistent with the description of maxAttempts, as it indicates that the task would be retried once if the parameter is 1.
I've experimented with setting maxAttempts to 0, but that seems to make it assume a default value of 100.
Thank you in advance.
As #averi-kitsch mentioned, this is currently an internal issue which our Cloud Tasks engineer team is working on right now, sadly we don't have any ETA yet.
You can follow the progress of this issue with this Public Issue Tracker, click on the "star" to subscribe to it and receive future updates.
As a work around, if you don't want the task to retry after it fails, set "task_retry_limit=0" directly on the queue.yaml.
Example :
queue:
- name: my-queue1
rate: 1/s
retry_parameters:
task_retry_limit: 0

Do ColdFusion Scheduled Tasks have a built-in request timeout?

I have several scheduled tasks that essentially perform the same type of functionality:
Request JSON data from an external API
Parse the data
Save the data to a database
The "Timeout (in seconds)" field in the Scheduled Task form is empty for each task.
Each CFM template has the following line of code at the top of the page:
<cfscript>
setting requesttimeout=299;
</cfscript>
However, I consistently see the following entries in the scheduled.log file:
"Information","DefaultQuartzScheduler_Worker-8","04/24/19","12:23:00",,"Task
default - Data - Import triggered."
"Error","DefaultQuartzScheduler_Worker-8","04/24/19","12:24:00",,"The
request has exceeded the allowable time limit Tag: cfhttp "
Notice, there is only a 1-minute difference between the start of the task, and its timing out.
I know that, according to Charlie Arehart, the timeout error messages that are logged are usually not indicative of the actual cause/point of the timeout, and, in fact, I have run tests and confirmed that the CFHTTP calls generally run in a matter of 1-10 seconds.
Lastly, when I make the same request in a browser, it runs until the requesttimeout set in the CFM page is reached.
This leads me to believe that there is some "forced"/"built-in"/"unalterable" request timeout for Scheduled Tasks, or, that it is using the default timeout value for the server and/or application (which is set to 60 seconds for this server/application) yet, I cannot find this documented anywhere.
If this is the case, is it possible to scheduled a task in ColdFusion that runs longer than the forced request timeout?

Slack slash command works sometimes

We have a Slack slash command that executes a Lambda (written in node) in AWS. The Lambda calls an internal service we have and returns JSON. It often takes multiple executions to get the slash command to work. The caller gets the below message:
Darn - that slash command didn't work. If you see this message more than once we suggest you contact "name".
We ran a bash sript that calls the lambda once a minute for 12 hours. The average duration of the calls was about 1.5 seconds, well below the slash command expectation that a response will be returned in 3 seconds. Has anyone else experienced this issue?
Increase the timeout over 3 secs though your estimated run time is around 1.5 seconds.
Also, it is to be noted that AWS Lambda limits the total concurrent executions across all functions within a given region to 100 (default limit which can increased on request)

Restricting the Web service request itself gets resubmitted after 5 minutes

the web service request for one of our java rest service gets submitted again from the client/browser after ever 5 minutes in case when the service is taking longer to execute.
Can we restrict this to get resubmitted once it has taken sufficiently longer.
Regards,
Vaibhav
Exactly happened to me, only difference is mine is simple HTTP Post request and not web-service request.
Which application server you are using? Its upto application server to resubmit the request in case of Request Timeout.
You observed it right, the interval of double trigger is exact 5 minutes, that means your HTTP server is configured to timeout after 5 minutes, You need to set the timeout to -1 (infinite) or some sensible longer value so that the request won't get timed out and re-submit again.