Canceling multiple HTTP/2 requests leads to unrecoverable connection timeouts - amazon-web-services

I have a native iOS app that loads multiple videos from AWS cloudfront via HTTP/2 requests. If the user skips to the next video, I cancel the request and start the new one.
After some cancels I get timeouts for the following requests and the connection seems to be unrecoverably broken.
edit
CloudFront monitoring shows only 200 responses – so no errors/timeouts are reported.
Using Charles Proxy for debugging shows new requests but they never receive any data ...
edit end
To check if this is a iOS problem or not, I rebuilt the same logic in NodeJS (using got) and ran into the same problems. So it's not iOS-related.
When using axios (which only supports HTTP/1.1) for doing the actual requests in Node, everything worked as expected.
So I tried and disabled HTTP/2 for my cloudfront distribution and after that the iOS implementation also worked.
Is this a known problem with HTTP/2? That canceling requests can lead to timeouts? I tried searching the web/SO but couldn't find anything helpful.
How can I get this to work with HTTP/2? Or should I just keep using HTTP/1.1?

Related

Flask API, browser requests stopped working, no log the request was received

I have a very simple flask app that has been working for years, but last week requests from the built app return a 500, and from the Flask side, I can't even see the request. I am not seeing an OPTIONS request.
The below lines worked previously to keep CORS happy.
#app.after_request
def after_request(response):
response.headers.add('Access-Control-Allow-Origin', '*')
response.headers.add('Access-Control-Allow-Headers', 'Content-Type,Authorization,Auth-Token')
response.headers.add('Access-Control-Allow-Methods', 'GET,PUT,POST,DELETE')
return response
I have tried in a few browsers and all of them fail to successfully make any requests. Since the server doesn't even acknowledge the request has been made, I am not sure where to trouble shoot. I did confirm the app returns data as expected when I use Postman to make the request, as well as confirming that if I use the app locally (gulp serve on my computer) that the requests are successful. I have to believe its CORS, but what might I have to add / do to get the browser to be happy? Thanks.
The solution to my problem was that chrome started to "restrict the ability of websites to communicate with devices on the local network"
Communicating from Chrome 94+ with LAN devices that do not support HTTPS from a web app

HTTP 407 Proxy Authentication Required while accessing Amazon S3

I have tried everything but I cant seem to fix this issue that is happening for only one client behind a corporate proxy/firewall. Our Silverlight application connects to Amazon S3 for downloading/Uploading some documents. On one client and one client only it returns a 407 error and after that the application fails to save anything.
Inner Exception:
System.ServiceModel.ProtocolException: [UnexpectedHttpResponseCode]
Arguments: 407,Proxy Authentication Required
We had something similar at a different client but there was more of a CORS issue. to resolve this I used cloud-front to fake a sub-domain that then accesses the S3 bucket and it solved the issue. I was hoping it would fix it with this client as well but it didnt.
I have tried adding this code to web.config as suggested by a lot of answers
<system.net>
<defaultProxy useDefaultCredentials="true" >
</defaultProxy>
</system.net>
I have read articles about passing a proxy headers with basis authentication using username and password but I am not sure how this would help us. The Proxy server is used by client and any authentication it requires is outside our domain.
**Additional Information**
The Silverlight code references 2 services. One is our wcf service that retrieves all the data for the application. One is The Amazon S3 service that uses the amazon Soap api, the endpoint for which is at http://s3.amazonaws.com/doc/2006-03-01/AmazonS3.wsdl?
If I go into our app and only use part of the system that dont make any calls to the Amazon S3 api the application works fine. As soon as I go to a part of the system that makes a call to the S3, the problem starts. funny enough the call to S3 goes fine and I can retrieve the doc fine but then any calls to our wcf service return 407.
Any ideas?
**Update 2**
Based on comments from Elliot Nelson I check the stack we were using for making http requests in our application. Turns out we are using client http for both http and https requests by default. Here is the code we have in the App.xaml constructor
public App()
{
Startup += Application_Startup;
UnhandledException += Application_UnhandledException;
InitializeComponent();
WebRequest.RegisterPrefix("http://", WebRequestCreator.ClientHttp);
WebRequest.RegisterPrefix("https://", WebRequestCreator.ClientHttp);
}
Now, to understand the differences between clienthttp and browserhttp and when to use them. Also, the potential impacts/issues of switching to browserhttp.
**Update 3**
Is there a way to request browsers to run your in-browser Silverlight application in trusted mode and would it help bypass this issue?
(Answer #2)
So, most likely (for corporate environments like this network), almost nothing can be done without whatever custom proxy settings are set in IE, usually pushed by corporate policy. To take advantage of these proxy settings, you want to use WebRequestCreator.BrowserHttp, which automatically uses the browser's default settings when making requests.
There's a table of the differences between these two clients available in the Microsoft docs. I'm guessing you were using something (maybe setting custom headers or reading the raw response body) that wasn't supported in BrowserHttp.
For security reasons, you can't "ask" the browser what its proxy settings are and use them, so this is a tricky situation. You can specify Browser vs Client handling by domain, or even for a specific request (the same page above describes how); you may be able in this case to get away with just using ClientHttp for your service calls and BrowserHttp for your S3 calls, and avoid the problem altogether!
For next steps, I'd try that approach; if it doesn't work, I'd try switching wholesale to BrowserHttp just to see if it bypasses the proxy issue (there's almost no chance the application will actually work, since you're probably using ClientHttp-only options).
Long term, you may want to consider making changes to your services so they are usable by a BrowserHttp-only application (this would require you to be pretty basic in your requests/responses, but using only BrowserHttp would be a guarantee you'd work in pretty much any corp network).
Running in trusted mode is probably a group policy thing which would require their AD admins to approve / whitelist your app.
I think the underlying issue you are facing is that the proxy requires NTLM authentication and for whatever reason the browser declines to provide your app with that context.
One way to prove that it's an NTLM auth issue is to test with curl - get it to make a req through the proxy, then it should be a bit easier to code to. EG the following curl will get you through 99% of Windows corporate proxies (assuming the proxy is at proxy-host.corp:3128):
C:\> curl.exe -v --proxy proxy-host:3128 --proxy-user : --proxy-ntlm https://www.google.com
NOTE The --proxy-user : tells curl to use the current user session to perform the NTLM challenge.
So if you can get the client to run that, you can at least identify that NTLM works, then it's a just a matter of getting the app to perform the NTLM challenge using the default credentials (which may or may not be provided by the browser session)
Since you described this as a silverlight application, I'm going to assume you can't use classic browser-proxy troubleshooting like "move browser to public network" or "try a different browser", to isolate the problem.
You should try to isolate the proxy server, and have the customer use the required proxy-auth.
The application is making request, but it might be intercepted by a transparent proxy, or the result might be coming from what you consider a web server.
In the early days, the 401 error was pretty strictly associated with web-auth, and 407 was for proxy-auth.
Architecturally, the separation is a convenience, a web server can have both web server, proxy, and reverse-proxy behaviors.
What happens is your customer's environment is making a web connection to the destination, but it receives a HTTP 407 status from some host, probably their network, or sometimes the provider. Almost certainly the request is received not forwarded. The HTTP client your application lives in needs to provide the credentials that host requires. Companies have environments that are complex enough where often your customer will say this is the first time they have heard of this (some proxy-auth is also dynamic or destination specific).
Also, in some corporate environments, the operator will allow temporary or permanent white-listing from the proxy-auth service. You should see if they can do this, even temporarily, to confirm there aren't going to be other problems.
In the end, it sounds like your application might not robustly support proxy-auth, or the proxy-auth type they use in their environment.

Requests timeout when sent from webservices after upgrade to play 2.3.8

So here's the story, I've got a play framework application that uses org.apache.cxf plugin to provide SOAP services. In my routes file, I have the following:
GET /soap/*path org.apache.cxf.transport.play.CxfController.handle(path)
POST /soap/*path org.apache.cxf.transport.play.CxfController.handle(path)
This routes to one of my own functions that turns the path into another request that will hit my usual controllers. We do this by building up on a WSRequestHolder object. We set headers, query parameters, etc.
This used to work quite well in play 2.2 but with the upgrade to 2.3.8, there seems to be an issue. I've traced it to this line:
Promise<WSResponse> responsePromise = request.get();
WSResponse response = responsePromise.get(2000);
When we make the request (when calling response.Promise.get) the call times out regardless of the timeout set. I was testing with a basic login request and it used to respond in less that 200 ms. I've reproduced the request parameters using postman and the request seems to work fine on it's own but when it's being fired from my webservice, it times out.
I maybe have missed something in the upgrade to 2.2 but I'm not even sure what to debug. It clearly doesn't hit the controller, and turning on play logs at the DEBUG level doesn't even see the request.
Any help would be appreciated.
Update:
I have tested it in dev and prod mode. Both seem to fail in the same place.
I figured it out. The issue was that, during our redirection of the request, we were re-adding the Content-Length header twice, once as the length and once as zero (in an attempt to force the regeneration of the length). Turns out this works in play 2.2 but causes it to hang in 2.3. Making sure to only add the Content-Length header once prevents the request from hanging. Dev/Prod mode tested and working.

504 gateway timeout django site with nginx+fastcgi

we added ability for admin users to change server date&time through the portal. Changing the date&time back is working fine, but changing forward(more than fastcgi_read_timeout) is returning '504 gateway timeout' even though server time successfully changed behind the scenes.
Please advice how to handle this?
Thanks.
I had a very similar issue with another project. Maybe it is best to submit the date&time credentials (I assume you would be using NTP servers IPs to do this) through the portal asynchronously via a JavaScript AJAX request. Then, let the server then do its thing with the date&time.
Meanwhile, have the client side JavaScript, continuously probe the server with interval AJAX requests (perhaps every 5 seconds) to get back a response message on the server time. That way, each subsequent AJAX request initiates a new Nginx session and if the first fails/timeouts, then try a second time, if that fails, then try a third time, and so on.
This worked on our system. However, I do not know if your product has login/authentication credentials. If it does, then the user may have to log back in once all set and done because a change in time may also expire their log-in session. I don't think this is such a big deal though because theoretically they should only need to change the date/time once in a while if not just one time only. So it shouldn't have too much of an impact on the user experience.
tags: nginx, NTP, timeout, 504

OpenGraph Debugger reporting bad HTTP response codes

For a number of sites that are functioning normally, when I run them through the OpenGraph debugger at developers facebook com/tools/debug, Facebook reports that the server returned a 502 or 503 response code.
These sites are clearly working fine on servers that are not under heavy load. URLs I've tried include but are not limited to:
http://ac.mediatemple.net
http://freespeechforpeople.org
These are in fact all sites hosted by MediaTemple. After talking to people at MediaTemple, though, they've insisted that it must be a bug in the API and is not an issue on their end. Anyone else getting unexpected 500/502/503 HTTP response codes from the Facebook Debug tool, with sites hosted by MediaTemple or anyone else? Is there a fix?
Note that I've reviewed the Apache logs on one of these and could find no evidence of Apache receiving the request from Facebook, or of a 502 response etc.
Got this response of them:
At this time, it would appear that (mt) Media Temple servers are returning 200 response codes to all requests from Facebook, including the debugger. This can be confirmed by searching your access logs for hits from the debugger. For additional information regarding viewing access logs, please review the following KnowledgeBase article:
Where are the access_log and error_log files for my server?
http://kb.mediatemple.net/questions/732/Where+are+the+access_log+and+error_log+files+for+my+server%3F#gs
You can check your access logs for hits from Facebook by using the following command:
cat <name of access log> | grep 'facebook'
This will return all hits from Facebook. In general, the debugger will specify the user-agent 'facebookplatform/1.0 (+http://developers.facebook.com),' while general hits from Facebook will specify 'facebookexternalhit/1.0 (+http://www.facebook.com/externalhit_uatext.php).'
Using this information, you can perform even further testing by using 'curl' to emulate a request from Facebook, like so:
curl -Iv -A "facebookplatform/1.0 (+http://developers.facebook.com)" http://domain.com
This should return a 200 or 206 response code.
In summary, all indications are that our servers are returning 200 response codes, so it would seem that the issue is with the way that the debugger is interpreting this response code. Bug reports have been filed with Facebook, and we are still working to obtain more information regarding this issue. We will be sure to update you as more information becomes available.
So good news, is that they are busy with it solving it. Bad news, it's out of our control.
There's a forum post here of the matter:
https://forum.mediatemple.net/topic/6759-facebook-503-502-same-html-different-servers-different-results/
With more than 800 views, and recent activity, it states that they are working hard on it.
I noticed that https MT sites don't even give a return code:
Error parsing input URL, no data was scraped.
RESOLUTION
MT admitted it was their fault and fixed it:
During our investigation of the Facebook debugger issue, we have found that multiple IPs used by this tool were being filtered by our firewall due to malformed requests. We have whitelisted the range of IP addresses used by the Facebook debugger tool at this time, as listed on their website, which should prevent this from occurring again.
We believe our auto-banning system has been blocking several Facebook IP addresses. This was not immediately clear upon our initial investigation and we apologize this was not caught earlier.
The reason API requests may intermittently fail is because only a handful of the many Facebook IP addresses were blocked. The API is load-balanced across several IP ranges. When our system picks up abusive patterns, like HTTP requests resulting in 404 responses or invalid PUT requests, a global firewall rule is added to mitigate the behavior. More often than not, this system works wonderfully and protects our customers from constant threats.
So, that being said, we've been in the process of whitelisting the Facebook API ranges today and confirming our system is no longer blocking these requests. We'd still like those affected to confirm if the issue persists. If for any reason you're still having problems, please open up or respond to your existing support request