I have an application that I'm deploying on a private CloudFoundry instance, using the Ruby buildpack. Sometimes, an in-bound request causes my application to crash and the container to restart. At this point, the user is served an error page, saying something like Error 502 - container was unable to service your request, or something. This is not an error served by my app, but by the infrastructure, so I don't have any control over it.
My app is designed to be run as part of a dashboard/kiosk that refreshes periodically, and adds a Refresh header to every successful request. The refresh time is dynamic and not always the same value (it may be anything from 5 mins to 0 seconds), and that's why I don't use a browser refresh extension.
When I hit the error page, there is no Refresh header so the page just sits there forever. How can I get CloudFoundry to add a Refresh header to the error page? I'd be content with that value being some static value set in my manifest.yml, but I can't see any option to get it to do that.
You can't modify responses that are generated by the Gorouters. If you want to customize THOSE, you should consider, if you have the authority, to put something in your external load balancer that would watch for errors from the infrastructure (I believe all such errors have headers that start with X-Cf-* but I may be mistaken) and customize when they are received.
Related
I have a baffling issue with cookie handling in a Blazor server app (.NET Core 6) using openid (Keycloak). Actually, more than a couple which are may or may not linked. It’s a typical (?) reverse proxy architecture:
A central nginx receives queries for services like Jenkins, JypyterHub, SonarQube, Discourse etc. These are mapped through aliases in internal IPs where the nginx can access them. This nginx intercepts URL like: https://hub.domain.eu
A reverse proxy which resolves to https://dsc.domain.eu. This forwards request to a Blazor app running in Kestrel in port 5001. Both Kestrel and nginx under SSL – required to get the websockets working.
Some required background: the Blazor app is essentially a ‘hub’ where its various razor pages ‘host’ in iframe-like the above mentioned services. How it works: When the user asks for the root path (https://hub.domain.eu) it opens the root page of the Blazor app (/).
The nav menu contains the links to razor pages which contain the iframes for the abovementioned services. For example:
The relative path is intercepted by the ‘central’ nginx which loads Jenkins. Everything is under the same Keycloak OpenID server. Note that everything works fine without the Blazor app.
Scenarios that cause the same problem
Assume the user logins in my app using the login page of Keycloak (NOT the REST API) through redirection. Then proceeds to link and he is indeed logged in as well. The controls in the App change accordingly to indicate that the user is indeed authenticated. If you close the tab and open a new one, the Blazor app will act as if it’s not logged in while the other services (e.g Jenkins) will show the logged in user from before. When you press the Login link, you’ll be greeted with a 502 nginx error. If you clean the cookies from browser (or in private / stealth mode) everything works again. Or of you just log off e.g. from Jenkins.
Assume that the user is now in a service such as Jenkins, SonarQube, etc. if you press F5 now you have two problems: you get a 404 Error but only on SOME services such as Sonarcube but not in others. This is a side problem for another post. The thing is that Blazor app appears not logged in again by pressing Back / Refresh
The critical part of Program.cs looks like the following:
This class handles the login / logoff:
Side notes:
SaveTokens = false still causes large header errors and results in empty token (shown in the above code with the Warning: Token received was null). I’m still able to obtain user details though from httpContext.
No errors show up in the reverse proxy error.log and in Kestrel (all deployed in Linux)
MOST important: if I copy-paste the failed login link (the one that produced the 502 error) to a "clean" browser, it works fine.
There are lots of properties affecting the OpenID connect, it could also be an nginx issue but I’ve run out of ideas the last five days. The nginx config has been accommodated for large headers and websockets.
Any clues as to where I should at least focus my research to track the error??
The 502 error shows an error at NGINX's side. The reverse proxy had proper configuration but as it turned out, not the front one. Once we set the header size to suggested size, everything played out.
I've built a content editor XML UI element. I launch it via a command with the code
string url = Sitecore.UIUtil.GetUri("control:CloneToMarkets") + "&id=" + HttpUtility.UrlEncode(id) + "&path=" + HttpUtility.UrlEncode(path) + "&database=" + HttpUtility.UrlEncode(database);
Context.ClientPage.ClientResponse.ShowModalDialog(url, "400px", "700px", string.Empty, true);
In my DialogForm class I'm overriding OnLoad() and OnOK(). With on load I'm invoking its base method at the start of the class, and OnOk ends with the base method.
If I "ok", "cancel" or "X" on the custom DialogForm I get this error:
My dialog works fine, and completes its purpose, I'm just getting this error afterwards. Does anyone know what causes this?
I believe that you experience a known problem when Sitecore Client users get mistakenly categorised as robots.
Usually, it happens when Sitecore Analytics is enabled and users do not visit the site front-end before logging into the Sitecore Client. In this situation, the current session may be mistakenly identified as a robot visit and will cause the admin session expiration as Sitecore Analytics reduces session timeout for robot visits aiming to minimise the server resources utilisation.
So, make sure that Sitecore.Analytics.Tracking.RobotDetection.config is disabled on your CM instance and also make the following changes in web.config:
In system.web/httpModules node, name="MediaRequestSessionModule" change the following line from
"Sitecore.Analytics.RobotDetection.Media.MediaRequestSessionModule, Sitecore.Analytics.RobotDetection"
to "Sitecore.Analytics.Media.MediaRequestSessionModule, Sitecore.Analytics".
In system.webServer/modules node, name="MediaRequestSessionModule" change the following line from
"Sitecore.Analytics.RobotDetection.Media.MediaRequestSessionModule, Sitecore.Analytics.RobotDetection"
to "Sitecore.Analytics.Media.MediaRequestSessionModule, Sitecore.Analytics".
Also, take a look at similar posts here:
Your session may have been lost due to a time-out or a server failure. in Sitecore 8.1
https://sitecore.stackexchange.com/questions/6383/can-sitecore-be-customized-to-auto-save-pages-including-the-rich-text-editor
I'm trying to host a Django web application on a windows 10 machine with IIS 10 with FastCGI.
Whilst everything is running good so far, I'm running into problems with certain POST-requests while uploading larger files (~120MB), namely an HTTP 500 Error. I'm at a point where I don't know how to debug any further.
I resolved the Error "413.1 - Request Entity Too Large" by increasing the request limits. However, now I get an HTTP-error stating the following:
C:\Apps\Python3\python.exe - The FastCGI process exceeded configured request timeout
The timeout is set to 90 seconds, and I can tell that after the uploading of files completes, my browser is waiting about that time for a response from the server.
There are not that much operations to perform within the Django-view for responding to the request. If I run the django developement server on the same machine, the response is beeing send justs seconds after the files were uploaded. The duration, the IIS takes to send the HTTP 500 response, takes more than 1 minute longer.
I added some code to the Django-view in the post()-method to write something to a file whenever the method is called:
def post(self, request, *args, **kwargs):
with open(os.path.join(settings.REPORT_DIR, "view_output.txt"), "w") as f:
f.write("tbd.")
(...)
However, this action is never performed, although it works in other Django-Views. I therefore assume a problem with IIS processing the request.
I enabled FREB logging, but am a little lost with interpretation. The "Errors & Warnings" just state the LOG_FILE_MAX_SIZE_TRUNCATE event, probably due to the large request.
Since I'm new to IIS, how can I debug any further?
Thank you very much!
To resolve the issue you could follow the below steps:
The IIS default file upload size is 30mb(30000000 bytes) increase this value by using:
open IIS manager, select your site.
Double click request filtering feature from the middle pane.
From the Actions pane on the right-hand side of the screen click Edit Feature Settings... link.
In the Request Limits section, enter the appropriate Maximum allowed content length (Bytes) e.g.2147483648 which means 2GB and then click the OK button.
click ok and apply the setting then go back.
increase the site connection time out:
Open Internet Information Services (IIS) Manager.
Expand the local computer node, expand Web Sites, right-click the appropriate Web site, point to Manage Web Site, click Advanced Settings.
In the Advanced Settings window, expand Connection Limits, change the value in the Connection time-out field, and then click OK.
Application Pool setting:
Open IIS.
On the left side, select"Application Pools"
On the right side, right-click this application pool and select Advanced Settings.
In the advanced settings, increase "Idle Time-out (minutes)".
CGI Time out:
in IIS, double-click the CGI icon and increase "Time-out".
We would like to add a maintenance page to our front-end which should appear when the back-end is currently unavailable (e.g. stopped or deploying). When the application is not running, the following message is displayed together with a 404 status code:
404 Not Found: Requested route ('name.scapp.io') does not exist.
Additionally, there is header present, when the application is stopped (and only then):
X-Cf-Routererror: unknown_route
Is this header reliably added if the application is not running? If this is the case, I can use this flag to display a maintenance page.
By the way: Wouldn't it make more sense to provide a 5xx status code if the application is not started/crashed, i.e. differ between stopped applications and wrong request routes? Catching a 503 error would be much easier, as it does not interfere with our business logic (404 is used inside the application).
Another option is to use a wildcard route.
https://docs.cloudfoundry.org/devguide/deploy-apps/routes-domains.html#create-an-http-route-with-wildcard-hostname
An application mapped to a wildcard route acts as a fallback app for route requests if the requested route does not exist.
Thus you can map a wildcard route to a static app that displays a maintenance page. Then if your app mapped to a specific route is down or unavailable the maintenance page will get displayed instead of the 404.
In regards to your question...
By the way: Wouldn't it make more sense to provide a 5xx status code if the application is not started/crashed, i.e. differ between stopped applications and wrong request routes? Catching a 503 error would be much easier, as it does not interfere with our business logic (404 is used inside the application).
The GoRouter maintains a list of routes for mapping incoming requests to applications. If your application is down then there is no route in the routing table, that's why you end up with a 404. If you think about it from the perspective of the GoRouter, it makes sense. There's no route, so it returns a 404 Not Found. For a 503 to make sense, the GoRouter would have to know about the app and know it's down or not responding.
I suppose you might be able to achieve that behavior if you used a wildcard route above, but instead of displaying a maintenance page just have it return an HTTP 503.
Hope that helps!
The 404 Error you see is generated by CloudFoundrys routing tier and is maintained upstream.
Generally if you don't want to get such error messages you can use blue-green deployments. Here is a detailed description of it in the CF docs: https://docs.cloudfoundry.org/devguide/deploy-apps/blue-green.html
An other option is to add a routing service that implements this functionality for you. Have a look at the CF docs for this: https://docs.cloudfoundry.org/services/route-services.html
we added ability for admin users to change server date&time through the portal. Changing the date&time back is working fine, but changing forward(more than fastcgi_read_timeout) is returning '504 gateway timeout' even though server time successfully changed behind the scenes.
Please advice how to handle this?
Thanks.
I had a very similar issue with another project. Maybe it is best to submit the date&time credentials (I assume you would be using NTP servers IPs to do this) through the portal asynchronously via a JavaScript AJAX request. Then, let the server then do its thing with the date&time.
Meanwhile, have the client side JavaScript, continuously probe the server with interval AJAX requests (perhaps every 5 seconds) to get back a response message on the server time. That way, each subsequent AJAX request initiates a new Nginx session and if the first fails/timeouts, then try a second time, if that fails, then try a third time, and so on.
This worked on our system. However, I do not know if your product has login/authentication credentials. If it does, then the user may have to log back in once all set and done because a change in time may also expire their log-in session. I don't think this is such a big deal though because theoretically they should only need to change the date/time once in a while if not just one time only. So it shouldn't have too much of an impact on the user experience.
tags: nginx, NTP, timeout, 504