I am new to Ha-proxy and stuck in a situation.
I have configured ha-proxy for two server 10.x.y.10 and 10.x.y.20. These two run jetty.
Everything is working fine if one of the jetty is down. The request goes to second server and everything happens as expected.
PROBLEM : Suppose both jetty are running and if i remove "war" file from one jetty , the request does not goes to second server. It just gives error "Error 404 Not Found"
I know i have configured ha-proxy for jetty not for the war files but is there any way to redirect request if the war file is missing or the requested situation is not even possible.
Please point me to the right direction.
Thanks in advance.
This is my haproxy configuration.
HA PROXY CONFIGURATION
defaults
mode http
log global
option httplog
option logasap
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000
frontend vs_http_80
bind *:9090
default_backend pool_http_80
backend pool_http_80
#balance options
balance roundrobin
#http options
mode http
option httpchk OPTIONS /
option forwardfor
option http-server-close
#monitoring service endpoints with healthchecks
server pool_member1 10.x.y.10:8080 // x and y are dummy variables
server pool_member2 10.x.y.20:8080
frontend vs_stats :8081
mode http
default_backend stats_backend
backend stats_backend
mode http
stats enable
stats uri /stats
stats realm Stats\ Page
stats auth serveruser:password
stats admin if TRUE
I finally found the solution. In case anybody comes across the same issue , please find the solution below.
The following link solved my problem
http://tecadmin.net/haproxy-acl-for-load-balancing-on-url-request/
Basically the following line entry in the frontend configuration did the trick.
acl is_blog url_beg /blog
use_backend tecadmin_blog if is_blog
default_backend tecadmin_website
ACL = Access Control list -> ACLs are used to test some condition and perform an action
If the precondition is satisfied then it redirects to backend server.
We can use mulitple acls and direct to muliple backend through same front end.
Next in the backend server configuration we need to add "check" in the end which monitures its health condition.
backend tecadmin_website
mode http
balance roundrobin # Load Balancing algorithm
option httpchk
option forwardfor
server WEB1 192.168.1.103:80 check
server WEB2 192.168.1.105:80 check
Here's the complete configuration for my problem.
defaults
mode http
log global
option httplog
option logasap
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000
frontend vs_http_80
bind *:9090
acl x1_app path_dir x1
acl x2_app path_dir x2
acl x1_avail nbsrv(backend_x1) ge 1
acl x2_avail nbsrv(backend_x2) ge 1
use_backend backend_x1 if x1_app1 x1_avail
use_backend backend_x2 if x2_app x2_avail
backend backend_x1
#balance options
balance roundrobin
#http options
mode http
option httpchk GET /x1
option forwardfor
option http-server-close
#monitoring service endpoints with healthchecks
server pool_member1 10.x.y.143:8080/x1 check
server pool_member2 10.x.y.141:8080/x2 check
backend backend_x2
#balance options
balance roundrobin
#http options
mode http
option httpchk GET /x2
option forwardfor
option http-server-close
#monitoring service endpoints with healthchecks
server pool_member1 10.x.y.143:8080/x2 check
server pool_member2 10.x.y6.141:8080/x2 check
frontend vs_stats :8081
mode http
default_backend stats_backend
backend stats_backend
mode http
stats enable
stats uri /stats
stats realm Stats\ Page
stats auth serveruser:password
stats admin if TRUE
Related
I have an HTTP API Gateway with a HTTP Integration backend server on EC2. The API has lots of queries during the day and looking at the logs i realized that the API is returning sometimes a 503 HTTP Code with a body:
{ "message": "Service Unavailable" }
When i found out this, i tried the API and running the HTTP requests many times on Postman, when i try twenty times i get at least one 503.
I then thought that the HTTP Integration Server was busy but the server is not loaded and i tried going directly to the HTTP Integration Server and i get 200 responses all the times.
The timeout parameter is set to 30000ms and the endpoint average response time is 200ms so timeout is not a problem. Also the HTTP 503 is not after 30 seconds of the request but instantly.
Can anyone help me?
Thanks
I solved this issue by editing the keep-alive connection parameters of my internal integration server. The AWS API Gateway needs the keep alive parameters on a standard configuration, so I started tweaking my NGINX server parameters until I solved the issue.
Had the same issue on a selfmade Microservice with Node that was integrated into AWS API-Gateway. After some reconfiguration of the Cloudwatch-Logs I got further indicator on what is wrong: INTEGRATION_NETWORK_FAILURE
Verify your problem is alike - i.e. through elaborated log output
In API-Gateway - Logging add more output in "Log format"
Use this or similar content for "Log format":
{"httpMethod":"$context.httpMethod","integrationErrorMessage":"$context.integrationErrorMessage","protocol":"$context.protocol","requestId":"$context.requestId","requestTime":"$context.requestTime","resourcePath":"$context.resourcePath","responseLength":"$context.responseLength","routeKey":"$context.routeKey","sourceIp":"$context.identity.sourceIp","status":"$context.status","errMsg":"$context.error.message","errType":"$context.error.responseType","intError":"$context.integration.error","intIntStatus":"$context.integration.integrationStatus","intLat":"$context.integration.latency","intReqID":"$context.integration.requestId","intStatus":"$context.integration.status"}
After using API-Gateway Endpoint and failing consult the logs again - should be looking like that:
Solve in NodeJS Microservice (using Express)
Add timeouts for headers and keep-alive on express servers socket configuration when upon listening.
const app = require('express')();
// if not already set and required to advertise the keep-alive through HTTP-Response you might want to use this
/*
app.use((req: Request, res: Response, next: NextFunction) => {
res.setHeader('Connection', 'keep-alive');
res.setHeader('Keep-Alive', 'timeout=30');
next();
});
*/
/* ..you r main logic.. */
const server = app.listen(8080, 'localhost', () => {
console.warn(`⚡️[server]: Server is running at http://localhost:8080`);
});
server.keepAliveTimeout = 30 * 1000; // <- important lines
server.headersTimeout = 35 * 1000; // <- important lines
Reason
Some AWS Components seem to demand a connection kept alive - even if server responding otherwise (connection: close). Upon reusage in API Gateway (and possibly AWS ELBs) the recycling will fail because other-side most likely already closed hence the assumed "NETWORK-FAILURE".
This error seems intermittent - since at least the API-Gateway seems to close unused connections after a while providing a clean execution the next time. I can only assume they do that for high-performance and not divert to anything less.
My HTTP(S) External Load Balancer on GCP occasionally returns response with error code 502.
And the reason for the response is as follows:
jsonPayload: {
#type: "type.googleapis.com/google.cloud.loadbalancing.type.LoadBalancerLogEntry"
statusDetails: "backend_connection_closed_before_data_sent_to_client"
}
According to GCP documentation such response occurs because of the following reason:
The backend unexpectedly closed its connection to the load balancer
before the response was proxied to the client. This can happen if the
load balancer is sending traffic to another entity. The other entity
might be a third-party load balancer that has a TCP timeout that is
shorter than the external HTTP(S) load balancer's 10-minute
(600-second) timeout. The third-party load balancer might be running
on a VM instance. Manually setting the TCP timeout (keepalive) on the
target service to greater than 600 seconds might resolve the issue.
Reference.
In the backend of my load balancer I have a GCP VM that runs an HAProxy server (v1.8) with following configuration:
global
log /dev/log local0
log /dev/log local1 notice
chroot /var/lib/haproxy
pidfile /var/run/rh-haproxy18-haproxy.pid
user haproxy
group haproxy
daemon
stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
spread-checks 21
# Default SSL material locations
ca-base /etc/ssl/certs
crt-base /etc/ssl/private
# Default ciphers to use on SSL-enabled listening sockets.
# For more information, see ciphers(1SSL). This list is from:
# https://hynek.me/articles/hardening-your-web-servers-ssl-ciphers/
# An alternative list with additional directives can be obtained from
# https://mozilla.github.io/server-side-tls/ssl-config-generator/?server=haproxy
ssl-default-bind-ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:RSA+AESGCM:RSA+AES:!aNULL:!MD5:!DSS
ssl-default-bind-options no-sslv3
defaults
mode http
log global
option httplog
option dontlognull
option http-server-close
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout http-keep-alive 10s
timeout check 10s
maxconn 10000
balance roundrobin
frontend http-80
bind *:80
mode http
option httplog
default_backend www-80
backend www-80
balance roundrobin
mode http
option httpchk /haproxy_status
http-check expect status 200
rspidel ^Server:.*
rspidel ^server:.*
rspidel ^x-envoy-upstream-service-time:.*
server backendnode1 node-1:80 check port 8080 fall 3 rise 2 inter 1597
server backendnode2 node-2:80 check port 8080 fall 3 rise 2 inter 1597
frontend health-80
bind *:8080
acl backend_dead nbsrv(www-80) lt 1
monitor-uri /haproxy_status
monitor fail if backend_dead
listen stats # Define a listen section called "stats"
bind :9000 # Listen on localhost:9000
mode http
stats enable # Enable stats page
stats hide-version # Hide HAProxy version
stats realm Haproxy\ Statistics # Title text for popup window
stats uri /haproxy_stats # Stats URI
stats auth haproxy:pass # Authentication credentials
#lastline
According to GCP documentation we can get rid of 502 errors by setting a TCP Keep-Alive value that is higher than 600 seconds (10 minutes).
They have suggested values for Apache and Nginx.
Web server software Parameter Default setting Recommended setting
Apache KeepAliveTimeout KeepAliveTimeout 5 KeepAliveTimeout 620
nginx keepalive_timeout keepalive_timeout 75s; keepalive_timeout 620s;
Reference.
I'm not sure what timeout values or what config should I change in my HAProxy configuration to set keepalive time more than 600s.
Is setting the timeout http-keep-alive more than 600 seconds do the trick?
You version of HAProxy should have keep-alive option enabled by default but I don't see the corresponding line in your config file;
to enable it you need to add the option http-keep-alive line in the defaults section so it will look like this:
defaults
mode http
log global
option httplog
option dontlognull
option http-server-close
option redispatch
retries 3
option http-keep-alive
To check if it's working follow instructions from this anwer.
You may also find useful these threads on SO:
How to make HA Proxy keepalive
How to enable keep-alive in haproxy?
Apparently I cannot figure out how to do custom HTTP endpoint for Health Checks. Maybe I missed something or GCP doesn't offer it yet.
The ElasticSearch health check page describes various ways to check the EL cluster.
I was looking at the GCP health checks interface and it doesn't let us to add a URL endpoint, neither let us to define a parser for the health check to match against the "green" cluster.
What I was able to do is to wire in port 9200 and use a config like:
port: 9200, timeout: 5s, check interval: 60s, unhealthy threshold: 2 attempts
But this is way not the way to go for EL cluster, as the cluster may respond but having a yellow/red state.
There would be an easier way without parsing the output just adding a timeout check like:
GET /_cluster/health?wait_for_status=yellow&timeout=50s
Note: Will wait for 50 seconds for the cluster to reach the yellow level (if it reaches the green or yellow status before 50 seconds elapse, it will return at that point).
Any suggestions?
GCP health checks are simple and use the HTTP status code to determine if the check passes (200) - https://cloud.google.com/compute/docs/load-balancing/health-checks
what you can do is implement a simple HTTP service that will query ES's health check endpoint parse the output and decide if status code 200 should be returned or something else.
After continuous time of the endpoint we are getting the message connection timeout after request is read and esb will stop responding. we need to restart the wso2 services again.
i had increase the socket time out as suggested.
Time out in the esb is defined in three levels.
endpoint timeout < socket timeout < synapse timeout.check[1]
If you have defined enpoint timeout for your endpoint you can increase it up to the timeout value of socket timeout. and you can in crease the socket time out to the vlaue of synapse timeout. default synapse timeout is 2 minutes.So even you increase the endpoint timeout and socket time out to 2 minutes and you dont get any response form your backend service,Then you should check your backend service.
once timeout occurred that enpoint will be suspended to 30000ms .So any request to that endpoint within the suspension period will be ignored by esb. you can disable the suspension period as mention here [2]
Default keepalive property is enabled in the esb .But some firewalls will ignore keep alive packets form esb .So there will be a actual connection between esb and firewall .But connection form firewall to backend might be closed.In that case disabling the keepalive property will create new connection for each request[3] and backend will give the response.
1.http://soatutorials.blogspot.in/2014/11/how-to-configure-timeouts-in-wso2-esb.html
2.http://miyurudw.blogspot.com/2012/02/disable-suspension-of-wso2-esb-synapse.html
3.https://udaraliyanage.wordpress.com/2014/06/17/wso2-esb-keep-alive-property/
I have configured HAproxy on a RedHat server. The server is up and running without any issue but i cannot access the server through my browser. I have open the firewall port for the bind address.
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 2080/haproxy
My haproxy.cfg is as below:
defaults
log global
mode http
option httplog
option dontlognull
retries 3
option redispatch
maxconn 2000
contimeout 5000
clitimeout 50000
srvtimeout 50000
frontend http-in
bind *:80
default_backend servers
backend servers
option httpchk OPTIONS /
option forwardfor
stats enable
stats refresh 10s
stats hide-version
stats scope .
stats uri /admin?stats
stats realm Haproxy\ Statistics
stats auth admin:pass
cookie JSESSIONID prefix
server adempiere1 192.168.1.216:8085 cookie JSESSIONID_SERVER_1 check inter 5000
server adempiere2 192.168.1.25:8085 cookie JSESSIONID_SERVER_2 check inter 5000
any suggestion?
To view HAProxy stats on your browser, put these lines in your configuration file.
You will be able to see HAProxy at http://Hostname:9000
listen stats :9000
mode http
stats enable
stats hide-version
stats realm Haproxy\ Statistics
stats uri /
global
log 127.0.0.1 local0
log 127.0.0.1 local1 notice
daemon
log global
mode http
option httplog
option dontlognull
option forwardfor
retries 1 #number of times it will try to know if system is up or down
option redispatch #if one system is down, it will redispatch to another system which is up.
maxconn 2000
contimeout 5 #you can increase these numbers according to your configuration
clitimeout 50 #this is set to smaller number just for testing
srvtimeout 50 #so you can view right away the actual result
listen http-in IP_ADDRESS_OF_LOAD_BALANCER:PORT #example 192.168.1.1:8080
mode http
balance roundrobin
maxconn 10000
server adempiere1 192.168.1.216:8085 cookie JSESSIONID_SERVER_1 check inter 5000
server adempiere2 192.168.1.25:8085 cookie JSESSIONID_SERVER_2 check inter 5000
#
#try access from your browser the ip address with the port mentioned in the listen configuration #above.
#or try this is command line `/terminal: curl http://192.168.1.1:8080`