How to determine if application/application container has started on AWS elastic beanstalk? - amazon-web-services

I am writing a generic application deployment tool. It takes an application from the user and deploys it to Elastic Beanstalk. That part is working. The issue is that the users want to compose the use of the deployment tool with other operations, and right now my tool reports success when it has told the Beanstalk APIs to start the application.
Unfortunately, it is thus returning before the application itself has started. So the user is forced to write polling logic themselves to await the starting of their application.
Looking at the AWS Elastic Beanstalk API and I cannot see any methods that return any indication of such a state reporting. The closest I can find is DescribeEvents... which looks hopeful, however it seems from the examples that the granularity of the application / application container starting within the environment is not part of that API:
<DescribeEventsResponse xmlns="https://elasticbeanstalk.amazonaws.com/docs/2010-12-01/">
<DescribeEventsResult>
<Events>
<member>
<Message>Successfully completed createEnvironment activity.</Message>
<EventDate>2010-11-17T20:25:35.191Z</EventDate>
<VersionLabel>New Version</VersionLabel>
<RequestId>bb01fa74-f287-11df-8a78-9f77047e0d0c</RequestId>
<ApplicationName>SampleApp</ApplicationName>
<EnvironmentName>SampleAppVersion</EnvironmentName>
<Severity>INFO</Severity>
</member>
Note: the INFO level event is that the environment was created, nothing at the lower level of the application container starting within the environment appears to be reported...
I could mandate that the applications deployed with this tool expose a status REST endpoint, but that puts restrictions on the application.
Is there some API that I am missing that will report when the application container (e.g. Tomcat, Node, etc) is started... or better yet when the application deployed within the container is started... but I can live with the application container

Every application is supposed to expose a health URL (Beanstalk/ELB will have problems any case otherwise - it will think the instances are not responding, and might replace). This is typically a HEAD request expecting a 200 OK.
Since this is anyway expected to be there in all apps, you can probably hit this URL and check the deployment is OK. I guess Beanstalk console itself is using this method.

You can also poll using DescribeEnvironments API call which will give you the Environment CNAME (the URL to check), Health of the environment (RED, GREEN), Status (Launching | Updating | Ready | Terminating | Terminated). This API takes Environment Name as an argument. So you can just get the description of one environment.
API Documentation: http://docs.aws.amazon.com/elasticbeanstalk/latest/APIReference/API_DescribeEnvironments.html
Explanation of Environment Description in response: http://docs.aws.amazon.com/elasticbeanstalk/latest/APIReference/API_EnvironmentDescription.html
Sample Response Below:
<DescribeEnvironmentsResponse xmlns="https://elasticbeanstalk.amazonaws.com/docs/2010-12-01/">
<DescribeEnvironmentsResult>
<Environments>
<member>
<VersionLabel>Version1</VersionLabel>
<Status>Available</Status>
<ApplicationName>SampleApp</ApplicationName>
<EndpointURL>elasticbeanstalk-SampleApp-1394386994.us-east-1.elb.amazonaws.com</EndpointURL>
<CNAME>SampleApp-jxb293wg7n.elasticbeanstalk.amazonaws.com</CNAME>
<Health>Green</Health>
<EnvironmentId>e-icsgecu3wf</EnvironmentId>
<DateUpdated>2010-11-17T04:01:40.668Z</DateUpdated>
<SolutionStackName>32bit Amazon Linux running Tomcat 7</SolutionStackName>
<Description>EnvDescrip</Description>
<EnvironmentName>SampleApp</EnvironmentName>
<DateCreated>2010-11-17T03:59:33.520Z</DateCreated>
</member>
</Environments>
</DescribeEnvironmentsResult>
<ResponseMetadata>
<RequestId>44790c68-f260-11df-8a78-9f77047e0d0c</RequestId>
In your case you may want to read following documentation:
Monitoring Application Health
You can also configure Application Health Check URL for your environment. By default AWS Elastic Beanstalk uses TCP:80 check on your instances. But using the Application Health Check URL you can override this health check to use HTTP:80 by using the Application Health Check URL option as described here.
Using Status/Health from DescribeEnvironments you can check if the application has been deployed.

Related

Making GCP Load-balancer HTTP logs integrate with Cloud Trace (in GCP Logs Explorer)?

Cloud Trace and Cloud Logging integrate quite nicely in most cases, described in https://cloud.google.com/trace/docs/trace-log-integration
Unfortunately, this doesn't seem to include the HTTP request logs generated by a Load Balancer when request logging is enabled.
The LB logs show the traces icon, and are correctly associated with an overall trace in the Cloud Trace system, but the context menu 'show trace details' is greyed out for those log items.
A similar problem arose with my application level logging/tracing, and was solved by setting the traceSampled attribute on the LogEntry, but this can't work for LB logs, since I'm not in control of their generation.
In this instance I'm tracing 100% of requests since the service is M2M and fairly low volume, but in the general case it makes sense that the LB can't know if something is actually generating traces without being told.
I can't find any good references in the docs, but in theory a response header indicating it was sampled could be observed by the LB and cause it to issue the appropriate log.
Any ideas if such a feature exists, in this form or any other?
(Last-ditch workaround might be to use Logs Router to feed LB logs into a pubsub queue (and exclude them from normal logging sinks), and resubmit them to the normal sink(s) with fields appropriately set by some Cloud Function or other pubsub consumer, but that seems like a lot of work and complexity for this purpose)
There is currently a Feature Request created for this, you can follow the status in the following link 1.
As a workaround, you could implement target proxies along with your Load Balancer, according to the documentation for a Global external HTTP(S) load balancer:
The proxies set HTTP request/response headers as follows:
Via: 1.1 google (requests and responses)
X-Forwarded-Proto: [http | https] (requests only)
X-Cloud-Trace-Context: <trace-id>/<span-id>;<trace-options> (requests only) Contains parameters for Cloud Trace.
X-Forwarded-For: [<supplied-value>,]<client-ip>,<load-balancer-ip>
(see X-Forwarded-For header) (requests only)
You can find the complete documentation about external HTTP(S) load balancers and target proxies here 2.
And finally, take a look at the documentation on how to use and configure target proxies here 3.

Google App Engine (GAE) basic scaling backend instance serves one request and undeploys

I have deployed an application (frontend and backend) in App Engine. First of all, I am using the free tier and I chose the default F1 for the frontend and B2 for the backend. I don't exactly understand the difference between B and F instances but based on their names, I chose them for backend and frontend respectively.
My backend is a Flask application that reads some data from Firestore on #app.before_first_request and "pre-caches" it for all future requests. This takes about 20-30 seconds before the first request is served so I really don't want the backend instance to become undeployed all the time.
Right now, my backend successfully serves one request (that I am making from the browser) and then immediately gets undeployed (basically I see no active instances in App Engine dashboard after the request is served). This means that every request once again has the same long delay upon server start that I don't want. I am not sure why this is happening because I've set idle timeout to 5 minutes. I know it is not a problem with my Flask application because it does not crash after a request on a local machine and I've done its memory profiling which is in bounds of B2 limits. This is my app.yaml for the backend:
runtime: python38
service: api
env_variables:
PORT: 8080
instance_class: B2
basic_scaling:
max_instances: 1
idle_timeout: 5m
Any insight would be appreciated!
Based on the information and behavior that you are exposing, please allow me to explain to you that both Scaling models are behaving as they are designed to do so.
“Automatic Scaling: It creates instances based on request rate, response latencies, and other application metrics. You can specify thresholds for each of these metrics, and a minimum number instances to keep running always.
Basic Scaling: Basic scaling creates instances only when your application receives requests. Each instance will be shut down when the application becomes idle. Basic scaling is ideal for work that is intermittent or driven by user activity.”
Use the following URL’s documentation as reference for those models and more of them How Instances are Managed.
Information added on 10/12/2021:
Hi,
I think the correct term is “shutdown” instead of “undeployed” Disabling your application. Looking at Instance States "an instance of a manual or basic scaled service can be either running or stopped. All instances of the same service and version share the same state." then looking at Scaling types "Basic scaling creates instances when your application receives requests. Each instance will be shut down when the application becomes idle. Basic scaling is ideal for work that is intermittent or driven by user activity." and the table's Startup and shutdown row for basic scaling "Instances are created on demand to handle requests and automatically shut down when idle, based on the idle_timeout configuration parameter. An instance that is manually stopped has 30 seconds to finish handling requests before it is forcibly terminated." and Scaling down "You can specify a minimum number of idle instances. Setting an appropriate number of idle instances for your application based on request volume allows your application to serve every request with little latency".
Could you please verify:
that the instance was not manually halted?
that instance is becoming idle?
that there are no background threads?
if functionality is the same when setting the max_instances to 2
that there are no logs showcasing an instance shutdown
that they are reaching the version with the updated the idle_timeout set

Nextjs 404s on buildManifest across multiple EC2 instances

Context: I have a simple Next.js and KeystoneJS app. I've made duplicate deployments on 2 AWS EC2 instances. Each instance also has an Nginx reverse proxy routing port 80 to 3000 (my apps port). The 2 instances are also behind an application load balancer.
Problem: When routing to my default url, my application attempts to fetch the buildManifest for the nextjs application. This, however, 404s most of the time.
My Guess: Because the requests are coming in so close together, my load balancer is routing the second request for the buildManifest to the other instance. Since I did a separate yarn build on that instance, the build ids are different, and therefore it is not fetching the correct build. This request 404s and my site is broken.
My Question: Is there a way to ensure that all requests made from instance A get routed to instance A? Or is there a better way to do my builds on each instance such that their ids are the same? Is this a use case for Docker?
I have had a similar problem with our load balancer and specifying a custom build id seems to have fixed it. Here's the dedicated issue and this is how my next.config.js looks like now:
const execSync = require("child_process").execSync;
const lastCommitCommand = "git rev-parse HEAD";
module.exports = {
async generateBuildId() {
return execSync(lastCommitCommand).toString().trim();
},
};
If you are using a custom build directory in your next.config.js file, then remove it and use the default build directory.
Like:
distDir: "build"
Remove the above line from your next.config.js file.
Credits: https://github.com/serverless-nextjs/serverless-next.js/issues/467#issuecomment-726227243

Health check in Cloud Foundry

Does anyone know how I can tell my cloud foundry instance to monitor my health endpoint, so that when my health endpoint says that the app health is not status: UP, that the app is restarted?
The cf CLI 6.24.0 (released Feb 2017) exposed this type of health checking.
In your app manifest, use:
applications:
- name: myapp
health-check-type: http
health-check-http-endpoint: /admin/health
Your app needs to return a 200 status code from that path, or an error code when it's not status UP.
You can also use the cf set-health-check command to configure it on existing apps.
Check out this documentation for more details on the different health check types.
If an app instance dies, Cloud Foundry, by default, will new up a new instance and try to start it. That resiliency is built into Cloud Foundry.
Actuators are rest end points auto injected in your app that allow you to see the app's status and health at runtime.
https://spring.io/guides/gs/actuator-service/
Try Actuators out.
I don't believe that custom url health checking is available to day in CF. If your application instance is no longer healthy and you want to restart it you can System.exit(1) and CF will restart it for you.
I've heard rumors of custom health checks possibly coming in the future with the CC V3 api and Diego.
the way to do health check in PCF
cf set-health-check APP-NAME <HEALTH-CHECK-TYPE> --endpoint <CUSTOM-HTTP-ENDPOINT>
HEALTH-CHECK-TYPE = process | port | http ( ideally http for web apps )
CUSTOM-HTTP-ENDPOINT = /health
Reference: https://docs.cloudfoundry.org/devguide/deploy-apps/healthchecks.html

Unable to launch task from a spring cloud data flow stream

I registered my task app in Spring Cloud Data Flow, created a definition for it and the status shows 'unknown'. I created the stream and trying to launch the task through task-sink and I get an error:
java.lang.IllegalStateException: failed to resolve MavenResource:
How to launch a task from the task-sink? Am I missing something? Any help is appreciated. Another question I have is how do I access the payload sent via TaskLaunchRequest in my task?
S1 http | step1: transformer-rabbit | log
S2 :S1.step1 > filter --expression=payload.contains('CUSTADDRMODRQ_V15') | task-processor | task-sink
task-sink is launching the task provided by the uri in the TaskLaunchRequest. It is looking for the resource as shown in the log
OUT Using manager EnhancedLocalRepositoryManager with priority 10.0 for /home/vcap/.m2/repository
OUT Using transporter HttpTransporter with priority 5.0 for https://repo.spring.io/libs-snapshot and finally failing.
The task is deployed in our repository and as mentioned I registered and created the definition for it as well.
This one is in cf environment and I am using SCDF server 1.0.0.M4.
In the application.properties for the task-sink i am providing maven.remote.repositories.snapshots.url=**
task create fis-ifx-event-task --definition "fis-event-task"
My goal is launching the task from the stream.
Thanks for the information. I am in fact using the BUILD-SNAPSHOT as I am unable to enable taks in 1.0.0M4 version. Here is the one I am using spring-cloud-dataflow-server-cloudfoundry-1.0.0.BUILD-20160808.144306-116. I am able to register and create task definitions. The status of the task definition is showing as 'unknown' even when I am using the sample task module provided by your team. But when I initiate the flow of the stream and when task-sink tries to launch the task, it is unable to find the maven resource. When I create the task definition, does the task module gets deployed? I don't see any app in Pivotal Apps Manager. As mentioned earlier, I provided maven.remote.repositories.snapshot.url in the application.properties file for the task-sink application. Another thing I observed is when I launch the task manually from dataflow shell it gives an error CF-UnprocessableEntity(10008): The request is semantically invalid: Unknown field(s): 'staging_disk_in_mb', 'staging_memory_in_mb' and also a message saying 'Source is empty'. Presently the task is supposed to print the timestamp and is not dependent on any input.
TaskProcessor code:
#EnableBinding(Processor.class)
#EnableConfigurationProperties(TaskProcessorProperties.class)
public class TaskProcessor {
#Autowired
private TaskProcessorProperties processorProperties;
public TaskProcessor() {
}
#Transformer(inputChannel = Processor.INPUT, outputChannel = Processor.OUTPUT)
#ELI(level = "info", eventType = ELIEventType.INBOUND)
public Object setupRequest(String message) {
Map<String, String> properties = new HashMap<String, String>();
properties.put("payload", message);
TaskLaunchRequest request = new TaskLaunchRequest(processorProperties.getUri(), null, properties, null);
return new GenericMessage<>(request);
}
}
TaskSink code:
#SpringBootApplication
#EnableTaskLauncher
#EnableBinding(Sink.class)
#EnableConfigurationProperties(TaskSinkProperties.class)
public class FisIfxEventTaskSinkApplication {
public static void main(String[] args) {
SpringApplication.run(FisIfxEventTaskSinkApplication.class, args);
}
}
I provided the stream I am using earlier in the post. Sink is receiving the TaskLaunchRequest with uri and payload as you can see here and unable to launch the task.
OUT registering [40, java.io.File] with serializer org.springframework.integration.codec.kryo.FileSerializer
2016-08-10T16:08:55.02-0600 [APP/0]
OUT Launching Task for the following resource TaskLaunchRequest{uri='maven://com.xxx:fis.ifx.event-task:jar:1.0-SNAPSHOT', commandlineArguments=[], environmentProperties={payload={"statusCode":0,"fisT
opic":"CustomerDataUpdated","payloadId":"CUSTADDRMODR``Q_V15","customerIds":[1597304]}}, deploymentProperties={}}
Before I begin, you have a number of questions here. In the future, it's better to break them up into multiple questions so that they are easier to find by other users and easier to answer. That being said:
A little context on the current state of things
In order to understand how things will work, it's important to understand the current state of things. The current releases of the software involved are:
Pivotal Cloud Foundry (PCF) - 1.7.12. This version is required for any task support.
Spring Cloud Task (SCT) - 1.0.2.RELEASE
Spring Cloud Data Flow CF (SCDF) - 1.0.0.BUILD-SNAPSHOT (current as of the date of this post).
Currently PCF 1.7.12+ has all the capabilities to run tasks. You can create v3 applications (the type of application used to launch a task), run it as a task, etc. However, the tooling around that functionality is not currently complete. There is no support for v3 applications in Apps Manager or the CLI. There is a plugin for the CLI that is more of a dev tool that can be used to help with some functions (it will show you logs, etc), but it is not fully functional and requires a specific version of the CLI to work [1]. This is one of the reasons that the task functionality within PCF is still considered experimental.
Spring Cloud Task is currently GA and supports all the functionality needed to effectively run tasks on CF. However, it's important to note that SCT doesn't handle orchestration so the actual launching of tasks on CF is the responsibility of either the user, or Spring Cloud Data Flow (the easier route).
Spring Cloud Data Flow's Cloud Foundry server implementation currently has functionality to launch tasks on PCF in the latest snapshots. We have validated this against 1.7.12 as well as the development branch of 1.8.
The task workflow within SCDF
Tasks are fundamentally different from stream applications within the context of SCDF. When you create a stream definition, you are given the option to deploy it. What this does is it actually downloads the Spring Boot über jars and deploys them to PCF as long running processes. If they go down, PCF, will relaunch them as expected, etc.
Tasks on the other hand, are not deployed. They are launched. The difference is that while you create a task definition, there is nothing deployed until you click launch. And when the task completes, the software is shut down and cleaned up. So while a stream definition may have states, it's really a one to one relationship between the definition and the deployed software. Where with a task, you can launch a task definition as many times as you want.
Your issues
Reading through your post, I see a few things that you are struggling with. Let me see if I can help:
Task Definitions within SCDF and launching them via a stream - When launching a task from a stream, the task registry within SCDF is not used. The sink expects the URL for the resource to be within the TaskLauchRequest.
Apps Manager and tasks - As mentioned above, there is no support for v3 applications in Apps Manager yet so you won't be able to see your tasks there.
Viewing the logs - In order to debug what's going wrong with launching your task on CF, you're going to want to view the logs. To do so, use the v3 CLI plugin mentioned above to view them. It's important to note that you can only tail live logs with the plugin, not view logs that have previously been rendered. Because of that, when testing, you'll want to tail the logs as soon as the app is created, before it's launched.
Error in SCDF Shell - The error you received from the SCDF shell (CF-UnprocessableEntity(10008):...) leads me to wonder if you have both the correct version of PCF (1.7.12+) and the correct version of the following other libraries:
spring-cloud-deployer-cloudfoundry - The latest snapshots
cf-java-client - 2.0.0.M10+
reactor-core - 3.0.0.RC1+
I hope this helps!
[1] https://github.com/cloudfoundry/v3-cli-plugin
Task support is not available in 1.0.0.M4 release of SCDF's CF-server. In this release, the task commands/REST-APIs should be disabled - see here. And for that reason, you wouldn't see any docs related to Tasks in the 1.0.0.M4 reference guide.
That said, the Task support is available/enabled in the BUILD-SNAPSHOT release. If you're locally building the CF-server and upon pushing it to CF, you could take advantage the task commands in the shell to create and launch task definitions.