How to avoid a TooManyApplicationVersion exception on AWS Elastic Beanstalk? - amazon-web-services

Lately has anyone witnessed the
TooManyApplicationVersions Exception
on AWS Elastic Beanstalk console while deploying a new application version (war)? It's so annoying to see this message as it appears only after you have finished uploading the war.
I would be interested to know why this exception occurs and what precautions one should take to avoid such situations?

Cause
The exception you are seeing stems from reaching your respective account limits for AWS Elastic Beanstalk, see section Errors in CreateApplicationVersion [paraphrased]:
TooManyApplicationVersions - The caller has exceeded the limit on the
number of application versions associated with their account.
TooManyApplications - The caller has exceeded the limit on the number of applications associated with their account.
The current limits are outlined in the respective FAQ How many applications can I run with AWS Elastic Beanstalk?:
You can create up to 25 applications and 500 application versions. By
default you can run up to 10 environments across all of your
applications. If you are also using AWS outside of Elastic Beanstalk,
you may not be [...] If you need more resources, complete the AWS Elastic
Beanstalk request form and your request will be promptly evaluated. [emphasis mine]
Solution
As emphasized, AWS offers the usual escalation option and allows you to submit a Request to Increase AWS Elastic Beanstalk Limits, if you really need that many application versions to be available for reuse still. Otherwise you might just delete older ones you will not use anymore and the problem should vanish accordingly.
Good luck!

Here's a one liner that uses the AWS CLI that will help you clear out old application versions:
aws elasticbeanstalk describe-application-versions --output text --query 'ApplicationVersions[*].[ApplicationName,VersionLabel,DateCreated]' | grep "2014-02" | while read app ver date; do aws elasticbeanstalk delete-application-version --application-name $app --version-label $ver --delete-source-bundle; done
Replace the grep with whatever date, (2013, 2014-01, 2014-02-0, etc) you see fit.

As of EB CLI 3.3, you can now run the following to clear out old versions:
$ eb labs cleanup-versions
By default this will cleanup to the last 10 versions and/or older than 60 days. Adding --help, outputs the following:
usage: eb labs cleanup-versions [options...]
Cleans up old application versions.
optional arguments:
--num-to-leave NUM number of versions to leave DEFAULT=10
--older-than DAYS delete only versions older than x days DEFAULT=60
--force don't prompt for confirmation

You're approaching maximum number of versions and need to delete old or unused ones.
In the current web console you can simply do it on the Application Versions tab of your Beanstalk environment.

This is the piece of code that we use in our deploy script to delete the oldest application version.
console.log('Deleting oldest application version.');
params = {};
local.waitFor(function(done) {
eb.describeApplicationVersions(params, function(err, data) {
if (err) {
console.error(err, err.stack);
local.abort('Could not retrieve the list of application version.');
} else {
// This is probably not needed as the list is already sorted but it is
// not written anywhere that this will always be the case
function compare(a,b) {
if (a.DateCreated > b.DateCreated)
return -1;
if (a.DateCreated < b.DateCreated)
return 1;
return 0;
}
var applicationsVersion = data['ApplicationVersions'].sort(compare),
oldestApplication = applicationsVersion[applicationsVersion.length - 1],
applicationName = oldestApplication['ApplicationName'],
versionLabel = oldestApplication['VersionLabel'];
params = {
ApplicationName: applicationName, /* required */
VersionLabel: versionLabel, /* required */
DeleteSourceBundle: false /* Do not delete source bundle from S3 */
};
eb.deleteApplicationVersion(params, function(err, data) {
if (err) {
console.error(err, err.stack);
local.abort('Could not delete the oldest application version. (' + versionLabel + ')')
} else {
console.log('Successfully deleted the oldest application version. (' + versionLabel + ')');
}
});
}
});
});
Documentation for the Elastic Beantalk API (js): http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/ElasticBeanstalk.html

Related

Errors connecting to AWS Keyspaces using a lambda layer

Intermittently getting the following error when connecting to an AWS keyspace using a lambda layer
All host(s) tried for query failed. First host tried, 3.248.244.53:9142: Host considered as DOWN. See innerErrors.
I am trying to query a table in a keyspace using a nodejs lambda function as follows:
import cassandra from 'cassandra-driver';
import fs from 'fs';
export default class AmazonKeyspace {
tpmsClient = null;
constructor () {
let auth = new cassandra.auth.PlainTextAuthProvider('cass-user-at-xxxxxxxxxx', 'zzzzzzzzz');
let sslOptions1 = {
ca: [ fs.readFileSync('/opt/utils/AmazonRootCA1.pem', 'utf-8')],
host: 'cassandra.eu-west-1.amazonaws.com',
rejectUnauthorized: true
};
this.tpmsClient = new cassandra.Client({
contactPoints: ['cassandra.eu-west-1.amazonaws.com'],
localDataCenter: 'eu-west-1',
authProvider: auth,
sslOptions: sslOptions1,
keyspace: 'tpms',
protocolOptions: { port: 9142 }
});
}
getOrganisation = async (orgKey) => {
const SQL = 'select * FROM organisation where organisation_id=?;';
return new Promise((resolve, reject) => {
this.tpmsClient.execute(SQL, [orgKey], {prepare: true}, (err, result) => {
if (!err?.message) resolve(result.rows);
else reject(err.message);
});
});
};
}
I am basically following this recommended AWS documentation.
https://docs.aws.amazon.com/keyspaces/latest/devguide/using_nodejs_driver.html
It seems that around 10-20% of the time the lambda function (cassandra driver) cannot connect to the endpoint.
I am pretty familiar with Cassandra (I already use a 6 node cluster that I manage) and don't have any issues with that.
Could this be a timeout or do I need more contact points?
Followed the recommended guides. Checked from the AWS console for any errors but none shown.
UPDATE:
Update to the above question....
I am occasionally (1 in 50 if I parallel call the function (5 concurrent calls)) getting the below error:
"All host(s) tried for query failed. First host tried,
3.248.244.5:9142: DriverError: Socket was closed at Connection.clearAndInvokePending
(/opt/node_modules/cassandra-driver/lib/connection.js:265:15) at
Connection.close
(/opt/node_modules/cassandra-driver/lib/connection.js:618:8) at
TLSSocket.
(/opt/node_modules/cassandra-driver/lib/connection.js:93:10) at
TLSSocket.emit (node:events:525:35)\n at node:net:313:12\n at
TCP.done (node:_tls_wrap:587:7) { info: 'Cassandra Driver Error',
isSocketError: true, coordinator: '3.248.244.5:9142'}
This exception may be caused by throttling in the keyspaces side, resulting the Driver Error that you are seeing sporadically.
I would suggest taking a look over this repo which should help you to put measures in place to either prevent the occurrence of this issue or at least reveal the true cause of the exception.
Some of the errors you see in the logs you will need to investigate Amazon CloudWatch metrics to see if you have throttling or system errors. I've built this AWS CloudFormation template to deploy a CloudWatch dashboard with all the appropriate metrics. This will provide better observability for your application.
A System Error indicates an event that must be resolved by AWS and often part of normal operations. Activities such as timeouts, server faults, or scaling activity could result in server errors. A User error indicates an event that can often be resolved by the user such as invalid query or exceeding a capacity quota. Amazon Keyspaces passes the System Error back as a Cassandra ServerError. In most cases this a transient error, in which case you can retry your request until it succeeds. Using the Cassandra driver’s default retry policy customers can also experience NoHostAvailableException or AllNodesFailedException or messages like yours "All host(s) tried for query failed". This is a client side exception that is thrown once all host in the load balancing policy’s query plan have attempted the request.
Take a look at this retry policy for NodeJs which should help resolve your "All hosts failed" exception or pass back the original exception.
The retry policies in the Cassandra drivers are pretty crude and will not be able to do more sophisticated things like circuit breaker patters. You may want to eventually use a "failfast" retry policy for the driver and handle the exceptions in your application code.

AWS SDK for JavaScript CloudWatch Logs - GetLogEventsCommand isn't fetching logs, potentially due to a log stream size issue?

I have multiple Node.js applications deployed via AWS Elastic Beanstalk on the Docker platform. I can manually download the full logs for every environment without trouble via the AWS console. Let's say I have two AWS Elastic Beanstalk Environments: env-a and env-b.
I've started using the AWS SDK for JavaScript, specifically #aws-sdk/client-cloudwatch-logs, in a Node app so that I can programmatically fetch logs, render them in a custom UI, and do my own analysis as needed.
I'm running the following code in order to fetch the log events for a given app (pseudocode):
// IMPORTS
const {
CloudWatchLogsClient,
DescribeLogStreamsCommand,
GetLogEventsCommand
} = require("#aws-sdk/client-cloudwatch-logs");
// SETUP
const awsCloudWatchClient = new CloudWatchLogsClient({
region: process.env.AWS_REGION,
});
// APPLICATION CODE
const logGroupName = getLogGroupName();
// Get the log streams for the given log group.
const logStreamRes = await awsCloudWatchClient.send(new DescribeLogStreamsCommand({
descending: true,
logGroupName,
orderBy: 'LastEventTime',
limit: 50,
}))
// For testing purposes, I'll just use the first log stream name I find.
const logStreamName = logStreamRes.logStreams[0].logStreamName;
// Get the log events for the first log stream.
const logEventRes = await awsCloudWatchClient.send(new GetLogEventsCommand({
logGroupName,
logStreamName,
}));
const logEvents = logEventRes.events;
Now, I can fetch the log events for env-a without trouble using this code. However, GetLogEventsCommand always returns an empty collection when I attempt to fetch the logs for env-b. If I download the logs manually via the AWS console, I can definitely see that logs exist - yet for a reason that isn't clear to me yet, the AWS SDK doesn't seem to recognize that.
Here's some interesting details that may help diagnose the issue.
env-a is configured in Elastic Beanstalk so that each new deploy (which happens potentially multiple times a day) replaces EC2 instances. On the other hand, env-b is configured so that new application code is deployed to existing EC2 instances without actually replacing them. Since log streams map to EC2 instances, env-a has a high number of pretty small log streams whereas env-b` has three extremely large log streams for each of its long-lived EC2 instances. The logs are easily >1 MBs in size.
Considering that GetLogEventsCommand returns responses up to 1 MB in size, am I hitting some size limit and the AWS SDK is handling it by returning 0 log events for env-b? I tried setting a limit on the GetLogEventsCommand above, but still causes the AWS SDK to return 0 events for env-a.
Another interesting note: if I go to Amazon CloudWatch > Log Group and select env-a's Log Group, I can see the log events for every log stream without trouble. If I try to view the log events for env-b's three very large log streams, I run into "Rate exceeded" errors on the console. This seems to confirm that the log stream's event count is simply too large for both the AWS console and AWS SDK to process, though I'm not certain.
Is there anything I can do to get the AWS SDK to fetch env-b's logs? How can I further confirm that excessive log stream size is the culprit here? And if that's the case, is there anything I can do about it, e.g. purge logs?
Or could this be some other issue that I'm not seeing?

How to increase _cluster/settings/cluster.max_shards_per_node for AWS Elasticsearch Service

I uses AWS Elasticsearch service version 7.1 and its built-it Kibana to manage application logs. New indexes are created daily by Logstash. My Logstash gets error about maximum shards limit reach from time to time and I have to delete old indexes for it to become working again.
I found from this document (https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/aes-handling-errors.html) that I have an option to increase _cluster/settings/cluster.max_shards_per_node.
So I have tried that by put following command in Kibana Dev Tools
PUT /_cluster/settings
{
"defaults" : {
"cluster.max_shards_per_node": "2000"
}
}
But I got this error
{
"Message": "Your request: '/_cluster/settings' payload is not allowed."
}
Someone suggests that this error occurs when I try to update some settings that are not allowed by AWS, but this document (https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/aes-supported-es-operations.html#es_version_7_1) tells me that cluster.max_shards_per_node is one in the allowed list.
Please suggest how to update this settings.
You're almost there, you need to rename defaults to persistent
PUT /_cluster/settings
{
"persistent" : {
"cluster.max_shards_per_node": "2000"
}
}
Beware though, that the more shards you allow per node, the more resources each node will need and the worse the performance can get.

Programmatically move files from an S3 bucket to a Windows EC2 and vice-versa

I want to push files to a Windows EC2 and also get folders from my EC2 and store it in S3. As I have many EC2s, I wanted to automate this - I currently have a Node.js server on ECS that uses ssh2-sftp-client library. With the following piece of code I'm able to push files into it (similarly, I'm able to extract files from it and upload it to S3):
var Client = require('ssh2-sftp-client');
var sftp = new Client(newClientID);
sftp.connect({
host: host,
username: 'username',
password: 'password',
port: '22',
tryKeyboard: true
}).then(async () => {
try {
if (file.filename && file.file) {
await sftp.put(file.file, `C:/Users/user/Desktop/${file.filename}`);
console.log(`Successfully pushed ${file.filename}`);
}
sftp.on('error', error => {
console.log(error);
sftp.end();
});
resolve();
} catch (ex) {
console.log("SFTP EXCEPTION PUSHING FILES TO INSTANCE", ex);
} finally {
sftp.end();
}
}
But this is not a robust solution - even when I have 5 or 6 users trying to push at the same time the server errors because it has too many active SSH connections.
Is there a better way to do this? All I want to do is upload/download specific directories using Node.js (for a Windows EC2).
One good way to automate the running of scripts on EC2 is to use SSM Run Command. If you set up each EC2 instance correctly, then it becomes a managed instance and you can trigger the running of scripts across a fleet of EC2 instances, chosen by tags, for example.

Unable to launch task from a spring cloud data flow stream

I registered my task app in Spring Cloud Data Flow, created a definition for it and the status shows 'unknown'. I created the stream and trying to launch the task through task-sink and I get an error:
java.lang.IllegalStateException: failed to resolve MavenResource:
How to launch a task from the task-sink? Am I missing something? Any help is appreciated. Another question I have is how do I access the payload sent via TaskLaunchRequest in my task?
S1 http | step1: transformer-rabbit | log
S2 :S1.step1 > filter --expression=payload.contains('CUSTADDRMODRQ_V15') | task-processor | task-sink
task-sink is launching the task provided by the uri in the TaskLaunchRequest. It is looking for the resource as shown in the log
OUT Using manager EnhancedLocalRepositoryManager with priority 10.0 for /home/vcap/.m2/repository
OUT Using transporter HttpTransporter with priority 5.0 for https://repo.spring.io/libs-snapshot and finally failing.
The task is deployed in our repository and as mentioned I registered and created the definition for it as well.
This one is in cf environment and I am using SCDF server 1.0.0.M4.
In the application.properties for the task-sink i am providing maven.remote.repositories.snapshots.url=**
task create fis-ifx-event-task --definition "fis-event-task"
My goal is launching the task from the stream.
Thanks for the information. I am in fact using the BUILD-SNAPSHOT as I am unable to enable taks in 1.0.0M4 version. Here is the one I am using spring-cloud-dataflow-server-cloudfoundry-1.0.0.BUILD-20160808.144306-116. I am able to register and create task definitions. The status of the task definition is showing as 'unknown' even when I am using the sample task module provided by your team. But when I initiate the flow of the stream and when task-sink tries to launch the task, it is unable to find the maven resource. When I create the task definition, does the task module gets deployed? I don't see any app in Pivotal Apps Manager. As mentioned earlier, I provided maven.remote.repositories.snapshot.url in the application.properties file for the task-sink application. Another thing I observed is when I launch the task manually from dataflow shell it gives an error CF-UnprocessableEntity(10008): The request is semantically invalid: Unknown field(s): 'staging_disk_in_mb', 'staging_memory_in_mb' and also a message saying 'Source is empty'. Presently the task is supposed to print the timestamp and is not dependent on any input.
TaskProcessor code:
#EnableBinding(Processor.class)
#EnableConfigurationProperties(TaskProcessorProperties.class)
public class TaskProcessor {
#Autowired
private TaskProcessorProperties processorProperties;
public TaskProcessor() {
}
#Transformer(inputChannel = Processor.INPUT, outputChannel = Processor.OUTPUT)
#ELI(level = "info", eventType = ELIEventType.INBOUND)
public Object setupRequest(String message) {
Map<String, String> properties = new HashMap<String, String>();
properties.put("payload", message);
TaskLaunchRequest request = new TaskLaunchRequest(processorProperties.getUri(), null, properties, null);
return new GenericMessage<>(request);
}
}
TaskSink code:
#SpringBootApplication
#EnableTaskLauncher
#EnableBinding(Sink.class)
#EnableConfigurationProperties(TaskSinkProperties.class)
public class FisIfxEventTaskSinkApplication {
public static void main(String[] args) {
SpringApplication.run(FisIfxEventTaskSinkApplication.class, args);
}
}
I provided the stream I am using earlier in the post. Sink is receiving the TaskLaunchRequest with uri and payload as you can see here and unable to launch the task.
OUT registering [40, java.io.File] with serializer org.springframework.integration.codec.kryo.FileSerializer
2016-08-10T16:08:55.02-0600 [APP/0]
OUT Launching Task for the following resource TaskLaunchRequest{uri='maven://com.xxx:fis.ifx.event-task:jar:1.0-SNAPSHOT', commandlineArguments=[], environmentProperties={payload={"statusCode":0,"fisT
opic":"CustomerDataUpdated","payloadId":"CUSTADDRMODR``Q_V15","customerIds":[1597304]}}, deploymentProperties={}}
Before I begin, you have a number of questions here. In the future, it's better to break them up into multiple questions so that they are easier to find by other users and easier to answer. That being said:
A little context on the current state of things
In order to understand how things will work, it's important to understand the current state of things. The current releases of the software involved are:
Pivotal Cloud Foundry (PCF) - 1.7.12. This version is required for any task support.
Spring Cloud Task (SCT) - 1.0.2.RELEASE
Spring Cloud Data Flow CF (SCDF) - 1.0.0.BUILD-SNAPSHOT (current as of the date of this post).
Currently PCF 1.7.12+ has all the capabilities to run tasks. You can create v3 applications (the type of application used to launch a task), run it as a task, etc. However, the tooling around that functionality is not currently complete. There is no support for v3 applications in Apps Manager or the CLI. There is a plugin for the CLI that is more of a dev tool that can be used to help with some functions (it will show you logs, etc), but it is not fully functional and requires a specific version of the CLI to work [1]. This is one of the reasons that the task functionality within PCF is still considered experimental.
Spring Cloud Task is currently GA and supports all the functionality needed to effectively run tasks on CF. However, it's important to note that SCT doesn't handle orchestration so the actual launching of tasks on CF is the responsibility of either the user, or Spring Cloud Data Flow (the easier route).
Spring Cloud Data Flow's Cloud Foundry server implementation currently has functionality to launch tasks on PCF in the latest snapshots. We have validated this against 1.7.12 as well as the development branch of 1.8.
The task workflow within SCDF
Tasks are fundamentally different from stream applications within the context of SCDF. When you create a stream definition, you are given the option to deploy it. What this does is it actually downloads the Spring Boot über jars and deploys them to PCF as long running processes. If they go down, PCF, will relaunch them as expected, etc.
Tasks on the other hand, are not deployed. They are launched. The difference is that while you create a task definition, there is nothing deployed until you click launch. And when the task completes, the software is shut down and cleaned up. So while a stream definition may have states, it's really a one to one relationship between the definition and the deployed software. Where with a task, you can launch a task definition as many times as you want.
Your issues
Reading through your post, I see a few things that you are struggling with. Let me see if I can help:
Task Definitions within SCDF and launching them via a stream - When launching a task from a stream, the task registry within SCDF is not used. The sink expects the URL for the resource to be within the TaskLauchRequest.
Apps Manager and tasks - As mentioned above, there is no support for v3 applications in Apps Manager yet so you won't be able to see your tasks there.
Viewing the logs - In order to debug what's going wrong with launching your task on CF, you're going to want to view the logs. To do so, use the v3 CLI plugin mentioned above to view them. It's important to note that you can only tail live logs with the plugin, not view logs that have previously been rendered. Because of that, when testing, you'll want to tail the logs as soon as the app is created, before it's launched.
Error in SCDF Shell - The error you received from the SCDF shell (CF-UnprocessableEntity(10008):...) leads me to wonder if you have both the correct version of PCF (1.7.12+) and the correct version of the following other libraries:
spring-cloud-deployer-cloudfoundry - The latest snapshots
cf-java-client - 2.0.0.M10+
reactor-core - 3.0.0.RC1+
I hope this helps!
[1] https://github.com/cloudfoundry/v3-cli-plugin
Task support is not available in 1.0.0.M4 release of SCDF's CF-server. In this release, the task commands/REST-APIs should be disabled - see here. And for that reason, you wouldn't see any docs related to Tasks in the 1.0.0.M4 reference guide.
That said, the Task support is available/enabled in the BUILD-SNAPSHOT release. If you're locally building the CF-server and upon pushing it to CF, you could take advantage the task commands in the shell to create and launch task definitions.