I was using elastic search 1.5 and now needs to migrate to 5.5. However there's no direct way supported by AWS. I'm using cloudwatch streaming support for elastic search to feed events.
Now only the new events get feed in to elastic search. I'm thinking of following steps to migrate.
Create a new ES domain with 5.5.
Do a onetime import of existing cloudwatch logs.
Change the ES domain endpoint in the lambda function to point to the new ES domain.
Drop the old ES domain.
Is there a way to achieve step 2 in the process? Or is there any better way of achieving this migration?
Your strategy looks good to me. We have done this ES migration in past. Only thing you need to remember is that 1.5 to 5.5 is not a straight forward migration. There are lots of code changes also involved. Lots of classes are not even available in 5.5.
For import; you might have to write a custom export and importer.
Related
I have a pretty complex backend project that I deploy to AWS using the Serverless framework. The problem I'm facing is related to versioning. I have a React app on the FE, which has a version on it, but I didn't add a version to the BE for simplicity (it is the same app, I'm not exposing any special API so didn't want to deal with versioning matrices between the FE and the BE, backward compatibility, etc..) --> Is this a mistake?
When I deploy my BE code, AWS does keeps track of the deploy calls and adds versions in the Versions tab of the Lambdas page, and it has a Description property. I'd like to access that Description to at least have an idea which code is running at any given time.
I was looking at the serverless docs and couldn't find a way to send a Description up to AWS. I'm calling it like so:
serverless deploy -s integration
NOTE: I don't have CI/CD hooked up yet, but the idea would be that only checkins to a specific branch (master or develop) would do a deploy to AWS (as opposed to doing it manually on a feature branch while developing). Is this something anyone is doing?
Any thoughts and/or ideas on versioning serverless backend are appreciated.
I currently have aws dynamodb (port: 8000) and elasticsearch (port: 9200)+kibana(port: 5601) running locally.I pretty much spent almost half a day figuring out how to sync these services together. To provide context, I have a nextJS app running thats integrated with both clients however, when I upsert data into dynamo, Id like to get it synced with elasticsearch right away.
Note, for upper environments I plan on using aws lambdas integrated with dynamodb streams to update indexes in elasticsearch instances (I am not a fan of the aws elasticsearch managed service).
Here is what I have tried thus far for local env syncing:
Logstash w/dynamodb plugin (https://github.com/awslabs/logstash-input-dynamodb) - the repo hasnt been updated in 4 years and issues support newer version of ES. Been a headache all morning trying to get the thing running due to this and other issues --> https://github.com/awslabs/logstash-input-dynamodb/issues/10
result: its a lost cause as far as I know... ive even tried their docker images but not workin..
node-scheduler: oh jeez... given Im using nextJS Id have to create a custom server just to get this extra piece synced... not to mention it removes a lot of the important features of nextJS -> https://nextjs.org/docs/advanced-features/custom-server
host the development dynamodb in aws (not local) + provision ES on ec2 and point local app there. I think this is overkill imo as now pretty much everything is hosted except my app, which could be ok. Let me know your thoughts as I already have dev environment using a separate cognito pool to authenticate.
Just chain ES calls to any updates made on dynamodb. Can someone tell me why we cant do this locally and in general? I get that data can get out of sync but maybe we can do hourly, or even twice a day (offpeak) cron jobs to bulk update the dynamo records to ES.
Would really appreciate your thoughts. Thanks!
We are migrating some of our J2EE based application from on-prem to the AWS cloud. I am trying to find some good document on what steps to be considered for the App migration. Since we already have an AWS account, and some of the applications have been migrated earlier, I don't have to worry about those aspects.. However I am thinking more towards
- Which App-server to use?
- Do i need to migrate DB as well..or just the App?
- Any licensing requirements for app.. we use mostly Open source.. So that should be fine..
- Operational monitoring after migrating to cloud..
Came across some of these articles.
https://serverguy.com/cloud/aws-migration/
Migration Scenario: Migrating Web Applications to the AWS Cloud : https://d36cz9buwru1tt.cloudfront.net/CloudMigration-scenario-wep-app.pdf
I would like to know If you have worked on this kind of work.. and If you point me to some helpful document/links.. or your pwn experience?
So theres 2 good resources I'd recommend for migration:
AWS Whitepaper for migration
AWS Well-Architected Framework.
The key is planning, but not being afraid to experiment. This is cloud so don't be afraid of setting an instance size in stone, you can easily change it.
Earlier I was using TransportClient in my app.
Recently moving towards AWS manages Elastic Search services.
Learned that AWS managed ES Cluster would not support TransportClient.
So migrating the code where it was using BulkProcessort to insert documents to ES.
When I refactor the code as a part of ES documentation I added this line.
BulkProcessor bulkProcessor = BulkProcessor.builder(client::bulkAsync, listener).build();
and I get an error at client::bulkAsync saying Client is not a functional interface.
Need help understanding what am I doing wrong.
Document Link For reference,
https://www.elastic.co/guide/en/elasticsearch/client/java-rest/master/java-rest-high-document-bulk.html#java-rest-high-document-bulk-processor
What is the type of your client object?
It must be a RestHighLevelClient instance.
Here is a working code: https://github.com/dadoonet/legacy-search/blob/02-bulk/src/main/java/fr/pilato/demo/legacysearch/dao/ElasticsearchDao.java
Amazon Web Services offer a number of continuous deployment and management tools such as Elastic Beanstalk, OpsWorks, Cloud Formation and Code Deploy depending on your needs. The basic idea being to facilitate code deployment and upgrade with zero downtime. They also help manage best architectural practice using AWS resources.
For simplicity lets assuming a basic architecture where you have a 2 tear structure; a collection of application servers behind a load balancer and then a persistence layer using a multi-zone RDS DB.
The actual code upgrade across a fleet of instances (app servers) is easy to understand. For a very simplistic overview the AWS service upgrades each node in turn handing connections off so the instance in question is not being used.
However, I can't understand how DB upgrades are managed. Assume that we are going from version 1.0.0 to 2.0.0 of an application and that there is a requirement to change the DB structure. Normally you would use a script or a library like Flyway to perform the upgrade. However, if there is a fleet of servers to upgrade there is a point where both 1.0.0 and 2.0.0 applications exist across the fleet each requiring a different DB structure.
I need to understand how this is actually achieved (high level) to know what the best way/time of performing the DB migration is. I guess there are a couple of ways they could be achieving this but I am struggling to see how they can do it and allow both 1.0.0 and 2.0.0 to persist data without loss.
If they migrate the DB structure with the first app node upgrade and at the same time create a cached version of the 1.0.0. Users connected to the 1.0.0 app persist using the cached version of the DB and users connected to the 2.0.0 app persist to the new migrated DB. Once all the app nodes are migrated, the cached data is merged into the DB.
It seems unlikely they can do this as the merge would be pretty complex but I can't see another way. Any pointers/help would be appreciated.
This is a common problem to encounter once your application infrastructure gets into multiple application nodes. In the olden days, you could take your application offline for "maintenance windows" during which you could:
Replace application with a "System Maintenance, back soon" page.
Perform database migrations (schema and/or data)
Deploy new application code
Put application back online
In 2015, and really for several years this approach is not acceptable. Your users expect 24/7 operation, so there must be a better way. Of course there is, the answer is a series of patterns for Database Refactorings.
The basic concept to always keep in mind is to assume you have to maintain two concurrent versions of your application, and there can be no breaking changes between these two versions. This means that you have a current application (v1.0.0) currently in production and (v2.0.0) that is scheduled to be deployed. Both these versions must work on the same schema. Once v2.0.0 is fully deployed across all application servers, you can then develop v3.0.0 that allows you to complete any final database changes.