I have a business requirement in which I need a microservice which takes a list of CSV files and use those files to update the Database to which it is connected to. This happens once a month and there is no end point and running of service is not required. The app starts and does the job of creating DB using some CSV files and it's done.
Can I use AWS lambda for it? I already have a spring boot project created which does this job. But we want to minimise the cost and instead of running service in EC2 which is not required because the app only needs to start once a month. I need the best way to do my job with minimum cost.
Ps- DB will also reside in AWS
Related
I'm currently working on a project that I want to move into the cloud so I can scale to multiple users. Basically, I want to grab the name for every user from my database (thinking about using Firestore for this), and for each of those names I want to call a python script every 24 hours (will use a Cloud Scheduler job). I envision that I would use Cloud Run, since I would want multiple instances of the python script running at the same time (one for every user) and I would want it to be serverless since I would only be running it once a day.
My questions are:
Are the cloud services I chose the correct ones for the job? For instance, is there a better service than Cloud Run to launch the script?
Is there built-in functionality to pass in entries in Firestore and launch a Cloud Run instance for each? I see, for instance, that issue Does it make sense to run a non web application on cloud run? is very close to my question, but they do not interact with the database element
Sorry i am very new to AWS and looking for the correct solution to implement following.
I need to build a job ( C# preferbly, since rest of the code is developed in c#) that will run nightly ( once in 24 hours hopefully ) to get some records from a postgress database table(s) and update the status based on a pre-defined condition.
What is the best way to implement this in AWS with postgress.
You will need to trigger an SQL Client (running somewhere) that will connect to the PostgreSQL database and run the desired queries.
This could be done from:
An Amazon EC2 instance
A computer anywhere on the Internet
An AWS Lambda function
If you have a Windows instance running somewhere, feel free to use it (be it on EC2 or elsewhere).
Alternatively, you could create an AWS Lambda function that connects to the database and runs the desired commands. The Lambda function can be assigned a schedule to run on a regular basis.
See: Schedule Expressions Using Rate or Cron - AWS Lambda
AWS Lambda functions can be written in a variety of languages, including .Net core.
I have multiple CSV files containing data for different tables, with different file sizes varying from 1 MB to 1.5 GB. I want to process the data (replace/remove values of columns) row by row and then load the data to existing Redshift tables. This is once a day batch processing.
AWS Lambda:
Lambda has limitations of memory, hence I was not able to run process for large CSV files.
EC2: I already have EC2 instance where I am running python script to transform and load the data to redshift.
I have keep EC2 running all the time, which has all python scripts which I want to run for all tables and environment created (installing python, psycopg lib etc), leads to more cost.
AWS Batch:
I created a container image which has all the setup to run the python scripts, and pushed it to ECR.
I then set up AWS Batch job, which can take this container image and run it through ECS.
This is more optimized, I only pay for EC2 used and ECR image storage.
But all the development and unit testing I will have to do on my personal desktop and then push a container, no inline AWS service to test.
AWS Workspaces:
I am not much familiar with AWS Workspaces, but need inputs, can this also be used as aws batch to start and stop instance when required and run python scripts on that, edit or test scripts.
Also, Can I schedule it to run everyday at defined time?
I need a inputs on which service is best suitable, optimized solution for such use-case? Or It would also be great if anyone suggests a better way to use above services I mentioned in better way.
Batch is best suited for your use case. I see that your concern about batch is about the development and unit testing on your personal desktop. You can automate that process using AWS ECR, CodePipeline, CodeCommit and CodeBuild. Setup a pipeline to detect changes made to your code repo, build the image and push it to ECR. Batch can pick up the latest image from there.
I was using a free tier aws account in which I had one ec2 machine (Linux). I have a simple website with backend server running on django at 8000 port and front end server written in angular and running on http (80) port. I used nginx for https and redirection of calls to backend and frontend server.
Now for backend build system, I did these 3 main steps (which I automated by running jenkins on the same machine).
1) git pull (Pull the latest code from repo).
2) Do migrations (Updating my db with any new table).
3) Restarting the django server. (I was using gunicorn).
Now, I split my front end and backend server into 2 different machines using auto scaling groups and I am now using ELB (Aws Elastic Load balancer) to route the requests. I am done with the setup. But now I am having problem in continuous deployment. The main thing is that ELB uses auto scaling groups which in turn uses AMI.
Now, since AMI's are created once, my first question is how to automate this process and deploy my latest code in already running aws servers.
Second, if I want to run few steps just once for all the servers like my second step of updating db with new tables then how to achieve that.
And also third if these steps need to run on a machine, then do I need to have another ec2 instance to automate the process of creating AMI, updating auto scaling groups with it and then deploying latest code in that.
So, basically I want to know the best practices that people follow in deploying latest code in aws machines that were created by auto scaling groups with the help of AMI. Also I use bitbucket for code management.
First Question: how to automate 'package based deployment'.
Instead of creating a new AMI for every release, create a baseline AMI which only changes when your new release require OS changes / security patches / etc. Look into tools such as packer to create AMIs automatically. In order to automate your code deployment when it changes, you can use a package-based deployment approach, which means you create a package for every release (Should be part of your CI process), which is stored in some repository such as Nexus, Artifactory, or even a simple S3 bucket.
When you deploy a new instance of your application, it should run some sort of script to pull and unpack/install that package on the instance < this is the basic concept, there are many tools that can help you achieve this, for example, Chef, or AWS CloudFormation.
So essentially, Step 1 should pull the code, create the package and store it in some repository available to your application servers > this can be done offline.
Second Question: How to run other tasks such as updating database schema.
As mentioned above, this can also be part of your 'deployment' automation, so if you are using Chef or even a simple bash script, it can update a database schema before unpacking the new code, this really depends on your database, how you manage it, and who orchestrates the deployment.
For example, you could have a Jenkins job that pulls the new schema and updates your database when ever you rollout a release.
Your third question can be solved by Packer, it can spin up instances, create an AMI, and terminate the instance.
Read more into CICD, and CICD related tools.
I have few questions regarding AWS elastic beanstalk. My upcoming mobile application has backend written in php and it uses mysql database.
I learnt that FTP is not possible with AWS elastic beanstalk. If I have to make changes to the any application, I have upload the entire applications once again.
My questions is: while uploading the application fresh, will there be downtime? will it destroy the old database and create fresh one?
regards
You can upload a new version of the application using the console or you can use the CLI tools or the API.
You can avoid downtime of your application during deployments by increasing the minimum number of instances > 1 and then you can do a rolling deployment (with batch size < number of instances). You can choose either a time based or health based rolling deployment. This will ensure that the code is deployed only to a subset of the instances at any given point of time.
You can read about rolling deployments here:
http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/using-features.rollingupdates.html