AWS CodeDeploy agent is deleting files in the wrong folder during install - amazon-web-services

We have an unusual setup. We use git on Azure Devops for our code repositories, and AWS for our cloud-based services. In our arsenal we have a mixture of AWS Lambda functions, along with console apps, web apps, and Windows services running on EC2 instances. We have been able to create CI/CD pipelines for all three classes of apps. For the apps running on EC2 instances we use AWS CodeDeploy. These deployments are more complicated, but they all work -- except for one.
Another unusual thing about our setup is that both our development and QA environments are on the same EC2 instance. When the CodeDeploy agent running on that instance retrieves the deployment archive, it unpacks it, reads the appspec.yml file, runs our before install script, which backs up the existing installation and shuts down any services that might be using those files. Then, the install phase updates the files in the designated environment, then deletes -- or tries to delete -- all the files in the other environment folder.
In other words, if a DEV deployment is running, it replaces the files in the DEV folder and also tries to delete the files in the QA folder. I know this sounds like a scripting problem, but I have checked all the script and yaml files no where do I reference the opposing environment.
In this case, the app is a Windows service. Normally, I get a Ruby 'Permission denied # unlink_internal' error on a file in the other folder. As an experiment, I shut down the service in the other environment in my before install script and, as I expected, the agent deleted all the files in the other environment. It updated the files in the target environment, but left the folder in the other environment empty!
Here are my files. I suspect, the problem is being caused by something I did, but I can't, for the life of me, find it.
These are all .NET projects. In my solution I have a ConfigFiles folder set up with subfolders for each environment. Then, in my pipeline yaml file I run a script to select the correct files to move into the archive based on the git branch that is being built.
Here's the code for code for the script that selects the correct files.
Here's the Azure pipeline YAML file.
Here's my before install script:
And, finally, here is my appspec.yml file, which the CodeDeploy agent uses to know where to update the files during installation. How I want this to be the wrong path, but in the deployment archive, the environment specific values are all exactly right.
Any ideas on this one would be greatly appreciated.

I encountered the same problem where deployment of an app deletes files from another app in another folder unexpectedly. My solution is to use different deployment groups for each app, even though they are deploying to the same EC2 instance.

Deploying many apps on the same EC2 instance using the same deployment group results in files/folder deletion on other deployed projects.
From AWS Technical Support:
The reason is that codedeploy creates a clean up file by the format '[deployment group 1 ID]_cleanup" in the directory '/opt/codedeploy-agent/deployment-root/deployment-instructions' everytime a deployment is made to the deployment group and this file deletes all the files that had been installed during the previous deployment made to the deployment group. Since the deployment group is the same in your case, when you make a deployment to the deployment group which installs files to the folder "/var/www/project1", files installed by the previous deployment in the folder "/var/www/project2" are being cleaned up and vice versa which is an expected mechanism of the codedeploy agent.
You can find the explaination here: https://docs.aws.amazon.com/codedeploy/latest/userguide/codedeploy-agent.html#codedeploy-agent-install-files
Please consider creating two different applications/deployment groups
and configure the two pipelines to use different
applications/deployment groups which should fix your problem.

Related

How do I access beanstalk application venv?

this last week I have been trying to upload a flask app using AWS Beanstalk.
The main problem for me was loading a very heavy library as part of the bundle (there is a 500mb limit for uploading the bundle code).
Instead, I tried to use requirements.txt file so it would download the library directly to the server.
Unfortunately, every time I tried to include the library name in the requirements file, it failed to load it (torch library).
on pythonanywhere server there is a console which allows you to access the virtual environment and simply type
pip install torch
which was very useful and comfortable.
I am looking for something similar in AWS beanstalk, so that I could install the library directly instead of relying on the requirements.txt file.
I have been at it for a few days now and can't make any progress.
your help would be much appreciated.
another question,
is it possible to load the venv to Amazon-S3 and then access the folder from the beanstalk environment?
Its not a good practice to "manually" install your dependencies or configure your EB env from inside. This is only useful for testing and debugging purposes. Thus keep that it mind.
To get your venv, you have to ssh to your EB instance using regular ssh or web-based clients available in AWS EC2 console when you locate your EB EC2 instance. Session manager should work out-of-the-box to enable you to login to the instance.
When you login to the instance, then to activate your venv, you do:
# start bash
bash
# source venv
source /var/app/venv/staging-*/bin/activate

Selenium cloud execution on a machine without code or IDE

I set up my Selenium project (Maven, Java, TestNG) in GitHub repo and it is connected to Jenkins. I am able to execute the Maven project via Jenkins and do the testing. This requires all dependant tools (Maven,Java,Jenkins) set up in my local machine.
But we have a requirement to do this in the cloud. I know we can use Selenium Grid-Docker, BrowserStack or GCP to execute the tests in the cloud but what we need is to have everything installed in the cloud and any external user with access being able to execute any test via UI or executable file without installing anything in user's local machine.
Is this possible at all? If yes,how?
I searched a lot and couldn't find anything. One of my friends said it can be done using AWS but doesn't know how. I just need guidance on the path to take here and I'm willing to learn and implement it myself.
Solved this my deploying code to AWS-EC2.
Here's what I did.
I created a TestNG-Maven project and uploaded to GitHub. Then created a AWS-EC2 t2.micro linux instance and installed Chrome and Jenkins in it. I accessed Jenkins from my local machine and connected it to GitHub repo. From Jenkins when I build the project everything was getting downloaded in EC2 and execution happened in EC2. This will be chrome-headless execution.

Can you load standard zeppelin interpreter settings from S3?

Our company is building up a suite of common internal Spark functions and jobs, and I'd like to make sure that our data scientists have access to all of these when they prototype in Zeppelin.
Ideally, I'd like a way for them to start up a Zeppelin notebook on AWS EMR, and have the dependency jar we build automatically loaded onto it without them having to manually type in the maven information manually every time (private repo location/credentials, package info, etc).
Right now we have the dependency jar loaded on S3, and with some work we could get a private maven repository to host it on.
I see that ZEPPELIN_INTERPRETER_DIR saves off interpreter settings, but I don't think it can load from a common default location (like S3, or something)
Is there a way to tell Zeppelin on an EMR cluster to load it's interpreter settings from a common location? I can't be the first person to want this.
Other thoughts I've had but have not tried yet:
Have a script that uses aws cmd line options to start a EMR cluster with all the necessary settings pre-made for you. (Could also upload the .jar dependency if we can't get maven to work)
Use a infrastructure-as-code framework to start up the clusters with the required settings.
I don't believe it's possible to tell EMR to load settings from a common location. The first thought you included is the way to go imo - you would aws emr create ... and that create would include a shell script step to replace /etc/zeppelin/conf.dist/interpreter.json by downloading the interpreter.json of interest from S3, and then hard restart zeppelin (sudo stop zeppelin; sudo start zeppelin).

Is there a way to push changed to AWS Beanstalk instead of uploading an entire zip file on each deploy?

Im migrating a Play! application from Heroku to AWS Beanstalk.
Heroku is really straight forward when it comes to deploying: Just push changes to a remote git repository on Heroku and the build occurs on the server side.
This is very convenient because it is not necessary to upload the whole project for each tiny change (Including all libraries!).
Basically for each change we are generating a huge 140 MB Docker zipped file that takes at least 10 minutes to upload.
Surely there must be a better way but a long search on Google only returned options to automize the file generation with scripts and alternatives like Jenkins but this does not solve the problem, it just automates the problem.
Does anyone have a better solution?
You can set up a AWS CodeCommit repository, and use that as a remote for your local git repository. Next you can set up AWS CodePipeline to build your application and deploy to Elastic Beanstalk whenever there is a new commit to the AWS CodeCommit repository.
This way you don't have to upload everything every time. Whenever you do git push, only the changed files are uploaded to the AWS CodeCommit repository, and then AWS CodePipeline takes care of building your application and deploying it to Elastic Beanstalk.
So I got curious about this question too and had a conversation with an AWS specialist about different options here. Each option has it's downsides tho.
The first option is to bake your application code, create an AMI out of it and carry out deployment using baked AMI. More on that
You have to test this approach first before adopting. The downside is that you would have to regularly maintain the AMI. You might also miss out critical patches from Beanstalk since AMI has been locked down
A good read on this topic
The next approach would be to move out of Beanstalk and use CloudFormation where you can just upload your application folder to S3. Your CloudFormation template has to take care of spinning up all the resources required and using AWS::CloudFormation::Init and cfn-signals, it would be possible to install and setup software.Changes within the resource Metadata can be detected by making use of the proper CloudFormation signal and we can also run user-specified actions when a change is detected on the template specification.
(AWS::CloudFormation::Init)
http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/cfn-helper-scripts-reference.html (set of helper scripts that can be used with CloudFormation)
Although these are not exactly a solution to what you asked for, they can be a good alternative. At least I made sure that you are not missing out any available options at Beanstalk.
Also one advice I got from them was to consider splitting up application into multiple components and sub-components. This would reduce your application size considerably.
Hope this helped.
Short answer: No.
Long Answer: I ended up packaging the app with activator and not using Docker.
Crate a folder named "dist" in the root of the project.
Include a file named Procfile with the following line:
web: ./bin/YOUR_APP_NAME -Dhttp.port=5000 -Dconfig.file=conf/application.conf
Make sure to replace YOUR_APP_NAME with the name of your app as configured in build.sbt.
Package the Play app with the following command:
activator clean dist
That will generate a zip file inside target/universal/ folder in the project.
Deploy that zip file to AWS Elastic Beanstalk.

How do I make a django project compatible with AWS Beanstalk?

I want to make a Django project compatible with AWS Beanstalk.
I dont want this to be like in AWS tutorial, since they use git and need to setup the whole project as they tell.
I just want to know if there is a way of converting an already created Python-Django project to be AWS Beanstalk compatible. I mean, isn't there a standard project layout to download or a plugin or command-line tool that creates the .ebsettings folder for me? I want to convert my project and upload it throw the AWS web gui, dont need all the git stuff.
You can do this without using git route. You just need to zip your source bundle and upload to the Beanstalk Web Console. The code structure can be kept the way you want.
Key configurations are:
1. WSGIPath : This should point to the .py file which you need to start the app (WSGI app)
2. static: This should point to the path containing the static files
You can add the configurations in the .ebextensions folder, which should at the root of your app zip. You can read more details here: Customizing and Configuring a Python Container - AWS Elastic Beanstalk