Framework to run processes in the "cloud" - amazon-web-services

I am currently looking for a solution to run arbitrary scripts on a cloud instance (aws, digitalocean, rackspace, I'm not picky). I am not doing something shady I simply want to use it for performance testing and need reproducible results (deploy a service, set specific testdata, run the performance tests, kill everything, repeat if necessary).
Of course I can use the API of these providers and build a custom solution, but I'm wondering if there is a framework or a bunch of tools that will help me with that.
What I need is:
- Only using an instance for the runtime of the script
- possibility to store data outside of the instance for result analysis
There are a lot of tools to automate setting up cloud instances but they all seem targeted for deployment purposes. What I need is a cloud script runner.

From your description is sounds like you might be looking for something like AWS's new(ish) Lambda service.
This allows you define scripts and triggers to run them in he clued without the overhead of spinning up and having to manage cloud compute 'servers'.
More info:
https://aws.amazon.com/lambda/
One thing to be careful of when using the cloud for performance testing - you have no teal control over the actual HW that your code will run on and different runs may run on different HW. This is true even for server or instance based cloud testing.

Related

Setup serverless local environment for AWS using serverless framework

Hi I am using the serverless framework to develop my application and I need to set it up in a local environment I am using API gateway, Lambda, VPC , SNS, SQS, and DB is connected via VPC peering, currently, everytime I am deploying and testing my code and its tedious process and takes 5 mins to deploy, Is there any way to set up a local environment to have everything in one place
It should be possible in theory, but it is not an easy thing to do. There are products like LocalStack that offer exactly this.
But, I would not recommend going that route. Ultimately, by design this will always be a huge cat and mouse game. AWS introduces a new feature or changes some minor detail of their implementation and products like LocalStack need to catch up. Furthermore, you will always only get an "approximation" of the "actual cloud". It never won't be a 100% match.
I would think there is a lot of work involved to get products like LocalStack working properly with your setup and have it running well.
Therefore, I would propose to invest the same time into proper developer experience within the "actual cloud". That is what we do: every developer deploys their version of the project to AWS.
This is also not trivial, but the end result is not a "fake version" of the cloud that might or might not reflect the "real cloud".
The key to achieve this is Infrastructure as code and as much automation as possible. We use Terraform and Makefiles which works very well for us. If done properly, we only ever build and deploy what we changed. The result is that changes can be deployed in seconds to AWS and the developer can test the result either through the Makefile itself or using the AWS console.
And another upside of this is, that in theory you need to do all the same work anyway for your continuous deployment, so ultimately you are reducing work by not having to maintain local deployments and cloud deployments.

Cloud Workflows vs Cloud Build for buinding infrastructure?

Since now, I've used Cloud Build as a vanilla CICD for running terraform and for building the infrastructure (sometimes I've Docker containers to build, sometimes I've not).
Now that Cloud Workflows is available I was wondering if this could be a better tool for pipelining atomic steps execution, for easiness and better control (for ex. conditional executions, error handling and so on, centralized log pushing and so)
I think that everything of the aboves can be done in Cloud Build, but it's usually not trivial to do.
Is Workflows ok for that and, if not, which is the best use case of this new tool instead?
You can have similarities, if, for example, your Cloud Build only call APIs to run/deploy/configure stuff.
However, keep in mind 2 things:
Cloud Workflow can only call APIs and sleep. You can't build a container image (with Docker for example) with Workflow. it's not a runtime environment, just a stuff which call APIs
Cloud Build can be trigger on push, tag and pull request. You can't do that with Workflow.
So, yes, sometime you can ask yourselves if you can change one by the other, but personally, I think that you have to use the right product for the right job.
API call orchestration -> Workflow
CICD -> Cloud Build

Using cloud functions vs cloud run as webhook for dialogflow

I don't know much about web development and cloud computing. From what I've read when using Cloud functions as the webhook service for dialogflow, you are limited to write code in just 1 source file. I would like to create a real complex dialogflow agent, so It would be handy to have an organized code structure to make the development easier.
I've recently discovered Cloud run which seems like it can also handle webhook requests and makes it possible to develop a complex code structure.
I don't want to use Cloud Run just because it is inconvenient to write everything in one file, but on the other hand it would be strange to have a cloud function with a single file with thousands of lines of code.
Is it possible to have multiple files in a single cloud function?
Is cloud run suitable for my problem? (create a complex dialogflow agent)
Is it possible to have multiple files in a single cloud function?
Yes. When you deploy to Google Cloud Functions you create a bundle with all your source files or have it pull from a source repository.
But Dialogflow only allows index.js and package.json in the Built-In Editor
For simplicity, the built-in code editor only allows you to edit those two files. But the built-in editor is mostly just meant for basic testing. If you're doing serious coding, you probably already have an environment you prefer to use to code and deploy that code.
Is Cloud Run suitable?
Certainly. The biggest thing Cloud Run will get you is complete control over your runtime environment, since you're specifying the details of that environment in addition to the code.
The biggest downside, however, is that you also have to determine details of that environment. Cloud Funcitons provide an HTTPS server without you having to worry about those details, as long as the rest of the environment is suitable.
What other options do I have?
Anywhere you want! Dialogflow only requires that your webhook
Be at a public address (ie - one that Google can resolve and reach)
Runs an HTTPS server at that address with a non-self-signed certificate
During testing, it is common to run it on your own machine via a tunnel such as ngrok, but this isn't a good idea in production. If you're already familiar with running an HTTPS server in another environment, and you wish to continue using that environment, you should be fine.

What service should I use to process my files in a Cloud Storage bucket and upload the result?

I have a software that process some files. What I need is:
start a default image on google cloud (I think docker should be a good solution) using an API or a run command
download files from google storage
process it, run my software using those downloaded files
upload the result to google storage
shut the image down, expecting not to be billed anymore
What I do know is how to create my image hehe. But I can't find any info saying me what google cloud service should I use or even if I could do it like I'm thinking. I think I'm not using the right keywords to find what i need.
I was looking at Kubernetes, but i couldn't figure out how to manipulate those instances to execute a one time processing.
[EDIT]
Explaining better the process I have an app that receive images and send it to Google storage. After that, I need to process that images, apply filters, georeferencing, split image etc. So I want to start a docker image to process it and upload the results to google cloud again.
If you are using any of the runtimes supported by Google Cloud Functions, they are easiest way to do those kind of operations (i.e. fetch something from Google Cloud Storage, perform some actions on those files and upload them again). The Cloud Functions will be triggered by an event of your choice, and after the job, it will die.
Next option in terms of complexity would be to deploy a Google App Engine application in standard environment. It allows you to deploy your own application written in any of the supported languages for this environment. While there is traffic in your application, you will have instances serving, but the number of instances running can go down to 0 when they are not serving, which would mean less cost.
Another option would be Google App Engine in flexible environment. This product allows you to deploy your application in any custom runtime. This option has always at least one instance running, so it would never shut down.
Lastly, you can use Google Compute Engine to "create and run virtual machines on Google infrastructure". Otherwise than GAE, this is not that managed by Google, which means that most of the configuration is up to you. In this case, you would need to programmatically indicate your VM to shut down after you have finished your operations.
Based on your edit where you stated that you already have an app that is inserting images into Google Cloud Storage, your easiest option would be to use Cloud Functions that are triggered by additions, changes, or deletions to objects in Cloud Storage buckets.
You can follow the Cloud Functions tutorial for Cloud Storage to get an idea of the generic process and then implement your own code that handles your specific tasks. There are other tutorials like the Imagemagick tutorial for Cloud Functions that might also be relevant to the type of processing you intend to do.
Cloud Functions is probably your lightest weight approach. You could of course do more full scale applications, but that is likely overkill, more expensive, and more complex. You can write your processing code in Node.js, Python, or Go.

How can I speed up my unit tests using cloud computing?

I work on a project with a lot of legacy code that has some tests (in JUnit). Those unit tests are pretty slow and running them often slows me down. I plan on refactoring and optimizing them (so they're real unit tests), but before I do this, I'd like to speed them up just for now, by running them in parallel. Of course I could get myself some cluster and run them there, but that's not worth the hassle and the money. Is it possible to do so with some cloud services like e.g. Amazon AWS? Can you recommend me some articles where I could read more about it?
Running unit tests in parallel requires 2 things:
Creating groups of multiple tests using either suites or JUnit categories
A way to launch each group of tests
Once you have #1, an easy way to get #2 is to use Continuous Integration.
Do you already use Continuous Integration ?
If not, it is easy to add. One of the more popular ones is Jenkins
Jenkins also supports distributed builds where a master code launches build jobs on slave nodes. You could use this to run each of your groups in parallel.
Continuous Integration in the Cloud
Once you have Continuous Integration set up, you can move it into the cloud if you wish.
If you want to stick with jenkins, you can use
Cloud Bees which costs money
Red Hat Openshift "Gears" can be used to run Jenkins. I think it is possible to use gears in Amazon WS but I've never done it. You can run a few gears with limited memory usage for free, but if you need more it will cost money.
If you don't want to use Jenkins, there are other cloud based CIs including
Travis-CI which integrates with github. Here is an article about how to speed up builds in Travis