AWS Lambda - separate/protected binary store, so devs don't have to share binary files? - amazon-web-services

I'm trying to wrap my head around how using custom binaries with Lambda works. Since you have to upload the code bundle in a ZIP file (or pull it from S3), that means this action overwrites whatever you currently have in place. So let's say I have a folder structure like this in said ZIP file:
myFunc/
index.js
bin/
node_modules/
And in the bin folder are a couple of binary executables. This means all the developers on the team would have to have access to these binaries, and every time even the smallest code change was made to index.js, they would have to ZIP this up with the binaries every time and upload the bundle.
Is there not some way in Lambda to specify some sort of separate cache/store where binaries can be kept, independently of the source code?

This is what you use a build server for - anybody pushes a code change, and a bundle is created automatically within seconds. Even better, the bundle can be pushed through a test pipeline until it reaches production (via the AWS API using for example boto) a few minutes later.
You could possibly store binaries somewhere like S3 for access by the Lambda, but then you have a massive problem of version controlling them. Much easier (and safer) to create a complete bundle with absolutely everything the program needs. Additional benefits:
You can be absolutely certain of which exact code was involved in handling a particular request, making debugging much easier.
Developers can download the whole bundle and run it without having to establish a connection to the binary repository.
The bundle can be migrated to another service with minimal effort.

Related

Can I load code from file in a AWS Lambda?

I am thinking of creating 2 generic AWS Lambda functions, one as an "Invoker" to run the other Lambda function. The invoked Lambda function loads the code of the Lambda from a file based on the parameter that is passed to it.
Invoker: Calls the invoked Lambda with a specified parameter, e.g. ID
Invoked: Based on the ID, load the appropriate text file containing
the actual code to run
Can this be done?
The reason for this thinking is that I don't want to have to deploy 100 Lambda functions if I could just save the code in 100 text files in S3 bucket and load them as required.
The code is uploaded constantly by users and so I cannot include it in the lambda. And the code can be in all languages supported by AWS (.NET, NodeJs, Python, etc.)
For security, is there a way to maybe "containerized" running the code?
Any recommendation and ideas are greatly appreciated.
Thanking you in advance.
The very first I'd like to mention is that you should pay a lot of attention to the security aspects of your app as you are going to execute code uploaded by users, meaning that they will potentially be able to access sensitive data.
My example is based on NodeJS, but I think something similar may be achieved using other runtimes, not sure. There are main two things you need to know:
AWS Lambda execution environment provides you with the /tmp folder with capacity of 512 MB and you are allowed to put there any necessary resources needed for the current particular invocation.
NodeJS allows you to require modules dynamically at any place in the app.
So, basically, you may download the desired js file into the /tmp folder and then require it from your code. I am not going to write the real code now as it could be quite big, but here is some general steps just to make things clear:
Lambda receives fileId as a parameter in event.
Lambda searches S3 for the file named fileId and then downloads it to the /tmp folder as fileId.js
Now in the app you may require that file and consider it as a module:
const dynamicModule = require("/tmp/fileId.js");
Use the the module loaded
You certainly won't be able to run Python code, or .Net code, in a Node lambda. Can you load files and dynamically run the code? Probably. Should you? Probably not. Even if trust the source of that code you don't want them all running in the same function. 1) they would share the same permissions. That means that, at a minimum, they would all have access to the same S3 bucket where the code is stored. 2) they would all log to the same place. Good luck debugging.
We have several hundred lambda functions deployed in our account. I would never even entertain this idea as an alternative.

Can I sync two bazel-remote-cache's using rsync

I have a build pipeline that builds and tests changes before they are merged to the main line. Once that happens, it would be great if the Bazel actions from that build are available to developers. Unfortunately, the build pipeline runs in the cloud and uses an in-cloud cache, but the developers use an on-premises cache.
I am using https://github.com/buchgr/bazel-remote
Does anyone know if I can just rsync the artifacts from the data directory of the cloud cache to the developers' cache in order to give them access to the pre-built artifacts? Normally, I would just try it out, but I'm concerned about subtle issues that might poison the cache or negatively effect the hit rate, so I'm hoping to hear from someone who understands the code before I go digging.
You can rsync the cache directory contents and use them from another location, but this won't work with a running bazel-remote- the items will be ignored until bazel-remote is restarted.
Another option would be to use the http_proxy configuration file setting to automatically put/get cache items to/from another bazel-remote instance. An example configuration file was recently added to README.md in the bazel-remote git repository.

Calculating when Lambda Layers should be updated

I have a project with a CodePipeline sourcing from Github which updates layers based on file changes. We don't want to automatically update the layers on every commit because they are not necessarily changing. Because there is no built-in comparison with lambda layers to be updated, the burden of determining if a layer should be updated falls to the user. I've tried a couple different options:
Hash the local representation of the layer files and compare it to the most recent lambda layer on AWS. If the hash is different, you know you have file changes, and should update.
Look at your git file changes (i.e. with PythonGit) and see if any of your layers have changed files. If so, you should update your layer.
2 is a problem in CodePipeline specifically because when a repo is sourced from github, the Download Zip functionality is used, not git clone, so the .git folder is removed. You could get it back via renaming it, but it gets messy.
I'd be interested to hear how other people have handled this problem.
You can write some version/hash/etc to the description of the Lambda Layer.
You can compare this description with the version in your git.
It sounds a little bit creepy, so I prefer to build layer every time I commit to master (for example), and automatically delete previous versions (saving last N versions for potential rollback).
It's not so overhead for my purposes, but it depends on your issue.

Sitecore Config Files + Project Setup

We are updating our sitecore to 8.2 and in the process I am trying to refine our source control and development workflow.
Goals
1. Have a single source of truth for support dlls, configs, lic, etc.
2. Have everything in source control that is needed to recreate the entire site from dev to prod. (excluding packages).
In order to have all of the different configs needed for the various machines I have created gulp tasks that transform the configs on build (dev, staging, prod). Those transformed configs are placed in a folder in the project that is then used to replace the originals on the target machines. This folder publishes all of its contents and seems to be working well so far.
What I don't know is how to deal with all of the config files that do not change.
Is it best to include all of those .config files in the project so that they publish? If not, then the target machine folders will have to be either manually managed (seems like a bad idea) or a script used to ensure the configs are up to date (more customization..by default not a great idea).
The only downside (that I see) to including all of the configs in the project is the weight that it would add to file searches (and that doesn't seem like a very strong argument).
Am I not seeing something?
How are you other Sitecore humans handling this?
Gregory
As a general rule of thumb, do not check in any default files into Source Control.
The main reasons are; bloat, making syncing/downloading from your source control take much longer, and upgrades, the latter being a much more important reason.
If/when you upgrade in the future, if you do not have any Sitecore files checked into source control then you can simply deploy a new/clean instance of Sitecore, fix any conflicts in your own code and then deploy on top. You don't have to try and figure out what has changed in the default install files between releases.
Any changes you need to make to Sitecore configs or settings should be made using patch files and only those custom files added to your solution.
How to handle this for deployments?
There are a few options. You could go done the scripted route, which will take a clean Sitecore install, unzip and made whatever modifications you need, then install/unzip the modules that you use in your solution one by one.
Another option maybe to create a default install with all the modules and then zip this up, then an install would be similar process to above but a more simpler case of just unzipping a single file. You could use Sitecore SIM to both install the instance, modules and then backup or do this manually.
Yet another alternative may be to check everything into Source Control, either under separate repository or a different project so ensure that all default files and configs are kept separate. If you need to upgrade in the future, simply delete the repo/project and add them back in again.
I would also do the same (a separate project) to keep all Support patches/dlls separate, again to help easily identify what fixes have been applied and to easily remove them if a future version resolves the issue.
These may add an additional step to your deploy, but keeping this separation will make your life much much easier when it comes to upgrade time.

Is there an ideal way to move from Staging to Production for Coldfusion code?

I am trying to work out a good way to run a staging server and a production server for hosting multiple Coldfusion sites. Each site is essentially a fork of a repo, with site specific changes made to each. I am looking for a good way to have this staging server move code (upon QA approval) to the production server.
One fanciful idea involved compiling the sites each into EAR files to be run on the production server, but I cannot seem to wrap my head around Coldfusion archives, plus I cannot see any good way of automating this, especially the deployment part.
What I have done successfully before is use subversion as a go between for a site, where once a site is QA'd the code is committed and then the production server's working directory would have an SVN update run, which would then trigger a code copy from the working directory to the actual live code. This worked fine, but has many moving parts, and still required some form of server access to each server to run the commits and updates. Plus this worked for an individual site, I think it may be a nightmare to setup and maintain this architecture for multiple sites.
Ideally I would want a group of developers to have FTP access with the ability to log into some control panel to mark a site for QA, and then have a QA person check the site and mark it as stable/production worthy, and then have someone see that a site is pending and click a button to deploy the updated site. (Any of those roles could be filled by the same person mind you)
Sorry if that last part wasn't so much the question, just a framework to understand my current thought process.
Agree with #Nathan Strutz that Ant is a good tool for this purpose. Some more thoughts.
You want a repeatable build process that minimizes opportunities for deltas. With that in mind:
SVN export a build.
Tag the build in SVN.
Turn that export into a .zip, something with an installer, etc... idea being one unit to validate with a set of repeatable deployment steps.
Send the build to QA.
If QA approves deploy that build into production
Move whole code bases over as a build, rather than just changed files. This way you know what's put into place in production is the same thing that was validated. Refactor code so that configuration data is not overwritten by a new build.
As for actual production deployment, I have not come across a tool to solve the multiple servers, different code bases challenge. So I think you're best served rolling your own.
As an aside, in your situation I would think through an approach that allows for a standardized codebase, with a mechanism (i.e. an API) that allows for the customization you're describing. Otherwise managing each site as a "custom" project is very painful.
Update
Learning Ant: Ant in Action [book].
On Source Control: for the situation you describe, I would maintain a core code base and overlays per site. Export core, then site specific over it. This ensures any core updates that site specific changes don't override make it in.
Call this combination a "build". Do builds with Ant. Maintain an Ant script - or perhaps more flexibly an ant configuration file - per core & site combination. Track version number of core and site as part of a given build.
If your software is stuffed inside an installer (Nullsoft Install Shield for instance) that should be part of the build. Otherwise you should generate a .zip file (.ear is a possibility as well, but haven't seen anyone actually do this with CF). Point being one file that encompasses the whole build.
This build file is what QA should validate. So validation includes deployment, configuration and functionality testing. See my answer for deployment on how this can flow.
Deployment:
If you want to automate deployment QA should be involved as well to validate it. Meaning QA would deploy / install builds using the same process on their servers before doing a staing to production deployment.
To do this I would create something that tracks what server receives what build file and whatever credentials and connection information is necessary to make that happen. Most likely via FTP. Once transferred, the tool would then extract the build file / run the installer. This last piece is an area I would have to research as to how it's possible to let one server run commands such as extraction or installation remotely.
You should look into Ant as a migration tool. It allows you to package your build process with a simple XML file that you can run from the command line or from within Eclipse. Creating an automated build process is great because it documents the process as well as executes it the same way, every time.
Ant can handle zipping and unzipping, copying around, making backups if needed, working with your subversion repository, transferring via FTP, compressing javascript and even calling a web address if you need to do something like flush the application memory or server cache once it's installed. You may be surprised with the things you can do with Ant.
To get started, I would recommend the Ant manual as your main resource, but look into existing Ant builds as a good starting point to get you going. I have one on RIAForge for example that does some interesting stuff and calls a groovy script to do some more processing on my files during the build. If you search riaforge for build.xml files, you will come up with a great variety of them, many of which are directly for ColdFusion projects.