Is the content on disk in cloud (Azure, AWS) zeroized prior to re-releasing to other users? - amazon-web-services

Wanted to know if cloud based platforms such as Azure and Amazon zeroize the content on the hard disk whenever an 'instance' is 'deleted' and prior to making it available for other users?
I've tried using 'dd' command on an Amazon-LightSail instance and it appears that the raw data is indeed zeroized. However was not sure if it was by chance (i just tried few random lengths) or if they actually take care to do that.
The concern is, if I leave passwords in configuration files, then someone who comes along would be able to read them (theoretically). Same goes for data in a database.

Generically, the solution to your concern typically used by Azure is storage encryption.
Your data is encrypted by default at the platform level with a key specific to your subscription; when the data or resource is removed, whether or not the storage is zeroed, it is effective inaccessible to a resource deployed on the same storage in another subscription.

Related

Cache JWKS in Lambda memory vs in temp

I currently am retrieving a JWKS keys using the Auth0 JWKS library for my Lambda custom authoriser function.
As explained in this issue on the JWKS library, apparently the caching built into JWKS for the public key ID does not work on lambda functions and as such they recommend writing the key to the tmp file.
What reasons could there be as to why cache=true would not work?
As far as I was aware, there should be no difference that would prevent in-memory caching working with lambda functions but allow file-based caching on the tmp folder to be the appropriate solution.
As far as I can tell, the only issues that would occur would be from the spawning of containers rate-limiting JWKS API and not the act of caching using the memory of the created containers.
In which case, what would be the optimal pattern of storing this token externally in Lambda?
There are a lot of option how to solve this. All have different advantages and disadvantages.
First of, storing the keys in memory or on the disk (/tmp) has the same result in terms of persistence. Both are available across calls to the same Lambda instance.
I would recommend storing the keys in memory, because memory access is a lot faster than reading from a file (on every request).
Here are other options to solve this:
Store the keys in S3 and download during init.
Store the keys on an EFS volume, mount that volume in your Lambda instance, load the keys from the volume during init.
Download the keys from the API during init.
Package the keys with the Lambdas deployment package and load them from disk during init.
Store the keys in AWS SSM parameter store and load them during init.
As you might have noticed, the "during init" phase is the most important part for all of those solutions. You don't want to do that for every request.
Option 1 and 2 would require some other "application" that you build do regularly download the keys and store them on S3 or a EFS volume. That is extra effort, but might in certain circumstances be a good idea for more complex setups.
Option 3 is basically what you are already doing at the moment and is probably the best tradeoff between simplicity and sound engineering for simple use cases. As stated before, you should store the key in memory.
Option 4 is a working "hack" that is the easiest way to get your key to your Lambda. I'd never recommend doing this, because sudden changes to the key would require a re-deployment of the Lambda, while in the meantime requests can't be authenticated, resulting in a down time.
Option 5 can be a valid alternative to option 3, but requires the same key management by another application like option 1 and 2. So it is not necessarily a good fit for a simple authorizer.

Best way to store and share cryptographic keys on GCP accross functions

Some functions I am writing would need to store and share a set of cryptographic keys (<1kb) somewhere so that:
it is shared across functions and within instances of the same function
it is maintained after function deploys
The keys are modified (and written) every 4 hours or so, based on whether a key has expired or a new key needs to be created.
Right now, I am storing the keys as encrypted binary in a cloud bucket with access limited to that function. It works, except that it is fairly slow (~500ms for the read / write that is required when updating the keys).
I have considered some other solutions:
Redis: fast, but overkill given the price ($40/month) it would cost to store a single value
Cloud SQL: the functions are already connected to a cloud instance so it would not incur more costs
Dropping everything and using a KMS. Unfortunately it would not meet the requirements I have.
The library I use in my functions is available here.
Is there a better way to store a single small blob of data for cloud functions (and possibly other tools like GKE) ?
Edit
The solution I ended up using was using a single table in a database that the app was already connected to. It is also about 5 times faster than using a bucket (<100ms).
The moral of the story is to use whatever is already provisioned to store the keys. If storing a key is a problem, then using the combo KMS + cloud functions for rotations described below seems like a good option.
All the code + more details are available here.
A better approach would be to manage your keys with Cloud KMS. However, as you mentioned before Cloud KMS does not automatically delete the old key version material and you will need to manually delete old versions which I suspect is a thing that you don’t want to do.
Another possibility is to just keep the keys in Firestore. Since for this you don’t have to provision any specific infrastructure such as with Redis Memorystore and Postgres Cloud SQL it will be easier to manage and to scale in the long run.
The general idea would be to have a Cloud Function triggered by Cloud Scheduler every 4 hours, and this function will rotate the keys on your Cloud Firestore.
How does this sound to you?

Does SNS retain my data?

I am evaluating push notification services and cannot use services on the cloud as laws prohibit customer identification data being stored off-premise.
Question
Is there any chance data will be stored off-premise if I use AWS-SNS API (not the console) to send push notifications to end user devices via code hosted on-premise(using AWS SDK)? In other words, will SNS retain my data or will it forget it right after it send the notification?
What have I tried so far?
Combed through the documentation as much as I could, but couldn't find anything to be 100% sure.
Would appreciate any pointers on this. TIA.
I would pose this question directly to AWS as it pertains to a legal requirement. I would clarify if the laws you need to comply with are in relation to data at rest or in transit, or both. Additionally if there are any circumstances where it would be ok for one or both of the aforementioned if there was certain security aspects that have been met.
Knowing no real detail about your use case I will say that AWS has a Region specifically for use by the US Government. If your solution is for the US Government then you should be making use of this Region as it ticks off a lot of compliance forms for you well in advance.
You can open a support ticket in the AWS console.
Again if there is a legal requirement for your data I thoroughly recommend that you ask AWS directly so that you may reference their answer in writing in the future.
Even if they didn't store it, how can you prove that to auditors?
Besides, what is the difference between storing something in memory (which they obviously have to do) and storing something on disk? One is volatile and the other isn't I guess. But from a compliance point of view, an admin on the box can get both, so who cares if the hardware with your data on it is a stick of RAM or a disk plugged into a SATA port?

Where to store user file uploads?

In my compojure app, where should I store user upload files? Do I just make a user-upload dir in my project root and stick everything in there? Is there anything special I should do (classpath, permissions, etc)?
To properly answer your question, you need to think of the lifecycle of the uploaded files. I would start answering questions such as:
how big are the files going to be?
what storage options will hold enough data to store all the uploads?
how about SLAs, redundancy and disaster avoidance?
how and who to monitor the free space and health of the storage?
In general, the file system location is much less relevant than the block device sitting behind it: as long as your data is stored safely enough for your application, user-upload can be anywhere and be anything from a regular disk to an S3 bucket e.g. via s3fs-fuse.
Putting such folder in your classpath sounds odd to me. It gives no essential benefit, as you will always need to go through a configuration entry to state where to store and read files from.
Permission wise, your application will require at least write access to the upload storage (most likely read access as well). Granting such permissions depends on the physical device you choose: if you opt for the local file system as you suggest in your question, you need to make sure the Clojure app is run by a user with chmod +rw, but in case of S3, you will need to configure API keys.
For anything other than a practice problem, I would suggest using a database such as Postgres or Datomic. This way, you get the reliability of a DB with real transactions, along with the ability to access the files across a network from any location.

Should I bother with web server database and file encryption?

I'm launching a Python Django web app on Heroku, using the default PostgreSQL database. I'll also be using a AWS S3 to store some files. The client I'm creating the site for is rightly concerned with security and asks is we can encrypt the database and the files stored in S3.
Am I correct in saying the only benefit that encryption will have, is it will protect our data in the unlikely case somebody breaks into one of Amazon's datacenters and happens to steal a hard drive on which our data is located?
I've come to the following conclusions:
Unless somebody gets hold of my AWS credentials or Heroko login details, the data is as safe as it can be.
Also, even if the data is encrypted and they get hold of my credentials/login details, they will still be able to read the data.
The key in keeping the site secure is just making sure nobody gets hold of my credentials/login details.
It is therefore not necessary to encrypt the database and files unless we believe there is a strong possibility of somebody breaking into an AWS datacenter.
Are my statements above correct?