I'm building a server-side application which requires the data the be stored encrypted in the database. When a client accesses the data, it also has to be transferred encrypted. The clients each has a unique login.
My original idea to do this, is to store the data encrypted with a symmetric-algorithm like AES. So when a client wants to access the data the encrypted data is transferred to the client, while the key is encrypted with the public key from the client.
Is this a secure way to do store and transfer the data or is there a better solution to this problem?
Update: If following Søren's suggestion to keep a copy of the AES key encrypted using each client's public key, wouldn't that include the key to be stored somewhere in order to add additional clients or could that be generated in any way?
First you should start by defining some security properties you want to provide, for example:
Is it ok to give different users access to the same secret key? Aka if File1 is AES encrypted with key K, is it a problem if user Alice and user Bob both are given K.
How do I revoke users from the system? (It turns out Bob from scenario 1 is actually a Chinese spy working for our company, how do I securely kick him out of the system).
Does the encrypted data that is saved in the database need to be searched? (This problem is well researched and hard to solve!)
How much (if any) and what plaintext data will be placed into the database to help organize it? Databases expect data to have unique keys associated with them. You need to make sure these keys don't leak information, but are useful enough to retrieve the data later.
How often should secret keys be changed? If you are storing files and multiple users are allowed access to encrypted files, what happens when user X modifies a file? Does the secret key change? Should the new key be sent to all users?
What happens when 2 users modify the same data at the same time? Will the database be able to handle this without modification?
There are many others.
If the server is not trusted and must never see plaintext data, then here's a general overview of a possible solution.
Let the clients managed the crypto completely. Clients authenticate with the server and are allowed to store data into the database. It is the responsibility of the client to make sure the data is encrypted.
In this scenario, keys should be saved securely only on the clients computer. If they must be placed elsewhere, a "master key" could be created.
Secure from what? You need to define your goals more clearly.
The solution would protect the data during transfer, but from your description, the server would have full access to the data (since it'd need to store the AES key unencrypted). In other words, a hacker or burglar with access to the server would have full access to the data.
If secure transmission is what you want, use an SSL / TLS wrapper around the database connection. This is a standard solution from all major vendors.
To secure the data server side, the server should not have the AES key. If the number of clients were limited, the server could store a copy of the AES key for every client, each copy of the key already encrypted with the public key of each client, such that the server never sees the plain text data nor any unencrypted AES keys.
That is indeed the common approach, e.g. also used by NTFS file encryption.
Related
I am wondering how signing key and encryption key of a gcp shielded VM instance can be used? I am thinking of using the encryption key (ekPub) to encrypt an arbitrary blob of data and be sure only the the associated gcp instance can decrypt it. But I am not sure how to ask vTPM to decrypt the encrypted data?
Shielded VM and Confidential computing are 2 different features on Google Cloud.
Shielded VM check at startup is any component has been tampered and can lead to a dataleak (through malware/backdoor)
Confidential Computing automatically create a cryptographic key at startup. This key is used to cipher all the data in memory. The data are only decipher inside the CPU, while processing.
When the data are written on disk, the data are get from encrypted memory, decipher in the CPU and written in plain text on the disk, which is automatically encrypted (but by another process, not by the CPU)
You have nothing to do, it's automatic!
Background and Definitions
The endorsement key (EK) is a key on TPM2.0 that is used for attestation. The EK typically comes with a certificate signed by the manufacturer (note, not available on GCE instances) stating that the TPM is a genuine TPM[1]. However, the TCG had privacy concerns around attestation with one signing key. So, they decided to make the endorsement key an encryption key. The ActivateCredential flow[2] is typically used to trust a new signing key. This sidesteps the privacy concerns by allowing the use of a privacy CA to create an AK cert endorsing that the EK and AK are on the same TPM. GCE creates an AK by default that allows users to avoid this process by using the get-shielded-identity API.
Decryption
There are a few ways to encrypt data using the endorsement key.
Since the EK is restricted [3], you have to jump through some hoops to easily use it. Restricted here means the key cannot be used for general decryption. Rather, they are used for storage/wrapping TPM objects. A storage key is typically a restricted decryption key.
Here are some ways you can get around this problem:
1. Use TPM2_Import and TPM2_Unseal (Part 3 of the TPM spec [3])
TPM2_Import has the TPM decrypt an external blob (public and private) with a storage key. Then, the user can load that object under the storage key and use. TPM2_Unseal returns the secret within the sealed blob.
The flow is roughly the following:
A remote entity creates a blob containing a private part and a corresponding public part. The private part contains the original secret to decrypt.
Remote entity uses an EK to wrap a seed for a known KDF that derives a symmetric and HMAC key.
Use seed and KDF derived key to encrypt the private part. This is the "duplicate" blob.
Send duplicate, public, and encrypted seed to the VM.
TPM2_Import on duplicate, public, and encrypted seed with handle for the EK.
TPM2_Load on public and outPrivate (decrypted private) from TPM2_Import.
TPM2_Unseal on the object handle, secret will be in outData.
This is all done for you in https://github.com/google/go-tpm-tools. All you need is to pass in the PEM, decode it, and parse it into a public key.
Then you can use server.CreateImportBlob.
Send the output blob to the VM.
On the client side, use EndorsementKeyRSA (or EndorsementKeyECC) to create a go-tpm-tools key.
Use key.Import with the blob.
Specifically, see https://pkg.go.dev/github.com/google/go-tpm-tools/server#CreateImportBlob and https://pkg.go.dev/github.com/google/go-tpm-tools/tpm2tools#Key.Import
Note package tpm2tools was recently renamed client, but this is not yet a public release.
2. Use TPM2_ActivateCredential (TPM spec, Part 3)
ActivateCredential allows you to verify a key is co-resident with another. Again, while this is typically used for attestation, you can use this to create an asymmetric key pair for general decryption.
In this scenario, the VM would generate an unrestricted decryption key on the TPM.
The server then generates the ActivateCredential challenge with the known templates of the EK and the decryption key.
If the decryption key's properties match, the TPM can fetch the challenge secret and return it to the server.
The server, upon receiving the successful response, can rely on the corresponding public key generated in the challenge and encrypt data to the VM.
One thing you may notice is, if you only want to decrypt a few times, you can just use the challenge secret as the plaintext.
You would need to stitch this together using https://pkg.go.dev/github.com/google/go-tpm/tpm2/credactivation and
https://pkg.go.dev/github.com/google/go-tpm/tpm2#ActivateCredential, as I don't currently know of tooling that supports this out of the box.
References
[1] EK specification: https://trustedcomputinggroup.org/resource/tcg-ek-credential-profile-for-tpm-family-2-0/
[2] Credential activation: https://github.com/google/go-attestation/blob/master/docs/credential-activation.md
[3] TPM spec: https://trustedcomputinggroup.org/resource/tpm-library-specification
I need to build an identity service that uses a customer supplied key to encrypt sensitive ID values for storage in RDS but also has to allow us to look up a record later using the plaintext ID. We'd like to use a simple deterministic encryption algorithm for this but it looks like KMS API doesn't allow you to specify the IV so you can never get identical plaintext to encrypt to the same value twice.
We also have the requirement to look up the data using another non-secure value and retrieve the encrypted secure value and decrypt it - so one-way hashing is unfortunately not going to work.
Taken together, this means we won't be able to perform our lookup of the secure ID without brute force iterating through all records and decrypting them and comparing to the plaintext value, instead of simply encrypting the plaintext search value using a known IV and using that encrypted value as an index to look up the matching record in the database.
I'm guessing this is a pretty common requirement for things like SSN's and such so how do people solve for it?
Thanks in advance.
look up a record later using the plaintext ID
Then you are loosing quite a bit of security. Maybe you could store a hash (e. g. sha-256) of the ID along the encrypted data, which would make easier to lookup the record, but not revert the value
This approach assumes that the ID is from a reasonably large message space (there are potentially a lot of IDs) so it is not feasible to create a map for every possible value
KMS API doesn't allow you to specify the IV so you can never get identical plaintext to encrypt to the same value twice.
yes, KMS seems to provide its own IV for ciphertext enforcing good security practice
if I understand your use case correctly, your flow is like this:
The customer provides a key K and you use this key to encrypt a secret S, which is stored in RDS with an associated ID.
Given a non-secret key K, you want to be able to look up S and decrypt it.
If the customer is reusing the key, this is actually not all that hard to accomplish.
Create a KMS key for the customer.
Use this KMS key to encrypt the customer's IV and the key the customer has specified, and store them in Amazon Secrets Manager - preferably namespaced in some way by customer. A Json structure like this:
{
"iv": "somerandomivvalue",
"key": "somerandomkey"
}
would allow you to easily parse the values out. ASM also allows you to seamlessly perform key rotation - which is really nifty.
If you're paranoid, you could take a cryptographic hash of the customer name (or whatever) and namespace by that.
RDS now stores the numeric ID of the customer, the insecure values, and a namespace value (or some method of deriving the location) in ASM.
It goes without saying that you need to limit access to the secrets manager vault.
To employ the solution:
Customer issues request to read secure value.
Service accesses ASM and decrypts the secret for customer.
Service extracts IV and key
Service initialises cipher scheme with IV and key and decrypts customer data.
Benefits: You encrypt and decrypt the secret values in ASM with a KMS key under your full control, and you can store and recover whatever state you need to decrypt the customer values in a secure manner.
Others will probably have cryptographically better solutions, but this should do for a first attempt.
In the end we decided to continue to use KMS for the customer supplied key encrypt/decrypt of the sensitive ID column but also enabled the PostgreSQL pgcrypt extension to provide secure hashes for lookups. So in addition to our encrypted column we added an id_hash column and we operate on the table something like this:
`INSERT INTO employee VALUES ..., id_hash = ENCODE(HMAC('SENSITIVE_ID+SECRET_SALT', 'SECRET_PASSPHRASE', 'sha256'), 'hex');
SELECT FROM employee WHERE division_id = ??? AND id_hash = ENCODE(HMAC('SENSITIVE_ID+SECRET_SALT', 'SECRET_PASSPHRASE', 'sha256'), 'hex');`
We could have done the hashing client-side but since the algorithm is key to later lookups we liked the simplicity of having the DB do the hashing for us.
Hope this is of use to anyone else looking for a solution.
I'm using the Amazon Encryption SDK to encrypt data before storing it in a database. I'm also using Amazon KMS. As part of the encryption process, the SDK stores the Key Provider ID of the data key used to encrypt in the generated cipher-text header.
As described in the documentation here http://docs.aws.amazon.com/encryption-sdk/latest/developer-guide/message-format.html#header-structure
The encryption operations in the AWS Encryption SDK return a single
data structure or message that contains the encrypted data
(ciphertext) and all encrypted data keys. To understand this data
structure, or to build libraries that read and write it, you need to
understand the message format.
The message format consists of at least two parts: a header and a
body. In some cases, the message format consists of a third part, a
footer.
The Key Provider ID value contains the Amazon Resource Name (ARN) of the AWS KMS customer master key (CMK).
Here is where the issue comes in. Right now I have two different KMS regions available for encryption. Each Key Provider ID has the exact same Encrypted Data Key value. So either key could be used to decrypt the data. However, the issue is with the ciphertext headers. Let's say I have KMS1 and KMS2. If I encrypt the data with the key provided by KMS1, then the Key Provider ID will be stored in the ciphertext header. If I attempt to decrypt the data with KMS2, even though the Encrypted Data Key is the same, the decryption will fail because the header does not contain the Key Provider for KMS2. It has the Key Provider ID for KMS1. It fails with this error:
com.amazonaws.encryptionsdk.exception.BadCiphertextException: Header integrity check failed.
at com.amazonaws.encryptionsdk.internal.DecryptionHandler.verifyHeaderIntegrity(DecryptionHandler.java:312) ~[application.jar:na]
at com.amazonaws.encryptionsdk.internal.DecryptionHandler.readHeaderFields(DecryptionHandler.java:389) ~[application.jar:na]
...
com.amazonaws.encryptionsdk.internal.DecryptionHandler.verifyHeaderIntegrity(DecryptionHandler.java:310) ~[application.jar:na]
... 16 common frames omitted
Caused by: javax.crypto.AEADBadTagException: Tag mismatch!
It fails to verify the header integrity and fails. This is not good, because I was planning to have multiple KMS's in case of one region KMS failing. We duplicate our data across all our regions, and we thought that we could use any KMS from the regions to decrypt as long as the encrypted data keys match. However, it looks like I'm locked into using only the original KMS that was encrypting the data? How on earth can we scale this to multiple regions if we can only rely on a single KMS?
I could include all the region master keys in the call to encrypt the data. That way, the headers would always match, although it would not reflect which KMS it's actually using. However, that's also not scalable, since we could add/remove regions in the future, and that would cause issues with all the data that's already encrypted.
Am I missing something? I've thought about this, and I want to solve this problem without crippling any integrity checks provided by the SDK/Encryption.
Update:
Based on a comment from #jarmod
Using an alias doesn't work either because we can only associate an alias to a key in the region, and it stores the resolved name of the key ARN it's pointing to anyway.
I'm reading this document and it says
Additionally, envelope encryption can help to design your application
for disaster recovery. You can move your encrypted data as-is between
Regions and only have to reencrypt the data keys with the
Region-specific CMKs
However, that's not accurate at all, because the encryption SDK will fail to decrypt on a different region because the Key Provider ID of the re-encrypted data keys will be totally different!
Apologies since I'm not familiar with Java programming, but I believe there is confusion how you are using the KMS CMKs to encrypt (or decrypt) the data using keys from more than one-region for DR.
When you use multiple master keys to encrypt plaintext, any one of the master keys can be used to decrypt the plaintext. Note that, only one master key (let's say MKey1) generates the plaintext data key which is used to encrypt the data. This plaintext data key is then encrypted by the other master key (MKey2) as well.
As a result, you will have encrypted data + encrypted data key (using MKey1) + encrypted data key (using MKey2).
If for some reason MKey1 is unavailable and you want to decrypt the ciphertext, SDK can be used to decrypt the encrypted data key using MKey2, which can decrypt the ciphertext.
So, yes, you have to specify multiple KMS CMK ARN in your program if you want to use multiple KMS. The document shared by you has an example as well which I'm sure you are aware of.
I'm using AWS S3 in my C++ app to upload and download files. I've included the access key and secret in my code but I'm worried someone could read them from the binary. Is there any standard technique for obfuscating them?
Update: I'm not running this app on a PC, it's actually on an embedded device so I'm not worried about users reading the key and secret from a file or RAM (accessing the device is a lot harder). What I'm worried about is someone binwalking our update file and pulling the key and secret from the binary.
Storing a secret in computer is not an easy task. One thing you could do is encrypt the key using a password and store the encrypted data in a file. Then when user enters a password you can decrypt the encrypted data using the password and retrieve the key - which you can use.
But this approach will not work for scenarios where the software needs to run without user intervention.
It is better not keeping keys in code. Input when needed.
If kept in code, do not keep the key in simple string. Keep it in some pattern, and generate the key by some algorithm when needed.
I am looking for protocol/algorithm that will allow me to use a shared secret between my App & a HTML page.
The shared secret is designed to ensure only people who have the app can access the webpage.
My Problem: I do not know what algorithm(my methodology to validate a valid access to the HTML page) & what encryption protocol I should use for this.
People have suggested to me that I use HMAC SHAXXX or DES or AES, I am unsure which I should use - do you have any suggestions?
My algorithm is like so:
I create a shared secret that the App & the HTML page know of(lets call it "MySecret"). To ensure that that shared secret is always unique I will add the current date & minute to the end of the secret then hash it using XXX algorithm/protocol(HMAC/AES/DES). So the unencrypted secret will be "MySecret08/17/2011-11-11" & lets say the hash of that is "xyz"
I then add this hash to the url CGI: http://mysite.com/comp.py?sharedSecret=xyz
The comp.py script then uses the same shared secret & date combination, hashes it, then checks that the resulting hash is the same as the CGI variable sharedSecret("xyz"). If it is then I know a valid user is accessing the webpage.
Can you think of a better methodology to ensure on valid people can access my webpage(the webpage allows the user to enter a competition)?
I think I am on the correct track using a shared secret but my methodology for validating the secret seems flawed especially if the hash algorithm doesn't produce the same result for the same in put all the time.
especially if the hash algorithm doesn't produce the same result for the same in put all the time.
Then the hash is broken. Why wouldn't it?
You want HMAC in the simple case. You are "signing" your request using the shared secret, and the signature is verified by the server. Note that the HMAC should include more data to prevent replay attacks - in fact it should include all query parameters (in a specified order), along with a serial number to prevent the replay of the same message by an eavesdropper. If all you are verifying is the shared secret, anyone overhearing the message can continue to use this shared secret until it expires. By including a serial number, or a short validity range, you can configure the server to flag that.
Note that this is still imperfect. TLS supports client and server side certificate support - why not use that?
The looks like it would work. Clock drift could be a problem, you may need to validate a range of, say, +/- 3 minutes if it fails for the exact time.
flawed especially if the hash algorithm doesn't produce the same result for the same input all the time
Well, that would be a broken hash algorithm then. A hash reliable produces the same output for the same input every time (and almost always a different output for a different input).
Try using some sort of network encryption. Your web server should be able to handle that type of authentication automatically. All that remains is for you to write it into your app (which you have to do anyway). Depending on your app platform, you may be able to do that automatically as well.
Google these: Kerberos, SPNEGO and HTTP 401 Authorization Required. You may be able to get away with simple hard-coded user name and password HTTP headers and run your connections over HTTPS. That way you have less custom code on your server and your server takes care of authenticating your requests for you. Not to mention you are taking advantage of some additional features of HTTP.