Pickling Error - Unable to Decrypt the Column - amazon-web-services

I am using AWS Encryption SDK for decrypting the incoming encrypted JSON from source system.
The JSON looks like this
{
id:"uuid"
data:{cipherText:"ASHFLASHLhjhkjGKgkgl"}
}
At source we are doing first encryption and then doing encoding-64 for that specific data-key payload.
So in order to Decrypt it first we need to Decode it and then we can decrypt.
def decrypt_string(encrypted_line):
# Here I am extracting the Cipertext
encrypted_payload = encrypted_line["ciphertext"]
return encrypted_payload
def decryption_data(decoded_payload):
# Here I am passing the Decoded Row to decrypt function
decrypted_data, decrypted_header = client.decrypt(
source=decoded_payload, key_provider=master_key_provider)
return decrypted_data
spark_udf = udf(decrypt_string)
df = df.withColumn("cipertext_value", spark_udf("data"))
df = df.withColumn("decoded_base64_value", unbase64(df.cipertext_value))
spark_udf_decrpt = udf(decryption_data)
df = df.withColumn("decrypted_value", spark_udf_decrpt("decoded_base64_value"))
df.printSchema()
So I am able to extract the Cipertext value from the column and then also able to decode it using unbase64 but while trying to decrypt it I am getting below error.
PicklingError: Could not serialize object: TypeError: can't pickle SSLContext objects

Related

Encrypt value using AWS KMS - Can I have encrypted value in UTF8?

I can successfully encrypt the value using the following code:
final static Charset ENCODING = StandardCharsets.ISO_8859_1;
var awsCreds = AwsBasicCredentials.create(KEY, SECRET_KEY);
kmsClient = KmsClient.builder().credentialsProvider(StaticCredentialsProvider.create(awsCreds)).region(Region.US_EAST_1).build();
var sdkBytesString = SdkBytes.fromString(stringToEncrypt, ENCODING);
var encryptRequest = EncryptRequest.builder().keyId(KEY_ARN).plaintext(sdkBytesString).build();
var encryptResponse = this.kmsClient.encrypt(encryptRequest);
var result = encryptResponse.ciphertextBlob().asString(ENCODING);
In the result I can see encrypted value.
BUT The problem is that I need this value in UTF8 not ISO_8859_1. When trying to get ciphertextBlob in UTF8 - getting conversion error:
Blockquote
java.io.UncheckedIOException: Cannot encode string.
I need to save the string in UTF-8 DB and to send this encrypted string to another service that accepts UTF-8 strings\
Could you please advise how to get UTF-8 string after encryption?
Actually Base64 encrypting solves the problem:
https://github.com/amazon-archives/realworld-serverless-application/blob/master/backend/src/main/java/software/amazon/serverless/apprepo/api/impl/pagination/EncryptedTokenSerializer.java#L51

Sagemaker boto3 invoke_endpoint - I keep getting type errors for payload. using Blazingtext model endpoint

Let me frame the issue. I have trained a blazingtext model and have an endpoint deployed.
Within my Notebook instance I can call model.predict and get inferences from the endpoint.
I am now trying to set up a lambda and an API gateway for the endpoint. I am having trouble trying to figure out what the payload is supposed to be for Invoke_endpoint(endpoint_name = mymodel,
body = payload)
I keep getting invalid payload format errors
This is what my payload looks like when testing the lambda
{"instances":"string of text"}
the documentation says the body take b'bytes or file like objects. i have tinkered around with IO with no luck. No good blogs or tutorials out there for this particular issue. Only a bunch of videos going over the cookie cutter examples that are out there.
import io
import boto3
import json
import csv
# grab environment variables
ENDPOINT_NAME = os.environ['ENDPOINT_NAME']
runtime= boto3.client('runtime.sagemaker')
def lambda_handler(event, context):
print("Received event: " + json.dumps(event, indent=2))
data = json.loads(json.dumps(event))
payload = data["instances"]
print(data)
#print(payload)
response = runtime.invoke_endpoint(EndpointName=ENDPOINT_NAME,
ContentType='application/json',
Body=payload.getvalue())
#print(response)
#result = json.loads(response['Body'].read().decode())
#print(result)
#pred = int(result['predictions'][0]['score'])
#predicted_label = 'M' if pred == 1 else 'B'
return ```
"errorMessage": "An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (406) from model with message \"Invalid payload format\"
If your payload is what you describe, i.e.:
payload = {"instances":"string of text"}
then you can get it in the form of json string using:
json.dumps(payload)
# which gives:
'{"instances": "string of text"}'
If you want it in bate array, then you can do as follows:
json.dumps(payload).encode()
# which gives:
b'{"instances": "string of text"}'

What encoding does blob.download_as_string() return?

I am downloading a file from Google Storage as a byte string, b64 encoding it, and using that as input into the Google Vision API.
storage_client = storage.Client(project=[PROJECT])
bucket = storage_client.get_bucket([BUCKET])
blob = bucket.blob([KEY])
content = blob.download_as_string()
b64content = base64.b64encode(content)
client = vision.ImageAnnotatorClient()
image = vision.types.Image(content=b64content)
I am getting a bad image error using the b64content. However, if I use the non base64 content, my call to the Vision API succeeds:
image = vision.types.Image(content=content)
Does blob.download_as_string() return a byte string that is already base64 encoded?
Short answer: no, it is not base64 encoded. Then why does it work with the non-encoded string?
Using the Python Client as you do, you don't need to encode the string, as seen here. You need to encode it if you post a Vision API request in JSON, like this one. This is why you get it working already without base64.b64encode().

Twitter media/upload image by its web url

I followed the steps in media/upload. I wrote this function in python
def upload_media(self,access_token,image_url):
client = self.get_client(access_token)
message = {'media' : image_url}
encoded_status = urllib.urlencode(message)
url = "https://upload.twitter.com/1.1/media/upload.json?"+ encoded_status
resp, content = client.request(url,'post')
return content
And I got this :
{"request":"\/1.1\/media\/upload.json","error":"media type unrecognized."}
As far as I can tell, the error is in trying to upload a URL. The Twitter API requires you to upload a base64-encoded image.
See: https://dev.twitter.com/rest/reference/post/media/upload
So instead of the image's URL, it should be the file content:
with open('example.jpg', 'rb') as f:
data = f.read()
message = {'media':data}
Optionally (I still haven't figured out whether this is required or not, as different people give different answers), you could encode the image in base-64 encoding:
with open('example.jpg', 'rb') as f:
data = f.read()
data = data.encode('base64')
message = {'media':data}

Send login credentials to another server

I have two servers running django. I'll call one server my "logging" server and another my "client" server. The client server wants to log a message with the logging server by passing over a username, password, and message over the internet. With my current implementation I'm hitting an error when trying to decrypt the encrypted message, username, and password that was sent over the wire. It looks like I should be trying to decrypt a "byte string" according to the pycrypto documentation, but I can't seem to create a byte string correctly since I haven't been able to get around this problem. Also, it feels like my implementation is taking me down a rabbit hole of security vulnerabilities and codec confusion. Is there a package that I should look at which already implements this type of functionality? If so what would that implementation look like?
client:
from Crypto.Hash import MD5
from Crypto.PublicKey import RSA
from base64 import b64decode
import urllib2
import urllib
#I realize recreating the hash everytime is slow. I just included it here for simplicity.
logger_public_signature_message = "I am a client :)"
logger_public_signature_hash = MD5.new(logger_public_signature_message).digest()
client_private_key = #private key
logger_public_key = #public key
client_private = RSA.importKey(client_private_key)
client_public = client_private.publickey()
logger_public = RSA.importKey(logger_public_key)
message = "my message"
username = "user"
password = "password"
encrypted_message = logger_public.encrypt(message, "ignored_param")
encrypted_username = logger_public.encrypt(username, "ignored_param")
encrypted_password = logger_public.encrypt(password, "ignored_param")
signature = client_private.sign(logger_public_signature_hash, '')
params = { "message": encrypted_message, "username": encrypted_username, "password": encrypted_password, "signature": signature }
url_encoded_params = urllib.urlencode(params)
url = 'http://localhost:8000/url/to/logger/'
req = urllib2.Request(url, url_encoded_params)
logger:
from Crypto.Hash import MD5
from Crypto.PublicKey import RSA
from base64 import b64decode
def log(request):
#I realize recreating the hash everytime is slow. I just included it here for simplicity.
logger_public_signature_message = "I am a client :)"
logger_public_signature_hash = MD5.new(logger_public_signature_message).digest()
client_public_key = #client public key
logger_private_key = #logger private key
client_public = RSA.importKey(client_public_key)
logger_private = RSA.importKey(logger_private_key)
p = request.POST
encrypted_message = urllib2.unquote(p["message"])
encrypted_username = urllib2.unquote(p["username"])
encrypted_password = urllib2.unquote(p["password"])
signature = urllib2.unquote(p["signature"])
#I'm hitting exceptions when trying to decrypt the encrypted messages.
#The exceptions are: "ValueError: Message too large" I think the issue is that
#I'm trying to decrypt a base64 string where I should be trying to decrypt a byte
#string from reading the documentation. But I haven't been able I guess to correctly
#create a byte string because I can't get it to work.
decrypted_message = logger_private.decrypt(encrypted_message.encode("base64"))
decrypted_username = logger_private.decrypt(encrypted_username.encode("base64"))
decrypted_password = logger_private.decrypt(encrypted_password.encode("base64"))
verified = client_public.verify(logger_public_signature_hash, signature)
I think you are putting a lot of effort into stuff, that doesn't need to be handled by Django.
Here is what I would usually do:
Use HTTPS, as transport encryption layer
Use HTTP Basic Auth. Basic auth is implemented in urllib2 as well as requests.
But there is an even better solution: Django REST framework
It provides you will a full blown REST API including different authentication solutions.
If you need any help, setting up one of these options, let me know and I'll add an example.
May it be that you would use sentry for logging? Of course if it isn't task for training.
I look at the sentry since it been django application, and now it surely is excellent production-ready solution.
We're using it in banking-sphere software development.
You are very close to decrypting the values on the server. The result of the encryption on the client is a tuple. When you urllib2.unquote the items on the server, you then recreate tuples from them.
For example:
>>> c = public.encrypt('Hello', "ignored")
>>> c
('3\xae0\x1f\xd7\xe4b\xd4\xf1\xf4\x88!Be\xff!\x1e\xda\x82\x10\x9bRy\x0c\xa0v\xed\x84\xf9\xe35\xc6QG\xcf\xb7\x1b\xea\x9fe\t\x9b\x8d\xd6\xf3\x8cw\xde\x17\xb5\xf7\x9a+\x84i%#\x8a\xdf\xf4\xdd\xc8wY',)
which in your code you pack into params like this:
>>> params = { "message" : c }
>>> params
{'message': ('3\xae0\x1f\xd7\xe4b\xd4\xf1\xf4\x88!Be\xff!\x1e\xda\x82\x10\x9bRy\x0c\xa0v\xed\x84\xf9\xe35\xc6QG\xcf\xb7\x1b\xea\x9fe\t\x9b\x8d\xd6\xf3\x8cw\xde\x17\xb5\xf7\x9a+\x84i%#\x8a\xdf\xf4\xdd\xc8wY',)}
>>> urllib.urlencode(params)
'message=%28%273%5Cxae0%5Cx1f%5Cxd7%5Cxe4b%5Cxd4%5Cxf1%5Cxf4%5Cx88%21Be%5Cxff%21%5Cx1e%5Cxda%5Cx82%5Cx10%5Cx9bRy%5Cx0c%5Cxa0v%5Cxed%5Cx84%5Cxf9%5Cxe35%5Cxc6QG%5Cxcf%5Cxb7%5Cx1b%5Cxea%5Cx9fe%5Ct%5Cx9b%5Cx8d%5Cxd6%5Cxf3%5Cx8cw%5Cxde%5Cx17%5Cxb5%5Cxf7%5Cx9a%2B%5Cx84i%25%40%5Cx8a%5Cxdf%5Cxf4%5Cxdd%5Cxc8wY%27%2C%29'
I would guess that urllib2.unquote(p["message"]) returns this (but I did not try this):
"('3\\xae0\\x1f\\xd7\\xe4b\\xd4\\xf1\\xf4\\x88!Be\\xff!\\x1e\\xda\\x82\\x10\\x9bRy\\x0c\\xa0v\\xed\\x84\\xf9\\xe35\\xc6QG\\xcf\\xb7\\x1b\\xea\\x9fe\\t\\x9b\\x8d\\xd6\\xf3\\x8cw\\xde\\x17\\xb5\\xf7\\x9a+\\x84i%#\\x8a\\xdf\\xf4\\xdd\\xc8wY',)"
then you can recreate the tuple at the server like this (m is the unquoted message):
>>> from ast import literal_eval
>>> literal_eval(m)
('3\xae0\x1f\xd7\xe4b\xd4\xf1\xf4\x88!Be\xff!\x1e\xda\x82\x10\x9bRy\x0c\xa0v\xed\x84\xf9\xe35\xc6QG\xcf\xb7\x1b\xea\x9fe\t\x9b\x8d\xd6\xf3\x8cw\xde\x17\xb5\xf7\x9a+\x84i%#\x8a\xdf\xf4\xdd\xc8wY',)
once you have the tuple back, you can decrypt it:
>>> private.decrypt(literal_eval(m))
'Hello'
It would be better to find a vetted and standard mechanism to do this rather than roll your own. For example, in your scheme, I could capture different messages between the client and server, and then mix and match messages and username/password pairs, making it appear that the messages came from different users.
However, with just this minor tweak (recreating the tuples from the "unquoted" strings) your code should decrypt just fine.