KTOR - Compress Large json string in GET routing request - compression

New to this. Please provide block of code that can help me to compress large json to small size that can be faster to access with GET request.

You can use the Compression plugin to compress a response body using some algorithm like gzip.

Related

How to facilitate downloading both CSV and PDF from API Gateway connected to S3

In the app I'm working on, we have a process whereby a user can download a CSV or PDF version of their data. The generation works great, but I'm trying to get it to download the file and am running into all sorts of problems. We're using API Gateway for all the requests, and the generation happens inside a Lambda on a POST request. The GET endpoint takes in a file_name parameter and then constructs the path in S3 and then makes the request directly there. The problem I'm having is when I'm trying to transform the response. I get a 500 error and when I look at the logs, it says Execution failed due to configuration error: Unable to transform response. So, clearly that's where I've spent most of my time. I've tried at least 50 different iterations of templates and combinations with little success. The closest I've gotten is the following code, where the CSV downloads fine, but the PDF is not a valid PDF anymore:
CSV:
#set($contentDisposition = "attachment;filename=${method.request.querystring.file_name}")
$input.body
#set($context.responseOverride.header.Content-Disposition = $contentDisposition)
PDF:
#set($contentDisposition = "attachment;filename=${method.request.querystring.file_name}")
$util.base64Encode($input.body)
#set($context.responseOverride.header.Content-Disposition = $contentDisposition)
where contentHandling = CONVERT_TO_TEXT. My binaryMediaTypes just has application/pdf and that's it. My goal is to get this working without having to offload the problem into a Lambda so we don't have that overhead at the download step. Any ideas how to do this right?
Just as another comment, I've tried CONVERT_TO_BINARY and just leaving it as Passthrough. I've tried it with text/csv as another binary media type and I've tried different combinations of encoding and decoding base64 and stuff. I know the data is coming back right from S3, but the transformation is where it's breaking. I am happy to post more logs if need be. Also, I'm pretty sure this makes sense on StackOverflow, but if it would fit in another StackExchange site better, please let me know.
Resources I've looked at:
https://docs.aws.amazon.com/apigateway/latest/developerguide/request-response-data-mappings.html
https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-mapping-template-reference.html#util-template-reference
https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-payload-encodings-workflow.html
https://docs.amazonaws.cn/en_us/apigateway/latest/developerguide/api-gateway-payload-encodings-configure-with-control-service-api.html.
(But they're all so confusing...)
EDIT: One Idea I've had is to do CONVERT_TO_BINARY and somehow base64 encode the CSVs in the transformation, but I can't figure out how to do it right. I keep feeling like I'm misunderstanding the order of things, specifically when the "CONVERT" part happens. If that makes any sense.
EDIT 2: So, I got rid of the $util.base64Encode in the PDF one and now I have a PDF that's empty. The actual file in S3 definitely has things in it, but for some reason CONVERT_TO_TEXT is not handling it right or I'm still not understading how this all works.
Had similar issues. One major thing is the Accept header. I was testing in chrome which sends Accept header as text/html,application/xhtml.... api-gateway ignores everything except the first one(text/html). It will then convert any response from S3 to base64 to try and conform to text/html.
At last after trying everything else I tried via Postman which defaults the Accept header to */*. Also set your content handling on the Integration response to Passthrough. And everything was working!
One other thing is to pass the Content-Type and Content-Length headers through(Add them in method response first and then in Integration response):
Content-Length integration.response.header.Content-Length
Content-Type integration.response.header.Content-Type

CRC32C checksum for HTTP Range Get requests in google cloud storage

When I want to get a partial range of file content in the google cloud storage, I used XML API and use HTTP Range Get requests. From the google cloud response, I can find the header x-goog-hash, and it contains CRC32C and MD5 checksums. But these checksums are calculated from the whole file. What I need is the crc32c checksum of the partial range of content in the response. With that partial crc32c checksum, I can verify the data in response, otherwise, I cannot check the validation of the response.
I was wondering: Are the files stored in your bucket on gzip format? I read here Using Range Header on gzip-compressed files that you can't get partial information from a compressed file. By default you get the whole file information.
Anyways, could you share the petition you're sending?
I looked for more information and found this: Request Headers and Cloud Storage.
It says that when you use the Range header, the returned checksum will cover the whole file.
So far, there's no way to get the checksum for a byte range alone using the XML API.
However, you could try to do it by splitting your file with your preferred programming language and get the checksum for that "splitted" part.

Online prediction with Data stored in Bucket

As per my understanding Online prediction works with json data. Currently i am running online prediction on local host, where each image get converted to json. ML engin API use this json from localhost for prediction.
Internally ML engine API might have been uploading json to cloud for prediction.
Is there any way to run online prediction on json files already uploaded to cloud bucket?
Internally we parse the input from the payload in the request directly for serving, not store the requests on disk. Currently reading inputs from Cloud is not supported for online prediction. You may consider to use batch prediction which reads data from files stored on cloud.
There is a small discrepancy of the inputs between online and batch for the model that accepts only one string input (probably like your case). In this case, you must base64 encode the image bytes and put it in a JSON file for online prediction, while for batch prediction you need to pack the image bytes into records in TFRecords format and save it as tfrecord file(s). Other than that, the inputs are compatible.

Upload Spark RDD to REST webservice POST method

Frankly i'm not sure if this feature exist?sorry for that
My requirement is to send spark analysed data to file server on daily basis, file server supports file transfer through SFTP and REST Webservice post call.
Initial thought was to save Spark RDD to HDFS and transfer to fileserver through SFTP.
I would like to know is it possible to upload the RDD directly by calling REST service from spark driver class without saving to HDFS.
Size of the data is less than 2MB
Sorry for my bad english!
There is no specific way to do that with Spark. With that kind of data size it will not be worth it to go through HDFS or another type of storage. You can collect that data in your driver's memory and send it directly. For a POST call you can just use plain old java.net.URL, which would look something like this:
import java.net.{URL, HttpURLConnection}
// The RDD you want to send
val rdd = ???
// Gather data and turn into string with newlines
val body = rdd.collect.mkString("\n")
// Open a connection
val url = new URL("http://www.example.com/resource")
val conn = url.openConnection.asInstanceOf[HttpURLConnection]
// Configure for POST request
conn.setDoOutput(true);
conn.setRequestMethod("POST");
val os = conn.getOutputStream;
os.write(input.getBytes);
os.flush;
A much more complete discussion of using java.net.URL can be found at this question. You could also use a Scala library to handle the ugly Java stuff for you, like akka-http or Dispatch.
Spark itself does not provide this functionality (it is not a general-purpose http client).
You might consider using some existing rest client library such as akka-http, spray or some other java/scala client library.
That said, you are by no means obliged to save your data to disk before operating on it. You could for example use collect() or foreach methods on your RDD in combination with your REST client library.

Upload Photos using Silverlight - Ria Services

i'm trying to find a good exemple on uploading and downloading images using solely Silverlgith + Ria Services, i tried to find some but i failed, please any help would be appreciated.
thank you all in advance
I just found some useful walk trough here and make sure to read follow-up that improves the save process and used image
We did it by saving the images on disk (not in a DB) - like this:
Upload image:
Write a Domain Service with an operation like void UploadJPGImage(string uniqueName, byte[] jpgBytes). This needs to be marked with the attribute for ClientAccess. The (server-side) implementation saves the image on the disk.
for the uniqueName, we generate a GUID client-side
Download image:
HTTP Handler - write an HTTP handler for downloading the image using a URL containing the unique name parameter passed by the client when uploading the image
Or, one could write a Domain Service operation, like byte[] DownloadJPGImage(string uniqueName)