Akka http streaming the response headers

Akka http streaming the response headers - akka

By definition the http response is split in 3 parts, status-code -> headers -> body, and when doing akka client http requests the http response is received after the first 2 parts have been completely
received.
val responseFuture: Future[HttpResponse]
responseFuture.map {
case HttpResponse(statusCode:StatusCode, headers:Seq[HttpHeader], entity:ResponseEntity, protocol:HttpProtocol)
}
This is completely fine for most use cases, but in my particular case I need access to the headers before all the headers are received (a third party server is returning progress by writing custom progress headers until the response is ready). Is there any way to access the headers the same way we access the body?
val entity: ResponseEntity
val entitySource:Source[ByteString, Any] = entity.dataBytes
In the perfect world there would be a way to access the headers as a source as well
HttpResponse(statusCode:StatusCode, headers:Source[HttpHeader, NotUsed], entity:ResponseEntity, protocol:HttpProtocol)

Not Possible With akka-http
The representation of HttpResponse treats the headers as a Seq[HttpHeader] instead of an Iterator or an akka-stream Source. Therefore, as explained in the question, it is not possible to instantiate an HttpResponse object without having all the header values available first.
I do not know the exact reason behind this design decision but I suspect it is because it would be difficult to support a Source for the headers and a Source for the body. The body Source would not be able to be consumed without first consuming the header Source, so there would have to be a strict ordering of accessing the response's member variables. This would lead to confusion and unexpected errors.
Lower Level Processing with akka-stream
The hypertext transfer protocol is just an application layer protocol, usually on top of TCP. And, it is a fairly simple message format:
The response message consists of the following:
A status line which includes the status code and reason message (e.g.,
HTTP/1.1 200 OK, which indicates that the client's request succeeded).
Response header fields (e.g., Content-Type: text/html).
An empty line.
An optional message body.
Therefore, you could use the Tcp binding to get a connection and parse the message ByteString Source yourself to get at the headers:
val maximumFrameLength = 1024 * 1024
val endOfMessageLine : () => Byte => Boolean = () => {
var previousWasCarriage = false
(byte) =>
if(byte == '\r') {
previousWasCarriage = true
false
}
else if(byte == '\n' && previousWasCarriage) {
previousWasCarriage = false
true
}
else {
previousWasCarriage = false
false
}
}
def subFlow =
Flow[ByteString].flatMapConcat(str => Source.fromIterable(str))
.splitAfter(endOfMessageLine())
Unfortunately this probably requires that your request be sent as a raw ByteString via the Tcp binding as well.

Related

Retuning stream in AWS API Gateway -> Lambda function?

I have created an API using AWS api gateway like https://api.mydomain.com/v1/download?id=1234". The download resource has GET method. And the GET method is invoking lambda function using Lambda Proxy Integration.
The Lambda function needs to act as Proxy. It needs to resolve correct backend endpoint based on header x-clientId and then forward the request to that backend endpoint and return response as it is. So it needs to be generic to handle GET request of different content-type.
My lambda function looks like ( .NET Core)
public async Task<APIGatewayProxyResponse> Route(APIGatewayProxyRequest input, ILambdaContext context)
{
var clientId = headers["x-clientId"];
var mappings = new Mappings();
var url = await mappings.GetBackendUrl(clientId, input.Resource);
var httpClient = new HttpClient();
var response = await httpClient.GetAsync(url);
response.EnsureSuccessStatusCode();
var proxyResponse = new APIGatewayProxyResponse()
{
Headers = new Dictionary<string, string>(),
StatusCode = (int)System.Net.HttpStatusCode.OK,
IsBase64Encoded = false,
Body = await response.Content.ReadAsString())
};
}
The handler above works as long as request and response's content-type is application/json or application/xml. However i am not sure how to handle response when backend returns stream.
For download API, the backend returns Content-Disposition: attachment; filename="somefilename and ContentType may be one of the following:
application/pdf
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
application/vnd.openxmlformats-officedocument.wordprocessingml.document
application/x-zip-compressed
application/octet-stream
For these streams, How do i set APIGatewayProxyResponse.Body?
For Excel file I have tried setting body like below
var proxyResponse = new APIGatewayProxyResponse()
{
Headers = new Dictionary<string, string>(),
StatusCode = (int)System.Net.HttpStatusCode.OK,
IsBase64Encoded = true,
Body = Convert.ToBase64String(await response.Content.ReadAsByteArrayAsync())
};
proxyResponse.Headers.Add("Content-Type", "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet");
proxyResponse.Headers.Add("Content-Disposition", "attachment; filename=\"Report.xlsx\"");
When i access the Url from the browser and try to open the file. I get error
Excel cannot open the fileReport.xlsxbecuase the file format or file extension is not valid. Verify that the file has not been corrupted and that the extention matches the format of the file
I think the issue is how i am setting the response body
Update 1
So based on AWS doc Binary Data Now Supported by API Gateway. Now as per the documentation
you can specify if you would like API Gateway to either pass the
Integration Request and Response bodies through, convert them to text
(Base64 encoding), or convert them to binary (Base64 decoding). These
options are available for HTTP, AWS Service, and HTTP Proxy
integrations. In the case of Lambda Function and Lambda Function Proxy
Integrations, which currently only support JSON, the request body is
always converted to JSON.
I am using Lambda Function Proxy, which currently support JSON. However the example here shows how to do it with Lambda Proxy.
I think what i am missing here is Binary Media Types setting and Method Response settings. Below is my setting. Not sure if these settings are correct
Binary Media
Method Response

here how solved it
1>add Binary Media Types. API->Settings->Binary Media Types -> add
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
2>In Method Response Add Content-Disposition and Content-Type headers for thestatus 200
3>In Integration Response map these headers to headers that are coming from the backend. And also set content handling convert to binary. (our backend api is returning file blob in body)

How to send attachment using AWS SES in NodeJS?

As per the given doc at "https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/SES.html#sendTemplatedEmail-property". It was using "sendTemplatedEmail" API we can send email using templates. It was successful. But I could not figure out how to add attachments to it.
In the 4th point of the "sendTemplatedEmail" API doc it says "The total size of the message, including attachments, must be less than 10 MB". How to add the attachment here in this sendTemplatedEmail API?
Also there is a API called "sendRawEmail". But that does not suit my requirement. I need to use templates and also attach documents. Does any one know what to do ??

Take a look at the SendRawEmail example:
/* The following example sends an email with an attachment: */
var params = {
Destinations: [],
FromArn: "",
RawMessage: {
Data: <Binary String>
},
ReturnPathArn: "",
Source: "",
SourceArn: ""
};
ses.sendRawEmail(params, function(err, data) {
if (err) console.log(err, err.stack); // an error occurred
else console.log(data); // successful response
/*
data = {
MessageId: "EXAMPLEf3f73d99b-c63fb06f-d263-41f8-a0fb-d0dc67d56c07-000000"
}
*/
});
Reference: https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/SES.html
Important: you need to understand the MIME type standards to include your attachment. Take a look at this article.
MIME was defined in 1992 by the Internet Engineering Task Force
(IETF). The distinguishing characteristic of a MIME message is the
presence of the MIME headers. As long as your mail recipients also
have e-mail software that is MIME-compliant (and most e-mail software
is), you can swap files containing attachments automatically.
EDIT: This article explains how to include attachment in your body.
MIME completes the illusion of file attachments by allowing the
message body to be divided into distinct parts, each with their own
headers. The content type multipart/mixed means that the content of
the body is divided into blocks separated by "--" + a unique string
guaranteed to not be found anywhere else in the message. If you say
that your boundary string is "MyBoundaryString", then all occurrences
of that string will be treated as a boundary. So it better not be in
the message the user typed or it won't be decoded correctly.
Wikipedia also gives an example:
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary=frontier
This is a message with multiple parts in MIME format.
--frontier
Content-Type: text/plain
This is the body of the message.
--frontier
Content-Type: application/octet-stream
Content-Transfer-Encoding: base64
PGh0bWw+CiAgPGhlYWQ+CiAgPC9oZWFkPgogIDxib2R5PgogICAgPHA+VGhpcyBpcyB0aGUg
Ym9keSBvZiB0aGUgbWVzc2FnZS48L3A+CiAgPC9ib2R5Pgo8L2h0bWw+Cg==
--frontier--
I assume you are familiar with Base64.

Missing request headers in puppeteer

I want to read the request cookie during a test written with the puppeteer. But I noticed that most of the requests I inspect have only referrer and user-agent headers. If I look at the same requests in Chrome dev tools, they have a lot more headers, including Cookie. To check it out, copy-paste the code below into https://try-puppeteer.appspot.com/.
const browser = await puppeteer.launch();
const page = await browser.newPage();
page.on('request', function(request) {
console.log(JSON.stringify(request.headers, null, 2));
});
await page.goto('https://google.com/', {waitUntil: 'networkidle'});
await browser.close();
Is there a restriction which request headers you can and can not access? Is it a limitation of Chrome itself or puppeteer?
Thanks for suggestions!

I also saw this when I was trying to use Puppeteer to test some CORS behaviour - I found the Origin header was missing from some requests.
Having a look around the GitHub issues I found an issue which mentioned Puppeteer does not listen to the Network.responseReceivedExtraInfo event of the underlying Chrome DevTools Protocol, this event provides extra response headers not available to the Network.responseReceived event. There is also a similar Network.requestWillBeSentExtraInfo event for requests.
Hooking up to these events seemed to get me all the headers I needed. Here is some sample code which captures the data from all these events and merges it onto a single object keyed by request ID:
// Setup.
const browser = await puppeteer.launch()
const page = await browser.newPage()
const cdpRequestDataRaw = await setupLoggingOfAllNetworkData(page)
// Make requests.
await page.goto('http://google.com/')
// Log captured request data.
console.log(JSON.stringify(cdpRequestDataRaw, null, 2))
await browser.close()
// Returns map of request ID to raw CDP request data. This will be populated as requests are made.
async function setupLoggingOfAllNetworkData(page) {
const cdpSession = await page.target().createCDPSession()
await cdpSession.send('Network.enable')
const cdpRequestDataRaw = {}
const addCDPRequestDataListener = (eventName) => {
cdpSession.on(eventName, request => {
cdpRequestDataRaw[request.requestId] = cdpRequestDataRaw[request.requestId] || {}
Object.assign(cdpRequestDataRaw[request.requestId], { [eventName]: request })
})
}
addCDPRequestDataListener('Network.requestWillBeSent')
addCDPRequestDataListener('Network.requestWillBeSentExtraInfo')
addCDPRequestDataListener('Network.responseReceived')
addCDPRequestDataListener('Network.responseReceivedExtraInfo')
return cdpRequestDataRaw
}

That's because your browser sets a bunch of headers depending on settings and capabilities, and also includes e.g. the cookies that it has stored locally for the specific page.
If you want to add additional headers, you can use methods such as:
page.setExtraHTTPHeaders docs here.
page.setUserAgent docs here.
page.setCookies docs here.
With these you can mimic the extra headers that you see your Chrome browser dispatching.

Write OwinResponse content and set StatusCode at the same time

Is it possible to use OwinResponse.Write while setting the status code to something other than 200?
I have the following code in an OwinMiddleware but as long as OwinResponse.Write is called the StatusCode is always set to 200 :(
response.OnSendingHeaders(state =>
{
var resp = (OwinResponse) state;
var message = string.Format(
"Max API concurrent calls quota exceeded, please try again later. Maximum admitted: {0}",
_maxConcurrentRequests);
resp.ReasonPhrase = message;
resp.Write(message);
resp.StatusCode = 429; // doesn't work here unless I comment out the line above
}, response);

StatusCode must be set before writing to the body, not after.
Don't write to the body inside OnSendingHeaders, it's recursive as
OnSendingHeaders is usually triggered by a write to the body.
Why are you even using OnSendingHeaders here? Why not just do all of
this directly on the response?
That's far more information than is usually included in a reason phrase. That level of detail belongs in the response body. The default reason phrase for 429 is Too Many Requests.

Wierd Behavior When Using Python requests.put() with Flask

Background
I have a service A accessible with HTTP requests. And I have other services that want to invoke these APIs.
Problem
When I test service A's APIs with POSTMAN, every request works fine. But when I user python's requests library to make these request, there is one PUT method that just won't work. For some reason, the PUT method being called cannot receive the data (HTTP body) at all, though it can receive headers. On the other side, the POST method called in the same manner receives the data perfectly.
I managed to achieve my goal simply by using httplib library instead, but I am still quite baffled by what exactly happened here.
The Crime Scene
Route 1:
#app.route("/private/serviceA", methods = ['POST'])
#app.route("/private/serviceA/", methods = ['POST'])
def A_create():
# request.data contains correct data that can be read with request.get_json()
Route 2:
#app.route("/private/serviceA/<id>", methods = ['PUT'])
#app.route("/private/serviceA/<id>/", methods = ['PUT'])
def A_update(id):
# request.data is empty, though request.headers contains headers I passed in
# This happens when sending the request with Python requests library, but not when sending with httplib library or with POSTMAN
# Also, data comes in fine when all other routes are commented out
# Unless all other routes are commented out, this happens even when the function body has only one line printing request.data
Route 3:
#app.route("/private/serviceA/schema", methods = ['PUT'])
def schema_update_column():
# This one again works perfectly fine
Using POSTMAN:
Using requests library from another service:
#app.route("/public/serviceA/<id>", methods = ['PUT'])
def A_update(id):
content = request.get_json()
headers = {'content-type': 'application/json'}
response = requests.put('%s:%s' % (router_config.HOST, serviceA_instance_id) + '/private/serviceA/' + str(id), data=json.dumps(content), headers = headers)
return Response(response.content, mimetype='application/json', status=response.status_code)
Using httplib library from another service:
#app.route('/public/serviceA/<id>', methods=['PUT'])
def update_course(id):
content= request.get_json()
headers = {'content-type': 'application/json'}
conn = httplib.HTTPConnection('%s:%s' % (router_config.HOST, serviceA_instance_id))
conn.request("PUT", "/private/serviceA/%s/" % id, json.dumps(content), headers)
return str(conn.getresponse().read())
Questions
1. What am I doing wrong for the route 2?
2. For route 2, the handler doesn't seem to be executed when either handler is commented out, which also confuses me. Is there something important about Flask that I'm not aware of?
Code Repo
Just in case some nice ppl are interested enough to look at the messy undocumented code...
https://github.com/fantastic4ever/project1
The serviceA corresponds to course service (course_flask.py), and the service calling it corresponds to router service (router.py).
The version that was still using requests library is 747e69a11ed746c9e8400a8c1e86048322f4ec39.

In your use of the requests library, you are using requests.post, which is sending a POST request. If you use requests.put then you would send a PUT request. That could be the issue.
Request documentation

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Akka http streaming the response headers - akka

Related

Retuning stream in AWS API Gateway -> Lambda function?

How to send attachment using AWS SES in NodeJS?

Missing request headers in puppeteer

Write OwinResponse content and set StatusCode at the same time

Wierd Behavior When Using Python requests.put() with Flask

Categories

Resources