How should I do an ISerializable Stream (or near enough)

How should I do an ISerializable Stream (or near enough) - web-services

I have a web service accessed via SOAP. I'd really like one of the methods to return a Stream.
What are my options?
My thoughts right now amount to implement Stream and stuff all the data in a string. Is there a type that does this already? If possible (and I don't think it is) I'd love to actually tunnel the stream through SOAP so that data gets pulled lazily even after the method returns.

Your best bet is to read the Stream into a byte array. You can then serialize the byte array in the web service. The client can then consume the raw byte array and re-assemble it into it's original format.
I've also used the same strategy for uploading files via web service it worked great.

Related

What's the most efficient setup for hosting a “passthrough” string manipulator?

Server A needs to send raw string data to Server B via http API calls.
Server B must parse and manipulate raw string data and then send resulting key/value pairs to Server C, via http API calls.
Server A cannot talk to Server C directly.
The operations to take place are:
Reception of string data (~8k of poorly formatted XML)
Manipulation of string data
Sending of string data
All of this must happen at scale, meaning thousands of times per second, from hundreds of different clients.
Perhaps nginx as a host, since it's great at concurrency? If so, what's the most efficient language to use behind it for the text manipulation, parsing and sending?
I also thought node might be a good option, since it has all the string manipulation functions built-in, as well as the protocols for sending and receiving data.
Very interested in hearing thoughts on the best way to approach this.

Should output be encoded at the API or client level?

We are moving our Web app architecture to being microservice based. We have an internal debate as to whether an REST API that provides content (in JSON, let's say) should be looking to encode content to make it safe, or whether the consumers that take that content and display it (in HTML, for example, or otherwise use it) should be responsible for that encoding. The use case is to prevent XSS attacks and similar.
The provider stance is "Well, we can't know how to encode it for everyone, or how you're going to use the content, so of course the consumers should encode the content."
The consumer stance is "There is one provider and multiple consumers, so it's more secure to do it once in the providing API than to hope that every consumer does it."
Are there any generally accepted best practices on this and why?

As a rule, data when passing through "internal" processes (whatever that might mean to use) should be stored or encoded in whatever "internal" format makes sense. The format chosen is typically designed to minimize encoding/decoding steps and to prevent data loss.
Then, upon output, data is encoded using whatever output format makes sense. Preventing data loss is important, but also proper escaping and formatting is key here.
So for example, with internal APIs, data in binary format may be sufficent. But when you output JSON or HTML or XML or PDF, you have to encode and escape your data appropriately to fit the output format.
The important point here is that different output formats have different concepts of "safe". What's "safe" for HTML may not be safe for JSON, and what's safe for JSON may not be safe for SQL. Data is encoded upon output specifically so that you can use the proper encoding for the task. You cannot assume that this step is done for you ahead of time, nor should you put your output function in the position to determine whether or not encoding must be done. If you stick with the rule: "output function ALWAYS encodes for safety", then you will never have to worry about data injection attacks.

I would say that the two important points are the following:
The encoding used by the provider MUST be specified with extreme clarity and precision in a reference document, so that all consumer implementors can know what to expect.
Whatever default encoding is used by the provider MUST keep all needed information, i.e. still be amenable to transcoding by any consumer who would wish to do it.
If you follow these two rules then you will have done 95% of the job for reliability and security.
As for your specific question, a good practice is a middle-ground: the provider follows by default a "generic" encoding, but consumers can ask (optionally) for a specific encoding which the provider may then apply -- this allows the provider to support a number of dumb, lightweight clients of possibly different kinds and can be extended later on with extra encodings without breaking the API.

I firmly believe it is both the consumer and the provider that need to do their part in being good citizens in the security space.
As the provider I want to make sure I deliver a secure product. I don't need to know the context in which my client is going to use my product, all I need to know is how I am going to deliver it. If my delivery is in JSON, then I can use that context to escape my data before sending it off, similarly for XML, plain text, etc. Further more there are transport methods that aid in security already. JSONP is one such delivery method. This ensures the payload is consumed appropriately.
As the consumer, which by the way in our environment no one is the final consumer, we are all providers to the final end client (the end users via a web browser mostly.). Because of this we have to also secure the data at this end. I would never trust a black box API to do this job for me, I would always make a point to ensure a secure payload. There are many tools out there, the ESAPI project from OWASP comes to mind, that will aid in the sanitization by context of data. Remember that you are eventually sending this data on to the end-user (browser) and if there is something awry you won't be able to pass the buck. Your service will be viewed as the vulnerable one regardless of where the flaw lies. Additionally, as the consumer, you may not always be able to rely on the black box provider to fix their flaws in a timely fashion. What if their support is lacking or they have higher priorities. Does that mean you continue to provide a known flaw to your end-users?
Security is about layers, and having safeguards at the source and end-points is always preferable.

libcurl multipart post of variable length data objects

I'm having a difficult time with libcurl trying to adapt it to a particular situation. What I'm doing is essentially loading a variable number of objects into memory, performing various transforms on them, and then I want to uploaded them (serialized binary data of course) as part of a multi part post.
The part I'm struggling with is that I want to just add them as a part as they finish down this pipeline, then delete them after that particular part is posted.
I have thought about perhaps giving it a read function ptr, and on the callbacks perhaps manually feed the buffer with the part headers and data? This approach seems to be quite a hack.
I have tried the regular multipart approach (with multi-handle) but that seems to require all the data up front, or to be read from a file. Which i do not want libcurl to deal with.
To recap, I want to open a connection, start http multipart post request -> get in memory buffer -> add as post attatchment (multipart) -> send that off -> wait for next chunk of data -> repeat till done.
Thanks in advanced.

Use the curl_formadd() function to prepare a multipart/form-data HTTP post, and then use the CURLOPT_HTTPPOST option to actuallly send it. curl_formadd() has a CURLFORM_STREAM option to enable use of the connection's CURLOPT_READFUNCTION callback so you can custom-stream each multipart's data.

Transmit raw vertex information to XTK?

We're using XTK to display data processed and created on a server. In our particular case, it's a parallel isocontouring application. As it currently stands we're converting to the (textual) VTK format and passing the entire (imaginary) VTK file over the wire to the client, where XTK renders it. This provides some substantial overhead, as the text format outweighs in the in-memory format by a considerably amount.
Is there a recommended mechanism available for transmitting binary data directly, either through an alternate format that is well-described or by constructing XTK primitives inside the JavaScript code itself?

It should be supported to parse an X.object from JSON. So you could generate the JSON on the serverside and use the X.object(jsonobject) copy constructor to safe down cast it. This should also give the advantage that the objects can be 'webgl-ready' and do not require any clientside parsing which should result in instant loading.
I was planning to play with that myself soon but if you get anything to work, please let us know.
Just have in mind that you need to match the X.object structure even in JSON. The best way to see what is expected by xtk is to JSON.stringify a webgl-ready X.object.

XMLHTTPRequest, in its second specification (the last one), allows trans-domain http requests (but you must have the control of the php header on the server side).
In addition it allows to sent ArrayBuffer, or Blobs or Documents (look here). And then on the client side you can write your own parser for that blob or (I think it fits more in you case) that BinaryBuffer using binary buffer views (see doc here). However XMLHTTPRequest is from client to server, but look HTML5 WebSocket, it seems it can transfert binaryArrays too (they say it here : ).
In every case you will need a parser to transform binary to string or to X.object at the client side.
I wish it helped you.

Transferring large files with web services

What is the best way to transfer large files with web services ? Presently we are using the straight forward option to transfer the binary data by converting the binary data into base 64 format and embeding the base 64 encoding into soap envelop itself.But it slows down the application performance considerably.Please suggest something for performance improvement.

In my opinion the best way to do this is to not do this!
The Idea of Webservices is not designed to transfer large files. You should really transfer an url to the file and let the receiver of the message pull the file itsself.
IMHO that would be a better way to do this then encoding and sending it.

Check out MTOM, a W3C standard designed to transfer binary files through SOAP.
From Wikipedia:
MTOM provides a way to send the binary
data in its original binary form,
avoiding any increase in size due to
encoding it in text.
Related resources:
SOAP Message Transmission Optimization Mechanism
Message Transmission Optimization Mechanism (Wikipedia)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How should I do an ISerializable Stream (or near enough) - web-services

Your best bet is to read the Stream into a byte array. You can then serialize the byte array in the web service. The client can then consume the raw byte array and re-assemble it into it's original format. I've also used the same strategy for uploading files via web service it worked great.

Related

What's the most efficient setup for hosting a “passthrough” string manipulator?

Should output be encoded at the API or client level?

libcurl multipart post of variable length data objects

Transmit raw vertex information to XTK?

Transferring large files with web services

Categories

Resources