Streaming vs chunking in WCF REST service

Streaming vs chunking in WCF REST service - web-services

Can anyone explain me the difference between chunking and streaming and which method should be preferred when uploading big files from iPad to WCF REST service? Right now we have timeout error when uploading big files from iPad and we'd like to fix this. Our key requirement is that WCF Service should know whether the whole file is uploaded or not. So when client by some reason won't be able to upload the whole file WCF should not perform any operation on uploaded content (as far as I understand streaming upload won't allow to implement this).
Some more questions that confuse me:
1) How both of these modes are working in terms of HTTP?
2) I found that in chunked mode there is header "trasnfer-encoding: chunked" in the first request. Then client sends chunks within separate requests to server and a final zero-length request. Do I need to set trasnfer-encoding header in every request? What other headers should be used?
3) Do I need to send only one HTTP request in streaming mode?
4) Do I need to tell WCF Service somehow that I'm sending streamed content?
5) Let's say default connection timeout for WCF service is 30 seconds. How this timeout affect
streaming and chunking modes?
6) Can anyone explain in short how both of these modes should be implemented on server and client? (No code required, just high level description). The more I read on this topic the more I'm getting confused.
Many thanks!

Related

Does anyone know if cloud run supports http/2 streaming while it does NOT support http1.1 streaming?

We have a streaming endpoint where data streams through our api.domain.com service to our backend.domain.com service and then as chunks are received in backend.domain.com, we write those chunks to the database. In this way, we can ndjson a request into our servers and IT IS FAST, VERY FAST.
We were very very disappointed to find out the cloud-run firewalls for http1.1 at least (via curl) do NOT support streaming!!!! curl is doing http2 to google cloud run firewall and google is by default hitting our servers with http1.1(for some reason though I saw an option to start in http2 mode that we have not tried).
What I mean, by they don't support streaming is that google does not send our servers a request UNTIL the whole request is received by them!!!(ie. not just headers, it needs to receive the entire body....this makes things very slow as opposed to streaming straight through firewall 1, cloud run service 1, firewall 2, cloud run service 2, database.
I am wondering if google's cloud run firewall by chance supports http/2 streaming and actually sends the request headers instead of waiting for the entire body.
I realize google has body size limits.......AND I realize we respond to clients with 200OK before the entire body is received (ie. we stream back while a request is being streamed in) sooooo, I am totally ok with google killing the connection if size limits are exceeded.
So my second question in this post is if they do support streaming, what will they do when size is exceeded since I will have already responded with 2000k at that point.
In this post, my definition of streaming is 'true streaming'. You can stream a request into a system and that system can forward it to the next system and keep reading/forwarding and reading/forwarding rather than waiting for the whole request. The google cloud run firewall is NOT MY definition of streaming since it does not pass through chunks it receives! Our servers sends data as it receives it so if there are many hops, there is no impact thanks to webpieces webserver.

Unfortunately, Cloud Run doesn't support HTTP/2 end-to-end to the serving instance.
Server-side streaming is in ALPHA. Not sure if it helps solving your problem. If it does, please fill out the following form to opt in, thanks!
https://docs.google.com/forms/d/e/1FAIpQLSfjwvwFYFFd2yqnV3m0zCe7ua_d6eWiB3WSvIVk50W0O9_mvQ/viewform

HTTP vs Webservices to calculate logic for file transfer

I want to transfer files (upload+download) between clients and server which would be better for me?any recommendation.
one way be would be using http with client sending different parameters in request and server sending or requesting file depending upon algorithm result.
Other way may be, implement WebService on LAMP server and then properly using RPCs to decide upload or download.
I have basic knowledge of Websevices.
Thanks

Automatic upload of 10KB file to web service?

I am writing an application, similar to Seti#Home, that allows users to run processing on their home machine, and then upload the result to the central server.
However, the final result is maybe a 10K binary file. (Processing to achieve this output is several hours.)
What is the simplest reliable automatic method to upload this file to the central server? What do I need to do server-side to prevent blocking? Perhaps having the client send mail is simple and reliable? NB the client program is currently written in Python, if that matters.

Email is not a good solution; you will run into potential ISP blocking and other anti-spam mechanisms.
The easiest way is over HTTP via a simple webservice. Have a listener at your server that accepts the uploaded files as part of a HTTP POST and then dump them wherever they need to be post-processed.

Ideal way/architecture to deliver large data over Web Services

We are trying to design 6 web services, which will serve another client component. The client component requires data from the web service we are implementing.
Now, the problem is, there is not 1 Web Service we are implementing, there is one Web Service which the client component hits, this initiates a series (5 more) of Web Services which gather data from their respective data stores and finally provide the data back to the original Web Service, which then delivers the data back to the client component.
So, if the requested data becomes huge, then, this will be a serious problem for our internal communication channel.
So, what do you guys suggest? What can be done to avoid overloading of the communication channel between the internal Web Service and at the same time, also delivering the data to the client component.
Update 1
Using 5 WS, where, 1WS does not know about the others, except the next one is a business requirement. Actually, 5 companies "small services" are being integrated.
We use Java and Axis2

We've had a similar problem. Apart from trying to avoid it (eg for internal communication go direct to db instead of web service) you can mitigate it by at least not performing the 5 or so tasks in series. Make new threads to collect them all in parallel and process them at the end to reduce latency (except where they might contend for the same resource and bottle neck).
But before I'd do anything load test it and see if it is even an issue and get some baseline stats so you can see what improvement each change makes. Also sometimes you might be better off tweaking network settings or the actual network rather than trying to optimise the code - but again test and see.

Put all the data on a temporary compressed file and give back the ftp url of the file.
The client fetches the big data chunk uncompress it and reads it. (maybe some authentication mechanism for the ftp server)

Is this a good canditate for a web-service?

Ok so coming in from a completely different field of software development, I have a problem that's a little out of my experience. I'll state it as plainly as possible without giving out confidential details:
I want to make a server that "does stuff" when requested by a client on the same network. The client will most likely be a back-end to a content management system.
The request consists of some parameters, an input file and several output files.
The files are quite large, from 10MB - 100MB of data that must be processed (possibly more). The client can specify destination for output files.
The client needs to be able to find out the status of the request - eg position in queue, percent complete. And obviously when and where to pick up output.
So, my questions are - What is a good method for the client and server to communicate? Should the client poll the server, or provide a "callback" somehow for status updates?
At this point the implementation platform is completely open - anything from C to scripting languages like Ruby are available (at either end), my main issue is how the communication should occur.

First thought, set up some webservices between the machines. But webservices aren't going to be too friendly or efficient with the large files.
Simple appoach:
ServerA hits a web method on ServerB "BeginProcess". The response give you back a FTP location username/password, and ticket number.
ServerA delivers the files to FTP location.
ServerA regularly polls a webmethod "GetProcessStatus(ticketNumber)", possible return values: Awaiting files, Percent complete, Finished
Slightly more complicated approach, without the polling.
ServerA hits a web method on ServerB "BeginProcess(postUrl)", and you send along a URL you want status updates POSTed to. Response: FTP location username/password, and ticket number.
ServerA delivers the files to FTP location.
ServerB sends thru updates to the POST location on ServerA every XXX% completed.
For extra resilience you would keep the GetProcessStatus in case something gets lost in the ether...

Files that will be up to 100MB aren't a good choice for a webservice, since you run a risk of the HTTP session timing out before you have completed your processing.
Having a webservice for checking the status of these jobs would be more ideal. Handle the file transfers via FTP or whatever file transfer method you choose and poll a webservice for updates on status. When the process is completed, you might have an output file url returned that can be downloaded.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js