Design considerations while developing web services for mobile applications

Design considerations while developing web services for mobile applications - web-services

I have been developing server side applications in java, and now I have been asked to serve some mobile applications. The question always bugs me over "how much" data should I send to the app ?
If I have to transfer a large xml document, should I send it node by node, as per requirement....if yes, wont it consume the phone battery more (as the phone will be creating new connections for all the nodes). If I decide to send the whole document at once, it may take a long time for the client to download the whole doc, and may have problem temporarily storing it...even more, there may arise data inconsistency in the two copies of data. In short, I need to know "is the creation of connection for a mobile device, too expensive ?" Which approach is better - receive data in chunks, by creating multiple connections OR receive all data together in one connection.
I also need to know, while developing my web service for the mobile clients, should I send them the image URI or the image data (as byte array) ?
Thanks.

You should download as few (times) as possible.
It´s the application that will answer your question. Don´t download lists of 1000 items, load just 10, wait the user scroll down, load next 10. cache the items.
A bit more tricky way to do soemthing like that is registering the user, that is downloading data, on the begining, he first time uses the app, it will download as many stuff as you need, call it the first time loading. Register, what the user downloaded. The next call from the user u send him only data that was changed, and send with this data the changeactions to perform on the clients device.

With mobile clients, the latency kills more than the data package size. While you should not send more data than you are going to consume, node by node is not a good method to employ. This is not as much a battery answer as a user experience answer.
The best way to architect for mobile is find the right bite size for the meal. In other words, you don't send 10,000 records at one time, but you don't send a grid of 10 items 1 row at a time.
Depending on image size, you should send the image directly (base64 encoding is common) and not as a link. An exception would be sending the user to a web page and letting them browse, but then it is not really a "mobile application" any more, right?

Related

Collecting card data PCI level

We want to integrate a 3rd party service, regarding payments, their API waiting PAN & expiration date, and we need to determinate what PCI level do we need?
So, we just collect this data on client, send them to our server which will send data to them, we do not store it in database.

If your server can see this data, you need PCI SAQ-D, end of story. It doesn’t matter if you’re storing it or not, what matters is that someone who compromises your server can see it in transit. And if you’re asking this question, you do not want to be responsible for all the requirements of D.
To qualify for SAQ-A, or SAQ-A-EP, which are the only other two valid for websites, the card data needs to never come to your server in a readable form. That could mean redirecting the user to a page hosted by your payment processor to enter their data, embedding an iframe they provide, posting it directly to them from the front end (i.e. JavaScript POST), or (maybe) encrypting it with a key that only they can decrypt.
More information can be found in the official summary document

Real-time duplication of data among EC2 instances located in different regions

I'm new to AWS and back-end architecture in general. My current configuration is an EC2 instance (south-east region Singapore) running a Twisted real-time server for a real-time chat app.
Currently, in my implementation, whenever a sender sends a message to the server, it is stored in a python dictionary on the server if the receiver is not online. So basically it is storing this message in the instance's RAM. Now, I want to make the app available worldwide, so I'll be running it on instances of different regions. So my question is, how am I supposed to duplicate/replicate this dictionary stored in RAM of one instance to all the other instance, so it is readily available in all regions? (The reason of storing the messages in RAM and not in a database is the nature of the app. The app involves a large volume of messages sent in bursts, which requires it to be considerably faster than speeds offered by a persistent DB store's I/O read-writes.) My aim is to make the app available globally, and having real-time performance.
(Kindly don't flag this question as an "opinion-based" question and close it. I'm new to server side architecture and I really need someone to at least just point me in the right direction. And I don't think I'll be able to find help on this anywhere other than StackOverflow.)

Here's a few things I would think of if I had to build it myself (I've implemented most of these pointers in our own project and it took me quite a while).
If you really really need all servers to be in sync you'll need a consensus protocol. If you do. Don't built this yourself. It's going to take a lot of time and errors.
If you can, partition your chat data into chatrooms and have only a few servers handle one chatroom.
I've used msgpack to encode my data. It's faster and smaller than json.
You'll benefit a lot of compressing your data before you send it over the wire. Have a look at something like zlib or lz4
Even though the size of compressed msgpack is almost the same of that compressed json. I'd choose msgpack because it's faster. It's easier to parse because it's length prefixed encoded.
I would try to send messages together. Batch up all messages every x ms. In my project I chose 100ms batching up messages will save you a lot of bandwidth since your compression algorithm can remove more duplication.
You'll have to handle connection timeouts. Only regard a message as sent and done when you get a reply back (you'll have to design/choose your protocol to handle that)
Think of what is acceptable, how much data you're willing to loose when something crashes or otherwise fails. If you're not willing to loose data you'll have to implement something that stores data to disk.
I've had the problem that writes to database we use (Google Cloud Datastore) take a long time as well. Like somewhere between 100ms and 900ms depending on how much I store. What I did was only store this data every x seconds and set flags on objects that need to be saved next run. Of course you can only do this if you're willing to loose some data when your program crashes.
You'll need something to keep track of what servers are running and which server is responsible for which piece of data
Set up something that checks whether your connection is alive. For example send echoRequests and echos every x time. The sooner you detect a faillure the better. Note however if your reactor is blocked by some cpu intensive task it will not send your echo in time.
If you're not in control of how much data comes in you'll have to slow down or penalize connections that would otherwise take up all of your server time.
EDIT: I only now see that you're looking into redis. As far as I know it's a good queueing system. Use that if you can. Implementing the stuff above would take a lot of time to get it right.

Publish/Subscribe/Request for exchange of big, complex, and confidential data?

I am working on a project where a website needs to exchange complex and confidential (and thus encrypted) data with other systems. The data includes personal information, technical drawings, public documents etc.
We would prefer to avoid the Request-Reply pattern to the dependent systems (and there are a LOT of them), as that would create an awful lot of empty traffic.
On the other hand, I am not sure that a pure Publisher/Subscriber pattern would be apropriate -- mainly because of the complex and bulky nature of the data to be exchanged.
For that reason we have discussed the possibility of a "publish/subscribe/request" solution. The Publish/Subscribe part would be to publish a message to the dependent systems, that something is ready for pickup. The actual content is then picked up by old-school Request-Reply action.
How does this sound to you??
Regards,
Morten

If the systems are always online, it sounds good.
You might want to look at PubSubHubbub because:
1. Don't solve a problem that has already been solved 2. It is scalable and represents a good separation of concern.
It involves 3 parties:
Publishers (who publish stuff)
Subscribers (who are interested in certain publications)
Hubs (who mediate and get rid of 'polling')
It works in the following way:
A subscriber, registers their interest in a URL with a Hub and provides a callback URL.
A publisher, notifies the hub when publishing content.
A hub fetches the 'delta' and pushes it to interested subscribers.
The protocol itself is an extension to Atom, but it seems to fit your requirement, e.g. the new Atom 'content' could be an item containing URLs to newly published documents (which can then be downloaded separately).
New/modified documents => new/modified items in feed containing URLs to fetch them => Hub => Subscribers => Pull documents from Publisher

I don't have a great experience about this, but a messaging queue should help you accomplish what you need. I am using such a system while managing publishing data to multiple front end clients from a backend.
If the client is off, the data is not consumed and the server receives no acknowledgement of data being reveived. Once the client comes back online he consumes the data and remains listening for more messages onve the queue is clear. And ofc the publisher receives a ack for data being consumed. In this way we can identify and notify people who have problems at the receiving end as a bonus. Could this do it in your case?

This approach works if the dependent systems are always online - you can't send messages to PCs that are turned off for the night/weekend.
So if the clients are servers that run 24/7, this works. Otherwise, try this approach:
Let clients register themselves
When new documents come in, add an entry "client X needs to see this" in your database
When clients connect, send them all the entries.
When clients successfully downloaded a document, delete the "client X needs to see this" entry. That keeps the work table small.
This has several advantages:
Clients don't need to run 24/7
The flag is only removed after the client has seen the document (so no updates can be lost).
You have one place where you can see which client never pulls it's documents. A simple select client, count(*) group by client having count(*) > 10 tells you about problems.
Most clients will fetch their data timely, so the work table will stay small. That means there is little overhead when you have to collect the "what's now" data.
EDIT The problem with off-line subscribers is that they don't know what they're missing. So the sending side needs to keep track of the failed push/pull requests. Which means you must implement my suggested pseudo-code to make sure broken connections can be resumed.

Windows Phone: Updating backend datastore (via web service) while keeping UI very responsive

I am developing a Windows Phone app where users can update a list. Each update, delete, add etc need to be stored in a database that sits behind a web service. As well as ensuring all the operations made on the phone end up in the cloud, I need to make sure the app is really responsive and the user doesn’t feel any lag time whatsoever.
What’s the best design to use here? Each check box change, each text box edit fires a new thread to contact the web service? Locally store a list of things that need to be updated then send to the server in batch every so often (what about the back button)? Am I missing another even easier implementation?
Thanks in advance,

Data updates to your web service are going to take some time to execute, so in terms of providing the very best responsiveness to the user your best approach would be to fire these off on a background thread.
If updates not taking place (until your app resumes) due to a back press is a concern for your app then you can increase the frequency of sending these updates off.
Storing data locally would be a good idea following each change to make sure nothing is lost since you don't know if your app will get interrupted such as by a phone call.
You are able to intercept the back button which would allow you to handle notifying the user of pending updates being processed or requesting confirmation to defer transmission (say in the case of poor performing network location). Perhaps a visual queue in your UI would be helpful to indicate pending requests in your storage queue.
You may want to give some thought to the overall frequency of data updates in a typical usage scenario for your application and how intensely this would utilise the network connection. Depending on this you may want to balance frequency of updates with potential power consumption.
This may guide you on whether to fire updates off of field level changes, a timer when the queue isn't empty, and/or manipulating a different row of data among other possibilities.
General efficiency guidance with mobile network communications is to have larger and less frequent transmissions rather than a "chatty" or frequent transmissions pattern, however this is up to you to decide what is most applicable for your application.

You might want to look into something similar to REST or SOAP.
Each update, delete, add would send a request to the web service. After the request is fulfilled, the web service sends a message back to the Phone application.
Since you want to keep this simple on the Phone application, you would send a URL to the web service, and the web service would respond with a simple message you can easily parse.
Something like this:
http://webservice?action=update&id=10345&data=...
With a reply of:
Update 10345 successful
The id number is just an incrementing sequence to identify the request / response pair.

There is the Microsoft Sync Framework recently released and discussed some weeks back on DotNetRocks. I must admit I didnt consider this till I read your comment.
I've not looked into the sync framework's dependencies and thus capability for running on the wp7 platform as yet, but it's probably worth checking out.
Here's a link to the framework.
And a link to Carl and Richard's show with Lev Novik, an architect on the project if you're interested in some background info. It was quite an interesting show.

Ideal way/architecture to deliver large data over Web Services

We are trying to design 6 web services, which will serve another client component. The client component requires data from the web service we are implementing.
Now, the problem is, there is not 1 Web Service we are implementing, there is one Web Service which the client component hits, this initiates a series (5 more) of Web Services which gather data from their respective data stores and finally provide the data back to the original Web Service, which then delivers the data back to the client component.
So, if the requested data becomes huge, then, this will be a serious problem for our internal communication channel.
So, what do you guys suggest? What can be done to avoid overloading of the communication channel between the internal Web Service and at the same time, also delivering the data to the client component.
Update 1
Using 5 WS, where, 1WS does not know about the others, except the next one is a business requirement. Actually, 5 companies "small services" are being integrated.
We use Java and Axis2

We've had a similar problem. Apart from trying to avoid it (eg for internal communication go direct to db instead of web service) you can mitigate it by at least not performing the 5 or so tasks in series. Make new threads to collect them all in parallel and process them at the end to reduce latency (except where they might contend for the same resource and bottle neck).
But before I'd do anything load test it and see if it is even an issue and get some baseline stats so you can see what improvement each change makes. Also sometimes you might be better off tweaking network settings or the actual network rather than trying to optimise the code - but again test and see.

Put all the data on a temporary compressed file and give back the ftp url of the file.
The client fetches the big data chunk uncompress it and reads it. (maybe some authentication mechanism for the ftp server)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js