MEAN stack and concurrency

MEAN stack and concurrency - concurrency

What is the preferred way of handling concurrent users for a web app powered by the MEAN stack? What I really mean is that in a multi-user environment, users can update the same object, for example a collection with students and the users update grades, how would we handle it? My initial thought is that there is a need for an optimistic lock with a versioning hookup, so for each update we want to cross-check the versionid. Does frameworks like mongoose handle it?

This should be dealt with at the Mongoose level, as you suggest. This may answer your question: http://aaronheckmann.tumblr.com/post/48943525537/mongoose-v3-part-1-versioning
Note particularly that Mongoose v3 now adds a schema-configurable version key to each document

Related

API Throttling Best Practices

I have a SOAP api that I would like to throttle access to on a User basis after "x" many calls have been received in "y" amount of time.
After searching around, the #1 consideration (obviously) is to consider your parameters for when to throttle users. However, I don't see much in the way of best practices/examples for implementing such a solution. I did see the Leaky Bucket Method which makes sense. I have to believe there are more ideas out there though.
Any other takers on how you go about implementing your throttling solution? Questions include:
Do any frameworks provide capabilities (e.g. Spring, etc.) for throttling in web apis?
Seems to me you would need to store access information per user. How do you minimize the database overhead for doing this EVERY call?
Do you even NEED to access a datastore to implement this?

For what its worth, I've sort of answered this question after working on some other production projects.
Home brew: Using Spring AOP to pointcut around the method calls prior to executing API method code is one home-brew way if you have your own algorithm to implement. This ends up being pretty elegant and flexible as you can capture a lot of metadata prior to deciding what to do with the request.
API Management Service: If you're talking about a production system and you have the budget, probably the best way to go is to delegate this to an API Management layer like Apigee or Mashery.
Advantage is that it separates the concerns so its easier to change and allows you to focus just on your API. This is especially helpful if business stakeholders are involved and you need a good UI and dictionary of terms.
Disadvantage, of course is the cost and the vendor lock in.
Hope this helps someone!

How to do the equivalent of SELECT .. FRO UPDATE in a SOA architecture?

The title is probably not be explicit enough, so let me try to explain. I'm working on a new project, built on .NET, it consists of WPF clients that use WCF web services to access an Oracle database. The problem is not this basic architecture, but rather how it's supposed to work with what's already in place.
Currently, applications are written using PowerBuilder and connect to the database directly. Additionally, they use Oracle's SELECT .. FOR UPDATE statements extensively to manage concurrency by locking records. Since the new applications must exists side-by-side with the old ones, they are supposed to lock records in a similar manner too, but the new architecture that relies on web services does not make this easy.
For the moment, what we are thinking of doing is build a "data server" that would be called by the web services and would be responsible for accessing the database. The purpose of this server is to maintain the open connections/transaction needed to maintain record locks throughout several web service calls. This is needed because the "select" and the subsequent "update" parts of the operations that require SELECT .. FOR UPDATE are most likely going to happen in at least two separate web service calls ("get record" and "post update".)
I've searched the Internet for documentation regarding this kind of situation, but I can't seem to find much on the subject. Can this—i.e. keep record locks open throughout several web services calls—be done? Properly? Is my approach appropriate? Are there any published "best practices" on the matter?
Update: The original title of the question was How to maintain record locking with a service oriented architecture? I changed it following John's suggestion, hoping it may inspire some answers.

In general the SOA already have XA transaction turned on. So you can benefit from this and use the UPDATE / SELECT / UPDATE strategy. Meaning that say, you do UPDATE/SELECT as a single INVOKE example is:
UPDATE MY_TABLE SET state='Working' ... ;
SELECT * FROM MY_TABLE WHERE state='Working' ...
Then you can process the data with out fear of someone else will claim that data since it already in a different 'state', provided that the service you write only one in existing for each table.
Finally, you can complete it with the UPDATE ... SET ... state='Complete' where state='Working'. By the way, this is the same strategy the DatabaseAdapter use for polling.

Java web application for multiple users

I need to design and implement a Java web application that can be used by multiple users at the same time. The data that is handled by this application is going to be huge and may take about 5 minutes for a page to display the results(database records).
I had designed this application using HTML, Servlets and JSP. But when two users would try to get the records, only one user was able to view the results while the other faced an error.
I always thought a web application would take care of handling multiple users but this is not the case.
Any insights on this would be highly appreciated.
Thanks.

I always thought a web application would take care of handling multiple users but this is not the case.
They do if they're written correctly. Obviously yours is not. That's all we can tell you unless you give more information, most importantly details of the error shown to the second user.
One possibility is that everything is OK on the web layer but your DB access for the first user causes an exclusive lock so that the second user cannot access the data at the same time. This could be fixed by using non-exclusive read locks. How to do that depends mainly on what DB you're using.
Getting concurrency right requires you to choose the correct tools and use them correctly. It doesn't just happen magically because it's a web app.

What are are using to develop this web-application? If you are developing it in your own way from the start I must say you are trying to re-invent the same wheel which has been already created and enhanced by very solid frameworks.
I suggest you analyze your requirements thoroughly and study some available frameworks. Let them handle the things like multi threading and other aspects in the best possible manner.
Handling multiple request at a time is a container work and as an application developer we have to concentrate how we are handling and processing those requret being forwarded by the container.
I must suggest you to get some insight how web-application work and how request -response cycle happens

Building a web service: what options do I have?

I'm looking to build my first web service and I would like to be pointed in the right direction right from the start.
Here are the interactions that must take place:
First, a smartphone or computer will send a chunk of data to my web service.
The web service will persist the information to the database.
Periodically, an algorithm will access and modify the database.
The algorithm will periodically bundle data and send it out to smartphones or computer (how?)
The big question is: What basic things do I need to learn to in order to implement something like this?
Now here are the little rambling questions that I've also got rolling around in my head. Feel free to answer these if you wish. (...maybe consider them extra credit?)
I've heard a lot of good things about RESTful services, I've read the wiki article, and I've even played around with the Twitter's webservice which is RESTful. Is this the obvious way to go? Or should I still consider something else?
What programming language do I use to persist things to the database? I'm thinking php will be the first choice for this.
What programming language do I use to interact with the database? I'm thinking anything is probably acceptable, right?
Do I have to worry about concurrent access to the database, or does MySQL handle that for me? (I'm fairly new to databases too.)
How on earth do I send information back? Obviously, if it's a reply to an HTTP request that's no problem, but there will be times when the reply may take quite a long time to compute. Should I just let the HTTP request remain outstanding until I have the answer?
There will be times when I need to send information to a smartphone regardless of whether or not information has been sent to me. How can I push information to my users?
Other information that may help you know where I'm coming from:
I am pretty familiar with Java, C#, C++, and Python. I have played around with PHP, Javascript, and Ruby.
I am relatively new to databasing, but I get the basic idea.
I've already set up my server and I'm using a basic LAMP stack. My understanding of L, A, M and P is fairly rudimentary.

Language: Python for it's ease of use assuming the GIL is not a particular concern for your requirements (e.g. multi-threading). It has drivers for most databases and supports numerous protocols. There are several web frameworks for it - the most popular probably being Django.
Protocols: if you are HTTP focused study SOAP and REST. Note, SOAP tends to be verbose, which causes problems moving volumes of data. On the other hand, if you are looking at other options study socket programming and perhaps some sort of binary format such as Google's protocol buffers. Flash is also a possibility (see: Flash Remoting). Note, binary options require users install something onto their machine (e.g. applet or standalone app).
Replies: if the process is long running, and the client should be notified when it's done, I would recommend developing an app for the client. Browser's can be programmed with JavaScript to periodically poll, or a Flash movie can be embedded to real time updates, but these are somewhat tricky bits of browser programming. If you're dealing with wireless phones, look at SMS. Otherwise I would just provide a way for clients to get status, but not send out notification (e.g. push vs. pull). As #jcomea_ictx wrote, AJAX is an option if it's a browser based solution - study jQuery.
Concurrency: understand what ACID means with regards to databases. Think about what should happen if you receive multiple writes to the same data - database may not necessarily solve this problem the way you'd want.

Please, for the love of programming, don't use PHP if you're already comfortable with Python. The latter makes for far cleaner, more maintainable code. Not that it's impossible to write good code in PHP, but it's a relative rarity. You can use Python for all the server-end stuff including MySQL interaction, with the MySQLdb module. Either with standard CGI, or FCGI, or mod_python.
As for the database, use of transactions will eliminate conflicts. But you can usually design a system in such a way that conflicts will not happen. For example, use of auto-incrementing primary-key IDs on each insert will make sure that every entry is unique.
You can "pull" data with Javascript, perhaps using AJAX methodology, or "push" using SMS or other technologies.
When replies take a while to compute, you can "poll" using AJAX. This is a very common technique. The server just returns "we are working on this" (or equivalent) with a built-in refresh until the results are ready.
I'm no expert on REST, but AJAX, especially when using polling rather than simply responding to user input, can be said to violate RESTful principles. But you can be a purist, or you can do whatever works. It's up to you.
I don't believe I've ever used any "push" technologies other than SMS, and that was years ago when many companies had free SMS gateways. So if you want to "push" data, better hope someone else joins in the conversation!

Use Java. The latest version of Java EE 6 makes coding RESTful and SOAP services a breeze, and it interoperates with databases very easily as well.
The benefits of using a true language instead of a script are: full server-side state, strong typing, multithreading, and countless other things that may or may not come in handy, but knowing they are available makes your project future proof.

comparison of ways to maintain state

There are various ways to maintain user state using in web development.
These are the ones that I can think of right now:
Query String
Cookies
Form Methods (Get and Post)
Viewstate (ASP.NET only I guess)
Session (InProc Web server)
Session (Dedicated web server)
Session (Database)
Local Persistence (Google Gears) (thanks Steve Moyer)
etc.
I know that each method has its own advantages and disadvantages like cookies not being secure and QueryString having a length limit and being plain ugly to look at! ;)
But, when designing a web application I am always confused as to what methods to use for what application or what methods to avoid.
What I would like to know is what method(s) do you generally use and would recommend or more interestingly which of these methods would you like to avoid in certain scenarios and why?

While this is a very complicated question to answer, I have a few quick-bite things I think about when considering implementing state.
Query string state is only useful for the most basic tasks -- e.g., maintaining the position of a user within a wizard, perhaps, or providing a path to redirect the user to after they complete a given task (e.g., logging in). Otherwise, query string state is horribly insecure, difficult to implement, and in order to do it justice, it needs to be tied to some server-side state machine by containing a key to tie the client to the server's maintained state for that client.
Cookie state is more or less the same -- it's just fancier than query string state. But it's still totally maintained on the client side unless the data in the cookie is a key to tie the client to some server-side state machine.
Form method state is again similar -- it's useful for hiding fields that tie a given form to some bit of data on the back end (e.g., "this user is editing record #512, so the form will contain a hidden input with the value 512"). It's not useful for much else, and again, is just another implementation of the same idea behind query string and cookie state.
Session state (any of the ways you describe) are all great, since they're infinitely extensible and can handle anything your chosen programming language can handle. The first caveat is that there needs to be a key in the client's hand to tie that client to its state being stored on the server; this is where most web frameworks provide either a cookie-based or query string-based key back to the client. (Almost every modern one uses cookies, but falls back on query strings if cookies aren't enabled.) The second caveat is that you need to put some though into how you're storing your state... will you put it in a database? Does your web framework handle it entirely for you? Again, most modern web frameworks take the work out of this, and for me to go about implementing my own state machine, I need a very good reason... otherwise, I'm likely to create security holes and functionality breakage that's been hashed out over time in any of the mature frameworks.
So I guess I can't really imagine not wanting to use session-based state for anything but the most trivial reason.

Security is also an issue; values in the query string or form fields can be trivially changed by the user. User authentication should be saved either in an encrypted or tamper-evident cookie or in the server-side session. Keeping track of values passed in a form as a user completes a process, like a site sign-up, well, that can probably be kept in hidden form fields.
The nice (and sometimes dangerous) thing, though, about the query string is that the state can be picked up by anyone who clicks on a link. As mentioned above, this is dangerous if it gives the user some authorization they shouldn't have. It's nice, though, for showing your friends something you found on the site.

With the increasing use of Web 2.0, I think there are two important methods missing from your list:
8 AJAX applications - since the page doesn't reload and there is no page to page navigation, state isn't an issue (but persisting user data must use the asynchronous XML calls).
9 Local persistence - Browser-based applications can persist their user data and state to the local hard drive using libraries such as Google Gears.
As for which one is best, I think they all have their place, but the Query String method is problematic for search engines.

Personally, since almost all of my web development is in PHP, I use PHP's session handlers.
Sessions are the most flexible, in my experience: they're normally faster than db accesses, and the cookies they generate die when the browser closes (by default).

Avoid InProc if you plan to host your website on a cheap-n-cheerful host like webhost4life. I've learnt the hard way that because their systems are over subscribed, they recycle the applications very frequently which causes your session to get lost. Very annoying.
Their suggestion is to use StateServer which is fine except you have to serialise/deserialise the session eash post back. I love objects and my web app is full of them. I'm concerned about performance when switching to StateServer. I need to refactor to only put the stuff I really need in the session.
Wish I'd know that before I started...
Cheers, Rob.

Be careful what state you store client side (query strings, form fields, cookies). Anything security-related should not be stored client-side, except maybe a session identifier if it is reasonably obscured and hard to guess. There are too many websites that have settings like "authenticated=true" and store those in a cookie or query string or hidden form field. It is trivial for a user to bypass something like that. Remember that ANY input coming from a client could have been tampered with and should not be trusted.

Signed Cookies linked to some sort of database store when you need to grab data. There's no reason to be storing data on the client side if you have a connected back-end; you're just looking for trouble if this is a public facing website.

It's not some much a question of what to use & what to avoid, but when to use which. Each has a particular circumstances when it is the best, and a different circumstance when it's the worst.
The deciding factor is generally lifetime of the data. Session state lives longer than form fields, and so on.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js