Client / server reactive sync in Clojure / Clojurescript - clojure

Is there an idiomatic way to do reactive data synchronization between browser and server with Clojure and Clojurescript? What are the pros and cons of one technique vs another?
Having used Meteor.js in the past this sort of reactive database sync is highly preferable to manually writing routes and polling for updates. A pub/sub system lets web developers write less boilerplate code to move data around. Clojure seems like it would be a natural fit for such a technique. I have been unable to determine if this is a solved problem in the clj/cljs ecosystem.

The short answer is "no".
I've used Meteor.js in production, and also looked for the same when getting into CLJ(S). The closest I'm aware of are:
Datsync
Uses Datomic and Datoms as the sync protocol
Seems to be alpha
Fulcro
Has help for optimistic updates, but not direct data sync like Meteor has with Minimongo, closer to Meteor methods (ie a client and server implementation of the same call)
Production-ready, widely used
Why hasn't a Meteor-killer hasn't been written in CLJ(S)?
It's hard to do (at all, performantly)
The company behind Meteor raised tens of millions of dollars over the last decade to work in this space and still didn't resolve all the scalability or semantic rough edges in the data sync design
The datastores Clojure people tend to use (Datomic or relational) likely make data sync even harder, and certainly quite different (it wouldn't be a port from Meteor's implementation).
Clojure developers tend to prefer assembling a system around the needs of a particular problem/solution/requirement rather than assembling a solution around a particular feature/framework (eg Mongo documents that sync full-stack).
People are still thinking about this though, both inside the Clojure community and out. Check out:
Baqend (and their blog comparing with Meteor.js, Firebase, RechinkDB)
"The Web After Tomorrow" (by the developer of DataScript)
Reactive Datalog
The only meaningful way to compare is to consider them both in your context. I remember giving up Meteor's magic felt significant at the time but haven't found myself wanting to go back. I see it as trading magic for simplicity and flexibility.

Related

Clojure Web (HttpKit, Manifold) vs Elixir/Pheonix

I am currently weighing up the use of Elixir vs Clojure for running a web server to handle many concurrent Web socket connections. Now Elixir/Phoenix seems a natural fit for this and you see benchmarks demonstrating how far it scales (I doubt this has an bearing on real life load). But, most of our infrastructure is written in Clojure.
In our case, the websocket handler is almost completely independent from the rest of the existing code base.
So the question is - would you consider taking on another language/ecosystem because it is a better fit for a particular job? vs using a tool that is already a large part of your existing ecosystem.
And, does  Elixir/Phoenix exceed Clojure/JVM by a significant margin under real world loads for this kind of task to justify using it?
You may find this conversation helpful.
In summary, the "speed of your webserver" is not going to be the bottleneck in your development. Coding, testing, debugging, refactoring, and other human-related tasks are more important by a ratio of 100x or 1000x for just about any modern project.
Since you are already comfortable with Clojure and using it for most of your project, I would strongly suggest sticking with it rather than split your codebase into 2 incompatible parts.

Twisted Django Comet(Orbited): the interaction of upper and middle level

I'm developing a monitoring system(something like real-time web app). And the question is about system architecture.
Device connects to server and sends information about controlled parameters state. Sever should save information to Database and notify Comet server. Comet server sends message to user saying that new data avaliable. User gets new information.
What's the best way to analyze and save information(create alarm messages if needed) about device state:
Twisted app it self analyzes and interacts with DB(adbapi) and Comet server(Orbited).
Twisted pushes received data to Django(how to push?) and Django analyzes and saves data to DB and sends "NEW" flag to orbited.
Any Your suggestions, if there is a better way.
More information you can find on the pictures below:
This question is fairly open ended. Someone could probably write a dozen pages on each of the options you described, and that much again on a handful of other approaches as a bonus.
Instead of doing that, I'll take an alternate route.
Make sure you have a good understanding of your requirements. Think about which approach is going to be easiest for you (or for the developers on your team) to satisfy those requirements. Take that approach, documenting the overall idea and unit testing everything you write (preferably using TDD).
When you're done, you might not have the optimal solution, but you'll have a solution, and 99 times out of 100 that's indistinguishable from being optimal.
If I do think about your proposed approaches a little bit, then what mostly occurs to me is that they don't differ from each other very much. Your analysis is just some Python code somewhere that you're going to invoke. Whether you invoke it closer to some Twisted-using code or closer to some Django-using code doesn't seem to make a huge difference to the outcome. Perhaps some part of your requirements would make one approach better than the other. However, if you have unit tests and understand your requirements, then I expect you'll actually find it quite easy to switch between those two approaches.
After you've implemented something, you'll have a much better understanding of the trade-offs involved and you'll be in a better position to decide if one implementation is going to work better or worse than another.
Note that unit tests are a pretty essential part of this idea. Without them, you won't really know if you've implemented your requirements, you won't know if your functionality still works after any particular refactoring, and refactoring itself will be harder because your units will not be as well-defined and isolated as they would be if you were doing test-driven development.

node.js concurrency

I'm new in node.js. I'm testing socket.io for real time messaging. I love it and I want to use. I have a question. How many concurrency can run in Node.js server ? Our program will be approximately 100 concurrency. So, I'm worry about that.
I found another real time messaging server , APE. Which one is better ? I love node.js because it's easy to learn and easy to write. But I couldn't find discussion about concurrency in node.js server. My friend company is using APE and it can control around 2000. So, I want to know about node.js server.
Without having any benchmarks to back this up--since both are event-driven (i.e. epoll on Linux), I would imagine that you'll see comparable results for both (at least 10K concurrent users). That being said, performance is likely to be much more affected by the frequency of messages than the number of concurrent connections, since that's where the implementations really differ.
For a real-world example and discussion about node.js Comet performance, see Amir Salihefendic's excellent blog post here: http://amix.dk/blog/post/19577 (you can follow the links in that post to other posts that are fantastic as well).
Notice that one of the versions he wrote was in C using libevent (epoll) which is what APE uses as well. Also, note that APE's website claims that it can handle over 100,000 concurrent users.
If you really want to understand the problems associated, you might find the famous "C10K problem" article interesting (do a google search for "C10K problem").
In the end, it probably comes down to how many requests per second you expect, and how many machines you have, and which language you prefer to code in. If you're only expecting around 100 concurrent users, I think you'll be just fine using any platform you want. That being said, I highly recommend using node.js--just for sheer enjoyment if nothing else. :-)

Node.js or Erlang

I really like these tools when it comes to the concurrency level it can handle.
Erlang/OTP looks like much more stable solution but requires much more learning and a lot of diving into functional language paradigm. And it looks like Erlang/OTP makes it much better when it comes to multi-core CPUs (correct me if I am wrong).
But which should I choose? Which one is better in the short and long term perspective?
My goal is to learn a tool which makes scaling my Web projects under high load easier than traditional languages.
I would give Erlang a try. Even though it will be a steeper learning curve, you will get more out of it since you will be learning a functional programming language. Also, since Erlang is specifically designed to create reliable, highly concurrent systems, you will learn plenty about creating highly scalable services at the same time.
I can't speak for Erlang, but a few things that haven't been mentioned about node:
Node uses Google's V8 engine to actually compile javascript into machine code. So node is actually pretty fast. So that's on top of the speed benefits offered by event-driven programming and non-blocking io.
Node has a pretty active community. Hop onto their IRC group on freenode and you'll see what I mean
I've noticed the above comments push Erlang on the basis that it will be useful to learn a functional programming language. While I agree it's important to expand your skillset and get one of those under your belt, you shouldn't base a project on the fact that you want to learn a new programming style
On the other hand, Javascript is already in a paradigm you feel comfortable writing in! Plus it's javascript, so when you write client side code it will look and feel consistent.
node's community has already pumped out tons of modules! There are modules for redis, mongodb, couch, and what have you. Another good module to look into is Express (think Sinatra for node)
Check out the video on yahoo's blog by Ryan Dahl, the guy who actually wrote node. I think that will help give you a better idea where node is at, and where it's going.
Keep in mind that node still is in late development stages, and so has been undergoing quite a few changes—changes that have broke earlier code. However, supposedly it's at a point where you can expect the API not to change too much more. So if you're looking for something fun, I'd say node is a great choice.
I'm a long-time Erlang programmer, and this question prompted me to take a look at node.js. It looks pretty damn good.
It does appear that you need to spawn multiple processes to take advantage of multiple cores. I can't see anything about setting processor affinity though. You could use taskset on linux, but it probably should be parametrized and set in the program.
I also noticed that the platform support might be a little weaker. Specifically, it looks like you would need to run under Cygwin for Windows support.
Looks good though.
Edit
Node.js now has native support for Windows.
I'm looking at the same two alternatives you are, gotts, for multiple projects.
So far, the best razor I've come up with to decide between them for a given project is whether I need to use Javascript. One existing system I'm looking to migrate is already written in Javascript, so its next version is likely to be done in node.js. Other projects will be done in some Erlang web framework because there is no existing code base to migrate.
Another consideration is that Erlang scales well beyond just multiple cores, it can scale to a whole datacenter. I don't see a built-in mechanism in node.js that lets me send another JS process a message without caring which machine it is on, but that's built right into Erlang at the lowest levels. If your problem isn't big enough to need multiple machines or if it doesn't require multiple cooperating processes, this advantage isn't likely to matter, so you should ignore it.
Erlang is indeed a deep pool to dive into. I would suggest writing a standalone functional program first before you start building web apps. An even easier first step, since you seem comfortable with Javascript, is to try programming JS in a more functional style. If you use jQuery or Prototype, you've already started down this path. Try bouncing between pure functional programming in Erlang or one of its kin (Haskell, F#, Scala...) and functional JS.
Once you're comfortable with functional programming, seek out one of the many Erlang web frameworks; you probably shouldn't be writing your app directly to something low-level like inets at this late stage. Look at something like Nitrogen, for instance.
While I'd personally go for Erlang, I'll admit that I'm a little biased against JavaScript. My advice is that you evaluate few points:
Are you reusing existing code in either of those languages (both in terms of source code, and programmer experience!)
Do you need/want on-the-fly updates without stopping the application (This is where Erlang wins by default - its runtime was designed for that case, and OTP contains all the tools necessary)
How big is the expected traffic, in terms of separate, concurrent operations, not bandwidth?
How "parallel" are the operations you do for each request?
Erlang has really fine-tuned concurrency & network-transparent parallel distributed system. Depending on what exactly is the project, the availability of a mature implementation of such system might outweigh any issues regarding learning a new language. There are also two other languages that work on Erlang VM which you can use, the Ruby/Python-like Reia and Lisp-Flavored Erlang.
Yet another option is to use both, especially with Erlang being used as kind of "hub". I'm unsure if Node.js has Foreign Function Interface system, but if it has, Erlang has C library for external processes to interface with the system just like any other Erlang process.
It looks like Erlang performs better for deployment in a relatively low-end server (512MB 4-core 2.4GHz AMD VM). This is from SyncPad's experience of comparing Erlang vs Node.js implementations of their virtual whiteboard server application.
There is one more language on the same VM that erlang is -> Elixir
It's a very interesting alternative to Erlang, check this one out.
Also it has a fast-growing web framework based on it-> Phoenix Framework
whatsapp could never achieve the level of scalability and reliability without erlang https://www.youtube.com/watch?v=c12cYAUTXXs
I will Prefer Erlang over Node.
If you want concurrency, Node can be substituted by Erlang or Golang because of their light weight processes.
Erlang is not easy to learn so requires a lot of effort but its community is active so can get help from that, this is only the reason why people prefer Node .

Practical SOA for a newbie

I am a total newbie to the world of SOA. As such, I am looking at some "SOA frameworks/technologies", and trying to understand how to utilize them to build a highly scalable (Facebook class) website.
There are several "pains" I am trying to solve here:
Composability (+ managing dependencies, Pub/Sub)
Language-independence of services
Scalability & Performance
High availability
I looked into some technologies that answer a subset of the above criteria:
Thrift - Facebook's cross-platform RPC platform
WCF - supports SOAP, JSON & REST, so it can be considered language-interoperable. Generates WSDL files that can be use to generate java proxies.
Microsoft DSS - just inclued it in my survey, but it doesn't seem relevant as it is highly state-driven and .NET specific.
Web Services
Now, I understand how I get some aspects of composability and language-independence out of the above. But, I haven't found much concrete information (not buzz) about how to use the above / other tools for scalability and high availability. So finally I get to my question:
How does one leverage SOA technologies to solve the pains I defined above? Where can I find technical guides for that? I am looking for more than just system diagrams, but rather actual libraries, code samples, APIS...
I think the question has more to do with the concepts involved than the tools. Answers to the items:
Understand and internalize bounded context. Keeping unrelated pieces separate its important to get real reuse on different services. Related technologies won't help you on this, its you that separate the app in appropriate contexts, which you can appropriately reuse for different services.
Clear endpoint for communication based on known protocols allows implementing different pieces with different technologies
Having the operations flow like independent actions based on different protocols, gives you a lot of places where you can add tiers. Is a particular sub-process of the overall process using lots of resources and the server can't take it anymore, just move to a separate server. Load kept growing, and that server isn't taking it anymore, add an additional server and load balance. You also have more opportunity to use caching and connection pooling.
Have a critical sub-process that needs to be available all the time, add a server so you can have a fail over. Have an overall process that needs to be "available" all the time, use queues for pieces that can be processed later.
Do you really need to support that type of load? Set appropriate performance/load/interoperability targets that relate to the specific scenario. If you really need to support that type of load, I recommend you get someone on board who has dealt with it.
If it is something for a load that might eventually be, identify the bounded contexts and design the interaction between those with a SOA mindset. Keeping the code clean is all you have to do for the rest, use TDD, loose coupling, focused integration tests, etc in your code base. With good code, if you later need to separate pieces of the system, it will be a lot easier.
There are interesting and relevant things said about services and architecture by Amazon's CTO - Werner Vogels:
An interview about the Amazon technology platform
A post on his blog - "Eventually Consistent - Revisited"
A 50 min presentation about Availability & Consistency
IDesign.net has a bunch of great downloads for WCF.
Worth a look at: http://www.manning.com/davis/ ?
Check out the Mule project which bundles the CXF services stack and also the Mule REST pack which provides RESTful alternatives. I think you'll see it addresses all of your pains and there are lots of examples in the documentation as well as the distribution.
May I recommend the book: Enterprise Service Oriented Architectures published by Springer Verlag.
All of the advice here is well and good, but don't worry about it until you really need to.
Focus on building a usable, functional application that people really like. When you start running into problems, then start handling bottlenecks.
You will never be able to foresee every way an application will fail, so how can you tell if [[insert tech here]] is your answer?