Synchronize Infinispan cache entries with database - jpa-2.0

I want to know whether I can use Infinispan for cached data synchronization with Oracle database. This is my scenario. I have two main applications. One is highly concurrent use app and the second is used as admin module. Since it is highly concurrent I want to reduce database connections (Load the entities into cache (read write enable) and use it from this place without calling database). But meantime I want to update database according to the cache changes because admin module is using database directly. Can that update process (cache to database) handle in entity level without involving application? Please let me know whether Infinispan supports this scenario or not. If supports please share ideas.

Yes, it is possible. Infinispan supports this use case.
This should be simple configuration "problem" only. All you need to use is properly configured CacheStore with passivation disabled. It will keep your cache (used by highly concurrent application) synchronized with database.
What does it exactly cause?
When passivation is disabled, whenever an element is modified, added
or removed, then that modification is persisted in the backend store
via the cache loader. There is no direct relationship between eviction
and cache loading. If you don't use eviction, what's in the persistent
store is basically a copy of what's in memory.
By memory is meant a cache here.
If you want to know even more about this and about other interesting options please see: https://docs.jboss.org/author/display/ISPN/Cache+Loaders+and+Stores#CacheLoadersandStores-cachepassivation
Maybe it's worth to consider aforementioned eviction. Whether to disable or enable it. It depends mainly on load generated by your highly concurrent application.

Actually, this only works when you use Infinispan in the same cluster for the admin module. If you load A in memory with Infinispan, change A to something else in the database directly with the admin module, then Infinispan will not know A has been updated.

Related

Does django use cache by default?

Is their predefined default caching system for django projects if nothing is configured ?
If yes. how it works ?
If you configure your view or method to use cache, it will use Local Memory Cache. It has separate in-memory thread-safe cache for each process. Quite nice temporal solution.
From Django cache documentation
Local-memory caching
This is the default cache if another is not specified in your settings
file. If you want the speed advantages of in-memory caching but don’t
have the capability of running Memcached, consider the local-memory
cache backend. This cache is per-process (see below) and thread-safe.

Should I implement revisioning using database triggers or using django-reversion?

We're looking into implementing audit logs in our application and we're not sure how to do it correctly.
I know that django-reversion works and works well but there's a cost of using it.
The web server will have to make two roundtrips to the database when saving a record even if the save is in the same transaction because at least in postgres the changes are written to the database and comitting the transaction makes the changes visible.
So this will block the web server until the revision is saved to the database if we're not using async I/O which is currently the case. Even if we would use async I/O generating the revision's data takes CPU time which again blocks the web server from handling other requests.
We can use database triggers instead but our DBA claims that offloading this sort of work to the database will use resources that are meant for handling more transactions.
Is using database triggers for this sort of work a bad idea?
We can scale both the web servers using a load balancer and the database using read/write replicas.
Are there any tradeoffs we're missing here?
What would help us decide?
You need to think about the pattern of db usage in your website.
Which may be unique to you, however most web apps read much more often than they write to the db. In fact it's fairly common to see optimisations done, to help scaling a web app, which trade off more complicated 'save' operations to get faster reads. An example would be denormalisation where some data from related records is copied to the parent record on each save so as to avoid repeatedly doing complicated aggregate/join queries.
This is just an example, but unless you know your specific situation is different I'd say don't worry about doing a bit of extra work on save.
One caveat would be to consider excluding some models from the revisioning system. For example if you are using Django db-backed sessions, the session records are saved on every request. You'd want to avoid doing unnecessary work there.
As for doing it via triggers vs Django app... I think the main considerations here are not to do with performance:
Django app solution is more 'obvious' and 'maintainable'... the app will be in your pip requirements file and Django INSTALLED_APPS, it's obvious to other developers that it's there and working and doesn't need someone to remember to run the custom SQL on the db server when you move to a new server
With a db trigger solution you can be certain it will run whenever a record is changed by any means... whereas with Django app, anyone changing records via a psql console will bypass it. Even in the Django ORM, certain bulk operations bypass the model save method/save signals. Sometimes this is desirable however.
Another thing I'd point out is that your production webserver will be multiprocess/multithreaded... so although, yes, a lengthy db write will block the webserver it will only block the current process. Your webserver will have other processes which are able to server other requests concurrently. So it won't block the whole webserver.
So again, unless you have a pattern of usage where you anticipate a high frequency of concurrent writes to the db, I'd say probably don't worry about it.

how to directly access appfabric cache without proxy?

Start to learn and use appfabric cache.
From the whitepaper, http://msdn.microsoft.com/en-us/library/gg186017%28v=azure.10%29.aspx, it says:
Bulk get calls result in better network utilization.Direct cache access is much faster than proxies (ASP.NET, WCF).
I am not sure what this means. What is a proxy in appfabric world?
We do websites base on asp.net/mvc, so if we write some logic to access our abpfabric cluster, it will be called from asp.net/mvc code?
Many Thanks
If you look at the document refernced by that page it explains what is meant by caching:
In some cases, the cache client is wrapped and accessed via a proxy
with additional application or domain logic. Oftentimes, performance
of such applications is much different from the Windows Server
AppFabric Cache cluster itself. The goal of the tests in this category
is to show performance of a middle tier application with additional
logic and compare it with performance of direct access to the cache.
To accomplish the goal, a simple WCF application was implemented that
provided access to the cache and contained additional logic of
populating the cache from an external data source if the requested
object is not yet in the cache.
The document contains details on how this affected performance, but if you need more detail the source code used is available.
Using the DataCacheFactory (and/or AppFabric Session provider) from your MVC site will access the cache cluster directly, once you've granted access to the Application Pool user.

C++ InMemory Cache Library

we have an application which utilizes a UDP service , and observe that 75% of calls to this UDP service is repeated.
Hence would wish to apply an In-Memory Cache , so as to avoid the costly network call and improve the application's turn-around time.
Never used caching before , hence any pointers on suitable cache libraries in C++/Unix environ would greatly help.
Also would like to share this cache across multiple processes.
The cache is required to store key value pairs of string type.
Have a look at redis, it's a noSQL key-value database. Here you have an interactive tutorial. We use it in several of our applications successfully.
Gemfire (no relation) is a commercial distributed caching system. Servers are written in Java but native support exists for C++ (among others).

Offline web application

I’m thinking about building an offline-enabled web application.
The architecture I’m considering is as follows:
Web server (remote) <--> Web server/cache (local) <--> Browser/Prism
The advantages I envision for this model are:
Deployment is web-based, with all the advantages of this approach
Offline-enabled
UI (html/js) synchronization is a non-issue
Data synchronization can be mostly automated
as long as I stay within a RESTful paradigm
I can break this as required but manual synchronization would largely remain surgical
The local web server is started as a service; I can run arbitrary code, including behind-the-scene data synchronization
I have complete control of the data (location, no size limit, no possibility of user deleting unknowingly)
Prism with an extension could allow to keep the javascript closed source
Any thoughts on this architecture? Why should I / shouldn’t I use it? I'm particularly looking for success/horror stories.
The long version
Notes:
Users are not very computer-literate.
For instance, even superficially
explaining how Gears works is totally
out of the question.
I WILL be held liable if data is loss, even if it’s really the users fault (short of him deleting random directories on his machine)
I can require users to install something on their machine. It doesn’t have to be 100% web-based and/or run in a sandbox
The common solutions to this problem don’t feel adequate somehow. Here is a short analysis of each.
Gears/HTML5:
no control over data, can be deleted
by users without any warning
no
control over location of data (not
uniform across browsers and
platforms)
users need to open application in browser for synchronization to happen; no automatic, behind-the-scene synchronization
different browsers are treated differently, no uniform view of data on a single machine
limited disk space available
synchronization is completely manual, sql-based storage makes this a pain (would be less complicated if sql tables were completely replicated but it’s not so in my case). This is a very complex problem.
my code would be almost completely open sourced (html/js)
Adobe AIR:
some of the above
no server-side includes (!)
can run in the background, but not windowless
manual synchronization
web caching seems complicated
feels like a kludge somehow, I’ve had trouble installing on some machines
My requirements are:
Web-based (must). For a number of
reasons, sharing data between users
for instance.
Offline (must). The application must be fully usable offline (w/ some rare exceptions).
Quick development (must). I’m a single developer going against players with far more business resources.
Closed source (nice to have). Yes, I understand the open source model. However, at this point I don’t want competitors to copy me too easily. Again, they have more resources so they could take my hard work and make it better in less time than I could myself. Obviously, they can still copy me developing their own code -- that is fine.
Horror stories from a CRM product:
If your application is heavily used, storing a complete copy of its data on a user's machine is unfeasible.
If your application features data that can be updated by many users, replication is not simple. If three users with local changes synch, who wins?
In reality, this isn't really what users want. They want real-time access to the most current data from anywhere. We had better luck offering a mobile interface to a single source of truth.
The part about running the local Web server as a service appears unwise. Besides the fact that you are tied to certain operating environments that are available in the client, you are also imposing an additional burden of managing the server, on the end user. Additionally, the local Web server itself cannot be deployed in a Web-based model.
All in all, I am not too thrilled by the prospect of a real "local Web server". There is a certain bias to it, no doubt since I have proposed embedded Web servers that run inside a Web browser as part of my proposal for seamless off-line Web storage. See BITSY 0.5.0 (http://www.oracle.com/technology/tech/feeds/spec/bitsy.html)
I wonder how essential your requirement to prevent data loss at any cost is. What happens when you are offline and the disk crashes? Or there is a loss of device? In general, you want the local cache to be the least farther ahead of the server, but be prepared to tolerate loss of data to the extent that the server is behind the client. This may involve some amount of contractual negotiation or training. In practice this may not be a deal-breaker.
The only way to do this reliably is to offer some sort of "check out and lock" at the record level. When a user is going remote they must check out the records they want to work with. This check out copied the data to a local DB and prevents the record in the central DB from being modified while the record is checked out.
When the roaming user reconnects and check their locked records back in the data is updated on the central DB and unlocked.