app fabric local cache tracing with ETW - appfabric

We have a problem that after some time at production, the client of the app fabric caching work very slow. since we work with local cache we thought that something wrong there. After i examined the entries at the trace of the app fabric client, i saw an entry that contains "Count = 450000" the number is very high and increases every day exponential. The settings of the local cache: objectCount="500000" ttl="300".
1) How can i know how many items exist at the local cache. Does the "Count =x" at the trace entry means that the local cache has x items?
2) What can be the reason that the number of items at the local cache increase and gets to 500000, while the itemCount at the server is about 4000?
thanks

Related

Django expire cache every N'th HTTP request

I have a Django view which needs can be cached, however it needs to be recycled every 100th time when the view is called by the HTTP request.
I cannot use the interval based caching here since the number will keep changing upon traffic.
How would I implement this? Are there other nice methods around except maintaining a counter (in db) ?
Here are some ideas / feedback:
You're going to have to centralize something if you need it to be exact - the Redis idea in this linked solution looks OK if you can't put it in the main DB. If Redis is in your stack, I'd use that. If the 100 requests can be per user and you're using sessions, you could attach a counter to the session.
implementing a counter that counts requests with django
To not centralize the counter outside of the webserver would mean your app needs to be and stay single-threaded to keep counts in memory. It would also reset if the server was restarted. Not a great idea IMO...
If you really can't make it work with anything else, you could hack something like a request counter on your load balancer (...if the load balancer is a single machine you control, and you're comfortable doing that) and pass it as a header for Django to read.

AppFabric Syncing Local Caches

We have a very simple AppFabric setup where there are two clients -- lets call them Server A and Server B. Server A is also the lead cache host, and both Server A and B have a local cache enabled. We'd like to be able to make an update to an item from server B and have that change propagate to the local cache of Server A within 30 seconds (for example).
As I understand it, there appears to be two different ways of getting changes propagated to the client:
Set a timeout on the client cache to evict items every X seconds. On next request for the item it will get the item from the host cache since the local cache doesn't have the item
Enable notifications and effectively subscribe to get updates from the cache host
If my requirement is to get updates to all clients within 30 seconds then setting a timeout of less than 30 seconds on the local cache appears to be the only choice if going with option #1 above. Due to the size of the cache, this would be inefficient to evict all of the cache (99.99% of which probably hasn't changed in the last 30 seconds).
I think what we need to implement is option #2 above, but I'm not sure I understand how this works. I've read all of the msdn documentation (http://msdn.microsoft.com/en-us/library/ee808091.aspx) and have looked at some examples but it is still unclear to me whether it is really necessary to write custom code or if this is only if you want to do extra handling.
So my question is: is it necessary to add code to your existing application if want to have updates propagated to all local caches via notifications, or is the callback feature just an bonus way of adding extra handling or code if a notification is pushed down? Can I just enable Notifications and set the appropriate polling interval at the client and things will just work?
It seems like the default behavior (when Notifications are enabled) should be to pull down fresh items automatically at each polling interval.
I ran some tests and am happy to say that you do NOT need to write any code to ensure that all clients are kept in sync. If you set the following as a child element of the cluster config:
In the client config you need to set sync="NotificationBased" on the element.
The element in the client config will tell the client how often it should check for new notifications on the server. In this case, every 15 seconds the client will check for notifications and pull down any items that have changed.
I'm guessing the callback logic that you can add to your app is just in case you want to add your own special logic (like emailing the president every time an item changes in the cache).

Appfabric local cache notifications

I want to see if my understanding of Appfabric local cache invalidation is correct
Assume I have notification based invalidation set up on my local cache
The default polling interval is 5 minutes
Which way does the polling occur? I believe the local cache polls the distributed cache to check for notifications, is this correct?
Does that mean that if a change occurs to the distributed cache it could be anywhere up to 5 minutes before that item in the local cache is invalidated depending on when the last sync occurred?
Is there any way to see the last synched time, through powershell or another mechanism?
Yes, local polls server each pollInterval. The interval can be customized.
Yes, that's correct
Doubt about powershell. Maybe there will be some trace events in case you use Set-CacheLogging but I didn't try. What will definitely work is to subscribe to cache notifications right in the code and put a breakpoint into it.

Redis is taking too long to respond

Experiencing very high response latency with Redis, to the point of not being able to output information when using the info command through redis-cli.
This server handles requests from around 200 concurrent processes but it does not store too much information (at least to our knowledge). When the server is responsive, the info command reports used memory around 20 - 30 MB.
When running top on the server, during periods of high response latency, CPU usage hovers around 95 - 100%.
What are some possible causes for this kind of behavior?
It is difficult to propose an explanation only based on the provided data, but here is my guess. I suppose that you have already checked the obvious latency sources (the ones linked to persistence), that no Redis command is hogging the CPU in the slow log, and that the size of the job data pickled by Python-rq is not huge.
According to the documentation, Python-rq inserts the jobs into Redis as hash objects, and let Redis expires the related keys (500 seconds seems to be the default value) to get rid of the jobs. If you have some serious throughput, at a point, you will have many items in Redis waiting to be expired. Their number will be high compared to the pending jobs.
You can check this point by looking at the number of items to be expired in the result of the INFO command.
Redis expiration is based on a lazy mechanism (applied when a key is accessed), and a active mechanism based on key sampling, which is run in the event loop (in pseudo background mode, every 100 ms). The point is when the active expiration mechanism is running, no Redis command can be processed.
To avoid impacting the performance of the client applications too much, only a limited number of keys are processed each time the active mechanism is triggered (by default, 10 keys). However, if more than 25% keys are found to be expired, it tries to expire more keys and loops. This is the way this probabilistic algorithm automatically adapt its activity to the number of keys Redis has to expire.
When many keys are to be expired, this adaptive algorithm can impact the performance of Redis significantly though. You can find more information here.
My suggestion would be to try to prevent Python-rq to delegate item cleaning to Redis by setting expiration. This is a poor design for a queuing system anyway.
I think reduce ttl should not be the right way to avoid CPU usage when Redis expire keys.
Didier says, with a good point, that the current architecture of Python-rq that it delegates the cleaning jobs to Redis by using the key-expire feature. And surely, like Didier said it is not the best way. ( this is used only when result_ttl is greater than 0 )
Then the problem should rise when you have a set of keys/jobs with a expiration dates near one of the other, and it could be done when you have a bursts of job creation.
But Python-rq sets expire key when the job has been finished in one worker,
Then it doesn't have too sense, because the keys should spread around the time with enough time between them to avoid this situation

Windows Server AppFabric Cache Time-out based invalidation callback

I am using Windows Server AppFabric Caching in our application with local cache enabled.
This is configured as following:
<localCache isEnabled="true" sync="TimeoutBased" objectCount="1000" ttlValue="120"/>
I have setup time-out based invalidation with time-out interval of 120 seconds.
As per this configuration, local cache will remove items from in-memory cache after every 120 seconds and retrieve item from cache cluster. Is it possible to add a callback which gets fired whenever local cache tries to hit the cache cluster to retrieve items instead of fetching them locally?
Unfortunately, there is no way to know if data is fetched locally or not. There are cache server notifications but they are not reliable.
In your scenario, a good approach could be the Read-Through and Write-Behind feature. It does not fit to all situations but your can take a quick look.
Here are some links :
http://msdn.microsoft.com/en-us/library/hh377669.aspx
http://blogs.msdn.com/b/prathul/archive/2011/12/06/appfabric-cache-read-from-amp-write-to-database-read-through-write-behind.aspx