How to read all the items present in an Appfabric Cache: - appfabric

I am trying to develop a tool (in Visual Studio 2010, C#) which can read all the items present in an Appfabric cache and store them in a Table. I don't have to use powershell.
First I thought that If I can get all the regions present in the cache, I can make use of the DataCache.GetObjectsInRegion Method to complete my task. But I was not able to get all the region names from the cache as it does not shows the user defined region names but only the default ones, so now I am giving up on this approach.
Can anyone please guide me here, my main goal is to read all the items present in a cache.

There is no built-in method to list all items in the cache.
You're correct, it's possible to list all items using GetObjectsInRegion for a named cache. You have to know first all regions names (if used) or call GetSystemRegions to get all (default) system regions. A simple foreach will allow you to list all items. When you put something into the cache without region name, it will be added to a system region.
Here is a basic example
// Declare array for cache host(s).
DataCacheServerEndpoint[] servers = new DataCacheServerEndpoint[1];
servers[0] = new DataCacheServerEndpoint("YOURSERVERHERE", 22233);
// Setup the DataCacheFactory configuration.
DataCacheFactoryConfiguration factoryConfig = new DataCacheFactoryConfiguration();
factoryConfig.Servers = servers;
factoryConfig.SecurityProperties = new DataCacheSecurity(DataCacheSecurityMode.None, DataCacheProtectionLevel.None);
// Create a configured DataCacheFactory object.
DataCacheFactory mycacheFactory = new DataCacheFactory(factoryConfig);
// Get a cache client for the default cache
DataCache myCache = mycacheFactory.GetDefaultCache(); //or change to mycacheFactory.GetCache(myNamedCache);
//inserty dummytest data
myCache.Put("key1", "myobject1");
myCache.Put("key2", "myobject2");
myCache.Put("key3", "myobject3");
Random random = new Random();
//list all items in the cache : important part
foreach (string region in myCache.GetSystemRegions())
{
foreach (var kvp in myCache.GetObjectsInRegion(region))
{
Console.WriteLine("data item ('{0}','{1}') in region {2} of cache {3}", kvp.Key, kvp.Value.ToString(), region, "default");
}
}

Related

Aggregating a huge list from reducer input without running out of memory

At the reduce stage (67% of reduce percentage), my code ends up getting stuck and failing after hours of attempting to complete. I found out that the issue is that the reducer is receiving huge amounts of data that it can't handle and ends up running out of memory, which leads to the reducer being stuck.
Now, I am trying to find a way around this. Currently, I am assembling a list from the values received by the reducer fro each key. At the end of the reduce phase, I try to write the key and all of the values in the list. So my question is, how can I get the same functionality of having the key and list of values related to that key without running out of memory?
public class XMLReducer extends Reducer<Text, Text, Text, TextArrayWritable> {
private final Logger logger = Logger.getLogger(XMLReducer.class);
#Override
public void reduce(Text key, Iterable<Text> values, Context context) throws IOException, InterruptedException {
//logger.info(key.toString());
Set<String> filesFinal = new HashSet<>();
int size = 0;
for(Text value : values) {
String[] files = value.toString().split(",\\s+");
filesFinal.add(value.toString());
//size++;
}
//logger.info(Integer.toString(size));
String[] temp = new String[filesFinal.size()];
temp = filesFinal.toArray(temp);
Text[] tempText = new Text[filesFinal.size()];
for(int i = 0; i < filesFinal.size(); i++) {
tempText[i] = new Text(temp[i]);
}
}
}
and TextArrayWritable is just a way to write an array to file
You can try reducing the amount of data that is read by the single reducer by writing a Custom partitioner.
HashPartitioner is the default partitioner that is used by the map reduce job. While this guarantees you uniform distribution, in some cases it is highly possible that many keys get hashed to a single reducer. As a result, a single reducer would have a lot of data compared to others. In your case, I think this is the issue.
To resolve this:
Analyze your data and the key on which you are doing group by. You
Try to come up with a partitioning function based on your group by key for your Custom Partitioner. Try limiting the number of keys for each partition.
You would see an increase in number of reduce tasks in your job. If the issue is related to uneven key distribution, the solution that I proposed should resolve your issue.
You could also try increasing reducer memory.

Update newly created record on client after successfully processing the Create request on server

Please, consider following scenario:
IgniteUI 16.1 igGrid powered with igGridUpdating feature and RESTDataSource
User creates a new record through modal dialog
Post request is initiated with form data
Server processes the create request and returns an object, populated with correct ID
In success handler on the client side, the newly added in the grid row has to be found and updated with correct ID returned from the server.
The ID column serves as a grid's primary key and it's hidden
What happens when a new row is adding?
We are watching infragistics.lob-16.1.js
In _dialogOpening(), row 68167, _originalValues are computed via $.extend(this._originalValues, values, this._originalValues), where values = _getDefaultValues() or with other words values.id = this._pkVal. And _pkVal is a counter that is incremented each time when a new row appears.
Keeping that in mind, later, _endEditDialog() is called, where newValues, representing the entered data by the user, are merged with default values of the input form: newValues = this._getNewValuesForRow(colElements) followed by newValues = $.extend({}, prevValues, newValues) and prevValues are the same _originalValues from above.
Then an _addRow() is called, which calls on its run grid.dataSource.addRow() and a transaction is created.
My point here is the updating feature generates ID automatically for the new row and ID = CurrentRowsCount + 1.
So, if the grid contains 8 records, then newly created record will automatically be assigned with ID = 9. And imagine, if one of existing records has an ID = 9, then igGridUpdating's updateRow(rowId, values) will update both rows, existing and the new one. And I realy want to call this method in order to update the row with the data, returned from the server.
How could I intervene in the whole picture and accomplish the update of the new row?
The auto-generated primary keys are only meant to cover the most basic scenarios. If your app supports row deletion you should change them with something that will keep them unique using the generatePrimaryKeyValue event.
Using updateRow after receiving the permanent keys from the server is the way to go, however, remember to pop the transaction from the allTransactions array so the update doesn't go to the server on the next saveChanges call.

DynamoDB - how to retrieve and delete (pop) an item?

I am working on an application written in Flask and backed by Amazon's DynamoDB accessed through boto.
For a specific use case, we need to retrieve a value from a table and then make it unavailable for other users.
However, by retrieving and then deleting the value, a race-condition could occur in between the retrieval and deletion.
Is there any way to retrieve an item from a table and immediately delete or update it in an atomic fashion?
If your logic:
get item
delete
without any additional logic to determine whether deletion should occur, then you can actually send delete request immediately, here is example (I haven't checked it, mostly take from: http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/LowLevelJavaItemCRUD.html)
HashMap<String, AttributeValue> key = new HashMap<String, AttributeValue>();
key.put("Id", new AttributeValue().withN("101"));
DeleteItemRequest deleteItemRequest = new DeleteItemRequest()
.withTableName(tableName)
.withKey(key)
.withReturnValues(ReturnValue.ALL_OLD);
DeleteItemResult deleteItemResult = client.deleteItem(deleteItemRequest);
Map<String,AttributeValue> deletedItem = deleteItemResult.getAttributes();
Documentation:
withReturnValues
getAttributes

AppFabric - Putting Data into Local Cache

I'm pretty new to AppFabric and what I'm trying to understand is how to stipulate that I want data to go into the Distributed cache as well as the Local Cache
I read the post here which is doing this based on config. I am not using any XML config but rather creating my objects with configuration programmatically. I am playing around with the following code:-
// Declare array for cache host(s).
List<DataCacheServerEndpoint> servers = new List<DataCacheServerEndpoint>();
servers.Add(new DataCacheServerEndpoint("SERVER1", 10023));
servers.Add(new DataCacheServerEndpoint("SERVER2", 10023));
servers.Add(new DataCacheServerEndpoint("SERVER3", 10023));
DataCacheLocalCacheProperties localCacheConfig;
TimeSpan localTimeout = new TimeSpan(0, 5, 0);
localCacheConfig = new DataCacheLocalCacheProperties(10000, localTimeout, DataCacheLocalCacheInvalidationPolicy.TimeoutBased);
// Setup the DataCacheFactory configuration.
DataCacheFactoryConfiguration factoryConfig = new DataCacheFactoryConfiguration();
factoryConfig.Servers = servers;
factoryConfig.SecurityProperties = new DataCacheSecurity(DataCacheSecurityMode.None, DataCacheProtectionLevel.None);
factoryConfig.LocalCacheProperties = localCacheConfig;
DataCacheFactory factory = DataCacheFactoryExtensions.Create(factoryConfig);
DataCache dataCache = factory.GetCache("MyCache");
dataCache.Put("myKey", "MyValue");
Am I right to assume that because I have added the local cache config to the factoryConfig object that my cached item will be automatically added to local cache as well as the distributed cache?
And therefore if I want items only cached to distributed cache do I just need to drop off adding the local cache config to the factoryConfig object?
Or do I need two separate factory config objects - one for each cache?
You can see here that, yes, the object will be stored in the local cache, if the local cache is enabled:
When local cache is enabled, the cache client stores a reference to the object locally.
The instructions for "enabling the local cache" are exactly as you've done -- basically just using the DataCacheLocalCacheProperties (although the local cache can also be enabled using app.config settings instead).
So it's exactly as you say -- to use the distributed cache only, without the local, then use a DataCache object taken from a DataCacheFactory that does not use DataCacheLocalCacheProperties.
Note also that items in the local cache can be evicted depending on the policies configured:
The lifetime of an object in the local cache is dependent on several factors, such as the maximum number of objects in the local cache and the invalidation policy.

Bing Maps Rest Services Multiple Locations for Geocoding

Currently, I have an ASP application which retrieves a set of locations from a datasource and then uses Bing map REST services to geocode the addresses and then display them on a table and a map in pages of 10 results at a time.
Currently, the application processes the locations sequentially ...
var geocodeRequest = "http://ecn.dev.virtualearth.net/REST/v1/Locations/" + fullAddress.replace('&', ' ').replace(',', ' ') + "?output=json&jsonp=GeocodeCallback&key=" + getCredentials;
CallRestService(geocodeRequest);
......
function GeocodeCallback(result) {
if (result &&
result.resourceSets &&
result.resourceSets.length > 0 &&
result.resourceSets[0].resources &&
result.resourceSets[0].resources.length > 0) {
// Set the map view using the returned bounding box
var bbox = result.resourceSets[0].resources[0].bbox;
var viewBoundaries = MM.LocationRect.fromLocations(new MM.Location(bbox[0], bbox[1]), new MM.Location(bbox[2], bbox[3]));
map.setView({ bounds: viewBoundaries });
// Add a pushpin at the found location
MM.Location.prototype.locID = null;
var location = new MM.Location(result.resourceSets[0].resources[0].point.coordinates[0], result.resourceSets[0].resources[0].point.coordinates[1]);
location.locID = tableRowIndex;
locs.push(location);
.....
Is there any way to speed this up by passing 10 locations in one call and then processing result.resourceSets[0], result.resourceSets[1] etc?
How would multiple addresses be passed into the rest services call? (comma deliminated?)
Thanks
Bing has two REST-accessible geocoding APIs. One of them is the one you're using, which only supports one address at a time, and the other is the Dataflow API which is designed for high-volume batch processing. Neither really seem like they're right for you, as your system is currently designed.
Depending on where you're getting your street addresses from (all you mention is 'a datasource'), you might be able to do a big-batch geocode for all the locations in your datasource - move the geocoding from request time to a batch process, and just use the request-time geocoding for the ones the batch process hasn't gotten to yet.
There is no way of doing this as it looks right now. It has been proposed to support native in javascript (i think), but I do not think that it has been implemented yet. It you want some concurrency, you could look at webworkers:
http://en.wikipedia.org/wiki/Web_Workers
https://developer.mozilla.org/En/Using_web_workers
But this is not supported in IE yet. Maybe you could try to check out html5 async. I do not know if it could be used in the creation of the script element that is created when you call the REST services.