#apollo/react-hooks , how to query cache first? - apollo

I'm looking for a convenient way to set up my apollo hooks to query the cache first and if the cache is empty, to make the endpoint call.
I can't seem to find the right documentation for this, I have only found how to query the cache and do a normal query.

Apollo Client allows you to specify the fetchPolicy for an individual query made using useQuery. Possible values include:
cache-first: This is the default value where we always try reading data from your cache first. If all the data needed to fulfill your query is in the cache then that data will be returned. Apollo will only fetch from the network if a cached result is not available. This fetch policy aims to minimize the number of network requests sent when rendering your component.
cache-and-network: This fetch policy will have Apollo first trying to read data from your cache. If all the data needed to fulfill your query is in the cache then that data will be returned. However, regardless of whether or not the full data is in your cache this fetchPolicy will always execute query with the network interface unlike cache-first which will only execute your query if the query data is not in your cache. This fetch policy optimizes for users getting a quick response while also trying to keep cached data consistent with your server data at the cost of extra network requests.
network-only: This fetch policy will never return you initial data from the cache. Instead it will always make a request using your network interface to the server. This fetch policy optimizes for data consistency with the server, but at the cost of an instant response to the user when one is available.
cache-only: This fetch policy will never execute a query using your network interface. Instead it will always try reading from the cache. If the data for your query does not exist in the cache then an error will be thrown. This fetch policy allows you to only interact with data in your local client cache without making any network requests which keeps your component fast, but means your local data might not be consistent with what is on the server. If you are interested in only interacting with data in your Apollo Client cache also be sure to look at the readQuery() and readFragment() methods available to you on your ApolloClient instance.
no-cache: This fetch policy will never return your initial data from the cache. Instead it will always make a request using your network interface to the server. Unlike the network-only policy, it also will not write any data to the cache after the query completes.
Since cache-first is the default, there's nothing else you need to do with your hook to get the desired behavior.

Related

Is there a way to tell when AWS Amplify Datastore is initialized or ready to be queried?

I have an application that needs to update the UI with the results of an Amplify Datastore query. I am making the query as soon as the component mounts/renders, but the results of the query are empty even though I know there is available data. If I add a timeout of 1 second or greater before making the query, then the query returns the expected data. My hunch is that this is because the query is returning an empty set of data before the response from the delta sync table, which shows there is data to be fetched, is returned.
Is there any type of event provided by Datastore that would allow me to wait until the data store is initialized or has data to query before making the query?
I understand that I could use the .observe functionality of datastore for a similar effect, but this is currently not an option.
First, if you do not use the Datastore start method then sync from the backend starts when the first query is submitted. Queries are run against the local store so data won't be there yet.
Second, Datastore publishes events on the amplify hub so that you can monitor changes, such as a set of data being synced, Datastore being ready and even Datastore being ready and all data synced locally.
See the documentation on Datastore.start
and the documentation for Datastore events for more information.

Is Redis atomic when multiple clients attempt to read/write an item at the same time?

Let's say that I have several AWS Lambda functions that make up my API. One of the functions reads a specific value from a specific key on a single Redis node. The business logic goes as follows:
if the key exists:
serve the value of that key to the client
if the key does not exist:
get the most recent item from dynamoDB
insert that item as the value for that key, and set an expiration time
delete that item from dynamoDB, so that it only gets read into memory once
Serve the value of that key to the client
The idea is that every time a client makes a request, they get the value they need. If the key has expired, then lambda needs to first get the item from the database and put it back into Redis.
But what happens if 2 clients make an API call to lambda simultaneously? Will both lambda processes read that there is no key, and both will take an item from a database?
My goal is to implement a queue where a certain item lives in memory for only X amount of time, and as soon as that item expires, the next item should be pulled from the database, and when it is pulled, it should also be deleted so that it won't be pulled again.
I'm trying to see if there's a way to do this without having a separate EC2 process that's just keeping track of timing.
Is redis+lambda+dynamoDB a good setup for what I'm trying to accomplish, or are there better ways?
A Redis server will execute commands (or transactions, or scripts) atomically. But a sequence of operations involving separate services (e.g. Redis and DynamoDB) will not be atomic.
One approach is to make them atomic by adding some kind of lock around your business logic. This can be done with Redis, for example.
However, that's a costly and rather cumbersome solution, so if possible it's better to simply design your business logic to be resilient in the face of concurrent operations. To do that you have to look at the steps and imagine what can happen if multiple clients are running at the same time.
In your case, the flaw I can see is that two values can be read and deleted from DynamoDB, one writing over the other in Redis. That can be avoided by using Redis's SETNX (SET if Not eXists) command. Something like this:
GET the key from Redis
If the value exists:
Serve the value to the client
If the value does not exist:
Get the most recent item from DynamoDB
Insert that item into Redis with SETNX
If the key already exists, go back to step 1
Set an expiration time with EXPIRE
Delete that item from DynamoDB
Serve the value to the client

Real Time Google Analytics API - Identify user session

I'm retreiving event data using Real Time Google Analytics API, so as to trigger responses each time conditions are met - while the user navigates.
This is my actual query on Google Analytics Real Time API (which works perfectly!)
return service.data().realtime().get(
ids='ga:' + profile_id,
metrics='rt:totalEvents',
dimensions='rt:eventAction,rt:eventLabel,rt:eventCategory',
max_results='25').execute()
I'd like to show results grouped by each particular session or user. So as to trigger a message to this particular user if some conditions are met.
Is that possible? And if so, how do apply this criteria to this query?
"Trigger a message to a particular user" would imply that you either have personally identifiable data stored in GA, which would violate Googles TOS, or that you map an anonymous ID (clientid or UserID or similar) to a key stored in an external database (which might be legally murky, depending on your legislation). Since I don't want to throw away the answer I have written before reading your question to the end :-) I am going to assume the latter.
So, is that possible? No, not really. By default GA does not identify neither an identifier for the user (client id or user id) nor for the session (a session identifier is present only in the BigQuery export schema).
The realtime API has a very limited set of dimensions (mostly I think because data aggregation does not happen in realtime), so you can't even use custom dimensions. Your only chance would be to overwrite one of the standard fields, i.e. campaign information.
Of course this destroys the original data in the field. So you should use an extra view for the API query, send a custom dimension with the user identifier along, and then use an advanced filter to copy the custom dimension value to a standard field (while you original data is safe in your other data views). This is a bit hackish, though.
Also the realtime API only displays the current hit per user, so you cannot group by user in the query in any case - you'd need to download and store the data to an external database and do your aggregation there.

Preventing Ember Data Caching & loading model data on demand

We are considering moving from Backbone to Ember. There are a few issues through I can't get answers to from the docs.
1) Ember-Data caches it's data. Our application is multi-user so other users need to be able to see new records created by everyone. Is there a way around this? I read on another post that when a query string is passed, ember data does not cache data, is this true? If it is, can I then just always send query string and nothing will be cached?
2) Ember data has a single model in the router that appears to be instantiated at route load time. I can see that your can request data from multiple sources by returning an object with many this.store.find calls. Say I have a select element and when you select an option, another select gets populated with items based on the first select (which requires a call back to the server). How would that work, how can I get model data on demand (not at route load time)?
I'm not sure if it answers your question but you can always call
model.reload()
to refetch data from server so you can work with up to date data.
You may want to consider Faye (http://faye.jcoglan.com/), which would let you have a pub/sub setup that could update your store by listening to topics of interest. This uses WebSocket for the streaming interface. You could then put new objects into the store, remove or update existing objects which the server could publish to the client.

SOA/Web Service Pagination

In SOA we should not be building or holding state (or designing dependencies) between client and server. This is understood. But what patterns can be followed in the case that a client wants to consume a real-time service that may return an open ended number of 'rows'?
Web applications, similar to SOA but allowing for state (sessions) have solved this with pagination. Pagination requires (in most cases, especially with SQL) that the server holds the data and that the client request the data in chunks.
If we where to consider pagination-like scenarios for web services, what patterns would these follow that would still allow the tenets of SOA to be adhered (or as close as possible).
Some rules for the thinkers:
1) Backed by a SQL database (therefore there is no concept of a row number in a select set)
2) It is important to not skip a row or duplicate a row in a set during pagination
3) Data may be inserted and deleted at any time into the database by other clients
4) There is no need to consider the dataset a live (update-able) dataset
Personally, I think that 1 and 2 above already spell our the solution by constraining the solution space with the requirements.
My proposed solution would have the data (as much as is selected) be stored in a read-only store/cache where it can be assigned a row number within the result set and allow pagination to occur on this data snapshot. I have would have infrastructure to store snapshots (servers, external caches, memcached or ehcache - this must scale quite large). The result of such a query would be a snapshot ID and clients could retrieve the data from the snapshot using a snapshot API (web services) and the snapshot ID. Results would be processed in a read-only, forward only manner for x records at a time where x was something reasonable.
Competing thoughts and ideas, criticisms or accolades would be greatly appreciated.
Paginated results in a Web Service is actually quite easy to achieve.
All you have to do is add two parameters to the web service call: Page Size, Page Number.
Page Size is the number of results to include in a page. Page Number is the number of the page of results you are looking for.
Your web service then goes back to the database (or cache), retreives the results, figures out which results fit on the requested page, and return only those results.
The client then has to make a single request per page of results they want from the service.
What you propose with memcached will also work with a caching table. The first service call would (1) INSERT results INTO the caching table with a snapshot ID (2) return the first page from the caching table and the snapshot ID. Subsequent calls would return pages based on page size and page number by querying the caching table using the snapshot ID.
I should think this could also be optimized by using an in-memory caching table, but that depends on whether your database supports INSERT-INTO from a disk table to an in-memory table. That might get complicated in a clustered environment though.
Such a cache is stateful by its very nature if you are retaining a client-specific copy between requests, whether storage is in a session object, database table or memcached data store. Given the requirements though, you have no choice but to cache results in some form or another, except you risk the chance of returning deleted or no-longer-relevant records as legitimate results.
SOA is not meant for such low level functionality.
SOA is meant to glue together business areas, not frontends to backends. Not because your application talks to the back end using webservices you have a "SOA" application. This is non sense since SOA is meaningless in the context of 1 isolated system.
From that point of view, it is then clear that, in SOA, the caller should not have known about the SQL table you are paginating, that’s an implementation detail that SOA should hide. In the other hand the server should not know about the client's state, because it should be agnostic to the details of the clients, to be really open.
So, just understand that pagination is not SOA. Do as you wish, just understand that the webservice you are using to paginate is an internal artifact of your application, not to be used for external clients in a SOA bus. Also remember that it can not be transaction consistent with out state in the server. Probably the problem is that you have only one service layer for the application's UI and the SOA bus, you need to separate them.
Using this webservice in a SOA bus would be bad. I can not be consistent as the user paginates and as other applications hang to it they become tied to the specific SQL.
... then you might as well have granted direct SQL access to the table for all that matters.
SOA is for business messages between systems, not to glue an application's frontend to the backend.
Same problem, resolved using the Navision approach.
$ws->getList($first_record_id, $limit)
This return a page of $limit element that start from the the passed id
select * from collection where collection.id > $first_record_id ASC limit $limit
ordered by id ASC
Navision use Key (each element has a key) but in MySQL an autoincrement id is better.
In this case pagination is intended for handle large result sets and not for a frontend pagination...
I am not sure if SOA is of concern here. The problem you have seems to be with paginating your API's. I will point you to how twitter handles their pagination dev.twitter.com/rest/public/timelines