For Google Workspace Admin SDK ReportsAPI, is response ordered by time - google-admin-sdk

For the Activities API, is response sorted by time specifically id.time ?
If not is there a way we can do that?

Issue:
At least when I test this, the activities are sorted by id.time, with more recent activities listed first.
No official documentation makes this explicit, though, so this behavior could change without notice.
Feature request:
I'd suggest you to file a feature request in Issue Tracker using this template in order to add query parameters to explicitly sort activities based on their time, and either in ascending or descending order.
Side note:
Also, depending on your situation, startTime and endTime might be useful, in order to filter activities based on their time.

Related

what is the right way to query different filters on dynamodb?

I save my order data on dyanmodb table. And the partition key is orderId, sort key is timestamp. Each order has many other attributes like category, userName, price, items, status`. I am going to build a filter service to let clients query order based on these attributes. Also I'd like to add a limit on the query for pagination. But I find some limitations on dynamodb.
In order to support querying different fields, I have two options:
Create GSI for each attribute. It is very expensive but it supports query each attribute very performance. This solution doesn't support combine multiple attributes in the filter.
Attach a filter expression on the SCAN to include attribute condition. SCAN is not very performance in the first place. Also the filter expression is applied after limits. Which means it is very likely to response less than users request limits.
so what is the good way to achieve this in dynamodb?
There is unfortunately no magic way to solve your problems. There is no DynamoDB feature which you missed. Indeed, as you said, making each of the attributes available for efficient queries requires a GSI which will cost you additional money - but that's reasonable. Indeed, as you said, there is no efficient way to search for an intersection of requirements on two different attribute. And indeed, the "limit" feature doesn't quite do what you want and you'll need to emulate your page size need in the client code (asking for more pages until your desired amount is recieved), potentially with unacceptably high latency.
It sounds that what you really need is a search engine. These have exactly the features that you asked for. You'll still be paying for these features (indexing of individual columns still takes up CPU and disk space, intersection of multiple attribute searches still requires significant work at query time) but search engines are designed for exactly these operations, and do them more efficiently and with lower latency (which is important for interactive searches, which are the bread-and-butter of search engines).
You can add the limit for pagination using the limit attribute in the query. But can you please be more specific about your access patterns, is your clients going to query all the orders or only the orders belonging to them ?

AWS Data Structure and Stack Suggestion for highly filterable data

Firstly, let me know if I should place this in a different Community. It is programming related but less than I would prefer.
I am creating a mobile app based which I intend to base on AWS App Sync unless I can determine it is a poor fit.
I want to store a fairly large set of data, say a half million records.
From these records, I need to be able to grab all entries based on a tag and page them from the larger set.
An example of this data would be:
{
"name":"Product123",
"tags":[
{
"name":"1880",
"type":"year",
"value":7092
},
{
"name":"f",
"type":"gender",
"value":4120692
}
]
}
Various objects may or may not have a specific tag but may have up to 500 tags or more (the seed of initial data has 130 tags). My filter would ignore them if they did not match but return them if they did.
In reading about Query vs Scan on DyanmoDB, I feel like my current data structure would require mostly scanning and be in-efficient. Efficiency is only a real restriction due to cost.
With cost in mind, I will focus on the cost per user to access this data in filtered sets. Say 100,000 users for now each filtering and paging data many times a day.
Your concept of tags doesn't sound too different from the concept of Cognito User Pools' groups with AppSync (docs) - authentication based on groups will only return items allowed for groups that the user making the request is in. Cognito's default group limit is 25 per user pool, so while convenient out of the box, it wouldn't itself help you much. Instead, it's interesting just because it's similar conceptually, and can give you insight by looking at how it works internally.
If you go into the AppSync console and set up a request mapping template for groups auth, you'll see that it uses a scan and the contains operation. Doing something similar would probably be your best bet here, if you really want to use Dynamo. If you find that prohibitively costly, you could use a Lambda data source, which allows you to use any data store, if you have one in mind that's a little more flexible for this type of action.

Create fault tolerance example with Dynamodb streams

I have been looking at DynamoDB to create something close to a transaction. I was watching this video presentation: https://www.youtube.com/watch?v=KmHGrONoif4 in which the speaker shows around the 30 minute mark ways to make dynamodb operation close to ACID compliant as can be. He shows the best concept is to use dynamodb streams, but doesn't show a demo or an example. I have a very simple scenario I am look at and that is I have one Table called USERS. Each user has a list of friends. If two users no longer wish to be friends they must be removed from both of the user's entities (I can't afford for one friend to be deleted from one entity, and due to a crash for example, the second user entities friend attribute is not updated causing inconsistent data). I was wondering if someone could provide some simple walk-through oh of how to accomplish something like this to see how it all works? If code could be provided that would be great to see how it works.
Cheers!
Here is the transaction library that he is referring: https://github.com/awslabs/dynamodb-transactions
You can read through the design: https://github.com/awslabs/dynamodb-transactions/blob/master/DESIGN.md
Here is the Kinesis client library:
http://docs.aws.amazon.com/kinesis/latest/dev/developing-consumers-with-kcl.html
When you're writing to DynamoDB, you can get an output stream with all the operations that happen on the table. That stream can be consumed and processed by the Kinesis Client Library.
In your case, have your client remove it from the first user, then from the second user. In the Kinesis Client Library when you are consuming the stream and see a user removed, look at who he was friends with and go check/remove if needed - if needed the removal should probably done through the same means. It's not truly a transaction, and relies on the fact that KCL guarantees that the records from the streams will be processed.
To add to this confusion, KCL uses Dynamo to store where in the stream is at when processing and to checkpoint processed records.
You should try and minimize the need for transactions, which is a nice concept in a small scale, but can't really scale once you become very successful and need to support millions and billions of records.
If you are thinking in a NoSQL mind set, you can consider using a slightly different data model. One simple example is to use Global Secondary Index on a single table on the "friend-with" attribute. When you add a single record with a pair of friends, both the record and the index will be updated in a single action. Both table and index will be updated in a single action, when you delete the friendship record.
If you choose to use the Updates Stream mechanism or the Global Secondary Index one, you should take into consideration the "eventual consistency" case of the distributed system. The consistency can be achieved within milli-seconds, but it can also take longer. You should analyze the business implications and the technical measures you can take to solve it. For example, you can verify the existence of both records (main table as well as the index, if you found it in the index), before you present it to the user.

Overcoming querying limitations in Couchbase

We recently made a shift from relational (MySQL) to NoSQL (couchbase). Basically its a back-end for social mobile game. We were facing a lot of problems scaling our backend to handle increasing number of users. When using MySQL loading a user took a lot of time as there were a lot of joins between multiple tables. We saw a huge improvement after moving to couchbase specially when loading data as most of it is kept in a single document.
On the downside, couchbase also seems to have a lot of limitations as far as querying is concerned. Couchbase alternative to SQL query is views. While we managed to handle most of our queries using map-reduce, we are really having a hard time figuring out how to handle time based queries. e.g. we need to filter users based on timestamp attribute. We only need a user in view if time is less than current time:
if(user.time < new Date().getTime() / 1000)
What happens is that once a user's time is set to some future time, it gets exempted from this view which is the desired behavior but it never gets added back to view unless we update it - a document only gets re-indexed in view when its updated.
Our solution right now is to load first x user documents and then check time in our application. Sorting is done on user.time attribute so we get those users who's time is less than or near to current time. But I am not sure if this is actually going to work in live environment. Ideally we would like to avoid these type of checks at application level.
Also there are times e.g. match making when we need to check multiple time based attributes. Our current strategy doesn't work in such cases and we frequently get documents from view which do not pass these checks when done in application. I would really appreciate if someone who has already tackled similar problems could share their experiences. Thanks in advance.
Update:
We tried using range queries which works for only one key. Like I said in most cases we have multiple time based keys meaning multiple ranges which does not work.
If you use Date().getTime() inside a view function, you'll always get the time when that view was indexed, just as you said "it never gets added back to view unless we update it".
There are two ways:
Bad way (don't do this in production). Query views with stale=false param. That will cause view to update before it will return results. But view indexing is slow process, especially if you have > 1 milllion records.
Good way. Use range requests. You just need to emit your date in map function as a key or a part of complex key and use that range request. You can see one example here or here (also if you want to use DateTime in couchbase this example will be more usefull). Or just look to my example below:
I.e. you will have docs like:
doc = {
"id"=1,
"type"="doctype",
"timestamp"=123456, //document update or creation time
"data"="lalala"
}
For those docs map function will look like:
map = function(){
if (doc.type === "doctype"){
emit(doc.timestamp,null);
}
}
And now to get recently "updated" docs you need to query this view with params:
startKey="dateTimeNowFromApp"
endKey="{}"
descending=true
Note that startKey and endKey are swapped, because I used descending order. Here is also a link to documnetation about key types that couchbase supports.
Also I've found a link to a question that can also help.

Find User's First Post?

Using the Graph API or FQL, is there a way to efficiently find a user's first post or status? As in, the first one they ever made?
The slow way, I assume, would be to paginate through the feed, but for users like me who joined in 2005 or earlier, that would take a very long time with a huge amount of API calls.
From what I have found, we cannot obtain the date the user registered with Facebook for a good starting point, and we cannot sort by date ascending (not outside of the single page of data returned) to get the oldest post on top.
Is there any reasonable way to do this?
you can use facebook query language (FQL) to get first post information.
Please refer below query for more details :-
SELECT message, time FROM status WHERE uid= me() ORDER BY time ASC LIMIT 1
Please check and let me know in case of any issue.
Thanks and Regards
Durgaprasad
I think the Public API is limited to the depth of information it is allowed to query. Facebook probably put in these constraints for performance and cost concerns. Maybe they've changed it. When I tried to go backwards thru a person's stream about 4 months ago, there seemed to be a limit as to how far back I could go. Maybe it's a time limit or a # posts back limit. If you know when your user first posted, then getting to it should be fairly quick using the since/until time stamps in your queries.