Partial retrieval of results from the Google Health API - google-health

I want to get records from Google Health within a range. Every time I send a new query, I want to get ten records.
I.e. Start: 1 to 10
Second query: start 11 to 20
....
I used the following query :
PROFILE_URL + selectedProfileId
+ "/-/labtest?start-index="+index+"&max-results=10";
This retrieved records, but when it reached the end where the list had only five records and it queries for ten results, the application crashes.
How can I get the total count of results, or partial results?

One can use the above query and it will return records as many as it has even if it has records less than max-results.
But this will throw exception when try to query for the invalid count. So catch the exception there where u add the values into your listView.

Related

AWS Cloudwatch Log Insights: Aggregate results are impossible (count - count_distinct is negative)

I'm running a CloudWatch log insights query on a single log stream that corresponds to a single Python AWS Lambda function. This function logs a unique line corresponding to the key in s3 that it is processing. It logs this line once at the beginning of the invocation. The only condition where it won't log this line is if it fails before it even reads the event.
The query is:
parse #message /(?<#unique_key>Processing key: \w+\/[\w=_-]+\/\w+\.\d{4}-\d{2}-\d{2}-\d{2}\.[\w-]+\.\w+\.\w+)/
| filter #message like /Processing key: \w+\/[\w=_-]+\/\w+\.\d{4}-\d{2}-\d{2}-\d{2}\.[\w-]+\.\w+\.\w+/
| stats count(#unique_key) - count_distinct(#unique_key) as #distinct_unique_keys_delta
by datefloor(#timestamp, 1d) as #_datefloor
| sort #_datefloor asc
The two regular expressions in this query will parse the full key of the s3 file being processed. In this particular problem and in general, my understanding is that the count(...) of any quantity minus the count_distinct(...) of the same quantity should always be greater than or equal to zero.
For several of the days in the results, it is a negative number.
I thought I might be misunderstanding the correct usage of datefloor(), so I tried running the following query:
parse #message /(?<#unique_key>Processing key: \w+\/[\w=_-]+\/\w+\.\d{4}-\d{2}-\d{2}-\d{2}\.[\w-]+\.\w+\.\w+)/
| filter #message like /Processing key: \w+\/[\w=_-]+\/\w+\.\d{4}-\d{2}-\d{2}-\d{2}\.[\w-]+\.\w+\.\w+/
| stats count(#unique_key) - count_distinct(#unique_key) as #distinct_unique_keys_delta
The result was -20,347.
At this point the only scenarios I can see are
Something wrong with the code executing the query.
I'm misunderstanding this tool.
I have discovered that the count_distinct function in AWS Log Insights queries doesn't really return a distinct count! As per the documentation
Returns the number of unique values for the field. If the field has very high cardinality (contains many unique values), the value returned by count_distinct is just an approximation.
Apparently I can't just assume that a function returns an accurate result.
The documentation page.

How to query location every 15 mins?

I have a QuestDB table with item locations similar to the taxi trip database on the QuestDB Live demo and I want to query location but no more than every X mins per item. Similar query on the demo server would be
SELECT vendor_id, pickup_latitude, pickup_longitude
FROM trips
WHERE vendor_id = 'VTS'
SAMPLE BY 15m
but get back error
at least one aggregation function must be present in 'select' clause
I don't want any aggregation like average etc, I just need the location every hour (or X mins). Is there a way to query that?
Use first for aggregation
SELECT vendor_id, first(pickup_latitude) lat, first(pickup_longitude) Lon
FROM trips
WHERE vendor_id = 'VTS'
SAMPLE BY 15m

How to implement a do while loop in Power Query and read the last row of a table which is updated dynamically

I am trying to import data from sharepoint rest API using the document id of all the documents. My objective is to start from the smallest document id and move on until there are no more documents.
I have designed a custom function and i am calling it by passing the Document Id which i am starting from 0. This function return me a table containing 500 documents whose Doc Id is greater than the Document Id which i am passing.
#"Output" =Table.AddColumn(Termset,"c", each GetList( try List.Max(Output[c.DocId]) otherwise LastDocID))
So my data is updated in the Output table. My problem here is that it is returning the same set of 500 recs again and again. Which is possibly because the value of List.Max(Output[c.DocId] is not changing (i want this value to be the last document id which is returned from GetList function) . I am trying here to do someting like a do while loop.
do{
Output=GetList(LastDocID)
LastDocId=List.Max(Output[DocId])
}while(there_are_no_more_docs)
Is there any way in Power Query that i can dynamically change the value of LastDocId which i am passing to the GetList function. The method which i tried below does not seem to be working as it is not able to read the contents of the Output table after every function call.
Note: I am using Termset as pages to put a check on the total documents being read. It is a list whose value starts from 0 and increments by 500 until it is less than the total number of docs in Sharepoint.
I would really appreciate if somebody can help me here.
You need to look at List.Generate to generate all of your page requests and then compute the last document id from the list of results.
Use a loop like the following to fetch all pages from the REST API:
listOfPages = List.Generate(
() => FetchPage(null),
(page) => page <> null,
(page) => FetchPage(page)
)

DynamoDB QuerySpec {MaxResultSize + filter expression}

From the DynamoDB documentation
The Query operation allows you to limit the number of items that it
returns in the result. To do this, set the Limit parameter to the
maximum number of items that you want.
For example, suppose you Query a table, with a Limit value of 6, and
without a filter expression. The Query result will contain the first
six items from the table that match the key condition expression from
the request.
Now suppose you add a filter expression to the Query. In this case,
DynamoDB will apply the filter expression to the six items that were
returned, discarding those that do not match. The final Query result
will contain 6 items or fewer, depending on the number of items that
were filtered.
Looks like the following query should return (at least sometimes) 0 records.
In summary, I have a UserLogins table. A simplified version is:
1. UserId - HashKey
2. DeviceId - RangeKey
3. ActiveLogin - Boolean
4. TimeToLive - ...
Now, let's say UserId = X has 10,000 inactive logins in different DeviceIds and 1 active login.
However, when I run this query against my DynamoDB table:
QuerySpec{
hashKey: null,
rangeKeyCondition: null,
queryFilters: null,
nameMap: {"#0" -> "UserId"}, {"#1" -> "ActiveLogin"}
valueMap: {":0" -> "X"}, {":1" -> "true"}
exclusiveStartKey: null,
maxPageSize: null,
maxResultSize: 10,
req: {TableName: UserLogins,ConsistentRead: true,ReturnConsumedCapacity: TOTAL,FilterExpression: #1 = :1,KeyConditionExpression: #0 = :0,ExpressionAttributeNames: {#0=UserId, #1=ActiveLogin},ExpressionAttributeValues: {:0={S: X,}, :1={BOOL: true}}}
I always get 1 row. The 1 active login for UserId=X. And it's not happening just for 1 user, it's happening for multiple users in a similar situation.
Are my results contradicting the DynamoDB documentation?
It looks like a contradiction because if maxResultSize=10, means that DynamoDB will only read the first 10 items (out of 10,001) and then it will apply the filter active=true only (which might return 0 results). It seems very unlikely that the record with active=true happened to be in the first 10 records that DynamoDB read.
This is happening to hundreds of customers that are running similar queries. It works great, when according to the documentation it shouldn't be working.
I can't see any obvious problem with the Query. Are you sure about your premise that users have 10,000 items each?
Your keys are UserId and DeviceId. That seems to mean that if your user logs in with the same device it would overwrite the existing item. Or put another way, I think you are saying your users having 10,000 different devices each (unless the DeviceId rotates in some way).
In your shoes I would just remove the filterexpression and print the results to the log to see what you're getting in your 10 results. Then remove the limit too and see what results you get with that.

Trying to get the time delta between two date columns in a dataframe, where one column can possibly be "NaT"

I am trying to get the difference in days between two dates pulled from a SQL database. One is the start date, the other is a completed date. The completed date can and is, in this case, be a NaT value. Essentially I'd like to iterate through each row, and take the difference, if the completed date is NaT I'd like to skip it or assign a NaN value, in a completely new delta column. the code below is giving me this error: 'member_descriptor' object is not callable
for n in df.FINAL_DATE:
df.DELTA = [0]
if n is None:
df.DELTA = None
break
else:
df.DELTA = datetime.timedelta.days(df['FINAL_DATE'], df['START_DATE'])
So, you first have to check if any of the dates are NaT. If not, you can calculate by:
df.DELTA = (df['FINAL_DATE'] - df['START_DATE']).days