Power Query List Generate #odata.nextLink where pagination not available - powerbi

I am trying to pull data from Dynamics CRM.
I am not able to use oData as this brings in the whole DB.
= OData.Feed("https://crm/xxxxxx/api/data/v8.0/contacts")
Json allows the columns to be restricted but has a number of limits.
5000k limit
No page number listed
Offset function does not work
Limit function does not work
No total record count function
The below Json link returns two columns of information:
Values column - A list of the first 5000 records
oData.nextLink column - a link to the next 5000 records.
=Json.Document(Web.Contents("https://crm/xxxxxx/api/data/v8.0/contacts"))
I need to write function starting with the URL that returns the lists into a table. The first link is just a URL, the second is the oData link which makes coding harder. The loop ends on error.
I have tried too many methods to list, none give me the answer.
oData.nextLink is a large string
https://crm/xxxxxx/api/data/v8.0/contacts?$skiptoken=%3Ccookie%20pagenumber=%221%22%20pagingcookie=%22%253ccookie%2520page%253d%25221%2522%253e%253ccontactid%2520last%253d%2522%257bD0E5305F-0085-E211-9FD1-000C29854771%257d%2522%2520first%253d%2522%257b609AF16C-120C-4E2C-9498-00015D9B0068%257d%2522%2520%252f%253e%253c%252fcookie%253e%22%20istracking=%22False%22%20/%3E
let
Source = {1..7},
BaseURL = "https://crm/xxxxxx/api/data/v8.0/contacts",
NextURL = Json.Document(Web.Contents("https://crm/xxxxxx/api/data/v8.0/contacts")) [#"#odata.nextLink"],
NextList = Json.Document(Web.Contents("https://crm/xxxxxx/api/data/v8.0/contacts")) [#"value"],
ToTable = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
Renamed = Table.RenameColumns(ToTable,{{"Column1", "Page"}}),
AddedBase = Table.AddColumn(Renamed, "Base" as text, each BaseURL),
AddedLink = Table.AddColumn(AddedBase, "Next" as text, each NextURL),
AddedList = Table.AddColumn(AddedLink, "List" as text, each NextList)
in AddedList

Related

Impossible to show all records if parameter is not given in Power Bi Report Builder

I've been struggling to make it work. I have a dataset with column Country and each record has its own country. I've also created a parameter #Country which is a blank input. What I want to do is to show all records from dataset if parameter is left empty and show records matching country when it is given by user. So far everything works whenever I input a country BUT when I leave it blank then no records are being shown. How can I fix it?
Snippets of code I've tried, ever one with the same result.
FILTER('Dataset', ISBLANK(#Country) || 'Dataset'[Country] = #Country))
FILTER('Dataset', IF(ISBLANK(#Country), 1, 'Dataset'[Country] = #Country))
FILTER('Dataset', IF(ISBLANK(#Country), 'Dataset'[Country], 'Dataset'[Country] = #Country))
FILTER('Dataset', IF(NOT(ISBLANK(#Country)), 'Dataset'[Country] = #Country, 1))

DAX - Count one row for each group that meet filter requirement

I have a UserLog table that logs the actions "ADDED", "DELETED" or "UPDATED" along with the date/timestamp.
In Power BI, I would like to accurately show the amount of new users as well as removed users for a filtered time period. Since it's possible that a user has been added and then deleted, I need to make sure that I only get the last record (ADDED/DELETED) from the log for every user.
Firstly, I tried setting up a measure that gets the max date/timestamp:
LastUpdate = CALCULATE(MAX(UserLog[LogDate]), UserLog[Action] <> "UPDATED")
I then tried to create the measure that shows the amount of new users:
AddedCount = CALCULATE(COUNT(UserLog[userId]), FILTER(UserLog, [Action] = "ADDED" && [LogDate] = [LastUpdate]))
But the result is not accurate as it still counts all "ADDED" records regardless if it's the last record or not.
I find writing formulas against unstructured data is much harder. If you put a little more structure to the data, it becomes much easier. There are lots of ways to do this. Here's one.
Add 2 columns to your UserLog table to help count adds and deletes:
Add = IF([Action]="ADDED",1,0)
Delete = IF([Action]="DELETED",1,0)
Then create a summary table so you can tell if the same user was added and deleted in the same [LogDate]:
Counts = SUMMARIZE('UserLog',
[LogDate],
[userId],
"Add",SUM('UserLog'[Add]),
"Delete",SUM('UserLog'[Delete]))
Then the measures are simple to write. For the AddedCount, just filter the rows where [Delete]=0, and for DeletedCount, just filter the rows were [Add]=0:
AddedCount = CALCULATE(SUM(Counts[Add]),
Counts[Delete]=0,
FILTER(Counts,Counts[LogDate]=[LastUpdate]))
DeletedCount = CALCULATE(SUM(Counts[Delete]),
Counts[Add]=0,
FILTER(Counts,Counts[LogDate]=[LastUpdate]))

How to implement a do while loop in Power Query and read the last row of a table which is updated dynamically

I am trying to import data from sharepoint rest API using the document id of all the documents. My objective is to start from the smallest document id and move on until there are no more documents.
I have designed a custom function and i am calling it by passing the Document Id which i am starting from 0. This function return me a table containing 500 documents whose Doc Id is greater than the Document Id which i am passing.
#"Output" =Table.AddColumn(Termset,"c", each GetList( try List.Max(Output[c.DocId]) otherwise LastDocID))
So my data is updated in the Output table. My problem here is that it is returning the same set of 500 recs again and again. Which is possibly because the value of List.Max(Output[c.DocId] is not changing (i want this value to be the last document id which is returned from GetList function) . I am trying here to do someting like a do while loop.
do{
Output=GetList(LastDocID)
LastDocId=List.Max(Output[DocId])
}while(there_are_no_more_docs)
Is there any way in Power Query that i can dynamically change the value of LastDocId which i am passing to the GetList function. The method which i tried below does not seem to be working as it is not able to read the contents of the Output table after every function call.
Note: I am using Termset as pages to put a check on the total documents being read. It is a list whose value starts from 0 and increments by 500 until it is less than the total number of docs in Sharepoint.
I would really appreciate if somebody can help me here.
You need to look at List.Generate to generate all of your page requests and then compute the last document id from the list of results.
Use a loop like the following to fetch all pages from the REST API:
listOfPages = List.Generate(
() => FetchPage(null),
(page) => page <> null,
(page) => FetchPage(page)
)

How to combine sources into one?

I have an issue when Calling a REST endpoint. The resulting data set is too large for the endpoint to return (I get an HTTP 500 error).
I can split the Query up in pieces, e.g. by month. How do I perform multiple calls to the endpoint - one for each month I want to return, and then combine them into one table?
Unfortunately, the REST endpoind doesn't support ODATA queries, so I cannot page through the result set.
let
Source1 = Json.Document(Web.Contents("https://someurl?theapi" & "&q=Date>='2019-01-01' AND Date<='2019-01-31'")),
Source2 = Json.Document(Web.Contents("https://someurl?theapi" & "&q=Date>='2019-02-01' AND Date<='2019-02-28'")),
Table1= Table.FromList(Source1, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
Table2= Table.FromList(Source2, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
CompositeTable= Table.NestedJoin(Table2, {"Datum"}, Table1, {"Datum"}, "CompositeTable", JoinKind.LeftOuter)
in
CompositeTable
I want to have the result sets from both queries merged into "CompositeTable"
There's a great blog post by Mark Tiedemann that suggests a solution. I've applied this method a dozen times and it works flawlessly for any paginated API I encountered.
Mark's elegant solution is to query the first page, extract the number of total results from it and then call the GetPage function for all remaining pages and combine all pages using the List.Union function.
For your use case, I would use a start and end date instead of items/page and total items. For each month in between start and end date, call the function that queries this month only, and combine the results. To give you an idea, something like this:
let
BaseUrl = "https://someurl?theapi&",
StartDate = #date(2019,01,01),
EndDate = #date(2019,05,31),
GetJson = (Url) =>
let Json = Json.Document(Web.Contents(Url))
in Json,
GetPage = (Index) =>
let Start = "Date>=" & Text.From(Date.StartOfMonth(Index)),
End = "Date<=" & Text.From(Date.EndOfMonth(Index)),
Url = BaseUrl & "&q=" & Start " AND " & End,
Json = GetJson(Url),
Value = Json[#"value"]
in Value,
PageIndices = { LIST OF ALL MONTHS },
Pages = List.Transform(PageIndices, each GetPage(_)),
Entities = List.Union(Pages),
Table = Table.FromList(Entities, Splitter.SplitByNothing(), null, null, ExtraValues.Error)
in
Table

DynamoDB QuerySpec {MaxResultSize + filter expression}

From the DynamoDB documentation
The Query operation allows you to limit the number of items that it
returns in the result. To do this, set the Limit parameter to the
maximum number of items that you want.
For example, suppose you Query a table, with a Limit value of 6, and
without a filter expression. The Query result will contain the first
six items from the table that match the key condition expression from
the request.
Now suppose you add a filter expression to the Query. In this case,
DynamoDB will apply the filter expression to the six items that were
returned, discarding those that do not match. The final Query result
will contain 6 items or fewer, depending on the number of items that
were filtered.
Looks like the following query should return (at least sometimes) 0 records.
In summary, I have a UserLogins table. A simplified version is:
1. UserId - HashKey
2. DeviceId - RangeKey
3. ActiveLogin - Boolean
4. TimeToLive - ...
Now, let's say UserId = X has 10,000 inactive logins in different DeviceIds and 1 active login.
However, when I run this query against my DynamoDB table:
QuerySpec{
hashKey: null,
rangeKeyCondition: null,
queryFilters: null,
nameMap: {"#0" -> "UserId"}, {"#1" -> "ActiveLogin"}
valueMap: {":0" -> "X"}, {":1" -> "true"}
exclusiveStartKey: null,
maxPageSize: null,
maxResultSize: 10,
req: {TableName: UserLogins,ConsistentRead: true,ReturnConsumedCapacity: TOTAL,FilterExpression: #1 = :1,KeyConditionExpression: #0 = :0,ExpressionAttributeNames: {#0=UserId, #1=ActiveLogin},ExpressionAttributeValues: {:0={S: X,}, :1={BOOL: true}}}
I always get 1 row. The 1 active login for UserId=X. And it's not happening just for 1 user, it's happening for multiple users in a similar situation.
Are my results contradicting the DynamoDB documentation?
It looks like a contradiction because if maxResultSize=10, means that DynamoDB will only read the first 10 items (out of 10,001) and then it will apply the filter active=true only (which might return 0 results). It seems very unlikely that the record with active=true happened to be in the first 10 records that DynamoDB read.
This is happening to hundreds of customers that are running similar queries. It works great, when according to the documentation it shouldn't be working.
I can't see any obvious problem with the Query. Are you sure about your premise that users have 10,000 items each?
Your keys are UserId and DeviceId. That seems to mean that if your user logs in with the same device it would overwrite the existing item. Or put another way, I think you are saying your users having 10,000 different devices each (unless the DeviceId rotates in some way).
In your shoes I would just remove the filterexpression and print the results to the log to see what you're getting in your 10 results. Then remove the limit too and see what results you get with that.