How do I schedule refresh of a Web.Contents data source? - powerbi

I am trying to set a 'Scheduled Refresh' on a dataset in the Power BI web app (https://app.powerbi.com).
Normally I should see these options in the dataset settings:
But when I go to settings I am greeted by this warning:
and no way to select the 'Gateway Connection' or data source settings.
I found a useful article which explains a problem with Web.Contents and how to get around it:
https://blog.crossjoin.co.uk/2016/08/23/web-contents-m-functions-and-dataset-refresh-errors-in-power-bi/
I applied this and it still doesn't work.
In Power BI Desktop no data sources are listed as I am using a hand-authored query.
The way it works is there is a main query (Log Scroll) which calls a recursive function query (RecursiveFetch). The function then calls a Web API which works by sending a new page of JSON data everytime it is called, in a sort of 'scrolling' manner.
The Log Scroll query looks like this:
let
url = "http://exampleURL:1000"
Source = RecursiveFetch(url, 5, null, null)
in
Source
The RecursiveFetch looks like this:
let
RecursiveFetch= (url, scrollCount, scrollID, counter) =>
let
Counter = if (counter = null) then 0 else counter,
Results = if (scrollID = null) then
Json.Document(Web.Contents(url,
[
Headers=[
#"Authorization"="Basic <key here>",
#"Content-Type"="application/json"
]
]
))
else
Json.Document(Web.Contents(url,
[
Content = Text.ToBinary(scrollID),
Headers=[
#"Authorization"="Basic <key here>",
#"Content-Type"="application/json"
]
]
)),
ParsedResults = Table.FromList(Results[hits][hits], Splitter.SplitByNothing(), null, null, ExtraValues.Error),
Return = if (Counter < scrollCount) then
ParsedResults & RecursiveFetch(url, scrollCount scrollID, Counter)
else
ParsedResults
in
Return
in
RecursiveFetch
It all works perfectly in Power BI Desktop but when I publish it to the web app I get the errors shown above.
I have manually set up a data source in my Gateway Cluster which connects fine to the URL with the same credentials that the hand-authored query uses.
How do I get this all to work? Is there something I have missed?

It all works perfectly in Power BI Desktop but when I publish it to the web app I get the errors shown above.
To fix that, Web.Contents needs to use options[RelativePath] and options[Query] (or Content for HTTP POST)
The original:
"https://data.gov.uk/api/3/action/package_search?q=" & Term
Will use:
let
BaseUrl = "https://data.gov.uk",
Options = [
RelativePath = "/api/3/action/package_search",
Headers = [
Accept="application/json"
],
Query = [
q = Term
]
],
// wrap 'Response' in 'Binary.Buffer' if you are using it multiple times
response = Web.Contents(BaseUrl, Options),
buffered = Binary.Buffer(response),
response_metadata = Value.Metadata(response),
status_code = response_metadata[Response.Status],
from_json = Json.Document(final_result)
in
from_json
the parameter url is going to be the minimum url possible, otherwise the service will think your dynamic request is actually unchanged -- causing the original refresh error.

Related

AWS Kendra PreHook Lambdas for Data Enrichment

I am working on a POC using Kendra and Salesforce. The connector allows me to connect to my Salesforce Org and index knowledge articles. I have been able to set this up and it is currently working as expected.
There are a few custom fields and data points I want to bring over to help enrich the data even more. One of these is an additional answer / body that will contain key information for the searching.
This field in my data source is rich text containing HTML and is often larger than 2048 characters, a limit that seems to be imposed in a String data field within Kendra.
I came across two hooks that are built in for Pre and Post data enrichment. My thought here is that I can use the pre hook to strip HTML tags and truncate the field before it gets stored in the index.
Hook Reference: https://docs.aws.amazon.com/kendra/latest/dg/API_CustomDocumentEnrichmentConfiguration.html
Current Setup:
I have added a new field to the index called sf_answer_preview. I then mapped this field in the data source to the rich text field in the Salesforce org.
If I run this as is, it will index about 200 of the 1,000 articles and give an error that the remaining articles exceed the 2048 character limit in that field, hence why I am trying to set up the enrichment.
I set up the above enrichment on my data source. I specified a lambda to use in the pre-extraction, as well as no additional filtering, so run this on every article. I am not 100% certain what the S3 bucket is for since I am using a data source, but it appears to be needed so I have added that as well.
For my lambda, I create the following:
exports.handler = async (event) => {
// Debug
console.log(JSON.stringify(event))
// Vars
const s3Bucket = event.s3Bucket;
const s3ObjectKey = event.s3ObjectKey;
const meta = event.metadata;
// Answer
const answer = meta.attributes.find(o => o.name === 'sf_answer_preview');
// Remove HTML Tags
const removeTags = (str) => {
if ((str===null) || (str===''))
return false;
else
str = str.toString();
return str.replace( /(<([^>]+)>)/ig, '');
}
// Truncate
const truncate = (input) => input.length > 2000 ? `${input.substring(0, 2000)}...` : input;
let result = truncate(removeTags(answer.value.stringValue));
// Response
const response = {
"version" : "v0",
"s3ObjectKey": s3ObjectKey,
"metadataUpdates": [
{"name":"sf_answer_preview", "value":{"stringValue":result}}
]
}
// Debug
console.log(response)
// Response
return response
};
Based on the contract for the lambda described here, it appears pretty straight forward. I access the event, find the field in the data called sf_answer_preview (the rich text field from Salesforce) and I strip and truncate the value to 2,000 characters.
For the response, I am telling it to update that field to the new formatted answer so that it complies with the field limits.
When I log the data in the lambda, the pre-extraction event details are as follows:
{
"s3Bucket": "kendrasfdev",
"s3ObjectKey": "pre-extraction/********/22736e62-c65e-4334-af60-8c925ef62034/https://*********.my.salesforce.com/ka1d0000000wkgVAAQ",
"metadata": {
"attributes": [
{
"name": "_document_title",
"value": {
"stringValue": "What majors are under the Exploratory track of Health and Life Sciences?"
}
},
{
"name": "sf_answer_preview",
"value": {
"stringValue": "A complete list of majors affiliated with the Exploratory Health and Life Sciences track is available online. This track allows you to explore a variety of majors related to the health and life science professions. For more information, please visit the Exploratory program description. "
}
},
{
"name": "_data_source_sync_job_execution_id",
"value": {
"stringValue": "0fbfb959-7206-4151-a2b7-fce761a46241"
}
},
]
}
}
The Problem:
When this runs, I am still getting the same field limit error that the content exceeds the character limit. When I run the lambda on the raw data, it strips and truncates it as expected. I am thinking that the response in the lambda for some reason isn't setting the field value to the new content correctly and still trying to use the data directly from Salesforce, thus throwing the error.
Has anyone set up lambdas for Kendra before that might know what I am doing wrong? This seems pretty common to be able to do things like strip PII information before it gets indexed, so I must be slightly off on my setup somewhere.
Any thoughts?
since you are still passing the rich text as a metadata filed of a document, the character limit still applies so the document would fail at validation step of the API call and would not reach the enrichment step. A work around is to somehow append those rich text fields to the body of the document so that your lambda can access it there. But if those fields are auto generated for your documents from your data sources, that might not be easy.

How can I get available license via Google Workspace Admin SDK?

On Google Admin screen, I can get numbers of available licenses and used licenses shown below:
How can I get these numbers via API?
Note: I read this question and tried, but not worked well.
-- EDIT: 2021/07/15 --
My request:
https://developers.google.com/admin-sdk/reports/reference/rest/v1/customerUsageReports/get
date: (few days before now)
parameters: accounts:gsuite_unlimited_total_licenses (comes from Account Parameters)
Response from API:
{
"kind": "admin#reports#usageReports",
"etag": "\"occ7bTD-Q2yefKPIae3LMOtCT9xQVZYBzlAbHU5b86Q/gt9BLwRjoWowpJCRESV3vBMjYMc\""
}
Expectation: I want to get the data same as 2 available, 1132 assigned as the GUI shows.
To be honestly, I'm not satisfying even if I can get info via this API, because it seems not responding real-time data like GUI.
I think there are 2 ways this information can be obtain, but I can confirm for only one of them.
1. Using the Report API that you mentioned.
NOTE : The report is not live data, so you must run the API call with a "date" parameter set at least 2 days before the execution date
Given that, you would have to run this GET method with the proper date in the {date} param
GET https://admin.googleapis.com/admin/reports/v1/usage/dates/{date}
Then you would need to parse through the parameters to find the desired license you are looking for.
reference - https://developers.google.com/admin-sdk/reports/reference/rest/v1/customerUsageReports#UsageReports
Here is how it look like after parsing
[
{
"BoolValue": null,
"DatetimeValueRaw": null,
"DatetimeValue": null,
"IntValue": 12065,
"MsgValue": null,
"Name": "accounts:gsuite_enterprise_total_licenses",
"StringValue": null
},
{
"BoolValue": null,
"DatetimeValueRaw": null,
"DatetimeValue": null,
"IntValue": 12030,
"MsgValue": null,
"Name": "accounts:gsuite_enterprise_used_licenses",
"StringValue": null
}
]
Important : The repot will always date 2 day back, so you can get the total number of licenses gsuite_enterprise_total_licenses in my example, and then use the Enterprise License Manager API to retrieve all currently assigned licenses
reference https://developers.google.com/admin-sdk/licensing/reference/rest
2. Using the Reseller API
Retrieving the information from the reseller point of view you would need to use the subscriptions.get method, providing your customerId and subscriptionId , calling the following GET request:
GET https://reseller.googleapis.com/apps/reseller/v1/customers/{customerId}/subscriptions/{subscriptionId}
The response of that would be a subscriptions resource, that contains various information about the license and the Seats object , which if you expand looks like this :
{
"numberOfSeats": integer,
"maximumNumberOfSeats": integer,
"licensedNumberOfSeats": integer,
"kind": string
}
numberOfSeats should be the total amount of licenses and licensedNumberOfSeats should be the number of users having that license assigned to them.
NOTE : in order to use this API , the given tenant should have a "fully executed and signed reseller contract" - https://developers.google.com/admin-sdk/reseller/v1/how-tos/prerequisites
Reference - https://developers.google.com/admin-sdk/reseller/reference/rest/v1/subscriptions
Answer:
You can only get the number of assigned licenses using the API, the number available isn't exposed and so does not get returned.
More Information:
Given that you have licenses assigned for your domain, and the user that is querying the API has access to this information, you can retrieve the data with the following request:
curl \
'https://admin.googleapis.com/admin/reports/v1/usage/dates/2021-07-10?parameters=accounts%3Agsuite_unlimited_total_licenses&fields=*&key=[YOUR_API_KEY]' \
--header 'Authorization: Bearer [YOUR_ACCESS_TOKEN]' \
--header 'Accept: application/json' \
--compressed
While not necessary, I added the parameter field=* in order to make sure all data is returned.
This gave me a response as such:
{
"kind": "admin#reports#usageReports",
"etag": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
"usageReports": [
{
"kind": "admin#reports#usageReport",
"date": "2021-07-10",
"etag": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
"entity": {
"type": "CUSTOMER",
"customerId": "C0136mgul"
},
"parameters": [
{
"name": "accounts:gsuite_unlimited_total_licenses",
"intValue": "233"
}
]
}
]
}
Here you can see that the intValue for accounts:gsuite_unlimited_total_licenses is 233 - which is reflected in the UI:
Feature Request:
You can however let Google know that this is a feature that is important for access to their APIs, and that you would like to request they implement it.
Google's Issue Tracker is a place for developers to report issues and make feature requests for their development services, I'd urge you to make a feature request there. The best component to file this under would be the Google Admin SDK component, with the Feature Request template.
I was not able to use Reseller API (ran into some authorization issues) and Reports API (contained null values in all relevant attributes). The only way I was able to find how many licenses were remaining was through Enterprise License Manager API.
After getting assignments, I used sdkName to filter records based on the type of license.
Here is the complete code.
function getRemainingGoogleUserLicensesCount() {
const GOOGLE_USER_LICENSES_TOTAL = 100 // You can find this from Google Admin -> Billing -> Subscriptions
const productId = 'Google-Apps';
const customerId = 'C03az79cb'; // You can find this from this response, https://developers.google.com/admin-sdk/directory/v1/guides/manage-users#json-response
let assignments = [];
let usedLicenseCount = 0
let pageToken = null;
do {
const response = AdminLicenseManager.LicenseAssignments.listForProduct(productId, customerId, {
maxResults: 500,
pageToken: pageToken
});
assignments = assignments.concat(response.items);
pageToken = response.nextPageToken;
} while (pageToken);
for (const assignment of assignments) {
if (assignment["skuName"] == "Google Workspace Business Plus") {
usedLicenseCount += 1
}
}
return GOOGLE_USER_LICENSES_TOTAL - usedLicenseCount
}

Log entries api not retrieving log entries

I am trying to retrieve custom logs for a particular project in google-cloud. I am using this api:
https://logging.googleapis.com/v2/entries:list
as per the example given in this link.
The below is the payload:
{
"filter": "projects/projectA/logs/slow_log",
"resourceNames": [
"projects/projectA"
]
}
There is a custom log based metric called slow_log I created in that projectA, which gathers query logs from cloud-SQL database in that project. I also generated data before calling this api. I am able to see the data in stack-driver console, but unable to get it from the rest call.
Every time I run this api, I only get this response and nothing else:
"nextPageToken": "EAA4suKu3qnLwbtrSg8iDSIDCgEAKgYIgL7q8wVSBwibvMSMvhhglPDiiJzdjt_zAWocCgwI2buKhAYQlvTd2gESCAgLEMPV7ukCGAAgAQ"
Is there anything missing here?
How is it possible to pass time range in this query?
Update
Changed the request as per the comment below as gave the full path of the logs: still only the token is displayed
{
"filter": "projects/projectA/logs/cloudsql.googleapis.com%2Fmysql-slow.log",
"projectIds": [
"projectA"
],
"orderBy": "timestamp desc"
}
Also I give this command from command line:
gcloud logging read logName="projects/projectA/logs/cloudsql.googleapis.com%2Fmysql-slow.log"
then it fetches the logs in command line, so I am not sure what I am missing in the api explorer and postman where I get only nextpage token.
resourceNames, filter and orderBy are mandatory, try like this:
{
"resourceNames": [
"projects/projectA"
],
"filter": "projects/projectA/logs/cloudsql.googleapis.com%2Fmysql-slow.log",
"orderBy": "timestamp desc"
}

Is it possible to get reports by filtering using power bi rest api?

Is it possible to get reports by filtering using power bi rest api? I want to embed power bi reports filtering by records. I can't see any option on power bi rest api, then how to get all reports by filter and embed reports in my application?
Since I am using powerbi.js as javascript client so below is my sample code:
https://github.com/Microsoft/PowerBI-JavaScript
var tokenType = 'embed';
// Get models. models contains enums that can be used.
var models = window['powerbi-client'].models;
// We give All permissions to demonstrate switching between View and
//Edit mode and saving report.
var permissions = models.Permissions.All;
var config = {
type: 'report',
tokenType: tokenType == '0' ? models.TokenType.Aad :
models.TokenType.Embed,
accessToken: txtAccessToken,
embedUrl: txtEmbedUrl,
id: txtEmbedReportId,
permissions: permissions,
settings: {
filterPaneEnabled: true,
navContentPaneEnabled: true
}
};
// Get a reference to the embedded report HTML element
var embedContainer = $('#embedContainer')[0];
// Embed the report and display it within the div container.
var report = (<any>window).powerbi.embed(embedContainer, config);
When you are embedding a report, you can use the Embed Configuration to apply filters when the report is loaded. You can also change the filters dynamically later.
Here is a quote from filters wiki:
Filters are JavaScript objects that have a special set of properties. Currently, there are five types of filters: Basic, Advanced, Relative Date, Top N and Include/Exclude, which match the types of filters you can create through the filter pane. There are corresponding interfaces IBasicFilter, IAdvancedFilter, IRelativeDateFilter, ITopNFilter and IIncludeExcludeFilter, which describe their required properties.
For example, your filter can be constructed like this:
const basicFilter: pbi.models.IBasicFilter = {
$schema: "http://powerbi.com/product/schema#basic",
target: {
table: "Sales",
column: "AccountId"
},
operator: "In",
values: [1,2,3],
filterType: pbi.models.FilterType.BasicFilter
}
You should pass this filter in report's configuration filters property.

How to send GET request to API

Summary: I have a job board, a user searches a zip code and all the jobs matching that zip code are displayed, I am trying to add a feature that lets you see jobs within a certain mile radius of that zip code. There is a web API ( www.zipcodeapi.com ) that does these calculations and returns zip codes within the specified radius, I am just unsure how to use it.
Using www.zipcodeapi.com , you enter a zip code and a distance and it returns all zip codes within this distance. The format for API request is as follows: https://www.zipcodeapi.com/rest/<api_key>/radius.<format>/<zip_code>/<distance>/<units>, so if a user enters zip code '10566' and a distance of 5 miles, the format would be https://www.zipcodeapi.com/rest/<api_key>/radius.json/10566/5/miles and this would return:
{
"zip_codes": [
{
"zip_code": "10521",
"distance": 4.998,
"city": "Croton On Hudson",
"state": "NY"
},
{
"zip_code": "10548",
"distance": 3.137,
"city": "Montrose",
"state": "NY"
}
#etc...
]
}
My question is how do I send a GET request to the API using django?
I have the user searched zip code stored in zip = request.GET.get('zip') and the mile radius stored in mile_radius = request.GET['mile_radius']. How can I incorporate those two values in their respective spots in https://www.zipcodeapi.com/rest/<api_key>/radius.<format>/<zip_code>/<distance>/<units> and send the request? Can this be done with Django or do I have this all confused? Does it need to be done with a frontend language? I have tried to search this on google but only find this for RESTful APIS, and I dont think this is what I am looking for. Thanks in advance for any help, if you couldn't tell i've never worked with a web API before.
You can use the requests package, to do exactly what you want. It's pretty straightforward and has good documentation.
Here's an example of how you could perform it for your case:
zip_code = request.GET.get('zip')
mile_radius = request.GET['mile_radius']
api_key = YOUR_API_KEY
fmt = 'json'
units = 'miles'
response = requests.get(
url=f'https://www.zipcodeapi.com/rest/{api_key}/radius.{fmt}/{zip_code}/{mile_radius}/{units}')
zip_codes = response.json().get('zip_codes')
zip_codes should then be an array with those dicts as in your example.