I am attempting (and can successfully do so) to connect to an API and loop through several iterations of the API call in order to grab the next_page value, put it in a list and then call the list.
Unfortunately, when this is published to the PBI service I am unable to refresh there and indeed 'Data Source Settings' tells me I have a 'hand-authored query'.
I have attempted to follow Chris Webbs' blog post around the usage of query parameters and relative path, but if I use this I just get a constant loop of the first page that's hit.
The Start Epoch Time is a helper to ensure I only grab data less than 3 months old.
let
iterations = 10000, // Number of MAXIMUM iterations
url = "https://www.zopim.com/api/v2/" & "incremental/" & "chats?fields=chats(*)" & "&start_time=" & Number.ToText( StartEpochTime ),
FnGetOnePage =
(url) as record =>
let
Source1 = Json.Document(Web.Contents(url, [Headers=[Authorization="Bearer MY AUTHORIZATION KEY"]])),
data = try Source1[chats] otherwise null, //get the data of the first page
next = try Source1[next_page] otherwise null, // the script ask if there is another page*//*
res = [Data=data, Next=next]
in
res,
GeneratedList =
List.Generate(
()=>[i=0, res = FnGetOnePage(url)],
each [i]<iterations and [res][Data]<>null,
each [i=[i]+1, res = FnGetOnePage([res][Next])],
each [res][Data])
Lookups
If Source1 exists, but [chats] may not, you can simplify
= try Source1[chats] otherwise null
to
= Source1[chats]?
Plus it you don't lose non-lookup errors.
m-spec-operators
Chris Web Method
should be something closer to this.
let
Headers = [
Accept="application/json"
],
BaseUrl = "https://www.zopim.com", // very important
Options = [
RelativePath = "api/v2/incremental/chats",
Headers = [
Accept="application/json"
],
Query = [
fields = "chats(*)",
start_time = Number.ToText( StartEpocTime )
],
Response = Web.Contents(BaseUrl, Options),
Result = Json.Document(Response) // skip if it's not JSON
in
Result
Here's an example of a reusable Web.Contents function
helper function
let
/*
from: <https://github.com/ninmonkey/Ninmonkey.PowerQueryLib/blob/master/source/WebRequest_Simple.pq>
Wrapper for Web.Contents returns response metadata
for options, see: <https://learn.microsoft.com/en-us/powerquery-m/web-contents#__toc360793395>
Details on preventing "Refresh Errors", using 'Query' and 'RelativePath':
- Not using Query and Relative path cause refresh errors:
<https://blog.crossjoin.co.uk/2016/08/23/web-contents-m-functions-and-dataset-refresh-errors-in-power-bi/>
- You can opt-in to Skip-Test:
<https://blog.crossjoin.co.uk/2019/04/25/skip-test-connection-power-bi-refresh-failures/>
- Debugging and tracing the HTTP requests
<https://blog.crossjoin.co.uk/2019/11/17/troubleshooting-web-service-refresh-problems-in-power-bi-with-the-power-query-diagnostics-feature/>
update:
- MaybeErrResponse: Quick example of parsing an error result.
- Raw text is returned, this is useful when there's an error
- now response[json] does not throw, when the data isn't json to begin with (false errors)
*/
WebRequest_Simple
= (
base_url as text,
optional relative_path as nullable text,
optional options as nullable record
)
as record =>
let
headers = options[Headers]?, //or: ?? [ Accept = "application/json" ],
merged_options = [
Query = options[Query]?,
RelativePath = relative_path,
ManualStatusHandling = options[ManualStatusHandling]? ?? { 400, 404, 406 },
Headers = headers
],
bytes = Web.Contents(base_url, merged_options),
response = Binary.Buffer(bytes),
response_metadata = Value.Metadata( bytes ),
status_code = response_metadata[Response.Status]?,
response_text = Text.Combine( Lines.FromBinary(response,null,null, TextEncoding.Utf8), "" ),
json = Json.Document(response),
IsJsonX = not (try json)[HasError],
Final = [
request_url = metadata[Content.Uri](),
response_text = response_text,
status_code = status_code,
metadata = response_metadata,
IsJson = IsJsonX,
response = response,
json = if IsJsonX then json else null
]
in
Final,
tests = {
WebRequest_Simple("https://httpbin.org", "json"), // expect: json
WebRequest_Simple("https://www.google.com"), // expect: html
WebRequest_Simple("https://httpbin.org", "/headers"),
WebRequest_Simple("https://httpbin.org", "/status/codes/406"), // exect 404
WebRequest_Simple("https://httpbin.org", "/status/406"), // exect 406
WebRequest_Simple("https://httpbin.org", "/get", [ Text = "Hello World"])
},
FinalResults = Table.FromRecords(tests,
type table[
status_code = Int64.Type, request_url = text,
metadata = record,
response_text = text,
IsJson = logical, json = any,
response = binary
],
MissingField.Error
)
in
FinalResults
Related
The issue
I'm trying out great expectations with dagster, as per this guide
My pipeline seems to execute correctly until it reaches this block:
expectation = dagster_ge.ge_validation_op_factory(
name='ge_validation_op',
datasource_name='dev.data-pipeline-data-storage.data_pipelines.raw_data.sirene_update',
suite_name='suite.data_pipelines.raw_data.sirene_update',
)
if expectation["success"]:
print("Success")
trying to call expectation["success"] results in a
# TypeError: 'SolidDefinition' object is not subscriptable
When I go inside the code of ge_validation_op_factory, there is a _ge_validation_fn that should yield ExpectationResult, but somehow it gets coverted into a SolidDefinition...
Dagster version = 0.15.9;
Great Expectations version = 0.15.44
Code to reproduce the error
In my code, I am trying to interact with an s3 bucket, so it would be a bit tedious to re-create the code for my example but here it is anyway:
In a gx_postprocessing.py
import json
import boto3
import dagster_ge
from dagster import (
op,
graph,
Field,
String,
OpExecutionContext,
)
from typing import List, Dict
#op(
config_schema={
"bucket": Field(
String,
description="s3 bucket name",
),
"path_in_s3": Field(
String,
description="Prefix representing the path to data",
),
"technical_date": Field(
String,
description="date string to fetch data",
),
"file_name": Field(
String,
description="file name that contains the data",
),
}
)
def read_in_json_datafile_from_s3(context: OpExecutionContext):
bucket = context.op_config["bucket"]
path_in_s3 = context.op_config["path_in_s3"]
technical_date = context.op_config["technical_date"]
file_name = context.op_config["file_name"]
object = f"{path_in_s3}/" f"technical_date={technical_date}/" f"{file_name}"
s3 = boto3.resource("s3")
content_object = s3.Object(bucket, object)
file_content = content_object.get()["Body"].read().decode("utf-8")
json_content = json.loads(file_content)
return json_content
#op
def process_example_dq(data: List[Dict]):
return len(data)
#op
def postprocess_example_dq(numrows, expectation):
if expectation["success"]:
return numrows
else:
raise ValueError
#op
def validate_example_dq(context: OpExecutionContext):
expectation = dagster_ge.ge_validation_op_factory(
name='ge_validation_op',
datasource_name='my_bucket.data_pipelines.raw_data.example_update',
suite_name='suite.data_pipelines.raw_data.example_update',
)
return expectation
#graph(
config={
"read_in_json_datafile_from_s3": {
"config": {
"bucket": "my_bucket",
"path_in_s3": "my_path",
"technical_date": "2023-01-24",
"file_name": "myfile_20230124.json",
}
},
},
)
def example_update_evaluation():
output_dict = read_in_json_datafile_from_s3()
nb_items = process_example_dq(data=output_dict)
expectation = validate_example_dq()
postprocess_example_dq(
numrows=nb_items,
expectation=expectation,
)
Do not forget to add great_expectations_poc_pipeline to your __init__.py where the pipelines=[..] are listed.
In this example, dagster_ge.ge_validation_op_factory(...) is returning an OpDefinition, which is the same type of thing as (for example) process_example_dq, and should be composed in the graph definition the same way, rather than invoked within another op.
So instead, you'd want to have something like:
validate_example_dq = dagster_ge.ge_validation_op_factory(
name='ge_validation_op',
datasource_name='my_bucket.data_pipelines.raw_data.example_update',
suite_name='suite.data_pipelines.raw_data.example_update',
)
Then use that op inside your graph definition the same way you currently are (i.e. expectation = validate_example_dq())
I am using the Python Zeep library to call Maximo web services and able to fetch the WSDL without any issues. When I am trying to query using one of the method in Web service, it is throwing error
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',))
multiasset_wsdl = "http://maximoqa.xyz.com:9080/meaweb/services/MULTIASSET?wsdl"
work_order = 'NA0000211'
# set the session
session = Session()
session.auth = HTTPBasicAuth(Username, Password)
session.verify = False
transport = Transport(session=session)
settings = Settings(strict=False ,force_https = False)
maximo_multiasset_client = Client(multiasset_wsdl, transport=transport, settings=settings
multiasset_type_factory = maximo_multiasset_client.type_factory('ns0')
mult_asset_query = multiasset_type_factory.QueryMULTIASSETType(
baseLanguage = "?",
creationDateTime = tt.get_now(),
maximoVersion = "?",
maxItems = "1",
messageID = "?",
rsStart = "0",
transLanguage = "EN",
uniqueResult = False,
MULTIASSETQuery = multiasset_type_factory.MULTIASSETQueryType
(
operandMode = multiasset_type_factory.OperandModeType('AND'),
orderby = "?",
WHERE = f"WONUM='{work_order}'",
WORKORDER = None
)
)
print('Calling Query MultiAsset')
query_response = maximo_multiasset_client.service.QueryMULTIASSET(mult_asset_query)
Appreciate any help on this issue.
Regarding PowerQuery's List.Generate function.
Could someone please help me understand why the first version of this List.Generate call returns 2 elements (desired result), while the second returns only 1 element (only the initial value - an empty list)? The only difference in the two calls is found on line 3.
Working query
Source =
List.Generate(
()=>[Page = 1, NextPage = "next", Response = [next = "next", results = {}]],
each [Page] <= MaxPages and NextPage <> null,
each[
Page = [Page] + 1,
Response = Json.Document(Web.Contents("https://platform.myapi.com/public_api?",
[
Query=
[
page=Text.From([Page]),
page_size=Text.From(1000)
],
Headers=[#"x-api-key"=#"ApiKey"],
RelativePath=relativePath
]
)
),
NextPage = [Response][next]
],
each [Response][results]
), ...
Correct. 2 lists - the initial, empty list, and the results of the API call
Query I think should work, but doesn't
List.Generate(
()=>[Page = 1, NextPage = "next", Response = [next = "next", results = {}]],
each [Page] <= MaxPages and [Response][next] <> null, // only change
each[
Page = [Page] + 1,
Response = Json.Document(Web.Contents("https://platform.myapi.com/public_api?",
[
Query=
[
page=Text.From([Page]),
page_size=Text.From(1000)
],
Headers=[#"x-api-key"=#"ApiKey"],
RelativePath=relativePath
]
)
),
NextPage = [Response][next]
],
each [Response][results]
),...
Incorrect. Only the initial empty list
For clarity, here is sample output from the API. It’s a paginated API with 1,000 records per page. So, since there are only 697 records returned, there is no further page to pull, and the next element is returned as null. There are 697 elements in the results array.
In you data, as you state, Response[next] is null (since you only have 687 items).
So the condition:
each [Page] <= MaxPages and [Response][next] <> null,
Is never met on line 3 of query 2 and no further processing happens.
In the first query, NextPage = "next", so this works.
I need help to create orders using the bittrex version 3 REST API. I have the code below and I can't understand what is missing to work.
I can make other GET calls, but I cannot make this POST request.
I don't know how to deal with the passing of parameters.
Official documentation at https://bittrex.github.io/api/v3#tag-Orders.
def NewOrder(market, amount, price):
#print 'open sell v3', market
market = 'HEDG-BTC'#'BTC-'+market
uri = 'https://api.bittrex.com/v3/orders?'
params = {
'marketSymbol': 'BTC-HEDG',#'HEDG-BTC', #market
'direction': 'BUY',
'type': 'LIMIT',
'quantity': amount,
'limit': price,
'timeInForce': 'POST_ONLY_GOOD_TIL_CANCELLED',
'useAwards': True
}
timestamp = str(int(time.time()*1000))
Content = ""
contentHash = hashlib.sha512(Content.encode()).hexdigest()
Method = 'POST'
uri2 = buildURI(uri, params)
#uri2 = 'https://api.bittrex.com/v3/orders?direction=BUY&limit=0.00021&marketSymbol=HEDG-BTC&quantity=1.1&timeInForce=POST_ONLY_GOOD_TIL_CANCELLED&type=LIMIT&useAwards=True'
#print uri2
PreSign = timestamp + uri2 + Method + contentHash# + subaccountId
#print PreSign
Signature = hmac.new(apisecret, PreSign.encode(), hashlib.sha512).hexdigest()
headers = {
'Api-Key' : apikey,
'Api-Timestamp' : timestamp,
'Api-Content-Hash': contentHash,
'Api-Signature' : Signature
}
r = requests.post(uri2, data={}, headers=headers, timeout=11)
return json.loads(r.content)
NewOrder('HEDG', 1.1, 0.00021)
And my error message:
{u'code': u'BAD_REQUEST', u'data': {u'invalidRequestParameter': u'direction'}, u'detail': u'Refer to the data field for specific field validation failures.'}
It seems from the documentation that this body is expected by the api as json data:
{
"marketSymbol": "string",
"direction": "string",
"type": "string",
"quantity": "number (double)",
"ceiling": "number (double)",
"limit": "number (double)",
"timeInForce": "string",
"clientOrderId": "string (uuid)",
"useAwards": "boolean"
}
and you are setting these values as url params that's the issue.
you need to do this:
uri = 'https://api.bittrex.com/v3/orders'
# NOTE >>>> please check that you provide all the required fields.
payload = {
'marketSymbol': 'BTC-HEDG',#'HEDG-BTC', #market
'direction': 'BUY',
'type': 'LIMIT',
'quantity': amount,
'limit': price,
'timeInForce': 'POST_ONLY_GOOD_TIL_CANCELLED',
'useAwards': True
}
# do rest of the stuffs as you are doing
# post payload as json data with the url given in doc
r = requests.post(uri, json=payload, headers=headers, timeout=11)
print(r.json())
If you still have issues let us know. If it works then please mark answer as accepted.
Hope this helps.
I made the following modifications to the code but started to give error in 'Content-Hash'
I'm assuming that some parameters are optional so they are commented.
def NewOrder(market, amount, price):
market = 'BTC-'+market
uri = 'https://api.bittrex.com/v3/orders'
payload = {
'marketSymbol': market,
'direction': 'BUY',
'type': 'LIMIT',
'quantity': amount,
#"ceiling": "number (double)",
'limit': price,
'timeInForce': 'POST_ONLY_GOOD_TIL_CANCELLED',
#"clientOrderId": "string (uuid)",
'useAwards': True
}
#ceiling (optional, must be included for ceiling orders and excluded for non-ceiling orders)
#clientOrderId (optional) client-provided identifier for advanced order tracking
timestamp = str(int(time.time()*1000))
Content = ''+json.dumps(payload, separators=(',',':'))
print Content
contentHash = hashlib.sha512(Content.encode()).hexdigest()
Method = 'POST'
#uri2 = buildURI(uri, payload)#line not used
print uri
#PreSign = timestamp + uri2 + Method + contentHash# + subaccountId
PreSign = timestamp + uri + Method + contentHash# + subaccountId
print PreSign
Signature = hmac.new(apisecret, PreSign.encode(), hashlib.sha512).hexdigest()
headers = {
'Api-Key' : apikey,
'Api-Timestamp' : timestamp,
'Api-Content-Hash': contentHash,
'Api-Signature' : Signature
}
r = requests.post(uri, json=payload, headers=headers, timeout=11)
print(r.json())
return json.loads(r.content)
NewOrder('HEDG', 1.5, 0.00021)
{u'code': u'INVALID_CONTENT_HASH'}
Bittrex API via requests package PYTHON
import hmac
import hashlib
import time, requests
nonce = str(int(time.time() * 1000))
content_hash = hashlib.sha512(''.encode()).hexdigest()
signature = hmac.new(
'<SECRET_KEY>'.encode(),
''.join([nonce, url, 'GET', content_hash]).encode(),
hashlib.sha512
).hexdigest()
headers = {
'Api-Timestamp': nonce,
'Api-Key': '<API_KEY>',
'Content-Type': 'application/json',
'Api-Content-Hash': content_hash,
'Api-Signature': signature
}
result = requests.get(url=url, headers=headers)
I want implement fetching in autocomplete, here is my autocomplete function
def autocomplete(request):
fetch_field = request.GET.get('fetch_field')
sqs = SearchQuerySet().autocomplete(
content_auto=request.GET.get(
'query',
''))[
:5]
s = []
for result in sqs:
d = {"value": result.title, "data": result.object.slug}
s.append(d)
output = {'suggestions': s}
print('hihi' ,output)
return JsonResponse(output)
Now I can get fetch fields but I don't know how to fetch with SearchQuerySet.
sqs = SearchQuerySet().filter(field_want_to_fetch = fetch_field ).autocomplete(
content_auto=request.GET.get(
'query',
''))[
:5]
Use this !!