JSONata - Grouping, summing, then dividing by the sum of the group

JSONata - Grouping, summing, then dividing by the sum of the group - grouping

I am trying to group by a key and then calculate the sums for those groups (as shown in this example: JSONata (or JS) - group and sum JSON array / objects). Then I would like to divide the original number by the group sum.
Sample:
{
"positions": [
{
"ticker": "AAPL",
"marketValue": 100
},
{
"ticker": "AAPL",
"marketValue": 200
},
{
"ticker": "ATVI",
"marketValue": 200
},
{
"ticker": "ATVI",
"marketValue": 300
},
{
"ticker": "BAC",
"marketValue": 100
},
{
"ticker": "BAC",
"marketValue": 400
},
{
"ticker": "BAC",
"marketValue": 200
}
]
}
The result I want (where "group-weight" equals each item's marketValue divided by the sum of the same-ticker marketValues:
{
"positions": [
{
"ticker": "AAPL",
"marketValue": 100,
"group-weight": 0.3333
},
{
"ticker": "AAPL",
"marketValue": 200,
"group-weight": 0.6667
},
{
"ticker": "ATVI",
"marketValue": 200,
"group-weight": 0.4
},
{
"ticker": "ATVI",
"marketValue": 300,
"group-weight": 0.6
},
{
"ticker": "BAC",
"marketValue": 100,
"group-weight": 0.1429
},
{
"ticker": "BAC",
"marketValue": 400,
"group-weight": 0.5714
},
{
"ticker": "BAC",
"marketValue": 200,
"group-weight": 0.2857
}
]
}
I can get the sum of the groups using:
positions{`ticker`: $sum(marketValue)}
but can't get that next step where I divide by the group sums.
https://try.jsonata.org/m_xPDfncW

(
$totals := positions{ticker: $sum(marketValue)};
positions.{
"marketValue": marketValue,
"ticker": ticker,
"group-weight": marketValue / $lookup($totals, ticker)
}
)
See https://try.jsonata.org/jLDI1Pgnx

Related

AWS Step Functions - Arrays

Given the following, I would like to know how I can access the 'exists' field. I thought it was simply $.result.exists or $.result[0].exists but neither of these work? Also is there a way I can extract all occurrences of 'exists' from the array and use them in a choice flow using Step Functions?
[
{
"result": {
"exists": true
},
"StatusCode": 200
},
{
"result": {
"exists": true
},
"StatusCode": 200
}
]
Even better I would like to transform the following input:
"output": [
[
{
"result": {
"exists": true
},
"StatusCode": 200
},
{
"result": {
"exists": true
},
"StatusCode": 200
}
],
{
"result": {
"exists": true
},
"StatusCode": 200
}
]
into the following one array:
[
{
"result": {
"exists": true
},
"StatusCode": 200
},
{
"result": {
"exists": true
},
"StatusCode": 200
}
{
"result": {
"exists": true
},
"StatusCode": 200
}
]
Is this possible? Any suggestions welcome!

Elasticsearch query to match a pattern and replace it with regex

I have to get the TOP 5 hit APIs/Urls and display with the http statuses of each url and I am able to do it in a way with the below extraction query
{
"query": {
"bool": {
"must": [
{
"range": {
"#timestamp": {
"from": "now-15m",
"to": "now",
"include_lower": true,
"include_upper": true,
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"aggregations": {
"Url": {
"terms": {
"field": "data.url.keyword",
"size": 5,
"min_doc_count": 1,
"shard_min_doc_count": 0,
"show_term_doc_count_error": false,
"order": [
{
"_count": "desc"
},
{
"_key": "asc"
}
]
},
"aggregations": {
"Status": {
"terms": {
"field": "data.response.status",
"size": 5,
"min_doc_count": 1,
"shard_min_doc_count": 0,
"show_term_doc_count_error": false,
"order": [
{
"_count": "desc"
},
{
"_key": "asc"
}
]
}
}
}
}
}
}
Output :
"aggregations": {
"Url": {
"doc_count_error_upper_bound": 940,
"sum_other_doc_count": 52374,
"buckets": [
{
"Status": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"doc_count": 3,
"key": 200
},
{
"doc_count": 254,
"key": 400
}
]
},
"doc_count": 3515,
"key": "/account/me/"
},
{
"Status": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"doc_count": 3376,
"key": 200
}
]
},
"doc_count": 3385,
"key": "/PlanDetails"
},
{
"Status": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"doc_count": 3282,
"key": 200
}
]
},
"doc_count": 3282,
"key": "/evaluation"
},
{
"Status": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"doc_count": 3205,
"key": 200
}
]
},
"doc_count": 3205,
"key": "/user/me"
},
{
"Status": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"doc_count": 3055,
"key": 200
}
]
},
"doc_count": 3055,
"key": "/user"
}
]
}
}
}
But I have few URL where user ID comes inbetween the URLs.
For eg URLs like
/account/me/3417375321062014.cust
DJXXODKNPA1581975RI/PlanDetails
KXVEIPBYSR1597110677RI/payment
Because of distinct user ID inside the URL, elastic search considers these as separate one.
User ID is a mix of alphabets and numbers, sometimes only numbers and few only alphabets.
I need to replace these userID's in URL and count them as one
Lets say there are two hits with 200 status for /payment
DJXXODKNPA1583201975RI/payment
KXVEIPBYSR1597110677RI/payment
I should be able to replace the user ID with * and count as one API so I get output like
*/payment 200:2
But now it considers these as distinct URL/API, because of this I am unable to get the correct Top 5 Hit urls/APIs.
Any idea on this would be a great help. Thanks

Google Cloud Vision- Method: files.annotate - Response Object Changing

As per the documentation provided in Cloud Vision Docs the BoundingPoly object in the blocks array should have a format like this
{
"vertices": [
{
object (Vertex)
}
],
"normalizedVertices": [
{
object (NormalizedVertex)
}
]
}
But when we tried https://vision.googleapis.com/v1/files:annotate?key=xxxxxx to perform OCR on a PDF file with the request:
{
"requests": [{
"inputConfig": {
"content": "encoded content",
"mimeType": "application/pdf"
},
"features": [{
"type": "DOCUMENT_TEXT_DETECTION",
"maxResults": 50
}]
}]
}
the response from server was
{
"responses": [
{
"responses": [
{
"fullTextAnnotation": {
"pages": [
{
"property": {
"detectedLanguages": [
{
"languageCode": "en",
"confidence": 0.65
},
{
"languageCode": "fil",
"confidence": 0.01
}
]
},
"width": 841,
"height": 595,
"blocks": [
{
"property": {
"detectedLanguages": [
{
"languageCode": "en",
"confidence": 1
}
]
},
"boundingBox": {
"normalizedVertices": [
{
"x": 0.4351962,
"y": 0.057142857
},
{
"x": 0.6052319,
"y": 0.057142857
},
{
"x": 0.6052319,
"y": 0.08571429
},
{
"x": 0.4351962,
"y": 0.08571429
}
]
},
"paragraphs": [
{
"property": {
"detectedLanguages": [
{
"languageCode": "en",
"confidence": 1
}
]
},
"boundingBox": {
"normalizedVertices": [
{
"x": 0.4351962,
"y": 0.057142857
},
{
"x": 0.6052319,
"y": 0.057142857
},
{
"x": 0.6052319,
"y": 0.08571429
},
{
"x": 0.4351962,
"y": 0.08571429
}
]
},
"words": [
{
"property": {
"detectedLanguages": [
{
"languageCode": "en"
}
]
},
"boundingBox": {
"normalizedVertices": [
{
"x": 0.4351962,
"y": 0.057142857
},
{
"x": 0.49346018,
"y": 0.057142857
},
{
"x": 0.49346018,
"y": 0.08571429
},
{
"x": 0.4351962,
"y": 0.08571429
}
]
},
"symbols": [
{
"property": {
"detectedLanguages": [
{
"languageCode": "en"
}
]
},
"text": "F",
"confidence": 0.99
},
{
"property": {
"detectedLanguages": [
{
"languageCode": "en"
}
]
},
"text": "a",
"confidence": 1
},
{
"property": {
"detectedLanguages": [
{
"languageCode": "en"
}
]
},
"text": "c",
"confidence": 0.99
},
{
"property": {
"detectedLanguages": [
{
"languageCode": "en"
}
]
},
"text": "t",
"confidence": 0.99
},
{
"property": {
"detectedLanguages": [
{
"languageCode": "en"
}
]
},
"text": "o",
"confidence": 1
},
{
"property": {
"detectedLanguages": [
{
"languageCode": "en"
}
]
},
"text": "r",
"confidence": 1
},
{
"property": {
"detectedLanguages": [
{
"languageCode": "en"
}
],
"detectedBreak": {
"type": "SPACE"
}
},
"text": "y",
"confidence": 1
}
],
"confidence": 0.99
},
{
"property": {
"detectedLanguages": [
{
"languageCode": "en"
}
]
},
"text": "i",
"confidence": 0.99
},
{
"property": {
"detectedLanguages": [
{
"languageCode": "en"
}
]
},
"text": "n",
"confidence": 1
},
{
"property": {
"detectedLanguages": [
{
"languageCode": "en"
}
],
"detectedBreak": {
"type": "SPACE"
}
},
"text": "g",
"confidence": 1
}
],
"confidence": 0.99
},
{
"property": {
"detectedLanguages": [
{
"languageCode": "en"
}
]
},
"boundingBox": {
"normalizedVertices": [
{
"x": 0.57431626,
"y": 0.057142857
},
{
"x": 0.6052319,
"y": 0.057142857
},
{
"x": 0.6052319,
"y": 0.08571429
},
{
"x": 0.57431626,
"y": 0.08571429
}
]
},
"symbols": [
{
"property": {
"detectedLanguages": [
{
"languageCode": "en"
}
]
},
"text": "L",
"confidence": 0.99
},
{
"property": {
"detectedLanguages": [
{
"languageCode": "en"
}
]
},
"text": "i",
"confidence": 0.99
},
{
"property": {
"detectedLanguages": [
{
"languageCode": "en"
}
]
},
"text": "s",
"confidence": 0.99
},
{
"property": {
"detectedLanguages": [
{
"languageCode": "en"
}
],
"detectedBreak": {
"type": "LINE_BREAK"
}
},
"text": "t",
"confidence": 1
}
],
"confidence": 0.99
}
],
"confidence": 0.99
}
],
"blockType": "TEXT",
"confidence": 0.99
}
Is there anything to be considered if the vertices object is missing in the BoundingPoly object(boundingBox) property in the above json
When tried in the drag & drop demo, the json response for the OCR done on an image was
"fullTextAnnotation": {
"pages": [
{
"blocks": [
{
"blockType": "TEXT",
"boundingBox": {
"vertices": [
{
"x": 31,
"y": 63
},
{
"x": 147,
"y": 63
},
{
"x": 147,
"y": 81
},
{
"x": 31,
"y": 81
}
]
},
"confidence": 0.99,
"paragraphs": [
{
"boundingBox": {
"vertices": [
{
"x": 31,
"y": 63
},
{
"x": 147,
"y": 63
},
{
"x": 147,
"y": 81
},
{
"x": 31,
"y": 81
}
]
},
Is this the intended behavior or any issues ? which field we should follow normalizedVertices or vertices !!

The difference is that, in the request made from the code you're sending a PDF. In the drag & drop demo, you're sending an image (the demo doesn't accept files).
I replicated this and the behavior seems to be constant: PDF files are annotated with NormalizedVertices, whereas images are annotated with Vertices. My guess is that this is the intended behavior to enhance the performance of large PDF files annotation requests (large due to number of pages).
I sent a request to the Google Documentation so they can add this information in their docs.

icCube gauge with multiple band colors

I'm trying to make a gauge and want to have two band colors (wrong/red, good/green). I've an example of the amchart in their online Chart maker https://live.amcharts.com/new/edit/. But I'm not able to get this working in icCube.
current we have icCube reporting version 7.0.0 (5549).
This is my chart JSON:
{
"box": {
"id": "wb695",
"widgetAdapterId": "w28",
"rectangle": {
"left": 1510,
"top": 340,
"right": 1910,
"bottom": 640
},
"zIndex": 901
},
"data": {
"mode": "MDX",
"schemaSettings": {
"cubeName": null,
"schemaName": null
},
"options": {
"WIZARD": {
"measures": [],
"rows": [],
"rowsNonEmpty": false,
"columns": [],
"columnsNonEmpty": false,
"filter": []
},
"MDX": {
"statement": "with \n member [Measures].[Measure1] AS 0.9\n member [Measures].[Measure2] AS 0.1\nSELECT\n\n{[Measures].[Measure1], [Measures].[Measure2]} on 0\n\nFROM [cube]"
},
"DATASOURCE": {}
},
"ic3_name": "mdx Query-5",
"ic3_uid": "m17"
},
"data-render": {
"chartType": {
"label": "Gauge",
"proto": {
"chartPrototype": {
"type": "gauge",
"arrows": [
{
"id": "GaugeArrow-1"
}
],
"axes": [
{
"id": "GaugeAxis-1"
}
]
},
"graphPrototype": {},
"dataProviderType": 3
},
"id": "gauge-chart"
},
"graphsConfiguration": [
{
"graph": {}
}
],
"valueAxes": [],
"trendLinesGuides": {},
"configuredQuadrants": {},
"advanced": {
"titles": [],
"faceAlpha": 0,
"faceBorderAlpha": 0
},
"balloon": {
"offsetX": 8
},
"chartOptions": {
"axes": [
{
"axisAlpha": 0.25,
"bottomText": "SLA",
"bottomTextColor": "#2A3F56",
"tickAlpha": 0.25,
"bandOutlineAlpha": 1,
"bandAlpha": 1,
"bandOutlineThickness": 95,
"bandOutlineColor": "#0095BC",
"id": 1
}
],
"bands": [
{
"alpha": 0.8,
"color": "#B53728",
"endValue": 0.6,
"startValue": 0,
"id": "GaugeBand-1"
},
{
"alpha": 0.6,
"color": "#435035",
"endValue": 1,
"startValue": 0.6,
"innerRadius": 0.69,
"id": "GaugeBand-2"
}
]
},
"ic3Data": {
"chartTypeConfig": {
"pie-chart-donut": {
"chartType": {
"label": "Donut",
"proto": {
"chartPrototype": {
"type": "donut",
"pullOutRadius": 0,
"startDuration": 0,
"legend": {
"enabled": false,
"align": "center",
"markerType": "circle"
},
"innerRadius": "60%"
},
"dataProviderType": 1
},
"id": "pie-chart-donut"
},
"graphsConfiguration": [
{}
],
"valueAxes": [],
"trendLinesGuides": {},
"configuredQuadrants": {},
"advanced": {
"titles": []
},
"balloon": {
"offsetX": 8
},
"chartOptions": {
"showZeroSlices": false,
"labelsEnabled": false,
"innerRadius": "60%",
"startAngle": 270,
"radius": "",
"fontSize": 20,
"color": "#0095BC",
"outlineAlpha": 0.25,
"tapToActivate": false
}
}
}
},
"axes": [
{
"startValue": 0,
"endValue": 1,
"startAngle": -90,
"endAngle": 90
}
],
"valueFormatting": ""
},
"navigation": {
"menuVisibility": {
"back": true,
"axisXChange": "All",
"axisYChange": "All",
"filter": "All",
"reset": true,
"widget": true,
"others": "All"
},
"selectionMode": "disabled"
},
"events": {},
"filtering": {},
"hooks": {}
}

Sorry for the late answer, out of the box it's not possible but you can use hooks to change the javascript options sent to amcharts.
JS / On Widget Options :
function(context, options, $box) {
const bands = [
{
"color": "#00CC00",
"endValue": 300000,
"id": "GaugeBand-1",
"startValue": 0
},
{
"color": "#ffac29",
"endValue": 600000,
"id": "GaugeBand-2",
"startValue": 300000
},
{
"color": "#ea3838",
"endValue": 900000,
"id": "GaugeBand-3",
"innerRadius": "95%",
"startValue": 600000
}
];
options.axes[0]["bands"] = bands;
return options;
}
This should work

Elasticsearch nested aggregation with range aggregation

I am very new to ElasticSearch and trying to figure out a situation
where I have to show particular JSON data from Elasticsearch. Please
see below code
def bucket_for_order_amount(buckets, aggregations)
buckets['order_range'] = {
range: {
field: :total,
keyed: true,
ranges: [
{ to: 25.0 },
{ from: 25.0, to: 50.0 },
{ from: 50.0, to: 75.0 },
{ from: 75.0, to: 100.0 },
{ from: 100.0 }
]
}
}
end
This will return JON response:
"report": {
"order_range": {
"buckets": {
"*-25.0": {
"to": 25,
"doc_count": 0
},
"25.0-50.0": {
"from": 25,
"to": 50,
"doc_count": 0
},
"50.0-75.0": {
"from": 50,
"to": 75,
"doc_count": 0
},
"75.0-100.0": {
"from": 75,
"to": 100,
"doc_count": 0
},
"100.0-*": {
"from": 100,
"doc_count": 2
}
}
}
I have a situation where I have to show the users listing inside the doc_count if any user exists in that range but I am not sure which
aggregation need to use to achieve this response.
The response would be like this:-
"*-25.0": {
"to": 25,
"doc_count": 2
{
user1,
user2
},
},

You can use the top_hits sub-aggregation for that purpose:
def bucket_for_order_amount(buckets, aggregations)
buckets['order_range'] = {
range: {
field: :total,
keyed: true,
ranges: [
{ to: 25.0 },
{ from: 25.0, to: 50.0 },
{ from: 50.0, to: 75.0 },
{ from: 75.0, to: 100.0 },
{ from: 100.0 }
]
},
aggs: {
users: { top_hits: {} }
}
}
end

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

JSONata - Grouping, summing, then dividing by the sum of the group - grouping

( $totals := positions{ticker: $sum(marketValue)}; positions.{ "marketValue": marketValue, "ticker": ticker, "group-weight": marketValue / $lookup($totals, ticker) } ) See https://try.jsonata.org/jLDI1Pgnx

Related

AWS Step Functions - Arrays

Elasticsearch query to match a pattern and replace it with regex

Google Cloud Vision- Method: files.annotate - Response Object Changing

icCube gauge with multiple band colors

Elasticsearch nested aggregation with range aggregation

Categories

Resources