Range query for long type in aws elasticsearch - amazon-web-services

I am trying to query an elasticsearch index in AWS to get all entries with a mass attribute greater than 1000, the datatype for the attribute is Long.
I found the range query and have tried that (see example below) but it's returning nothing but when I use other queries they return attributes with mass greater than 1000 so they're definitely in the index.
This is the Range query I'm trying:
{
"method": "POST",
"index": "users",
"type": "user",
"path": "_search?filter_path=filter",
"body": {
"size": 20,
"from": 0,
"query": {
"bool": {
"must":[{
"range": {
"mass": {
"gte": 1000
}
}
}]
}
}
}
}
I'm not getting any error messages, just zero hits.

So the problem that's causing to get you zero hits is the filter_path parameter you specify in
"path": "_search?filter_path=filter"
As stated in the official documentation the filter_path parameter is part of the common options regarding the REST API's. That means you can always add that parameter.
With Response Filtering you can reduce the response returned by Elasticsearch. Since you defined
_search?filter_path=filter
you probably get zero hits because there is no filter-element that can be returned.

Related

How do I extract a string of numbers from random text in Power Automate?

I am setting up a flow to organize and save emails as PDF in a Dropbox folder. The first email that will arrive includes a 10 digit identification number which I extract along with an address. My flow creates a folder in Dropbox named in this format: 2023568684 : 123 Main St. Over a few weeks, additional emails arrive that I need to put into that folder. The subject always has a 10 digit number in it. I was building around each email and using functions like split, first, last, etc. to isolate the 10 digits ID. The problem is that there is no consistency in the subjects or bodies of the messages to be able to easily find the ID with that method. I ended up starting to build around each email format individually but there are way too many, not to mention the possibility of new senders or format changes.
My idea is to use List files in folder when a new message arrives which will create an array that I can filter to find the folder ID the message needs to be saved to. I know there is a limitation on this because of the 20 file limit but that is a different topic and question.
For now, how do I find a random 10 digit number in a randomly formatted email subject line so I can use it with the filter function?
For this requirement, you really need regex and at present, PowerAutomate doesn't support the use of regex expressions but the good news is that it looks like it's coming ...
https://powerusers.microsoft.com/t5/Power-Automate-Ideas/Support-for-regex-either-in-conditions-or-as-an-action-with/idi-p/24768
There is a connector but it looks like it's not free ...
https://plumsail.com/actions/request-free-license
To get around it for now, my suggestion would be to create a function app in Azure and let it do the work. This may not be your cup of tea but it will work.
I created a .NET (C#) function with the following code (straight in the portal) ...
#r "Newtonsoft.Json"
using System.Net;
using Microsoft.AspNetCore.Mvc;
using Microsoft.Extensions.Primitives;
using Newtonsoft.Json;
public static async Task<IActionResult> Run(HttpRequest req, ILogger log)
{
string requestBody = await new StreamReader(req.Body).ReadToEndAsync();
dynamic data = JsonConvert.DeserializeObject(requestBody);
string strToSearch = System.Text.Encoding.UTF8.GetString(Convert.FromBase64String((string)data?.Text));
string regularExpression = data?.Pattern;
var matches = System.Text.RegularExpressions.Regex.Matches(strToSearch, regularExpression);
var responseString = JsonConvert.SerializeObject(matches, new JsonSerializerSettings()
{
ReferenceLoopHandling = ReferenceLoopHandling.Ignore
});
return new ContentResult()
{
ContentType = "application/json",
Content = responseString
};
}
Then in PowerAutomate, call the HTTP action passing in a base64 encoded string of the content you want to search ...
The is the expression in the JSON ... base64(variables('String to Search')) ... and this is the json you need to pass in ...
{
"Text": "#{base64(variables('String to Search'))}",
"Pattern": "[0-9]{10}"
}
This is an example of the response ...
[
{
"Groups": {},
"Success": true,
"Name": "0",
"Captures": [],
"Index": 33,
"Length": 10,
"Value": "2023568684"
},
{
"Groups": {},
"Success": true,
"Name": "0",
"Captures": [],
"Index": 98,
"Length": 10,
"Value": "8384468684"
}
]
Next, add a Parse JSON action and use this schema ...
{
"type": "array",
"items": {
"type": "object",
"properties": {
"Groups": {
"type": "object",
"properties": {}
},
"Success": {
"type": "boolean"
},
"Name": {
"type": "string"
},
"Captures": {
"type": "array"
},
"Index": {
"type": "integer"
},
"Length": {
"type": "integer"
},
"Value": {
"type": "string"
}
},
"required": [
"Groups",
"Success",
"Name",
"Captures",
"Index",
"Length",
"Value"
]
}
}
Finally, extract the first value that you find which matches the regex pattern. It returns multiple results if found so if you need to, you can do something with those.
This is the expression ... #{first(body('Parse_JSON'))?['value']}
From this string ...
We're going to search for string 2023568684 within this text and we're also going to try and find 8384468684, this should work.
... this is the result ...
Don't have a Premium PowerAutomate licence so can't use the HTTP action?
You can do this exact same thing using the LogicApps service in Azure. It's the same engine with some slight differences re: connectors and behaviour.
Instead of the HTTP, use the Azure Functions action.
In relation to your action to fire when an email is received, in LogicApps, it will poll every x seconds/minutes/hours/etc. rather than fire on event. I'm not 100% sure which email connector you're using but it should exist.
Dropbox connectors exist, that's no problem.
You can export your PowerAutomate flow into a LogicApps format so you don't have to start from scratch.
https://learn.microsoft.com/en-us/azure/logic-apps/export-from-microsoft-flow-logic-app-template
If you're concerned about cost, don't be. Just make sure you use the consumption plan. Costs only really rack up for these services when the apps run for minutes at a time on a regular basis. Just keep an eye on it for your own mental health.
TO get the function URL, you can find it in the function itself. You have to be in the function ...

Adding pages to a multi-column notion database works flawlessly sometimes and gives a validation error sometimes for the same input

Basically, I'm using Postman to send POST requests to
https://api.notion.com/v1/pages
It works for 70% of the times and rest of the times it gives the following error sometimes. That is, for the same input.
{
"object": "error",
"status": 400,
"code": "validation_error",
"message": "body failed validation. Fix one: body.parent.type should be not present, instead was `\"database_id\"`. body.parent.page_id should be defined, instead was `undefined`."
}
Here's how my body starts
{
"parent": {
"type": "database_id",
"database_id": "a94c42320ef04b6a9c1a7e5e73455557"
},
"properties": {
"Title": {
..................
I'm not posting the entire body because it works flawlessly sometimes.
Please help me out. Is there a way to check logs of the requests that come to my page?
First, I found out that type: database_id is not necessary in parent.
I also found out that syntax errors in the payload returns a 400 error:
body failed validation. Fix one: body.parent.type should be not present, instead was `\"database_id\"`. body.parent.page_id should be defined, instead was `undefined`.
In my case, I wrongly added a value in the same level as parent, properties. Like this:
{
"parent": {
"database_id": "<database_id>"
},
"properties": {
...
},
"wrong_value": {}
}
Since the errors are not that specific, check if you made the same misktake like me, and please also double check if the parent you are trying to post to is actually a database, not a page.
The issue was with having "type: database_id" inside "parent" in the request data.
{
"parent": {
"type": "database_id",(REMOVE THIS LINE)
"database_id": "a94c42320ef04b6a9c1a7e5e73455557"
},
"properties": {
"Title": {
..................
After removing "type" it worked fine. Notion needs to update their docs.

GCP stackdriver enteries list pagination issue

i am sending the followin request for enteries list API ,here is the link to API
https://cloud.google.com/logging/docs/reference/v2/rest/v2/entries/list
{
"filter": "(jsonPayload.event_type=\"GCE_OPERATION_DONE\" OR protoPayload.serviceName=\"storage.googleapis.com\" OR protoPayload.serviceName=\"clientauthconfig.googleapis.com\" OR protoPayload.serviceName=\"iam.googleapis.com\" OR protoPayload.serviceName=\"compute.googleapis.com\") AND (jsonPayload.event_subtype=\"compute.instances.insert\" OR jsonPayload.event_subtype=\"compute.instances.delete\" OR protoPayload.methodName=\"storage.buckets.create\" OR protoPayload.methodName=\"storage.buckets.delete\" AND protoPayload.resourceOriginalState.direction=\"EGRESS\" AND protoPayload.request.disabled=true)) AND timestamp>=\"2020-05-16T12:52:00.820Z\" AND timestamp < \"2020-05-16T13:52:00.820Z\"",
"resourceNames": [
"projects/project1"
],
"orderBy": "timestamp desc",
"pageSize": 1000,
"pageToken":xxxx"
}
I am getting the following respone
{
"error": {
"code": 400,
"message": "page_token doesn't match arguments from the request",
"status": "INVALID_ARGUMENT"
}
}
Can anyone Suggest what does message imply with an example
this error is faced when,
the page token of some other project is being used in place of project1
Try to test the API using following format for the requested body and also try without any parameters.
{
"projectIds": [
string
],
"resourceNames": [
string
],
"filter": string,
"orderBy": string,
"pageSize": integer,
"pageToken": string
}

How to specify attributes to return from DynamoDB through AppSync

I have an AppSync pipeline resolver. The first function queries an ElasticSearch database for the DynamoDB keys. The second function queries DynamoDB using the provided keys. This was all working well until I ran into the 1 MB limit of AppSync. Since most of the data is in a few attributes/columns I don't need, I want to limit the results to just the attributes I need.
I tried adding AttributesToGet and ProjectionExpression (from here) but both gave errors like:
{
"data": {
"getItems": null
},
"errors": [
{
"path": [
"getItems"
],
"data": null,
"errorType": "MappingTemplate",
"errorInfo": null,
"locations": [
{
"line": 2,
"column": 3,
"sourceName": null
}
],
"message": "Unsupported element '$[tables][dev-table-name][projectionExpression]'."
}
]
}
My DynamoDB function request mapping template looks like (returns results as long as data is less than 1 MB):
#set($ids = [])
#foreach($pResult in ${ctx.prev.result})
#set($map = {})
$util.qr($map.put("id", $util.dynamodb.toString($pResult.id)))
$util.qr($map.put("ouId", $util.dynamodb.toString($pResult.ouId)))
$util.qr($ids.add($map))
#end
{
"version" : "2018-05-29",
"operation" : "BatchGetItem",
"tables" : {
"dev-table-name": {
"keys": $util.toJson($ids),
"consistentRead": false
}
}
}
I contacted the AWS people who confirmed that ProjectionExpression is not supported currently and that it will be a while before they will get to it.
Instead, I created a lambda to pull the data from DynamoDB.
To limit the results form DynamoDB I used $ctx.info.selectionSetList in AppSync to get the list of requested columns, then used the list to specify the data to pull from DynamoDB. I needed to get multiple results, maintaining order, so I used BatchGetItem, then merged the results with the original list of IDs using LINQ (which put the DynamoDB results back in the correct order since BatchGetItem in C# does not preserve sort order like the AppSync version does).
Because I was using C# with a number of libraries, the cold start time was a little long, so I used Lambda Layers pre-JITed to Linux which allowed us to get the cold start time down from ~1.8 seconds to ~1 second (when using 1024 GB of RAM for the Lambda).
AppSync doesn't support projection but you can explicitly define what fields to return in the response template instead of returning the entire result set.
{
"id": "$ctx.result.get('id')",
"name": "$ctx.result.get('name')",
...
}

When predicting, what are the valid values for dataFormat?

Problem
Using the REST API, I have trained and deployed a model that I now want to use for prediction. I've defined the collections for prediction input and output and uploaded a json file formatted accordingly to the cloud storage. However, when trying to create a prediction job I cannot figure out what value to use for the dataFormat field, which is a required parameter. Is there any way to list all valid values?
What I've tried
My requests look like the one below. I've tried JSON, NEWLINE_DELIMITED_JSON (like when importing data into BigQuery), and even the json mime type application/json, in pretty much all different cases I can think of (upper and lower combined with snake, camel, etc.).
{
"jobId": "my_predictions_123",
"predictionInput": {
"modelName": "projects/myproject/models/mymodel",
"inputPaths": [
"gs://model-bucket/data/testset.json"
],
"outputPath": "gs://model-bucket/predictions/0/",
"region": "us-central1",
"dataFormat": "JSON"
},
"predictionOutput": {
"outputPath": "gs://my-bucket/predictions/1/"
}
}
All my attempts have only gotten me this back though:
{
"error": {
"code": 400,
"message": "Invalid value at 'job.prediction_input.data_format' (TYPE_ENUM), \"JSON\"",
"status": "INVALID_ARGUMENT",
"details": [
{
"#type": "type.googleapis.com/google.rpc.BadRequest",
"fieldViolations": [
{
"field": "job.prediction_input.data_format",
"description": "Invalid value at 'job.prediction_input.data_format' (TYPE_ENUM), \"JSON\""
}
]
}
]
}
}
From Cloud ML API reference document https://cloud.google.com/ml/reference/rest/v1beta1/projects.jobs#DataFormat, the data format field in your request should be "TEXT" for all text inputs (including JSON, CSV, etc).