How do I use JSON with U2/Universe - universe

U2/Universe JSON document have the following UDOSetProperty, how would one set the value if it has multiple values? For example if I have multiple emails.
example: UDOSetProperty(udoHandle, "to", value)
"to": [
{
"email": "recipientEmail#example.com",
"name": "Recipient Name",
"type": "to"
}
],

Not sure if you are trying to add another "to" array element or if you want to add a 2nd "email" only.
So working with your example:
"to": [
{
"email": [ "recipientEmail#example.com",
"name": "Recipient Name",
"type": "to"
},
{
"email": [ "recipient2Email#example.com",
"name": "Recipient2 Name",
"type": "to"
}
],
If you wanted to create the above JSON from scratch, with the UDO commands, the steps would be:
Using the following functions should help you with what you are trying to do:
Create the initial/root object UDOCreate(UDO_OBJECT,
udoHandle)
Create the array UDOCreate(UDO_ARRAY,
thisArray)
( Use UDOCreate and UDOSetProperty to create the theEmailObject you
want to add to the array, and then add it to the object with
UDOArrayAppendItem( thisArray, theEmailObject )
Then add the array to the root object eith UDOSetProperty(udoHandle,
"TO", thisArray)
Note the part that is important is that there are several functions for dealing with arrays.
Mike
Created a program that builds the JSON with the U2 UDO functions, and added it to github:
https://github.com/RocketSoftware/multivalue-lab/blob/master/U2/Demos/UDO/JSON/The-Basics/arrayExample

Related

How do I extract a string of numbers from random text in Power Automate?

I am setting up a flow to organize and save emails as PDF in a Dropbox folder. The first email that will arrive includes a 10 digit identification number which I extract along with an address. My flow creates a folder in Dropbox named in this format: 2023568684 : 123 Main St. Over a few weeks, additional emails arrive that I need to put into that folder. The subject always has a 10 digit number in it. I was building around each email and using functions like split, first, last, etc. to isolate the 10 digits ID. The problem is that there is no consistency in the subjects or bodies of the messages to be able to easily find the ID with that method. I ended up starting to build around each email format individually but there are way too many, not to mention the possibility of new senders or format changes.
My idea is to use List files in folder when a new message arrives which will create an array that I can filter to find the folder ID the message needs to be saved to. I know there is a limitation on this because of the 20 file limit but that is a different topic and question.
For now, how do I find a random 10 digit number in a randomly formatted email subject line so I can use it with the filter function?
For this requirement, you really need regex and at present, PowerAutomate doesn't support the use of regex expressions but the good news is that it looks like it's coming ...
https://powerusers.microsoft.com/t5/Power-Automate-Ideas/Support-for-regex-either-in-conditions-or-as-an-action-with/idi-p/24768
There is a connector but it looks like it's not free ...
https://plumsail.com/actions/request-free-license
To get around it for now, my suggestion would be to create a function app in Azure and let it do the work. This may not be your cup of tea but it will work.
I created a .NET (C#) function with the following code (straight in the portal) ...
#r "Newtonsoft.Json"
using System.Net;
using Microsoft.AspNetCore.Mvc;
using Microsoft.Extensions.Primitives;
using Newtonsoft.Json;
public static async Task<IActionResult> Run(HttpRequest req, ILogger log)
{
string requestBody = await new StreamReader(req.Body).ReadToEndAsync();
dynamic data = JsonConvert.DeserializeObject(requestBody);
string strToSearch = System.Text.Encoding.UTF8.GetString(Convert.FromBase64String((string)data?.Text));
string regularExpression = data?.Pattern;
var matches = System.Text.RegularExpressions.Regex.Matches(strToSearch, regularExpression);
var responseString = JsonConvert.SerializeObject(matches, new JsonSerializerSettings()
{
ReferenceLoopHandling = ReferenceLoopHandling.Ignore
});
return new ContentResult()
{
ContentType = "application/json",
Content = responseString
};
}
Then in PowerAutomate, call the HTTP action passing in a base64 encoded string of the content you want to search ...
The is the expression in the JSON ... base64(variables('String to Search')) ... and this is the json you need to pass in ...
{
"Text": "#{base64(variables('String to Search'))}",
"Pattern": "[0-9]{10}"
}
This is an example of the response ...
[
{
"Groups": {},
"Success": true,
"Name": "0",
"Captures": [],
"Index": 33,
"Length": 10,
"Value": "2023568684"
},
{
"Groups": {},
"Success": true,
"Name": "0",
"Captures": [],
"Index": 98,
"Length": 10,
"Value": "8384468684"
}
]
Next, add a Parse JSON action and use this schema ...
{
"type": "array",
"items": {
"type": "object",
"properties": {
"Groups": {
"type": "object",
"properties": {}
},
"Success": {
"type": "boolean"
},
"Name": {
"type": "string"
},
"Captures": {
"type": "array"
},
"Index": {
"type": "integer"
},
"Length": {
"type": "integer"
},
"Value": {
"type": "string"
}
},
"required": [
"Groups",
"Success",
"Name",
"Captures",
"Index",
"Length",
"Value"
]
}
}
Finally, extract the first value that you find which matches the regex pattern. It returns multiple results if found so if you need to, you can do something with those.
This is the expression ... #{first(body('Parse_JSON'))?['value']}
From this string ...
We're going to search for string 2023568684 within this text and we're also going to try and find 8384468684, this should work.
... this is the result ...
Don't have a Premium PowerAutomate licence so can't use the HTTP action?
You can do this exact same thing using the LogicApps service in Azure. It's the same engine with some slight differences re: connectors and behaviour.
Instead of the HTTP, use the Azure Functions action.
In relation to your action to fire when an email is received, in LogicApps, it will poll every x seconds/minutes/hours/etc. rather than fire on event. I'm not 100% sure which email connector you're using but it should exist.
Dropbox connectors exist, that's no problem.
You can export your PowerAutomate flow into a LogicApps format so you don't have to start from scratch.
https://learn.microsoft.com/en-us/azure/logic-apps/export-from-microsoft-flow-logic-app-template
If you're concerned about cost, don't be. Just make sure you use the consumption plan. Costs only really rack up for these services when the apps run for minutes at a time on a regular basis. Just keep an eye on it for your own mental health.
TO get the function URL, you can find it in the function itself. You have to be in the function ...

AWS Redshift - Loading GeoJSON Into Geometry Field

Currently it is not possible to load GeoJSON directly into a redshift geometry column using the copy command, but a workaround has been suggested at:-
Copying GeoJSON data from S3 to Redshift
This involves ingesting as WKT then converting to geometry using a spatial function, however i'm not entirely sure how to get from geojson to wkt - i am sure there must be some converter available.
But this is where my limited understanding of spatial data comes in - let's say i want to load weather geojson objects like the one shown below into a table in redshift.
If I understand it correctly, each Feature in the FeatureCollection would be a row in the table with just the contents of json geometry field loaded into a field of type GEOMETRY i.e. this field does not take any of the properties or attributes of the feature. The properties would then be loaded into completely separate fields using conventional datatypes. Then if i wanted to export that feature as geojson, i would have to stitch the geometry and properties back together again.
Is that correct?
Or does the GEOMETRY type actually have the facility to store the properties as well as the geometry field contents?
{
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"id": "abcd1234",
"geometry": {
"type": "MultiPolygon",
"coordinates": [
[
[
[
7.51,
48.04
],
[
8.12,
48.05
],
[
8.19,
47.95
]
]
]
]
},
"geometry_name": "contour",
"properties": {
"identifier": "abcd1234",
"analysisTime": "2019-04-06T14:15:00Z",
"convectionCellType": "CELL_BASE",
"speed": 3,
"area": "MSG",
"phasetype": null,
"top": 10363,
"intensityValue": null,
"ice": true,
"created_at": "2020-07-21T12:01:25.651Z"
}
}
],
"totalFeatures": 1,
"numberMatched": 1,
"numberReturned": 1,
"timeStamp": "2021-02-03T16:39:48.963Z",
"crs": {
"type": "name",
"properties": {
"name": "urn:ogc:def:crs:EPSG::4326"
}
}
}
I found this Gist which provides the following Python snippet to do the conversion:
from shapely.geometry import shape
o = {
"coordinates": [[[23.314208, 37.768469], [24.039306, 37.768469], [24.039306, 38.214372], [23.314208, 38.214372], [23.314208, 37.768469]]],
"type": "Polygon"
}
geom = shape(o)
# Now it's very easy to get a WKT/WKB representation
geom.wkt
geom.wkb

How to highlight custom extractions using a2i's crowd-textract-analyze-document?

I would like to create a human review loop for images that undergone OCR using Amazon Textract and Entity Extraction using Amazon Comprehend.
My process is:
send image to Textract to extract the text
send text to Comprehend to extract entities
find the Block IDs in Textract's output of the entities extracted by Comprehend
add new Blocks of type KEY_VALUE_SET to textract's JSON output per the docs
create a Human Task with crowd-textract-analyze-document element in the template and feed it the modified textract output
What fails to work in this process is step 5. My custom entities are not rendered properly. By "fails to work" I mean that the entities are not highlighted on the image when I click them on the sidebar. There is no error in the browser's console.
Has anyone tried such a thing?
Sorry for not including examples. I will remove secrets/PII from my files and attach them to the question
I used the AWS documentation of the a2i-crowd-textract-detection human task element to generate the value of the initialValue attribute. It appears the doc for that attribute is incorrect. While the the doc shows that the value should be in the same format as the output of Textract, namely:
[
{
"BlockType": "KEY_VALUE_SET",
"Confidence": 38.43309020996094,
"Geometry": { ... }
"Id": "8c97b240-0969-4678-834a-646c95da9cf4",
"Relationships": [
{ "Type": "CHILD", "Ids": [...]},
{ "Type": "VALUE", "Ids": [...]}
],
"EntityTypes": ["KEY"],
"Text": "Foo bar"
},
]
the a2i-crowd-textract-detection expects the input to have lowerCamelCase attribute names (rather than UpperCamelCase). For example:
[
{
"blockType": "KEY_VALUE_SET",
"confidence": 38.43309020996094,
"geometry": { ... }
"id": "8c97b240-0969-4678-834a-646c95da9cf4",
"relationships": [
{ "Type": "CHILD", "ids": [...]},
{ "Type": "VALUE", "ids": [...]}
],
"entityTypes": ["KEY"],
"text": "Foo bar"
},
]
I opened a support case about this documentation error to AWS.

CouchDB Map syntax

I have documents in CouchDB (v. 2.1.1) as follows:
{
"xyz": "a",
"abc": "def"
},
{
"xyz": "a",
"ghi": "jkl"
},
{
"xyz": "a",
"mno": "pqr"
},
{
"xyz": "a",
"stu": "vwx"
},
{
"xyz": "a",
"bcd": 1000
}
If I run a simple map function, for example:
function (doc) {
if (doc.xyz ){
emit(doc.xyz, doc.abc);}}
I get:
{
"id": "4c3406a1d92942b4fb10d1314e0061a9",
"key": "a",
"value": "def"
},
{
"id": "4c3406a1d92942b4fb10d1314e006ccf",
"key": "a",
"value": null
},
{
"id": "4c3406a1d92942b4fb10d1314e00787f",
"key": "a",
"value": null
},
{
"id": "4c3406a1d92942b4fb10d1314e00871e",
"key": "a",
"value": null
},
{
"id": "4c3406a1d92942b4fb10d1314e00906a",
"key": "a",
"value": null
}
I want to try and eliminate the 'null' outputs.
I am looking at having a CouchDB database with many small documents containing small snippets of information rather than having larger documents containing much more information per document.
My question is, is my document design a good one and if so how do I get just what I am looking for rather than rows of 'nulls'. If my storage design is not ideal, what kind of design should I be looking at to simplify the output given my plan to have many small 'docs'.
EDIT:
Having looked at possible answers, I have decided that having numerous small documents as I described in my question is not giving me the kind of benefit I imangined they would.
I was unable to get a satisfactory solution to the map function to get readable answers.
However, I investigated the 'Mango' query system available in recent updates of CouchDB and I was able using these queries to get acceptable output from a database like my supplied one.
This is what I did:
curl -X POST http://admin:123#127.0.0.1:5984/ptn/_find -d '{"selector": {"$or": [{"abc": {"$gt": null}},{"ghi": {"$gt": null}}]},"fields": ["abc","ghi"]}' -H "Content-Type:application/json"
Un-minified:
{
"selector": {
"$or": [
{
"abc": {
"$gt": null
}
},
{
"ghi": {
"$gt": null
}
}
]
},
"fields": [
"abc",
"ghi"
]
}
The output:
{"docs":[
{"abc":"def"},
{"ghi":"jkl"}
]
.....
A concise answer.
Sorting can be done but sorted fields must be indexed. Indexing is in any case advised for larger data sets.
Reference:
http://docs.couchdb.org/en/2.1.1/api/database/find.html
As my question required a map function, this perhaps cannot be regarded as a valid answer but for me it is an answer. I have tried the 'Mango' query system a little on other databases and it seems to be more useful/powerful than I thought is was although it offers no means of totaling etc.

Formatting a JSON with a dictionary

I have a JSON with format variables in it, similar to string format, and I'd like to be able to load it with the variables replaced by actual values.
For example, if the JSON is:
[
{
"role": "President",
"name": "{first_name}",
"age": "{first_age}"
},
{
"role": "Vice President",
"name": "{second_name}",
"age": "{second_age}"
}
]
And the dictionary I'd like to format with is:
{"first_name": "Bob", "first_age": "50", "second_name": "Bill", "second_age": "35"}
I'd like to get:
[
{
"role": "President",
"name": "Bob",
"age": "50"
},
{
"role": "Vice President",
"name": "Bill",
"age": "35"
}
]
I tried converting the JSON to a string, using format, and then turning it back to a list of dictionaries:
from ast import literal_eval
literal_eval(str(raw_json).format(**json_params))
But the dictionaries' curly brackets confuse the format function and give me a KeyError exception. I suppose I could replace every pair of curly brackets which don't have a variable name between them with double curly brackets, but that's bound to go wrong and also not very Pythonic.
What would be the most elegant way to solve that issue?
What you are looking for is a templating engine.
Template is json string and data must be injected into this template.
Right tool to do that with python is jinja2