How to decode block extrinsics using Polkadot js api - polkadot-js

I am writing typescript code to get block information. I am connected to wss://kusama-rpc.polkadot.io . I am followig official polkadot js api documentation .
I am calling api.rpc.chain.getBlock method get block information, and it returns block information as json :
{
"header": {
"parentHash": "0xf292579563eb2f12e7a1571643d5285a072f04694397758cae76b38075daf631",
"number": 1134,
"stateRoot": "0x468de0ef831c96f56d518017b18d76a89f35f30371c45866d12c12ca2116a407",
"extrinsicsRoot": "0x4875f3ab89c2a3c30f5de8be2ac40cfaee02059fd69ea76115550a418db5fcc8",
"digest": {
"logs": [
"0x066175726120d86ae01200000000",
"0x05617572610101be3d6d596445d3cb3b711da09e22f9f24c283306744657ce397d17ff1dbf9859051def7406cd356b2d3d2add155d76618f6b098de0c4ce6b7620106ec00e1188"
]
}
},
"extrinsics": [
"0x280401000bc0ca26af7001"
]
}
How do I get extrinsic details as -
{
"method": {
"callIndex": "0x0200",
"section":"timestamp",
"method": "set",
"args" : [
"1,582,827,870,000"
]
},
"isSigned": false
}
I am assuming the extrinsic I am getting is encoded, what is the method to to decode it?

I found parseInt('0x33c395') works great to decode block numbers

Those are hex Uint8Array encoded.
If using node.js you could simply get rid of the "0x" string and perform a Buffer.from(<U8A>, 'hex')

Given that the extrinsic is 0x280401000bc0ca26af7001
Let's decode. First comes the length of the extrinsic encoded by the compact encoding.
0x28 is 0b00101000 i.e. the value is 0b001010 or 10 10-base
0x04 means that it is not signed extrinsic of the 4th version (EXTRINSIC_VERSION), otherwise it would be 0b1000_0000 | EXTRINSIC_VERSION
since it is not signed next comes call_data, otherwise there would be signature
0x01 a runtime enum specific value (pallet_index)
0x00 a runtime enum specific value (call_index)
0x0bc0ca26af7001 according to the compact encoding the first byte 0x0b ( 0b0000_1011 ) meas that it is the big-integer encoding according 0b11 two lower bits, the others 0b000010 are the number of the following bytes save the shift of 4, i.e. the number of bytes is 4 + 0b10 = 6 in the LE encoding.
Thus one have to swap the 0xc0ca26af7001 bytes according to the byte-order.
0xc0ca26af7001 -> 0x0170af26cac0 i.e. 1583486520000 10-base

Related

NEAR FunctionCall `args` field

In the near_primitives::views, the args field on the FunctionCall is represented as a String type. From the chain data model, which is transaction::Action::FunctionCall, its args field there is a `Vec.
The question is, does this args field will always content a valid JSON payload as the content? We assume the answer is probably a No since the underlying field contains pure bytes.
In which circumstances this would a valid JSON string and in which circumstances it would be a binary format?
Finally, if binary format is possible (likely), how is it possible to decode it? Is this in developers hand and could be any binary format?
See
https://github.com/near/nearcore/blob/14711926391d3ec1d23116658a295a62e77bc701/core/primitives/src/views.rs#L768
https://github.com/near/nearcore/blob/14711926391d3ec1d23116658a295a62e77bc701/core/primitives/src/transaction.rs#L113
In most cases args will be base64 encoded JSON string.
Here's an example of how we decode them on NEAR Indexer for Explorer side.
ActionView::FunctionCall {
method_name,
args,
gas,
deposit,
} => {
if let Ok(decoded_args) = base64::decode(args) {
if let Ok(mut args_json) = serde_json::from_slice(&decoded_args) {
escape_json(&mut args_json);
arguments["args_json"] = args_json;
}
}
Is this in developers hand and could be any binary format?
Yes.
Rainbow Bridge-related transactions have borsh-serialized args which are not possible to decode into JSON.
ref: https://github.com/near/near-indexer-for-explorer/blob/master/src/models/serializers.rs#L94-L103
args are not limited to any format at all, they are just binary blob. What you see in the views.rs is partially serialized data where args are expected to be in base64 encoding thus it is a String (thus, it is always base64 data there; be it JSON, Borsh-serialized data, or just raw binary blob, e.g. PNG image)

Nifi - Extracting Key Value pairs into new fields

With Nifi I am trying to use the ReplaceText processor to extract key value pairs.
The relevant part of the JSON file is the 'RuleName':
"winlog": {
"channel": "Microsoft-Windows-Sysmon/Operational",
"event_id": 3,
"api": "wineventlog",
"process": {
"pid": 1640,
"thread": {
"id": 4452
}
},
"version": 5,
"record_id": 521564887,
"computer_name": "SERVER001",
"event_data": {
"RuleName": "Technique=Commonly Used Port,Tactic=Command and Control,MitreRef=1043"
},
"provider_guid": "{5790385F-C22A-43E0-BF4C-06F5698FFBD9}",
"opcode": "Info",
"provider_name": "Microsoft-Windows-Sysmon",
"task": "Network connection detected (rule: NetworkConnect)",
"user": {
"identifier": "S-1-5-18",
"name": "SYSTEM",
"domain": "NT AUTHORITY",
"type": "Well Known Group"
}
},
Within the ReplaceText processor I have this configuration
ReplaceText
"winlog.event_data.RuleName":"MitreRef=(.*),Technique=(.*),Tactic=(.*),Alert=(.*)"
"MitreRef":"$1","Technique":"$2","Tactic":"$3","Alert":"$4"
The first problem is that the new fields MitreRef etc. are not created.
The second thing is that the fields may appear in any order in the original JSON, e.g.
"RuleName": "Technique=Commonly Used Port,Tactic=Command and Control,MitreRef=1043"
or,
MitreRef=1043,Tactic=Command and Control,Technique=Commonly Used Port
Any ideas on how to proceed?
Welcome to StackOverflow!
As your question is quite ambiqious I'll try to guess what you aimed for.
Replacing string value of "RuleName" with JSON representation
I assume that you want to replace the entry
"RuleName": "Technique=Commonly Used Port,Tactic=Command and Control,MitreRef=1043"
with something along the lines of
"RuleName": {
"Technique": "Commonly Used Port",
"Tactic": "Command and Control",
"MitreRef": "1043"
}
In this case you can grab basically the whole line and assume you have three groups of characters, each consisting of
A number of characters that are not the equals sign: ([^=]+)
The equals sign =
A number of characters that are not the comma sign: ([^,]+)
These groups in turn are separated by a comma: ,
Based on these assumptions you can write the following RegEx inside the Search Value property of the ReplaceText processor:
"RuleName"\s*:\s*"([^=]+)=([^,]+),([^=]+)=([^,]+),([^=]+)=([^,]+)"
With this, you grab the whole line and build a group for every important data point.
Based on the groups you may set the Replacement Value to:
"RuleName": {
"${'$1'}": "${'$2'}",
"${'$3'}": "${'$4'}",
"${'$5'}": "${'$6'}"
}
Resulting in the above mentioned JSON object.
Some remarks
The RegEx assumes that the entry is on a single line and does NOT work when it is splitted onto multiple lines, e.g.
"RuleName":
"Technique=Commonly Used Port,Tactic=Command and Control,MitreRef=1043"
The RegEx assumes the are exactly three "items" inside the value of RuleName and does NOT work with different number of "items".
In case your JSON file can grow larger you may try to avoid using the Entire text evaluation mode, as this loads the content into a buffer and routes the FlowFile to the failure output in case the file is to large. In that case I recommend you to use the Line-by-Line mode as seen in the attached image.
Allowing a fourth additional value
In case there might be a fourth additional value, you may adjust the RegEx in the Search Value property.
You can add (,([^=]+)=([^,]+))? to the previous expression, which roughly translated to:
( )? - match what is in the bracket zero or one times
, - match the character comma
([^=]+)=([^,]+) - followed by the group of characters as explaind above
The whole RegEx will look like this:
"RuleName"\s*:\s*"([^=]+)=([^,]+),([^=]+)=([^,]+),([^=]+)=([^,]+)(,([^=]+)=([^,]+))?"
To allow the new value to be used you have to adjust the replacement value as well.
You can use the Expression Language available in most NiFi processor properties to decide whether to add another item to the JSON object or not.
${'$7':isEmpty():ifElse(
'',
${literal(', "'):append(${'$8'}):append('": '):append('"'):append(${'$9'}):append('"')}
)}
This expression will look if the seventh RegEx group exists or not and either append an empty string or the found values.
With this modification included the whole replacement value will look like the following:
"RuleName": {
"${'$1'}": "${'$2'}",
"${'$3'}": "${'$4'}",
"${'$5'}": "${'$6'}"
${'$7':isEmpty():ifElse(
'',
${literal(', "'):append(${'$8'}):append('": '):append('"'):append(${'$9'}):append('"')}
)}
}
regarding multiple occurrences
The ReplaceText processor replaces all occurrences it finds where the RegEx matches. Using the settings provided in the last paragraph given the following example input
{
"event_data": {
"RuleName": "Technique=Commonly Used Port,Tactic=Command and Control,MitreRef=1043,Foo=Bar"
},
"RuleName": "Technique=Commonly Used Port,Tactic=Command and Control,MitreRef=1043"
}
will result in the following:
{
"event_data": {
"RuleName": {
"Technique": "Commonly Used Port",
"Tactic": "Command and Control",
"MitreRef": "1043",
"Foo": "Bar"
}
},
"RuleName": {
"Technique": "Commonly Used Port",
"Tactic": "Command and Control",
"MitreRef": "1043"
}
}
example template
You may download a template I created that includes the above processor from gist.

Regexp starts with not working Elasticsearch 6.*

I got trouble with understanding regexp mechanizm in ElasticSearch. I have documents that represent property units:
{
"Unit" :
{
"DailyAvailablity" :
"UIAOUUUUUUUIAAAAAAAAAAAAAAAAAOUUUUIAAAAOUUUIAOUUUUUUUUUUUUUUUUUUUUUUUUUUIAAAAAAAAAAAAAAAAAAAAAAOUUUUUUUUUUIAAAAAOUUUUUUUUUUUUUIAAAAOUUUUUUUUUUUUUIAAAAAAAAOUUUUUUIAAAAAAAAAOUUUUUUUUUUUUUUUUUUIUUUUUUUUIUUUUUUUUUUUUUUIAAAOUUUUUUUUUUUUUIUUUUIAOUUUUUUUUUUUUUUUUUUUUUUUUUUUUUIAAAAAAAAAAAAOUUUUUUUUUUUUUUUUUUUUIAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAOUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUIAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"
}
}
DailyAvailability field codes availability of property by days for the next two years from today. 'A' means available, 'U' unabailable, 'I' can check in, 'O' can check out. How can I write regexp filter to get all units that are available in particular dates?
I tried to find the 'A' substring with particular length and offset in DailyAvailability field. For example to find units that would be available for 7 days in 7 days from today:
{
"query": {
"bool": {
"filter": [
{
"regexp": { "Unit.DailyAvailability": {"value": ".{7}a{7}.*" } }
}
]
}
}
}
This query returns for instance unit with DateAvailability that starts from "UUUUUUUUUUUUUUUUUUUIAA", but contains suitable sequences somehere inside the field. How can I anchor regexp for entire source string? ES docs say that lucene regex should be anchored by default.
P.S. I have tried '^.{7}a{7}.*$'. Returns empty set.
It looks like you are using text datatype to store Unit.DailyAvailability (which is also the default one for strings if you are using dynamic mapping). You should consider using keyword datatype instead.
Let me explain in a bit more detail.
Why does my regex match something in the middle of a text field?
What happens with text datatype is that the data gets analyzed for full-text search. It does some transformations like lowercasing and splitting into tokens.
Let's try to use the Analyze API against your input:
POST _analyze
{
"text": "UIAOUUUUUUUIAAAAAAAAAAAAAAAAAOUUUUIAAAAOUUUIAOUUUUUUUUUUUUUUUUUUUUUUUUUUIAAAAAAAAAAAAAAAAAAAAAAOUUUUUUUUUUIAAAAAOUUUUUUUUUUUUUIAAAAOUUUUUUUUUUUUUIAAAAAAAAOUUUUUUIAAAAAAAAAOUUUUUUUUUUUUUUUUUUIUUUUUUUUIUUUUUUUUUUUUUUIAAAOUUUUUUUUUUUUUIUUUUIAOUUUUUUUUUUUUUUUUUUUUUUUUUUUUUIAAAAAAAAAAAAOUUUUUUUUUUUUUUUUUUUUIAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAOUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUIAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"
}
The response is:
{
"tokens": [
{
"token": "uiaouuuuuuuiaaaaaaaaaaaaaaaaaouuuuiaaaaouuuiaouuuuuuuuuuuuuuuuuuuuuuuuuuiaaaaaaaaaaaaaaaaaaaaaaouuuuuuuuuuiaaaaaouuuuuuuuuuuuuiaaaaouuuuuuuuuuuuuiaaaaaaaaouuuuuuiaaaaaaaaaouuuuuuuuuuuuuuuuuuiuuuuuuuuiuuuuuuuuuuuuuuiaaaouuuuuuuuuuuuuiuuuuiaouuuuuuuuuuuuuuu",
"start_offset": 0,
"end_offset": 255,
"type": "<ALPHANUM>",
"position": 0
},
{
"token": "uuuuuuuuuuuuuuiaaaaaaaaaaaaouuuuuuuuuuuuuuuuuuuuiaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
"start_offset": 255,
"end_offset": 510,
"type": "<ALPHANUM>",
"position": 1
},
{
"token": "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaouuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuiaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
"start_offset": 510,
"end_offset": 732,
"type": "<ALPHANUM>",
"position": 2
}
]
}
As you can see, Elasticsearch has split your input into three tokens and lowercased them. This looks unexpected, but if you think that it actually tries to facilitate search for words in human language, it makes sense - there are no such long words.
That's why now regexp query ".{7}a{7}.*" will match: there is a token that actually starts with a lot of a's, which is an expected behavior of regexp query.
...Elasticsearch will apply the regexp to the terms produced by the
tokenizer for that field, and not to the original text of the field.
How can I make regexp query consider the entire string?
It is very simple: do not apply analyzers. The type keyword stores the string you provide as is.
With a mapping like this:
PUT my_regexes
{
"mappings": {
"doc": {
"properties": {
"Unit": {
"properties": {
"DailyAvailablity": {
"type": "keyword"
}
}
}
}
}
}
}
You will be able to do a query like this that will match the document from the post:
POST my_regexes/doc/_search
{
"query": {
"bool": {
"filter": [
{
"regexp": { "Unit.DailyAvailablity": "UIAOUUUUUUUIA.*" }
}
]
}
}
}
Note that the query became case-sensitive because the field is not analyzed.
This regexp won't return any results anymore: ".{12}a{7}.*"
This will: ".{12}A{7}.*"
So what about anchoring?
The regexes are anchored:
Lucene’s patterns are always anchored. The pattern provided must match the entire string.
The reason why it looked like the anchoring was wrong was most likely because tokens got split in an analyzed text field.
Just in addition to brilliant and helpfull answer of Nikolay Vasiliev. In my case I was forced to go farther to make it work on NEST .net. I added attribute mapping to DailyAvailability:
[Keyword(Name = "DailyAvailability")]
public string DailyAvailability { get; set; }
The filter still didn't work and I got mapping:
"DailyAvailability":"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
}
My field contained about 732 symbols so it was ignored by index. I tried:
[Keyword(Name = "DailyAvailability", IgnoreAbove = 1024)]
public string DailyAvailability { get; set; }
It didn't make any difference on mapping. And only after adding manual mappings it started working properly:
var client = new ElasticClient(settings);
client.CreateIndex("vrp", c => c
.Mappings(ms => ms.Map<Unit>(m => m
.Properties(ps => ps
.Keyword(k => k.Name(u => u.DailyAvailability).IgnoreAbove(1024))
)
)
));
The point is that:
ignore_above - Do not index any string longer than this value. Defaults to 2147483647 so that all values would be accepted. Please however note that default dynamic mapping rules create a sub keyword field that overrides this default by setting ignore_above: 256.
So use explicit mapping for long keyword fields to set ignore_above if you need to filter them with regexp.
For anyone could be useful, the ES tool does not support the \d \w modes, you should write those as [0-9] and [a-z]

Type assertion on Goa package (uuid.UUID)

I'm testing out Goa for an API. I want to use uuid as an ID data type. I modified the following function in controller.go:
// Show runs the show action.
func (c *PersonnelController) Show(ctx *app.ShowPersonnelContext) error {
// v4 UUID validate regex
var validate = `^[a-f0-9]{8}-[a-f0-9]{4}-4[a-f0-9]{3}-[8|9|aA|bB][a-f0-9]{3}-[a-f0-9]{12}$`
uuidValidator := regexp.MustCompile(validate)
if !uuidValidator.Match(ctx.MemberID) {
return ctx.NotFound()
}
// Build the resource using the generated data structure
personnel := app.GoaCrewWorkforce{
MemberID: ctx.MemberID,
FirstName: fmt.Sprintf("First-Name #%s", ctx.MemberID),
LastName: fmt.Sprintf("Last-Name #%s", ctx.MemberID),
}
What I would like to do is validate a v4 uuid in my controller using Regexp so that it does not poll the server if it does not validate. It's my understanding a uuid is a [16]byte slice. Regexp has a Match []byte function. Yet I can't seem to understand why I get the following error:
cannot use ctx.MemberID (type uuid.UUID) as type []byte in argument to uuidValidator.Match
How can I type assert ctx.MemberID? I don't think its possible to do a cast conversion in this case? Any guidance is appreciated.
If you want to validate the uuid, you can check the bits directly. There's not much to verify, since most of the 16 bytes are random, but you can check the top 4 bits of the 6th byte for the version number, or the top 3 bits of the 8th byte for the variant.
// enforce only V4 uuids
if ctx.MemberID[6] >> 4 != 4 {
log.Fatal("not a V4 UUID")
}
// enforce only RFC4122 type UUIDS
if (ctx.MemberID[8]&0xc0)|0x80 != 0x80 {
log.Fatal("not an RFC4122 UUID")
}

Is there a perticular format for sigma.js compatible json file ?

Im trying to generate a json file which should be compatible with sigma.js to plot the graph ? can i know in what format it should be ? in C++.
The sigma.js json format, derived from the sigma.js wiki, consists of an array of edges and an array of nodes.
An edge has three string properties:
id is a unique identifier for this edge), and
source, target are ids for nodes.
A node has the properties:
id is a unique identifier for this node,
label is an optional text displayed along with the node,
x, y are coordinates in the plane (can be floating-point),
color is an optional CSS color specification, and
size is a logical size comparable to other edge sizes.
In summary, only color and label are optional, and only x, y and size are numbers, the rest are strings. Even when using automatic layout generators (e.g. the ForceAtlas2 plugin), it is necessary to specify initial x and y, and even if all nodes have the same size, size must be specified. (This is not mentioned in the docs, but it can be confirmed experimentally.)
This is an abbreviated excerpt from the GitHub sample arctic.json:
{
"edges": [
{
"source": "473",
"target": "313",
"id": "6432"
},
...
],
"nodes": [
{
"id": "262",
"label": "Sciences De La Terre",
"x": 1412.2230224609,
"y": -2.0559763908386,
"size": 8.540210723877
"color": "rgb(255,204,102)",
},
...
]
}