Let's say I have the following type of data:
[577] {0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00}
[578] {0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00}
[579] {0x05,0x08,0x01,0x00,0x47,0x00,0x61,0x00,0x6c}
[580] {0x05,0x08,0x01,0x00,0x47,0x00,0x61,0x00,0x6c}
[581] {0x05,0x08,0x01,0x00,0x47,0x00,0x61,0x00,0x6c}
[582] {0x04,0x08,0x00,0x00,0x61,0x00,0x78,0x00,0x79}
[583] {0x04,0x08,0x00,0x00,0x61,0x00,0x78,0x00,0x79}
[584] {0x04,0x08,0x00,0x00,0x61,0x00,0x78,0x00,0x79}
[585] {0x04,0x08,0x00,0x00,0x61,0x00,0x78,0x00,0x79}
[586] {0x04,0x08,0x00,0x00,0x61,0x00,0x78,0x00,0x79}
[587] {0x04,0x08,0x00,0x00,0x61,0x00,0x78,0x00,0x79}
[588] {0x03,0x08,0x00,0x00,0x20,0x00,0x53,0x00,0x32}
[589] {0x03,0x08,0x00,0x00,0x20,0x00,0x53,0x00,0x32}
[590] {0x03,0x08,0x00,0x00,0x20,0x00,0x53,0x00,0x32}
[591] {0x03,0x08,0x00,0x00,0x20,0x00,0x53,0x00,0x32}
[592] {0x02,0x08,0x00,0x00,0x32,0x00,0x2b,0x00,0x20}
[593] {0x02,0x08,0x00,0x00,0x32,0x00,0x2b,0x00,0x20}
[594] {0x02,0x08,0x00,0x00,0x32,0x00,0x2b,0x00,0x20}
[595] {0x02,0x08,0x00,0x00,0x32,0x00,0x2b,0x00,0x20}
[596] {0x02,0x08,0x00,0x00,0x32,0x00,0x2b,0x00,0x20}
[597] {0x02,0x08,0x00,0x00,0x32,0x00,0x2b,0x00,0x20}
[598] {0x01,0x08,0x00,0x00,0x2d,0x00,0x39,0x00,0x33}
[599] {0x00,0x08,0x00,0x00,0x34,0x00,0x00,0x00,0x00}
[600] {0x00,0x08,0x00,0x00,0x34,0x00,0x00,0x00,0x00}
[601] {0x00,0x08,0x00,0x00,0x34,0x00,0x00,0x00,0x00}
[602] {0x00,0x08,0x00,0x00,0x34,0x00,0x00,0x00,0x00}
The relevant data is between the braces { }.
I want to find where the first column doesn't repeat.
In the data above that would be for the row marked as "[598]".
Because row "[597]" start with a '0x02', and row "[599]" starts with a '0x00'. So the '0x01' is unique.
But it could very well be that the '0x01' is a '0x09'. I mean that the number per-se don't matter, as long as it's different from the lines above and below it. Only for the first column matters though.
I've been trying with Lookarounds but it doesn't work:
(?<!.*\{(\3).*\n)(.*\{(0x\d\d))(?!.*\n.*\{(\3))
Any ideas?
Notes:
I'm using VSCode to find.
No need to capture it, just would like it to highlight.
I think the following works for what you're after (slightly improved from Andrej, and adapted to support JavaScript's flavour regex, which I believe is what VSCode uses).
Regex101
^\[\d+\]\s+{([^,]+)[^[]+^\[\d+\]\s+{((?!\1)[^,]+)[^[]+^\[\d+\]\s+{((?!\2)[^,]+)[^[]+$
Notes:
JavaScript regex doesn't appear to support the (?!\1|\3) negative lookahead syntax, so I've swapped this for a single back reference 2-vs-1, and 3-vs-2
Due to this, if the first and third lines have the same value in the first element, then it'll still match, which isn't ideal...
Matches full lines and fields, if you need/want to use this for processing too
This is operating over three distinct lines:
^\[\d+\]\s+{([^,]+)[^[]+
matches against the numeric component surrounded by [] brackets, and the first element in the {} braces
^\[\d+\]\s+{((?!\1)[^,]+)[^[]+
matches the same again, but instead of "the first value", it explicitly forbids the value used on the first line
when compared with Andrej's answer, this will capture the full element due to ((?!\1)[^,]+) vs ((?!\1).{4})
^\[\d+\]\s+{((?!\2)[^,]+)[^[]+$
same again, but explicitly forbids the value used on the second line
Try (regex101):
^\[\d+\]\s+{([^,]+)[^{]+{((?!\1|\3).{4})[^{]+{((?!\1|\2).{4})
Let's say I have a column which has values like:
foo/bar
chunky/bacon/flavor
/baz/quz/qux/bax
I.e. a variable number of strings separated by /.
In another column I want to get the last element from each of these strings, after they have been split on /. So, that column would have:
bar
flavor
bax
I can't figure this out. I can split on / and get an array, and I can see the function INDEX to get a specific numbered indexed element from the array, but can't find a way to say "the last element" in this function.
Edit:
this one is simplier:
=REGEXEXTRACT(A1,"[^/]+$")
You could use this formula:
=REGEXEXTRACT(A1,"(?:.*/)(.*)$")
And also possible to use it as ArrayFormula:
=ARRAYFORMULA(REGEXEXTRACT(A1:A3,"(?:.*/)(.*)$"))
Here's some more info:
the RegExExtract function
Some good examples of syntax
my personal list of Regex Tricks
This formula will do the same:
=INDEX(SPLIT(A1,"/"),LEN(A1)-len(SUBSTITUTE(A1,"/","")))
But it takes A1 three times, which is not prefferable.
You could do this too
=index(SPLIT(A1, "/"), COLUMNS(SPLIT(A1, "/"))-1)
Also possible, perhaps best on a copy, with Find:
.+/
(Replace with blank) and Search using regular expressions ticked.
You can try use this!
You've got the array of String, so you can acess the last element by length
String message = "chunky/bacon/flavor";
String[] outSplited = message.split("/");
System.out.println(outSplited[outSplited.length -1]);
please help me decipher the regular expression-
'!_[$0]++'
It is being used to get a MSISDN (one at a time from a file containing list of MSISDN starting with zero )by the following usage:
awk '!_[$0]++' file.txt
It's not a regular expression, it's an arithmetic and boolean expression.
$0 = The current input line
_[$0] = An associative array element whose key is the input line
_[$0]++ = increment that array element each time we encounter a repeat of the line, but evaluates to the original value
!_[$0]++ = boolean inverse, so it returns true if the value was originally 0 or the empty string, false otherwise
So this expression is true the first time a line is encountered, false every other time. Since there's no action block after the expression, the default is to print the line if the expression is true, skip it when false.
So this prints the input file with duplicates omitted.
'true'- then the line will be printed
'_[$0]++'- associative array will be incremented everytime when $0 is present.means it will set the number of times each line is repeated.
'!_[$0]++'-this will be true when a line is inserted in the associative array for the firsttime only and the rest of the times it will resolve to false ultimately not printing the line.
So all the duplicate lines will not be prited.
This is not a regular expression. This particular command prints unique lines the first time they are found.
_ is being used as an array here and $0 refers to the entire line. Given that the default numeric value for array element is 0 (it's technically an empty string, but in numeric contexts its treated as 0), the first time you see a line, you print the line (since _[$0] is falsy, !_[$0] will be true). The command increments every time it sees a line (after printing -- awk's default command is to print), so the next time you see the line _[$0] will be 1 and the line will not be printed