Can we extract dyanmic data from string using regex? - regex

I want to validate and get the data for following tags(9F03,9F02,9C ) using regex:
9F02060000000060009F03070000000010009C0101
Above string is in Tag - length - value format.
Where 9F02,9F03,9C are tags and have fixed length but their position and value in string can vary.
Just after the tag there is the length of the value in bytes that tag can store.
for example:
9F02=tag
06=Length in bytes
000000006000= value
Thanks,
Ashutosh

Standard regex doesn't know how to count very well, it behaves like a state machine in that way.
What you can do though if the number of possibilities is small is represent each possibility in a state in regex, and use multiple regex queries for each tag ...
/9F02(01..|02....|03......)/
/9C(01..|02....)/
... And so on.
Example here.
http://rubular.com/r/euHRxeTLqH

import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegEx {
public static void main(String[] args) {
String s = "9F02060000000060009F03070000000010009C0101";
String regEx = "(9F02|9F03|9C)";
Pattern p = Pattern.compile(regEx);
Matcher m = p.matcher(s);
while(m.find()){
System.out.println("Tag : "+ m.group());
String length = s.substring(m.end(), m.end()+2);
System.out.println("Length : " + length);
int valueEndIndex = new Integer(m.end()) + 3 + new Integer(length);
String value = s.substring(m.end()+3,valueEndIndex);
System.out.println("Value : "+ value);
}
}
}
This code will give you following output :
Tag : 9F02
Length : 06
value : 000000
Tag : 9F03
Length : 07
value : 0000000
Tag : 9C
Length : 01
value : 1
I am not sure about byte length you are mentioning here, but I guess this code shall help you kick start!

Related

How to detect incomplet date from list and replace with flutter?

Hello I don't find how to detect an incomplet date from listString. I think about regex but don't know how to extract this sequence input.
input=[2022-01-20 20:01, 2022-01-20 21, 2022-01-20 22:25, 2022-01-20 23:01]
Here I tried to match 2022-01-20 21 (it's the only who not have minute)
after match I want to add minute :00 to remove wrong date format
Here is what I search to have
output=[2022-01-20 20:01, 2022-01-20 21:00, 2022-01-20 22:25, 2022-01-20 23:01]
here is what I tried
dateList=[2022-01-20 20:01, 2022-01-20 21, 2022-01-20 22:25, 2022-01-20 23:01];
for (var i = 1; i < dateList.length; i++) {
RegExp regExp = new RegExp(
r"^((?!:).)*$",
);
var match = regExp.firstMatch("${dateList}");
var index = dateList1.indexOf(match);
dateList.replaceRange(index, index + 1, ["$match:00"]);
}
for each index of my stringlist I seach the only who not have : after I found the index who have a problem, and I replace the index with the add :00
problem match return null...
Thank you
I agree that using regular expressions is the way to go here. Detecting a date is relatively simple, you're basically looking for
4-digits dash 2-digits dash 2-digits space 2-digits colon 2-digits
Which, in RegExp language is
\d{4}-\d{2}-\d{2} \d{2}:\d{2}
Now we can detect whether a given String contains a complete datetime. The only thing that's left is to add the trailing minutes when it is missing. Note that you can decide what to add using another regular expression, but this code will just add the minutes, assuming that's always the issue.
List<String> input = ['2022-01-20 20:01', '2022-01-20 21', '2022-01-20 22:25', '2022-01-20 23:01'];
List<String> output = [];
// detect a date + time
RegExp regex = RegExp(r'\d{4}-\d{2}-\d{2} \d{2}:\d{2}');
for (String maybeDate in input) {
bool isCompleteDate = regex.hasMatch(maybeDate);
if (isCompleteDate) {
output.add(maybeDate);
} else {
// we want to comlete the String
// in this case, I assume it's always just the minutes missing, but you could use another regex to see which part is missing
output.add(maybeDate + ':00');
}
}
print(output);
Alternatively, you can indeed use negative lookahead to find the missing minutes:
// detects a date and hour, without a colon and two digits (the minutes)
RegExp missingMinutes = RegExp(r'(\d{4}-\d{2}-\d{2} \d{2})(?!:\d{2})');
Which, in case you have a String instead of a List<String> would result in
List<String> input = ['2022-01-20 20:01', '2022-01-20 21', '2022-01-20 22:25', '2022-01-20 23:01'];
String listAsString = input.toString();
RegExp missingMinutes = RegExp(r'(\d{4}-\d{2}-\d{2} \d{2})(?!:\d{2})');
List<RegExpMatch?> matches = missingMinutes.allMatches(listAsString).toList();
for (int i = matches.length - 1; i >= 0; i--) {
// walk through all matches
if (matches[i] == null) continue;
listAsString = listAsString.substring(0, matches[i]!.end) + ':00' + listAsString.substring(matches[i]!.end);
}
print(listAsString);

Regular expression to match all digits of unknown length except the last 4 digits

There is a number with unknown length and the idea is to build a regular expression which matches all digits except last 4 digits.
I have tried a lot to achieve this but no luck yet.
Currently I have this regex: "^(\d*)\d{0}\d{0}\d{0}\d{0}.*$"
Input: 123456789089775
Expected output: XXXXXXXXXXX9775
which I am using as follows(and this doesn't work):
String accountNumber ="123456789089775";
String pattern = "^(\\d*)\\d{1}\\d{1}\\d{1}\\d{1}.*$";
String result = accountNumber.replaceAll(pattern, "X");
Please suggest how I should approach this problem or give me the solution.
In this case my whole point is to negate the regex : "\d{4}$"
You may use
\G\d(?=\d{4,}$)
See the regex demo.
Details
\G - start of string or end of the previous match
\d - a digit
(?=\d{4,}$) - a positive lookahead that requires 4 or more digits up to the end of the string immediately to the right of the current location.
Java demo:
String accountNumber ="123456789089775";
String pattern = "\\G\\d(?=\\d{4,}$)"; // Or \\G.(?=.{4,}$)
String result = accountNumber.replaceAll(pattern, "X");
System.out.println(result); // => XXXXXXXXXXX9775
still not allowed to comment as I don't have that "50 rep" yet but DDeMartini's answer would swallow prefixed non-number-accounts as "^(.*)" would match stuff like abcdef1234 as well - stick to your \d-syntax
"^(\\d+)(\\d{4}$)"
seems to work fine and demands numbers (minimum length 6 chars). Tested it like
public class AccountNumberPadder {
private static final Pattern LAST_FOUR_DIGITS = Pattern.compile("^(\\d+)(\\d{4})");
public static void main(String[] args) {
String[] accountNumbers = new String[] { "123456789089775", "999775", "1234567890897" };
for (String accountNumber : accountNumbers) {
Matcher m = LAST_FOUR_DIGITS.matcher(accountNumber);
if (m.find()) {
System.out.println(paddIt(accountNumber, m));
} else {
throw new RuntimeException(String.format("Whooaaa - don't work for %s", accountNumber));
}
}
}
public static String paddIt(String input, Matcher m) {
StringBuilder b = new StringBuilder();
for (int i = 0; i < m.group(1).length(); i++) {
b.append("X");
}
return input.replace(m.group(1), b.toString());
}
}
Try:
String pattern = "^(.*)[0-9]{4}$";
Addendum after comment: A refactor to only match full numerics could look like this:
String pattern = "^([0-9]+)[0-9]{4}$";

Extract first three octet from IP address

I need to extract first three octet from a IP address(class C) and I can do it by splitting on "//.". But is there a way to do it using REGEX.
Input : 192.168.1.1 Output : 192.168.1
Something like this:
/^[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}/
Use match and it is done..
More precisely for Java:
Pattern p = Pattern.compile("([0-9]{1,3})\\.([0-9]{1,3})\\.([0-9]{1,3}).*");
Matcher m = p.matcher("127.0.2.13");
if (m.matches()) {
String s0 = m.group(1); // contains "127"
String s1 = m.group(2); // contains "0"
String s2 = m.group(3); // contains "2"
System.out.println("s0 + "." + s1 + "." + s2);
}
This slightly more simple pattern also works:
Pattern p = Pattern.compile("(\\d{1,3})\\.(\\d{1,3})\\.(\\d{1,3}).*");
Really good regex tutorial here.

Remove text between two tags

I'm trying to remove some text between two tags [ & ]
[13:00:00]
I want to remove 13:00:00 from [] tags.
This number is not the same any time.
Its always a time of the day so, only Integer and : symbols.
Someone can help me?
UPDATE:
I forgot to say something. The time (13:00:00) was picked from a log file. Looks like that:
[10:56:49] [Client thread/ERROR]: Item entity 26367127 has no item?!
[10:57:25] [Dbutant] misterflo13 : ils coute chere les enchent aura de feu et T2 du spawn??*
[10:57:35] [Amateur] firebow ?.SkyLegend.? : ouai 0
[10:57:38] [Novice] iPasteque : ils sont gratuit me
[10:57:41] [Novice] iPasteque : ils sont gratuit mec *
[10:57:46] [Dbutant] misterflo13 : on ma dit k'ils etait payent :o
[10:57:57] [Novice] iPasteque : on t'a mytho alors
Ignore the other text I juste want to remove the time between [ & ] (need to looks like []. The time between [ & ] is updated every second.
It looks like your log has specific format. And you seem want to get rid of the time and keep all other information. Ok - read in comments
I didn't test it but it should work
' Read log
Dim logLines() As String = File.ReadAllLines("File_path")
If logLines.Length = 0 Then Return
' prepare array to fill sliced data
Dim lines(logLines.Length - 1) As String
For i As Integer = 0 To logLines.Count - 1
' just cut off time part and add empty brackets for each line
lines(i) = "[]" & logLines(i).Substring(10)
Next
What you see above - if you know that your file comes in certain format, just use position in the string where to cut it off.
Note: Code above can be done in 1 line using LINQ
If you want to actually get the data out of it, use IndexOf. Since you looking for first occurrence of "[" or "]", just use start index "0"
' get position of open bracket in string
Dim openBracketPos As Integer = myString.IndexOf("[", 0, StringComparison.OrdinalIgnoreCase)
' get position of close bracket in string
Dim closeBracketPos As Integer = myString.IndexOf("]", 0, StringComparison.OrdinalIgnoreCase)
' get string between open and close bracket
Dim data As String = myString.Substring(openBracketPos + 1, closeBracketPos - 1)
This is another possibility using Regex:
Public Function ReplaceTime(ByVal Input As String) As String
Dim m As Match = Regex.Match(Input, "(\[)(\d{1,2}\:\d{1,2}(\:\d{1,2})?)(\])(.+)")
Return m.Groups(1).Value & m.Groups(4).Value & m.Groups(5).Value
End Function
It's more of a readability nightmare but it's efficient and it takes only the brackets containing a time value.
I also took the liberty of making it match for example 13:47 as well as 13:47:12.
Test: http://ideone.com/yogWfD
(EDIT) Multiline example:
You can combine this with File.ReadAllLines() (if that's what you prefer) and a For loop to get the replacement done.
Public Function ReplaceTimeMultiline(ByVal TextLines() As String) As String
For x = 0 To TextLines.Length - 1
TextLines(x) = ReplaceTime(TextLines(x))
Next
Return String.Join(Environment.NewLine, TextLines)
End Function
Above code usage:
Dim FinalT As String = ReplaceTimeMultiline(File.ReadAllLines(<file path here>))
Another multiline example:
Public Function ReplaceTimeMultiline(ByVal Input As String) As String
Dim ReturnString As String = ""
Dim Parts() As String = Input.Split(Environment.NewLine)
For x = 0 To Parts.Length - 1
ReturnString &= ReplaceTime(Parts(x)) & If(x < (Parts.Length - 1), Environment.NewLine, "")
Next
Return ReturnString
End Function
Multiline test: http://ideone.com/nKZQHm
If your problem is to remove numeric strings in the format of 99:99:99 that appear inside [], I would do:
//assuming you want to replace the [......] numeric string with an empty []. Should you want to completely remove the tag, just replace with string.Empty
Here's a demo (in C#, not VB, but you get the point (you need the regex, not the syntax anyway)
List<string> list = new List<string>
{
"[13:00:00]",
"[4:5:0]",
"[5d2hu2d]",
"[1:1:1000]",
"[1:00:00]",
"[512341]"
};
string s = string.Join("\n", list);
Console.WriteLine("Original input string:");
Console.WriteLine(s);
Regex r = new Regex(#"\[\d{1,2}?:\d{1,2}?:\d{1,2}?\]");
foreach (Match m in r.Matches(s))
{
Console.WriteLine("{0} is a match.", m.Value);
}
Console.WriteLine();
Console.WriteLine("String with occurrences replaced with an empty string:");
Console.WriteLine(r.Replace(s, string.Empty).Trim());

Dart how to add commas to a string number

I'm trying to adapt this:
Insert commas into number string
to work in dart, but no luck.
either one of these don't work:
print("1000200".replaceAllMapped(new RegExp(r'/(\d)(?=(\d{3})+$)'), (match m) => "${m},"));
print("1000300".replaceAll(new RegExp(r'/\d{1,3}(?=(\d{3})+(?!\d))/g'), (match m) => "$m,"));
Is there a simpler/working way to add commas to a string number?
You just forgot get first digits into group. Use this short one:
'12345kWh'.replaceAllMapped(RegExp(r'(\d{1,3})(?=(\d{3})+(?!\d))'), (Match m) => '${m[1]},')
Look at the readable version. In last part of expression I added checking to any not digit char including string end so you can use it with '12 Watt' too.
RegExp reg = RegExp(r'(\d{1,3})(?=(\d{3})+(?!\d))');
String Function(Match) mathFunc = (Match match) => '${match[1]},';
List<String> tests = [
'0',
'10',
'123',
'1230',
'12300',
'123040',
'12k',
'12 ',
];
for (String test in tests) {
String result = test.replaceAllMapped(reg, mathFunc);
print('$test -> $result');
}
It works perfectly:
0 -> 0
10 -> 10
123 -> 123
1230 -> 1,230
12300 -> 12,300
123040 -> 123,040
12k -> 12k
12 -> 12
import 'package:intl/intl.dart';
var f = NumberFormat("###,###.0#", "en_US");
print(f.format(int.parse("1000300")));
prints 1,000,300.0
check dart's NumberFormat here
The format is specified as a pattern using a subset of the ICU formatting patterns.
0 A single digit
# A single digit, omitted if the value is zero
. Decimal separator
- Minus sign
, Grouping separator
E Separates mantissa and expontent
+ - Before an exponent, to say it should be prefixed with a plus sign.
% - In prefix or suffix, multiply by 100 and show as percentage
‰ (\u2030) In prefix or suffix, multiply by 1000 and show as per mille
¤ (\u00A4) Currency sign, replaced by currency name
' Used to quote special characters
; Used to separate the positive and negative patterns (if both present)
Try the following regex: (\d{1,3})(?=(\d{3})+$)
This will provide two backreferences, and replacing your number using them like $1,$2, will add commas where they are supposed to be.
Let's take the example amount 12000. now our expected amount should be 12,000.00
so, the solution is
double rawAmount = 12000;
String amount = rawAmount.toStringAsFixed(2).replaceAllMapped(RegExp(r'(\d{1,3})(?=(\d{3})+(?!\d))'), (Match m) => '${m[1]},');
or if you don't want to add .00 then, we just need to use toString() instead of toStringAsFixed().
String amount = rawAmount.toString().replaceAllMapped(RegExp(r'(\d{1,3})(?=(\d{3})+(?!\d))'), (Match m) => '${m[1]},');
extension on int {
String get priceString {
final numberString = toString();
final numberDigits = List.from(numberString.split(''));
int index = numberDigits.length - 3;
while (index > 0) {
numberDigits.insert(index, ',');
index -= 3;
}
return numberDigits.join();
}
}
because in case of type double, the output will change based on the way, so check them.
If you need to format integer then any way works.
//1233.45677 => 1,233.4567
String num='1233.45677';
RegExp pattern = RegExp(r'(?<!\.\d*)(\d)(?=(?:\d{3})+(?:\.|$))');
String Function(Match) replace = (m) => '${m[1]},';
print(num..replaceAllMapped(pattern, replace));
//1233.45677 => 1,233.456,7
String num='1233.45677';
pattern = RegExp(r'(\d{1,3})(?=(\d{3})+(?!\d))');
String Function(Match) replace = (m) => '${m[1]},';
print(num..replaceAllMapped(pattern, replace));
//1233.45677 => 1,233.46
//after import intl package, to be able to use NumberFormat
String num='1233.45677';
var f = NumberFormat("###,###.0#", "en");
print(f.format(double.parse()));
if the number is in String type.
//in case of int data type
int.parse(num);
//in case of double data type
double.parse(num);