BBEdit chomp off excessive decimals - regex

Having a very large data set of float values where the precision is not longer needed, what is a regular expression that I can use with BBEdit to allow me to keep a maximum of 5 digits after a period?
Physically, the decimal value always has a character preceeding the period, is always preceeded by a space, but can have a comma or a space after the string.
sample:
-162.40904700399989, -82.896416924999954

You may use
Find: (\d\.\d{5})\d+
Replace: \1
Details
(\d\.\d{5}) - Group 1 (referred to via \1 from the replacement pattern): a digit, . and then 5 digits (note the first \d has no quantifier, we are not interested if there are more than one, one is enough, before the decimal separator)
\d+ - one or more digits. Note the + quantifier makes more sense than * because we only want to match those numbers that we want to modify, those that already have 5 digits after the decimal separator do not have to be matched.

Related

How to match the decimal digits which equals 0 at the end of a number in Regex?

I want to remove the zeros at the end of a number coming after the decimal point. To give an example:
12.009000 should match "000"
I have the regex pattern below but it gives an error A quantifier inside a lookbehind makes it non-fixed width and I can't find any solution to fix that. What is the correct pattern to match successfully?
Pattern: (?<=\.[0-9]*)0+$
With Java, you can do it like this.
(\\d) capture digits
followed by 0's
replace with the captured digits.
$1 is the back reference to the capture group
str = str.replaceAll("(\\.\\d+?)0+$","$1");
System.out.println(str);
Note: It will leave 12.000000 as 12.0.
(\d+[.]?\d*?)0*$
One more step is needed to replace the dot for numbers such as 12.000
Click here for demo: Click Here
Or to deal with numbers such as 12.000 in one step:
(?:(\d+)\.0*$)|(?:(\d+[.]?\d*?)0*$)
Click here for demo: Click Here
Here is my attempt:
(?:[.][0-9]*[1-9])(0+)$|([.]0+$)
This assumes that the input string is actually a number (it won't protect against things like xyz.001). It will not match at all if there are no trailing zeros after decimal point; and if there are, it removes:
sequence of 0s preceded by a [1-9] after [.][0-9]*
or
a [.] followed by a sequence of 0s.
The result will always be in the captured group if the regex matches.
([\d.]+?)(0*)
"Find digits and dots, but not greedily, then find trailing zeros"
Group 1 is the number. Group 2 is the trailing zeros.

Input Commas into regex during the whole number part of 10,4 decimal

I am looking for a regex that will limit a decimal to 10,4 but in the whole number part (10) I would like it to separate with commas.
For example - 1,123,123,123.1234
This gets me close to what I need - \d{0,10}.\d{4}
But I would like to show commas as in the example.
But I am not sure how to tweak this to achieve what I need?
You should be able to use the following :
(?:\d{1,3}(?:,\d{3}){0,2}|\d(?:,\d{3}){3}|\d{1,10})(?:\.\d{1,4})?
I've tested it here.
The whole pattern is an integer part followed by an optional floating part.
The integer part, (?:\d{1,3}(?:,\d{3}){0,2}|\d(?:,\d{3}){3}|\d{1,10}), is an alternative between three sub-patterns :
up to 9 digits with commas, \d{1,3}(?:,\d{3}){0,2}, which is a leading group of digits of one to three digits followed by up to two optional groups of exactly three digits, groups which are separated by commas
the 10 digits case with commas, \d(?:,\d{3}){3}, in which the leading digits group must contain exactly one digit and is followed by three three-digits groups, groups which are separated by commas
the commas-less number you had to begin with, \d{1,10}
The floating part is a dot followed by at least one digit and at most four.
Note that if you can avoid using a regex you absolutely should, this is the kind of regex which will make maintainers cry...
I don't think you can do this with a single regex
The algorithm I use is
Take the part of the number before the decimal point
Convert that to a string
Reverse the string
Split the string into chunks of 3 digits allowing the last group to have 1, 2 or 3 digits (this depends on your programming language)
Join the string together inserting , between each group
Reverse the string.
Concatenate a decimal point and the decimal digits if necessary.
You now have a correctly formatted string.
This does the job:
^(?:\d,)?\d{0,3}(?:,\d{1,3}){0,2}\.\d{4}$
Explanation:
^ # beginning of string
(?:\d,)? # non capture group, a digit and a comma, optional
\d{0,3} # 0 to 3 digits
(?: # non capture group
, # a comma
\d{1,3} # a to 3 digits
){0,2} # end group, may appear 0, 1 or 2 times
\. # a dot
\d{4} # 4 digits
$ # end of string
Demo
The following perl code uses a trick to work from right to left:
$num = 12345678.01;
$rev = reverse($num);
$rev =~ s/(\d{3})(?=\d)(?!\d*\.)/$1,/g;
$res = reverse($rev);
print "$res\n";
results in
12,345,678.01

TCL regexp for float fails at single digit

I have developed the following regexp to capture float numbers.
([+-]?[0-9]+\.?[0-9]+([eE][-+]?[0-9]+)?)
It works fine for such things as 4.08955e-11 or 3.57. Now by stupid chance my parser came across 0 and failed. I guess I need to make all following the decimal point optional. But how do I do that?
Contrary to what one might think, matching every possible form of floating point number (including NaN etc) with a manageable regular expression that still discards e.g. impossibly large numbers or pseudo-octals is non-trivial.
There are some ideas about reducing the risk of false positives by using word boundaries, but note that those match boundaries between word characters (usually alphanumerics and underscore).
The scan command allows simple and reliable validation and extraction of floating point numbers:
scan $number %f
If you make all following the decimal point optional (which itself is optional) you could match values like 2.
Note that your regex does not match a single digit because you match 2 times one or more digits [0-9]+
If you only want to match float numbers or zero you could use an alternation and for example use word boundaries \b:
\b[-+]?(?:[0-9]+\.[0-9]+(?:[eE][-+]?[0-9]+)?|0)\b
Explanation
[-+]? Match optional + or -
\b Word boundary
(?: Non capturing group
[0-9]+\.[0-9]+ match one or more digits dot and one or more digits
(?:[eE][-+]?[0-9]+)? Optional exponent part
| Or
0 Match literally
) Close non capturing group
\b Word boundary
To match a float value that does not start with a dot and could be one or more digits without a dot you cold use use:
^[-+]?[0-9]+(?:\.[0-9]+)?(?:[eE][-+]?[0-9]+)?$
Perhaps using alternatives:
{[-+]?(?:\y[0-9]+(?:\.[0-9]*)?|\.[0-9]+\y)(?:[eE][-+]?[0-9]+\y)?}

Regex expressions for matching comparisons

Is it possible to create a regular expression that matches a comparison such as less than or greater than? For example, match all dollar values less than $500.
One way I would use this would be on online stores that list many products on a single page but do not provide a way to sort by price. I found a search page by regex extension for Chrome and am trying to figure out if there is a way I can use a regex to match any strings on the page beginning with a dollar sign followed by any number less than a number that I specify.
This should work for you \$[1-4]?\d?\d\b.
Explanation:
r"""
\$ # Match the character “$” literally
[1-4] # Match a single character in the range between “1” and “4”
? # Between zero and one times, as many times as possible, giving back as needed (greedy)
\d # Match a single digit 0..9
? # Between zero and one times, as many times as possible, giving back as needed (greedy)
\d # Match a single digit 0..9
\b # Assert position at a word boundary
"""
This could do what you need: ^(\$[1-4]?\d?\d)$. This will match any value between $1 and $499.
As mentioned above, if you would like to match even decimal values you could use something like so: ^(\$[1-4]?\d?\d(\.\d{2})?)$. That being said, numeric validation should ideally be done using actual mathematical operations, and not regular expressions.
Edit: this is overly complicated, but it will also match any value strictly less than 500
\$[1-4]\d{2}(\.\d{2})?$|\$\d{1,2}(\.\d{2})?$
if you need to match $500 as well, add another |\$500(\.00)?$
This matches:
\$ the dollar symbol
[1-4] followed by a digit between 1 and 4
\d{2} followed by exactly 2 digits
(\.\d{2})? optionally --> ()? followed by a dot --> \. and exactly 2 digits
$ followed by end of line (may be replaced with \b for word boundaries)
| or
\$\d{1,2} the dollar symbol followed by any two digits
(\.\d{1,2})?$ again optionally followed by cents, followed by end of line

Regular expression to find the number, 0 or decimal

I'm looking for a regular expression which whill validate the number starting from 0 up - and might include decimals.
Any idea?
A simple regex to validate a number:
^\d+(\.\d+)?$
This should work for a number with optional leading zeros, with an optional single dot and more digits.
^...$ - match from start to end of the string (will not validate ab12.4c)
\d+ - At least one digit.
(...)? - optional group of...
\.\d+ - literal dot and one or more digits.
Because decimal numbers may or may not have a decimal point in them, and may or may not have digits before that decimal point if they have some afterwards, and may or may not have digits following that decimal point if they have some before it, you must use this:
^(\d+(\.\d*)?|\d*\.\d+)$
which is usually better written:
^(?:\d+(?:\.\d*)?|\d*\.\d+)$
and much better written:
(?x)
^ # anchor to start of string
(?: # EITHER
\d+ (?: \. \d* )? # some digits, then optionally a decimal point following by optional digits
| # OR ELSE
\d* \. \d+ # optional digits followed then a decimal point and more digits
) # END ALTERNATIVES
$ # anchor to end of string
If your regex compiler doesn’t support \d, or also depending on how Unicode-aware your regex engine is if you should prefer to match only ASCII digits instead of anything with the Unicode Decimal_Number property (shortcut Nd) — that is, anything with the Numeric_Type=Decimal property — then you might wish to swap in [0-9] for all instances above where I’ve used \d.
I always use RegExr to build my regular expressions. It is sort of drag-and-drop and has a live-preview of your regex and the result.
It'll look something like ^0[,.0-9]*
^[0-9]+(\.[0-9]+)?$
Note that with this expression 0.1 will be valid but .1 won't.
This should do what you want:
^[0-9]+([,.][0-9]+)?$
It will match any number starting with 0 and then any number, maybe a , or . and any number
'/^([0-9\.]+)$/'
will match if the test string is a positive decimal number