I'm trying to write a regular expression that can match a decimal (and the digits after) of a dollar value. For example, I want to match $1.00 , $1,100.89 (includes values in the thousands with commas). It cannot match any digits that are not preceded by a $ character. There values are also not the only pieces of text in this file.
So far, I've tried a few things that haven't quite gotten me there:
\.+[\d]+ (highlights the decimal and every digit after the decimal point, but not what we want because it includes non-dollar values like 1.00)
\$+[\d+\.]+ highlights the whole value of the dollar except the 1,250
(\$\d+\.+\d+)|\$\d+\,+\d+\.+\d+ highlights the whole value of anything with a dollar sign
Anyone have an idea?
I looked at your problem and I believe I have a solution.
You could use the regex below to search for the last two decimals.
^\$[\d,]+\.((?:\d){2})
You can see it in action here
Use:
^\$[\d,]+\.(\d\d)$
Explanation:
^ # beginning of string
\$ # $ sign
[\d,]+ # 1 or more digit or comma
\. # a dot
(\d\d) # group 1, 2 digits
$ # end of string
var test = [
'$100.00',
'$1,100.89',
'$123',
'123.45',
];
console.log(test.map(function (a) {
m = a.match(/^\$[\d,]+\.(\d\d)$/);
if (m)
return a + ' : ' + m[1];
else
return a + ' : no match';
}));
You could use the non matching group selector (?:) to isolate only the group you want. I've come up with this regex and it seams to do what you are looking for
^(?:\$[,\d]+)(?:\.([\d]{2}))
const regex = /^(?:\$[,\d]+)(?:\.([\d]{2}))/;
const values = [
'$100.00',
'$99.99',
'$1,354.92'
];
const result = values.map(item => regex.exec(item)[1]);
console.log(result);
You could test more cases here
EDIT :
Here is an example on how to replace only the last digit.
I'm using the same concept as the other one, only this time i'm not keeping the digit. I'm going to use $1 to get the group i want in the new string.
const regex = /^(\$[,\d]+)\.(?:[\d]{2})/;
const values = [
'$100.00',
'$99.99',
'$1,354.92'
];
const result = values.map(item => item.replace(regex, '$1.50'));
console.log(result);
Notice here that the $1 in the replace function refer to the first group matching group of the regex. This way, we can get it back an "insert" it into our final string.
Here I've choosen .50 as a replace string, but you could use what ever.
P.S. I know this might be confusing because we are talking about dollar, so here is an example where we replace the final digit with a word.
const regex = /^(\$[,\d]+)\.(?:[\d]{2})/;
const values = [
'$100.00',
'$99.99',
'$1,354.92'
];
const result = values.map(item => item.replace(regex, '$1 this is a word'));
console.log(result);
Related
I've got this text: 3,142 people. I need to remove the people from it and get only the number, also removing comma(s). I need it to work with any higher numbers too like 13,142 or even 130,142 (at every 3 digits it will get a new comma).
So, in short, I need to get the numeric characters only, without commas and people. Ex: 3,142 people -> 3142.
My first version that didn't work was:
var str2 = "3,142 people";
var patt2 = /\d+/g;
var result2 = str2.match(patt2);
But after I changed patt2 to /\d+[,]\d+/g, it worked.
you can use this:
var test = '3,142 people';
test.replace(/[^0-9.]/g, "");
It will remove every thing except digit and decimal point
'3,142 people'.replace(/[^\d]/g, ''); // 3142
JSFiddle Demo: http://jsfiddle.net/zjx2hn1f/1/
Explanation
[] // match any character in this set
[^] // match anything NOT in character set
\d // match only digit
[^\d] // match any character that is NOT a digit
string.replace(/[^\d]/g, '') // replace any character that is NOT a digit with an empty string, in other words, remove it.
I have a string that looks like the following:
<#399969178745962506> hello to <#!104729417217032192>
I have a dictionary containing both that looks like following:
{"399969178745962506", "One"},
{"104729417217032192", "Two"}
My goal here is to replace the <#399969178745962506> into the value of that number key, which in this case would be One
Regex.Replace(arg.Content, "(?<=<)(.*?)(?=>)", m => userDic.ContainsKey(m.Value) ? userDic[m.Value] : m.Value);
My current regex is as following: (?<=<)(.*?)(?=>) which only matches everything in between < and > which would in this case leave both #399969178745962506 and #!104729417217032192
I can't just ignore the # sign, because the ! sign is not there every time. So it could be optimal to only get numbers with something like \d+
I need to figure out how to only get the numbers between < and > but I can't for the life of me figure out how.
Very grateful for any help!
In C#, you may use 2 approaches: a lookaround based on (since lookbehind patterns can be variable width) and a capturing group approach.
Lookaround based approach
The pattern that will easily help you get the digits in the right context is
(?<=<#!?)\d+(?=>)
See the regex demo
The (?<=<#!?) is a positive lookbehind that requires <= or <=! immediately to the left of the current location and (?=>) is a positive lookahead that requires > char immediately to the right of the current location.
Capturing approach
You may use the following pattern that will capture the digits inside the expected <...> substrings:
<#!?(\d+)>
Details
<# - a literal <# substring
!? - an optional exclamation sign
(\d+) - capturing group 1 that matches one or more digits
> - a literal > sign.
Note that the values you need can be accessed via match.Groups[1].Value as shown in the snippet above.
Usage:
var userDic = new Dictionary<string, string> {
{"399969178745962506", "One"},
{"104729417217032192", "Two"}
};
var p = #"<#!?(\d+)>";
var s = "<#399969178745962506> hello to <#!104729417217032192>";
Console.WriteLine(
Regex.Replace(s, p, m => userDic.ContainsKey(m.Groups[1].Value) ?
userDic[m.Groups[1].Value] : m.Value
)
); // => One hello to Two
// Or, if you need to keep <#, <#! and >
Console.WriteLine(
Regex.Replace(s, #"(<#!?)(\d+)>", m => userDic.ContainsKey(m.Groups[2].Value) ?
$"{m.Groups[1].Value}{userDic[m.Groups[2].Value]}>" : m.Value
)
); // => <#One> hello to <#!Two>
See the C# demo.
To extract just the numbers from you're given format, use this regex pattern:
(?<=<#|<#!)(\d+)(?=>)
See it work in action: https://regexr.com/3j6ia
You can use non-capturing groups to exclude parts of the needed pattern to be inside the group:
(?<=<)(?:#?!?)(.*?)(?=>)
alternativly you could name the inner group and use the named group to get it:
(?<=<)(?:#?!?)(?<yourgroupname>.*?)(?=>)
Access it via m.Groups["yourgroupname"].Value - more see f.e. How do I access named capturing groups in a .NET Regex?
Regex: (?:<#!?(\d+)>)
Details:
(?:) Non-capturing group
<# matches the characters <# literally
? Matches between zero and one times
(\d+) 1st Capturing Group \d+ matches a digit (equal to [0-9])
Regex demo
string text = "<#399969178745962506> hello to <#!104729417217032192>";
Dictionary<string, string> list = new Dictionary<string, string>() { { "399969178745962506", "One" }, { "104729417217032192", "Two" } };
text = Regex.Replace(text, #"(?:<#!?(\d+)>)", m => list.ContainsKey(m.Groups[1].Value) ? list[m.Groups[1].Value] : m.Value);
Console.WriteLine(text); \\ One hello to Two
Console.ReadLine();
The problem goes like this:
value match: 218\d{3}(\d{4})#domain.com replace with 10\1 to get 10 followed by last 4 digits
for example 2181234567 would become 104567
value match: 332\d{3}(\d{4})#domain.com replace with 11\1 to get 11 followed by last 4 digits
for example 3321234567 would become 114567
value match: 420\d{3}(\d{4})#domain.com replace with 12\1 to get 12 followed by last 4 digits
..and so on
for example 4201234567 would become 124567
Is there a better way to catch different values and replace with their corresponding replacements in a single RegEx than creating multiple expressions?
Like (218|332|420)\d{3}(\d{4})#domain.com to replace 10\4|11\4|12\4) and get just their corresponding results when matched.
Edit: Didn't specify the use case: It's for my PBX, that just uses RegEx to match patterns and then replace it with the values I want it to go out with. No code. Just straight up RegEx in the GUI.
Also for personal use, if I can get it to work with Notepad++
Ctrl+H
Find what: (?:(218)|(332)|(420))\d{3}(\d{4})(?=#domain\.com)
Replace with: (?{1}10$4)(?{2}11$4)(?{3}12$4)
CHECK Wrap around
CHECK Regular expression
Replace all
Explanation:
(?: # non capture group
(218) # group 1, 218
| # OR
(332) # group 2, 332
| # OR
(420) # group 3, 420
) # end group
\d{3} # 3 digits
(\d{4}) # group 4, 4 digits
(?=#domain\.com) # positive lookahead, make sure we have "#domain.com" after
# that allows to keep "#domain.com"
# if you want to remove it from the result, just put "#domain\.com"
# without lookahead.
Replacement:
(?{1} # if group 1 exists
10 # insert "10"
$4 # insert content of group 4
) # endif
(?{2}11$4) # same as above
(?{3}12$4) # same as above
Screenshot (before):
Screenshot (after):
I don't think you can use a single regular expression to conditionally replace text as per your example. You either need to chain multiple search & replace, or use a function that does a lookup based on the first captured group (first three digits).
You did not specify the language used, regular expressions vary based on language. Here is a JavaScript code snippet that uses the function with lookup approach:
var str1 = '2181234567#domain.com';
var str2 = '3321234567#domain.com';
var str3 = '4201234567#domain.com';
var strMap = {
'218': '10',
'332': '11',
'420': '12'
// add more as needed
};
function fixName(str) {
var re = /(\d{3})\d{3}(\d{4})(?=\#domain\.com)/;
var result = str.replace(re, function(m, p1, p2) {
return strMap[p1] + p2;
});
return result;
}
var result1 = fixName(str1);
var result2 = fixName(str2);
var result3 = fixName(str3);
console.log('str1: ' + str1 + ', result1: ' + result1);
console.log('str2: ' + str2 + ', result2: ' + result2);
console.log('str3: ' + str3 + ', result3: ' + result3);
Output:
str1: 2181234567#domain.com, result1: 104567#domain.com
str2: 3321234567#domain.com, result2: 114567#domain.com
str3: 4201234567#domain.com, result3: 124567#domain.com
#Toto has a nice answer, and there is another method if the operator (?{1}...) is not available (but thanks, Toto, I did not know this feature of NotePad++).
More details on my answer here: https://stackoverflow.com/a/63676336/1287856
Append to the end of the doc:
,218=>10,332=>11,420=>12
Search for:
(218|332|420)\d{3}(\d{4})(?=#domain.com)(?=[\s\S]*,\1=>([^,]*))
Replace with
\3\2
watch in action:
I've got a nested array of prices that are a string and I want to remove the £ symbols and convert the string into an integer. I need to strip the £ signs and also convert to a integer so I can use the values in a chart.js line graph.
I've been trying to use regex replace to remove the £ sign but I don't think I can get it working because the strings are in a nested array. I can't seem to find anything on the net about replacing characters in a nested array. I haven't even tried converting the string to an integer yet, but wondering if it could all be handled in one go in someway?
this is my nested array called linedata
var linedata = [["£14.99,£14.99,£14.99"],["£34.99,£34.99,£34.99"]]
this is the code I've been playing around with
var re = /£/g;
var newlinedata = linedata.replace(re, "");
Its not returning anything in chrome console and the ionic CLI is kicking out this error
ERROR in src/app/home/home.page.ts(66,26): error TS2339: Property
'replace' does not exist on type 'any[]'.
Thoughts?
This expression might help you to replace that:
£([0-9.]+,?)
The key is to add everything you like to keep in the capturing group () and the pound symbol outside of the group, then simply replace it with $1.
RegEx
If this wasn't your desired expression, you can modify/change your expressions in regex101.com.
RegEx Circuit
You can also visualize your expressions in jex.im:
JavaScript Demo
const regex = /£([0-9.]+,?)/gm;
const str = `£14.99,£14.99,£14.99
£34.99,£34.99,£34.99`;
const subst = `$1`;
// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);
console.log('Substitution result: ', result);
You could capture in a group matching a digit with an optional decimal part after matching the £
£(\d+(?:\.\d+)?)
£ Match literally
( Capturing group
\d+(?:\.\d+)? Match 1+ digits with an optional part to match a dot and 1+ digits
) Close group
Regex demo
You should use replace on a string instead of an array.
To turn the nested array with the strings into arrays of numbers, you could use map and in the replacement refer to the capturing group using $1 which contains your value.
For example:
var linedata = [
["£14.99,£14.99,£14.99"],
["£34.99,£34.99,£34.99"]
].map(ary =>
ary[0].split(',')
.map(v => Number(v.replace(/£(\d+(?:\.\d+)?)/g, "$1")))
);
console.log(linedata);
Or if you want to keep multiple nested arrays, you could use another map.
var linedata = [
["£14.99,£14.99,£14.99"],
["£34.99,£34.99,£34.99", "£34.99,£34.99,£34.99"]
].map(ary => ary
.map(s => s.split(',')
.map(v => Number(v.replace(/£(\d+(?:\.\d+)?)/g, "$1")))
)
);
console.log(linedata);
I am trying to parse a file that contains parameter attributes. The attributes are setup like this:
w=(nf*40e-9)*ng
but also like this:
par_nf=(1) * (ng)
The issue is, all of these parameter definitions are on a single line in the source file, and they are separated by spaces. So you might have a situation like this:
pd=2.0*(84e-9+(1.0*nf)*40e-9) nf=ng m=1 par=(1) par_nf=(1) * (ng) plorient=0
The current algorithm just splits the line on spaces and then for each token, the name is extracted from the LHS of the = and the value from the RHS. My thought is if I can create a Regex match based on spaces within parameter declarations, I can then remove just those spaces before feeding the line to the splitter/parser. I am having a tough time coming up with the appropriate Regex, however. Is it possible to create a regex that matches only spaces within parameter declarations, but ignores the spaces between parameter declarations?
Try this RegEx:
(?<=^|\s) # Start of each formula (start of line OR [space])
(?:.*?) # Attribute Name
= # =
(?: # Formula
(?!\s\w+=) # DO NOT Match [space] Word Characters = (Attr. Name)
[^=] # Any Character except =
)* # Formula Characters repeated any number of times
When checking formula characters, it uses a negative lookahead to check for a Space, followed by Word Characters (Attribute Name) and an =. If this is found, it will stop the match. The fact that the negative lookahead checks for a space means that it will stop without a trailing space at the end of the formula.
Live Demo on Regex101
Thanks to #Andy for the tip:
In this case I'll probably just match on the parameter name and equals, but replace the preceding whitespace with some other "parse-able" character to split on, like so:
(\s*)\w+[a-zA-Z_]=
Now my first capturing group can be used to insert something like a colon, semicolon, or line-break.
You need to add Perl tag. :-( Maybe this will help:
I ended up using this in C#. The idea was to break it into name value pairs, using a negative lookahead specified as the key to stop a match and start a new one. If this helps
var data = #"pd=2.0*(84e-9+(1.0*nf)*40e-9) nf=ng m=1 par=(1) par_nf=(1) * (ng) plorient=0";
var pattern = #"
(?<Key>[a-zA-Z_\s\d]+) # Key is any alpha, digit and _
= # = is a hard anchor
(?<Value>[.*+\-\\\/()\w\s]+) # Value is any combinations of text with space(s)
(\s|$) # Soft anchor of either a \s or EOB
((?!\s[a-zA-Z_\d\s]+\=)|$) # Negative lookahead to stop matching if a space then key then equal found or EOB
";
Regex.Matches(data, pattern, RegexOptions.IgnorePatternWhitespace | RegexOptions.ExplicitCapture)
.OfType<Match>()
.Select(mt => new
{
LHS = mt.Groups["Key"].Value,
RHS = mt.Groups["Value"].Value
});
Results: