Regular expression to extract Words inside nested parentheses - regex

im looking for the regexp that make able to do this tasks
message Body Input: Test1 (Test2) (test3) (ti,ab(text(text here(possible text)text(possible text(more text))))) end (text)
the result that i want Result: (text(text here(possible text)text(possible text(more text))))
I want to collect everything that is inside ti,ab(................)
var messageBody = message.getPlainBody()
var ssFile = DriveApp.getFileById(id);
DriveApp.getFolderById(folder.getId()).addFile(ssFile);
var ss = SpreadsheetApp.open(ssFile);
var sheet = ss.getSheets()[0];
sheet.insertColumnAfter(sheet.getLastColumn());
SpreadsheetApp.flush();
var sheet = ss.getSheets()[0];
var range = sheet.getRange(1, 1, sheet.getLastRow(), sheet.getLastColumn() + 1)
var values = range.getValues();
values[0][sheet.getLastColumn()] = "Search Strategy";
for (var i = 1; i < values.length; i++) {
//here my Regexp
var y = messageBody.match(/\((ti,ab.*)\)/ig);
if (y);
values[i][values[i].length - 1] = y.toString();
range.setValues(values);

The only solution you may use here is to extract all substrings inside parentheses and then filter them to get all those that start with ti,ab:
var a = [], r = [], result;
var txt = "Test1 (Test2) (test3) (ti,ab(text(text here(possible text)text(possible text(more text))))) end (text)";
for(var i=0; i < txt.length; i++){
if(txt.charAt(i) == '(') {
a.push(i);
}
if(txt.charAt(i) == ')') {
r.push(txt.substring(a.pop()+1,i));
}
}
result = r.filter(function(x) { return /^ti,ab\(/.test(x); })
.map(function(y) {return y.substring(6,y.length-1);})
console.log(result);
The nested parentheses function is borrowed from Nested parentheses get string one by one. The /^ti,ab\(/ regex matches ti,ab( at the start of the string.
The above solution allows extracting nested parentheses inside nested parentheses. If you do not need it, use
var txt = "Test1 (Test2) ((ti,ab(text(text here))) AND ab(test3) Near Ti(test4) NOT ti,ab,su(test5) NOT su(Test6))";
var start=0, r = [], level=0;
for (var j = 0; j < txt.length; j++) {
if (txt.charAt(j) == '(') {
if (level === 0) start=j;
++level;
}
if (txt.charAt(j) == ')') {
if (level > 0) {
--level;
}
if (level === 0) {
r.push(txt.substring(start, j+1));
}
}
}
console.log("r: ", r);
var rx = "\\b(?:ti|ab|su)(?:,(ti|ab|su))*\\(";
var result = r.filter(function(y) { return new RegExp(rx, "i").test(y); })
.map(function(x) {
return x.replace(new RegExp(rx, "ig"), '(')
});
console.log("Result:",result);
The pattern used to filter and remove the unnecessary words
\b(?:ti|ab|su)(?:,(ti|ab|su))*\(
Details
\b - a word boundary
(?:ti|ab|su) - 1 of the alternatives,
(?:,(ti|ab|su))* - 0 or more repetitions of , followed with 1 of the 3 alternatives
\( - a (.
The match is replaced with ( to restore it in the match.

Related

How to get integer with regular expression in kotlin?

ViewModel
fun changeQty(textField: TextFieldValue) {
val temp1 = textField.text
Timber.d("textField: $temp1")
val temp2 = temp1.replace("[^\\d]".toRegex(), "")
Timber.d("temp2: $temp2")
_qty.value = textField.copy(temp2)
}
TextField
OutlinedTextField(
modifier = Modifier
.focusRequester(focusRequester = focusRequester)
.onFocusChanged {
if (it.isFocused) {
keyboardController?.show()
}
},
value = qty.copy(
text = qty.text.trim()
),
onValueChange = changeQty,
label = { Text(text = qtyHint) },
singleLine = true,
keyboardOptions = KeyboardOptions(
keyboardType = KeyboardType.Number,
imeAction = ImeAction.Done
),
keyboardActions = KeyboardActions(
onDone = {
save()
onDismiss()
}
)
)
Set KeyboardType.Number, it display 1,2,3,4,5,6,7,8,9 and , . - space.
I just want to get integer like -10 or 10 or 0.
But I type the , or . or -(not the front sign), it show as it is.
ex)
typing = -10---------
hope = -10
display = -10---------
I put regular expression in
val temp2 = temp1.replace("[^\\d]".toRegex(), "")
But, it doesn't seem to work.
How I can get only integer(also negative integer)?
Use this regex (?<=(\d|-))(\D+) to replace all non digit characters, except first -.
fun getIntegersFromString(input: String): String {
val pattern = Regex("(?<=(\\d|-))(\\D+)")
val formatted = pattern.replace(input, "")
return formatted
}
Check it here

Find and change cyrillic word with boundary in google scripts

The problem is that \b doesn't work with Russian and Ukrainian letters.
Here I try to find all matches of a word 'февраля' it the text, change them to tempword, then make it a link and change it back to 'февраля'.
function addLinks(word, siteurl) {
var id = 'doc\'s ID';
var doc = DocumentApp.openById(id);
var body = doc.getBody();
var tempword = 'ASDFDSGDDKDSL2';
var searchText = "\\b"+word+"\\b";
var element = body.findText(searchText);
console.log(element);
while (element) {
var start = element.getStartOffset();
var text = element.getElement().asText();
text.replaceText(searchText, tempword);
text.setLinkUrl(start, start + tempword.length - 1, siteurl);
element = body.findText(searchText);
}
body.replaceText(tempword, word);
}
addLinks('февраля', 'example.com');
It works as it should, if I change Russian word 'февраля' to English 'february'.
addLinks('february', 'example.com');
I need regular expression, because if I just look for 'февраля' script will apply it to other words like 'февралям', 'февралями' etc.
So, it is a question, how to make it work.
Mistake "Exception: Invalid regular expression pattern" occurs with this code:
var searchText = "(?<=[\\s,.:;\"']|^)"+word+"(?=[\\s,.:;\"']|$)";
or this:
var searchText = "(^|\s)"+word+"(?=\s|$)";
and some other.
Here is my solution:
function main() {
addLinks('февраля', 'example.com');
}
function addLinks(word, url) {
var doc = DocumentApp.getActiveDocument();
var pgfs = doc.getParagraphs();
var bound = '[^А-яЁё]'; // any letter except Russian one
var patterns = [
{regex: bound + word + bound, start: 1, end: 1}, // word inside of line
{regex: '^' + word + bound, start: 0, end: 1}, // word at the start
{regex: bound + word + '$', start: 1, end: 0}, // word at the end
{regex: '^' + word + '$', start: 0, end: 0} // word = line
];
for (var pgf of pgfs) for (var pattern of patterns) {
var location = pgf.findText(pattern.regex);
while (location) {
var start = location.getStartOffset() + pattern.start;
var end = location.getEndOffsetInclusive() - pattern.end;
pgf.editAsText().setLinkUrl(start, end, url);
location = pgf.findText(pattern.regex, location);
}
}
}
Test output:
It handles well the word placed at the start or at the end of the line (or both). And it gives no the weird error message.

How to select first chars with a custom word boundary?

I've test cases with a series of words like this :
{
input: "Halley's Comet",
expected: "HC",
},
{
input: "First In, First Out",
expected: "FIFO",
},
{
input: "The Road _Not_ Taken",
expected: "TRNT",
},
I want with one regex to match all first letters of these words, avoid char: "_" to be matched as a first letter and count single quote in the word.
Currently, I have this regex working on pcre syntax but not with Go regexp package : (?<![a-zA-Z0-9'])([a-zA-Z0-9'])
I know lookarounds aren't supported by Go but I'm looking for a good way to do that.
I also use this func to get an array of all strings : re.FindAllString(s, -1)
Thanks for helping.
Something that plays with character classes and word boundaries should suffice:
\b_*([a-z])[a-z]*(?:'s)?_*\b\W*
demo
Usage:
package main
import (
"fmt"
"regexp"
)
func main() {
re := regexp.MustCompile(`(?i)\b_*([a-z])[a-z]*(?:'s)?_*\b\W*`)
fmt.Println(re.ReplaceAllString("O'Brian's dog", "$1"))
}
ftr, regexp less solution
package main
import (
"fmt"
)
func main() {
inputs := []string{"Hallمرحباey's Comet", "First In, First Out", "The Road _Not_ Taken", "O'Brian's Dog"}
c := [][]string{}
w := [][]string{}
for _, input := range inputs {
c = append(c, firstLet(input))
w = append(w, words(input))
}
fmt.Printf("%#v\n", w)
fmt.Printf("%#v\n", c)
}
func firstLet(in string) (out []string) {
var inword bool
for _, r := range in {
if !inword {
if isChar(r) {
inword = true
out = append(out, string(r))
}
} else if r == ' ' {
inword = false
}
}
return out
}
func words(in string) (out []string) {
var inword bool
var w []rune
for _, r := range in {
if !inword {
if isChar(r) {
w = append(w, r)
inword = true
}
} else if r == ' ' {
if len(w) > 0 {
out = append(out, string(w))
w = w[:0]
}
inword = false
} else if r != '_' {
w = append(w, r)
}
}
if len(w) > 0 {
out = append(out, string(w))
}
return out
}
func isChar(r rune) bool {
return (r >= 'a' && r <= 'z') || (r >= 'A' && r <= 'Z')
}
outputs
[][]string{[]string{"Hallمرحباey's", "Comet"}, []string{"First", "In,", "First", "Out"}, []string{"The", "Road", "Not", "Taken"}, []string{"O'Brian's", "Dog"}}
[][]string{[]string{"H", "C"}, []string{"F", "I", "F", "O"}, []string{"T", "R", "N", "T"}, []string{"O", "D"}}

Replacement matching regex with anchor tag?

I have a problem when using Regex. I have a html document which create an anchor link when it matches condition.
An example html:
Căn cứ Luật Tổ chức HĐND và UBND ngày 26/11/2003;
Căn cứ Nghị định số 63/2010/NĐ-CP ngày 08/6/2010 của Chính phủ về
kiểm soát thủ tục hành chính;
Căn cứ Quyết định số 165/2011/QĐ-UBND ngày 06/5/2011 của UBND tỉnh
ban hành Quy định kiểm soát thủ tục hành chính trên địa bàn tỉnh;
Căn cứ Quyết định số 278/2011/QĐ-UBND ngày 02/8/2011 của UBND tỉnh
ban hành Quy chế phối hợp thực hiện thống kê, công bố, công khai thủ
tục hành chính và tiếp nhận, xử lý phản ánh, kiến nghị của cá nhân, tổ
chức về quy định hành chính trên địa bàn tỉnh;
Xét đề nghị của Giám đốc Sở Công Thương tại Tờ trình số
304/TTr-SCT ngày 29 tháng 5 năm 2013
I want to match these bold texts and make anchor links from these. If it has, try ignore. Link example 63/2010/NĐ-CP
var matchLegals = new Regex(#"(?:[\d]+\/?)\d+\/[a-z\dA-Z_ÀÁÂÃÈÉÊÌÍÒÓÔÕÙÚĂĐĨŨƠàáâãèéêìíòóôõùúăđĩũơƯĂẠẢẤẦẨẪẬẮẰẲẴẶẸẺẼỀỀỂưăạảấầẩẫậắằẳẵặẹẻẽềềểỄỆỈỊỌỎỐỒỔỖỘỚỜỞỠỢỤỦỨỪễệỉịọỏốồổỗộớờởỡợụủứừỬỮỰỲỴÝỶỸửữựỳỵỷỹ\-]+", RegexOptions.Compiled);
var doc = new HtmlDocument();
doc.LoadHtml(htmlString);
var allElements = doc.DocumentNode.SelectSingleNode("//div[#class='main-content']").Descendants();
foreach (var node in allElements)
{
var matches = matchLegals.Matches(node.InnerHtml);
foreach (Match m in matches)
{
var k = m.Value;
//dont know what to do
}
}
What can i do this
Many thanks.
I assume your regex pattern is OK and works. Another assumption is that node.InnerHtml doesn't contain any <a> tags already encompassing any of the potential matches.
In this case, it's as simple as doing something like this:
node.InnerHtml = Regex.Replace(node.InnerHtml, "[your pattern here]", "<a href='query=$&'>$&</a>");
...
doc.Save("output.html");
Note, that you may need to work on the href component - I'm unsure how your link should be built.
you match text and replace:
<script>
var s = '...';
var matchs = s.match(/\d{2,3}\/\d{4}\/[a-zA-Z\-áàảãạăâắằấầặẵẫậéèẻẽẹêếềểễệóòỏõọôốồổỗộơớờởỡợíìỉĩịđùúủũụưứửữựÀÁÂÃÈÉÊÌÍÒÓÔÕÙÚĂĐĨŨƠƯĂẠẢẤẦẨẪẬẮẰẲẴẶẸẺẼÊỀỂỄỆỈỊỌỎỐỒỔỖỘỚỜỞỠỢỤỨỪỬỮỰỲỴÝỶỸửữựỵỷỹ]+/gi);
if (matchs != null) {
for(var i=0; i<matchs.length;i++){
var val = matchs[i];
s = s.replace(val, '<a href="?key=' + val + '"/>' + val + '</a>');
}
}
document.write(s);
</script>
#Shaamaan thank for your advice. After few hours of coding, it works now
var content = doc.DocumentNode.SelectSingleNode("//div[#class='main-content']");
var items = content.SelectNodes(".//text()[normalize-space(.) != '']");
foreach (HtmlNode node in items)
{
if (!matchLegals.IsMatch(node.InnerText) || node.ParentNode.Name == "a")
{
continue;
}
var texts = node.InnerHtml.Trim();
node.InnerHtml = matchLegals.Replace(texts, a => string.Format("<a href='/search?q={0}'>{0}</a>",a.Value));
}

Insert commas into number string

Hey there, I'm trying to perform a backwards regular expression search on a string to divide it into groups of 3 digits. As far as i can see from the AS3 documentation, searching backwards is not possible in the reg ex engine.
The point of this exercise is to insert triplet commas into a number like so:
10000000 => 10,000,000
I'm thinking of doing it like so:
string.replace(/(\d{3})/g, ",$1")
But this is not correct due to the search not happening from the back and the replace $1 will only work for the first match.
I'm getting the feeling I would be better off performing this task using a loop.
UPDATE:
Due to AS3 not supporting lookahead this is how I have solved it.
public static function formatNumber(number:Number):String
{
var numString:String = number.toString()
var result:String = ''
while (numString.length > 3)
{
var chunk:String = numString.substr(-3)
numString = numString.substr(0, numString.length - 3)
result = ',' + chunk + result
}
if (numString.length > 0)
{
result = numString + result
}
return result
}
If your language supports postive lookahead assertions, then I think the following regex will work:
(\d)(?=(\d{3})+$)
Demonstrated in Java:
import static org.junit.Assert.assertEquals;
import org.junit.Test;
public class CommifyTest {
#Test
public void testCommify() {
String num0 = "1";
String num1 = "123456";
String num2 = "1234567";
String num3 = "12345678";
String num4 = "123456789";
String regex = "(\\d)(?=(\\d{3})+$)";
assertEquals("1", num0.replaceAll(regex, "$1,"));
assertEquals("123,456", num1.replaceAll(regex, "$1,"));
assertEquals("1,234,567", num2.replaceAll(regex, "$1,"));
assertEquals("12,345,678", num3.replaceAll(regex, "$1,"));
assertEquals("123,456,789", num4.replaceAll(regex, "$1,"));
}
}
Found on http://gskinner.com/RegExr/
Community > Thousands separator
Pattern: /\d{1,3}(?=(\d{3})+(?!\d))/g
Replace: $&,
trace ( String("1000000000").replace( /\d{1,3}(?=(\d{3})+(?!\d))/g , "$&,") );
It done the job!
If your regex engine has positive lookaheads, you could do something like this:
string.replace(/(\d)(?=(\d\d\d)+$)/, "$1,")
Where the positive lookahead (?=...) means that the regex only matches when the lookahead expression ... matches.
(Note that lookaround-expressions are not always very efficient.)
While many of these answers work fine with positive integers, many of their argument inputs are cast as Numbers, which implies that they can handle negative values or contain decimals, and here all of the solutions fail. Though the currently selected answer does not assume a Number I was curious to find a solution that could and was also more performant than RegExp (which AS3 does not do well).
I put together many of the answers here in a testing class (and included a solution from this blog and an answer of my own called commaify) and formatted them in a consistent way for easy comparison:
package
{
public class CommaNumberSolutions
{
public static function commaify( input:Number ):String
{
var split:Array = input.toString().split( '.' ),
front:String = split[0],
back:String = ( split.length > 1 ) ? "." + split[1] : null,
n:int = input < 0 ? 2 : 1,
commas:int = Math.floor( (front.length - n) / 3 ),
i:int = 1;
for ( ; i <= commas; i++ )
{
n = front.length - (3 * i + i - 1);
front = front.slice( 0, n ) + "," + front.slice( n );
}
if ( back )
return front + back;
else
return front;
}
public static function getCommaString( input:Number ):String
{
var s:String = input.toString();
if ( s.length <= 3 )
return s;
var i:int = s.length % 3;
if ( i == 0 )
i = 3;
for ( ; i < s.length; i += 4 )
{
var part1:String = s.substr(0, i);
var part2:String = s.substr(i, s.length);
s = part1.concat(",", part2);
}
return s;
}
public static function formatNumber( input:Number ):String
{
var s:String = input.toString()
var result:String = ''
while ( s.length > 3 )
{
var chunk:String = s.substr(-3)
s = s.substr(0, s.length - 3)
result = ',' + chunk + result
}
if ( s.length > 0 )
result = s + result
return result
}
public static function commaCoder( input:Number ):String
{
var s:String = "";
var len:Number = input.toString().length;
for ( var i:int = 0; i < len; i++ )
{
if ( (len-i) % 3 == 0 && i != 0)
s += ",";
s += input.toString().charAt(i);
}
return s;
}
public static function regex1( input:Number ):String
{
return input.toString().replace( /-{0,1}(\d)(?=(\d\d\d)+$)/g, "$1," );
}
public static function regex2( input:Number ):String
{
return input.toString().replace( /-{0,1}\d{1,3}(?=(\d{3})+(?!\d))/g , "$&,")
}
public static function addCommas( input:Number ):String
{
var negative:String = "";
if ( input < 0 )
{
negative = "-";
input = Math.abs(input);
}
var s:String = input.toString();
var results:Array = s.split(/\./);
s = results[0];
if ( s.length > 3 )
{
var mod:Number = s.length % 3;
var output:String = s.substr(0, mod);
for ( var i:Number = mod; i < s.length; i += 3 )
{
output += ((mod == 0 && i == 0) ? "" : ",") + s.substr(i, 3);
}
if ( results.length > 1 )
{
if ( results[1].length == 1 )
return negative + output + "." + results[1] + "0";
else
return negative + output + "." + results[1];
}
else
return negative + output;
}
if ( results.length > 1 )
{
if ( results[1].length == 1 )
return negative + s + "." + results[1] + "0";
else
return negative + s + "." + results[1];
}
else
return negative + s;
}
}
}
Then I tested each for accuracy and performance:
package
{
public class TestCommaNumberSolutions
{
private var functions:Array;
function TestCommaNumberSolutions()
{
functions = [
{ name: "commaify()", f: CommaNumberSolutions.commaify },
{ name: "addCommas()", f: CommaNumberSolutions.addCommas },
{ name: "getCommaString()", f: CommaNumberSolutions.getCommaString },
{ name: "formatNumber()", f: CommaNumberSolutions.formatNumber },
{ name: "regex1()", f: CommaNumberSolutions.regex1 },
{ name: "regex2()", f: CommaNumberSolutions.regex2 },
{ name: "commaCoder()", f: CommaNumberSolutions.commaCoder }
];
verify();
measure();
}
protected function verify():void
{
var assertions:Array = [
{ input: 1, output: "1" },
{ input: 21, output: "21" },
{ input: 321, output: "321" },
{ input: 4321, output: "4,321" },
{ input: 54321, output: "54,321" },
{ input: 654321, output: "654,321" },
{ input: 7654321, output: "7,654,321" },
{ input: 987654321, output: "987,654,321" },
{ input: 1987654321, output: "1,987,654,321" },
{ input: 21987654321, output: "21,987,654,321" },
{ input: 321987654321, output: "321,987,654,321" },
{ input: 4321987654321, output: "4,321,987,654,321" },
{ input: 54321987654321, output: "54,321,987,654,321" },
{ input: 654321987654321, output: "654,321,987,654,321" },
{ input: 7654321987654321, output: "7,654,321,987,654,321" },
{ input: 87654321987654321, output: "87,654,321,987,654,321" },
{ input: -1, output: "-1" },
{ input: -21, output: "-21" },
{ input: -321, output: "-321" },
{ input: -4321, output: "-4,321" },
{ input: -54321, output: "-54,321" },
{ input: -654321, output: "-654,321" },
{ input: -7654321, output: "-7,654,321" },
{ input: -987654321, output: "-987,654,321" },
{ input: -1987654321, output: "-1,987,654,321" },
{ input: -21987654321, output: "-21,987,654,321" },
{ input: -321987654321, output: "-321,987,654,321" },
{ input: -4321987654321, output: "-4,321,987,654,321" },
{ input: -54321987654321, output: "-54,321,987,654,321" },
{ input: -654321987654321, output: "-654,321,987,654,321" },
{ input: -7654321987654321, output: "-7,654,321,987,654,321" },
{ input: -87654321987654321, output: "-87,654,321,987,654,321" },
{ input: .012345, output: "0.012345" },
{ input: 1.012345, output: "1.012345" },
{ input: 21.012345, output: "21.012345" },
{ input: 321.012345, output: "321.012345" },
{ input: 4321.012345, output: "4,321.012345" },
{ input: 54321.012345, output: "54,321.012345" },
{ input: 654321.012345, output: "654,321.012345" },
{ input: 7654321.012345, output: "7,654,321.012345" },
{ input: 987654321.012345, output: "987,654,321.012345" },
{ input: 1987654321.012345, output: "1,987,654,321.012345" },
{ input: 21987654321.012345, output: "21,987,654,321.012345" },
{ input: -.012345, output: "-0.012345" },
{ input: -1.012345, output: "-1.012345" },
{ input: -21.012345, output: "-21.012345" },
{ input: -321.012345, output: "-321.012345" },
{ input: -4321.012345, output: "-4,321.012345" },
{ input: -54321.012345, output: "-54,321.012345" },
{ input: -654321.012345, output: "-654,321.012345" },
{ input: -7654321.012345, output: "-7,654,321.012345" },
{ input: -987654321.012345, output: "-987,654,321.012345" },
{ input: -1987654321.012345, output: "-1,987,654,321.012345" },
{ input: -21987654321.012345, output: "-21,987,654,321.012345" }
];
var i:int;
var len:int = assertions.length;
var assertion:Object;
var f:Function;
var s1:String;
var s2:String;
for each ( var o:Object in functions )
{
i = 0;
f = o.f;
trace( '\rVerify: ' + o.name );
for ( ; i < len; i++ )
{
assertion = assertions[ i ];
s1 = f.apply( null, [ assertion.input ] );
s2 = assertion.output;
if ( s1 !== s2 )
trace( 'Test #' + i + ' Failed: ' + s1 + ' !== ' + s2 );
}
}
}
protected function measure():void
{
// Generate random inputs
var values:Array = [];
for ( var i:int = 0; i < 999999; i++ ) {
values.push( Math.random() * int.MAX_VALUE * ( Math.random() > .5 ? -1 : 1) );
}
var len:int = values.length;
var stopwatch:Stopwatch = new Stopwatch;
var s:String;
var f:Function;
trace( '\rTesting ' + len + ' random values' );
// Test each function
for each ( var o:Object in functions )
{
i = 0;
s = "";
f = o.f;
stopwatch.start();
for ( ; i < len; i++ ) {
s += f.apply( null, [ values[i] ] ) + " ";
}
stopwatch.stop();
trace( o.name + '\t\ttook ' + (stopwatch.elapsed/1000) + 's' ); //(stopwatch.elapsed/len) + 'ms'
}
}
}
}
import flash.utils.getTimer;
class Stopwatch
{
protected var startStamp:int;
protected var stopStamp:int;
protected var _started:Boolean;
protected var _stopped:Boolean;
function Stopwatch( startNow:Boolean = true ):void
{
if ( startNow )
start();
}
public function start():void
{
startStamp = getTimer();
_started = true;
_stopped = false;
}
public function stop():void
{
stopStamp = getTimer();
_stopped = true;
_started = false;
}
public function get elapsed():int
{
return ( _stopped ) ? stopStamp - startStamp : ( _started ) ? getTimer() - startStamp : 0;
}
public function get started():Boolean
{
return _started;
}
public function get stopped():Boolean
{
return _stopped;
}
}
Because of AS3's lack of precision with larger Numbers every class failed these tests:
Test #15 Failed: 87,654,321,987,654,320 !== 87,654,321,987,654,321
Test #31 Failed: -87,654,321,987,654,320 !== -87,654,321,987,654,321
Test #42 Failed: 21,987,654,321.012344 !== 21,987,654,321.012345
Test #53 Failed: -21,987,654,321.012344 !== -21,987,654,321.012345
But only two functions passed all of the other tests: commaify() and addCommas().
The performance tests show that commaify() is the most preformant of all the solutions:
Testing 999999 random values
commaify() took 12.411s
addCommas() took 17.863s
getCommaString() took 18.519s
formatNumber() took 14.409s
regex1() took 40.654s
regex2() took 36.985s
Additionally commaify() can be extended to including arguments for decimal length and zero-padding on the decimal portion — it also outperforms the others at 13.128s:
public static function cappedDecimal( input:Number, decimalPlaces:int = 2 ):Number
{
if ( decimalPlaces == 0 )
return Math.floor( input );
var decimalFactor:Number = Math.pow( 10, decimalPlaces );
return Math.floor( input * decimalFactor ) / decimalFactor;
}
public static function cappedDecimalString( input:Number, decimalPlaces:int = 2, padZeros:Boolean = true ):String
{
if ( padZeros )
return cappedDecimal( input, decimalPlaces ).toFixed( decimalPlaces );
else
return cappedDecimal( input, decimalPlaces ).toString();
}
public static function commaifyExtended( input:Number, decimalPlaces:int = 2, padZeros:Boolean = true ):String
{
var split:Array = cappedDecimalString( input, decimalPlaces, padZeros ).split( '.' ),
front:String = split[0],
back:String = ( split.length > 1 ) ? "." + split[1] : null,
n:int = input < 0 ? 2 : 1,
commas:int = Math.floor( (front.length - n) / 3 ),
i:int = 1;
for ( ; i <= commas; i++ )
{
n = front.length - (3 * i + i - 1);
front = front.slice( 0, n ) + "," + front.slice( n );
}
if ( back )
return front + back;
else
return front;
}
So, I'd offer that commaify() meets the demands of versatility and performance though certainly not the most compact or elegant.
This really isn't the best use of RegEx... I'm not aware of a number formatting function, but this thread seems to provide a solution.
function commaCoder(yourNum):String {
//var yourNum:Number = new Number();
var numtoString:String = new String();
var numLength:Number = yourNum.toString().length;
numtoString = "";
for (i=0; i<numLength; i++) {
if ((numLength-i)%3 == 0 && i != 0) {
numtoString += ",";
}
numtoString += yourNum.toString().charAt(i);
trace(numtoString);
}
return numtoString;
}
If you really are insistent on using RegEx, you could just reverse the string, apply the RegEx replace function, then reverse it back.
A sexeger is good for this. In brief, a sexeger is a reversed regex run against a reversed string that you reverse the output of. It is generally more efficient than the alternative. Here is some pseudocode for what you want to do:
string = reverse string
string.replace(/(\d{3})(?!$)/g, "$1,")
string = reverse string
Here is is a Perl implemntation
#!/usr/bin/perl
use strict;
use warnings;
my $s = 13_456_789;
for my $n (1, 12, 123, 1234, 12345, 123456, 1234567) {
my $s = reverse $n;
$s =~ s/([0-9]{3})(?!$)/$1,/g;
$s = reverse $s;
print "$s\n";
}
You may want to consider NumberFormatter
I'll take the downvotes for not being the requested language, but this non-regex technique should apply (and I arrived here via searching for "C# regex to add commas into number")
var raw = "104241824 15202656 KB 13498560 KB 1612672KB already 1,000,000 or 99.999 or 9999.99";
int i = 0;
bool isnum = false;
var formatted = raw.Reverse().Aggregate(new StringBuilder(), (sb, c) => {
//$"{i}: [{c}] {isnum}".Dump();
if (char.IsDigit(c) && c != ' ' && c!= '.' && c != ',') {
if (isnum) {
if (i == 3) {
//$"ins ,".Dump();
sb.Insert(0, ',');
i = 0;
}
}
else isnum = true;
i++;
}
else {
isnum = false;
i = 0;
}
sb.Insert(0, c);
return sb;
});
results in:
104,241,824 15,202,656 KB 13,498,560 KB 1,612,672KB already 1,000,000 or 99.999 or 9,999.99
// This is a simple code and it works fine...:)
import java.util.Scanner;
public class NumberWithCommas {
public static void main(String a[]) {
Scanner sc = new Scanner(System.in);
String num;
System.out.println("\n enter the number:");
num = sc.next();
printNumber(num);
}
public static void printNumber(String ar) {
int len, i = 0, temp = 0;
len = ar.length();
temp = len / 3;
if (len % 3 == 0)
temp = temp - 1;
len = len + temp;
char[] ch = ar.toCharArray();
char[] ch1 = new char[len];
for (int j = 0, k = (ar.length() - 1); j < len; j++)
{
if (i < 3)
{
ch1[j] = ch[k];
i++;
k--;
}
else
{
ch1[j] = ',';
i = 0;
}
}
for (int j = len - 1; j >= 0; j--)
System.out.print(ch1[j]);
System.out.println("");
}
}
If you can't use lookahead on regular expressions, you can use this:
string.replace(/^(.*?,)?(\d{1,3})((?:\d{3})+)$/, "$1$2,$3")
inside a loop until there's nothing to replace.
For example, a perlish solution would look like this:
my $num = '1234567890';
while ($num =~ s/^(.*?,)?(\d{1,3})((?:\d{3})+)$/$1$2,$3/) {}
Perl RegExp 1 liner:
1 while $VAR{total} =~ s/(.*\d)(\d\d\d)/$1,$2/g;
Try this code. it's simple and best performance.
var reg:RegExp=/\d{1,3}(?=(\d{3})+(?!\d))/g;
var my_num:Number = 48712694;
var my_num_str:String = String(my_num).replace(reg,"$&,");
trace(my_num_str);
::output::
48,712,694