How to select first chars with a custom word boundary?

How to select first chars with a custom word boundary? - regex

I've test cases with a series of words like this :
{
input: "Halley's Comet",
expected: "HC",
},
{
input: "First In, First Out",
expected: "FIFO",
},
{
input: "The Road _Not_ Taken",
expected: "TRNT",
},
I want with one regex to match all first letters of these words, avoid char: "_" to be matched as a first letter and count single quote in the word.
Currently, I have this regex working on pcre syntax but not with Go regexp package : (?<![a-zA-Z0-9'])([a-zA-Z0-9'])
I know lookarounds aren't supported by Go but I'm looking for a good way to do that.
I also use this func to get an array of all strings : re.FindAllString(s, -1)
Thanks for helping.

Something that plays with character classes and word boundaries should suffice:
\b_*([a-z])[a-z]*(?:'s)?_*\b\W*
demo
Usage:
package main
import (
"fmt"
"regexp"
)
func main() {
re := regexp.MustCompile(`(?i)\b_*([a-z])[a-z]*(?:'s)?_*\b\W*`)
fmt.Println(re.ReplaceAllString("O'Brian's dog", "$1"))
}

ftr, regexp less solution
package main
import (
"fmt"
)
func main() {
inputs := []string{"Hallمرحباey's Comet", "First In, First Out", "The Road _Not_ Taken", "O'Brian's Dog"}
c := [][]string{}
w := [][]string{}
for _, input := range inputs {
c = append(c, firstLet(input))
w = append(w, words(input))
}
fmt.Printf("%#v\n", w)
fmt.Printf("%#v\n", c)
}
func firstLet(in string) (out []string) {
var inword bool
for _, r := range in {
if !inword {
if isChar(r) {
inword = true
out = append(out, string(r))
}
} else if r == ' ' {
inword = false
}
}
return out
}
func words(in string) (out []string) {
var inword bool
var w []rune
for _, r := range in {
if !inword {
if isChar(r) {
w = append(w, r)
inword = true
}
} else if r == ' ' {
if len(w) > 0 {
out = append(out, string(w))
w = w[:0]
}
inword = false
} else if r != '_' {
w = append(w, r)
}
}
if len(w) > 0 {
out = append(out, string(w))
}
return out
}
func isChar(r rune) bool {
return (r >= 'a' && r <= 'z') || (r >= 'A' && r <= 'Z')
}
outputs
[][]string{[]string{"Hallمرحباey's", "Comet"}, []string{"First", "In,", "First", "Out"}, []string{"The", "Road", "Not", "Taken"}, []string{"O'Brian's", "Dog"}}
[][]string{[]string{"H", "C"}, []string{"F", "I", "F", "O"}, []string{"T", "R", "N", "T"}, []string{"O", "D"}}

Related

Regular expression to extract Words inside nested parentheses

im looking for the regexp that make able to do this tasks
message Body Input: Test1 (Test2) (test3) (ti,ab(text(text here(possible text)text(possible text(more text))))) end (text)
the result that i want Result: (text(text here(possible text)text(possible text(more text))))
I want to collect everything that is inside ti,ab(................)
var messageBody = message.getPlainBody()
var ssFile = DriveApp.getFileById(id);
DriveApp.getFolderById(folder.getId()).addFile(ssFile);
var ss = SpreadsheetApp.open(ssFile);
var sheet = ss.getSheets()[0];
sheet.insertColumnAfter(sheet.getLastColumn());
SpreadsheetApp.flush();
var sheet = ss.getSheets()[0];
var range = sheet.getRange(1, 1, sheet.getLastRow(), sheet.getLastColumn() + 1)
var values = range.getValues();
values[0][sheet.getLastColumn()] = "Search Strategy";
for (var i = 1; i < values.length; i++) {
//here my Regexp
var y = messageBody.match(/\((ti,ab.*)\)/ig);
if (y);
values[i][values[i].length - 1] = y.toString();
range.setValues(values);

The only solution you may use here is to extract all substrings inside parentheses and then filter them to get all those that start with ti,ab:
var a = [], r = [], result;
var txt = "Test1 (Test2) (test3) (ti,ab(text(text here(possible text)text(possible text(more text))))) end (text)";
for(var i=0; i < txt.length; i++){
if(txt.charAt(i) == '(') {
a.push(i);
}
if(txt.charAt(i) == ')') {
r.push(txt.substring(a.pop()+1,i));
}
}
result = r.filter(function(x) { return /^ti,ab\(/.test(x); })
.map(function(y) {return y.substring(6,y.length-1);})
console.log(result);
The nested parentheses function is borrowed from Nested parentheses get string one by one. The /^ti,ab\(/ regex matches ti,ab( at the start of the string.
The above solution allows extracting nested parentheses inside nested parentheses. If you do not need it, use
var txt = "Test1 (Test2) ((ti,ab(text(text here))) AND ab(test3) Near Ti(test4) NOT ti,ab,su(test5) NOT su(Test6))";
var start=0, r = [], level=0;
for (var j = 0; j < txt.length; j++) {
if (txt.charAt(j) == '(') {
if (level === 0) start=j;
++level;
}
if (txt.charAt(j) == ')') {
if (level > 0) {
--level;
}
if (level === 0) {
r.push(txt.substring(start, j+1));
}
}
}
console.log("r: ", r);
var rx = "\\b(?:ti|ab|su)(?:,(ti|ab|su))*\\(";
var result = r.filter(function(y) { return new RegExp(rx, "i").test(y); })
.map(function(x) {
return x.replace(new RegExp(rx, "ig"), '(')
});
console.log("Result:",result);
The pattern used to filter and remove the unnecessary words
\b(?:ti|ab|su)(?:,(ti|ab|su))*\(
Details
\b - a word boundary
(?:ti|ab|su) - 1 of the alternatives,
(?:,(ti|ab|su))* - 0 or more repetitions of , followed with 1 of the 3 alternatives
\( - a (.
The match is replaced with ( to restore it in the match.

Remove quotes between letters

In golang, how can I remove quotes between two letters, like that:
import (
"testing"
)
func TestRemoveQuotes(t *testing.T) {
var a = "bus\"zipcode"
var mockResult = "bus zipcode"
a = RemoveQuotes(a)
if a != mockResult {
t.Error("Error or TestRemoveQuotes: ", a)
}
}
Function:
import (
"fmt"
"strings"
)
func RemoveQuotes(s string) string {
s = strings.Replace(s, "\"", "", -1) //here I removed all quotes. I'd like to remove only quotes between letters
fmt.Println(s)
return s
}
For example:
"bus"zipcode" = "bus zipcode"

You may use a simple \b"\b regex that matches a double quote only when preceded and followed with word boundaries:
package main
import (
"fmt"
"regexp"
)
func main() {
var a = "\"test1\",\"test2\",\"tes\"t3\""
fmt.Println(RemoveQuotes(a))
}
func RemoveQuotes(s string) string {
re := regexp.MustCompile(`\b"\b`)
return re.ReplaceAllString(s, "")
}
See the Go demo printing "test1","test2","test3".
Also, see the online regex demo.

I am not sure about what you need when you commented I want to only quote inside test3.
This code is removing the quotes from the inside, as you did, but it is adding the quotes with fmt.Sprintf()
package main
import (
"fmt"
"strings"
)
func main() {
var a = "\"test1\",\"test2\",\"tes\"t3\""
fmt.Println(RemoveQuotes(a))
}
func RemoveQuotes(s string) string {
s = strings.Replace(s, "\"", "", -1) //here I removed all quotes. I'd like to remove only quotes between letters
return fmt.Sprintf(`"%s"`, s)
}
https://play.golang.org/p/dKB9DwYXZp

In your example you define a string variable so the outer quotes are not part of the actual string. If you would do fmt.Println("bus\"zipcode") the output on the screen would be bus"zipcode. If your goal is to replace quotes in a string with a space then you need to replace the quote not with an empty string as you do, but rather with a space - s = strings.Replace(s, "\"", " ", -1). Though if you want to remove the quotes entirely you can do something like this:
package main
import (
"fmt"
"strings"
)
func RemoveQuotes(s string) string {
result := ""
arr := strings.Split(s, ",")
for i:=0;i<len(arr);i++ {
sub := strings.Replace(arr[i], "\"", "", -1)
result = fmt.Sprintf("%s,\"%s\"", result, sub)
}
return result[1:]
}
func main() {
a:= "\"test1\",\"test2\",\"tes\"t3\""
fmt.Println(RemoveQuotes(a))
}
Note however that this is not very efficient, but I assume it's more about learning how to do it in this case.

Swift splitting "abc1.23.456.7890xyz" into "abc", "1", "23", "456", "7890" and "xyz"

In Swift on OS X I am trying to chop up the string "abc1.23.456.7890xyz" into these strings:
"abc"
"1"
"23"
"456"
"7890"
"xyz"
but when I run the following code I get the following:
=> "abc1.23.456.7890xyz"
(0,3) -> "abc"
(3,1) -> "1"
(12,4) -> "7890"
(16,3) -> "xyz"
which means that the application correctly found "abc", the first token "1", but then the next token found is "7890" (missing out "23" and "456") followed by "xyz".
Can anyone see how the code can be changed to find ALL of the strings (including "23" and "456")?
Many thanks in advance.
import Foundation
import XCTest
public
class StackOverflowTest: XCTestCase {
public
func testRegex() {
do {
let patternString = "([^0-9]*)([0-9]+)(?:\\.([0-9]+))*([^0-9]*)"
let regex = try NSRegularExpression(pattern: patternString, options: [])
let string = "abc1.23.456.7890xyz"
print("=> \"\(string)\"")
let range = NSMakeRange(0, string.characters.count)
regex.enumerateMatchesInString(string, options: [], range: range) {
(textCheckingResult, _, _) in
if let textCheckingResult = textCheckingResult {
for nsRangeIndex in 1 ..< textCheckingResult.numberOfRanges {
let nsRange = textCheckingResult.rangeAtIndex(nsRangeIndex)
let location = nsRange.location
if location < Int.max {
let startIndex = string.startIndex.advancedBy(location)
let endIndex = startIndex.advancedBy(nsRange.length)
let value = string[startIndex ..< endIndex]
print("\(nsRange) -> \"\(value)\"")
}
}
}
}
} catch {
}
}
}

It's all about your regex pattern. You want to find a series of contiguous letters or digits. Try this pattern instead:
let patternString = "([a-zA-Z]+|\\d+)"

alternative 'Swifty' way
let str = "abc1.23.456.7890xyz"
let chars = str.characters.map{ $0 }
enum CharType {
case Number
case Alpha
init(c: Character) {
self = .Alpha
if isNumber(c) {
self = .Number
}
}
func isNumber(c: Character)->Bool {
return "1234567890".characters.map{ $0 }.contains(c)
}
}
var tmp = ""
tmp.append(chars[0])
var type = CharType(c: chars[0])
for i in 1..<chars.count {
let c = CharType(c: chars[i])
if c != type {
tmp.append(Character("."))
}
tmp.append(chars[i])
type = c
}
tmp.characters.split(".", maxSplit: Int.max, allowEmptySlices: false).map(String.init)
// ["abc", "1", "23", "456", "7890", "xyz"]

Rexexp to match all the numbers,alphabets,special characters in a string

I want a pattern to match a string that has everything in it(alphabets,numbers,special charactres)
public static void main(String[] args) {
String retVal=null;
try
{
String s1 = "[0-9a-zA-Z].*:[0-9a-zA-Z].*:(.*):[0-9a-zA-Z].*";
String s2 = "BNTPSDAE31G:BNTPSDAE:Healthcheck:Major";
Pattern pattern = null;
//if ( ! StringUtils.isEmpty(s1) )
if ( ( s1 != null ) && ( ! s1.matches("\\s*") ) )
{
pattern = Pattern.compile(s1);
}
//if ( ! StringUtils.isEmpty(s2) )
if ( s2 != null )
{
Matcher matcher = pattern.matcher( s2 );
if ( matcher.matches() )
{
retVal = matcher.group(1);
// A special case/kludge for Asentria. Temp alarms contain "Normal/High" etc.
// Switch Normal to return CLEAR. The default for this usage will be RAISE.
// Need to handle switches in XML. This won't work if anyone puts "normal" in their event alias.
if ("Restore".equalsIgnoreCase ( retVal ) )
{
}
}
}
}
catch( Exception e )
{
System.out.println("Error evaluating args : " );
}
System.out.println("retVal------"+retVal);
}
and output is:
Healthcheck
Hera using this [0-9a-zA-Z].* am matching only alpahbets and numbers,but i want to match the string if it has special characters also
Any help is highly appreciated

Try this:
If you want to match individual elements try this:
2.1.2 :001 > s = "asad3435##:$%adasd1213"
2.1.2 :008 > s.scan(/./)
=> ["a", "s", "a", "d", "3", "4", "3", "5", "#", "#", ":", "$", "%", "a", "d", "a", "s", "d", "1", "2", "1", "3"]
or you want match all at once try this:
2.1.2 :009 > s.scan(/[^.]+/)
=> ["asad3435##:$%adasd1213"]

Try the following regex, it works for me :)
[^:]+
You might need to put a global modifier on it to get it to match all strings.

Insert commas into number string

Hey there, I'm trying to perform a backwards regular expression search on a string to divide it into groups of 3 digits. As far as i can see from the AS3 documentation, searching backwards is not possible in the reg ex engine.
The point of this exercise is to insert triplet commas into a number like so:
10000000 => 10,000,000
I'm thinking of doing it like so:
string.replace(/(\d{3})/g, ",$1")
But this is not correct due to the search not happening from the back and the replace $1 will only work for the first match.
I'm getting the feeling I would be better off performing this task using a loop.
UPDATE:
Due to AS3 not supporting lookahead this is how I have solved it.
public static function formatNumber(number:Number):String
{
var numString:String = number.toString()
var result:String = ''
while (numString.length > 3)
{
var chunk:String = numString.substr(-3)
numString = numString.substr(0, numString.length - 3)
result = ',' + chunk + result
}
if (numString.length > 0)
{
result = numString + result
}
return result
}

If your language supports postive lookahead assertions, then I think the following regex will work:
(\d)(?=(\d{3})+$)
Demonstrated in Java:
import static org.junit.Assert.assertEquals;
import org.junit.Test;
public class CommifyTest {
#Test
public void testCommify() {
String num0 = "1";
String num1 = "123456";
String num2 = "1234567";
String num3 = "12345678";
String num4 = "123456789";
String regex = "(\\d)(?=(\\d{3})+$)";
assertEquals("1", num0.replaceAll(regex, "$1,"));
assertEquals("123,456", num1.replaceAll(regex, "$1,"));
assertEquals("1,234,567", num2.replaceAll(regex, "$1,"));
assertEquals("12,345,678", num3.replaceAll(regex, "$1,"));
assertEquals("123,456,789", num4.replaceAll(regex, "$1,"));
}
}

Found on http://gskinner.com/RegExr/
Community > Thousands separator
Pattern: /\d{1,3}(?=(\d{3})+(?!\d))/g
Replace: $&,
trace ( String("1000000000").replace( /\d{1,3}(?=(\d{3})+(?!\d))/g , "$&,") );
It done the job!

If your regex engine has positive lookaheads, you could do something like this:
string.replace(/(\d)(?=(\d\d\d)+$)/, "$1,")
Where the positive lookahead (?=...) means that the regex only matches when the lookahead expression ... matches.
(Note that lookaround-expressions are not always very efficient.)

While many of these answers work fine with positive integers, many of their argument inputs are cast as Numbers, which implies that they can handle negative values or contain decimals, and here all of the solutions fail. Though the currently selected answer does not assume a Number I was curious to find a solution that could and was also more performant than RegExp (which AS3 does not do well).
I put together many of the answers here in a testing class (and included a solution from this blog and an answer of my own called commaify) and formatted them in a consistent way for easy comparison:
package
{
public class CommaNumberSolutions
{
public static function commaify( input:Number ):String
{
var split:Array = input.toString().split( '.' ),
front:String = split[0],
back:String = ( split.length > 1 ) ? "." + split[1] : null,
n:int = input < 0 ? 2 : 1,
commas:int = Math.floor( (front.length - n) / 3 ),
i:int = 1;
for ( ; i <= commas; i++ )
{
n = front.length - (3 * i + i - 1);
front = front.slice( 0, n ) + "," + front.slice( n );
}
if ( back )
return front + back;
else
return front;
}
public static function getCommaString( input:Number ):String
{
var s:String = input.toString();
if ( s.length <= 3 )
return s;
var i:int = s.length % 3;
if ( i == 0 )
i = 3;
for ( ; i < s.length; i += 4 )
{
var part1:String = s.substr(0, i);
var part2:String = s.substr(i, s.length);
s = part1.concat(",", part2);
}
return s;
}
public static function formatNumber( input:Number ):String
{
var s:String = input.toString()
var result:String = ''
while ( s.length > 3 )
{
var chunk:String = s.substr(-3)
s = s.substr(0, s.length - 3)
result = ',' + chunk + result
}
if ( s.length > 0 )
result = s + result
return result
}
public static function commaCoder( input:Number ):String
{
var s:String = "";
var len:Number = input.toString().length;
for ( var i:int = 0; i < len; i++ )
{
if ( (len-i) % 3 == 0 && i != 0)
s += ",";
s += input.toString().charAt(i);
}
return s;
}
public static function regex1( input:Number ):String
{
return input.toString().replace( /-{0,1}(\d)(?=(\d\d\d)+$)/g, "$1," );
}
public static function regex2( input:Number ):String
{
return input.toString().replace( /-{0,1}\d{1,3}(?=(\d{3})+(?!\d))/g , "$&,")
}
public static function addCommas( input:Number ):String
{
var negative:String = "";
if ( input < 0 )
{
negative = "-";
input = Math.abs(input);
}
var s:String = input.toString();
var results:Array = s.split(/\./);
s = results[0];
if ( s.length > 3 )
{
var mod:Number = s.length % 3;
var output:String = s.substr(0, mod);
for ( var i:Number = mod; i < s.length; i += 3 )
{
output += ((mod == 0 && i == 0) ? "" : ",") + s.substr(i, 3);
}
if ( results.length > 1 )
{
if ( results[1].length == 1 )
return negative + output + "." + results[1] + "0";
else
return negative + output + "." + results[1];
}
else
return negative + output;
}
if ( results.length > 1 )
{
if ( results[1].length == 1 )
return negative + s + "." + results[1] + "0";
else
return negative + s + "." + results[1];
}
else
return negative + s;
}
}
}
Then I tested each for accuracy and performance:
package
{
public class TestCommaNumberSolutions
{
private var functions:Array;
function TestCommaNumberSolutions()
{
functions = [
{ name: "commaify()", f: CommaNumberSolutions.commaify },
{ name: "addCommas()", f: CommaNumberSolutions.addCommas },
{ name: "getCommaString()", f: CommaNumberSolutions.getCommaString },
{ name: "formatNumber()", f: CommaNumberSolutions.formatNumber },
{ name: "regex1()", f: CommaNumberSolutions.regex1 },
{ name: "regex2()", f: CommaNumberSolutions.regex2 },
{ name: "commaCoder()", f: CommaNumberSolutions.commaCoder }
];
verify();
measure();
}
protected function verify():void
{
var assertions:Array = [
{ input: 1, output: "1" },
{ input: 21, output: "21" },
{ input: 321, output: "321" },
{ input: 4321, output: "4,321" },
{ input: 54321, output: "54,321" },
{ input: 654321, output: "654,321" },
{ input: 7654321, output: "7,654,321" },
{ input: 987654321, output: "987,654,321" },
{ input: 1987654321, output: "1,987,654,321" },
{ input: 21987654321, output: "21,987,654,321" },
{ input: 321987654321, output: "321,987,654,321" },
{ input: 4321987654321, output: "4,321,987,654,321" },
{ input: 54321987654321, output: "54,321,987,654,321" },
{ input: 654321987654321, output: "654,321,987,654,321" },
{ input: 7654321987654321, output: "7,654,321,987,654,321" },
{ input: 87654321987654321, output: "87,654,321,987,654,321" },
{ input: -1, output: "-1" },
{ input: -21, output: "-21" },
{ input: -321, output: "-321" },
{ input: -4321, output: "-4,321" },
{ input: -54321, output: "-54,321" },
{ input: -654321, output: "-654,321" },
{ input: -7654321, output: "-7,654,321" },
{ input: -987654321, output: "-987,654,321" },
{ input: -1987654321, output: "-1,987,654,321" },
{ input: -21987654321, output: "-21,987,654,321" },
{ input: -321987654321, output: "-321,987,654,321" },
{ input: -4321987654321, output: "-4,321,987,654,321" },
{ input: -54321987654321, output: "-54,321,987,654,321" },
{ input: -654321987654321, output: "-654,321,987,654,321" },
{ input: -7654321987654321, output: "-7,654,321,987,654,321" },
{ input: -87654321987654321, output: "-87,654,321,987,654,321" },
{ input: .012345, output: "0.012345" },
{ input: 1.012345, output: "1.012345" },
{ input: 21.012345, output: "21.012345" },
{ input: 321.012345, output: "321.012345" },
{ input: 4321.012345, output: "4,321.012345" },
{ input: 54321.012345, output: "54,321.012345" },
{ input: 654321.012345, output: "654,321.012345" },
{ input: 7654321.012345, output: "7,654,321.012345" },
{ input: 987654321.012345, output: "987,654,321.012345" },
{ input: 1987654321.012345, output: "1,987,654,321.012345" },
{ input: 21987654321.012345, output: "21,987,654,321.012345" },
{ input: -.012345, output: "-0.012345" },
{ input: -1.012345, output: "-1.012345" },
{ input: -21.012345, output: "-21.012345" },
{ input: -321.012345, output: "-321.012345" },
{ input: -4321.012345, output: "-4,321.012345" },
{ input: -54321.012345, output: "-54,321.012345" },
{ input: -654321.012345, output: "-654,321.012345" },
{ input: -7654321.012345, output: "-7,654,321.012345" },
{ input: -987654321.012345, output: "-987,654,321.012345" },
{ input: -1987654321.012345, output: "-1,987,654,321.012345" },
{ input: -21987654321.012345, output: "-21,987,654,321.012345" }
];
var i:int;
var len:int = assertions.length;
var assertion:Object;
var f:Function;
var s1:String;
var s2:String;
for each ( var o:Object in functions )
{
i = 0;
f = o.f;
trace( '\rVerify: ' + o.name );
for ( ; i < len; i++ )
{
assertion = assertions[ i ];
s1 = f.apply( null, [ assertion.input ] );
s2 = assertion.output;
if ( s1 !== s2 )
trace( 'Test #' + i + ' Failed: ' + s1 + ' !== ' + s2 );
}
}
}
protected function measure():void
{
// Generate random inputs
var values:Array = [];
for ( var i:int = 0; i < 999999; i++ ) {
values.push( Math.random() * int.MAX_VALUE * ( Math.random() > .5 ? -1 : 1) );
}
var len:int = values.length;
var stopwatch:Stopwatch = new Stopwatch;
var s:String;
var f:Function;
trace( '\rTesting ' + len + ' random values' );
// Test each function
for each ( var o:Object in functions )
{
i = 0;
s = "";
f = o.f;
stopwatch.start();
for ( ; i < len; i++ ) {
s += f.apply( null, [ values[i] ] ) + " ";
}
stopwatch.stop();
trace( o.name + '\t\ttook ' + (stopwatch.elapsed/1000) + 's' ); //(stopwatch.elapsed/len) + 'ms'
}
}
}
}
import flash.utils.getTimer;
class Stopwatch
{
protected var startStamp:int;
protected var stopStamp:int;
protected var _started:Boolean;
protected var _stopped:Boolean;
function Stopwatch( startNow:Boolean = true ):void
{
if ( startNow )
start();
}
public function start():void
{
startStamp = getTimer();
_started = true;
_stopped = false;
}
public function stop():void
{
stopStamp = getTimer();
_stopped = true;
_started = false;
}
public function get elapsed():int
{
return ( _stopped ) ? stopStamp - startStamp : ( _started ) ? getTimer() - startStamp : 0;
}
public function get started():Boolean
{
return _started;
}
public function get stopped():Boolean
{
return _stopped;
}
}
Because of AS3's lack of precision with larger Numbers every class failed these tests:
Test #15 Failed: 87,654,321,987,654,320 !== 87,654,321,987,654,321
Test #31 Failed: -87,654,321,987,654,320 !== -87,654,321,987,654,321
Test #42 Failed: 21,987,654,321.012344 !== 21,987,654,321.012345
Test #53 Failed: -21,987,654,321.012344 !== -21,987,654,321.012345
But only two functions passed all of the other tests: commaify() and addCommas().
The performance tests show that commaify() is the most preformant of all the solutions:
Testing 999999 random values
commaify() took 12.411s
addCommas() took 17.863s
getCommaString() took 18.519s
formatNumber() took 14.409s
regex1() took 40.654s
regex2() took 36.985s
Additionally commaify() can be extended to including arguments for decimal length and zero-padding on the decimal portion — it also outperforms the others at 13.128s:
public static function cappedDecimal( input:Number, decimalPlaces:int = 2 ):Number
{
if ( decimalPlaces == 0 )
return Math.floor( input );
var decimalFactor:Number = Math.pow( 10, decimalPlaces );
return Math.floor( input * decimalFactor ) / decimalFactor;
}
public static function cappedDecimalString( input:Number, decimalPlaces:int = 2, padZeros:Boolean = true ):String
{
if ( padZeros )
return cappedDecimal( input, decimalPlaces ).toFixed( decimalPlaces );
else
return cappedDecimal( input, decimalPlaces ).toString();
}
public static function commaifyExtended( input:Number, decimalPlaces:int = 2, padZeros:Boolean = true ):String
{
var split:Array = cappedDecimalString( input, decimalPlaces, padZeros ).split( '.' ),
front:String = split[0],
back:String = ( split.length > 1 ) ? "." + split[1] : null,
n:int = input < 0 ? 2 : 1,
commas:int = Math.floor( (front.length - n) / 3 ),
i:int = 1;
for ( ; i <= commas; i++ )
{
n = front.length - (3 * i + i - 1);
front = front.slice( 0, n ) + "," + front.slice( n );
}
if ( back )
return front + back;
else
return front;
}
So, I'd offer that commaify() meets the demands of versatility and performance though certainly not the most compact or elegant.

This really isn't the best use of RegEx... I'm not aware of a number formatting function, but this thread seems to provide a solution.
function commaCoder(yourNum):String {
//var yourNum:Number = new Number();
var numtoString:String = new String();
var numLength:Number = yourNum.toString().length;
numtoString = "";
for (i=0; i<numLength; i++) {
if ((numLength-i)%3 == 0 && i != 0) {
numtoString += ",";
}
numtoString += yourNum.toString().charAt(i);
trace(numtoString);
}
return numtoString;
}
If you really are insistent on using RegEx, you could just reverse the string, apply the RegEx replace function, then reverse it back.

A sexeger is good for this. In brief, a sexeger is a reversed regex run against a reversed string that you reverse the output of. It is generally more efficient than the alternative. Here is some pseudocode for what you want to do:
string = reverse string
string.replace(/(\d{3})(?!$)/g, "$1,")
string = reverse string
Here is is a Perl implemntation
#!/usr/bin/perl
use strict;
use warnings;
my $s = 13_456_789;
for my $n (1, 12, 123, 1234, 12345, 123456, 1234567) {
my $s = reverse $n;
$s =~ s/([0-9]{3})(?!$)/$1,/g;
$s = reverse $s;
print "$s\n";
}

You may want to consider NumberFormatter

I'll take the downvotes for not being the requested language, but this non-regex technique should apply (and I arrived here via searching for "C# regex to add commas into number")
var raw = "104241824 15202656 KB 13498560 KB 1612672KB already 1,000,000 or 99.999 or 9999.99";
int i = 0;
bool isnum = false;
var formatted = raw.Reverse().Aggregate(new StringBuilder(), (sb, c) => {
//$"{i}: [{c}] {isnum}".Dump();
if (char.IsDigit(c) && c != ' ' && c!= '.' && c != ',') {
if (isnum) {
if (i == 3) {
//$"ins ,".Dump();
sb.Insert(0, ',');
i = 0;
}
}
else isnum = true;
i++;
}
else {
isnum = false;
i = 0;
}
sb.Insert(0, c);
return sb;
});
results in:
104,241,824 15,202,656 KB 13,498,560 KB 1,612,672KB already 1,000,000 or 99.999 or 9,999.99

// This is a simple code and it works fine...:)
import java.util.Scanner;
public class NumberWithCommas {
public static void main(String a[]) {
Scanner sc = new Scanner(System.in);
String num;
System.out.println("\n enter the number:");
num = sc.next();
printNumber(num);
}
public static void printNumber(String ar) {
int len, i = 0, temp = 0;
len = ar.length();
temp = len / 3;
if (len % 3 == 0)
temp = temp - 1;
len = len + temp;
char[] ch = ar.toCharArray();
char[] ch1 = new char[len];
for (int j = 0, k = (ar.length() - 1); j < len; j++)
{
if (i < 3)
{
ch1[j] = ch[k];
i++;
k--;
}
else
{
ch1[j] = ',';
i = 0;
}
}
for (int j = len - 1; j >= 0; j--)
System.out.print(ch1[j]);
System.out.println("");
}
}

If you can't use lookahead on regular expressions, you can use this:
string.replace(/^(.*?,)?(\d{1,3})((?:\d{3})+)$/, "$1$2,$3")
inside a loop until there's nothing to replace.
For example, a perlish solution would look like this:
my $num = '1234567890';
while ($num =~ s/^(.*?,)?(\d{1,3})((?:\d{3})+)$/$1$2,$3/) {}

Perl RegExp 1 liner:
1 while $VAR{total} =~ s/(.*\d)(\d\d\d)/$1,$2/g;

Try this code. it's simple and best performance.
var reg:RegExp=/\d{1,3}(?=(\d{3})+(?!\d))/g;
var my_num:Number = 48712694;
var my_num_str:String = String(my_num).replace(reg,"$&,");
trace(my_num_str);
::output::
48,712,694

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to select first chars with a custom word boundary? - regex

Something that plays with character classes and word boundaries should suffice: \b_([a-z])[a-z](?:'s)?_\b\W demo Usage: package main import ( "fmt" "regexp" ) func main() { re := regexp.MustCompile(`(?i)\b_([a-z])[a-z](?:'s)?_\b\W`) fmt.Println(re.ReplaceAllString("O'Brian's dog", "$1")) }

Related

Regular expression to extract Words inside nested parentheses

Remove quotes between letters

Swift splitting "abc1.23.456.7890xyz" into "abc", "1", "23", "456", "7890" and "xyz"

Rexexp to match all the numbers,alphabets,special characters in a string

Insert commas into number string

Categories

Resources

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to select first chars with a custom word boundary? - regex

Something that plays with character classes and word boundaries should suffice: \b_*([a-z])[a-z]*(?:'s)?_*\b\W* demo Usage: package main import ( "fmt" "regexp" ) func main() { re := regexp.MustCompile(`(?i)\b_*([a-z])[a-z]*(?:'s)?_*\b\W*`) fmt.Println(re.ReplaceAllString("O'Brian's dog", "$1")) }

Related

Regular expression to extract Words inside nested parentheses

Remove quotes between letters

Swift splitting "abc1.23.456.7890xyz" into "abc", "1", "23", "456", "7890" and "xyz"

Rexexp to match all the numbers,alphabets,special characters in a string

Insert commas into number string

Categories

Resources

Something that plays with character classes and word boundaries should suffice: \b_([a-z])[a-z](?:'s)?_\b\W demo Usage: package main import ( "fmt" "regexp" ) func main() { re := regexp.MustCompile(`(?i)\b_([a-z])[a-z](?:'s)?_\b\W`) fmt.Println(re.ReplaceAllString("O'Brian's dog", "$1")) }