Parsing Perl regex with golang

Parsing Perl regex with golang - regex

http://play.golang.org/p/GM0SWo0qGs
This is my code and playground.
func insert_comma(input_num int) string {
temp_str := strconv.Itoa(input_num)
var validID = regexp.MustCompile(`\B(?=(\d{3})+$)`)
return validID.ReplaceAllString(temp_str, ",")
}
func main() {
fmt.Println(insert_comma(1000000000))
}
Basically, my desired input is 1,000,000,000.
And the regular expression works in Javascript but I do not know how to make this Perl regex work in Go. I would greatly appreciate it. Thanks,

Since lookahead assertion seems to be not supported, I'm providing you a different algorithm with no regexp:
Perl code:
sub insert_comma {
my $x=shift;
my $l=length($x);
for (my $i=$l%3==0?3:$l%3;$i<$l;$i+=3) {
substr($x,$i++,0)=',';
}
return $x;
}
print insert_comma(1000000000);
Go code: Disclaimer: I have zero experience with Go, so bear with me if I have errors and feel free to edit my post!
func insert_comma(input_num int) string {
temp_str := strconv.Itoa(input_num)
var result []string
i := len(temp_str)%3;
if i == 0 { i = 3 }
for index,element := range strings.Split(temp_str, "") {
if i == index {
result = append(result, ",");
i += 3;
}
result = append(result, element)
}
return strings.Join(result, "")
}
func main() {
fmt.Println(insert_comma(1000000000))
}
http://play.golang.org/p/7pvo7-3G-s

Related

Runtime optimization for regular expression

Most regular expression is "constant" in their life time. Is it a good idea to use global regular expression to speed up execution? For example:
func work() {
r := regexp.MustCompile(`...`)
if r.MatchString(...) {
...
}
}
comparing with:
var r *regexp.Regexp
func work() {
if r.MatchString(...) {
...
}
}
func init() {
r = regexp.MustCompile(`...`)
}
Do these 2 versions has any meaningful difference?
Regular expression compiling is so cheap so that it is not worth to use global regex, both in term of CPU cost and garbage collecting (suppose work() is heavily called)
It is better to use global regular expression whenever approriate.
Which of the above is correct, or the answer is not simply black/white?

if you use same regular expression(eg "\d+") just once -> it is not worth to use global regex.
if you use same regular expression(eg "\d+") often -> it is worth to use
func Benchmark01(b *testing.B) {
for i := 0; i < b.N; i++ {
r := regexp.MustCompile(`\d+`)
r.MatchString("aaaaaaa123bbbbbbb")
}
}
func Benchmark02(b *testing.B) {
r := regexp.MustCompile(`\d+`)
for i := 0; i < b.N; i++ {
r.MatchString("aaaaaaa123bbbbbbb")
}
}
Benchmark01
Benchmark01-4 886909 1361 ns/op
Benchmark02
Benchmark02-4 5368380 232.8 ns/op

How do I rewrite regex in golang to work around no support for positive lookaheads? [duplicate]

I am trying to write password validation function with regexp and don't know how to do it.
The regex package provided by the standard API of the Go language is different to other languages.
Have someone an idea, how this regexp pattern should looks like?
The pattern should validate:
/*
* Password rules:
* at least 7 letters
* at least 1 number
* at least 1 upper case
* at least 1 special character
*/

That's actually impossible since Go's regex doesn't support backtracking.
However, it's easy to implement, a simple example:
func verifyPassword(s string) (sevenOrMore, number, upper, special bool) {
letters := 0
for _, c := range s {
switch {
case unicode.IsNumber(c):
number = true
case unicode.IsUpper(c):
upper = true
letters++
case unicode.IsPunct(c) || unicode.IsSymbol(c):
special = true
case unicode.IsLetter(c) || c == ' ':
letters++
default:
//return false, false, false, false
}
}
sevenOrMore = letters >= 7
return
}

The right regexp would be... no regexp here.
You can define a custom function that would validate the password, and combine it with other frameworks helping validating a field, like mccoyst/validate (mentioned in this discussion about parameter validation)
You also have go-validator/validator whic allows to define similar validations (but I would still use a custom validator instead of one or several regexps).
Note: go regexp is based on re2, an efficient, principled regular expression library).
So the major trade offs are no back-references for example: (abc)\1 and no matching look-behinds.
In exchange you get high speed regex.

Building from a neighboring answer, I too wrote a helper function that works well for me. This one just assumes overall password length is satisfactory. Check out the following...
func isValid(s string) bool {
var (
hasMinLen = false
hasUpper = false
hasLower = false
hasNumber = false
hasSpecial = false
)
if len(s) >= 7 {
hasMinLen = true
}
for _, char := range s {
switch {
case unicode.IsUpper(char):
hasUpper = true
case unicode.IsLower(char):
hasLower = true
case unicode.IsNumber(char):
hasNumber = true
case unicode.IsPunct(char) || unicode.IsSymbol(char):
hasSpecial = true
}
}
return hasMinLen && hasUpper && hasLower && hasNumber && hasSpecial
}
isValid("pass") // false
isValid("password") // false
isValid("Password") // false
isValid("P#ssword") // false
isValid("P#ssw0rd") // true
Go Playground example

Based on #OneOfOne's answer with some error message improvement
package main
import (
"fmt"
"strings"
"unicode"
)
func verifyPassword(password string) error {
var uppercasePresent bool
var lowercasePresent bool
var numberPresent bool
var specialCharPresent bool
const minPassLength = 8
const maxPassLength = 64
var passLen int
var errorString string
for _, ch := range password {
switch {
case unicode.IsNumber(ch):
numberPresent = true
passLen++
case unicode.IsUpper(ch):
uppercasePresent = true
passLen++
case unicode.IsLower(ch):
lowercasePresent = true
passLen++
case unicode.IsPunct(ch) || unicode.IsSymbol(ch):
specialCharPresent = true
passLen++
case ch == ' ':
passLen++
}
}
appendError := func(err string) {
if len(strings.TrimSpace(errorString)) != 0 {
errorString += ", " + err
} else {
errorString = err
}
}
if !lowercasePresent {
appendError("lowercase letter missing")
}
if !uppercasePresent {
appendError("uppercase letter missing")
}
if !numberPresent {
appendError("atleast one numeric character required")
}
if !specialCharPresent {
appendError("special character missing")
}
if !(minPassLength <= passLen && passLen <= maxPassLength) {
appendError(fmt.Sprintf("password length must be between %d to %d characters long", minPassLength, maxPassLength))
}
if len(errorString) != 0 {
return fmt.Errorf(errorString)
}
return nil
}
// Let's test it
func main() {
password := "Apple"
err := verifyPassword(password)
fmt.Println(password, " ", err)
}

Below it's my implementation of the above answers with custom messages and somehow twisting to them in a good way(performance aware codes).
package main
import (
"fmt"
"strconv"
"unicode"
)
func main() {
pass := "12345678_Windrol"
// call the password validator and give it field name to be known by the user, password, and the min and max password length
isValid, errs := isValidPassword("Password", pass, 8, 32)
if isValid {
fmt.Println("The password is valid")
} else {
for _, v := range errs {
fmt.Println(v)
}
}
}
func isValidPassword(field, s string, min, max int) (isValid bool, errs []string) {
var (
isMin bool
special bool
number bool
upper bool
lower bool
)
//test for the muximum and minimum characters required for the password string
if len(s) < min || len(s) > max {
isMin = false
appendError("length should be " + strconv.Itoa(min) + " to " + strconv.Itoa(max))
}
for _, c := range s {
// Optimize perf if all become true before reaching the end
if special && number && upper && lower && isMin {
break
}
// else go on switching
switch {
case unicode.IsUpper(c):
upper = true
case unicode.IsLower(c):
lower = true
case unicode.IsNumber(c):
number = true
case unicode.IsPunct(c) || unicode.IsSymbol(c):
special = true
}
}
// append error
appendError := func(err string) {
errs = append(errs, field+" "+err)
}
// Add custom error messages
if !special {
appendError("should contain at least a single special character")
}
if !number {
appendError("should contain at least a single digit")
}
if !lower {
appendError("should contain at least a single lowercase letter")
}
if !upper {
appendError("should contain at least single uppercase letter")
}
// if there is any error
if len(errs) > 0 {
return false, errs
}
// everyting is right
return true, errs
}

There are many ways to skin a cat---The other answers seem to veer away from regex completely, so I thought I'd show my method for simple pass/fail testing of a password string, which is styled to suit my thinking. (Note that this doesn't meet the literal "7 letters" requirement in the original question, but does check overall length.) To me, this code is fairly simple and looks easier to read than doing switch statements or a bunch of if statements:
password := "Pa$$w0rd"
secure := true
tests := []string{".{7,}", "[a-z]", "[A-Z]", "[0-9]", "[^\\d\\w]"}
for _, test := range tests {
t, _ := regexp.MatchString(test, password)
if !t {
secure = false
break
}
}
//secure will be true, since the string "Pa$$w0rd" passes all the tests

Regex to not allow space between words [duplicate]

I'm trying to write a regular expression to remove white spaces from just the beginning of the word, not after, and only a single space after the word.
Used RegExp:
var re = new RegExp(/^([a-zA-Z0-9]+\s?)*$/);
Test Exapmle:
1) test[space]ing - Should be allowed
2) testing - Should be allowed
3) [space]testing - Should not be allowed
4) testing[space] - Should be allowed but have to trim it
5) testing[space][space] - should be allowed but have to trim it
Only one space should be allowed. Is it possible?

To match, what you need, you can use
var re = /^([a-zA-Z0-9]+\s)*[a-zA-Z0-9]+$/;
Maybe you could shorten that a bit, but it matches _ as well
var re = /^(\w+\s)*\w+$/;

function validate(s) {
if (/^(\w+\s?)*\s*$/.test(s)) {
return s.replace(/\s+$/, '');
}
return 'NOT ALLOWED';
}
validate('test ing') // => 'test ing'
validate('testing') // => 'testing'
validate(' testing') // => 'NOT ALLOWED'
validate('testing ') // => 'testing'
validate('testing ') // => 'testing'
validate('test ing ') // => 'test ing'
BTW, new RegExp(..) is redundant if you use regular expression literal.

This one does not allow preceding and following spaces plus only one space between words. Feel free to add any special characters You want.
^([A-Za-z]+ )+[A-Za-z]+$|^[A-Za-z]+$
demo here

Working code- Inside my name.addTextChangedListener():
public void onTextChanged(CharSequence s, int start, int before, int count) {
String n = name.getText().toString();
if (n.equals(""))
name.setError("Name required");
else if (!n.matches("[\\p{Alpha}\\s]*\\b") | n.matches(".*\\s{2}.*") | n.matches("\\s.*")) {
if (n.matches("\\s.*"))
name.setError("Name cannot begin with a space");
else if (n.matches(".*\\s{2}.*"))
name.setError("Multiple spaces between texts");
else if (n.matches(".*\\s"))
name.setError("Blank space at the end of text");
else
name.setError("Non-alphabetic character entered");
}
}
You could try adapting this to your code.

var f=function(t){return Math.pow(t.split(' ').length,2)/t.trim().split(' ').length==2}
f("a a")
true
f("a a ")
false
f("a a")
false
f(" a a")
false
f("a a a")
false

Here is a solution without regular expression.
Add this script inside document.ready function it will work.
var i=0;
jQuery("input,textarea").on('keypress',function(e){
//alert();
if(jQuery(this).val().length < 1){
if(e.which == 32){
//alert(e.which);
return false;
}
}
else {
if(e.which == 32){
if(i != 0){
return false;
}
i++;
}
else{
i=0;
}
}
});

const handleChangeText = text => {
let lastLetter = text[text.length - 1];
let secondLastLetter = text[text.length - 2];
if (lastLetter === ' ' && secondLastLetter === ' ') {
return;
}
setInputText(text.trim());
};

use this
^([A-Za-z]{5,}|[\s]{1}[A-Za-z]{1,})*$
Demo:-https://regex101.com/r/3HP7hl/2

If statement fails with regex comparison

public list[str] deleteBlockComments(list[str] fileLines)
{
bool blockComment = false;
list[str] sourceFile = [];
for(fileLine <- fileLines)
{
fileLine = trim(fileLine);
println(fileLine);
if (/^[\t]*[\/*].*$/ := fileLine)
{
blockComment = true;
}
if (/^[\t]*[*\/].*$/ := fileLine)
{
blockComment = false;
}
println(blockComment);
if(!blockComment)
{
sourceFile = sourceFile + fileLine;
}
}
return sourceFile;
}
For some reason, I am not able to detect /* at the beginning of a string. If I execute this on the command line, it seems to work fine.
Can someone tell me what I am doing wrong? In the picture below you can see the string to be compared above the comparison result (false).

[\/*] is a character set that matches forward slash or star, not both one after the other. Simply remove the square brackets and your pattern should start behaving as you expect.
While we're at it, let's also get rid of the superfluous square brackets around \t
^\t*\/*.*$

How to return only alphanumeric substring?

I'm new to Swift Programming. I would like to ask if anyone can help me to return only alphanumeric substring from a string ?
Example:
Input = "wolf & lion"
Output = "wolflion"
I wonder if there is any solution besides regex.
Thank you

try this:
let outputStr = "wolf & lion".components(separatedBy: CharacterSet.alphanumerics.inverted)
.joined()
print(outputStr)//wolflion

var a = "abs1 2csd^!#awerqwe"
let b = a.characters.map { (char) -> String in
if let charRange = String(char).rangeOfCharacter(from: CharacterSet.alphanumerics) {
return String(char)
} else {
return ""
}
}.joined()
//OR use unicode scalar
let c = a.unicodeScalars.map { (char) -> String in
if CharacterSet.alphanumerics.contains(char) {
return String(char)
} else {
return ""
}
}.joined()
Output: abs12csdawerqwe

Please check :
let str = "wolf & lion"
let charset = str.trimmingCharacters(in: CharacterSet.alphanumerics)
let alphanumericString = str.components(separatedBy: charset).joined(separator: "")
print(alphanumericString) // wolflion

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Parsing Perl regex with golang - regex

Related

Runtime optimization for regular expression

How do I rewrite regex in golang to work around no support for positive lookaheads? [duplicate]

Regex to not allow space between words [duplicate]

If statement fails with regex comparison

How to return only alphanumeric substring?

Categories

Resources