Condition:
Contents can only contain characters from the following set:
a b c d e f g h i j k l m n o p q r s t u v w x y z
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
0 1 2 3 4 5 6 7 8 9
/ - ? : ( ) . , ' +
• Contents may NOT begin with ‘/’
• Contents may NOT contain ‘//’
export function directDebitValidator(nameRe: RegExp): ValidatorFn {
return (control: AbstractControl): { [key: string]: any } | null => {
const directDebitID = nameRe.test(control.value);
return directDebitID ? { 'directDebit': { value: control.value } } : null;
};
}
#Directive({
selector: '[directDebit]',
providers: [{ provide: NG_VALIDATORS, useExisting: DirectDebitValidatorDirective, multi: true }]
})
export class DirectDebitValidatorDirective {
validate(control: AbstractControl): { [key: string]: any } | null {
return control.value ? directDebitValidator(new RegExp("^(? !.* [\/]{2})[a-zA-Z0-9-?:().,'+]+([a-zA-Z0-9\/-?:().,'+])*$"))(control)
: null;
}
}
There are a couple of issues:
There cannot be spaces in the lookahead definition
The [/-?] creates a range, the - must be escaped or placed at the start/end of the character class.
You may use a / unescaped in the constructor notation since no delimiters are being used there.
So, you may use
directDebitValidator(new RegExp("^(?!.*/{2})[a-zA-Z0-9?:().,'+-][a-zA-Z0-9/?:().,'+-]*$"))
Or, using a regex literal notation:
directDebitValidator(/^(?!.*\/{2})[a-zA-Z0-9?:().,'+-][a-zA-Z0-9\/?:().,'+-]*$/)
See the regex demo.
Related
I'm using ANTLR4 and the CSS grammar from https://github.com/antlr/grammars-v4/tree/master/css3. The grammar defines the following ( pared down a little for brevity )
dimension
: ( Plus | Minus )? Dimension
;
fragment FontRelative
: Number E M
| Number E X
| Number C H
| Number R E M
;
fragment AbsLength
: Number P X
| Number C M
| Number M M
| Number I N
| Number P T
| Number P C
| Number Q
;
fragment Angle
: Number D E G
| Number R A D
| Number G R A D
| Number T U R N
;
fragment Length
: AbsLength
| FontRelative
;
Dimension
: Length
| Angle
;
The matching works fine but I don't see an obvious way to extract the units. The parser creates a DimensionContext which has 3 TerminalNode members - Dimension, Plus and Minus. I'd like to be able to extract the unit during parse without having to do additional string parsing.
I know that one issue that the Length and Angle are fragments. I changed the grammar not use fragments
Unit
: 'em'
| 'ex'
| 'ch'
| 'rem'
| 'vw'
| 'vh'
| 'vmin'
| 'vmax'
| 'px'
| 'cm'
| 'mm'
| 'in'
| 'pt'
| 'q'
| 'deg'
| 'rad'
| 'grad'
| 'turn'
| 'ms'
| 's'
| 'hz'
| 'khz'
;
Dimension : Number Unit;
And things still parse but I don't get any more context about what the units are - the Dimension is still a single TerminalNode. Is there a way to deal with this without having to pull apart the full token string?
You will want to do as little as possible in the lexer:
NUMBER
: Dash? Dot Digit+ { atNumber(); }
| Dash? Digit+ ( Dot Digit* )? { atNumber(); }
;
UNIT
: { aftNumber() }?
( 'px' | 'cm' | 'mm' | 'in'
| 'pt' | 'pc' | 'em' | 'ex'
| 'deg' | 'rad' | 'grad' | '%'
| 'ms' | 's' | 'hz' | 'khz'
)
;
The trick is to produce the NUMBER and UNIT as separate tokens, yet limited to the required ordering. The actions in the NUMBER rule just set a flag and the UNIT predicate ensures that a UNIT can only follow a NUMBER:
protected void atNumber() {
_number = true;
}
protected boolean aftNumber() {
if (_number && Character.isWhitespace(_input.LA(1))) return false;
if (!_number) return false;
_number = false;
return true;
}
The parser rule is trivial, but preserves the detail required:
number
: NUMBER UNIT?
;
Use a tree-walk, parse the NUMBER to a Double and an enum (or equivalent) to provide the semantic UNIT characterization:
public enum Unit {
CM("cm", true, true), // 1cm = 96px/2.54
MM("mm", true, true),
IN("in", true, true), // 1in = 2.54cm = 96px
PX("px", true, true), // 1px = 1/96th
PT("pt", true, true), // 1pt = 1/72th
EM("em", false, true), // element font size
REM("rem", false, true), // root element font size
EX("ex", true, true), // element font x-height
CAP("cap", true, true), // element font nominal capital letters height
PER("%", false, true),
DEG("deg", true, false),
RAD("rad", true, false),
GRAD("grad", true, false),
MS("ms", true, false),
S("s", true, false),
HZ("hz", true, false),
KHZ("khz", true, false),
NONE(Strings.EMPTY, true, false), // 'no unit specified'
INVALID(Strings.UNKNOWN, true, false);
public final String symbol;
public final boolean abs;
public final boolean len;
private Unit(String symbol, boolean abs, boolean len) {
this.symbol = symbol;
this.abs = abs;
this.len = len;
}
public boolean isAbsolute() { return abs; }
public boolean isLengthUnit() { return len; }
// call from the visitor to resolve from `UNIT` to Unit
public static Unit find(TerminalNode node) {
if (node == null) return NONE;
for (Unit unit : values()) {
if (unit.symbol.equalsIgnoreCase(node.getText())) return unit;
}
return INVALID;
}
#Override
public String toString() {
return symbol;
}
}
Is recursive proc posible in Crystal?
Something like lambda in Ruby
I'm trying to do a y-combinator in Crystal,something like Ruby one:
puts -> {
fact_improver = ->(partial) {
-> (n) { n.zero? ? 1 : n * partial.(n-1) }
}
y = ->(f) {
->(x) { f.(->(v) { x.(x).(v) }) }.(
->(x) { f.(->(v) { x.(x).(v) }) }
)
}
fact = y.(fact_improver)
fact = fact_improver.(fact)
fact.(100)
}.()
The above code was taken from Y Not- Adventures in Functional Programming
As far as I know Crystal does not have recursive procs. But to create Y combinator you don't need a recursive proc. Actually, according to the definition:
In functional programming, the Y combinator can be used to formally define recursive functions in a programming language that doesn't support recursion.
Here is an example of Y combinator written in Crystal using recursive types:
alias T = Int32
alias Func = T -> T
alias FuncFunc = Func -> Func
alias RecursiveFunction = RecursiveFunction -> Func
fact_improver = ->(partial : Func) {
->(n : T) { n.zero? ? 1 : n * partial.call(n - 1) }
}
y = ->(f : FuncFunc) {
g = ->(r : RecursiveFunction) { f.call(->(x : T) { r.call(r).call(x) }) }
g.call(g)
}
fact = y.call(fact_improver)
fact = fact_improver.call(fact)
fact.call(5) # => 120
UPDATE: it is possible to create recursive proc in Crystal with uninitialized keyword:
g = uninitialized Int32 -> Int32
g = ->(n : Int32) { n.zero? ? 1 : n * g.call(n - 1) }
g.call(5) # => 120
Thanks to #mgarciaisaia for the comment.
Suppose I have the following local macro:
loc a = 12.000923
I would like to get the decimal position of the first non-zero decimal (4 in this example).
There are many ways to achieve this. One is to treat a as a string and to find the position of .:
loc a = 12.000923
loc b = strpos(string(`a'), ".")
di "`b'"
From here one could further loop through the decimals and count since I get the first non-zero element. Of course this doesn't seem to be a very elegant approach.
Can you suggest a better way to deal with this? Regular expressions perhaps?
Well, I don't know Stata, but according to the documentation, \.(0+)? is suported and it shouldn't be hard to convert this 2 lines JavaScript function in Stata.
It returns the position of the first nonzero decimal or -1 if there is no decimal.
function getNonZeroDecimalPosition(v) {
var v2 = v.replace(/\.(0+)?/, "")
return v2.length !== v.length ? v.length - v2.length : -1
}
Explanation
We remove from input string a dot followed by optional consecutive zeros.
The difference between the lengths of original input string and this new string gives the position of the first nonzero decimal
Demo
Sample Snippet
function getNonZeroDecimalPosition(v) {
var v2 = v.replace(/\.(0+)?/, "")
return v2.length !== v.length ? v.length - v2.length : -1
}
var samples = [
"loc a = 12.00012",
"loc b = 12",
"loc c = 12.012",
"loc d = 1.000012",
"loc e = -10.00012",
"loc f = -10.05012",
"loc g = 0.0012"
]
samples.forEach(function(sample) {
console.log(getNonZeroDecimalPosition(sample))
})
You can do this in mata in one line and without using regular expressions:
foreach x in 124.000923 65.020923 1.000022030 0.0090843 .00000425 {
mata: selectindex(tokens(tokens(st_local("x"), ".")[selectindex(tokens(st_local("x"), ".") :== ".") + 1], "0") :!= "0")[1]
}
4
2
5
3
6
Below, you can see the steps in detail:
. local x = 124.000823
. mata:
: /* Step 1: break Stata's local macro x in tokens using . as a parsing char */
: a = tokens(st_local("x"), ".")
: a
1 2 3
+----------------------------+
1 | 124 . 000823 |
+----------------------------+
: /* Step 2: tokenize the string in a[1,3] using 0 as a parsing char */
: b = tokens(a[3], "0")
: b
1 2 3 4
+-------------------------+
1 | 0 0 0 823 |
+-------------------------+
: /* Step 3: find which values are different from zero */
: c = b :!= "0"
: c
1 2 3 4
+-----------------+
1 | 0 0 0 1 |
+-----------------+
: /* Step 4: find the first index position where this is true */
: d = selectindex(c :!= 0)[1]
: d
4
: end
You can also find the position of the string of interest in Step 2 using the
same logic.
This is the index value after the one for .:
. mata:
: k = selectindex(a :== ".") + 1
: k
3
: end
In which case, Step 2 becomes:
. mata:
:
: b = tokens(a[k], "0")
: b
1 2 3 4
+-------------------------+
1 | 0 0 0 823 |
+-------------------------+
: end
For unexpected cases without decimal:
foreach x in 124.000923 65.020923 1.000022030 12 0.0090843 .00000425 {
if strmatch("`x'", "*.*") mata: selectindex(tokens(tokens(st_local("x"), ".")[selectindex(tokens(st_local("x"), ".") :== ".") + 1], "0") :!= "0")[1]
else display " 0"
}
4
2
5
0
3
6
A straighforward answer uses regular expressions and commands to work with strings.
One can select all decimals, find the first non 0 decimal, and finally find its position:
loc v = "123.000923"
loc v2 = regexr("`v'", "^[0-9]*[/.]", "") // 000923
loc v3 = regexr("`v'", "^[0-9]*[/.][0]*", "") // 923
loc first = substr("`v3'", 1, 1) // 9
loc first_pos = strpos("`v2'", "`first'") // 4: position of 9 in 000923
di "`v2'"
di "`v3'"
di "`first'"
di "`first_pos'"
Which in one step is equivalent to:
loc first_pos2 = strpos(regexr("`v'", "^[0-9]*[/.]", ""), substr(regexr("`v'", "^[0-9]*[/.][0]*", ""), 1, 1))
di "`first_pos2'"
An alternative suggested in another answer is to compare the lenght of the decimals block cleaned from the 0s with that not cleaned.
In one step this is:
loc first_pos3 = strlen(regexr("`v'", "^[0-9]*[/.]", "")) - strlen(regexr("`v'", "^[0-9]*[/.][0]*", "")) + 1
di "`first_pos3'"
Not using regex but log10 instead (which treats a number like a number), this function will:
For numbers >= 1 or numbers <= -1, return with a positive number the number of digits to the left of the decimal.
Or (and more specifically to what you were asking), for numbers between 1 and -1, return with a negative number the number of digits to the right of the decimal where the first non-zero number occurs.
digitsFromDecimal = (n) => {
dFD = Math.log10(Math.abs(n)) | 0;
if (n >= 1 || n <= -1) { dFD++; }
return dFD;
}
var x = [118.8161330, 11.10501660, 9.254180571, -1.245501523, 1, 0, 0.864931613, 0.097007836, -0.010880074, 0.009066729];
x.forEach(element => {
console.log(`${element}, Digits from Decimal: ${digitsFromDecimal(element)}`);
});
// Output
// 118.816133, Digits from Decimal: 3
// 11.1050166, Digits from Decimal: 2
// 9.254180571, Digits from Decimal: 1
// -1.245501523, Digits from Decimal: 1
// 1, Digits from Decimal: 1
// 0, Digits from Decimal: 0
// 0.864931613, Digits from Decimal: 0
// 0.097007836, Digits from Decimal: -1
// -0.010880074, Digits from Decimal: -1
// 0.009066729, Digits from Decimal: -2
Mata solution of Pearly is very likable, but notice should be paid for "unexpected" cases of "no decimal at all".
Besides, the regular expression is not a too bad choice when it could be made in a memorable 1-line.
loc v = "123.000923"
capture local x = regexm("`v'","(\.0*)")*length(regexs(0))
Below code tests with more values of v.
foreach v in 124.000923 605.20923 1.10022030 0.0090843 .00000425 12 .000125 {
capture local x = regexm("`v'","(\.0*)")*length(regexs(0))
di "`v': The wanted number = `x'"
}
In golang strings.SplitAfter method split text after an special character into an slice, but I didn't find a way for Regexp type to split text after matches. Is there a way to do that?
Example :
var text string = "1.2.3.4.5.6.7.8.9"
res := strings.Split(text, ".")
fmt.Println(res) // print [1 2 3 4 5 6 7 8 9]
res = strings.SplitAfter(text, ".")
fmt.Println(res) // print [1. 2. 3. 4. 5. 6. 7. 8. 9]
first at all, your regex "." is wrong for splitAfter function. You want number followed by value "." so the regex is: "[1-9]".
The function you are looking might look like this:
func splitAfter(s string, re *regexp.Regexp) (r []string) {
re.ReplaceAllStringFunc(s, func(x string) string {
s = strings.Replace(s,x,"::"+x,-1)
return s
})
for _, x := range strings.Split(s,"::") {
if x != "" {
r = append(r, x)
}
}
return
}
Than:
fmt.Println(splitAfter("healthyRecordsMetric",regexp.MustCompile("[A-Z]")))
fmt.Println(splitAfter("healthyrecordsMETetric",regexp.MustCompile("[A-Z]")))
fmt.Println(splitAfter("HealthyHecord Hetrics",regexp.MustCompile("[A-Z]")))
fmt.Println(splitAfter("healthy records metric",regexp.MustCompile("[A-Z]")))
fmt.Println(splitAfter("1.2.3.4.5.6.7.8.9",regexp.MustCompile("[1-9]")))
[Healthy Records Metric]
[healthy Records Metric]
[healthyrecords M E Tetric]
[Healthy Hecord Hetrics]
[healthy records metric]
[1. 2. 3. 4. 5. 6. 7. 8. 9]
Good luck!
Regexp type itself does not have a method to do that exactly that but it's quite simple to write a function that implements what your asking based on Regexp functionality:
func SplitAfter(s string, re *regexp.Regexp) []string {
var (
r []string
p int
)
is := re.FindAllStringIndex(s, -1)
if is == nil {
return append(r, s)
}
for _, i := range is {
r = append(r, s[p:i[1]])
p = i[1]
}
return append(r, s[p:])
}
Here I left a program to play with it.
I am working on ANLTR to support type checking. I am in trouble at some point. I will try to explain it with an example grammar, suppose that I have the following:
#members {
private java.util.HashMap<String, String> mapping = new java.util.HashMap<String, String>();
}
var_dec
: type_specifiers d=dec_list? SEMICOLON
{
mapping.put($d.ids.get(0).toString(), $type_specifiers.type_name);
System.out.println("identext = " + $d.ids.get(0).toString() + " - " + $type_specifiers.type_name);
};
type_specifiers returns [String type_name]
: 'int' { $type_name = "int";}
| 'float' {$type_name = "float"; }
;
dec_list returns [List ids]
: ( a += ID brackets*) (COMMA ( a += ID brackets* ) )*
{$ids = $a;}
;
brackets : LBRACKET (ICONST | ID) RBRACKET;
ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*;
LBRACKET : '[';
RBRACKET : ']';
In rule dec_list, you will see that I am returning List with ids. However, in var_dec when I try to put the first element of the list (I am using only get(0) just to see the return value from dec_list rule, I can iterate it later, that's not my point) into mapping I get a whole string like
[#4,6:6='a',<17>,1:6]
for an input
int a, b;
What I am trying to do is to get text of each ID, in this case a and b in the list of index 0 and 1, respectively.
Does anyone have any idea?
The += operator creates a List of Tokens, not just the text these Tokens match. You'll need to initialize the List in the #init{...} block of the rule and add the inner-text of the tokens yourself.
Also, you don't need to do this:
type_specifiers returns [String type_name]
: 'int' { $type_name = "int";}
| ...
;
simply access type_specifiers's text attribute from the rule you use it in and remove the returns statement, like this:
var_dec
: t=type_specifiers ... {System.out.println($t.text);}
;
type_specifiers
: 'int'
| ...
;
Try something like this:
grammar T;
var_dec
: type dec_list? ';'
{
System.out.println("type = " + $type.text);
System.out.println("ids = " + $dec_list.ids);
}
;
type
: Int
| Float
;
dec_list returns [List ids]
#init{$ids = new ArrayList();}
: a=ID {$ids.add($a.text);} (',' b=ID {$ids.add($b.text);})*
;
Int : 'int';
Float : 'float';
ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*;
Space : ' ' {skip();};
which will print the following to the console:
type = int
ids = [a, b, foo]
If you run the following class:
import org.antlr.runtime.*;
public class Main {
public static void main(String[] args) throws Exception {
TLexer lexer = new TLexer(new ANTLRStringStream("int a, b, foo;"));
TParser parser = new TParser(new CommonTokenStream(lexer));
parser.var_dec();
}
}