I have two optional values, and when both are present, a comma needs to be in between them. If one or both values are present, there may be a trailing comma, but if no values are present, no comma is allowed.
Valid examples:
(first,second,)
(first,second)
(first,)
(first)
(second,)
(second)
()
Invalid examples:
(first,first,)
(first,first)
(second,second,)
(second,second)
(second,first,)
(second,first)
(,first,second,)
(,first,second)
(,first,)
(,first)
(,second,)
(,second)
(,)
(,first,first,)
(,first,first)
(,second,second,)
(,second,second)
(,second,first,)
(,second,first)
I have EBNF code (XML-flavored) that suffices, but is there a way I can simplify it? I would like to make it more readable / less repetitive.
tuple ::= "(" ( ( "first" | "second" | "first" "," "second" ) ","? )? ")"
If it’s easier to understand in regex, here’s the equivalent code, but I need a solution in EBNF.
/\(((first|second|first\,second)\,?)?\)/
And here’s a helpful railroad diagram:
This question becomes even more complex when we abstract it to three terms: "first", "second", and "third" are all optional, but they must appear in that order, separated by commas, with an optional trailing comma. The best I can come up with is a brute-force method:
"(" (("first" | "second" | "third" | "first" "," "second" | "first" "," "third" | "second" "," "third" | "first" "," "second" "," "third") ","?)? ")"
Clearly, a solution involving O(2n) complexity is not very desirable.
I found a way to simplify it, but not by much:
"(" ( ("first" ("," "second")? | "second") ","? )? ")"
For the three-term solution, take the two-term solution and prepend a first term:
"(" (("first" ("," ("second" ("," "third")? | "third"))? | "second" ("," "third")? | "third") ","?)? ")"
For any (n+1)-term solution, take the n-term solution and prepend a first term. This complexity is O(n), which is significantly better than O(2n).
This expression might help you to maybe design a better expression. You can do this with only using capturing groups and swipe from left to right and pass your possible inputs, maybe similar to this:
\((first|second|)(,|)(second|)([\)|,]+)
I'm just guessing that you wish to capture the middle comma:
This may not be the exact expression you want. However, it might show you how this might be done in a simple way:
^(?!\(,)\((first|)(,|)(second|)([\)|,]+)$
You can add more boundaries to the left and right of your expression, maybe similar to this expression:
This graph shows how the second expression would work:
Performance
This JavaScript snippet shows the performance of the second expression using a simple 1-million times for loop, and how it captures first and second using $1 and $3.
repeat = 1000000;
start = Date.now();
for (var i = repeat; i >= 0; i--) {
var string = "(first,second,)";
var regex = /^(?!\(,)\((first|second|)(,|)(second|)([\)|,]+)$/gms;
var match = string.replace(regex, "$1 and $3");
}
end = Date.now() - start;
console.log("YAAAY! \"" + match + "\" is a match 💚💚💚 ");
console.log(end / 1000 + " is the runtime of " + repeat + " times benchmark test. 😳 ");
I'm not familiar with EBNF but I am familiar with BNF and parser grammars. The following is just a variation of what you have based on my own regex. I am assuming the unquoted parens are not considered tokens and are used to group related elements.
tuple ::= ( "(" ( "first,second" | "first" | "second" ) ","? ")" ) | "()"
It matches on either (first,second or (first or (second
Then it matches on an optional ,
Followed by a closing parens. )
or the empty parens grouping. ()
But I doubt this is an improvement.
Here is my Java test code. The first two lines of strings in the test data match. The others do not.
String[] testdata = {
"(first,second,)", "(first,second)", "(first,)", "(first)",
"(second,)", "(second)", "()",
"(first,first,)", "(first,first)", "(second,second,)",
"(second,second)", "(second,first,)", "(second,first)",
"(,first,second,)", "(,first,second)", "(,first,)", "(,first)",
"(,second,)", "(,second)", "(,)", "(,first,first,)",
"(,first,first)", "(,second,second,)", "(,second,second)",
"(,second,first,)", "(,second,first)"
};
String reg = "\\(((first,second)|first|second),?\\)|\\(\\)";
Pattern p = Pattern.compile(reg);
for (String t : testdata) {
Matcher m = p.matcher(t);
if (m.matches()) {
System.out.println(t);
}
}
I want to remove some characters from a textbox. It works, but when i try to replace the "[" character it gives a error. Why?
Return Regex.Replace(html, "[", "").Replace(",", " ").Replace("]", "").Replace(Chr(34), " ")
When i delete the "[", "").Replace( part it works great?
Return Regex.Replace(html, ",", " ").Replace("]", "").Replace(Chr(34), " ")
The problem is that since the [ character has a special meaning in regex, It must be escaped in order to use it as part of a regex sequence, therefore to escape it all you have to do is add a \ before the character.
Therefore this would be your proper regex code Return Regex.Replace(html, "\[", "").Replace(",", " ").Replace("]", "").Replace(Chr(34), " ")
Because [ is a reserved character that regex patterns use. You should always escape your search patterns using Regex.Escape(). This will find all reserved characters and escape them with a backslash.
Dim searchPattern = Regex.Escape("[")
Return Regex.Replace(html, searchPattern, ""). 'etc...
But why do you need to use regex anyway? Here's a better way of doing it, I think, using StringBuilder:
Dim sb = New StringBuilder(html) _
.Replace("[", "") _
.Replace(",", " ") _
.Replace("]", "") _
.Replace(Chr(34), " ")
Return sb.ToString()
I got hand over some legacy code and first I want to change
(int)a + b;
into
static_cast<int>(a) + b;
There are a lot of them and doing them manually is very time consuming. Is there a way to use vim to make this happen?
I tried something like
:%s/\(int\).* /static_cast<int>(\2)/g
but it doesn't work. Please advice.
Try this:
:%s/(\(.*\))\([^ ]*\)/static_cast<\1>(\2)/g
This regex, as per your question, assumes that there will be a space after the variable name:
Example:
For following test data:
(int)a + b
(float)x * y
(int)z+m
result will be
static_cast<int>(a) + b
static_cast<float>(x) * y
static_cast<int>(z+m)
Explaining the regex
(\(.*\)) - Match whatever is inside () and capture it
\([^ ]*\) - followed by anything which is not a space and capture it
You can use this:
%s/(int)\(a\)/static_cast<int>(\1)/g
This is assuming variable name always is a. If it is not then you can replace a with [a-z].
I have several mappings for this task in lh-cpp.
In that case, it'll be ,,sc, or ,,rc, or ,,dc. (here, , is actually my <localleader>).
It's actually implemented as:
function! s:ConvertToCPPCast(cast_type)
" Extract text to convert
let save_a = #a
normal! gv"ay
" Strip the possible brackets around the expression
let expr = matchstr(#a, '^(.\{-})\zs.*$')
let expr = substitute(expr, '^(\(.*\))$', '\1', '')
"
" Build the C++-casting from the C casting
let new_cast = substitute(#a, '(\(.\{-}\)).*',
\ a:cast_type.'<\1>('.escape(expr, '\&').')', '')
" Do the replacement
exe "normal! gvs".new_cast."\<esc>"
let #a = save_a
endfunction
vnoremap <buffer> <LocalLeader><LocalLeader>dc
\ <c-\><c-n>:'<,'>call <sid>ConvertToCPPCast('dynamic_cast')<cr>
nmap <buffer> <LocalLeader><LocalLeader>dc viw<LocalLeader><LocalLeader>dc
...