Can a regular expression in GeSHi include a keyword? - regex

I'm working on a language file for GeSHi and would like to highlight the (unknown) word following a keyword and some whitespace. I have tried something like
$language_data = array (
...
'KEYWORDS' => array(
1 => array(
'keyword1', 'keyword2'
)
),
...
'REGEXPS' => array(
0 => array(
GESHI_SEARCH => '(((keyword1|keyword2)\s+)([a-zA-Z_][a-zA-Z0-9_]*))',
GESHI_REPLACE => '\\4',
GESHI_MODIFIERS => '',
GESHI_BEFORE => '\\2',
GESHI_AFTER => ''
)
),
...
)
But the regular expression never matches. I assume that the keyword is already consumed by the parser when the regular expression is tested.
A lookbehind does not work in GESHI_SEARCH because the length of the real keywords and the separating whitespace is not fixed (see What's the technical reason for "lookbehind assertion MUST be fixed length" in regex?).
How can I highlight a word after a keyword in GeSHi?

https://regex101.com/r/SLzJ77/1
GeSHi\s*([\w]+) use this regex (may work)

Related

regex contraint to match md5 hash in zend routing

Below is the routing defined in my module.
'download' => array(
'type' => 'Segment',
'options' => array(
'route' => '/download[/:transferId][/:receiverId]',
'constraints' =>array(
'transferId' => '/^[a-f0-9]{32}$/i',
'receiverId' => '/^[a-f0-9]{32}$/i'
),
'defaults' => array(
'controller' => 'FileServer\Controller\Web',
'action' => 'download',
)
)
),
And these urls were expected to match http://localhost/download/229def85ea0ccfcd6809053cb8fc4911 and this http://localhost/download/229def85ea0ccfcd6809053cb8fc4911/229def85ea0ccfcd6809053cb8fc4911 but none are matched
Apart from this regex in the constraint /^[a-f0-9]{32}$/i I tried these aswell but its not working
^[a-fA-F0-9]{32}$
[a-fA-F0-9]{32}
[a-f0-9]{32}
Whats wrong?
You probably want:
'constraints' => array(
'transferId' => '[a-f0-9]{32}',
'receiverId' => '[a-f0-9]{32}'
),
(You don't include the other parts of the regex as ZF2 is combining all the constraints into one regex pattern.)
See the docs: http://framework.zend.com/manual/current/en/modules/zend.mvc.routing.html#zend-mvc-router-http-segment for more examples.
Apart from this regex in the constraint /^[a-f0-9]{32}$/i I tried
these as well but its not working
^[a-fA-F0-9]{32}$ [a-fA-F0-9]{32}
This didn't work because [a-f0-9]{32} matches only an alphabet or a number 32 times, whereas in your string, there is a semicolon :, and a forward slash /.
You can make use of this pattern
http:\/\/\S+
this strictly ensures that http:// is present in the match
see Demo
if you want a more general pattern, use this
[a-z]+:\/\/\S+
see Demo

Regular expression to check "#" at the beginning of a word and it should be only once in a word

I need to match the # symbol using a regex. I've tried the regex /\B#(\w*)$/. It is working but my use case is # should appear only once in the beginning of a word. Not more than once:
#test - Right one.
##test - Not a right one.
Here is the fiddle http://jsfiddle.net/Kurshith/deepk1dm/1/. When you type # on the text area, it will trigger the auto-complete. But it will allow even if you type # twice. But in my case, It should allow only for #test, #test #hello, #. Please help me.
/^#[^#]*$/ will do. Your case can be stated as: begin with # and then anything else that is not # with 0 or more occurrences, and until the end of the string.
Testing with Ruby's irb:
irb(main):003:0> /^#[^#]*$/.match("#test")
=> #<MatchData "#test">
irb(main):004:0> /^#[^#]*$/.match("##test")
=> nil
irb(main):005:0> /^#[^#]*$/.match("#test#")
=> nil
irb(main):006:0> /^#[^#]*$/.match("#test#test")
=> nil
irb(main):007:0> /^#[^#]*$/.match("test")
=> nil
irb(main):008:0> /^#[^#]*$/.match("t#est")
=> nil
irb(main):009:0> /^#[^#]*$/.match("#")
=> #<MatchData "#">
irb(main):010:0> /^#[^#]*$/.match("##")
=> nil
I would use a whitespace boundary. Add anchors as necessary.
(?<!\S)#(\w*)(?!\S)

No Space Zend Validator Regex

What is the Regex that allows everything but spaces.
i tried this validator and other similar ones
'validators' => array(
array('regex', true,
array(
'pattern'=>'/[^\s]/',
'messages'=>array(
'regexNotMatch'=>'Your own custom error message'
)
)
)
)
I am using Zend Framework 1
The validation seems to fail because it accepts any string with only one non-space caracter.
For example, these strings are accepted
* 'hello world'
* 'a b'
* ' c '
You need to change your pattern to :
'pattern' => '/^[^\s]*$/'
If you need only alpha-numeric caracters, Zend
has already a built-in Alpha-Numeric validator
'validators' => array(
array(
'Alnum',
true,
array('allowWhiteSpace' => false)
)
)
Hope it helps

Regex help advanced

I have a regex problem
I need to validate a user given pre defined string to check if there are no mistakes in that string. I made unit test below the tests so you can see what string must match and what don't.
What i already have and works for most:
/^product:\[(.*?)\]|default:\[(.*?)\]$/
What still needs to be tested is there must not be any whitespaces this does not count for the values between the [] And the | must be there. but not at the end
return array(
array(
'default:[6_400]',
TRUE
),
array(
'default:[bla_bla]',
TRUE
),
array(
'default:[bla _ bla]',
TRUE
),
array(
'product:8[8_400]|default:[6_400]',
TRUE
),
array(
'product:8[8_400]|default:[6_400]|product:10[10_400]',
TRUE
),
array(
'product:8[8_400]|product:12 [12_400]|default:[6_400]',
FALSE
),
array(
'roduct:8[8_400]|product:12[12_400]|default[6_400',
FALSE
),
array(
'default:6_400',
FALSE
),
array(
'product:8[8_400]',
FALSE
),
array(
'product:8[8_400]default:[6_400]',
FALSE
),
array(
'product:8[8_400]|default:[6_400]|',
FALSE
),
);
Looking at your examples, I think you mean that inside the brackets you want word characters or spaces (you probably don't want #$%^&&, null and other such stuff)...
\[[\w\s]+\]
This apparently can be proceded by either product:number or default:
((product:\d+)|(default:))(\[[\w\s]+\])
Clauses must be separated by | but the matching string must not end with '|'
((product:\d+)|(default:))(\[[\w\s]+\])(\|(?!$)|$)
This can occur one or more times
(((product:\d+)|(default:))(\[[\w\s]+\])(\|(?!$)|$))+
And we must have at least one full, legal default clause:
(?=.*?default:\[[\w\s]+\])(((product:\d+)|(default:))(\[[\w\s]+\])(\|(?!$)|$))+
and fill the whole line:
^(?=.*?default:\[[\w\s]+\])(((product:\d+)|(default:))(\[[\w\s]+\])(\|(?!$)|$))+$
Here it is in action http://regexr.com?3275i
Note that since I have not included any patterns that allow white-space anywhere other than in the brackets, nothing special needs to be done to prohibit it outside of the brackets
Also note that I have created many capturing groups (for simplicity/readability) but you can eliminate them by placing ?: after any ( you don't want to capture. This improves performance somewhat. Besides testing things in regexr.com, this site is often helpful for learning/building regular expressions:
http://www.regular-expressions.info/
Try this regex:
/^(?=.*?default)(?:(?:product|default):\d*\[[^\]]*\](?:\|(?!$)|$))+$/
See on rubular
If you want to exclude the possibility that default might appear in the [], as Gabber pointed out, you could use:
/^(?:product:\d*\[[^\]]*\]\|)?(?:default:\d*\[[^\]]*\](?:\|(?!$)|$))(?:product:\‌​d*\[[^\]]*\](?:\|(?!$)|$))?$/

RegExp pattern to capture around two-characters delimiter

I have a string which is something like:
prefix::key0==value0::key1==value1::key2==value2::key3==value3::key4==value4::
I want to retrieve the value associated to a key (say, key1). The following pattern:
::key1==([^:]*)
...will work only if there are no ':' character in the value, so I want to make sure the pattern matching will stop only for the substring ::, but I'm can't find how to do that, as most examples I see are about single character matching.
How do I modify the regexp pattern to match all characters between "::key1==" and the next "::" ?
Thanks!
Can you do something like this : ::key1==(.*?)::? Assuming the language supports the lazy ? operator, this should work.
As mentioned in my comment to your question, if the entirety of your string is
prefix::key0==value0::key1==value1::key2==value2::key3==value3::key4==value4::
I would suggest exploding/splitting the string at :: instead of using regex as it will usually always be faster. You didn't specify language but here is a php example:
// string
$string = "prefix::key0==value0::key1==value1::key2==value2::key3==value3::key4==value4::";
// explode using :: as delimiter
$string = explode('::',$string);
// for each element...
foreach ($string as $value) {
// check if it has == in it
if (strpos($value,'==')!==false) $matches[] = $value;
}
// output
echo "<pre>";print_r($matches);
output:
Array
(
[0] => key0==value0
[1] => key1==value1
[2] => key2==value2
[3] => key3==value3
[4] => key4==value4
)
However, if you insist on the regex approach, here negative look-ahead alternative
::((?:(?!::).)+)
php example
// string
$string = "prefix::key0==value0::key1==value1::key2==value2::key3==value3::key4==value4::";
preg_match_all('~::((?:(?!::).)+)~',$string,$matches);
//output
echo "<pre>";print_r($matches);
output
Array
(
[0] => key0==value0
[1] => key1==value1
[2] => key2==value2
[3] => key3==value3
[4] => key4==value4
)
I think you're looking for a positive look-ahead:
::key0==(.*?)(?=::\w+==)
With the following:
prefix::key0==val::ue0::key1==value1::key2==value2::key3==value3::key4==value4::
It correctly finds val::ue0. This also assumes the keys conform to \w ([0-9A-Za-z_])
Also, a positive look-ahead may be a bit of overkill, but will work if the answer contains ::, too.