I want to make my form field input to pass through a validator to allow only alphabets number and three symbols - ' / to pass.
r'^[A-Za-z0-9\s-/]+$';
I have done for all except for symbol ' . Once I add in ' symbols it will assume I close the statement on there. How can I put in the symbols ' .
If that singlequote is still bothering you and nothing else worked, then there is another way to achieve it. A little tedious way but works pretty well.
Try using below regex,
^[^\u0000-\u001f\u0021-\u0026\u0028-\u002c.\u003a-\u0040\u005b-\u0060\u007b-\uffff]+$
Basically this regex excludes the character ranges that are not valid in your character set. I can add detailed explanation once you confirm it works for you and it should as it doesn't have any singlequote in the regex which was causing problem.
Had to use Unicode notation to prohibit matching Unicode characters.
Check this demo for valid and invalid matches
First of all, always place - at the end of the character class, it is the safest method to use it inside brackets.
Next, adding ' to a single-quoted string literal is done with an escape single quote, \'. Since this does not work, I suspect the problem is that you have curly quotes.
Also, consider using triple-quoted string literals, r"""<pattern>""". This is the most convenient way of writing patterns with quotes.
So you can consider using
pattern = r'''^[A-Za-z0-9\s/'‘’-]+$'''
If there is some warning you get, escape these special chars
pattern = r'''^[A-Za-z0-9\s\/\'\‘\’\-]+$'''
Related
I hava a string, like this:
{"content":(uint32)123", "id":(uint64)111, "test":{"hi":"(uint32)456"}}
I want to get result:
(uint32)123
(uint64)111
so I write regex like this:
[^(?!\")](\(uint32\)|\(uint64\))(\d)+[^(?!\")$]
but the result is:
:(uint32)123
:(uint64)111,
here the result adds : and ,
I hope that the regex does not begin with " and does not end with " , now I should how change my regex?
(\(uint(?:32|64)\)\d+) Works for me. It captures the entire string (uint[32/64])<any number of digits\> without bothering about the characters that come before or after.
Tested the following one in python
(?<!\")(\(uint32\)|\(uint64\))\d+(?!(\"|\d))
It looked like you was trying to use negative lookahead and negative lookbehind checks. But you did couple of mistakes:
You put them inside symbol group like this: [^(?!\")] what this regexp really mean - not any of symbols inside square bracket (^ - stands for not). How it should be instead: (?!\") - which mean symbol after current position shouldn't be quote (note: this will also work if there is no symbol after
To check symbol before you need to use look ahead check which have syntax (?<!some_regexp). So it would be (?<!\")
You don't need checks for start or end of the line. If you do you can put then into separate negative look ahead/behind statement.
Here is corrected example without line start/end checks:
(?<!\")(\(uint32\)|\(uint64\))(\d)+(?!\")(?!\d)
Note: you need to add (?!\d) at the end, cause otherwise it would match everything except last digit if there is quote.
Here is example with start/end of line checks:
(?<!^)(?<!\")(\(uint32\)|\(uint64\))(\d)+(?!\")(?!\d)(?!$)
P.S.: depending on language you using - you might not need to escape quote - you do need to escape quote only in case it is string escape sequence not regexp escape sequence.
Using regexp_replace within PostgreSQL, I've developed (with a lot of help from SO) a pattern to match the first n characters, if the last character is not in a list of characters I don't want the string to end in.
regexp_replace(pf.long_description, '(^.{1,150}[^ -:])', '\1...')::varchar(2000)
However, I would expect that to simply end the string in an ellipses. However what I get is the first 150 characters plus the ellipses at the end, but then the string continues all the way to the end.
Why is all that content not being eliminated?
Why is all that content not being eliminated?
because you haven't requested that. you've asked to have the first 2-151 characters replaced with those same characters and elipsis. if you modify the pattern to be (^.{1,150}[^ -:]).* (notice the trailing .* has regex_replace work on the complete string, not just the prefix) you should get the desired effect.
Do your really want the range of characters between the space character and the colon: [^ -:]?
To include a literal - in a character class, put it first or last. Looks like you might actually want [^ :-] - that's just excluding the three characters listed.
Details about bracket expressions in the manual.
That whould be (building on what #just already provided):
SELECT regexp_replace(pf.long_decript-ion, '(^.{1,150}[^ :-]).*$', '\1...');
But it should be cheaper to use substring() instead:
SELECT substring(pf.long_decript-ion, '^.{1,150}[^ :-]') || '...';
Hey guys I am trying to find a way to display the letter I by itself but I keep having trouble this is what I have so far.
This is the text file that I open, tolls.txt:
Join Microsoft employees supporting I Inspire Youth Project and other youth causes #GivingHero: http://msft.it/6013jboz
Waze for #WindowsPhone is here: http://msft.it/6016jbp2 I
fid=fopen('tolls.txt');
getLine=fgetl(fid);
while ischar(getLine)
ct='I\s';
How=regexp(getLine,ct,'match');
counter=counter+length(How);
getLine=fgetl(fid);
end
My problem is since I have to incorporate any time there is an I I have to be able to show all the stand alone capital I that have no spaces after it such as in an end of a sentence and before a sentence. So in my bat variable I have bat=I\s but I don't know if there is a or statement I can use to also incorporate \sI.
Hope I was clear about the question thank you for the help in advance.
What you'd need is something like:
ct = '(?<!\w)(I)(?!\w)';
Here (?<!\w) and (?!\w) denote a negative look-behind and a negative look-ahead respectively for a character from the word character class.
More information about the same may be found here.
#RoneyMichael's solution is fine (though possibly overkill), but there is an or statement. Here is how you could look for three distinct patterns – ' I ' or 'I ' or ' I':
ct='(^I[\W]*\s)|(\sI[\W]*\s)|(\sI[\W]*$)';
How=regexp(getLine,ct,'match')
which returns:
How =
' I ' ' I'
The last two patterns specifically match the latter 'I' if it occurs at the beginning or the end of the string, respectively. The '[\W]*' matches zero or more occurrences of non-word characters, i.e., punctuation. It's zero or more because of things like '...', '?!', etc. Alternatively, you could explicitly list allowed punctuation by using something like '[\.\?\!]*' instead (just remember that things such as quotes, parentheses, brackets, etc. can also come at the end of a line). Also, you may want to match '"I' or ''I'. In that case you can simply use
ct='(^[\W]*I[\W]*\s)|(\s[\W]*I[\W]*\s)|(\s[\W]*I[\W]*$)';
There are other logical and conditional operators that you can use in regular expressions.
im looking to use a regular expression to parse a URL to get a specific section of the url and nothing if I cannot find the pattern.
A url example is
/te/file/value/jifle?uil=testing-cdas-feaw:jilk:&jklfe=https://value-value.jifels/temp.html/topic?id=e997aad4-92e0-j30e-a3c8-jfkaliejs5#c452fds-634d-f424fds-cdsa&bf_action=jildape
I wish to get the bolded text in it.
Currently im using the regex "d=([^#]*)" but the problem is im also running across urls of this pattern:
and im getting the bold section of it
/te/file/value/jifle?uil=testing-cdas-feaw:jilk:&jklfe=https://value-value.jifels/temp.html/topic?id=e997aad4-92e0-j30e-a3c8-jfkaliejs5&bf_action=jildape
I would prefer it have no matches of this url because it doesnt contain the #
Regexes are not a magic tool that you should always use just because the problem involves a string. In this case, your language probably has a tool to break apart URLs for you. In PHP, this is parse_url(). In Perl, it's the URI::URL module.
You should almost always prefer an existing, well-tested solution to a common problem like this rather than writing your own.
So you want to match the value of the id parameter, but only if it has a trailing section containing a '#' symbol (without matching the '#' or what's after it)?
Not knowing the specifics of what style of regexes you're using, how about something like:
id=([^#&]*)#
regex = "id=([\\w-])+?#"
This will grab everything that is character class[a-zA-Z_0-9-] between 'id=' and '#' assuming everything between 'id=' and '#' is in that character class(i.e. if an '&' is in there, the regex will fail).
id=
-Self explanatory, this looks for the exact match of 'id='
([\\w-])
-This defines and character class and groups it. The \w is an escaped \w. '\w' is a predefined character class from java that is equal to [a-zA-Z_0-9]. I added '-' to this class because of the assumed pattern from your examples.
+?
-This is a reluctant quantifier that looks for the shortest possible match of the regex.
#
-The end of the regex, the last character we are looking for to match the pattern.
If you are looking to grab every character between 'id=' and the first '#' following it, the following will work and it uses the same logic as above, but replaces the character class [\\w-] with ., which matches anything.
regex = "id=(.+?)#"
Hiho everyone! :)
I have an application, in which the user can insert a string into a textbox, which will be used for a String.Format output later. So the user's input must have a certain format:
I would like to replace exactly one placeholder, so the string should be of a form like this: "Text{0}Text". So it has to contain at least one '{0}', but no other statement between curly braces, for example no {1}.
For the text before and after the '{0}', I would allow any characters.
So I think, I have to respect the following restrictions: { must be written as {{, } must be written as }}, " must be written as \" and \ must be written as \.
Can somebody tell me, how I can write such a RegEx? In particular, can I do something like 'any character WITHOUT' to exclude the four characters ( {, }, " and \ ) above instead of listing every allowed character?
Many thanks!!
Nikki:)
I hate to be the guy who doesn't answer the question, but really it's poor usability to ask your user to format input to work with String.Format. Provide them with two input requests, so they enter the part before the {0} and the part after the {0}. Then you'll want to just concatenate the strings instead of use String.Format- using String.Format on user-supplied text is just a bad idea.
[^(){}\r\n]+\{0}[^(){}\r\n]+
will match any text except (, ), {, } and linebreaks, then match {0}, then the same as before. There needs to be at least one character before and after the {0}; if you don't want that, replace + with *.
You might also want to anchor the regex to beginning and end of your input string:
^[^(){}\r\n]+\{0}[^(){}\r\n]+$
(Similar to Tim's answer)
Something like:
^[^{}()]*(\{0})[^{}()]*$
Tested at http://www.regular-expressions.info/javascriptexample.html
It sounds like you're looking for the [^CHARS_GO_HERE] construct. The exact regex you'd need depends on your regex engine, but it would resemble [^({})].
Check out the "Negated Character Classes" section of the Character Class page at Regular-Expressions.info.
I think your question can be answered by the regexp:
^(((\{\{|\}\}|\\"|\\\\|[^\{\}\"\\])*(\{0\}))+(\{\{|\}\}|\\"|\\\\|[^\{\}\"\\])*$
Explanation:
The expression is built up as follows:
^(allowed chars {0})+(allowed chars)*$
one or more sequences of allowed chars followed by a {0} with optional allowed chars at the end.
allowed chars is built of the 4 sequences you mentioned (I assumed the \ escape is \\ instead of \.) plus all chars that do not contain the escapes chars:
(\{\{|\}\}|\\"|\\\\|[^\{\}\"\\])
combined they make up the regexp I started with.