I want to check for some special character set how can I implement the REG_MATCH? i want to search for some special characters like #, ^, * in a field value But no of characters can increase. I dont want to use instr function. Any help would be appreciated .
I don't think there is any special character set. You can just specify the individual special characters in [ ]. or you can check for anything other than letters and numbers like [^a-zA-Z0-9].
Related
Using regexp_replace within PostgreSQL, I've developed (with a lot of help from SO) a pattern to match the first n characters, if the last character is not in a list of characters I don't want the string to end in.
regexp_replace(pf.long_description, '(^.{1,150}[^ -:])', '\1...')::varchar(2000)
However, I would expect that to simply end the string in an ellipses. However what I get is the first 150 characters plus the ellipses at the end, but then the string continues all the way to the end.
Why is all that content not being eliminated?
Why is all that content not being eliminated?
because you haven't requested that. you've asked to have the first 2-151 characters replaced with those same characters and elipsis. if you modify the pattern to be (^.{1,150}[^ -:]).* (notice the trailing .* has regex_replace work on the complete string, not just the prefix) you should get the desired effect.
Do your really want the range of characters between the space character and the colon: [^ -:]?
To include a literal - in a character class, put it first or last. Looks like you might actually want [^ :-] - that's just excluding the three characters listed.
Details about bracket expressions in the manual.
That whould be (building on what #just already provided):
SELECT regexp_replace(pf.long_decript-ion, '(^.{1,150}[^ :-]).*$', '\1...');
But it should be cheaper to use substring() instead:
SELECT substring(pf.long_decript-ion, '^.{1,150}[^ :-]') || '...';
I want to remove all numbers from a paragraph except from some words.
My attempt is using a negative look-ahead:
gsub('(?!ami.12.0|allo.12)[[:digit:]]+','',
c('0.12','1245','ami.12.0 00','allo.12 1'),perl=TRUE)
But this doesn't work. I get this:
"." "" "ami.. " "allo."
Or my expected output is:
"." "" 'ami.12.0','allo.12'
You can't really use a negative lookahead here, since it will still replace when the cursor is at some point after ami.
What you can do is put back some matches:
(ami.12.0|allo.12)|[[:digit:]]+
gsub('(ami.12.0|allo.12)|[[:digit:]]+',"\\1",
c('0.12','1245','ami.12.0 00','allo.12 1'),perl=TRUE)
I kept the . since I'm not 100% sure what you have, but keep in mind that . is a wildcard and will match any character (except newlines) unless you escape it.
Your regex is actually finding every digit sequence that is not the start of "ami.12.0" or "allo.12". So for example, in your third string, it gets to the 12 in ami.12.0 and looks ahead to see if that 12 is the start of either of the two ignored strings. It is not, so it continues with replacing it. It would be best to generalize this, but in your specific case, you can probably achieve this by instead doing a negative lookbehind for any prefixes of the words (that can be followed by digit sequences) that you want to skip. So, you would use something like this:
gsub('(?<!ami\\.|ami\\.12\\.|allo\\.)[[:digit:]]+','',
c('0.12','1245','ami.12.0 00','allo.12 1'),perl=TRUE)
within a string i could have the following:
this is a string ::foo:bar:: ::baz:123abc:: ::bäz:üéü:: ::#$%%:4/4::
how can i get all parts with starts with :: and ends with :: and match what is in between.
within those colons there are key, value pairs i need to filter out of the string.
if there wouldn't be special chars i the regex would look like this:
r'::([a-z0-9]+):([a-z0-9]+)::'
i could list those special chars manually but i don't think thats the right way to do this.
thx
With not-colon:
r'::([^:]+):([^:]+)::'
First you should mention the regex flavor/tool you'd like to use, but generally:
r'::([^:]+)::
Should capture the special chars as well.
HTH
Rules for the regex in english:
min length = 3
max length = 6
only letters from ASCII table, non-numeric
My initial attempt:
[A-Za-z]{3-6}
A second attempt
\w{3-6}
This regex will be used to validate input strings from a HTML form (i.e. validating an input field).
A modification to your first one would be more appropriate
\b[A-Za-z]{3,6}\b
The \b mark the word boundaries and avoid matching for example 'abcdef' from 'abcdefgh'. Also note the comma between '3' and '6' instead of '-'.
The problem with your second attempt is that it would include numeric characters as well, has no word boundaries again and the hypen between '3' and '6' is incorrect.
Edit: The regex I suggested is helpful if you are trying to match the words from some text. For validation etc if you want to decide if a string matches your criteria you will have to use
^[A-Za-z]{3,6}$
I don't know which regex engine you are using (this would be useful information in your question), but your initial attempt will match all alphabetic strings longer than three characters. You'll want to include word-boundary markers such as \<[A-Za-z]{3,6}\>.
The markers vary from engine to engine, so consult the documentation for your particular engine (or update your question).
First one should be modified as below
([A-Za-z]{3,6})
Second one will allow numbers, which I think you don't want to?
first one should work, second one will include digits as well, but you want to check non-numeric strings.
Hiho everyone! :)
I have an application, in which the user can insert a string into a textbox, which will be used for a String.Format output later. So the user's input must have a certain format:
I would like to replace exactly one placeholder, so the string should be of a form like this: "Text{0}Text". So it has to contain at least one '{0}', but no other statement between curly braces, for example no {1}.
For the text before and after the '{0}', I would allow any characters.
So I think, I have to respect the following restrictions: { must be written as {{, } must be written as }}, " must be written as \" and \ must be written as \.
Can somebody tell me, how I can write such a RegEx? In particular, can I do something like 'any character WITHOUT' to exclude the four characters ( {, }, " and \ ) above instead of listing every allowed character?
Many thanks!!
Nikki:)
I hate to be the guy who doesn't answer the question, but really it's poor usability to ask your user to format input to work with String.Format. Provide them with two input requests, so they enter the part before the {0} and the part after the {0}. Then you'll want to just concatenate the strings instead of use String.Format- using String.Format on user-supplied text is just a bad idea.
[^(){}\r\n]+\{0}[^(){}\r\n]+
will match any text except (, ), {, } and linebreaks, then match {0}, then the same as before. There needs to be at least one character before and after the {0}; if you don't want that, replace + with *.
You might also want to anchor the regex to beginning and end of your input string:
^[^(){}\r\n]+\{0}[^(){}\r\n]+$
(Similar to Tim's answer)
Something like:
^[^{}()]*(\{0})[^{}()]*$
Tested at http://www.regular-expressions.info/javascriptexample.html
It sounds like you're looking for the [^CHARS_GO_HERE] construct. The exact regex you'd need depends on your regex engine, but it would resemble [^({})].
Check out the "Negated Character Classes" section of the Character Class page at Regular-Expressions.info.
I think your question can be answered by the regexp:
^(((\{\{|\}\}|\\"|\\\\|[^\{\}\"\\])*(\{0\}))+(\{\{|\}\}|\\"|\\\\|[^\{\}\"\\])*$
Explanation:
The expression is built up as follows:
^(allowed chars {0})+(allowed chars)*$
one or more sequences of allowed chars followed by a {0} with optional allowed chars at the end.
allowed chars is built of the 4 sequences you mentioned (I assumed the \ escape is \\ instead of \.) plus all chars that do not contain the escapes chars:
(\{\{|\}\}|\\"|\\\\|[^\{\}\"\\])
combined they make up the regexp I started with.