Coldfusion string replace with ReReplace - regex

I'm trying to replace spaces in a string with underscore for slug creation using RegEx. Its works fine when there is one space. But when there is two consecutive spaces or a space followed by an underscore and vice versa(' _' OR '_ ') its replaced as __. How can i overcome this? that is I want a single underscore instead of double or triple. Any help would be appreciated.
My code for replacing is similar to this.
rereplace(lCase('this is a sample _string'),'[ ]','_','all')

This seems to do the trick, based on your revised requirement:
original = "string with_mix _ of spaces__and_ _underscores__ __to_ _test with";
updated = reReplace(original, "[ _]+", "_", "all");
writeOutput(updated);
Results in:
string_with_mix_of_spaces_and_underscores_to_test_with
Is that to spec?

Related

Swift regex for characters and empty spaces

I'm trying a regex expression to only allow characters and spaces for a full name field i.e. Mr Bob Smith
What I've currently tried:
let textRegex = "[A-Za-z+\\s]"
let textRegex = "[A-Za-z ]"
let textRegex = "[A-Za-z+ ]"
let textRegex = "([A-Za-z ])"
It doesn't appear to be working.
Thanks
Your regular expression isn't working because you misplaced the + symbol.
This one will work:
([A-Za-z ]+)
I don't know how Swift handles regex however so keep in mind if you strictly want whitespaces only, it is better to just add " " character instead of the \s which can sometimes be extended to other spaces.

Regex parse a command line string but don't return spaces between quotes

I am using python to parse a string that is passed in by the optparse module.
I want to split the string on certain delimiters but not in between quote marks.
A sample string is:
--state-basedir /dir/dir/dir/ --cmd=\"param load $v2param\" --master=/dev/ttyUSB0 --console --map --out=udp:192.168.1.1:14550
This string is passed in as a single optparse argument, I am then going to pass it to another process.
I have been trying various things at http://pythex.org/
The closest I have gotten is:
`(?<!")[\s=](?![\s0-9a-zA-Z\$\\]*")`
The issue is that the = sign after --cmd and the space before --master are not matched.
In plain English, this is how I am reading my regex:
match either a space character or an equal character as long as it is not preceded by a quotation mark and as long as it is not proceeded by a combination of any other letter,numbers,punctuation and another quotation mark
I had a feeling that there was something else I was missing, like greediness, so I tried adding ? after my look-ahead and look-behind terms. If I put a ? after my look-behind one I can get the space before --master but if I put the ? after my look-ahead term I get the spaces in the quotation marks now, which I don't want.
The idea here is that I am going to use re.split to handle things.
Thanks for any explanations as to what I am doing wrong.
This is not a regex answer and it's also not pretty, but it is one line.
sum([[x] if '"' in x else re.split(' |=',x) for x in re.split('=(\".+?\" )',a)],[])
output:
['--state-basedir', '/dir/dir/dir/', '--cmd', '"param load $v2param" ', '--master', '/dev/ttyUSB0', '--console', '--map', '--out', 'udp:192.168.1.1:14550']
Starting from the re.split('=(\".+?\" )',a)] this splits out text surrounded by quotes (more specifically ="something another thing"). The split pieces are then split further with re.split(' |=',x) if they do not have a " in them, or are just returned as is [x] if they do. The last step is collapsing the resulting 2d list by overloading sum with sum(two_d_list,[]).
I hope this answer helps but I understand if it isn't what you're looking for

Regex Expression for textfield

I want a regix format that Must be alphabets and special characters (like space, ‘, -) but numeric value should not be taken.
I tried with this expression /^[a-zA-Z ]*$/ but it treats space as special character.
Please Help.
/^[a-zA-Z\s\-\'\"]*$/
use this.
This will contain any alphabet([upper/lower]case)
,space,
hiphen,
",
'
update
If you are using it inside NSPredicate
then make sure that you put the - in the end, as it throws error.
Move it to the end of the sequence to be the last character before the closing square bracket ].
like this [a-zA-Z '"-]
If you want only the alphabets and space, ' and - then:
/^[-a-zA-Z\s\']+$/
Notice the + from above instead of *. If you use * then it will match with empty string, where the + sign means to have at least one character in your input.
Now, if you want to match any alphabets with any special characters(not only those three which are mentioned), then I'll just you to use this one:
/^\D+$/
It means any characters other than digits!
Maybe try this:
\b[a-zA-Z \-\']+\b
http://regex101.com/r/oQ5nU9
You can use it defiantly work it
[a-zA-Z._^%$#!~#,-]+
this code work fine you can try it....
//Use this for allowing space as we all as other special character.
#"[a-zA-Z\\s\\-\\'\\"]"
//Following link will be help for further.
http://www.raywenderlich.com/30288/nsregularexpression-tutorial-and-cheat-sheet
Thanks for your response.. I finally resolved it with this
NSString characterRegex = #"^(\s[a-zA-Z]+(([\'\-\+\s]\s*[a-zA-Z])?[a-zA-Z])\s)+$";
NSPredicate *characterTest = [NSPredicate predicateWithFormat:#"SELF MATCHES %#",characterRegex];
return [characterTest evaluateWithObject:inputString];

problem in not replaceing minus sign(-) with a blank using regex

I am using this regex expression to replace some characters with ""
I used it as
query=query.replace(/[^a-zA-Z 0-9 * ? : . + - ^ "" _]+/g,'');
But when my query is as +White+Diamond, i get result +White+Diamond, but when query is -White+diamond i am getting White+diamond, it means - is replaced by "" that i don't want.
Please tell me what is the problem.
In regex, - means "from ... to ...", escape your - with a backslash: \-.
What SteeveDroz said:
query=query.replace(/[^a-zA-Z0-9*?:.+\-^"_ ]+/g,'');
I'm assuming you want to exclude spaces as well. If not, remove the final space from the character class.

Regex for quoted string with escaping quotes

How do I get the substring " It's big \"problem " using a regular expression?
s = ' function(){ return " It\'s big \"problem "; }';
/"(?:[^"\\]|\\.)*"/
Works in The Regex Coach and PCRE Workbench.
Example of test in JavaScript:
var s = ' function(){ return " Is big \\"problem\\", \\no? "; }';
var m = s.match(/"(?:[^"\\]|\\.)*"/);
if (m != null)
alert(m);
This one comes from nanorc.sample available in many linux distros. It is used for syntax highlighting of C style strings
\"(\\.|[^\"])*\"
As provided by ePharaoh, the answer is
/"([^"\\]*(\\.[^"\\]*)*)"/
To have the above apply to either single quoted or double quoted strings, use
/"([^"\\]*(\\.[^"\\]*)*)"|\'([^\'\\]*(\\.[^\'\\]*)*)\'/
Most of the solutions provided here use alternative repetition paths i.e. (A|B)*.
You may encounter stack overflows on large inputs since some pattern compiler implements this using recursion.
Java for instance: http://bugs.java.com/bugdatabase/view_bug.do?bug_id=6337993
Something like this:
"(?:[^"\\]*(?:\\.)?)*", or the one provided by Guy Bedford will reduce the amount of parsing steps avoiding most stack overflows.
/(["\']).*?(?<!\\)(\\\\)*\1/is
should work with any quoted string
"(?:\\"|.)*?"
Alternating the \" and the . passes over escaped quotes while the lazy quantifier *? ensures that you don't go past the end of the quoted string. Works with .NET Framework RE classes
/"(?:[^"\\]++|\\.)*+"/
Taken straight from man perlre on a Linux system with Perl 5.22.0 installed.
As an optimization, this regex uses the 'posessive' form of both + and * to prevent backtracking, for it is known beforehand that a string without a closing quote wouldn't match in any case.
This one works perfect on PCRE and does not fall with StackOverflow.
"(.*?[^\\])??((\\\\)+)?+"
Explanation:
Every quoted string starts with Char: " ;
It may contain any number of any characters: .*? {Lazy match}; ending with non escape character [^\\];
Statement (2) is Lazy(!) optional because string can be empty(""). So: (.*?[^\\])??
Finally, every quoted string ends with Char("), but it can be preceded with even number of escape sign pairs (\\\\)+; and it is Greedy(!) optional: ((\\\\)+)?+ {Greedy matching}, bacause string can be empty or without ending pairs!
An option that has not been touched on before is:
Reverse the string.
Perform the matching on the reversed string.
Re-reverse the matched strings.
This has the added bonus of being able to correctly match escaped open tags.
Lets say you had the following string; String \"this "should" NOT match\" and "this \"should\" match"
Here, \"this "should" NOT match\" should not be matched and "should" should be.
On top of that this \"should\" match should be matched and \"should\" should not.
First an example.
// The input string.
const myString = 'String \\"this "should" NOT match\\" and "this \\"should\\" match"';
// The RegExp.
const regExp = new RegExp(
// Match close
'([\'"])(?!(?:[\\\\]{2})*[\\\\](?![\\\\]))' +
'((?:' +
// Match escaped close quote
'(?:\\1(?=(?:[\\\\]{2})*[\\\\](?![\\\\])))|' +
// Match everything thats not the close quote
'(?:(?!\\1).)' +
'){0,})' +
// Match open
'(\\1)(?!(?:[\\\\]{2})*[\\\\](?![\\\\]))',
'g'
);
// Reverse the matched strings.
matches = myString
// Reverse the string.
.split('').reverse().join('')
// '"hctam "\dluohs"\ siht" dna "\hctam TON "dluohs" siht"\ gnirtS'
// Match the quoted
.match(regExp)
// ['"hctam "\dluohs"\ siht"', '"dluohs"']
// Reverse the matches
.map(x => x.split('').reverse().join(''))
// ['"this \"should\" match"', '"should"']
// Re order the matches
.reverse();
// ['"should"', '"this \"should\" match"']
Okay, now to explain the RegExp.
This is the regexp can be easily broken into three pieces. As follows:
# Part 1
(['"]) # Match a closing quotation mark " or '
(?! # As long as it's not followed by
(?:[\\]{2})* # A pair of escape characters
[\\] # and a single escape
(?![\\]) # As long as that's not followed by an escape
)
# Part 2
((?: # Match inside the quotes
(?: # Match option 1:
\1 # Match the closing quote
(?= # As long as it's followed by
(?:\\\\)* # A pair of escape characters
\\ #
(?![\\]) # As long as that's not followed by an escape
) # and a single escape
)| # OR
(?: # Match option 2:
(?!\1). # Any character that isn't the closing quote
)
)*) # Match the group 0 or more times
# Part 3
(\1) # Match an open quotation mark that is the same as the closing one
(?! # As long as it's not followed by
(?:[\\]{2})* # A pair of escape characters
[\\] # and a single escape
(?![\\]) # As long as that's not followed by an escape
)
This is probably a lot clearer in image form: generated using Jex's Regulex
Image on github (JavaScript Regular Expression Visualizer.)
Sorry, I don't have a high enough reputation to include images, so, it's just a link for now.
Here is a gist of an example function using this concept that's a little more advanced: https://gist.github.com/scagood/bd99371c072d49a4fee29d193252f5fc#file-matchquotes-js
here is one that work with both " and ' and you easily add others at the start.
("|')(?:\\\1|[^\1])*?\1
it uses the backreference (\1) match exactley what is in the first group (" or ').
http://www.regular-expressions.info/backref.html
One has to remember that regexps aren't a silver bullet for everything string-y. Some stuff are simpler to do with a cursor and linear, manual, seeking. A CFL would do the trick pretty trivially, but there aren't many CFL implementations (afaik).
A more extensive version of https://stackoverflow.com/a/10786066/1794894
/"([^"\\]{50,}(\\.[^"\\]*)*)"|\'[^\'\\]{50,}(\\.[^\'\\]*)*\'|“[^”\\]{50,}(\\.[^“\\]*)*”/
This version also contains
Minimum quote length of 50
Extra type of quotes (open “ and close ”)
If it is searched from the beginning, maybe this can work?
\"((\\\")|[^\\])*\"
I faced a similar problem trying to remove quoted strings that may interfere with parsing of some files.
I ended up with a two-step solution that beats any convoluted regex you can come up with:
line = line.replace("\\\"","\'"); // Replace escaped quotes with something easier to handle
line = line.replaceAll("\"([^\"]*)\"","\"x\""); // Simple is beautiful
Easier to read and probably more efficient.
If your IDE is IntelliJ Idea, you can forget all these headaches and store your regex into a String variable and as you copy-paste it inside the double-quote it will automatically change to a regex acceptable format.
example in Java:
String s = "\"en_usa\":[^\\,\\}]+";
now you can use this variable in your regexp or anywhere.
(?<="|')(?:[^"\\]|\\.)*(?="|')
" It\'s big \"problem "
match result:
It\'s big \"problem
("|')(?:[^"\\]|\\.)*("|')
" It\'s big \"problem "
match result:
" It\'s big \"problem "
Messed around at regexpal and ended up with this regex: (Don't ask me how it works, I barely understand even tho I wrote it lol)
"(([^"\\]?(\\\\)?)|(\\")+)+"