I only want to allow
Numbers
Letters
Spaces
International Letters
Anything else I want to remove.
I am using Coldfusion. I really haven't tried much because I have never really used regex before. I am trying to remove the "bad" characters
Here is what I am doing so far:
<cfset theText = "Baum -$&*( 5 Steine hoch groß 3 Stück grün****">
<cfset test1 = rereplace(theText, '[\p{L}0-9 ]', ' ', 'all')>
<cfset test2 = rereplace(theText, '[^\p{L}0-9 ]', ' ', 'all')>
The results:
Original Text: Baum -$&*( 5 Steine hoch groß 3 Stück grün****
Test 1 Result: Baum -$&*( Steine hoch groß Stück grün****
Test 2 Result: 5 3
In the end, I wound up doing this and it seems to be giving me what I need..
<cfset finalFile = varData.replaceAll('[^\p{L}0-9-.: ]',' ') />
Your question is a bit vague, but this regex sounds like it might fit your description.
[^\p{L}0-9 ]
You don't specify a language or flavor, so assuming \p{L} is supported, simply replace anything that matches this pattern with an empty string "".
Small demo: http://rubular.com/r/W4q5PFSJRg
I' am using IsValid here is the documentation. Below is the code where am trying to validate only text and space in the textfield with ColdFusion.
Well this doesn't work, what am missing here or is their any other function available for easy use. It should allow only alphabetical and space
<cfif isdefined("Form.txtname")
and Form.txtname eq ""
or Form.txtname eq "Enter your name"
or FindNoCase("http://",Form.txtname) neq 0
or IsValid("regex", Form.txtname, "[A-Z][a-z] +") eq 1>
If you want to validate only alphabetical text and spaces, your regex should be
^[a-zA-Z ]*$
the * will allow empty textfield (so no need for eq "" anymore)
^$ are anchors, that match respectively the beginning and the end of the string. They make sure there's only what you want in the textfield.
There is a variable being set as follows (through custom tag invocation)
<cfset str = Trim( THISTAG.GeneratedContent ) />
The contents of THISTAG.GeneratedContent looks like
FNAME|MNAME|LNAME Test|Test|Test
The code I am having trouble understanding is as follows:
<cfset str = str.ReplaceAll(
"(?m)^[\t ]+|[\t ]+$",
""
) />
<cfset arrRows = str.Split( "[\r\n]+" ) />
The above line of code should generate array with contents as
arrRows[1] = FNAME|MNAME|LNAME
arrRows[2] = Test|Test|Test
But on dumping the array shows following output:
FNAME|MNAME|LNAME Test|Test|Test
I do not understand what both regular expressions are trying to achieve.
This one...
<cfset str = str.ReplaceAll(
"(?m)^[\t ]+|[\t ]+$",
""
) />
..is removing any tabs/spaces that are at the beginning or end of lines. The (?m) turns on multiline mode which causes ^ to match "start of line" (as opposed to its usual "start of content"), and similarly $ means "end of line" (rather than "end of content") in this mode.
This one...
<cfset arrRows = str.Split( "[\r\n]+" ) />
...is converting lines to an array, by splitting on any combination of consecutive carriage returns and/or newline characters.
Bonus Info
You can actually combine these two regexes into a single one, like so:
<cfset arrRows = str.split( '\s*\n\s*' ) />
The \s will match any whitespace character - i.e. [\r\n\t ] and thus this combines the removal of spaces and tabs with turning it into an array.
(Note that since it works by looking for newlines, the trim on GeneratedContent is necessary for any preceeding/trailing whitespace to be removed.)
I list that I have created in coldfusion. Lets use the following list as an example:
<cfset arguments.tags = "battlefieldx, testx, wonderful, ererex">
What I would like to do is remove the "x" from the words that have an x at the end and keep the words in the list. Order doesn't matter. A regex would be fine or looping with coldfusion would be okay too.
Removing x from end of each list element...
To remove all x characters that preceed a comma or the end of string, do:
rereplace( arguments.tags , "x(?=,|$)" , "" , "all" )
The (?= ) part here is a lookahead - it matches the position of its contents, but does not include them in what is replaced. The | is alternation - it'll try to match a literal , and if that fails it'll try to match the end of the string ($).
If you don't want to remove a lone x from, e.g. "x,marks,the,spot"...
If you want to make sure that x is at the end of a word (i.e. is not alone), you can use a non-word boundary check:
rereplace( arguments.tags , "\Bx(?=,|$)" , "" , "all" )
The \B will not match if there isn't a [a-zA-Z0-9_] before the x - for more complex/accurate rules on what constitutes "end of a word", you would need a lookbehind, which can't be done with rereplace, but is still easy enough by doing:
arguments.tags.replaceAll("(?<=\S)x(?=,|$)" , "" )
(That looks for a single non-whitespace character before the x to consider it part of a word, but you can put any limited-width expression within the lookbehind.)
Obviously, to do any letter, switch the x with [a-zA-Z] or whatever is appropriate.
The regex to grab the 'x' from the end of a word is pretty straightforward. Supposing you have a given element as a string, the regex you need is simply:
REReplace(myString, "x$", "")
This matches an x at the end of the given string and replaces it with an empty string.
To do this for each substring in a comma-delimited list, try:
REReplace(myString, "x,|x$", ",", "ALL")
REReplace(myString, "x$", "")
The $ symbol is going to be used to detect the end of the string. Thus detecting an 'x' at the end of your string. The empty quotes will replace it with nothing, thus removing the 'x'.
This has already been answered, but thought I'd post a ColdFusion only solution since you said you could use either. (The RegEx is obviously much easier, but this will work too)
<cfset arguments.tags = "battlefieldx, testx, wonderful, ererex">
<cfset temparray = []>
<cfloop list="#arguments.tags#" index="i">
<cfif right(i,1) EQ 'X'>
<cfset arrayappend(temparray,left(i,len(i) - 1))>
<cfelse>
<cfset arrayappend(temparray,i)>
</cfif>
</cfloop>
<cfset arguments.tags = arraytolist(temparray)>
If you have ColdFusion 9+ or Railo you can simplify the loop using a ternary operator
<cfloop list="#arguments.tags#" index="i">
<cfset cfif right(i,1) EQ 'X' ? arrayappend(temparray,left(i,len(i) - 1)) : arrayappend(temparray,i)>
</cfloop>
You could also convert arguments.tags to an array and loop that way
<cfloop array="#listtoarray(arguments.tags)#" index="i">
<cfset cfif right(i,1) EQ 'X' ? arrayappend(temparray,left(i,len(i) - 1)) : arrayappend(temparray,i)>
</cfloop>
How do I remove spaces and other whitespace characters from inside a string. I don't want to remove the space just from the ends of the string, but throughout the entire string.
You can use a regular expression
<cfset str = reReplace(str, "[[:space:]]", "", "ALL") />
You can also simply use Coldfusion's Replace() (if you don't want to use regular expressions for some reason - but don't forget the "ALL" optional parameter.
I've run into this in the past, trying to remove 5 spaces in the middle of a string - I would do something like:
<cfset str = Replace(str, " ", "")/>
Forgetting the "ALL" will only replace the first occurrence so I would end up with 4 spaces, if that makes sense.
Be sure to use:
<cfset str = Replace(str, " ", "", "ALL")/>
to replace multiple spaces. Hope this helps!