ColdFusion ReReplace - regex

What I'm trying to do is add forward slash to the beginning and end of a string of text if the first and last character of the string is not /.
In my script I have:
if(!reFind('\/\S\/', myString){
myString = '/' & arrayToList(listToArray(myString, '/\'), '/') & '/');
}
I want to run a ReReplace instead of listing to an array and then adding the slashes in.

Using array to list and list to array could possibly remove inner slashes, so you don't want to do that. Instead, replace leading and trailing slashes with a regex.
<cfscript>
string1 = "foobar";
string2 = "/foobar/";
string3 = "foo/bar";
string4 = "/foo/bar/";
function addSlashes (str) {
return "/" & reReplace(str,"^/|/$","","all") & "/";
}
writeDump(addSlashes(string1));
writeDump(addSlashes(string2));
writeDump(addSlashes(string3));
writeDump(addSlashes(string4));
</cfscript>
you can paste the above into http://www.trycf.com

You should just be able to replace ^/?(.*?)/?$ with /\1/.
See a visual explanation at http://www.regexper.com/
Note the pattern I use # www.regexper.com is slightly different as I need to escape the / for a JS pattern; not so with CFML ones.

Related

Trying to remove newlines and carriage returns from a text. Why doesn´t this code work? [duplicate]

I have a text in a textarea and I read it out using the .value attribute.
Now I would like to remove all linebreaks (the character that is produced when you press Enter) from my text now using .replace with a regular expression, but how do I indicate a linebreak in a regex?
If that is not possible, is there another way?
How you'd find a line break varies between operating system encodings. Windows would be \r\n, but Linux just uses \n and Apple uses \r.
I found this in JavaScript line breaks:
someText = someText.replace(/(\r\n|\n|\r)/gm, "");
That should remove all kinds of line breaks.
Line breaks (better: newlines) can be one of Carriage Return (CR, \r, on older Macs), Line Feed (LF, \n, on Unices incl. Linux) or CR followed by LF (\r\n, on WinDOS). (Contrary to another answer, this has nothing to do with character encoding.)
Therefore, the most efficient RegExp literal to match all variants is
/\r?\n|\r/
If you want to match all newlines in a string, use a global match,
/\r?\n|\r/g
respectively. Then proceed with the replace method as suggested in several other answers. (Probably you do not want to remove the newlines, but replace them with other whitespace, for example the space character, so that words remain intact.)
var str = " \n this is a string \n \n \n"
console.log(str);
console.log(str.trim());
String.trim() removes whitespace from the beginning and end of strings... including newlines.
const myString = " \n \n\n Hey! \n I'm a string!!! \n\n";
const trimmedString = myString.trim();
console.log(trimmedString);
// outputs: "Hey! \n I'm a string!!!"
Here's an example fiddle: http://jsfiddle.net/BLs8u/
NOTE! it only trims the beginning and end of the string, not line breaks or whitespace in the middle of the string.
You can use \n in a regex for newlines, and \r for carriage returns.
var str2 = str.replace(/\n|\r/g, "");
Different operating systems use different line endings, with varying mixtures of \n and \r. This regex will replace them all.
The simplest solution would be:
let str = '\t\n\r this \n \t \r is \r a \n test \t \r \n';
str = str.replace(/\s+/g, ' ').trim();
console.log(str); // logs: "this is a test"
.replace() with /\s+/g regexp is changing all groups of white-spaces characters to a single space in the whole string then we .trim() the result to remove all exceeding white-spaces before and after the text.
Are considered as white-spaces characters:
[ \f\n\r\t\v​\u00a0\u1680​\u2000​-\u200a\u2028\u2029\u202f\u205f\u3000\ufeff]
If you want to remove all control characters, including CR and LF, you can use this:
myString.replace(/[^\x20-\x7E]/gmi, "")
It will remove all non-printable characters. This are all characters NOT within the ASCII HEX space 0x20-0x7E. Feel free to modify the HEX range as needed.
This will replace the line break by empty space.
someText = someText.replace(/(\r\n|\n|\r)/gm,"");
Read more on this article.
var str = "bar\r\nbaz\nfoo";
str.replace(/[\r\n]/g, '');
>> "barbazfoo"
To remove new line chars use this:
yourString.replace(/\r?\n?/g, '')
Then you can trim your string to remove leading and trailing spaces:
yourString.trim()
USE THIS FUNCTION BELOW AND MAKE YOUR LIFE EASY
The easiest approach is using regular expressions to detect and replace newlines in the string. In this case, we use replace function along with string to replace with, which in our case is an empty string.
function remove_linebreaks( var message ) {
return message.replace( /[\r\n]+/gm, "" );
}
In the above expression, g and m are for global and multiline flags
I often use this regex for (html) strings inside jsons:
replace(/[\n\r\t\s]+/g, ' ')
The strings come from a html editor of a CMS or a i18n php. The common scenarios are:
- lorem(.,)\nipsum
- lorem(.,)\n ipsum
- lorem(.,)\n
ipsum
- lorem ipsum
- lorem\n\nipsum
- ... many others with mixed whitespaces (\t\s) and even \r
The regex avoids this ugly things:
lorem\nipsum => loremipsum
lorem,\nipsum => lorem,ipsum
lorem,\n\nipsum => lorem, ipsum
...
Surely not for all use cases and not the fastest one, but enough for most textareas and texts for websites or webapps.
The answer provided by PointedEars is everything most of us need. But by following Mathias Bynens's answer, I went on a Wikipedia trip and found this: https://en.wikipedia.org/wiki/Newline.
The following is a drop-in function that implements everything the above Wiki page considers "new line" at the time of this answer.
If something doesn't fit your case, just remove it. Also, if you're looking for performance this might not be it, but for a quick tool that does the job in any case, this should be useful.
// replaces all "new line" characters contained in `someString` with the given `replacementString`
const replaceNewLineChars = ((someString, replacementString = ``) => { // defaults to just removing
const LF = `\u{000a}`; // Line Feed (\n)
const VT = `\u{000b}`; // Vertical Tab
const FF = `\u{000c}`; // Form Feed
const CR = `\u{000d}`; // Carriage Return (\r)
const CRLF = `${CR}${LF}`; // (\r\n)
const NEL = `\u{0085}`; // Next Line
const LS = `\u{2028}`; // Line Separator
const PS = `\u{2029}`; // Paragraph Separator
const lineTerminators = [LF, VT, FF, CR, CRLF, NEL, LS, PS]; // all Unicode `lineTerminators`
let finalString = someString.normalize(`NFD`); // better safe than sorry? Or is it?
for (let lineTerminator of lineTerminators) {
if (finalString.includes(lineTerminator)) { // check if the string contains the current `lineTerminator`
let regex = new RegExp(lineTerminator.normalize(`NFD`), `gu`); // create the `regex` for the current `lineTerminator`
finalString = finalString.replace(regex, replacementString); // perform the replacement
};
};
return finalString.normalize(`NFC`); // return the `finalString` (without any Unicode `lineTerminators`)
});
Simple we can remove new line by using text.replace(/\n/g, " ")
const text = 'Students next year\n GO \n For Trip \n';
console.log("Original : ", text);
var removed_new_line = text.replace(/\n/g, " ");
console.log("New : ", removed_new_line);
A linebreak in regex is \n, so your script would be
var test = 'this\nis\na\ntest\nwith\newlines';
console.log(test.replace(/\n/g, ' '));
I am adding my answer, it is just an addon to the above,
as for me I tried all the /n options and it didn't work, I saw my text is comming from server with double slash so I used this:
var fixedText = yourString.replace(/(\r\n|\n|\r|\\n)/gm, '');
Try the following code. It works on all platforms.
var break_for_winDOS = 'test\r\nwith\r\nline\r\nbreaks';
var break_for_linux = 'test\nwith\nline\nbreaks';
var break_for_older_mac = 'test\rwith\rline\rbreaks';
break_for_winDOS.replace(/(\r?\n|\r)/gm, ' ');
//output
'test with line breaks'
break_for_linux.replace(/(\r?\n|\r)/gm, ' ');
//output
'test with line breaks'
break_for_older_mac.replace(/(\r?\n|\r)/gm, ' ');
// Output
'test with line breaks'
If it happens that you don't need this htm characte &nbsp shile using str.replace(/(\r\n|\n|\r)/gm, "") you can use this str.split('\n').join('');
cheers
1st way:
const yourString = 'How are you \n I am fine \n Hah'; // Or textInput, something else
const newStringWithoutLineBreaks = yourString.replace(/(\r\n|\n|\r)/gm, "");
2nd way:
const yourString = 'How are you \n I am fine \n Hah'; // Or textInput, something else
const newStringWithoutLineBreaks = yourString.split('\n').join('');
On mac, just use \n in regexp to match linebreaks. So the code will be string.replace(/\n/g, ''), ps: the g followed means match all instead of just the first.
On windows, it will be \r\n.
This will remove all your newlines, spaces, unnecessary characters
str = '\n \n\n\n\n\n\n\n\n\n\n\n\n \n \n \n \n Books\n \n \n \n \n\n\n'
console.log(str)
var output = str.replace(/\n|\r|\W/g, "");
console.log(output)
'Books'
const text = 'test\nwith\nline\nbreaks'
const textWithoutBreaks = text.split('\n').join(' ')

Matlab: using regexp to get a string that has a whitespace in between

I want to use Regex to acquire some ID's in a cellstring array, the array looks like this:
myString = '(['US04650Y1001', 'US90274P3029', 'HON WI', 'US41165F1012'])';
My pattern for regex is as follows:
pattern = '[A-Za-z0-9.^_]+';
newArr = regexp(myString, pattern,'match');
I'd like to get the ID called 'HON WI', but with my current pattern, its splitting it into two because my pattern can't deal with the whitespace properly. I would like to get the whole "HON WI", as well as my other strings, everything that's in '', these might have special characters like ^, . or _, but I don't know how to add the whitespace.
I already tried stuff like this, without success:
pattern = '[A-Za-z0-9.^_\s]+';
My new array should have, in each cell, the strings/ID's contained in myString (US04650Y1001, US90274P3029, HON WI and US41165F1012) with dimensions 1x4.
Another approach that seems to work but not entirely sure:
myString = strrep(myString,'([','');
myString = strrep(myString,'])','');
myString = regexp(myString,',','split');
myString = strrep(myString,'''','');
This seems to get me what I want, but I would like to know how can I alter the regex on my first approach.
Many thanks in advance.
You may use a mere '([^']+)' regex and use 'tokens' to get the captures:
myString = '([''US04650Y1001'', ''US90274P3029'', ''HON WI'', ''US41165F1012''])';
pattern = '''([^'']+)''';
newArr = regexp(myString, pattern,'match', 'tokens');
The newArr will look like
{
[1,1] = 'US04650Y1001'
[1,2] = 'US90274P3029'
[1,3] = 'HON WI'
[1,4] = 'US41165F1012'
}
You may option is to use lookaround assertions. The following will match any string made of alphanumeric character or underscore (\w), space (' ') or characters . or ^, that is located between quotes. This will specifically exclude the blank space next to the comma, in the separation between tokens, i.e. ', ' does not give a match.
Note that \s will match any blank space character (including tab, newline), this is why a space is preferred here:
pattern2='(?<='')[\w.^ ]+(?='')';
pattern2 =
(?<=')[\w.^ ]+(?=')
newArr = regexp(myString, pattern2,'match');
newArr'
ans =
'US04650Y1001'
'US90274P3029'
'HON WI'
'US41165F1012'

Add parameter to String#replace

I want to remove the last character from a string if it is a pipe. I have
.replace(/\|(\s+)?$/, '')
I want to add a parameter delim to replace since the last character changes. I am trying:
.replace(/\+delim +(\s+)?$/, '')
but no luck.
The code that uses this function:
rangeValues[cellRow][hn[j]] = rangeValues[cellRow][hn[j]].toString()
.split(frValues[i][0])
.join(frValues[i][1]).trim()
.replace(/\ + delim + (\s+)?$/, '');
You want to remove a last character using regex.
You want to use by changing delim in the regex.
If my understanding for your question is correct, how about using RegExp?
Modified script:
var delim = "|";
var string = "\\" + delim + "(\\s+)?$";
var regex = new RegExp(string);
rangeValues[cellRow][hn[j]] = rangeValues[cellRow][hn[j]].toString()
.split(frValues[i][0])
.join(frValues[i][1]).trim()
.replace(regex, '');
Note :
When delim is |, regex becomes /\|(\s+)?$/.
Reference:
RegExp
If I misunderstand your question, I'm sorry.

Exact Match using VBA Replace

I am trying to remove a string from another string using VBA replace function.
The string from which I am trying to remove looks like below which contains cell address concatenated by ;
"$B$1;$B$21;$B$2;$C$3;$B$20;$B$201"
and the string which I would like to remove is $B$2 by say xxx.
The replace function matches all occurrences of $B$2 in the string and gives me the output as below
$B$1;xxx1;xxx;$C$3;xxx0;xxx01
However I would like to search for $B$2 exactly in the string and expect an output like
$B$1;$B$21;xxx;$C$3;$B$20;$B$201
I one way I could think of doing this is by splitting up the string on ;(separator and looping and looking at each value) but I am looking at more direct solution here. Like using pattern matching techniques or something else.
You can include the following ; in the replace operation, to make sure you only match "complete" references. You just need to take a precaution for also matching the last entry, by adding a dummy semicolon at the end:
s = "$B$1;$B$21;$B$2;$C$3;$B$20;$B$201"
find = "$B$2"
repl = "xxxx"
result = Replace(s & ";", find & ";", repl & ";")
result = Left(result, Len(result)-1) ' Remove the final semicolon
Although this works for your case, in a more general exercise, you would also want to test for the preceding delimiter, and then the last two lines would look like this:
result = Replace(";" & s & ";", ";" & find & ";", ";" & repl & ";")
result = Mid(result, 2, Len(result)-2)
you cold use:
Function myReplace(strng As String, findStr As String, replacementStrng As String)
myReplace = Replace(strng & ";", findStr & ";", replacementStrng & ";")
myReplace = Left(myReplace, Len(myReplace) - 1)
End Function
to be exploited in your "main" sub like follows:
strng = "$B$1;$B$21;$B$2;$C$3;$B$20;$B$201"
MsgBox myReplace(strng, "$B$2", "xxx")

Remove a number from a comma separated string while properly removing commas

FOR EXAMPLE: Given a string... "1,2,3,4"
I need to be able to remove a given number and the comma after/before depending on if the match is at the end of the string or not.
remove(2) = "1,3,4"
remove(4) = "1,2,3"
Also, I'm using javascript.
As jtdubs shows, an easy way is is to use a split function to obtain an array of elements without the commas, remove the required element from the array, and then rebuild the string with a join function.
For javascript something like this might work:
function remove(array,to_remove)
{
var elements=array.split(",");
var remove_index=elements.indexOf(to_remove);
elements.splice(remove_index,1);
var result=elements.join(",");
return result;
}
var string="1,2,3,4,5";
var newstring = remove(string,"4"); // newstring will contain "1,2,3,5"
document.write(newstring+"<br>");
newstring = remove(string,"5");
document.write(newstring+"<br>"); // will contain "1,2,3,4"
You also need to consider the behavior you want if you have repeats, say the string is "1,2,2,4" and I say "remove(2)" should it remove both instances or just the first? this function will remove only the first instance.
Just use multiple substitutions.
s/^$removed,//;
s/,$removed$//;
s/,$removed,/,/;
This will be easier than trying to invent a single replacement that handles all those cases.
string input = "1,2,3,4";
List<string> parts = new List<string>(input.Split(new char[] { ',' }));
parts.RemoveAt(2);
string output = String.Join(",", parts);
Instead of using regex, I would do something like:
- split on comma
- delete the right element
- join with comma
Here is a perl script that does the job:
#!/usr/bin/perl
use 5.10.1;
use strict;
use warnings;
my $toremove = 5;
my $string = "1,2,3,4,5";
my #tmp = split/,/, $string;
#tmp = grep{ $_ != $toremove }#tmp;
$string =join',', #tmp;
say $string;
Output:
1,2,3,4
Javascript has improved since this question was posted.
I use the following regex to remove items from a csv string
let searchStr = "359";
let regex = new RegExp("^" + searchStr + ",?|," + searchStr);
csvStr = csvStr.replace(regex, "");
If the child_id is the start, middle or end, or only item it is replaced.
If the searchStr is at the start of the csvStr it and any trailing comma is replaced. Else if the searchStr is anywhere else in the csvStr it must be preceded with a comma so the searchStr and its preceding comma are replaced by an empty string.