How do I use regex in google apps script to replace the special characters in my text but only those between certain strings?
so if this was the text and x represents random alphanumeric characters...
xx##xxxSTARTxxx###xxx$xxxENDxxxxx##££xxxSTARTxxxx££££xxx&&&&&xxxxENDxxx
what regex would i need so i end up with
xx##xxxSTARTxxxxxxxxxENDxxxxx##££xxxSTARTxxxxxxxxxxxENDxxx
You may use a replace with a callback:
var text = "xx##xxxSTARTxxx###xxx$xxxENDxxxxx##££xxxSTARTxxxx££££xxx&&&&&xxxxENDxxx";
var regex = /(START)([\s\S]*?)(END)/g;
var result = text.replace(regex, function ($0, $1, $2, $3) {
return $1 + $2.replace(/[^\w\s]+/g, '') + $3;
});
console.log(result);
// => xx##xxxSTARTxxxxxxxxxENDxxxxx##££xxxSTARTxxxxxxxxxxxENDxxx
The first regex is a simple regex to match a string between two strings:
(START) - Group 1 ($1): START (may be replaced with any pattern)
([\s\S]*?) - Group 2 ($2): any 0+ chars, but as few as possible
(END) - Group 3 ($3): END (may be replaced with any pattern)
The regex to match special chars I used here is [^\w\s], it matches any 1+ chars other than ASCII letters, digits, _ and whitespaces.
See Check for special characters in string for more variations of the special char regex.
Related
I need a Regex to match any word that contains letters: m+a+h+d together in any order
so, Mohamed, Hamada and Mahmoud matches, but hammer don't match
I tried do the following (I'm new to the Regex!):
Regex reg=new Regex("[mahd]");
But obviously it is not the correct pattern
When you want to match some substrings in any order, you either use alternation where all possible variations are enumerated, or use anchored lookaheads.
In this case, I'd suggest using positive lookaheads that will ensure both free order of the letters in a word and their obligatory presence in the word matched.
Use
(?i)\b(?=\w*m)(?=\w*a)(?=\w*h)(?=\w*d)\w+
See the regex demo (NOTE: You may replace \w with \p{L} to only match letters).
Details:
(?i) - case insensitive mode on
\b - a leading word boundary
(?=\w*m) - after 0+ word chars (i.e. letters, digits or underscores), there must be m
(?=\w*a) - after 0+ word chars, there must be a
(?=\w*h) - after 0+ word chars, there must be h
(?=\w*d) - after 0+ word chars, there must be d
\w+ - 1 or more letters, digits or underscores (you may replace with \p{L} to only match letters).
C# demo:
var str = "Mohamed, Hamada and Mahmoud match, but not hammer";
var letters = "mahd";
var pat = string.Format(#"\b{0}\w+\b", string.Join("", letters.Select(s => string.Format(#"(?=\w*{0})", s))));
var result = Regex.Matches(str, pat, RegexOptions.IgnoreCase)
.Cast<Match>()
.Select(match => match.Value)
.ToList();
Console.WriteLine(String.Join("\n", result)); // Demo line
Is it possible to match only the letter from the following string?
RO41 RNCB 0089 0957 6044 0001 FPS21098343
What I want: FPS
What I'm trying LINK : [0-9]{4}\s*\S+\s+(\S+)
What I get: FPS21098343
Any help is much appreciated! Thanks.
You can try with this:
var String = "0258 6044 0001 FPS21098343";
var Reg = /^(?:\d{4} )+ *([a-zA-Z]+)(?:\d+)$/;
var Match = Reg.exec(String);
console.log(Match);
console.log(Match[1]);
You can match up to the first one or more letters in the following way:
^[^a-zA-Z]*([A-Za-z]+)
^.*?([A-Za-z]+)
^[\w\W]*?([A-Za-z]+)
(?s)^.*?([A-Za-z]+)
If the tool treats ^ as the start of a line, replace it with \A that always matches the start of string.
The point is to match
^ / \A - start of string
[^a-zA-Z]* - zero or more chars other than letters
([A-Za-z]+) - capture one or more letters into Group 1.
The .*? part matches any text (as short as possible) before the subsequent pattern(s). (?s) makes . match line break chars.
Replace A-Za-z in all the patterns with \p{L} to match any Unicode letters. Also, note that [^\p{L}] = \P{L}.
To grep all the groups of letters that go in a row in any place in the string you can simply use:
([a-zA-Z]+)
You could use a capture group to get FPS:
\b[0-9]{4}\s+\S+\s+([A-Z]+)
The pattern matches:
\b[0-9]{4} A wordboundary to prevent a partial match, and match 4 digits
\s+\S+\s+ Match 1+ non whitespace chars between whitespace chars
([A-Z]+) Capture group 1, match 1+ chars A-Z
Regex demo
If the chars have to be followed by digits till the end of the string, you can add \d+$ to the pattern:
\b[0-9]{4}\s+\S+\s+([A-Z]+)\d+$
Regex demo
I'm trying to filter out strings in project code which have the following form
'alphanumeric.alphanumeric.alphanumeric.alphanumeric'
(surrounded by quote and has one or more dots between alphanumeric words)
and another regex to find strings with the form
'this is a regular sentence with space'
I'm new to regex and have the following pattern which doesn't work. Which should mean:
(' + anything + . + anything + ')
/'*[^.]*'
I need multiple words with . connecting them.
The pattern that you tried /'*[^.]*' matches a /, then optional occurrences of ' followed by optional chars other than ' and match a ' so a dot can not be matched.
You could use 2 separate patterns matching either a dot or a space at the start of the group and matching alphanumerics [^\W_]+ exluding the underscore from a word character.
'[^\W_]+(?:\.[^\W_]+)+'
Another option is to use a capture group matching either a dot or space and use a backreference in the repetition and match any letter or any number:
'[\p{L}\p{N}]+([.\p{Zs}\t])[\p{L}\p{N}]+(?:\1[\p{L}\p{N}]+)*'
' Match literally
[\p{L}\p{N}]+ Match 1+ alphanumerics
([.\p{Zs}\t])[\p{L}\p{N}]+ Capture group 1, match either . or a space and 1+ alphanumerics
(?:\1[\p{L}\p{N}]+)* Optionally match what is captured in group 1 using the backreference \1 followed by 1+ alphanumerics
' Match literally
Regex demo
I have a regular expression that is allowing a string to be standalone, separated by hyphen and underscore.
I need help so the string only takes hyphen or underscore, but not both.
This is what I have so far.
^([a-z][a-z0-9]*)([-_]{1}[a-z0-9]+)*$
foo = passed
foo-bar = passed
foo_bar = passed
foo-bar-baz = passed
foo_bar_baz = passed
foo-bar_baz_qux = passed # but I don't want it to
foo_bar-baz-quz = passed # but I don't want it to
You may expand the pattern a bit and use a backreference to only match the same delimiter:
^[a-z][a-z0-9]*(?:([-_])[a-z0-9]+(?:\1[a-z0-9]+)*)?$
See the regex demo
Details:
^ - start of string
[a-z][a-z0-9]* - a letter followed with 0+ lowercase letters or digits
(?:([-_])[a-z0-9]+(?:\1[a-z0-9]+)*)? - an optional sequence of:
([-_]) - Capture group 1 matching either - or _
[a-z0-9]+ - 1+ lowercase letters or digits
(?:\1[a-z0-9]+)* - 0+ sequences of:
\1 - the same value as in Group 1
[a-z0-9]+ - 1 or more lowercase letters or digits
$ - end of string.
Here's a nice clean solution:
^([a-zA-Z-]+|[a-zA-Z_]+)$
Break it down!
^ start at the beginning of the text
[a-zA-Z-]+ match anything a-z or A-Z or -
| OR operator
[a-zA-Z_]+ match anything a-z or A-Z or _
$ end at the end of the text
Here's an example on regexr!
I need to create the laravel migrations, so I have converted my SQL script to a laravel migration format using "replacement in files" with regular expressions from Sublime Text.
My problem is that i have to replace in the following string the '#' character by the 'tablename' in about 70 tables:
Schema::table('tablename', function($table) {
$table->dropForeign('#_columnname_foreign');
});
Actually I can do this using the following expression:
(Schema::table\('([a-z]+)',[\s]*function\(\$table\)[\s]*{[\s]*\$table->dropForeign\(')#(_[a-z_]+'\);)
And in the replace field:
$1$2$3
but I don't know how to do when the table has more than one fk:
Schema::table('tablename1', function($table) {
$table->dropForeign('#_field1_foreign');
$table->dropForeign('#_field2_foreign');
$table->dropForeign('#_field3_foreign');
$table->dropForeign('#_field4_foreign');
$table->dropForeign('#_field5_foreign');
$table->dropForeign('#_field6_foreign');
});
I have been using this site to validate my regular expressions RegExr
It is not an easy task for a regex in Sublime Text. The only way to do it with a regex is to make sure you capture the function singature with the optional number of table-dropForeign lines (matched lazily), and replace #s on the next line.
The regex below requires clicking Replace All multiple times until all matches are found.
(Schema::table\('([a-z0-9]+)',\s*function\(\$table\)\s*{(?:\s*\$table->dropForeign\('[a-z0-9]+_\w+'\);)*?\s*\$table->dropForeign\(')#(_\w+'\);)
Replacement is $1$2$3. See this regex demo, where you may replace the # in the second block manually with the table name and see how the match goes further.
Details:
(Schema::table\('([a-z0-9]+)',\s*function\(\$table\)\s*{(?:\s*\$table->dropForeign\('[a-z0-9]+_\w+'\);)*?\s*\$table->dropForeign\(') - Group 1 capturing:
Schema::table\(' - literal Schema::table(' substring
([a-z0-9]+) - Group 2 capturing 1+ alphanumerics (do not check Match Case option to also match uppercase ASCII letters)
',\s* - a comma and 0+ whitespaces
function\(\$table\) - a literal text function($table)
\s* - 0+ whitespaces
{ - a literal { (in SublimeText 2, it requires escaping)
(?:\s*\$table->dropForeign\('[a-z0-9]+_\w+'\);)*? - 0+ sequences, but as few as possible, matching:
\s*\$table->dropForeign\(' - 0+ whitespaces and then a literal text `$table->dropForeign('
[a-z0-9]+_\w+ - 1+ alphanumerics, _ and 1+ digits, letters or underscores (\w+)
'\); - a literal substring ');
\s* - 0+ whitespaces
\$table->dropForeign\(' - a literal text $table->dropForeign('
# - a matched # symbol to be replaced
(_\w+'\);) - Group 2 capturing:
_ - an underscore
\w+ - 1 or more letters, digits or underscores
'\); - a literal substring ');
NOTE: The issue I thought I found was related to an unescaped { that causes a regex failure in Sublime Text 2. In Sublime Text 3, the { in the regex does not have to be escaped.