Exact Match using VBA Replace - regex

I am trying to remove a string from another string using VBA replace function.
The string from which I am trying to remove looks like below which contains cell address concatenated by ;
"$B$1;$B$21;$B$2;$C$3;$B$20;$B$201"
and the string which I would like to remove is $B$2 by say xxx.
The replace function matches all occurrences of $B$2 in the string and gives me the output as below
$B$1;xxx1;xxx;$C$3;xxx0;xxx01
However I would like to search for $B$2 exactly in the string and expect an output like
$B$1;$B$21;xxx;$C$3;$B$20;$B$201
I one way I could think of doing this is by splitting up the string on ;(separator and looping and looking at each value) but I am looking at more direct solution here. Like using pattern matching techniques or something else.

You can include the following ; in the replace operation, to make sure you only match "complete" references. You just need to take a precaution for also matching the last entry, by adding a dummy semicolon at the end:
s = "$B$1;$B$21;$B$2;$C$3;$B$20;$B$201"
find = "$B$2"
repl = "xxxx"
result = Replace(s & ";", find & ";", repl & ";")
result = Left(result, Len(result)-1) ' Remove the final semicolon
Although this works for your case, in a more general exercise, you would also want to test for the preceding delimiter, and then the last two lines would look like this:
result = Replace(";" & s & ";", ";" & find & ";", ";" & repl & ";")
result = Mid(result, 2, Len(result)-2)

you cold use:
Function myReplace(strng As String, findStr As String, replacementStrng As String)
myReplace = Replace(strng & ";", findStr & ";", replacementStrng & ";")
myReplace = Left(myReplace, Len(myReplace) - 1)
End Function
to be exploited in your "main" sub like follows:
strng = "$B$1;$B$21;$B$2;$C$3;$B$20;$B$201"
MsgBox myReplace(strng, "$B$2", "xxx")

Related

Regex - change commas only in a portion of a string

I make a lot of changes on a original csv string. there is a lot of comma delimiter. I have to replace by a ";" either only the commas inside the expression || ....|| or only the commas outside this expression. i need to do this change in order to have different delimiter in the expression ||....|| compare to the rest of the string.
Example:
(.*)(?:\|\|)(?:.*)(,)(?:.*)\|\|
After I use
var regex = /myregex/g;
var str = str.replace(regex, ',')
thanks
You can use
const string = "aba,bjlj,alj,ljlj||name1,name2,name3||jflkj,glfgjlf,jflg,fjlfd||name1,name2||fd,sdfsfd,dfs||name1,name2,name3,name4,name5||";
console.log( string.replace(/\|{2}[\w\W]*?\|{2}/g, (x) => x.replace(/,/g, ';')) );
The regex is
/\|{2}.*?\|{2}/gs // matches any text between two double pipes
/\|{2}[\w\W]*?\|{2}/g // matches any text between two double pipes
/\|{2}.*?\|{2}/g // matches any text but line breaks between two double pipes
Note the . does not match line breaks without the s modifier flag.
The regex matches double pipe, then any zero or more chars, as few as possible up to the next double pipe.
Then, x, the whole match value, is passed as an argument to the anonymous callback function used as a replacement argument, and all commas are replaced with ; only inside the matches.
The "contrary" solution is to match and capture the strings between double pipes and only match commas in all other contexts so that you could keep the captures and replace those commas:
const string = "aba,bjlj,alj,ljlj||name1,name2,name3||jflkj,glfgjlf,jflg,fjlfd||name1,name2||fd,sdfsfd,dfs||name1,name2,name3,name4,name5||";
console.log( string.replace(/(\|{2}[\w\W]*?\|{2})|,/g, (x,y) => y || ';') );
Big Thanks.
I also find
var newStr = str.replace(/\|{2}.*?\|{2}/g, function(match) {
return match.replace(/,/g,";");
});
Do you think is it possible to do the contrary and change all the comma outside the occurence ||...|| ?

Matlab: using regexp to get a string that has a whitespace in between

I want to use Regex to acquire some ID's in a cellstring array, the array looks like this:
myString = '(['US04650Y1001', 'US90274P3029', 'HON WI', 'US41165F1012'])';
My pattern for regex is as follows:
pattern = '[A-Za-z0-9.^_]+';
newArr = regexp(myString, pattern,'match');
I'd like to get the ID called 'HON WI', but with my current pattern, its splitting it into two because my pattern can't deal with the whitespace properly. I would like to get the whole "HON WI", as well as my other strings, everything that's in '', these might have special characters like ^, . or _, but I don't know how to add the whitespace.
I already tried stuff like this, without success:
pattern = '[A-Za-z0-9.^_\s]+';
My new array should have, in each cell, the strings/ID's contained in myString (US04650Y1001, US90274P3029, HON WI and US41165F1012) with dimensions 1x4.
Another approach that seems to work but not entirely sure:
myString = strrep(myString,'([','');
myString = strrep(myString,'])','');
myString = regexp(myString,',','split');
myString = strrep(myString,'''','');
This seems to get me what I want, but I would like to know how can I alter the regex on my first approach.
Many thanks in advance.
You may use a mere '([^']+)' regex and use 'tokens' to get the captures:
myString = '([''US04650Y1001'', ''US90274P3029'', ''HON WI'', ''US41165F1012''])';
pattern = '''([^'']+)''';
newArr = regexp(myString, pattern,'match', 'tokens');
The newArr will look like
{
[1,1] = 'US04650Y1001'
[1,2] = 'US90274P3029'
[1,3] = 'HON WI'
[1,4] = 'US41165F1012'
}
You may option is to use lookaround assertions. The following will match any string made of alphanumeric character or underscore (\w), space (' ') or characters . or ^, that is located between quotes. This will specifically exclude the blank space next to the comma, in the separation between tokens, i.e. ', ' does not give a match.
Note that \s will match any blank space character (including tab, newline), this is why a space is preferred here:
pattern2='(?<='')[\w.^ ]+(?='')';
pattern2 =
(?<=')[\w.^ ]+(?=')
newArr = regexp(myString, pattern2,'match');
newArr'
ans =
'US04650Y1001'
'US90274P3029'
'HON WI'
'US41165F1012'

Split string with specified delimiter in lua

I'm trying to create a split() function in lua with delimiter by choice, when the default is space.
the default is working fine. The problem starts when I give a delimiter to the function. For some reason it doesn't return the last sub string.
The function:
function split(str,sep)
if sep == nil then
words = {}
for word in str:gmatch("%w+") do table.insert(words, word) end
return words
end
return {str:match((str:gsub("[^"..sep.."]*"..sep, "([^"..sep.."]*)"..sep)))} -- BUG!! doesnt return last value
end
I try to run this:
local str = "a,b,c,d,e,f,g"
local sep = ","
t = split(str,sep)
for i,j in ipairs(t) do
print(i,j)
end
and I get:
1 a
2 b
3 c
4 d
5 e
6 f
Can't figure out where the bug is...
When splitting strings, the easiest way to avoid corner cases is to append the delimiter to the string, when you know the string cannot end with the delimiter:
str = "a,b,c,d,e,f,g"
str = str .. ','
for w in str:gmatch("(.-),") do print(w) end
Alternatively, you can use a pattern with an optional delimiter:
str = "a,b,c,d,e,f,g"
for w in str:gmatch("([^,]+),?") do print(w) end
Actually, we don't need the optional delimiter since we're capturing non-delimiters:
str = "a,b,c,d,e,f,g"
for w in str:gmatch("([^,]+)") do print(w) end
Here's my go-to split() function:
-- split("a,b,c", ",") => {"a", "b", "c"}
function split(s, sep)
local fields = {}
local sep = sep or " "
local pattern = string.format("([^%s]+)", sep)
string.gsub(s, pattern, function(c) fields[#fields + 1] = c end)
return fields
end
"[^"..sep.."]*"..sep This is what causes the problem. You are matching a string of characters which are not the separator followed by the separator. However, the last substring you want to match (g) is not followed by the separator character.
The quickest way to fix this is to also consider \0 a separator ("[^"..sep.."\0]*"..sep), as it represents the beginning and/or the end of the string. This way, g, which is not followed by a separator but by the end of the string would still be considered a match.
I'd say your approach is overly complicated in general; first of all you can just match individual substrings that do not contain the separator; secondly you can do this in a for-loop using the gmatch function
local result = {}
for field in your_string:gsub(("[^%s]+"):format(your_separator)) do
table.insert(result, field)
end
return result
EDIT: The above code made a bit more simple:
local pattern = "[^%" .. your_separator .. "]+"
for field in string.gsub(your_string, pattern) do
-- ...and so on (The rest should be easy enough to understand)
EDIT2: Keep in mind that you should also escape your separators. A separator like % could cause problems if you don't escape it as %%
function escape(str)
return str:gsub("([%^%$%(%)%%%.%[%]%*%+%-%?])", "%%%1")
end

removing '<' '>' chars from string using regexp in matlab

In my simulink i have a propagate signal which look like this:
<foo_boo>
and at source
foo_boo
i would like to build a regular expression the return from
<foo_boo>
simply foo_boo and from foo_boo i would like to get foo_boo.
In other words, i would like a regular expression that remove '>' and '<' from my string and the string can include [a-zA-Z_0-9] chars.
Pretty easy. Use regexprep to search for symbols that contain < or > in your input string and replace them with nothing. In other words:
out = regexprep(in, '<|>', '');
in would be the string you want to operate on (i.e. <foo_boo>) and out contains the processed string.
Example:
in = '<foo_boo>';
out = regexprep(in, '<|>', '')
out =
foo_boo
Since I think logical indexing is the answer to most things MATLAB (the other being bsxfun), I throw this in:
str = '<foo_boo>';
str( (str=='<') | (str=='>') ) = [];
seems there's no need to use regex:
str = '<foo_boo>'
str([strfind(str,'<'),strfind(str,'>')]) = []

Remove a number from a comma separated string while properly removing commas

FOR EXAMPLE: Given a string... "1,2,3,4"
I need to be able to remove a given number and the comma after/before depending on if the match is at the end of the string or not.
remove(2) = "1,3,4"
remove(4) = "1,2,3"
Also, I'm using javascript.
As jtdubs shows, an easy way is is to use a split function to obtain an array of elements without the commas, remove the required element from the array, and then rebuild the string with a join function.
For javascript something like this might work:
function remove(array,to_remove)
{
var elements=array.split(",");
var remove_index=elements.indexOf(to_remove);
elements.splice(remove_index,1);
var result=elements.join(",");
return result;
}
var string="1,2,3,4,5";
var newstring = remove(string,"4"); // newstring will contain "1,2,3,5"
document.write(newstring+"<br>");
newstring = remove(string,"5");
document.write(newstring+"<br>"); // will contain "1,2,3,4"
You also need to consider the behavior you want if you have repeats, say the string is "1,2,2,4" and I say "remove(2)" should it remove both instances or just the first? this function will remove only the first instance.
Just use multiple substitutions.
s/^$removed,//;
s/,$removed$//;
s/,$removed,/,/;
This will be easier than trying to invent a single replacement that handles all those cases.
string input = "1,2,3,4";
List<string> parts = new List<string>(input.Split(new char[] { ',' }));
parts.RemoveAt(2);
string output = String.Join(",", parts);
Instead of using regex, I would do something like:
- split on comma
- delete the right element
- join with comma
Here is a perl script that does the job:
#!/usr/bin/perl
use 5.10.1;
use strict;
use warnings;
my $toremove = 5;
my $string = "1,2,3,4,5";
my #tmp = split/,/, $string;
#tmp = grep{ $_ != $toremove }#tmp;
$string =join',', #tmp;
say $string;
Output:
1,2,3,4
Javascript has improved since this question was posted.
I use the following regex to remove items from a csv string
let searchStr = "359";
let regex = new RegExp("^" + searchStr + ",?|," + searchStr);
csvStr = csvStr.replace(regex, "");
If the child_id is the start, middle or end, or only item it is replaced.
If the searchStr is at the start of the csvStr it and any trailing comma is replaced. Else if the searchStr is anywhere else in the csvStr it must be preceded with a comma so the searchStr and its preceding comma are replaced by an empty string.