removing '<' '>' chars from string using regexp in matlab - regex

In my simulink i have a propagate signal which look like this:
<foo_boo>
and at source
foo_boo
i would like to build a regular expression the return from
<foo_boo>
simply foo_boo and from foo_boo i would like to get foo_boo.
In other words, i would like a regular expression that remove '>' and '<' from my string and the string can include [a-zA-Z_0-9] chars.

Pretty easy. Use regexprep to search for symbols that contain < or > in your input string and replace them with nothing. In other words:
out = regexprep(in, '<|>', '');
in would be the string you want to operate on (i.e. <foo_boo>) and out contains the processed string.
Example:
in = '<foo_boo>';
out = regexprep(in, '<|>', '')
out =
foo_boo

Since I think logical indexing is the answer to most things MATLAB (the other being bsxfun), I throw this in:
str = '<foo_boo>';
str( (str=='<') | (str=='>') ) = [];

seems there's no need to use regex:
str = '<foo_boo>'
str([strfind(str,'<'),strfind(str,'>')]) = []

Related

Formatting regex in Dart on several lines

I have
Pattern pattern = r'^((?:19|20)\d\d)[- /.]
(0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])$';
My editor shows an error on this regexp:
How can I fix it?
You entered a line break inside a string literal, that is why you get a syntax issue.
If you want to split a pattern into several lines, just use string concatenation:
Pattern pattern = r'^((?:19|20)\d\d)[- /.]' +
r'(0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])$';
Or, since string literals separated only with whitespace characters are concatenated automatically:
Pattern pattern = r'^((?:19|20)\d\d)[- /.]'
r'(0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])$';
Or, if you plan to re-use a long pattern, you may define this part as a variable, and just use string interpolation:
String d = r'((?:19|20)\d\d)';
String M = r'(0[1-9]|1[012])';
String y = r'(0[1-9]|[12][0-9]|3[01])';
String sep = r'[- /.]';
Pattern pattern = '^$d$sep$M$sep$y\$';

Matlab: using regexp to get a string that has a whitespace in between

I want to use Regex to acquire some ID's in a cellstring array, the array looks like this:
myString = '(['US04650Y1001', 'US90274P3029', 'HON WI', 'US41165F1012'])';
My pattern for regex is as follows:
pattern = '[A-Za-z0-9.^_]+';
newArr = regexp(myString, pattern,'match');
I'd like to get the ID called 'HON WI', but with my current pattern, its splitting it into two because my pattern can't deal with the whitespace properly. I would like to get the whole "HON WI", as well as my other strings, everything that's in '', these might have special characters like ^, . or _, but I don't know how to add the whitespace.
I already tried stuff like this, without success:
pattern = '[A-Za-z0-9.^_\s]+';
My new array should have, in each cell, the strings/ID's contained in myString (US04650Y1001, US90274P3029, HON WI and US41165F1012) with dimensions 1x4.
Another approach that seems to work but not entirely sure:
myString = strrep(myString,'([','');
myString = strrep(myString,'])','');
myString = regexp(myString,',','split');
myString = strrep(myString,'''','');
This seems to get me what I want, but I would like to know how can I alter the regex on my first approach.
Many thanks in advance.
You may use a mere '([^']+)' regex and use 'tokens' to get the captures:
myString = '([''US04650Y1001'', ''US90274P3029'', ''HON WI'', ''US41165F1012''])';
pattern = '''([^'']+)''';
newArr = regexp(myString, pattern,'match', 'tokens');
The newArr will look like
{
[1,1] = 'US04650Y1001'
[1,2] = 'US90274P3029'
[1,3] = 'HON WI'
[1,4] = 'US41165F1012'
}
You may option is to use lookaround assertions. The following will match any string made of alphanumeric character or underscore (\w), space (' ') or characters . or ^, that is located between quotes. This will specifically exclude the blank space next to the comma, in the separation between tokens, i.e. ', ' does not give a match.
Note that \s will match any blank space character (including tab, newline), this is why a space is preferred here:
pattern2='(?<='')[\w.^ ]+(?='')';
pattern2 =
(?<=')[\w.^ ]+(?=')
newArr = regexp(myString, pattern2,'match');
newArr'
ans =
'US04650Y1001'
'US90274P3029'
'HON WI'
'US41165F1012'

Exact Match using VBA Replace

I am trying to remove a string from another string using VBA replace function.
The string from which I am trying to remove looks like below which contains cell address concatenated by ;
"$B$1;$B$21;$B$2;$C$3;$B$20;$B$201"
and the string which I would like to remove is $B$2 by say xxx.
The replace function matches all occurrences of $B$2 in the string and gives me the output as below
$B$1;xxx1;xxx;$C$3;xxx0;xxx01
However I would like to search for $B$2 exactly in the string and expect an output like
$B$1;$B$21;xxx;$C$3;$B$20;$B$201
I one way I could think of doing this is by splitting up the string on ;(separator and looping and looking at each value) but I am looking at more direct solution here. Like using pattern matching techniques or something else.
You can include the following ; in the replace operation, to make sure you only match "complete" references. You just need to take a precaution for also matching the last entry, by adding a dummy semicolon at the end:
s = "$B$1;$B$21;$B$2;$C$3;$B$20;$B$201"
find = "$B$2"
repl = "xxxx"
result = Replace(s & ";", find & ";", repl & ";")
result = Left(result, Len(result)-1) ' Remove the final semicolon
Although this works for your case, in a more general exercise, you would also want to test for the preceding delimiter, and then the last two lines would look like this:
result = Replace(";" & s & ";", ";" & find & ";", ";" & repl & ";")
result = Mid(result, 2, Len(result)-2)
you cold use:
Function myReplace(strng As String, findStr As String, replacementStrng As String)
myReplace = Replace(strng & ";", findStr & ";", replacementStrng & ";")
myReplace = Left(myReplace, Len(myReplace) - 1)
End Function
to be exploited in your "main" sub like follows:
strng = "$B$1;$B$21;$B$2;$C$3;$B$20;$B$201"
MsgBox myReplace(strng, "$B$2", "xxx")

Using RegEx split the string

I have a string like '[1]-[2]-[3],[4]-[5],[6,7,8],[9]' or '[Computers]-[Apple]-[Laptop],[Cables]-[Cables,Connectors],[Adapters]', I'd like the Pattern to get the list result, but don't know how to figure out the pattern. Basically the comma is the split, but [6,7,8] itself contains the comma as well.
the string: [1]-[2]-[3],[4]-[5],[6,7,8],[9]
the result:
[1]-[2]-[3]
[4]-[5]
[6,7,8]
[9]
or
the string: [Computers]-[Apple]-[Laptop],[Cables]-[Cables,Connectors],[Adapters]
the result:
[Computers]-[Apple]-[Laptop]
[Cables]-[Cables,Connectors]
[Adapters]
,(?=\[)
This pattern splits on any comma that is followed by a bracket, but keeps the bracket within the result text.
The (?=*stuff*) is known as a "lookahead assertion". It acts as a condition for the match but is not itself part of the match.
In C# code:
String inputstring = "[Computers]-[Apple]-[Laptop],[Cables]-[Cables,Connectors],[Adapters]";
foreach(String s in Regex.Split(inputstring, #",(?=\[)"))
System.Console.Out.WriteLine(s);
In Java code:
String inputstring = "[Computers]-[Apple]-[Laptop],[Cables]-[Cables,Connectors],[Adapters]";
Pattern p = Pattern.compile(",(?=\\[)"));
for(String s : p.split(inputstring))
System.out.println(s);
Either produces:
[Computers]-[Apple]-[Laptop]
[Cables]-[Cables,Connectors]
[Adapters]
Although I believe the best approach here is to use split (as presented by #j__m's answer), here's an approach that uses matching rather than splitting.
Regex:
(\[.*?\](?!-))
Example usage:
String input = "[Computers]-[Apple]-[Laptop],[Cables]-[Cables,Connectors],[Adapters]";
Pattern p = Pattern.compile("(\\[.*?\\](?!-))");
Matcher m = p.matcher(input);
while (m.find()) {
System.out.println(m.group(1));
}
Resulting output:
[Computers]-[Apple]-[Laptop]
[Cables]-[Cables,Connectors]
[Adapters]
An answer that doesn't use regular expressions (if that's worth something in ease of understanding what's going on) is:
substitute "]#[" for "],["
split on "#"

Remove a number from a comma separated string while properly removing commas

FOR EXAMPLE: Given a string... "1,2,3,4"
I need to be able to remove a given number and the comma after/before depending on if the match is at the end of the string or not.
remove(2) = "1,3,4"
remove(4) = "1,2,3"
Also, I'm using javascript.
As jtdubs shows, an easy way is is to use a split function to obtain an array of elements without the commas, remove the required element from the array, and then rebuild the string with a join function.
For javascript something like this might work:
function remove(array,to_remove)
{
var elements=array.split(",");
var remove_index=elements.indexOf(to_remove);
elements.splice(remove_index,1);
var result=elements.join(",");
return result;
}
var string="1,2,3,4,5";
var newstring = remove(string,"4"); // newstring will contain "1,2,3,5"
document.write(newstring+"<br>");
newstring = remove(string,"5");
document.write(newstring+"<br>"); // will contain "1,2,3,4"
You also need to consider the behavior you want if you have repeats, say the string is "1,2,2,4" and I say "remove(2)" should it remove both instances or just the first? this function will remove only the first instance.
Just use multiple substitutions.
s/^$removed,//;
s/,$removed$//;
s/,$removed,/,/;
This will be easier than trying to invent a single replacement that handles all those cases.
string input = "1,2,3,4";
List<string> parts = new List<string>(input.Split(new char[] { ',' }));
parts.RemoveAt(2);
string output = String.Join(",", parts);
Instead of using regex, I would do something like:
- split on comma
- delete the right element
- join with comma
Here is a perl script that does the job:
#!/usr/bin/perl
use 5.10.1;
use strict;
use warnings;
my $toremove = 5;
my $string = "1,2,3,4,5";
my #tmp = split/,/, $string;
#tmp = grep{ $_ != $toremove }#tmp;
$string =join',', #tmp;
say $string;
Output:
1,2,3,4
Javascript has improved since this question was posted.
I use the following regex to remove items from a csv string
let searchStr = "359";
let regex = new RegExp("^" + searchStr + ",?|," + searchStr);
csvStr = csvStr.replace(regex, "");
If the child_id is the start, middle or end, or only item it is replaced.
If the searchStr is at the start of the csvStr it and any trailing comma is replaced. Else if the searchStr is anywhere else in the csvStr it must be preceded with a comma so the searchStr and its preceding comma are replaced by an empty string.