MATLAB regular expression denied to remove spaces at beginning of a string

MATLAB regular expression denied to remove spaces at beginning of a string - regex

suppose that we have this string in MATLAB:
mm = [' 44412 (25.01%)'];
I want remove only fist space(s) in this string to have this output:
'44412 (25.01%)'
I'm using strrep(mm,'\^\s\s','') but didn't work. What is the problem?

The issue with strrep is that it does not allow you to utilize regex patterns. The first part of your filter ('\^') also tries to match ^ explicitly, so it won't work on your string. If you remove the leading \ your filter works fine with regexprep, but is limited to strings with exactly 2 leading whitespaces.
Try using this more generic filter instead with regexprep.
str = ' 44412 (25.01%)';
newstr = regexprep(str, '^\s+', '');
Which returns:
newstr =
44412 (25.01%)
What I've done here is match 1 or more whitespace characters at the beginning of the string. This syntax also allows us to use it on strings without any leading whitespace and not have it make any modification.
Edit: Here are some built-in alternatives!
You could use strtrim, but it strips leading and trailing whitespace:
newstr = strtrim(str);
You can also use strjust to left-justify your string:
newstr = strjust(str, 'left');
If you want to be really creative, you could flip your array and use deblank, which strips trailing whitespace:
newstr = fliplr(deblank(fliplr(str)));

Related

Regex: not all BLANKS but allow certain characters, with limit

Trying to come up with a Regex, or combination of Regex, that returns False if a) they have only entered only BLANK(s), or they b) entered "non-legal" characters. Lastly, the number of characters has a set limit.
The closest I have thus far is below. Where it fails is that it does not count any leading spaces; only the non-BLANKs are counted, and so it fails. Using js.
const reg = /^(**[ ]***[!-~\u2018-\u201d\u2013\u2014]){1,10}$/;
EDIT: I think the above is incorrect, and I meant to post this:
const re4 = /^(?!\s*$)[!-~\u2018-\u201d\u2013\u2014]{1,10}$/;
EDIT 2: this has less clutter; allow space and all other 'standard' keyboard chars:
const re5 = /^(?!\s*$)[!-~]{1,10}$/;
So, this says you can enter a bunch of spaces, and must include at least 1 other character from the list following; but the {1,10} only counts the non-spaces and so I can end up with too many in total.
EDIT:
So, using re5 above --
s = ' '; // should fail
s = ' blah blah'; // should pass
s = ' blah blah'; // should fail, as there are 11 characters

Try ^(?:\s*\S){1,10}\s*$
Allow 1-10 non whiter, change \S to allow chars

Update 2: After learning that you cannot invert the match result in code, here's one last suggestion using negative lookahead (like you already tried yourself).
This regex matches only strings of 1-10 non-banned characters that are not all whitespace:
const re4 = /^(?!\s+$)[^\!-\~\u2018-\u201d\u2013\u2014]{1,10}$/
Update 1: Use this regex to match all-whitespace string OR strings longer than 10 chars OR strings containing bad characters:
const re4 = /(^\s+$|^.{11,}$|[\!-\~\u2018-\u201d\u2013\u2014])/
I understand that you want to impose a length restriction via regex. I would suggest against that and recommend using str.length instead.
This regex will match whitespace-only strings and strings containing one or more bad characters:
const re4 = /(^\s+$|[\!-\~\u2018-\u201d\u2013\u2014])/;
Regarding prohibition of all-whitespace strings: Instead of packing it into a regex, you might consider using something more explicit like if (s.trim().length == 0). IMO this makes your intention clearer and your code propably more readable, leaving you with this easy to read regex:
# matches any string containing a *bad* character
const re4 = /[\!-\~\u2018-\u201d\u2013\u2014]/;
If you use trim for the all-whitespace check, you might convert your regex into a positive assertion, even with length restriction:
# matches any string consisting of 1-10 characters not considered *bad*
const re4 = /^[^\!-\~\u2018-\u201d\u2013\u2014]{1,10}$/;

To match the input when it’s from 1 to 10 chars long and can't be all blanks, use a negative look ahead to assert not all blanks:
^(?! *$).{1,10}
If you want to restrict allowable chars, change the dot to a suitable character class of allowable chars.

Split string by multiple delimiters and keep the delimiter in result [duplicate]

How to split string with Regex.Split and keep all separators?
I have a string:"substring1 delimeter1 substring2" , where delimeter+substring2 is a part of address.
Also i have 2 and more delimeters: delim1,delim2 wich are equivalent in meaning;
And i want to get string array like this:
arr[0]="subsctring1";
arr[1]="delim1 subsctring2";
or,
arr[1]="delim2 subsctring2;
I have a pattern:
addrArr= Regex.Split(inputText, String.Concat("(?<=",delimeter1, "|",delimeter2, ")"), RegexOptions.None);
But it not works well.
Can you help me to create a valid pattern to to that?

You need a pattern with a lookahead only:
\s+(?=delim1|delim2)
The \s+ will match 1 or more whitespaces (since your string contains whitespaces). In case there can be no whitespaces, use \s* (but then you will need to remove empty entries from the result). See the regex demo. If these delimiters must be whole words, use \b word boundaries: \s+(?=\b(?:delim1|delim2)\b).
In C#:
addrArr = Regex.Split(inputText, string.Format(#"\s+(?={0})", string.Join("|", delimeters)));
If the delimiters can contain special regex metacharacters, you will need to run Regex.Escape on your delimiters list.
A C# demo:
var inputText = "substring1 delim1 substring2 delim2 substr3";
var delimeters = new List<string> { "delim1", "delim2" };
var addrArr = Regex.Split(inputText,
string.Format(#"\s+(?={0})", string.Join("|", delimeters.Select(Regex.Escape))));
Console.WriteLine(string.Join("\n", addrArr));

I think you need to use a lookahead, not a lookbehind, for this to work (haven't tried it though).
Also, you have to be careful with the separators; they must be escaped to work correctly as patterns in the regex.
Try this:
addrArr= Regex.Split(inputText, string.Format("(?={0}|{1})", Regex.Escape(delimeter1), Regex.Escape(delimeter2)), RegexOptions.None);

Surrounding one group with special characters in using substitute in vim

Given string:
some_function(inputId = "select_something"),
(...)
some_other_function(inputId = "some_other_label")
I would like to arrive at:
some_function(inputId = ns("select_something")),
(...)
some_other_function(inputId = ns("some_other_label"))
The key change here is the element ns( ... ) that surrounds the string available in the "" after the inputId
Regex
So far, I have came up with this regex:
:%substitute/\(inputId\s=\s\)\(\"[a-zA-Z]"\)/\1ns(/2/cgI
However, when deployed, it produces an error:
E488: Trailing characters
A simpler version of that regex works, the syntax:
:%substitute/\(inputId\s=\s\)/\1ns(/cgI
would correctly inser ns( after finding inputId = and create string
some_other_function(inputId = ns("some_other_label")
Challenge
I'm struggling to match the remaining part of the string, ex. "select_something") and return it as:
"select_something")).

You have many problems with your regex.
[a-zA-Z] will only match one letter. Presumably you want to match everything up to the next ", so you'll need a \+ and you'll also need to match underscores too. I would recommend \w\+. Unless more than [a-zA-Z_] might be in the string, in which case I would do .\{-}.
You have a /2 instead of \2. This is why you're getting E488.
I would do this:
:%s/\(inputId = \)\(".\{-}\)"/\1ns(\2)/cgI
Or use the start match atom: (that is, \zs)
:%s/inputId = \zs\".\{-}"/ns(&)/cgI

You can use a negated character class "[^"]*" to match a quoted string:
%s/\(inputId\s*=\s*\)\("[^"]*"\)/\1ns(\2)/g

Regex match a string and allow specific character to appear randomly

I want to extract a portion of a string, allowing for the dash character to appear randomly throughout. In my match, I want the dash character occurrences to be included.
Let's say I have a scenario like so:
haystack = "arandomse-que-nce"
needle = "sequence"
and I want to come out on the other end with a string like se-que-nce this this case, what would the regex pattern look like?

I would split the string and then join by -*; for example, in JavaScript:
var needle = "sequence"
var regex = new RegExp(needle.split('').join('-*'))
var result = "arandomse-que-nce".match(regex) // ["se-que-nce"]
var result2 = "a-bad-sequ_ence".match(regex) // null
You could also use a regex to insert -* between each character:
var regex = new RegExp(needle.replace(/(?!$|^)/g, '-*'))
Both the split/join method and the replace method return 's-*e-*q-*u-*e-*n-*c-*e' for the regex.
If you have characters like * in your string, that have meanings in regular expressions, you may want to escape them, like so:
var regex = new RegExp(needle.replace(/(?!$|^)/g, '-*')
.replace(/([-\\^$*+?.()|[\]{}])/g, '\\$1'))
Then, if needle was 1+1, for example, it would give you 1-*\+-*1 for the regex.

s-*e-*q-*u-*e-*n-*c-*e-*
The assumes that multiple hyphens in a row are okay.
EDIT: Doorknob's split/join solution is good, but be aware that it only works for character that aren't special characters (*, +, etc.)
I don't know what the specifications are, but if there are special characters, make sure to escape them:
new RegExp(needle.split('').map(function(c) { return '\\' + c; }).join('-*'))

You could try to use:
s-?e-?q-?u-?e-?n-?c-?e

Regex to remove characters up to a certain point in a string

How do I use regex to convert
11111aA$xx1111xxdj$%%`
to
aA$xx1111xxdj$%%
So, in other words, I want to remove (or match) the FIRST grouping of 1's.

Depending on the language, you should have a way to replace a string by regex. In Java, you can do it like this:
String s = "11111aA$xx1111xxdj$%%";
String res = s.replaceAll("^1+", "");
The ^ "anchor" indicates that the beginning of the input must be matched. The 1+ means a sequence of one or more 1 characters.
Here is a link to ideone with this running program.
The same program in C#:
var rx = new Regex("^1+");
var s = "11111aA$xx1111xxdj$%%";
var res = rx.Replace(s, "");
Console.WriteLine(res);
(link to ideone)
In general, if you would like to make a match of anything only at the beginning of a string, add a ^ prefix to your expression; similarly, adding a $ at the end makes the match accept only strings at the end of your input.

If this is the beginning, you can use this:
^[1]*
As far as replacing, it depends on the language. In powershell, I would do this:
[regex]::Replace("11111aA$xx1111xxdj$%%","^[1]*","")
This will return:
aA$xx1111xxdj$%%

If you only want to replace consecutive "1"s at the beginning of the string, replace the following with an empty string:
^1+
If the consecutive "1"s won't necessarily be the first characters in the string (but you still only want to replace one group), replace the following with the contents of the first capture group (usually \1 or $1):
1+(.*)
Note that this is only necessary if you only have a "replace all" capability available to you, but most regex implementations also provide a way to replace only one instance of a match, in which case you could just replace 1+ with an empty string.

I'm not sure but you can try this
[^1](\w*\d*\W)* - match all as a single group except starting "1"(n) symbols

In Javascript
var str = '11111aA$xx1111xxdj$%%';
var patt = /^1+/g;
str = str.replace(patt,"");

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

MATLAB regular expression denied to remove spaces at beginning of a string - regex

suppose that we have this string in MATLAB: mm = [' 44412 (25.01%)']; I want remove only fist space(s) in this string to have this output: '44412 (25.01%)' I'm using strrep(mm,'\^\s\s','') but didn't work. What is the problem?

Related

Regex: not all BLANKS but allow certain characters, with limit

Split string by multiple delimiters and keep the delimiter in result [duplicate]

Surrounding one group with special characters in using substitute in vim

Regex match a string and allow specific character to appear randomly

Regex to remove characters up to a certain point in a string

Categories

Resources