Match word by its prefix - regex

I'm trying to match a string by its prefix that ends with a particular character. For example, if my string is "abcd" and ends in #, then any word which is a prefix of "abcd" should be matched as long as it ends with #. Here are some examples to help illustrate the pattern:
Input: "ab#" gives true (as "ab" is a prefix of "abcd" and end with a #).
Input: "abcd#" gives true (as "abcd" is a prefix of "abcd" and end with a #).
Input: "bc#" gives false (as "bc" is a not a prefix of "abcd" ).
Input: "ab#" gives false (while "ab" is a prefix of "abcd", it doesn't end with #) .
Input: "ac#" gives false (while "ac" is contained within "abcd", it doesn't begin with a prefix from "abcd") .
So far, I've managed to come up with the following expression which seems to be working fine:
/(abcd|abc|ab|a)#/
While this is working, it isn't very practical, as larger words of length n will make the expression quite large:
/(n|n-1|n-2| ... |1)#/
Is there a way to rewrite this expression so it is more scalable and concise?
Example of my attempt (in JS):
const regex = /(abcd|abc|ab|a)#/;
console.log(regex.test("abcd#")); // true
console.log(regex.test("ab#")); // true
console.log(regex.test("abc#")); // true
console.log(regex.test("abz#")); // false
console.log(regex.test("abc#")); // false
Edit: Some of the solutions provided are nice and do do what I'm after, however, for this particular question, I'm after a solution which uses pure regular expressions to match the prefix.

Just use String#startsWith and String#endsWith here:
String input = "abcd";
String prefix = "ab#";
if (input.startsWith(prefix.replaceAll("#$", "")) && prefix.endsWith("#")) {
System.out.println("MATCH");
}
else {
System.out.println("NO MATCH");
}
Edit: A JavaScript version of the above:
var input = "abcd";
var prefix = "ab#";
if (input.startsWith(prefix.replace(/#$/, "")) && prefix.endsWith("#")) {
console.log("MATCH");
}
else {
console.log("NO MATCH");
}

Try ^ab?c?d?#$
Explanation:
`^` - match beginning of a string
`b?` - match match zero or one `b`
Rest is analigocal to the above.
Demo

Here's a left field JavaScript option. Build and array of valid prefixes, use join on the array to make your regex pattern.
var validPrefixes = ["abcd",
"abc",
"ab",
"a",
"areallylongprefix"];
var regexp = new RegExp("^(" + validPrefixes.join("|") + ")#$");
console.log(regexp.test("abcd#"));// true
console.log(regexp.test("ab#")); // true
console.log(regexp.test("abc#")); // true
console.log(regexp.test("abz#")); // false
console.log(regexp.test("abc#")); // false
console.log(regexp.test("areallylongprefix#")); //true
This can be adapted to the language of tour choosing, also handy if your prefixes are dynamically retrieved from a database or similar.

Here's my c# attempt:
private static bool test(string v)
{
var pattern = "abcd#";
//No error handling
return v.EndsWith(pattern[pattern.Length-1])
&& pattern.Replace("#", "").StartsWith(v.Replace("#",""));
}
Console.WriteLine(test("abcd#")); // true
Console.WriteLine(test("ab#")); // true
Console.WriteLine(test("abc#")); // true
Console.WriteLine(test("abz#")); // false
Console.WriteLine(test("abc#")); // false
Console.WriteLine(test("abc")); //false

/a(b(cd?)?)?#/
Or for a longer example, to match a prefix of "abcdefg#":
/a(b(c(d(e(fg?)?)?)?)?)?#/
Generating this regex isn't completely trivial, but some options are:
function createPrefixRegex(s) {
// This method creates an unnecessary set of parentheses
// around the last letter, but that won't harm anything.
return new RegExp(s.slice(0,-1).split('').join('(') + ')?'.repeat(s.length - 2) + '#');
}
function createPrefixRegex2(s) {
var r = s[0];
for (var i = 1; i < s.length - 2; ++i) {
r += '(' + s[i];
}
r += s[s.length - 2] + '?' + ')?'.repeat(s.length - 3) + '#';
return new RegExp(r);
}
function createPrefixRegex3(s) {
var recurse = function(i) {
if (i >= s.length - 1) {
return '';
}
if (i === s.length - 2) {
return s[i] + '?';
}
return '(' + s[i] + recurse(i + 1) + ')?';
}
return new RegExp(s[0] + recurse(1) + '#');
}
These may fail if the input string has no prefix before the '#' character, and they assume the last character in the string is '#'.

Related

How to do a camel case to sentence case in dart

Something is wrong with my attempt:
String camelToSentence(String text) {
var result = text.replaceAll(RegExp(r'/([A-Z])/g'), r" $1");
var finalResult = result[0].toUpperCase() + result.substring(1);
return finalResult;
}
void main(){
print(camelToSentence("camelToSentence"));
}
It just prints "CamelToSentence" instead of "Camel To Sentence".
Looks like the problem is here r" $1"; but I don't know why.
You can use
String camelToSentence(String text) {
return text.replaceAllMapped(RegExp(r'^([a-z])|[A-Z]'),
(Match m) => m[1] == null ? " ${m[0]}" : m[1].toUpperCase());
}
Here,
^([a-z])|[A-Z] - matches and captures into Group 1 a lowercase ASCII letter at the start of string, or just matches an uppercase letter anywhere in the string
(Match m) => m[1] == null ? " ${m[0]}" : m[1].toUpperCase() returns as the replacement the uppercases Group 1 value (if it was matched) or a space + the matched value otherwise.
You should not use the / and /g in the pattern.
About the The replaceAll method:
Notice that the replace string is not interpreted. If the replacement
depends on the match (for example on a RegExp's capture groups), use
the replaceAllMapped method instead.
As is does not match, result[0] returns c and result.substring(1) contains amelToSentence so you are concatenating an uppercased c with amelToSentence giving CamelToSentence
You can also use lookarounds
(?<!^)(?=[A-Z])
(?<!^) Assert not the start of the string
(?=[A-Z]) Assert an uppercase char A-Z to the right
Dart demo
For example
String camelToSentence(String text) {
var result = text.replaceAll(RegExp(r'(?<!^)(?=[A-Z])'), r" ");
var finalResult = result[0].toUpperCase() + result.substring(1);
return finalResult;
}
void main() {
print(camelToSentence("camelToSentence"));
}
Output
Camel To Sentence

Backspace String Compare Leetcode Question

I have a question about the following problem on Leetcode:
Given two strings S and T, return if they are equal when both are typed into empty text editors. # means a backspace character.
Example 1:
Input: S = "ab#c", T = "ad#c"
Output: true
Explanation: Both S and T become "ac".
Example 2:
Input: S = "ab##", T = "c#d#"
Output: true
Explanation: Both S and T become "".
Example 3:
Input: S = "a##c", T = "#a#c"
Output: true
Explanation: Both S and T become "c".
Example 4:
Input: S = "a#c", T = "b"
Output: false
Explanation: S becomes "c" while T becomes "b".
Note:
1 <= S.length <= 200
1 <= T.length <= 200
S and T only contain lowercase letters and '#' characters.
Follow up:
Can you solve it in O(N) time and O(1) space?
My answer:
def backspace_compare(s, t)
if (s.match?(/[^#[a-z]]/) || t.match?(/[^#[a-z]]/)) || (s.length > 200 || t.length > 200)
return "fail"
else
rubular = /^[\#]+|([^\#](\g<1>)*[\#]+)/
if s.match?(/#/) && t.match?(/#/)
s.gsub(rubular, '') == t.gsub(rubular, '')
else
new_s = s.match?(/#/) ? s.gsub(rubular, '') : s
new_t = t.match?(/#/) ? t.gsub(rubular, '') : t
new_s == new_t
end
end
end
It works in the terminal and passes the given examples, but when I submit it on leetcode it tells me Time Limit Exceeded. I tried shortening it to:
rubular = /^[\#]+|([^\#](\g<1>)*[\#]+)/
new_s = s.match?(/#/) ? s.gsub(rubular, '') : s
new_t = t.match?(/#/) ? t.gsub(rubular, '') : t
new_s == new_t
But also the same error.
So far, I believe my code fulfills the O(n) time, because there are only two ternary operators, which overall is O(n). I'm making 3 assignments and one comparison, so I believe that fulfills the O(1) space complexity.
I have no clue how to proceed beyond this, been working on it for a good 2 hours..
Please point out if there are any mistakes in my code, and how I am able to fix it.
Thank you! :)
Keep in mind that with N <= 200, your problem is more likely to be linear coefficient, not algorithm complexity. O(N) space is immaterial for this; with only 400 chars total, space is not an issue. You have six regex matches, two of which are redundant. More important, regex is slow processing for such a specific application.
For speed, drop the regex stuff and do this one of the straightforward, brute-force ways: run through each string in order, applying the backspaces as appropriate. For instance, change both the backspace and the preceding letter to spaces. At the end of your checking, remove all the spaces in making a new string. Do this with both S and T; compare those for equality.
It may be easiest to start at the end of the string and work towards the beginning:
def process(str)
n = 0
str.reverse.each_char.with_object('') do |c,s|
if c == '#'
n += 1
else
n.zero? ? (s << c) : n -= 1
end
end.reverse
end
%w|ab#c ad#c ab## c#d# a##c #a#c a#c b|.each_slice(2) do |s1, s2|
puts "\"%s\" -> \"%s\", \"%s\" -> \"%s\" %s" %
[s1, process(s1), s2, process(s2), (process(s1) == process(s2)).to_s]
end
"ab#c" -> "ac", "ad#c" -> "ac" true
"ab##" -> "", "c#d#" -> "" true
"a##c" -> "c", "#a#c" -> "c" true
"a#c" -> "c", "b" -> "b" false
Let's look at a longer string.
require 'time'
alpha = ('a'..'z').to_a
#=> ["a", "b", "c",..., "z"]
s = (10**6).times.with_object('') { |_,s|
s << (rand < 0.4 ? '#' : alpha.sample) }
#=> "h####fn#fjn#hw###axm...#zv#f#bhqsgoem#glljo"
s.size
#=> 1000000
s.count('#')
#=> 398351
and see how long it takes to process.
require 'time'
start_time = Time.now
(u = process(s)).size
#=> 203301
puts (Time.now - start_time).round(2)
#=> 0.28 (seconds)
u #=> "ffewuawhfa...qsgoeglljo"
As u will be missing the 398351 pound signs in s, plus an almost equal number of other characters removed by the pound signs, we would expect u.size to be about:
10**6 - 2 * s.count('#')
#=> 203298
In fact, u.size #=> 203301, meaning that, at the end, 203301 - 203298 #=> 3 pound signs were unable to remove a character from s.
In fact, process can be simplified. I leave that as an exercise for the reader.
class Solution {
public boolean backspaceCompare(String s, String t) {
try {
Stack<Character> st1 = new Stack<>();
Stack<Character> st2 = new Stack<>();
st1 = convertToStack(s);
st2 = convertToStack(t);
if (st1.size() != st2.size()) {
return false;
} else {
int length = st1.size();
for (int i = 0; i < length; i++) {
if (st1.peek() != st2.peek())
return false;
else {
st1.pop();
st2.pop();
}
if (st1.isEmpty() && st2.isEmpty())
return true;
}
}
} catch (Exception e) {
System.out.print(e);
}
return true;
}
public Stack<Character> convertToStack(String s){
Stack<Character> st1 = new Stack<>();
for (int i = 0; i < s.length(); i++) {
if (s.charAt(i) != '#') {
st1.push(s.charAt(i));
} else if (st1.empty()) {
continue;
} else {
st1.pop();
}
}
return st1;
}
}

Regular expression to match repeated strings

I am trying to make a regex that will
match if the string exclusively is constructed with strings from a set of strings.
not match if there is any other string in there.
examples for a set of strings that is ['xyz', 'a', 'b']:
'xyzab' == true
'xyzxyzabbb' == true
'aaabb' == true
'' == true
'd' == false
'aabbbbd' == false
'zxy' == false
I am URL matching :/
You can try this regex: ^(?:xyz|[ab])*$
var regex = new RegExp('^(?:xyz|[ab])*$');
var input = ['xyzab', 'xyzxyzabbb', 'aaabb', '', 'd', 'aabbbbd', 'zxy'];
for (var i = 0, l = input.length; i < l; i++) {
console.log(input[i], '->', regex.test(input[i]));
}
Given a set of string {"str1", "str2", ..., "strN"}, write the regex as follows:
^(str1|str2|...|strN)*$
Where
^ matches the beginning of the string
(...) matches any of the strings
* means that the one above may be repeated from 0 to infinite times
$ matches the end of the string

Use Meteor Match and Regex to check strings

I'm checking an array of strings for a specific combination of patterns. I'm having trouble using Meteor's Match function and regex literal together. I want to check if the second string in the array is a url.
addCheck = function(line) {
var firstString = _.first(line);
var secondString = _.indexOf(line, 1);
console.log(secondString);
var urlRegEx = /((([A-Za-z]{3,9}:(?:\/\/)?)(?:[\-;:&=\+\$,\w]+#)?[A-Za-z0-9\.\-]+|(?:www\.|[\-;:&=\+\$,\w]+#)[A-Za-z0-9\.\-]+)((?:\/[\+~%\/\.\w\-]*)?\??(?:[\-\+=&;%#\.\w]*)#?(?:[\.\!\/\\\w]*))?)/g;
if ( firstString == "+" && Match.test(secondString, urlRegEx) === true ) {
console.log( "detected: + | line = " + line )
} else {
// do stuff if we don't detect a
console.log( "line = " + line );
}
}
Any help would be appreciated.
Match.test is used to test the structure of a variable. For example: "it's an array of strings, or an object including the field createdAt", etc.
RegExp.test on the other hand, is used to test if a given string matches a regular expression. That looks like what you want.
Try something like this instead:
if ((firstString === '+') && urlRegEx.test(secondString)) {
...
}

regex equals to something exactly but does not equal to something else

my regex query below does an exact match of a word say Bob or Bill for example
var regExp = new RegExp("^" + inputVal + "$", 'i');
what i want it to do is match anything exactly (Bob or Bill Etc) but not match Fred
so match anything exactly except for Fred, does that make sense?
anyone help me out as to how i do that?
Thanks
EDIT 2:
i thought id show my actual script instead, what im doing is searching a table, and im page load i want to hide rows that contain a string. so if exlucde lenght is greater than 0 hide that row...
function searchPagingTable(inputVal, tablename, fixedsearch, exclude) {
var table = $(tablename);
table.find('tr:not(.header)').each(function (index, row) {
var allCells = $(row).find('td');
if (allCells.length > 0) {
var found = false;
allCells.each(function (index, td) {
if (fixedsearch == 1) {
var regExp = new RegExp("^" + inputVal + "$", 'i');
}
else if (exclude.length > 0)
{
var regExp = new RegExp("^(?!" + exclude + ")", "i");
}
else {
var regExp = new RegExp(inputVal, 'i');
}
if (regExp.test($(td).text())) {
found = true;
return false;
}
});
if (found == true) $(row).show().removeClass('exclude'); else $(row).hide().addClass('exclude');
}
});
pa
ginate();
}
That would be
var exclude = "Fred"
var regExp = new RegExp("^(?!.*" + exclude + ")", "i");
This regex matches any string except those that contain Fred. It doesn't actually match any characters in the string, but that's sufficient if you're just looking for a true/false result.
This will also find strings that contain Alfred or Fredo, so if you don't want that, you need to tell the regex only to look for entire words using word boundaries:
var regExp = new RegExp("^(?!.*\\b" + exclude + "\\b)", "i");
You need to make sure that your exclude string only contains ASCII letters/digits (or underscores) for this to work correctly.
You could populate a list of names you wish to match against:
var validNames = ['bob', 'bill'];
Then lowercase each input and match against the list:
if (validNames.indexOf(inputVal.toLowerCase()) != -1) {
// it's a good name
}
For older browsers you have to shim Array.indexOf()
var re = new RegExp('^\\s*Fred\\s*$','i');
if (inputVal.match(re)) {
// Fred has been found
} else {
// Anything has been found
}