I am trying to cut a Spotify playlist link into getting all the chars between / & ?, however I am getting nowhere with regex.
The link: https://open.spotify.com/playlist/37i9dQZF1DX60OAKjsWlA2?si=2NBcsO0bQS-CQclS1rNoCA
What I want: 37i9dQZF1DX60OAKjsWlA2
My code so far looks the following, but I am getting nothing out of it:
RegExp regExp = new RegExp(
"\w*\?",
caseSensitive: false,
multiLine: false,
);
When I print with
print("stringMatch : " +
regExp
.stringMatch(
"https://open.spotify.com/playlist/7x1ebdezDivH4mXAhUdR2S?si=TxHdzuvnTzuoCD5TFR4z_g")
.toString());
It just prints an empty String. Where am i going wrong?
You need to match 1+ word chars, or chars other than /, up to a question mark excluding it.
Note that you need to double escape bacslashes in a regular string literal, or single ones in as raw string literal.
In your current case, you may use
r"\w+(?=\?)"
See the regex demo
Or,
r"[^?/]+(?=\?)"
See this regex demo. Here, [^?/]+ matches 1+ chars other than ? and /.
A non-regex way is to split on ?, get the first item, then get the chunk of chars after the last /:
String s = "https://open.spotify.com/playlist/7x1ebdezDivH4mXAhUdR2S?si=TxHdzuvnTzuoCD5TFR4z_g";
String t=s.split("?")[0];
print(t.substring(t.lastIndexOf("/")+1));
Output: 7x1ebdezDivH4mXAhUdR2S
Related
Hi everyone, I have a problem. I'm trying to use regex to get all lines that are not end with
_0.jpg_s.jpg_m.jpg_l.jpg
Example Lines:
9Uikt/ifehr54mg__0.jpg9Uikt/idg4hdmg2_s.jpg9Uikt/igdffgggfmg4_m.jpg9Uikt/img3teg3gegg7_l.jpg9Uikt/imgerhw45h70.jpg9Uikt/imggq4ge37s.jpg9Uikt/img3f37m.jpg9Uikt/img34g3f7l.jpg9Uikt/imgf3f34t4t73l_.jpg9Uikt/imgf3f34t4t73l_2.jpg
The bold ones I am trying to get.
Between 9Uikit/ and .jpg any character can happen, except the characters that are not allowed for file names
"*:<>?/\|
I have tried this code
.*(_(?![0-9][a-zA-Z])).*\.jpg
You can use
^(?!.*_[0sml]\.jpg$).+\.jpg$
See the regex demo
Details
.* - any zero or more chars other than line break chars, as many as possible
(?!.*_[0sml]\.jpg$) - a negative lookahead that fails the match if, immediately to the right of the current location, there are
.* - any zero or more chars other than line break chars, as many as possible
_ - an underscore
[0sml] - a 0, s, m, l char
\. - a dot
jpg - jpg string
$ - end of string anchor
.+\.jpg$ - any one or more chars other than line break chars, as many as possible, .jpg string and end of string.
Or, .*(?<!_[0sml])\.jpg$ if you can afford a lookbehind:
.*(?<!_[0sml])\.jpg$
See this regex demo. Details:
.* - any zero or more chars other than line break chars, as many as possible
(?<!_[0sml]) - no _ and 0, s, m or l char immediately on the left is allowed
-\.jpg$ - .jpg at the end of string.
Your problem statement is perhaps a little light on what other combinations you might want to reject. From what I can see you don't want to match '7_' in which case the negative lookahead is the thing to do
A Perl regex might look like
if(m/img(?!7_).*jpg/) {print}
Note that SED and AWK don't support negative lookahead, but you did not say how you were executing the regex
You can simply use this regex, which I think is simpler to read and understand.
\w+\/?.*?_\w\.jpg
const testCases = [
"9Uikt/img7_0.jpg",
"9Uikt/img7_s.jpg",
"9Uikt/img7_m.jpg",
"9Uikt/img7_l.jpg",
"9Uikt/img70.jpg",
"9Uikt/img7s.jpg",
"9Uikt/img7m.jpg",
"9Uikt/img7l.jpg",
"9Uikt/img7l_.jpg"
];
const re = /\w+\/?.*?_\w\.jpg/gi;
testCases.forEach(tc => {
if (tc.match(re)) {
console.log("Matched: " + tc);
} else {
console.log("Not matched: " + tc);
}
});
I would like to mask the email passed in the maskEmail function. I'm currently facing a problem wherein the asterisk * is not repeating when i'm replacing group 2 and and 4 of my pattern.
Here is my code:
fun maskEmail(email: String): String {
return email.replace(Regex("(\\w)(\\w*)\\.(\\w)(\\w*)(#.*\\..*)$"), "$1*.$3*$5")
}
Here is the input:
tom.cat#email.com
cutie.pie#email.com
captain.america#email.com
Here is the current output of that code:
t*.c*#email.com
c*.p*#email.com
c*.a*#email.com
Expected output:
t**.c**#email.com
c****.p**#email.com
c******.a******#email.com
Edit:
I know this could be done easily with for loop but I would need this to be done in regex. Thank you in advance.
For your problem, you need to match each character in the email address that not is the first character in a word and occurs before the #. You can do that with a negative lookbehind for a word break and a positive lookahead for the # symbol:
(?<!\b)\w(?=.*?#)
The matched characters can then be replaced with *.
Note we use a lazy quantifier (?) on the .* to improve efficiency.
Demo on regex101
Note also as pointed out by #CarySwoveland, you can replace (?<!\b) with \B i.e.
\B\w(?=.*?#)
Demo on regex101
As pointed out by #Thefourthbird, this can be improved further efficiency wise by replacing the .*? with a [^\r\n#]* i.e.
\B\w(?=[^\r\n#]*#)
Demo on regex101
Or, if you're only matching single strings, just [^#]*:
\B\w(?=[^#]*#)
Demo on regex101
I suggest keeping any char at the start of string and a combination of a dot + any char, and replace any other chars with * that are followed with any amount of characters other than # before a #:
((?:\.|^).)?.(?=.*#)
Replace with $1*. See the regex demo. This will handle emails that happen to contain chars other than just word (letter/digit/underscore) and . chars.
Details
((?:\.|^).)? - an optional capturing group matching a dot or start of string position and then any char other than a line break char
. - any char other than a line break char...
(?=.*#) - if followed with any 0 or more chars other than line break chars as many as possible and then #.
Kotlin code (with a raw string literal used to define the regex pattern so as not to have to double escape the backslash):
fun maskEmail(email: String): String {
return email.replace(Regex("""((?:\.|^).)?.(?=.*#)"""), "$1*")
}
See a Kotlin test online:
val emails = arrayOf<String>("captain.am-e-r-ica#email.com","my-cutie.pie+here#email.com","tom.cat#email.com","cutie.pie#email.com","captain.america#email.com")
for(email in emails) {
val masked = maskEmail(email)
println("${email}: ${masked}")
}
Output:
captain.am-e-r-ica#email.com: c******.a*********#email.com
my-cutie.pie+here#email.com: m*******.p*******#email.com
tom.cat#email.com: t**.c**#email.com
cutie.pie#email.com: c****.p**#email.com
captain.america#email.com: c******.a******#email.com
I have a few strings for which I want to tokenize
for example:
123ae4rf468 to be split into [123,ae4rf,468]
878768stb4hgbjh354 to be split into [878768,stb4hgbjh,354]
I tried below but did not work. Kindly, help
def groupStrings(): Unit ={
val pattern: Regex = "\"[^A-Z0-9]+|(?<=[A-Z])(?=[0-9])|(?<=[0-9])(?=[A-Z])\"".r
for(patternMatch <- pattern.findAllMatchIn("12341abc1234"))
println(patternMatch.groupCount)
}
You can use this
(^\d+)(.*?)(?<=[a-z])(\d+)$
(^\d+) - Matches digits at start of string
(.+?) - Match anything except new line one or more time
(?<=[a-z])(\d+)$ - Positive lookbehind matches digits preceded by character at end of string
Demo
On side note:- If you don't need groups you can change to this
^\d+.*?(?<=[a-z])\d+$
try this (\d+|\D+)
or (\D+(?:\d*\D)*|\d+)
or (\D+(?:\d*\D+)?|\d+)
I have this url http://localhost:64685/Forum/Runner/runner_job/24af786e
I would like the regex to check if the url, has a / followed by 8 x letter or numbers (like in the url) at the end of the url.
this is my best attempt so far, and I know it not good or correct: /[^/A-Z]{9}/g
Could someone guide me in the right direction?
Edit
How i run the regex,
Regex regex = new Regex(#"/\/[^\W_]{8}$/");
Match match = regex.Match(url);
if (match.Success)
{
url.Replace(match.Value, "");
}
Use
Regex regex = new Regex(#"/[^\W_]{8}$");
// Or, to make it match only ASCII letters/digits:
// Regex regex = new Regex(#"/[^\W_]{8}$", RegexOptions.ECMAScript);
url = regex.Replace(url, "");
No need to check for a match before replacing with a regex. Note that you used a String.Replace method, not a Regex.Replace one and did not assign the new value to url (strings are immutable in C#). See the regex demo.
Details:
/ - a literal /
[^\W_]{8} - exactly 8 letters or digits ([^\W_] matches a char other than a non-word (\W) and _ chars)
$ - end of string.
Pass the RegexOptions.ECMAScript option if you need to only match ASCII letters/digits.
Given a string like String a="- = - - What is your name?";
How to remove the leading equal, dash, space characters, to get the clean text,
"What is your name?"
If you want to remove the leading non-alphabets you can match:
^[^a-zA-Z]+
and replace it with '' (empty string).
Explanation:
first ^ - Anchor to match at the
begining.
[] - char class
second ^ - negation in a char class
+ - One or more of the previous match
So the regex matches one or more of any non-alphabets that are at the beginning of the string.
In your case case it will get rid of all the leading spaces, leading hyphens and leading equals sign. In short everything before the first alphabet.
$a=~s/- = - - //;
In Javascript you could do it like this
var a = "- = - - What is your name?";
a = a.replace(/^([-=\s]*)([a-zA-Z0-9])/gm,"$2");
Java:
String replaced = a.replaceFirst("^[-= ]*", "");
Assuming Java try this regex:
/^\W*(.*)$/
retrieve your string from captured group 1!
\W* matches all preceding non-word characters
(.*)then matches all characters to the end beginning with the first word character
^,$ are the boundaries. you could even do without $ in this case.
Tip try the excellent Java regex tutorial for reference.
In Python:
>>> "- = - - What is your name?".lstrip("-= ")
'What is your name?'
To remove any kind of whitespace, use .lstrip("-= \t\r\n").
In Javascript, I needed to do this and did it using the following regex:
^[\s\-]+
and replace it with '' (empty string) like this:
yourStringValue.replace(/^[\s\-]+/, '');