I'm new in regex, first time I use them.
Given a string, with multiple words, I need to extract the second word (word = any number of char between to spaces).
For example: "hi baby you're my love"
I need to extract "baby"
I think I could start from this: (\b\w*\b) that matches every single word, but I don't know how to make it skip the first match.
Thank's for suggestion guys,
I've modified a little your regex and I finally find what I need:
(?<=\s)(.*?)(?=\s)
This one (?<=.)(\b\w+\b) was also kinda good but fails if I have string like "hi ba-by you're my love" splitting "ba-by" into "ba" and "by".
You can do it even without \b.
Use \w+\s+(\w+) and read the word from capturing group 1.
The regex above:
First mathes a non-empty sequence of word characters (the first word).
Then it matches a non-empty sequence of white chars (spaces) between
word 1 and 2.
And finally, the capturing group captures just the second word.
Note that \s+(\w+) is wrong, because the source string can begin with a space
and in such case this regex would have catched the first word.
Related
I am new to regex, basically I'd like to check if a word has ONLY one colons or not.
If has two or more colons, it will return nothing.
if has one colon, then return as it is. (colon must be in the middle of string, not end or beginning.
(1)
a:bc:de #return nothing or error.
a:bc #return a:bc
a.b_c-12/:a.b_c-12/ #return a.b_c-12/:a.b_c-12/
(2)
My thinking is, but this is seems too complicated.
^[^:]*(\:[^:]*){1}$
^[-\w.\/]*:[-\w\/.]* #this will not throw error when there are 2 colons.
Any directions would be helpful, thank you!
This will find such "words" within a larger sentence:
(?<= |^)[^ :]+:[^ :]+(?= |$)
See live demo.
If you just want to test the whole input:
^[^ :]+:[^ :]+$
To restrict to only alphanumeric, underscore, dashes, dots, and slashes:
^[\w./-]+:[\w./-]+$
I saw this as a good opportunity to brush up on my regex skills - so might not be optimal but it is shorter than your last solution.
This is the regex pattern: /^[^:]*:[^:]*$/gm and these are the strings I am testing against: 'oneco:on' (match) and 'one:co:on', 'oneco:on:', ':oneco:on' (these should all not match)
To explain what is going on, the ^ matches the beginning of the string, the $ matches the end of the string.
The [^:] bit says that any character that is not a colon will be matched.
In summary, ^[^:] means that the first character of the string can be anything except for a colon, *: means that any number of characters can come after and be followed by a single colon. Lastly, [^:]*$ means that any number (*) of characters can follow the colon as long as they are not a colon.
To elaborate, it is because we specify the pattern to look for at the beginning and end of the string, surrounding the single colon we are looking for that only the first string 'oneco:on' is a match.
I am trying to write a regex that matches for a specific string as long as it does not contain a single string word.
Below, I want to return "I think one is cool" but not "one" because I only want it as long as it's not by itself.
Ex.
one
I think one is cool <--- I want this "one"
Any help would be greatly appreciated
For regex, the beginning of a string will be typically signified with ^ (carat) and end with $ (US Dollar sign)
Many flavors of regex allow you to do forward/backward lookarounds, so basically you want to find the word one that is not by itself, but part of a string.
You're looking for the word one, so you can use \b around the word, which is usually syntax for a word boundary. This helps you filter out searches like none.
So here is the regex that would work for you:
(?<!^)\bone\b(?!$)
This means that out of the following strings, only the bolded text will be a match:
one
is the one
one for all
i can none of
If I understand you correctly, you want to match lines containing your word, but not consisting of only your word.
Depending on your programming language there might be better ways to do this, but you can search for the regex /(\w+\sone|one\s\w+)/ to find lines containing something like a word, then a space, then "one", or "one", a space, and then something like a word. So this would match every line here:
one two three
this is one line
the number one
but no line here:
one
lonely
something else
If you want it to match something like "lonely", remove whitespace escape sequences (\s). If you want to match not only word-characters before and/or after, replace the \w with a dot ..
I'm trying to match the last four characters (alphanumeric) of all words beginning with the sequence &c.
For instance, in the string below, I'd like to match the pieces in bold:
Colour one is &cFF2AC3 and colour two is &c22DE4A.
Can anybody help me with the correct regex expression? I've spent hours on this great resource to no avail.
it looks like hexadecimal numbers, so use this pattern
&c[0-9A-F]{2}\K([0-9A-F]{4})
DEMO
This:
/(?i)\s*&c(?:[a-z0-9]{2})([a-z0-9]{4})\b/
append a g to the end of it if you want it to find all matches in a given text
Try this
/(?:^| )&c\w*(\w{4})\b/
If you want to try it in the regex tester you linked to, make sure to use the g modifier to see all matches.
Explanation: (?:^| ) matches either a space or the start of the string, &c\w* matches the ampersand and the the first however many characters of the word, and then \w{4} captures the last 4 characters. \b on the end asserts a word break (a "non-word" character or the end of the string).
I am looking for a regex that matches first word in a sentence excluding punctuation and white space. For example: "This" in "This is a sentence." and "First" in "First, I would like to say \"Hello!\""
This doesn't work:
"""([A-Z].*?(?=^[A-Za-z]))""".r
(?:^|(?:[.!?]\s))(\w+)
Will match the first word in every sentence.
http://rubular.com/r/rJtPbvUEwx
This is an old thread but people might need this like I did.
None of the above works if your sentence starts with one or more spaces.
I did this to get the first (non empty) word in the sentence :
(?<=^[\s"']*)(\w+)
Explanation:
(?<=^[\s"']*) positive lookbehind in order to look for the start of the string, followed by zero or more spaces or punctuation characters (you can add more between the brackets), but do not include it in the match.
(\w+) the actual match of the word, which will be returned
The following words in the sentence are not matched as they do not satisfy the lookbehind.
You can use this regex: ^[^\s]+ or ^[^ ]+.
You can use this regex: ^\s*([a-zA-Z0-9]+).
The first word can be found at a captured group.
[a-z]+
This should be enough as it will get the first a-z characters (assuming case-insensitive).
In case it doesn't work, you could try [a-z]+\b, or even ^[a-z]\b, but the last one assumes that the string starts with the word.
This seems like it should be trivial, but I'm not so good with regular expressions, and this doesn't seem to be easy to Google.
I need a regex that starts with the string 'dbo.' and ends with the string '_fn'
So far as I am concerned, I don't care what characters are in between these two strings, so long as the beginning and end are correct.
This is to match functions in a SQL server database.
For example:
dbo.functionName_fn - Match
dbo._fn_functionName - No Match
dbo.functionName_fn_blah - No Match
If you're searching for hits within a larger text, you don't want to use ^ and $ as some other responders have said; those match the beginning and end of the text. Try this instead:
\bdbo\.\w+_fn\b
\b is a word boundary: it matches a position that is either preceded by a word character and not followed by one, or followed by a word character and not preceded by one. This regex will find what you're looking for in any of these strings:
dbo.functionName_fn
foo dbo.functionName_fn bar
(dbo.functionName_fn)
...but not in this one:
foodbo.functionName_fnbar
\w+ matches one or more "word characters" (letters, digits, or _). If you need something more inclusive, you can try \S+ (one or more non-whitespace characters) or .+? (one or more of any characters except linefeeds, non-greedily). The non-greedy +? prevents it from accidentally matching something like dbo.func1_fn dbo.func2_fn as if it were just one hit.
^dbo\..*_fn$
This should work you.
Well, the simple regex is this:
/^dbo\..*_fn$/
It would be better, however, to use the string manipulation functionality of whatever programming language you're using to slice off the first four and the last three characters of the string and check whether they're what you want.
\bdbo\..*fn
I was looking through a ton of java code for a specific library: car.csclh.server.isr.businesslogic.TypePlatform (although I only knew car and Platform at the time). Unfortunately, none of the other suggestions here worked for me, so I figured I'd post this.
Here's the regex I used to find it:
\bcar\..*Platform
Scanner scanner = new Scanner(System.in);
String part = scanner.nextLine();
String line = scanner.nextLine();
String temp = "\\b" + part + "|" + part + "\\b";
Pattern pattern = Pattern.compile(temp.toLowerCase());
Matcher matcher = pattern.matcher(line.toLowerCase());
System.out.println(matcher.find() ? "YES" : "NO");
If you need to determine if any of the words of this text start or end with the sequence, you can use this regex: \bsubstring|substring\b:
anythingsubstring
substringanything
anythingsubstringanything
The simplest thing that you can do is:
dbo.*_fn$
It searches with dbo, followed by any characters, and then ends with _fn.
If you can identify what’s the right next character after n if it’s space, you can replace $ with space .