Regex matching beginning AND end strings

Regex matching beginning AND end strings - regex

This seems like it should be trivial, but I'm not so good with regular expressions, and this doesn't seem to be easy to Google.
I need a regex that starts with the string 'dbo.' and ends with the string '_fn'
So far as I am concerned, I don't care what characters are in between these two strings, so long as the beginning and end are correct.
This is to match functions in a SQL server database.
For example:
dbo.functionName_fn - Match
dbo._fn_functionName - No Match
dbo.functionName_fn_blah - No Match

If you're searching for hits within a larger text, you don't want to use ^ and $ as some other responders have said; those match the beginning and end of the text. Try this instead:
\bdbo\.\w+_fn\b
\b is a word boundary: it matches a position that is either preceded by a word character and not followed by one, or followed by a word character and not preceded by one. This regex will find what you're looking for in any of these strings:
dbo.functionName_fn
foo dbo.functionName_fn bar
(dbo.functionName_fn)
...but not in this one:
foodbo.functionName_fnbar
\w+ matches one or more "word characters" (letters, digits, or _). If you need something more inclusive, you can try \S+ (one or more non-whitespace characters) or .+? (one or more of any characters except linefeeds, non-greedily). The non-greedy +? prevents it from accidentally matching something like dbo.func1_fn dbo.func2_fn as if it were just one hit.

^dbo\..*_fn$
This should work you.

Well, the simple regex is this:
/^dbo\..*_fn$/
It would be better, however, to use the string manipulation functionality of whatever programming language you're using to slice off the first four and the last three characters of the string and check whether they're what you want.

\bdbo\..*fn
I was looking through a ton of java code for a specific library: car.csclh.server.isr.businesslogic.TypePlatform (although I only knew car and Platform at the time). Unfortunately, none of the other suggestions here worked for me, so I figured I'd post this.
Here's the regex I used to find it:
\bcar\..*Platform

Scanner scanner = new Scanner(System.in);
String part = scanner.nextLine();
String line = scanner.nextLine();
String temp = "\\b" + part + "|" + part + "\\b";
Pattern pattern = Pattern.compile(temp.toLowerCase());
Matcher matcher = pattern.matcher(line.toLowerCase());
System.out.println(matcher.find() ? "YES" : "NO");
If you need to determine if any of the words of this text start or end with the sequence, you can use this regex: \bsubstring|substring\b:
anythingsubstring
substringanything
anythingsubstringanything

The simplest thing that you can do is:
dbo.*_fn$
It searches with dbo, followed by any characters, and then ends with _fn.
If you can identify what’s the right next character after n if it’s space, you can replace $ with space .

Related

Regex to match string if it's the only thing on the line (except spaces)

I am new to regex and have done a lot of searching on this but can't fathom it.
I need to match a string (which may contain whitespace), lets say The Dog, but only when it is the only item on the line (followed by the \n character). For example, my input file may contain the following, but I am only interested in the first instance of The Dog (it being the only thing on the line is the only way I can distinguish it from potential other instances):
The Dog
property 1 : 5
property 2 : depends on The Dog
property 3 : more data
Where The Dog appears on the line on its own, it may or may not be preceded by whitespace.
I have found similar questions where the answers suggest things such as (^|[ ]), to check for either the beginning of the string or a preceding space, but all the suggestions I have found so far match both instances in my example.
Can anyone help? Thanks.

it really depends on the regex flavour you are using, but most of them allow for the m flag which ensures that the ^ and $ tags match for the beginning and end (respectively) of every line. With this, combined with \s* to consume any, if at all, whitespace, you can use the following:
^\s*The\s*Dog\s*$
Debuggex Demo

You can use the following:
^\s*The\s*Dog\s*$
See RegEx Demo

In javascript, you could do something like this:
var doesMatch = (" The Dog").match(/^\s*The Dog\s*$/);
if (doesMatch) {
//found a match
}

Thanks a lot, I used
^\sThe\sDog\s*$

Because "The Dog" is only an example according to your original post, I wuld suggest to use ^(?:\s*\w*)+\n.
This matches a line start, then it matches 1 to an unlimited repetion of zero or unlimited number of whitespace (\s*) followed by zero or an unlimited number of word characters (\w*) followed by a line break.
Because of the (?:) the part of the string, that matched won't be captured. This is useful if you only want to test if the string matches. If you want to capture, what part of the string matched, use ^((?:\s*\w*))+\n.
But as "R Nar" pointed out, it depends on the regex flavour. The above regex would work in javascript, but on in python for example. Tell us in which programming language you try to match the subject string and we can give you a working example.
In case you use javascript, here is how you would loop over all matches capturing the part of the string, that matches:
var myregexp = /^((?:\s*\w*)+)\n/g;
var match = myregexp.exec(subject);
while (match != null) {
// matched text: match[0]
// match start: match.index
// capturing group 1: match[1]
match = myregexp.exec(subject);
}

Regex to match words after dot until a whitespace occurs

Given the following string
span.a.b this.is.really.confusing
I need to return the matches a and b. I've been able to get close with the following regex:
(?<=\.)[\w]+
But it's also matching is, really, and confusing. When I include a negative lookahead I get even closer, but I'm still not there.
(?<=\.)[\w]+(?=\s) # matches b, confusing
How can I match words after a dot until a whitespace occurs?

How can I match words after a dot until a whitespace occurs?
NB: this is language agnostic pseudo-code, but should work.
regex = "^[^\s.]+.(\S+).*"
targets = <extracted_group>.split(".")
Regex explanation:
"^": beings with
"[^\s.]+." 1 or more non-whitespace, non-period characters, followed by a period.
"(\S+)": group and capture all of the following non-whitespace characters
".*": matches 0 or more of any non-newline character
If the split function takes a regex instead of a string, you'll need to escape the '.' or use a character class.
NB: You can do it without the split, but I think that the split is more transparent.

I am not sure if this is good enough for all your possible cases, but it should work with the provided example:
\.([\w]+)\.([\w]+)\s
$1 = a, $2 = b

How can I match the last 4 characters of a word beginning &c using PCRE regex?

I'm trying to match the last four characters (alphanumeric) of all words beginning with the sequence &c.
For instance, in the string below, I'd like to match the pieces in bold:
Colour one is &cFF2AC3 and colour two is &c22DE4A.
Can anybody help me with the correct regex expression? I've spent hours on this great resource to no avail.

it looks like hexadecimal numbers, so use this pattern
&c[0-9A-F]{2}\K([0-9A-F]{4})
DEMO

This:
/(?i)\s*&c(?:[a-z0-9]{2})([a-z0-9]{4})\b/
append a g to the end of it if you want it to find all matches in a given text

Try this
/(?:^| )&c\w*(\w{4})\b/
If you want to try it in the regex tester you linked to, make sure to use the g modifier to see all matches.
Explanation: (?:^| ) matches either a space or the start of the string, &c\w* matches the ampersand and the the first however many characters of the word, and then \w{4} captures the last 4 characters. \b on the end asserts a word break (a "non-word" character or the end of the string).

Regex for deleting characters before a certain character?

I'm very new at regex, and to be completely honest it confounds me. I need to grab the string after a certain character is reached in said string. I figured the easiest way to do this would be using regex, however like I said I'm very new to it. Can anyone help me with this or point me in the right direction?
For instance:
I need to check the string "23444:thisstring" and save "thisstring" to a new string.

If this is your string:
I'm very new at regex, and to be completely honest it confounds me
and you want to grab everything after the first "c", then this regular expression will work:
/c(.*)/s
It will return this match in the first matched group:
"ompletely honest it confounds me"
Try it at the regex tester here: regex tester
Explanation:
The c is the character you are looking for
.* (in combination with /s) matches everything left
(.*) captures what .* matched, making it available in $1 and returned in list context.

Regex for deleting characters before a certain character!
You can use lookahead like this
.*(?=x)
where x is a particular character or word or string.{using characters like .,$,^,*,+ have special meaning in regex so don't forget to escape when using it within x}
EDIT
for your sample string it would be
.*(?=thisstring)
.* matches 0 to many characters till thisisstring

Here is a one-line solution for matching everything after "before"
print $1."\n" if "beforeafter" =~ m/before(.*)/;
Edit:
While using lookbehind is possible, it's not required. Grouping provides an easier solution.

To get the string before : in your example, you have to use [^:][^:]*:\(.*\). Notice that you should have at least one [^:] followed by any number of [^:]s followed by an actual :, the character you are searching for.

Regular expression to allow spaces between words

I want a regular expression that prevents symbols and only allows letters and numbers. The regex below works great, but it doesn't allow for spaces between words.
^[a-zA-Z0-9_]*$
For example, when using this regular expression "HelloWorld" is fine, but "Hello World" does not match.
How can I tweak it to allow spaces?

tl;dr
Just add a space in your character class.
^[a-zA-Z0-9_ ]*$
Now, if you want to be strict...
The above isn't exactly correct. Due to the fact that * means zero or more, it would match all of the following cases that one would not usually mean to match:
An empty string, "".
A string comprised entirely of spaces, " ".
A string that leads and / or trails with spaces, " Hello World ".
A string that contains multiple spaces in between words, "Hello World".
Originally I didn't think such details were worth going into, as OP was asking such a basic question that it seemed strictness wasn't a concern. Now that the question's gained some popularity however, I want to say...
...use #stema's answer.
Which, in my flavor (without using \w) translates to:
^[a-zA-Z0-9_]+( [a-zA-Z0-9_]+)*$
(Please upvote #stema regardless.)
Some things to note about this (and #stema's) answer:
If you want to allow multiple spaces between words (say, if you'd like to allow accidental double-spaces, or if you're working with copy-pasted text from a PDF), then add a + after the space:
^\w+( +\w+)*$
If you want to allow tabs and newlines (whitespace characters), then replace the space with a \s+:
^\w+(\s+\w+)*$
Here I suggest the + by default because, for example, Windows linebreaks consist of two whitespace characters in sequence, \r\n, so you'll need the + to catch both.
Still not working?
Check what dialect of regular expressions you're using.* In languages like Java you'll have to escape your backslashes, i.e. \\w and \\s. In older or more basic languages and utilities, like sed, \w and \s aren't defined, so write them out with character classes, e.g. [a-zA-Z0-9_] and [\f\n\p\r\t], respectively.
* I know this question is tagged vb.net, but based on 25,000+ views, I'm guessing it's not only those folks who are coming across this question. Currently it's the first hit on google for the search phrase, regular expression space word.

One possibility would be to just add the space into you character class, like acheong87 suggested, this depends on how strict you are on your pattern, because this would also allow a string starting with 5 spaces, or strings consisting only of spaces.
The other possibility is to define a pattern:
I will use \w this is in most regex flavours the same than [a-zA-Z0-9_] (in some it is Unicode based)
^\w+( \w+)*$
This will allow a series of at least one word and the words are divided by spaces.
^ Match the start of the string
\w+ Match a series of at least one word character
( \w+)* is a group that is repeated 0 or more times. In the group it expects a space followed by a series of at least one word character
$ matches the end of the string

This one worked for me
([\w ]+)

Try with:
^(\w+ ?)*$
Explanation:
\w - alias for [a-zA-Z_0-9]
"whitespace"? - allow whitespace after word, set is as optional

I assume you don't want leading/trailing space. This means you have to split the regex into "first character", "stuff in the middle" and "last character":
^[a-zA-Z0-9_][a-zA-Z0-9_ ]*[a-zA-Z0-9_]$
or if you use a perl-like syntax:
^\w[\w ]*\w$
Also: If you intentionally worded your regex that it also allows empty Strings, you have to make the entire thing optional:
^(\w[\w ]*\w)?$
If you want to only allow single space chars, it looks a bit different:
^((\w+ )*\w+)?$
This matches 0..n words followed by a single space, plus one word without space. And makes the entire thing optional to allow empty strings.

This regular expression
^\w+(\s\w+)*$
will only allow a single space between words and no leading or trailing spaces.
Below is the explanation of the regular expression:
^ Assert position at start of the string
\w+ Match any word character [a-zA-Z0-9_]
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
1st Capturing group (\s\w+)*
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
\s Match any white space character [\r\n\t\f ]
\w+ Match any word character [a-zA-Z0-9_]
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
$ Assert position at end of the string

Just add a space to end of your regex pattern as follows:
[a-zA-Z0-9_ ]

This does not allow space in the beginning. But allowes spaces in between words. Also allows for special characters between words. A good regex for FirstName and LastName fields.
\w+.*$

For alphabets only:
^([a-zA-Z])+(\s)+[a-zA-Z]+$
For alphanumeric value and _:
^(\w)+(\s)+\w+$

If you are using JavaScript then you can use this regex:
/^[a-z0-9_.-\s]+$/i
For example:
/^[a-z0-9_.-\s]+$/i.test("") //false
/^[a-z0-9_.-\s]+$/i.test("helloworld") //true
/^[a-z0-9_.-\s]+$/i.test("hello world") //true
/^[a-z0-9_.-\s]+$/i.test("none alpha: ɹqɯ") //false
The only drawback with this regex is a string comprised entirely of spaces. "       " will also show as true.

It was my regex: #"^(?=.{3,15}$)(?:(?:\p{L}|\p{N})[._()\[\]-]?)*$"
I just added ([\w ]+) at the end of my regex before *
#"^(?=.{3,15}$)(?:(?:\p{L}|\p{N})[._()\[\]-]?)([\w ]+)*$"
Now string is allowed to have spaces.

This regex allow only alphabet and spaces:
^[a-zA-Z ]*$

Try with this one:
result = re.search(r"\w+( )\w+", text)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Regex matching beginning AND end strings - regex

^dbo\..*_fn$ This should work you.

Well, the simple regex is this: /^dbo\..*_fn$/ It would be better, however, to use the string manipulation functionality of whatever programming language you're using to slice off the first four and the last three characters of the string and check whether they're what you want.

The simplest thing that you can do is: dbo.*_fn$ It searches with dbo, followed by any characters, and then ends with _fn. If you can identify what’s the right next character after n if it’s space, you can replace $ with space .

Related

Regex to match string if it's the only thing on the line (except spaces)

Regex to match words after dot until a whitespace occurs

How can I match the last 4 characters of a word beginning &c using PCRE regex?

Regex for deleting characters before a certain character?

Regular expression to allow spaces between words

Categories

Resources