Backslashes in regexp pattern for PHP - regex

I'm trying to perform a regex operation in my PHP code (preg_replace). Now I'm working with:
|^http(s)?://[a-z0-9-]+(.[a-z0-9-]+)*(:[0-9]+)?(/.*)?$|i
That matches URLs like http://google.com, etc... but now I'm guessing what if the URL I want to match is also like this one?
http:\/\/asd.domain.com\/path\/of\/url\/something.else
I've tried with 2x backslashes and 4x backslashes and it doesn't seem to work.
Any advice?
Thanks in advance.

Well, if you simply want to match the string, add some backslashes:
^http(s)?:\\?/\\?/[a-z0-9-]+(.[a-z0-9-]+)*(:[0-9]+)?(\\?/.*)?$
I used a ? quantifier so that it still matches the URLs it could match before that. Since \ is an escaping character, you need two of those, the first to escape the escaping properties of the second \.
See demo (I only escaped forward slashes there because of how the regex tester works -- the delimiters are slashes).

Related

vim delete regex: pattern not found

This is my first time trying to use a regex for deletion.
The regex:
/net=.+\.net/
as shown here matches a string that starts with net= some random characters and ends with .net
However, when using it in vim:
:g/net=.+\.net/d
I simply get Pattern not found: net=.+\.net
I am guessing that vim uses a slightly different format, or do I need to escape the characters =, . and + ?
:help pattern is your friend. In your case, you need to escape + or prefix your whole pattern with \v to turn it “verymagic”.
Do not escape =, it would turn it into the same thing as {0,1} in some regexp engine, namely a greedy optional atom matcher.

Notepad++ regex replace - how to remove this string

I want to remove strings in the form of the following where some-text is a random text string.
$('#some-text').val();
I've tried various things but I think the $ sign is messing things up since it's used in regex.
You need to escape some characters.
Try this -
\$\('#[^']*'\)\.val\(\);
Try this regex by escaping special chars:
\$\(.*\).val\(\);
To avoid escaping the special characters, you can use \Q - \E pair to surround the part where you want the regex engine to interpret literally:
\Q$('\E<your-regex>\Q').val();\E
Replace <your-regex> with your regex to match the selector, or whatever it is.

Regex for getting domain name from Referer

Am using the following regex to capture different parts of referer url. I want to capture protocol and domain and used it in diff scenarios.
Pattern pr=new Patters("^\w+://|[^\/:]+|[\w\W]*$");
But eclipse is giving me and error
Invalid escape sequence (valid ones are \b \t \n \f \r \" \' \\ )..
Am new to regex. Can anyone help me on this?
You're supply a string to the Pattern constructor, so you need to escape the backslashes.
e.g.:
Pattern pr = new Pattern("^\\w+://|[^/:]+|[\\w\\W]*$");
Your regexp is probably not complete - you need to "group" the scheme and domain sections with brackets:
Pattern pr = new Pattern("^(\\w+)://([^/:]+)");
I've ignored everything after the next colon or slash - you said you only wanted the scheme and domain.
Regex uses "\"(i.g., \w, \W, \d, \D) as the starting character to define regex syntax. Java also uses "\" as well. Java also allows "\" to be used by adding an extra "\", so you would end up with "\\" in your code, this will escape the other backslash.
Just in case your solution in not what you expected try using "regexpal.com".
Remember, whenever you expect a single slash("\") in your outcome use a double slash("\\") in your code.

Slashes and hashes in Perl and metacharacters

Thanks for the previous assistance everyone!. I have a query regarding RegExp in Perl
My issue is..
I know, when matching you can write m// or // or ## (must include m or s if you use this). What is causing me the confusion is a book example on escaping characters I have. I believe most people escape lots of characters, as a sure fire way of the program working without missing a metacharacter something ie: \# when looking to match # say in an email address.
Here's my issue and I know what this script does:
$date= "15/12/99"
$date=~ s#(\d+)/(\d+)/(\d+)#$1/$2/$3#; << why are no forward slashes escaped??
print($date);
Yet the later example I have, shows it rewritten, as (which i also understand and they're escaped)
$date =~ s/()(\d+)\/(\d+)\/(d+)/$2\/$1\/$3; <<<<which is escaping the forward slashes.
I know the slashes or hashes are programmer preference and their use. What I don't understand is why the second example, escapes the slashes, yet the first doesn't - I have tried and they work both ways. No escaping slashes with hashes? What's even MORE confusing is, looking at yet another book example I also have earlier to this one, using hashes again, they too escape the # symbol.
if ($address =~ m#\##) { print("That's an email address"); } or something similar
So what do you escape from what you don't using hashes or slashes? I know you have to escape metacharacters to match them but I'm confused.
When you build a regexp, you define a character as a delimiter for your regexp i.e. doing // or ##.
If you need to use that character inside your regexp, you will need to escape it so that the regexp engine does not see it as the end of the regexp.
If you build your regexp between forward slashes /, you will need to escape the forward slashes contained in your regexp, hence the escaping in your second example.
Of course, the same rule apply with any character you use as a regexp delimiter, not just forward slashes.
The forward slashes are not meta characters in themselves - only the use of them in the second example as expression separators makes them "special".
The format of a substitute expression is:
s<expression separator char><expression to look for><expression separator char><expression to replace with><expression separator char>
In the first example, using a hash as the first character after the =~ s, makes that character the expression separator, so forward slash is not special and does not require any escaping.
in the second example, the expression separator is indeed the forward slash, so it must be escaped within the expressions themselves.
The regex match-operator allows to define a custom non-whitespace-character as seperator.
In your first example the '#' is used as seperator. So in this regex you don't need to escape the '/' because it hase no special meaning. In the second regex, the seperator char isn't changed. So the default '/' is used. Now you have to escape all '/' in your pattern. Otherwise the parser is confused. :)
If you are not use slashes, the recommend practice is to use the curly braces and the /x modifier.
$date=~ s{ (\d+) \/ (\d+) \/ (\d+) }{$1/$2/$3}x;
Escaping the non-alphanumerics is also a standard even if they are not meta-characters. See perldoc -f quotemeta.
There is another depth to this question about escaping forward slashes with the s operator.
With my example the capturing becomes the problem.
$image_name =~ s/((http:\/\/.+\/)\/)/$2/g;
For this to work the typo with the addition of a second forward slash, had to be captured.
Also, trying to work with just the two slashes did not work. The first slash has to be led by more than one character.
Changing "http://world.com/Photos//space_shots/out_of_this_world.jpg"
To: "http://world.com/Photos/space_shots/out_of_this_world.jpg"

How does one escape backslashes and forward slashes in VIM find/search?

For instance, if I wanted to a find and replace with strings containing backward or forward slashes, how would this be accomplished in vim?
Examples
Find & Replace is: :%s/foo/bar/g
what if I wanted to find all occurrences of <dog/> and replace it with <cat\>
Same way you escape characters most anywhere else in linuxy programs, with a backslash:
:%s/<dog\/>/<cat\\>
But note that you can select a different delimiter instead:
:%s#<doc/>#<cat\\>#
This saves you all typing all those time-consuming, confusing backslashes in patterns with a ton of slashes.
From the documentation:
Instead of the / which surrounds the pattern and replacement string, you
can use any other single-byte character, but not an alphanumeric character,
\, " or |. This is useful if you want to include a / in the search
pattern or replacement string.
%s:<dog/>:<cat>
You can replace the / delimiters if they become annoying for certain patterns.
Quote them with a backslash. Also, it often helps to use another delimiter besides slash.
:%s#<dog/>#<cat\\>#
or if you have to use slash as the substitute command delimiter
:%s/<dog\/>/<cat\\>/
I was looking for something similar, to search for register values containing the / character (to record a macro). The solution was to search using the ? token instead of the /.
The syntax is:
:%s/<dog\/>/<cat\\>/g
backslash slash backslash star
/(<- the prompt)\/\*
so after you type it looks like
/\/\*