Want to parse some logs, a bit hard.
Logs look like this:
/ajax/foto.php?whatever-session-info-here
/edit.php?path=blahblah-imgage-url.jpg
/catalog/whetaever-text-here
/item/whetaever-text-here
/gallery (without slash at the end)
So
/[a-zA-Z-]{0,}/
works good for text between slashes, and i have
/catalog/
/item/
after regexp work
So question is how to get output for this example that looked like:
/ajax/foto.php
/edit.php
/catalog/
/item/
/gallery
ADD:
found this, need only text betwen first two slashes:
/foto/300/b/5/4/19123312.jpg
to get /foto/
/[a-zA-Z](/?)([a-zA-Z].[a-zA-Z]*)?
Tested on
http://gskinner.com/RegExr/
or
s = '/foto/300/b/5/4/19123312.jpg'
s.split('/')[1]
=> "foto"
Related
I am trying to create a regex in pcre, that is going to salinize URL with multiple slashes like the following:
https://www.domin.com/test1/////test2/somemoretests_67142 https://www.domin.com/test1/test2/somemoretests_67142///// https://www.domin.com/test1/test2///somemoretests_67142
So that I can replace it with the following: https://\2\4 and the link at the end of it looks: https://www.domin.com/test1/test2/somemoretests_67142
I have been struggling with it for the past couple of days, so any regex guru help is more than welcome :)
I have tried the following and more:
(http|https):\/\/(.*)(\/\/+)(.*)
(http|https):\/\/(.*)(\/\/){2,}(.*)
(http|https):\/\/(.*)(\/\/{2})(.*)
I am going to utilize these for Akamai to sanitize our URLs though cloudlet.
You can try:
(?<!https:\/)(?<!http:\/)(\/+$|(?<=\/)\/+)
And substitute the first group with empty string.
Regex demo.
This will produce this output:
https://www.domin.com/test1/test2/somemoretests_67142
https://www.domin.com/test1/test2/somemoretests_67142
https://www.domin.com/test1/test2/somemoretests_67142
I’m trying to write regex
https://.*ABCD
https://.*ABCD.view
https://.*ABCD_1
https://.*ABCD_r
If there are these 4 urls, i’d like to export only these https://.*ABCD, https://.*ABCD.view 2 urls, not these urls https://.*ABCD_1, https://.*ABCD_r
I was trying to make it like this
https://.*ABCD.*
But this one includes these urls https://.*ABCD_1, https://.*ABCD_r
I googled it but no luck.
How can i fix and write regex?
This regex excludes underscore and space for valid URL:
^https://[^_\s]+$
Try this tool for different regex:
https://regex101.com/
I need to use Regex to check for URLs that contain 'folder', in the following URL:
subdomain.domain.co.uk/section/folder/page
I'm using:
subdomain.domain.co.uk\/.*\/(?!folder\/).*
but it's still finding 'folder'. Any ideas?
Try this regex:
^subdomain.domain.co.uk\/((?!folder).)*$
Demo here:
Regex101
First off, you need slashes around "folder", otherwise you'll also exclude "/anotherfolder/" and "/folder.jpg" etc.
Put the negative look ahead before the "." and add "." before "folder":
subdomain.domain.co.uk\/(?!.*\/folder\/).*
This won't match a URL with "/folder/" anywhere in it.
I have a couple of strings that end with a dot (.) at the end of the sentence which I need to remove in Yahoo Pipes.
Example:
example.com.
companywebsite.co.uk.
anothersite.co.
I've tried the following from a couple of posts here on SO but none have worked yet
/\.$/
or
^(.*)\\.(.*)$","$1!$2
Neither of these options have worked
I have tried a very simple find of
.com. and replace with .com
and
.co. to replace with .co
But the latter affects .com as well which is not ideal
EDIT: Here is a visual of what my pipe looks like.
If you can do something like this: ^(.*)\\.(.*)$","$1!$2, then doing this should work: "^(.+?)\.?$", $1. This should match the first part of the URL and leave out the period at the end, should it exist.
EDIT:
As per your image, you should place this: ^(.+?)\.?$ in your replace field and this: $1 in your with field. I do not know if you need to do any escaping, so you might have to use ^(.+?)\\.?$ instead of ^(.+?)\.?$.
I am trying to exclude gmail's requests from Live Http headers, but I cant
seem to get the exclude reg ex to work.
My exclude regex is this: .gif$|.jpg$|.ico$|.css$|.js$|.*mail.google.com.*
Any ideas/suggestions?
I have had the same problem and its soultion was stupid simple:
do you have enabled the check box ("exclude URL by RegExp" (or similar - I have only the german version))?
Hint: you do need to add the .* at start and end of your expression, because the request will be excludes if it contains the pattern (is must not match the complate url).
I think. You sould use "\." to catch a dot. Dot without slash is any symbol.
Like this:
\.gif$|\.jpg$|\.ico$|.css$|\.js$|.*mail\.google\.com.*