Incremental number in RegEx - regex

I'm using Bulk Image Downloader to download whole images in a forum thread.
But I need the regular expression to identify the page number increments.
The URL string of the page is this:
/topic/2244447/+(page number goes here)
Here's the situation, the page numbers are incremented by +20. So the second page URL is /topic/2244447/+20 and third page is /topic/2244447/+40 and so on.
How can I put the regex for this?

\/topic\/2244447\/\+([0-9]*[02468])?0$
Just being careful:
I took a look at the documentation, page 28 of the Bulk Image Downloader user's guide.
I wonder whether your page numbers end with 0 or 1.

If the + will not exist anywhere else, then this should work although it will match any number with any number of digits after the +.
\+\d+

Related

Are word timestamps always immediately consecutive and always start from 0?

In google cloud speech to text, I'm getting the timestamps of the words as documented here using PHP.
Two issues:
The first word always starts at 0s, even if the audio file doesn't have any sound until after.
Each word timestamp is immediately followed by another, even when the speaker pauses between words.
Is it possible to get a more precise word timestamp with PHP?
Based on the documentation, it seems there isn’t an option to modify any parameter in order to get a more precise word timestamp.
However, you can report this issue by providing all the information requested within the form.

Using RegEx to Find a Block of Text

I'm attempting to block a long string of unnecessary text that's on every page of a document.
Ex: "36075 This is another page and this is the date March 4 2013"
I know this must be very simple, but I'm hoping there is a way to block text verbatim. Is the only way to block this text by using a lot of /d/s/w+/+ etc or is there is a way to say, "match 36075 This is another page and this is the date March 4 2013".
This would be SO HELPFUL to know. Thank you for helping!
From what you wrote I assume you need to get leading numbers from string, to do it you just need to use this pattern: ^\d+ which from this input:
36075 This is another page and this is the date March 4 2013
will return this:
36075
For future, in case of such questions please provide example string and expected output. As well as what you have tried.
I realized the issue I was having. I didn't need to use RegEx. The program I was using has the functionality to match specific words or groups of words and pronounce them differently. What I discovered is that it will not match the words unless the word groups are input exactly the way the program typically reads them.
Ergo --> The channel saw
the end of the British hold over
Would have to be listed as one group for, "The channel saw" and a second group for "the end of the British hold over"
In addition, there were some numbers --> 11960_30_o_ho_
and if the program naturally read 119 and then 60_3 and then _o_ho_ then three strings would need to be input for each section.
A few frustrating hours later, problem solved :) Thank you for your assistance.

Regex Check Facebook Video URL

I try to check facebook video url using regex.
this is example Valid fb video URL :
https://www.facebook.com/video.php?v=100000000000000 (VALID)
this is example Valid fb video URL with username :
https://www.facebook.com/{username}/videos/100000000000000
note : {username} can contain any string.
example :
https://www.facebook.com/username1/videos/100000000000000 (VALID)
https://www.facebook.com/username2/videos/100000000000000 (VALID)
But my reqex still wrong if i check fb video url with username.
This is my regex :
^http(s)?://(www\.)?facebook.([a-z]+)/(?!(?:video\.php\?v=\d+|usernameFB/videos/\d+)).*$
You can run it :
https://regex101.com/r/dF5iP1/6
This will work for you:
^(https?://www\.facebook\.com/(?:video\.php\?v=\d+|.*?/videos/\d+))$
Demo
https://regex101.com/r/sC6oR2/3
UPDATED October 2018
Neither of the two existing REGEX proposals worked for me, and there are more visible cases than the ones considered.
Here's my REGEX Proposal:
^(?:(?:https?:)?\/\/)?(?:www\.)?facebook\.com\/[a-z\.]+\/videos\/(?:[a-z0-9\.]+\/)?([0-9]+)\/?(?:\?.*)?$
^(?:(?:https?:)?\/\/)?(?:www\.)?facebook\.com\/[a-zA-Z0-9\.]+\/videos\/(?:[a-zA-Z0-9\.]+\/)?([0-9]+)
I ignored video.php, I think it's old enough to safely ignore it.
Matches:
https://www.facebook.com/aguardos.nocturnos/videos/vb.1614866072064590/1828228624061666/?type=2&theater
https://www.facebook.com/aguardos.nocturnos/videos/vb.1614866072064590/1828228624061666?type=2&theater
https://www.facebook.com/aguardos.nocturnos/videos/1828228624061666/
https://www.facebook.com/latavernadelssomnis/videos/1609038972452561/?hc_ref=NEWSFEED
//www.facebook.com/aguardos.nocturnos/videos/1828228624061666/
https://facebook.com/aguardos.nocturnos/videos/1828228624061666/
http://www.facebook.com/aguardos.nocturnos/videos/1828228624061666/
www.facebook.com/aguardos.nocturnos/videos/18282286240612666/
facebook.com/aguardos.nocturnos/videos/18282286240612666/
https://www.facebook.com/aguardos.nocturnos/videos/1828228624061666
https://www.facebook.com/WEAU13News/videos/588612391555522/UzpfSTEzMzAzMDk4NjM6MTAyMTMxMjMzNDE3ODE0MTI/
I do not own nor I have watched any of the videos. I just picked random ones that were on my facebook feed.
Groups
Video ID.
Gotchas
One of the most common Facebook video formats is more complex than I'd like it to be and matching every case perfectly with REGEX would probably lead to a very messy query.
https://www.facebook.com/RolandGarros/videos/10155404760334920/FOO (valid)
https://www.facebook.com/RolandGarros/videos/FOO/10155404760334920 (valid)
https://www.facebook.com/RolandGarros/videos/10155404760334920/FOO/FOO (invalid)
The way this one seems to work is by retrieving the numeric value in the first or second part after videos/.
https://www.facebook.com/RolandGarros/videos/10155361533554920/1015536153355492134
What about this one where two valid numeric values are involved? It seems like the second one is the one that will prevail.
For this reason the REGEX solution above was softened1 to match only the beginning of the Facebook URL, up to the video group that we're looking for. Considering that your goal's probably to extract the video ID, rather than verify the URL, I think that's a valid trade-off. At the end of the day, you'll be checking the video either way (either through API or scrapping) to extract the video information since an ID doesn't mean that the video exists or it's public.
1 Not just softened, but also improved to match the test case format.
Test
You can easily test it yourself # Regex101
This is a little different than Pedro's, but it works well.
^http(?:s)?://(?:www\.)?facebook.(?:[a-z]+)/((?:video\.php\?v=\d+|username\d/videos/\d+)).*$
https://regex101.com/r/nV4rI3/1
Latest:
/(?:https?:\/\/)?(?:www.|web.|m.)?(facebook|fb).(com|watch)\/(?:video.php\?v=\d+|(\S+)|photo.php\?v=\d+|\?v=\d+)|\S+\/videos\/((\S+)\/(\d+)|(\d+))\/?/
That will help you
regexr.com/4tdur
you can use like this
const myURL = "https://www.facebook.com/video.php?v=100000000000000";
const res = /^https?:\/\/www\.facebook\.com.*\/(video(s)?|watch|story)(\.php?|\/).+$/gm.test(myURL);
console.log(res);
The Facebook Video URLs nowadays are of the formats as following:-
https://www.facebook.com/NowThisPolitics/videos/968643940204333/
https://www.facebook.com/chandni.nathani2/videos/10158204539960536/UzpfSTEwMDAwMTc3MzU1MjI2NzoyNzMxNDUyMTYzNTkwNTQy/
Also, since the facebook could be replaced by fb, I created this regex:
/(?:https?:\/{2})?(?:w{3}\.)?(facebook|fb).com\/.*\/videos\/.*/

Regex - Extract number from a link

I have this link www.xxx.yy/yyy/zzzzzz/xyz-z-yzy-/93797038 and I want to take the number 93797038 in order to pass it into another link.
For example: I want afterwards something like www.m.xxx.yy/93797038 which is the same page as before but in its mobile version.
In general, I know that I have to type www.xxx.yy/(.*) for extracting anything following the in the main url and then I group the result with www.m.xxx.yy/%1 which redirects to the same page but in the mobile version.
Any ideas how to do it?
EDIT: The link www.xxx.yy/yyy/zzzzzz/xyz-z-yzy-/93797038 is automated. The part that is the same each time is only the www.xxx.yy . Every time the system runs produces different urls. I want each time to take the number from those urls, e.g. the 93797038 in my case.
\/(\d+?)$ will get the trailing digits after the final /.
Why you want regex? You can use
string str = #"www.xxx.yy/yyy/zzzzzz/xyz-z-yzy-/93797038";
string digit = str.Split('/').Last();
instead.

Regex to match time

I want my users to be able to enter a time form.
If more info necessary, users use this to express how much time is needed to complete a task, and it will be saved in a database if filled.
here is what I have:
/^$|^([0-1]?[0-9]|2[0-4]):([0-5][0-9])(:[0-5][0-9])?$/
It matches an empty form or 01:30 and 01:30:00 formatted times. I really won't need the seconds as every task takes a minute at least, but I tried removing it and it just crashed my code and removed support for empty string.. I really don't understand regex at all.
What I'd like, is for it to also match simple minutes and simple hours, like for instance 3:30, 3:00, 5. Is this possible? It would greatly improve the user experience and limit waste typing. But I'd like to keep the zero optional in case some users find it natural to type it.
I think the following pattern does what you want:
p="((([01]?\d)|(2[0-4])):)?([0-5]\d)?(:[0-5]\d)?"
The first part:
(([01]?\d)|(2[0-3])):)?
is an optional group which deals with hours in format 00-24.
The second part:
([0-5]\d)?
is an optional group which deals with minutes if hours or seconds are present in your expression. The group also deals with expressions containing only minutes or only hours.
The third part:
(:[0-5]\d)?
is an optional group dealing with seconds.
The following samples show the pattern at work:
In [180]: re.match(p,'14:25:30').string
Out[180]: '14:25:30'
In [182]: re.match(p,'2:34:05').string
Out[182]: '2:34:05'
In [184]: re.match(p,'02:34').string
Out[184]: '02:34'
In [186]: re.match(p,'59:59').string
Out[186]: '59:59'
In [188]: re.match(p,'59').string
Out[188]: '59'
In [189]: re.match(p,'').string
Out[189]: ''
As every group is optional the pattern matches also the empty string. I've tested it with Python but I think it will work with other languages too with minimal changes.