Regular expression: match youtube links, but not youtube embed code

Regular expression: match youtube links, but not youtube embed code - regex

Could you help me, please. I need regular expression that match string like this:
http://www.youtube.com/watch?v=eE4qPqMYsp8
but not this:
<object width="500" height="700"><param name="movie" value="http://www.youtube.com/v/eE4qPqMYsp8&hl=ru&fs=1&rel=0" /><param name="wmode" value="transparent" /><param name="allowFullScreen" value="true" /><param name="allowscriptaccess" value="always" /><embed src="http://www.youtube.com/v/eE4qPqMYsp8&hl=ru&fs=1&rel=0" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" wmode="transparent" width="500" height="700">
I have this code:
%(?:(http://){0,1}(www.){0,1}youtube(?:-nocookie)?\.com/(?:[^/]+/.+/|(?:v|e(?:mbed)?)/|.*[?&]v=)|(http://){0,1}(www.){0,1}youtu\.be/)([^"&?/ ]{11})%
I don't know how to exclude some parameters.

How about an expression like this:
(?:https?://)?(?:www\.)?youtube\.com/watch.+?\bv=[a-zA-Z0-9]+
You can certainly add in more options (e.g. (?:-nocookie)), but it might be specific enough like this already.

Related

Ant task replace escaped URI param

I'm trying to use Ant to remove a url parameter in an xml file.
The line in the xml is similar to below.
<from uri="http://www.google.com?q=test&somethingElse=something" />
I want to remove the "&somethingElse=something". "something" could be different values so it must be generic.
I've tried
<replaceregexp file="somefile.xml" match="&somethingElse(.*)" replace='" />' flags="gs" byline="true" />
<replaceregexp file="somefile.xml" match="\&somethingElse(.*)" replace='" />' byline="true" />
<replaceregexp file="somefile.xml" match="(&)somethingElse(.*)" replace='" />' flags="gs" byline="true" />
but those don't seem to work.
$(ant.regexp.regexpimpl) is not set so the default engine is being used.

In order to get & you need to write & in the Ant build file because it is XML. To match &somethingElse in the input of the replaceregexp Ant task you might therefore need to specify &amp;somethingElse in the Ant build file.

Regular expression in java to extract URl from HTML

I am new to regexes. I need help.
My HTML source is
<img src ="planets.gif" width="145" height="126" alt="Planets" usemap ="#planetmap">
<map name="planetmap">
<area shape="rect" coords="0,0,82,126" href="http://www.sun.htm" alt="Sun">
<area shape="circle" coords="90,58,3" href="http://www.mercur.htm" alt="Mercury">
<area shape="circle" coords="124,58,8" href="http://www.www.venus.htm" alt="Venus">
</map>
I’m trying to extract all href links out like http://www.google.com.
kindly help.
My Regex is
"href=[\\\"\\'](http:\\/\\/|\\.\\/|\\/)?\\w+(\\.\\w+)*(\\/\\w+(\\.\\w+)?)*(\\/|\\?\\w*=\\w*(&\\w*=\\w*)*)?[\\\"\\']"
it wil extract like href="http://www.google.com"
But I need only link http://www.google.com without href=

Please use a XML-parser for this kind of stuff.

regex find word in string, replace word in new string (using Notepad++)

I posted a simplified version of this question before, but I think I might have simplified it too much, so here is the actual problem.
I want to use regex (in Notepad++ or similar) to find "a_dog" in the following (sorry about the wall):
<object classid="clsid:D27CDB6E-AE6D-11cf" id="FlashID">
<param name="movie" value="../flash/words/a_dog.swf">
<param name="quality" value="high">
<param name="wmode" value="opaque">
<param name="swfversion" value="6.0.65.0">
<!--[if !IE]>-->
<object data="../flash/words/a_dog.swf" type="application/x-shockwave-flash">
<!--<![endif]-->
<param name="quality" value="high">
<param name="wmode" value="opaque">
<param name="swfversion" value="6.0.65.0">
<!--[if !IE]>-->
</object>
<!--<![endif]-->
</object>
Then I want to use a back-reference to replace all instances of øø with a_dog in the following:
<input type="button" class="ButtonNormal" onClick="audio_func_øø()">
<script>
function audio_func_øø() {
var playAudio = document.getElementById("element_øø");
playAudio.play();
}
</script>
<audio id="element_øø">
<source src="../audio/words/øø.mp3" type='audio/mpeg'>
<source src="../audio/words/øø.wav" type='audio/wav'>
</audio>
So that only the second code is left (with a_dog instead of øø), and no trace of the first code remains.

I don't know how to do this in Notepad++, but you can do this in SublimeText using regex, snippets, and multiple selection:
First make a new snippet (guide) with the following in it:
<snippet>
<content><![CDATA[
<input type="button" class="ButtonNormal" onClick="audio_func_$1()">
<script>
function audio_func_$2() {
var playAudio = document.getElementById("element_$3");
playAudio.play();
}
</script>
<audio id="element_$4">
<source src="../audio/words/$5.mp3" type='audio/mpeg'>
<source src="../audio/words/$6.wav" type='audio/wav'>
</audio>
]]></content>
<!-- Optional: Set a tabTrigger to define how to trigger the snippet -->
<tabTrigger>audioSnippet</tabTrigger>
</snippet>
Save it as whatever you like in your User package. Follow the linked article if you have any questions on how/where to save it to get it working. I will discuss how this works later on.
Next use the following regex in Sublime Text by searching (with regex enabled) using the following pattern:
(?<=value="../flash/words/).+(?=\.swf)
And hit "Find All" - this will select all the names (e.g. 'a_dog', 'a_cat', 'a_plane') using multiple selection.
Copy the selected words (Ctrl+C or equivalent on your system)
In the menu, Selection->Expand to Paragraph (This will select where the <object> begins, to where </object> ends)
Hit Delete/Backspace to remove the <object>'s
Type in your snippet shortcut (above I've defined it to be "audioSnippet") and hit Tab
Paste in your copied text (Ctrl+V or equivalent on your system)
You will notice that you have only replaced the text in the snippet where the $1 appears. you will need to hit Tab to jump to $2, paste the text again (Ctrl+V), and repeat until you get to tab stop $6.
I've made a screen capture that you can look at here: http://youtu.be/oo2MQV3X244 (unlisted video on YouTube)

Set Ant property based on a regular expression in a file

I have the following in a file
version: [0,1,0]
and I would like to set an Ant property to the string value 0.1.0.
The regular expression is
version:[[:space:]]\[([[:digit:]]),([[:digit:]]),([[:digit:]])\]
and I need to then set the property to
\1.\2.\3
to get
0.1.0
I can't workout how to use the Ant tasks together to do this.
I have Ant-contrib so can use those tasks.

Based on matt's second solution, this worked for me for any (text) file, one line or not. It has no apache-contrib dependencies.
<loadfile property="version" srcfile="version.txt">
<filterchain>
<linecontainsregexp>
<regexp pattern="version:[ \t]\[([0-9]),([0-9]),([0-9])\]"/>
</linecontainsregexp>
<replaceregex pattern="version:[ \t]\[([0-9]),([0-9]),([0-9])\]" replace="\1.\2.\3" />
</filterchain>
</loadfile>

Solved it with this:
<loadfile property="burning-boots-js-lib-build.lib-version" srcfile="burning-boots.js"/>
<propertyregex property="burning-boots-js-lib-build.lib-version"
override="true"
input="${burning-boots-js-lib-build.lib-version}"
regexp="version:[ \t]\[([0-9]),([0-9]),([0-9])\]"
select="\1.\2.\3" />
But it seems a little wasteful - it loads the whole file into a property!
If anyone has any better suggestions please post :)

Here's a way that doesn't use ant-contrib, using loadproperties and a filterchain (note that replaceregex is a "string filter" - see the tokenfilter docs - and not the replaceregexp task):
<loadproperties srcFile="version.txt">
<filterchain>
<replaceregex pattern="\[([0-9]),([0-9]),([0-9])\]" replace="\1.\2.\3" />
</filterchain>
</loadproperties>
Note the regex is a bit different, we're treating the file as a property file.
Alternatively you could use loadfile with a filterchain, for instance if the file you wanted to load from wasn't in properties format.
For example, if the file contents were just [0,1,0] and you wanted to set the version property to 0.1.0, you could do something like:
<loadfile srcFile="version.txt" property="version">
<filterchain>
<replaceregex pattern="\s+\[([0-9]),([0-9]),([0-9])\]" replace="\1.\2.\3" />
</filterchain>
</loadfile>

Get a block of text in a list of blocks using Regular Expressions

Edit2: only regex match solutions, please. thank you!
Edit: I'm looking for regex solution, if it's exist. I have other blocks with the same data that are not XML, and I can't use Perl, I added Perl tag as I'm more familiar with regexes in Perl. Thanks in advance!
I Have list like this:
<Param name="Application #" value="1">
<Param name="app_id" value="32767" />
<Param name="app_name" value="App01" />
<Param name="app_version" value="1.0.0" />
<Param name="app_priority" value="1" />
</Param>
<Param name="Application #" value="2">
<Param name="app_id" value="3221" />
<Param name="app_name" value="App02" />
<Param name="app_version" value="1.0.0" />
<Param name="app_priority" value="5" />
</Param>
<Param name="Application #" value="3">
<Param name="app_id" value="32" />
<Param name="app_name" value="App03" />
<Param name="app_version" value="1.0.0" />
<Param name="app_priority" value="2" />
</Param>
How can I get a block for one app if I only know, say, a value of app_name. For example for App02 I want to get
<Param name="Application #" value="2">
<Param name="app_id" value="3221" />
<Param name="app_name" value="App02" />
<Param name="app_version" value="1.0.0" />
<Param name="app_priority" value="5" />
</Param>
Is it possible to get it, if other "name=" lines are not known (but there's always name="app_name" and Param name="Application #")?
Can it be done in a single regex match? (doesn't have to be, but feels like there's probably a way).

since your content seems to be some XML why don't use a real parser to do the task ?
use XML::XPath;
use XML::XPath::XMLParser;
my $xp = XML::XPath->new(filename => 'test.xhtml');
my $nodeset = $xp->find('/Param[#name=\'Application #\']'); # find all applications
foreach my $node ($nodeset->get_nodelist) {
print "FOUND\n\n",
XML::XPath::XMLParser::as_string($node),
"\n\n";
}
you can read a bit more about XPath here and have full reference at the w3c.
I advise you not to use reg exp to do that task because it's going to be complicate and not maintenable.
note: also possible to use the DOM API just depend the one you like the most.

This seems to be a sad case of bogus XML. A misguided attempt to create enterprisey software at best. The developers could have used a sane configuration file format such as:
[App03]
app_id = 32767
app_version = 1.0.0
...
but they decided to drive everyone insane with meaningless BSXML.
I would say, if this file is less than 10 MB in size, just go ahead and use XML::Simple. If the file indeed consists of nothing but repeated blocks of exactly what you posted, you can use the following solution:
#!/usr/bin/perl
use strict; use warnings;
my %apps;
{
local $/ = "</Param>\n";
while ( my $block = <DATA> ) {
last unless $block =~ /\S/;
my %appinfo = ($block =~ /name="([^"]+?)"\s+value="([^"]+?)"/g);
$apps{ $appinfo{app_name} } = \%appinfo;
}
}
use Data::Dumper;
print Dumper $apps{App03};
Edit: If you cannot use Perl and you won't tell us what you can use, there is not much I can do but point out that
/name="([^"]+?)"\s+value="([^"]+?)"/g
will give you all name-value pairs.

I would prefer a parser solution, too. If you absolutely have to use a regex and understand all the disadvantages of this approach, then the following regex should work:
<Param name="Application #"[^>]*>\s+<Param[^>]*>\s+<Param name="app_name" value="App02" />\s+(?:<Param[^>]*>\s+){2}</Param>
This relies heavily on the structure present in your example. A re-ordering of tags, introduction of additional tags or (shudder) nesting of tags will break the regex.

Seems like it would be more appropriate to use an XML reader library, but I don't know Perl enough to suggest one.

Perl's XML DOM Parser may be appropriate here.

I would suggest using one of XML parsers, but if you cannot do so, then the following quick and dirty code should do:
my ($rez) = $data =~/\<Param\s+name\s*=\s*"Application\s#"\s+value\s*=\s*"2"\>((?:.|\n)*?)^\<\/Param\>/m;
print $rez;
(assuming $data contains your xml as a single string, possibly multiline )

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Regular expression: match youtube links, but not youtube embed code - regex

How about an expression like this: (?:https?://)?(?:www\.)?youtube\.com/watch.+?\bv=[a-zA-Z0-9]+ You can certainly add in more options (e.g. (?:-nocookie)), but it might be specific enough like this already.

Related

Ant task replace escaped URI param

Regular expression in java to extract URl from HTML

regex find word in string, replace word in new string (using Notepad++)

Set Ant property based on a regular expression in a file

Get a block of text in a list of blocks using Regular Expressions

Categories

Resources