Regex mirc situation - regex

Hello i'm having a script but when i try to catch info that i need i cant this is the part of html code that i would like to get:
<div class="pretty-container " >
2100.0
</div>
So on my socket y try to do this to get the 2100.0(2100.0 it is a variable number):
on *:sockread:foo: {
var %read
sockread %read
if ($regex(%read,<div class="pretty-container " >(.*)<\/div>)) {
echo -s price founded: $regml(1)
; - here i try to catch the 2100.0 number
sockclose $sockname
}
}
but this doesn't work i believe it is because the number and are in another line. con some one helpme to solve this please?
thanks in advace,
Carlos

The sockread command you use only reads one line at a time. Seeing as the data you're looking for is spread over multiple lines, your regular expression will never find a match.
A solution to this would be simply checking whether the current line contains <div class="pretty-container " >, and if so, store the data of the next line in another variable:
var %read, %number
sockread %read
if (<div class="pretty-container " > isin %read) {
sockread %number
echo -s Number: %number
sockclose $sockname
}

Related

awk, skip current rule upon sanity check

How to skip current awk rule when its sanity check failed?
{
if (not_applicable) skip;
if (not_sanity_check2) skip;
if (not_sanity_check3) skip;
# the rest of the actions
}
IMHO, it's much cleaner to write code this way than,
{
if (!not_applicable) {
if (!not_sanity_check2) {
if (!not_sanity_check3) {
# the rest of the actions
}
}
}
}
1;
I need to skip the current rule because I have a catch all rule at the end.
UPDATE, the case I'm trying to solve.
There is multiple match point in a file that I want to match & alter, however, there's no other obvious sign for me to match what I want.
hmmm..., let me simplify it this way, I want to match & alter the first match and skip the rest of the matches and print them as-is.
As far as I understood your requirement, you are looking for if, else if here. Also you could use switch case available in newer version of gawk packages too.
Let's take an example of a Input_file here:
cat Input_file
9
29
Following is the awk code here:
awk -v var="10" '{if($0<var){print "Line " FNR " is less than var"} else if($0>var){print "Line " FNR " is greater than var"}}' Input_file
This will print as follows:
Line 1 is less than var
Line 2 isgreater than var
So if you see code carefully its checking:
First condition if current line is less than var then it will be executed in if block.
Second condition in else if block, if current line is greater than var then print it there.
I'm really not sure what you're trying to do but if I focus on just that last sentence in your question of I want to match & alter the first match and skip the rest of the matches and print them as-is. ... is this what you're trying to do?
{ s=1 }
s && /abc/ { $0="uvw"; s=0 }
s && /def/ { $0="xyz"; s=0 }
{ print }
e.g. to borrow #Ravinder's example:
$ cat Input_file
9
29
$ awk -v var='10' '
{ s=1 }
s && ($0<var) { $0="Line " FNR " is less than var"; s=0 }
s && ($0>var) { $0="Line " FNR " is greater than var"; s=0 }
{ print }
' Input_file
Line 1 is less than var
Line 2 is greater than var
I used the boolean flag variable name s for sane as you also mentioned something in your question about the conditions tested being sanity checks so each condition can be read as is the input sane so far and this next condition is true?.

How to assert that a text ends with digits in protractor

I would like to assert in Protractor that a link text is composed by the following way: text-1 (where text is a variable, and the number can be composed by any digits).
I tried the following:
browser.wait(
ExpectedConditions.visibilityOf(
element(by.xpath(`//a[#class = 'collapsed' and starts-with(text(), '${text}') and ends-with(text(), '-(/d+)')]`))),
5000)
and
browser.wait(
ExpectedConditions.visibilityOf(
element(by.xpath(`//a[#class = 'collapsed' and starts-with(text(), '${text}') and ends-with(text(), '/^-(/d+)$/')]`))),
5000)
Unfortunately, none of the above xpaths worked.
How can I fix this?
If you change the way to declare the variable and your second predicate you can go with :
//a[#class='collapsed'][starts-with(text(),'" + text_variable + "')][number(replace(.,'^.*-(\d+)$','$1'))*0=0]
where [number(replace(.,'^.*-(\d+)$','$1'))*0=0] test for the presence of a number at the end of a string.
Example. If you have :
<p>
<a class="collapsed">foofighters-200</a>
<a class="collapsed">foofighters</a>
<a class="collapsed">boofighters-200</a>
<a class="collapsed">boofighters-200abc</a>
</p>
The following XPath :
//a[#class='collapsed'][starts-with(text(),'foo')][number(replace(.,'^.*-(\d+)$','$1'))*0=0]
will output 1 element :
<a class="collapsed">foofighters-200</a>
So in Protractor you could have :
var text = "foo";
browser.wait(ExpectedConditions.visibilityOf(element(by.xpath("//a[#class='collapsed'][starts-with(text(),'" + text + "')][number(replace(.,'^.*-(\d+)$','$1'))*0=0]"))), 5000);
...
You can use regexp for this:
await browser.wait(async () => {
return new RegExp('^.*(\d+)').test(await $('a.collapsed').getText());
}, 20000, 'Expected link text to contain number at the end');
Tune this regex here if needed:
https://regex101.com/r/9d9yaJ/1

Regex match scss function / mixin

I am trying to match a function or mixin used in an SCSS string so I may remove it but I am having a bit of trouble.
For those unfamiliar with SCSS this is an example of the things I am trying to match (from bootstrap 4).
#mixin _assert-ascending($map, $map-name) {
$prev-key: null;
$prev-num: null;
#each $key, $num in $map {
#if $prev-num == null {
// Do nothing
} #else if not comparable($prev-num, $num) {
#warn "Potentially invalid value for #{$map-name}: This map must be in ascending order, but key '#{$key}' has value #{$num} whose unit makes it incomparable to #{$prev-num}, the value of the previous key '#{$prev-key}' !";
} #else if $prev-num >= $num {
#warn "Invalid value for #{$map-name}: This map must be in ascending order, but key '#{$key}' has value #{$num} which isn't greater than #{$prev-num}, the value of the previous key '#{$prev-key}' !";
}
$prev-key: $key;
$prev-num: $num;
}
}
And a small function:
#function str-replace($string, $search, $replace: "") {
$index: str-index($string, $search);
#if $index {
#return str-slice($string, 1, $index - 1) + $replace + str-replace(str-slice($string, $index + str-length($search)), $search, $replace);
}
#return $string;
}
So far I have the following regex:
#(function|mixin)\s?[[:print:]]+\n?([^\}]+)
However it only matches to the first } that it finds which makes it fail, this is because it needs to find the last occurance of the closing curly brace.
My thoughts are that a regex capable of matching a function definition could be adapted but I can't find a good one using my Google foo!
Thanks in advance!
I would not recommend to use a regex for that, since a regex is not able to handle recursion, what you might need in that case.
For Instance:
#mixin test {
body {
}
}
Includes two »levels« of scope here ({{ }}), so your regex should be able to to count brackets as they open and close, to match the end of the mixin or function. But that is not possible with a regex.
This regex
/#mixin(.|\s)*\}/gm
will match the whole mixin, but if the input is like that:
#mixin foo { … }
body { … }
It will match everything up to the last } what includes the style definition for the body. That is because the regex cannot know which } closes the mixin.
Have a look at this answer, it explains more or less the same thing but based on matching html elements.
Instead you should use a parser, to parse the whole Stylesheet into syntax tree, than remove unneeded functions and than write it to string again.
In fact, like #philipp said, regex can't replace syntax analysis like compilers do.
But here is a sed command which is a little ugly but could make the trick :
sed -r -e ':a' -e 'N' -e '$!ba' -e 's/\n//g' -e 's/}\s*#(function|mixin)/}\n#\1/g' -e 's/^#(function|mixin)\s*str-replace(\s|\()+.*}$//gm' <your file>
-e ':a' -e 'N' -e '$!ba' -e 's/\n//g' : Read all file in a loop and remove the new line (See https://stackoverflow.com/a/1252191/7990687 for more information)
-e 's/}\s*#(function|mixin)/}\n#\1/g' : Make each #mixin or #function statement the start of a new line, and the preceding } the last character of the previous line
's/^#(function|mixin)\s*str-replace(\s|\()+.*}$//gm' : Remove the line corresponding to the #function str-replace or #mixin str-replace declaration
But it will result in an output that will loose indentation, so you will have to reindent it after that.
I tried it on a file where I copy/paste multiple times the sample code you provided, so you will have to try it on your file because there could be cases where the regex will match more element than wanted. If it is the case, provide us a test file to try to resolve these issues.
After much headache here is the answer to my question!
The source needs to be split line by line and read, maintining a count of the open / closed braces to determine when the index is 0.
$pattern = '/(?<remove>#(function|mixin)\s?[\w-]+[$,:"\'()\s\w\d]+)/';
$subject = file_get_contents('vendor/twbs/bootstrap/scss/_variables.scss'); // just a regular SCSS file containing what I've already posted.
$lines = explode("\n",$subject);
$total_lines = count($lines);
foreach($lines as $line_no=>$line) {
if(preg_match($pattern,$line,$matches)) {
$match = $matches['remove'];
$counter = 0;
$open_braces = $closed_braces = 0;
for($i=$line_no;$i<$total_lines;$i++) {
$current = $lines[$i];
$open_braces = substr_count($current,"{");
$closed_braces = substr_count($current,"}");
$counter += ($open_braces - $closed_braces);
if($counter==0) {
$start = $line_no;
$end = $i;
foreach(range($start,$end) as $a) {
unset($lines[$a]);
} // end foreach(range)
break; // break out of this if!
} // end for loop
} // end preg_match
} // endforeach
And we have a $lines array without any functions or mixins.
There is probably a more elegant way to do this but I don't have the time or the willing to write an AST parser for SCSS
This can be quite easily adapted into making a hacked one however!

Error in writing output file through AWK scripting

I have a AWK script to write specific values matching with specific pattern to a .csv file.
The code is as follows:
BEGIN{print "Query Start,Query End, Target Start, Target End,Score, E,P,GC"}
/^\>g/ { Query=$0 }
/Query =/{
split($0,a," ")
query_start=a[3]
query_end=a[5]
query_end=gsub(/,/,"",query_end)
target_start=a[8]
target_end=a[10]
}
/Score =/{
split($0,a," ")
score=a[3]
score=gsub(/,/,"",score)
e=a[6]
e=gsub(/,/,"",e)
p=a[9]
p=gsub(/,/,"",p)
gc=a[12]
printf("%s,%s,%s,%s,%s,%s,%s,%s\n",query_start, query_end,target_start,target_end,score,e,p,gc)
}
The input file is as follows:
>gi|ABCDEF|
Plus strand results:
Query = 100 - 231, Target = 100 - 172
Score = 20.92, E = 0.01984, P = 4.309e-08, GC = 51
But I received the output in a .csv file as provided below:
100 0 100 172 0 0 0 51
The program failed to copy the values of:
Query end
Score
E
P
(Note: all the failed values are present before comma (,))
Any help to obtain the right output will be great.
Best regards,
Amit
As #Jidder mentioned, you don't need to call split() and as #jaypal mentioned you're using gsub() incorrectly, but also you don't need to call gsub() at all if you just include , in your FS.
Try this:
BEGIN {
FS = "[[:space:],]+"
OFS = ","
print "Query Start","Query End","Target Start","Target End","Score","E","P","GC"
}
/^\>g/ { Query=$0 }
/Query =/ {
query_start=$4
query_end=$6
target_start=$9
target_end=$11
}
/Score =/ {
score=$4
e=$7
p=$10
gc=$13
print query_start,query_end,target_start,target_end,score,e,p,gc
}
That work? Note the field numbers are bumped out by 1 because when you don't use the default FS awk no longer skips leading white space so there's an empty field before the white space in your input.
Obviously, you are not using your Query variable so the line that populates it is redundant.

Qt - Regex to filter rich-text string and replace substrings [duplicate]

This question already has answers here:
RegEx match open tags except XHTML self-contained tags
(35 answers)
Closed 9 years ago.
I have a QString of rich text more or less in this format:
<span background-color="red"><a name='item1'></a> property1 </span> + <span background-color="blue"><a name='item2'></a> property2 </span>
It can have more tags, but all will have the same structure. Also, between each tag, operators will show up - this is a string that is supposed to represent a calculation.
I need a regex to traverse the string and extract both the item1, item2, ...; but also the property1, property2,... substrings so I can then retrieve a value which I have stored somewhere else.
Then, after retrieving these values, and if, for example, property1=value1 and property2=value2 , I need to create another string like:
value1 + value2
This string will be evaluated to compute the calculation.
What would be the regex to read the string?
What would be the regex to replace in a copied string?
NOTE I do not intend to parse HTML with these regexps. The string of rich-text I need to filter has at most the tags and structure represented above. It will not have other types of tags, nor will it have other attributes besides the ones in the example string above. It can only have more examples of that same tag structure: a span, containing an anchor tag with a name attribute and some text to display.
NOTE2 #Passerby posted in the comments of this question a link to a very aproximate solution. I forgot one (hopefully small) detail about my objective: I also need to catch whatever is between the span tags as a string as well, instead of simply checking for a char like #Passerby (very well) suggested. Any ideas?
NOTE3 I actually still argue that this is not the same question as the duplicate marked one. While the strings I am filtering look like HTML, they are actually rich-text. They will always have this rigid structure/format, so RegEx is perfectly viable for what I need to do. After some great comments I got from a few users, namely #Passerby, I decided to go for it and this works perfectly for what I need:
Sample string:
<span background-color="red"><a name='item1'></a> property1 </span> + 300 * <span background-color="blue"><a name='item2'></a> property2 </span> + Math.sqrt(<span background-color="green"><a name='item3'></a> property3 </span>)
Regex:
/ <span.*?><a name='(.*?)'><\/a>\s*(.*?)\s*<\/span>(((.*?)?)(?=<)|) / g
Outputs:
MATCH 1
1. [38-43] `item1`
2. [50-59] `property1`
3. [67-76] ` + 300 * `
4. [67-76] ` + 300 * `
5. [67-76] ` + 300 * `
MATCH 2
1. [115-120] `item2`
2. [127-136] `property2`
3. [144-157] ` + Math.sqrt(`
4. [144-157] ` + Math.sqrt(`
5. [144-157] ` + Math.sqrt(`
MATCH 3
1. [197-202] `item3`
2. [209-218] `property3`
3. [226-226] (null, matches any position)
This would be probably something like:
QRegExp rx("^(?:\\<span background-color=\"red\"\\>\\<a name=')(\\w)(?:'\\>\\</a\\>)\s*(\\d+)\s*(?:\\</span\\>)\s*(\+)\s*(?:\\<span background-color=\"blue\"\\>\\<a name=')(\\w)(?'\\>\\</a\\>)\")\\s*(\\d+)\\s*\\</span\\>)$");
rx.IndexIn(myText);
qDebug() << rx.cap(1) << rx.cap(2) << rx.cap(3) << rx.cap(4) << rx.cap(5);
//will return item1 prop1 + item2 prop2
given item would be one word and property would be a number. I did something very similar in a calculator for our software.
The trick is, start with small bits:
rx("\\<a name='\\w'\\>");
which would capture the item but eventually the complete line. Then go for next bit and keep it on until you got the whole line like you want it to be.
Regular Expressions can be very powerfull but also very frustrating.
Good luck
Edit: Every bracket () can be accessed via \1 in replace function. (?:) brackets are not captured! So :
QString text = "My Text";
text.replace("^My( Text)$","His\\1");
//will have returned: His Text
I don't understand regexps either. With this kind of parsing problem I would use quick and (maybe) dirty solution like this:
QString str = "<span background-color='red'><a name='item1'></a> property1 </span> + <span background-color='blue'><a name='item2'></a> property2 </span>";
QStringList slist = str.split("<");
qDebug() << slist;
foreach (QString s, slist)
{
if (s.startsWith("/a"))
{
qDebug() << "property:" << s.split(" ")[1];
}
else if (s.startsWith("a name"))
{
qDebug() << "item:" << s.split("'")[1];
}
else if (s.startsWith("/span>"))
{
QString op = s.mid(6).trimmed();
if (op != "")
qDebug() << "operator:" << op;
}
}
And output is:
item: "item1"
property: "property1"
operator: "+"
item: "item2"
property: "property2"
Of course, this will break down if the format changes. But so will the regexp too.
If the format would be any more complicated I would try to change the format to valid XML and then using Qt's XML classes to parse the data.
If you end up using this kind of solution, I really recommend adding some additonal validity checks.