remove digits numbers in imacros without touch the phrase - imacros

I have iMacros which is take the data from csv file to post it to a website,
It runs good but
In my csv file have 4 columns
1 title 2 desc 3 price 4 phone
& column number 3 contain digits numbers only and the issue is some time contain like that 00 or 00000 or 000 or 0 or 1111111 or 222222222 which is wrong price
so i want to replace it with "" but in the same time i don't want to print "PRICE" phrase
so output it looks like this:
subject: HELLO WORLD
desc bla bla bla bla contact me 22212123
PRICE 00
and here my code
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:vbform ATTR=NAME:subject CONTENT={{!COL1}}
SET !VAR1 EVAR("'{{!COL3}}'.replace(/^0/gi,""
SET !VAR2 EVAL("'{{!COL4}}'.replace(/\+95/gi,"")")
TAG POS=1 TYPE=TEXTAREA FORM=NAME:vbform ATTR=ID:vB_Editor_001_textarea
CONTENT=<SP><SP>desc<SP>{{!COL2}}<SP>contact<SP>me{{!VAR2}}<BR>PRICE<SP>{{!VAR1}}<BR>[/CENTER]
TAG POS=1 TYPE=INPUT:SUBMIT FORM=NAME:vbform ATTR=ID:vB_Editor_001_save
and i expect output like that if the price contain like that wrong numbers
subject: HELLO WORLD
desc bla bla bla bla contact me 22212123
but if not contain will print it looks like this:
subject: HELLO WORLD
desc bla bla bla bla contact me 22212123
PRICE 120
thanks

Play this code sample:
SET price 00000
' your second line may look like this
SET !VAR1 EVAL("'{{price}}'.replace(/^0+/,"")")
PROMPT {{!VAR1}}

Related

Regex - delete everything but matching groups (using sublime 3)

How can I keep only matching groups and delete the rest of the text?
Using: Sublime 3 - Regex
My pattern is
1.5.1 Bla bla bla
text text text
text text text
1.5.2 Bla bla bla
text text text
I want to keep only this
1.5.1 Bla bla bla
1.5.2 Bla bla bla
I can manage to select only the groups, but not everything except them.
Link: https://regex101.com/r/pV9xU6/2
Thank you
According to the comments, it can be done in several ways:
Find: (?s)^(1\.5\.\d+[^\n]*\n[^\n]*\n)|. /gm
Replace: $1
or
Find (general way): (*SKIP)(*F)|.*\R*
Find: (1[.]5[.]\d+.*\n.*\n)(*SKIP)(*F)|.*\R*
Replace: nothing
or
Find: (^1\.5\.\d+.*\n.*\n)\K(?>.*\R)*?(?=(?1)|.*\z) /gm
Replace: nothing
Thanks for all your help.

How ignore specific lines of file using Regex and diff utility ("-I regex" option)?

I am writing automation tests to compare HTML files. For comparation I use diff linux utility
So, First HTML file 1.html
<!-- just example -->
<html>
<div id="userdata_hidden">bla bla bla</div>
<div id="something else" >bla bla bla</div>
<div id="waiver_id" >bla bla bla</div>
<html>
Second HTML file 2.html
<!-- just example -->
<html>
<div id="userdata_hidden">bla bla bla DIFFERENCE </div>
<div id="something else" >bla bla bla</div>
<div id="waiver_id" >bla bla bla DIFFERENCE </div>
<html>
Сommand to compare files:
diff -biw 1.html 2.html
Result:
3c3
< <div id="userdata_hidden">bla bla bla</div>
---
> <div id="userdata_hidden">bla bla bla DIFFERENCE </div>
5c5
< <div id="waiver_id" >bla bla bla</div>
---
> <div id="waiver_id" >bla bla bla DIFFERENCE </div>
Comaration works fine, but I need to ignore difference of lines which include special words - waiver_id and userdata_hidden.
diff command have -I option for ignoring lines by number or regex match:
To ignore insertions and deletions of lines that match a grep-style
regular expression, use the --ignore-matching-lines=regexp (-I regexp)
option. You should escape regular expressions that contain shell
metacharacters to prevent the shell from expanding them. For example,
‘diff -I '^[[:digit:]]'’ ignores all changes to lines beginning with a
digit.
However, -I only ignores the insertion or deletion of lines that
contain the regular expression if every changed line in the hunk—every
insertion and every deletion—matches the regular expression. In other
words, for each nonignorable change, diff prints the complete set of
changes in its vicinity, including the ignorable ones.
You can specify more than one regular expression for lines to ignore
by using more than one -I option. diff tries to match each line
against each regular expression.
So, I can use regex to ignore comaration of lines with waiver_id or userdata_hidden. If files have no differences diff returns nothing (empty string) to console.
Question:
How to write regex, which exclude strings that contain words waiver_id or userdata_hidden?
How correct diff command should look with -I option and regex?
P.S. Unfortunately, this variant not working:
diff -biw -I '^(?!.*(?:userdata_hidden|waiver_id))' 1.html 2.html
I need to check that string does not contain words waiver_id and userdata_hidden.
^(?!.*\bwaiver_id\b)(?!.*\buserdata_hidden\b)
If you don't want any one string to be presented.
^(?!.*\b(?:userdata_hidden|waiver_id)\b)
RUbular

Extract string embedded in pattern with a regex

I've been using the bash command line with grep -e and sort -nr trying to filter and analyze some lines coming from a bunch of "data" files. So far I came out with an output file like this:
25 The X value is: bla bla bla done
19 The X value is: foo done
19 The X value is: bar done
19 The X value is: bbb done
19 The X value is: xxx yyy zzz done
where you can see the frequency and the "data" part I am interested into.
I am not able to find a regex to be used by grep to "clean those lines". I mean: I can intercept those "data" lines with a regex like is:.*done (I know this pattern is unique in the files I am analyzing), but how can I clean those lines extracting exactly the stuff between "is:" and "done"?
Try sed instead:
$ sed -r 's/^.*: (.*) done$/\1/' outputfile.txt
bla bla bla
foo
bar
bbb
xxx yyy zzz
If you wanted to return:
bla bla bla
foo
bar
bbb
xxx yyy zzz
you can use
(?<=:)(.*)(?=done)

AWK - Split file by value in specific column

I have the following AWK script (provided by Armali on this site) which basically strips a tab delimited file by date(Month/year) and saves it as yyyymmm. I now have another additional condition by which the file should be split. It should be split by Month/year and also by the unique value in Column 3. Save the file as yyyymmm_Col3Uniquevalue.
The current script is
awk "NR>1{split($2,date,\"/\");print>date[3]strftime(\"%%b.txt\",(date[2]-1)*31*24*60*60)}" input.txt
Data Format:
Country Date Type
HongKong 31/01/2012 Television
Japan 14/01/2012 Press
Japan 05/01/2012 Television
Japan 16/02/2013 Press
Japan 15/02/2013 Television
Output will be 4 txt files:
2012Jan_Press - Containing record 2
2012Jan_Television - Containing record 1,3
2013Feb_Press - Containing record 4
2013Feb_Television - Containing record 5
Play with this for a bit to make sure you understand it:
$ cat file
Country Date Type
HongKong 31/01/2012 Television
Japan 14/01/2012 Press
Japan 05/01/2012 Television
Japan 16/02/2013 Press
Japan 15/02/2013 Television
$ cat tst.awk
NR>1 {
split($2,a,"/")
secs = mktime(a[3]" "a[2]" "a[1]" 0 0 0")
mth = strftime("%b", secs)
file = a[3] mth "_" $3
print file
}
$ awk -f tst.awk file
2012Jan_Television
2012Jan_Press
2012Jan_Television
2013Feb_Press
2013Feb_Television
Look up mktime() and strftime() in the GNU awk manual.
Just change print file to print > file when you're done testing.
With TAB separated fields...:
awk -F\t "NR>1{split($2,date,\"/\");print>date[3]strftime(\"%%b_\"$3\".txt\",(date[2]-1)*31*24*60*60)}" input.txt
$3 had to be excluded from the quoted format string.
If the date field $2 contains after a space also the time, split by space as well as by "/" to keep getting the year in date[3]:
awk -F\t "NR>1{split($2,date,\"[/ ]\");print>date[3]strftime(\"%%b_\"$3\".txt\",(date[2]-1)*31*24*60*60)}" input.txt

Regular expression to find text between strings WITHOUT a predefined string in the middle

I've a so written text:
11 bla gulp bla 22
11 bla bla bla 2211 bla
ble
bli 22
I need a regex to find all the text between all the couples "11" and "22" BUT that DON'T contain "gulp".
If I search (?s)11.*?22 using TextCrawler, I find all the three strings:
bla gulp bla
bla bla bla
bla ble bli
Wrong! I'd like to obtain only:
bla bla bla
bla ble bli
because "bla gulp bla" contains "gulp", and I don't want it!
Any idea? :-)
use a negative lookahead assertion:
11(?!.*?gulp.*?)(.*?)22
word boundaries might be a good idea in the middle (surrounding gulp), because it would allow to distinguish between gulp and gulping, gulped or ungulp(?):
11(?!.*?\bgulp\b.*?)(.*?)22
but putting them around everything:
\b11\b(?!.*?\bgulp\b.*?)(.*?)\b22\b
would exclude your other two results - not what you want.