Imacros Regex - Isolate name and surname - regex

I am struggling for quite a while now.
I am extracting text from a Website using Imacros with this result :
Niklaus Hasling
There are whitespaces before the first name and after the surname
This String is stored in the variable !VAR2
I would like to use a regex that isolates the first name in !VAR3 and the surname in !VAR4
Can someone help me ?
I can't figure out how to write the regex
'Extract and Save Names
TAG XPATH="/html/body/main/div[1]/div[1]/div/div[1]/div[1]/div/dl/dt/span" EXTRACT=TXT
SET !EXTRACT EVAL("'{{!EXTRACT}}'.trim(REGEX'')")
SET !VAR3 {{!EXTRACT}}
SET !EXTRACT NULL
'Extract and save SurNames
TAG XPATH="/html/body/main/div[1]/div[1]/div/div[1]/div[1]/div/dl/dt/span" EXTRACT=TXT
SET !EXTRACT EVAL("'{{!EXTRACT}}'.trim(REGEX'')")
SET !VAR4 {{!EXTRACT}}
SET !EXTRACT NULL
enter image description here

I figured this could trim the white spaces before and after the names :
^[ \t]+|[ \t]+$.
and an array could split names, but I cant figure out how with iMacros

Hum, you seem to like XPATH and REGEX, ah-ah...!
Implementation without REGEX (that I don't like and never use):
'Extract and Save First (`!VAR3`) + Last (`!VAR4`) Names:
SET !EXTRACT NULL
TAG XPATH="/html/body/main/div[1]/div[1]/div/div[1]/div[1]/div/dl/dt/span" EXTRACT=TXT
SET !ERRORIGNORE YES
SET !VAR3 EVAL("var s='{{!EXTRACT}}'; var x,y,z; x=s.trim(); y=x.split(' '); z=y[0]; z;")
SET !VAR4 EVAL("var s='{{!EXTRACT}}'; var x,y,z; x=s.trim(); y=x.split(' '); z=y[1]; z;")
'>
'Debug:
SET Debug_Info EXTRACT:<BR>_{{!EXTRACT}}_<BR><BR>
ADD Debug_Info First_Name:<SP>_{{!VAR3}}_<BR>Last_Name:<SP>_{{!VAR4}}_
PROMPT {{Debug_Info}}
!ERRORIGNORE is "needed" in case y[1] does not "exist", or iMacros will throw a Runtime Error...

Related

Picking data from Excel CSV using iMacros

I want to automate the submission of deleted pages in Google Search Console for a website that I manage.
That's what I wrote in iMacros for Chrome (I've replaced my-domain.name.com and my-file.csv with the real names, of course):
VERSION BUILD=1011 RECORDER=CR
SET !DATASOURCE C:\Users\MY-USERNAME\Desktop\my-file.csv
SET !DATASOURCE_COLUMNS 1
SET !TIMEOUT_STEP 0
SET !ERRORIGNORE YES
SET !EXTRACT_TEST_POPUP YES
SET !LOOP 1
TAB T=1
URL GOTO=https://search.google.com/search-console/removals?resource_id=https://www.my-domain-name.com/&hl=fr&utm_source=wmx&utm_medium=deprecation-pane&utm_content=url-removal
TAG POS=2 TYPE=SPAN ATTR=TXT:Nouvelle<SP>demande
WAIT SECONDS=4
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:newremovalform ATTR=NAME:urlt CONTENT={{!COL1}}
TAG POS=4 TYPE=SPAN ATTR=TXT:Suivante
TAG POS=4 TYPE=SPAN ATTR=TXT:Envoyer<SP>la<SP>demande
WAIT SECONDS=15
TAG POS=1 TYPE=INPUT:SUBMIT FORM=ACTION:/webmasters/tools/removals-submit-ac?hl=fr&siteUrl=https://www.my-domain-name.com/ ATTR=NAME:next
TAG POS=1 TYPE=INPUT:SUBMIT FORM=ID:the-form ATTR=ID:submit-button
WAIT SECONDS=3
But when I play the macro, I immediately get this error message:
Blockquote SyntaxError: wrong format of SET command at line 2
Thanks in advance for your help.
Best regards,
Eva.
FCI not mentioned (check my Profile on how to ask Qt's about the iMacros Tag "a bit correctly"...), but you probably have some Space(s) in the Path for your DataSource, I reckon...
=> Probable FCI:
iMacros for CR v10.1.1, CR93/94(...?), Win7/10/11(...?).
=> Need to replace Spaces with <SP>... :idea:
... Or you can enclose the whole Path with Single or Double Quotes... (But the Backslashes then need to be escaped with another Backslash...)
What also would cause "a Problem" is if your Path (including the Filename) contains some Single Quote(s)..., ... that you then can try to escape also, or maybe easier is to make sure that your Path "avoids" Spaces and "Special" Chars...
It's all documented and explained (with Examples) in the Wiki for the !DATASOURCE Command...

Regex filter for iMacros

I'm trying to scrape search result counter from Google SERP. It works with Google Spreadsheets, ImportXML and RegExReplace, but not always, because of Spreadsheets fault. So i'm trying to accomplish it with iMacros and can't get scraped string correctly filtered out.
In G Spreadsheets i use
=REGEXREPLACE(IMPORTXML("https://www.google.com/search?q=test&hl=en&as_qdr=m","//div[#id='resultStats']"),".*?([0-9,]+) (w|r)esults?","$1")
The whole imported string in the id="resultsStats" is About 4,290,000 results Here regex .*?([0-9,]+) (w|r)esults? filters all words out so i get only results number. As i said, it doesn't work reliably in Spreadsheets.
The question is: how i use this RegEx with iMacros to get only number? I use this iMacros code:
VERSION BUILD=8881205 RECORDER=FX
SET !TIMEOUT_STEP 0
SET !ERRORIGNORE YES
TAB T=1
SET !DATASOURCE sr1.csv
SET !DATASOURCE_COLUMNS 1
SET !LOOP 1
SET !DATASOURCE_LINE {{!LOOP}}
SET !VAR1 EVAL("var randomNumber=Math.floor(Math.random()*45 + 16); randomNumber;")
URL GOTO={{!COL1}}
WAIT SECONDS={{!VAR1}}
TAG POS=1 TYPE=DIV ATTR=ID:resultStats EXTRACT=TXT
ADD !EXTRACT {{!URLCURRENT}}
SET !EXTRACT EVAL("decodeURI('{{!EXTRACT}}');")
SAVEAS TYPE=EXTRACT FOLDER=* FILE=+{{!NOW:ddmmyyyy}}.csv
It's very simple to do:
' ... '
TAG POS=1 TYPE=DIV ATTR=ID:resultStats EXTRACT=TXT
SET !EXTRACT EVAL("'{{!EXTRACT}}'.match(/[0-9,]+/);")
' ... '

how to extract only first two characters with imacro?

So the HTML code is :
<span class="countdown">
01:22
</span>
The imacro code to extract the text (01:22) is :
TAG POS=1 TYPE=SPAN ATTR=CLASS:"countdown" EXTRACT=TXT
I want to extract only the first two characters and not the whole text, in the example i posted , the extracted TEXT would be "01" and not "01:22"
You will need to use the imacro EVAL method and use a little bit of javascript and regex to break up the string and assign it to another variable so that you get the first 2 characters. Below is the solution:
TAG POS=1 TYPE=SPAN ATTR=CLASS:"countdown" EXTRACT=TXT
SET !VAR1 EVAL("var x=\"{{!EXTRACT}}\"; x=x.match(/^.{2}/).join(''); x;")
PROMPT {{!VAR1}}
Enjoy! If this was helpful, please mark as such, thanks!
Edit Here's a slightly better method, using the javascript split function. This will allow you to specify the 1st part (01) or the 2nd part (22) of 01:22
TAG POS=1 TYPE=SPAN ATTR=CLASS:"countdown" EXTRACT=TXT
' Below line will assign the first part before colon (01) to VAR1
SET !VAR1 EVAL("var x=\"{{!EXTRACT}}\"; y=x.join(':'); y[0];")
' Below line will assign the first part before colon (01) to VAR2
SET !VAR2 EVAL("var x=\"{{!EXTRACT}}\"; x=x.join(':'); y[1];")
PROMPT {{!VAR1}}
Answer updated due to recent comment/question to my answer.

Extract multiple occurrences in JSON file with iMacros

I'm using iMacros for Firefox and want to extract some id's from a JSON file. The JSON file looks like this:
"count":0,"id":"12345","time"
blabla
"count":0,"id":"12346","time"
The code I'm using in iMacros is:
URL GOTO=https://www.jsonurl.com
SEARCH SOURCE=REGEXP:"\"id\":\"(.[^\"]*)\"" EXTRACT="$1"
PROMPT {{!EXTRACT}}
SAVEAS TYPE=EXTRACT FOLDER=* FILE=*
With this code, it is only extracting 12345 from the above JSON example. How can I edit the code to extract all occurrences of id?
Sorry, no can do :(
Global, iterative matching is currently not supported, so only the first match on the page can be found and extracted.
Source (iMacros Wiki)
I would use JavaScript solution.
var macro;
macro ="CODE:";
macro +="URL GOTO=https://www.jsonurl.com"+"\n";
macro +="TAG POS=1 TYPE=DIV ATTR=CLASS:what_ever_you_are_extracting EXTRACT=HTM"+"\n";
iimPlay(macro)
var text=iimGetLastExtract();
text=text.split('"id":"')[1];
text=text.split('",')[0];
text=text.trim();
alert(text);
Edit:
The command
TAG POS=1 TYPE=HTML ATTR=CLASS:* EXTRACT=HTM
Extracts everything on the page.

Imacros - Replace or delete all the apostraphies in a string of text

Imacros Eval function to replace "'" with ""
or just to delete all the ' in a string of text.
ive tried this but i cant get it to work with apostrophes
TAG POS=1 TYPE=DIV ATTR=CLASS:after_title EXTRACT=TXT
SET !VAR2 EVAL("var extr2=\"{{!EXTRACT}}\"; extr2.replace(\"'\",\"\"); ")
After doing some reading i tried this, get an error
TAG POS=1 TYPE=DIV ATTR=CLASS:after_title EXTRACT=TXT
SET !VAR2 EVAL("var extr2=\"{{!EXTRACT}}\"; extr2.replace(\'/g\,\"GHF\"); ")
I really hope some one can help, its really doing my head in
TAG POS=1 TYPE=DIV ATTR=CLASS:after_title EXTRACT=TXT
SET !VAR2 EVAL("var extr2=\"{{!EXTRACT}}\"; extr2.replace(/'/g,''); ")
Can you try this and let us know if it worked?