Parsing email with Google Apps Script, regex issue? - regex

I used to be quite proficient in VBA with excel, but I'm currently trying to do something with Google Scripts and I am well and truly stuck.
Basically, I am trying like to extract data out of a standardised email from Gmail into a Google sheet. There are a couple of other threads on the subject which I have consulted so far, and I can get the body of the email into the sheet but cannot parse it.
I am new to regex, but it tests OK on regex101
I am also brand new to Google Script, and even the debugger seems to have stopped working now (it did before, so would be grateful if anyone can suggest why this is).
Here is my basic function:
function processInboxToSheet() {
var label = GmailApp.getUserLabelByName("NEWNOPS");
var threads = label.getThreads();
// Set destination sheet
var sheet = SpreadsheetApp.getActiveSheet();
// Get all emails labelled NEWNOPS
for (var i = 0; i < threads.length; i++) {
var tmp,
message = threads[i].getMessages()[1], // second message in thread
content = message.getPlainBody(); // remove html markup
if (content) {
// search email for 'of:' and capure next line of text as address
// tests OK at regex101.com
property = content.match(/of:[\n]([^\r\n]*)[\r\n]/);
// if no match, display error
var property = (tmp && tmp[1]) ? tmp[1].trim() : 'No property';
sheet.appendRow([property]);
} // End if
// remove label to avoid duplication
threads[i].removeLabel(label)
} // End for loop
}
I can append 'content' to the sheet Ok, but cannot extract the address text required by the regex. Content displays as follows:
NOPS for the purchase of:
123 Any Street, Anytown, AN1 1AN
DATE: 05/05/2017
PRICE: £241,000
Seller’s Details
NAME: Mrs Seller
Thanks for reading :)

The return value of .match() is an array. The first captured group, containing the address, will be at index 1.
Based on the following line after your call to .match(), it looks like the tmp variable should have been assigned that array, not the property variable.
var property = (tmp && tmp[1]) ? tmp[1].trim() : 'No property';
That line says, if .match() returned something that isn't null and has a value at index 1, then trim that value and assign to property, otherwise assign it the string 'No property'.
So, try changing this line:
property = content.match(/of:[\n]([^\r\n]*)[\r\n]/);
To this:
tmp = content.match(/of:[\n]([^\r\n]*)[\r\n]/);

Thanks Kevin, I think I must have changed it while debugging.
The problem was with my regexp in the end. After a bit of trial and error the following worked:
tmp = content.match(/of:[\r\n]+([^\r\n]+)/);

Related

Check if any cell in specified range meets 2 conditions

I'm putting together a macro that sends alert e-mails if two conditions are met.
The e-mails are being sent, but indiscriminately and not just when the conditions I want to set are being met.
The conditions: send an e-mail if any cell inside the range (I1:I9999) has white as background colour AND contains the text "QC".
This is what I have tried:
var QCJobRange = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("WIP").getRange("I1:I9999");
var Location = QCJobRange.getValue();
// Check for white cells with value=QC in Location column
if (Location = "QC") and (Background = "#ffffff");
// Fetch the email address
var emailRange = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("WIP").getRange("C2");
var emailAddress = emailRange.getValues();
// Send Alert Email.
var message = 'bla';
var subject = 'bla';
MailApp.sendEmail(emailAddress, subject, message);
I'm working directly in the script editor that you can open from Google sheets.
It seems that some operators are not being picked up, f. e. "and" is not even highlighted and I get the following error message: "and" is not defined.
I've been combing the forums for a simple solution but am kind of stuck on the problem with "and".
Any suggestions?
Google Apps Script is based on Javascript
The syntax for "and" is &&
The syntax for an if statement is if(condition1&&condition2){...do something...};
The method getValue() is applicable to a single value (from a single cell), while getValues() is to be used for value ranges, which represent 2-dimentsional arrays
If you want to compare two values, use the operator ==
Here is a sample to modify your code in roder to send a message if the background of cell "I1" is white and its value "QC":
function myFunction() {
//if you do not have 999 rows full of data, please reduce your range - otherwise your code will be slow
var QCJobRange = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("WIP").getRange("I1:I9999");
var Locations = QCJobRange.getValues();
var firstLocation=Locations[0][0];
// Check for white cells with value=QC in Location column
if (firstLocation == "QC"&& QCJobRange.getBackgrounds()[0][0]== "#ffffff"){
// Fetch the email address
var emailRange = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("WIP").getRange("C2");
var emailAddress = emailRange.getValue();
// Send Alert Email.
var message = 'bla';
var subject = 'bla';
MailApp.sendEmail(emailAddress, subject, message);
}
}
Please consult the Apps Script tutorial for more samples and information:

Google App Script findText regex not working for new line character

I'm trying to locate/modify text in my Google Document where the text has been broken across a full line break. My regular expression below works when I manually find text in the Google document (CTRL+F) and then search via the regular expression dialog. What is baffling is why the exact same regex doesn't work in the code below on full line breaks, i.e. "\n" (note: the soft line "\v" breaks are ok).
The second approach finds the text but I'm unable to do anything with it as I need the element object in-order to manipulate the text.
//Test document 1Q6v8ipqA81LoPtpk71NdqTaIEqMjki1KIJbrm0bILBg contains the following text:
//
//This Agreement shall not be assigned by either party without the prior\n
//written consent of the parties hereto
var doc = DocumentApp.openById('1Q6v8ipqA81LoPtpk71NdqTaIEqMjki1KIJbrm0bILBg');
//Method 1 - does NOT locate the text
var body = doc.getBody();
var pattern = "prior[\s]*written";
var foundElement = body.findText(pattern);
while (foundElement != null) {
var foundText = foundElement.getElement().asText();
var start = foundElement.getStartOffset();
var end = foundElement.getEndOffsetInclusive();
foundElement = body.findText(pattern, foundElement);
}
//Method 2 - locates the text, but I cannot acquire the element object
var body2 = doc.getBody().getText();
var pattern2 = /prior[\s]*written/;
while (m=pattern2.exec(body2))
{
Logger.log(m[0]);
}
}
If this were ever going to work, you would need the regex to be in s (single line) mode. Per https://developers.google.com/apps-script/reference/document/body#findtextsearchpattern,
A subset of the JavaScript regular expression features are not fully supported, such as capture groups and mode modifiers.
So it looks like they have in fact chosen not to support multi-line matches in any way.

Apps Script findText() for Google Docs

I'm applying RegEx search to a Google Document text with some markdown code block ticks (```). Running the code below on my doc is returning a null result.
var codeBlockRegEx = '`{3}((?:.*?\s?)*?)`{3}'; // RegEx to find (lazily) all text between triple tick marks (/`/`/`), inclusive of whitespace such as carriage returns, tabs, newlines, etc.
var reWithCodeBlock = body.findText(codeBlockRegEx); // reWithCodeBlock evaluates to 'null'
I suspect that there's some element of regex in my code that is not supported by RE2, but the documentation has not shed light on this. Any ideas?
I received null as well- I was able to get the below to work using 3 ` surrounding the word test within a paragraph.
I did find this information:
findText method of objects of class Text in Apps Script, extending Google Docs. Documentation says “A subset of the JavaScript regular expression features are not fully supported, such as capture groups and mode modifiers.” In particular, it does not support lookarounds.
function findXtext() {
var body = DocumentApp.getActiveDocument().getBody();
var foundElement = body.findText("`{3}(test)`{3}");
while (foundElement != null) {
// Get the text object from the element
var foundText = foundElement.getElement().asText();
// Where in the element is the found text?
var start = foundElement.getStartOffset();
var end = foundElement.getEndOffsetInclusive();
// Set Bold
foundText.setBold(start, end, true);
// Change the background color to yellow
foundText.setBackgroundColor(start, end, "#FCFC00");
// Find the next match
foundElement = body.findText("`{3}(test)`{3}", foundElement);
}
}

IF Function - Google Scripts

I'm struggling to get my Script to run an IF function. Basically I want to run a script based on specific cell contents.
I would like an IF function to run based on this and have written the following code:
function sendemail () {
var ss = SpreadsheetApp.getActiveSpreadsheet();
var s = SpreadsheetApp.getActiveSheet();
var targetSheet = ss.getSheetByName("Response");
var vCodes = ss.getSheetByName("Codes")
var vResults = targetSheet.getRange("E2").getValues();
var emailAddresses = targetSheet.getRange("B2").getValues()
var dataRange = vCodes.getRange(1, 1, vResults, 1).getValues();
var subject = "Here are your Wi-Fi Codes!";
var vLength = vCodes.getRange("C2").getValues();
if (vLength == "24 hours"){
MailApp.sendEmail(emailAddresses, subject, dataRange);
targetSheet.deleteRows(2);
vCodes.deleteRows(1,vResults);
}
}
If the value in C2 is "24 hours" I'd like it to send an e-mail. At the moment when I run the script there are no errors but it doesn't send any e-mail as the IF function obviously isn't running correctly.
If I edit the code to say:
if (vLength == "")
then the e-mail sends. It doesn't seem to recognise "24 hours" as valid data to look up.
Can anyone see what I'm doing wrong?
The value you get from the cell is not what you think because you are using getValues() with an 's' and you probably know that this method always returns an array of arrays, even when a single cell is defined as range.
You have 2 options :
use getValue() to get the string content of the cell
use getValues()[0][0] to get the first (and only) element of this array.
I would suggest the first solution as I think it's generally a good idea to use appropriate methods... getValue() for single cell and getValues() for multiple cells...
I didn't check further but I'm pretty sure it will work with this change (applies to vResults , emailAddresses and vLength) .
It would also be careful to ensure that vResults is a number since you use it to define a range... you could use Number(vResults) as a safety measure.

How to use regular expression in WatiN

I'm working on WatiN automation tool. I'm having problem in regular expression. I've situation where i have to enter some text and click on a button in the popup window. I'm using AttachToIE method and URL attribute("http://192.168.25.10:215/admin/SelectUsers.aspx?Type=FeedbackID=ef5ad7ef5490-4656-9669-32464aeba7cd") of the popup to attach to the popup.
The problem is each time the popup appears the ID value in the URL changes. So i'm not able to access the popup. can anyone plz help with this by giving me Regular Expression for the changing value of ID in the below URL
("http://192.168.25.10:215/admin/SelectUsers.aspx?Type=FeedbackID=ef5ad7ef5490-4656-9669-32464aeba7cd")
thanking you
It appears that you have a URL with 2 query string parameters Type and ID and your pattern is:
"http://192.168.25.10:215/admin/SelectUsers.aspx?Type=Feedback&ID={some id}"
You can use the Find.ByUrl() attribute constraint method and pass it to AttachToIE() as shown below with the regex for matching that pattern.
string url = "http://192.168.25.10:215/admin/SelectUsers.aspx?Type=Feedback&ID="
Regex regex = new Regex(url + "[a-z0-9]+", RegexOptions.IgnoreCase);
IE ie = IE.AttachToIE(Find.ByUrl(regex));
string baseUrl ="http://192.168.25.10:215/admin/SelectUsers.aspx?Type=FeedbackID="
Regex urlIE= new Regex(baseUrl + "[\\wd]+", RegexOptions.IgnoreCase);
IE ie = IE.AttachToIE(Find.ByUrl(urlIE);
I'm not familiar with WatiN but it looks like it's runs on .Net so perhaps this might help?
var desiredId = "000000000000-0000-0000-000000000000";
var url = "http://192.168.25.10:215/admin/SelectUsers.aspx?Type=FeedbackID=ef5ad7ef5490-4656-9669-32464aeba7cd&someMoreStuff";
var pattern = #"(?i)(?<=FeedBackId=)[-a-z0-9]+";
var result = Regex.Replace(url, pattern, desiredId);
Console.WriteLine(result);
//Output: http://192.168.25.10:215/admin/SelectUsers.aspx?Type=FeedbackID=000000000000-0000-0000-000000000000&someMoreStuff
The following pattern should have the same affect but is more defensive. It should only match stuff in the query string, it requires the id to be 35 characters and won't match similar parameter names like "PreviousFeedBackId".
var pattern = #"(?i)(?<=\?.*\bFeedBackId=)[-a-z0-9]{35,35}\b";
If you just want to extract the id:
var id = Regex.Match(url, pattern).Value;
Console.WriteLine(id);
//output: ef5ad7ef5490-4656-9669-32464aeba7cd
WatiN has a feature where in we can use the url by neglecting the query string. Below is the code which is working fine for me.
string baseUrl = "http://192.168.25.10:215/admin/SelectUsers.aspx";
IE ie = IE.AttachToIE(Find.ByUrl(baseUrl,true));