Extracting mails from a spreadsheet - regex

I have a Google Spreadsheet with two columns.
First column includes the name of a referrer and second column includes a free format text where some referred email addresses are mentioned. There might be multiple email addresses in one cell, or none.
Ex:
Referrer | Referral
--------------------------------------------------------------------------
Mister X | I would like to refer somebody#gmail.com and somebodyelse#outlook.com
Miss Y | myfriend#mail.com
Mister Z | None!
etc | ...
I would like to format the data such that for each referred address we have the referrer and the email address referred.
EX:
Referrer | Referral
--------------------------------------------------------------------------
Mister X | somebody#gmail.com
Mister X | somebodyelse#outlook.com
Miss Y | myfriend#mail.com
etc | ...
What is the best way of achieving this?

Here's your original data in a table.
Referrer Referral
Mister X I would like to refer somebody#gmail.com and somebodyelse#outlook.com
Miss Y myfriend#mail.com
Mister Z None!
Here's the same columns after they're over written.
Referrer none
Mister X somebody#gmail.com
Mister X somebodyelse#outlook.com
Miss Y myfriend#mail.com
Mister Z none
And here's the code. Currently, you select the two columns as we were shown and I over write them in the format your requested. Although with such a limited dataset one can never be 100% sure. So further testing would be good. I included the menu and some of my display routines which help me debug the program. I suppose you may want to change the range. Go for it. Have fun. I enjoyed writing it.
function onOpen()
{
var ui = SpreadsheetApp.getUi();
ui.createMenu('My Tools')
.addItem('Extract Emails','emailFishing')
.addToUi();
}
function emailFishing()
{
var rng = SpreadsheetApp.getActiveRange();
var rngA = rng.getValues();
var resultsA = [];
//var s = '[';
for(var i = 0;i < rngA.length; i++)
{
if(rngA[i][1])
{
matchA = extractEmails(rngA[i][1]);
if(matchA)
{
for(var j = 0; j < matchA.length;j++)
{
resultsA.push([rngA[i][0], matchA[j]]);
//s += '[' + rngA[i][0] + ', ' + matchA[j] + '], '
}
}
else
{
resultsA.push([rngA[i][0],'none']);
//s += '[' + rngA[i][0] + ', \'none\'],'
}
}
}
//s += ']';
var orng = SpreadsheetApp.getActiveSheet().getRange(rng.getRow(), rng.getColumn(), resultsA.length, resultsA[0].length);
orng.setValues(resultsA);
//dispStatus('Results Array', s, 500, 400);
}
function extractEmails (text)
{
return text.match(/([a-zA-Z0-9._-]+#[a-zA-Z0-9._-]+\.[a-zA-Z0-9._-]+)/gi);
}
function dispStatus(title,html,width,height)
{
// Display a modeless dialog box with custom HtmlService content.
var title = typeof(title) !== 'undefined' ? title : 'No Title Provided';
var width = typeof(width) !== 'undefined' ? width : 250;
var height = typeof(height) !== 'undefined' ? height : 300;
var html = typeof(html) !== 'undefined' ? html : '<p>No html provided.</p>';
var htmlOutput = HtmlService
.createHtmlOutput(html)
.setWidth(width)
.setHeight(height);
SpreadsheetApp.getUi().showModelessDialog(htmlOutput, title);
}
The function extractEmail came from Leniel Macaferi. From this post Extract all email addresses from bulk text using jquery. Although I left out the JQuery part.

Related

Google App Script IF function checking only one row, and applying the result to all rows

I seem to be going quite wrong somewhere.
I'm writing a script that will automatically send out a reminder email if a Google sheet cell turns to "Yes".
The problem is my script seems to read it as:
if the second row has a "yes" it will return true for all rows and send out an email to everyone, regardless of the other rows saying "yes" or "no".
if any other row has a yes, then it seems to be completely ignored.
Defining the range to check:
//looping through all of the rows
for (var i = 0; i < data.length; ++i) {
var row = data[i];
// Creating where the if statement is check
var ss = SpreadsheetApp.getActiveSheet();
var thisQuarter = ss.getRange("H2:H50").getValue();
The IF statement to check against:
// checking for this quarter
if (
thisQuarter == "Yes") {
var subject =
'Your BCP is due to expire this quarter: ';
MailApp.sendEmail(emailAddress, subject, message,);
Logger.log('this quarter');
}
}
}
If anyone could give me a couple pointers as to where I'm going wrong, that would be greatly appreciated.
Thank you,
Ideally post a view only copy of the sheet. I believe the problem is this section of code:
// checking for this quarter
if (
thisQuarter == "Yes") {
var subject =
'Your BCP is due to expire this quarter: ';
MailApp.sendEmail(emailAddress, subject, message,);
Logger.log('this quarter');
}
thisQuarter is assigned here:
var thisQuarter = ss.getRange("H2:H50").getValue();
change that line to this:
var thisQuarter = ss.getRange("H2:H50").getValues();
so thisQuarter is an array of values from the range specified
change the if statement to this and see if it helps:
for (i = 0; i < thisQuarter.length; i++) {
if (thisQuarter[i][0] == "Yes" {
// send email
}
}

Gmail App search criteria

I have the following search criteria working very well in Gmail:
user#domain from:/mail delivery/ || /postmaster/ ||/Undeliverable/
I am trying to write Goole Apps code to return the same results. Here is the code:
var thread=GmailApp.search("user#domain from:/mail delivery/ || /postmaster/ ||/Undeliverable/ ");
I am getting different results. I am new to both Regex and Google Apps.
Try Amit Agarwal's tutorial on Gmail Search with Google Apps Script which includes Using Regular Expressions to Find Anything in your Gmail Mailbox:
function Search() {
var sheet = SpreadsheetApp.getActiveSheet();
var row = 2;
// Clear existing search results
sheet.getRange(2, 1, sheet.getMaxRows() - 1, 4).clearContent();
// Which Gmail Label should be searched?
var label = sheet.getRange("F3").getValue();
// Get the Regular Expression Search Pattern
var pattern = sheet.getRange("F4").getValue();
// Retrieve all threads of the specified label
var threads = GmailApp.search("in:" + label);
for (var i = 0; i < threads.length; i++) {
var messages = threads[i].getMessages();
for (var m = 0; m < messages.length; m++) {
var msg = messages[m].getBody();
// Does the message content match the search pattern?
if (msg.search(pattern) !== -1) {
// Format and print the date of the matching message
sheet.getRange(row,1).setValue(
Utilities.formatDate(messages[m].getDate(),"GMT","yyyy-MM-dd"));
// Print the sender's name and email address
sheet.getRange(row,2).setValue(messages[m].getFrom());
// Print the message subject
sheet.getRange(row,3).setValue(messages[m].getSubject());
// Print the unique URL of the Gmail message
var id = "https://mail.google.com/mail/u/0/#all/"
+ messages[m].getId();
sheet.getRange(row,4).setFormula(
'=hyperlink("' + id + '", "View")');
// Move to the next row
row++;
}
}
}
}

Eliminate newlines in google app script using regex

I'm trying to write part of an add-on for Google Docs that eliminates newlines within selected text using replaceText. The obvious text.replaceText("\n",""); gives the error Invalid argument: searchPattern. I get the same error with text.replaceText("\r","");. The following attempts do nothing: text.replaceText("/\n/","");, text.replaceText("/\r/","");. I don't know why Google App Script does not allow for the recognition of newlines in regex.
I am aware that there is an add-on that does this already, but I want to incorporate this function into my add-on.
This error occurs even with the basic
DocumentApp.getActiveDocument().getBody().textReplace("\n","");
My full function:
function removeLineBreaks() {
var selection = DocumentApp.getActiveDocument().getSelection();
if (selection) {
var elements = selection.getRangeElements();
for (var i = 0; i < elements.length; i++) {
var element = elements[i];
// Only deal with text elements
if (element.getElement().editAsText) {
var text = element.getElement().editAsText();
if (element.isPartial()) {
text.replaceText("\n","");
}
// Deal with fully selected text
else {
text.replaceText("\n","");
}
}
}
}
// No text selected
else {
DocumentApp.getUi().alert('No text selected. Please select some text and try again.');
}
}
It seems that in replaceText, to remove soft returns entered with Shift-ENTER, you can use \v:
.replaceText("\\v+", "")
If you want to remove all "other" control characters (C0, DEL and C1 control codes), you may use
.replaceText("\\p{Cc}+", "")
Note that the \v pattern is a construct supported by JavaScript regex engine, and is considered to match a vertical tab character (≡ \013) by the RE2 regex library used in most Google products.
The Google Apps Script function replaceText() still doesn't accept escape characters, but I was able to get around this by using getText(), then the generic JavaScript replace(), then setText():
var doc = DocumentApp.getActiveDocument();
var body = doc.getBody();
var bodyText = body.getText();
//DocumentApp.getUi().alert( "Does document contain \\t? " + /\t/.test( bodyText ) ); // \n true, \r false, \t true
bodyText = bodyText.replace( /\n/g, "" );
bodyText = bodyText.replace( /\t/g, "" );
body.setText( bodyText );
This worked within a Doc. Not sure if the same is possible within a Sheet (and, even if it were, you'd probably have to run this once cell at a time).
here is my pragmatic solution to eliminate newlines in Google Docs, or, more exact, to eliminate newlines from Gmail message.getPlainBody().
It looks that Google uses '\r\n\r\n' as a plain EOL and '\r\n' as a manuell Linefeed (Shift-Enter). The code should be self explainable.
It might help to get alone with the newline problem in Docs.
A solution possibly not very elegant, but works like a charm :-)
function GetEmails2Doc() {
var doc = DocumentApp.getActiveDocument();
var body = doc.getBody();
var pc = 0; // Paragraph Counter
var label = GmailApp.getUserLabelByName("_Send2Sheet");
var threads = label.getThreads();
var i = threads.length;
// LOOP Messages within a THREAT
for (i=threads.length-1; i>=0; i--) {
for (var j = 0; j < messages.length; j++) {
var message = messages[j];
/* Here I do some ...
body.insertParagraph(pc++, Utilities.formatDate(message.getDate(), "GMT",
"dd.MM.yyyy (HH:mm)")).setHeading(DocumentApp.ParagraphHeading.HEADING4)
str = message.getFrom() + ' to: ' + message.getTo();
if (message.getCc().length >0) str = str + ", Cc: " + message.getCc();
if (message.getBcc().length >0) str = str + ", Bcc: " + message.getBcc();
body.insertParagraph(pc++,str);
*/
// Body !!
var str = processBody(message.getPlainBody()).split("pEOL");
Logger.log(str.length + " EOLs");
for (var k=0; k<str.length; k++) body.insertParagraph(pc++,str[k]);
}
}
}
function processBody(tx) {
var s = tx.split(/\r\n\r\n/g);
// it looks like message.getPlainBody() [of mail] uses \r\n\r\n as EOL
// so, I first substitute the 'EOL's with the string pattern "pEOL"
// to be replaced with body.insertParagraph in the main function
tx = '';
for (k=0; k<s.length; k++) tx = tx + s[k] + "pEOL";
// then replace all remaining simple \r\n with a blank
s = tx.split(/\r\n/g);
tx = '';
for (k=0; k<s.length; k++) tx = tx + s[k] + " ";
return tx;
}
I have now found out through much trial and error -- and some much needed help from Wiktor Stribiżew (see other answer) -- that there is a solution to this, but it relies on the fact that Google Script does not recognise \n or \r in regex searches. The solution is as follows:
function removeLineBreaks() {
var selection = DocumentApp.getActiveDocument()
.getSelection();
if (selection) {
var elements = selection.getRangeElements();
for (var i = 0; i < elements.length; i++) {
var element = elements[i];
// Only deal with text elements
if (element.getElement()
.editAsText) {
var text = element.getElement()
.editAsText();
if (element.isPartial()) {
var start = element.getStartOffset();
var finish = element.getEndOffsetInclusive();
var oldText = text.getText()
.slice(start, finish);
if (oldText.match(/\r/)) {
var number = oldText.match(/\r/g)
.length;
for (var j = 0; j < number; j++) {
var location = oldText.search(/\r/);
text.deleteText(start + location, start + location);
text.insertText(start + location, ' ');
var oldText = oldText.replace(/\r/, ' ');
}
}
}
// Deal with fully selected text
else {
text.replaceText("\\v+", " ");
}
}
}
}
// No text selected
else {
DocumentApp.getUi()
.alert('No text selected. Please select some text and try again.');
}
}
Explanation
Google Docs allows searching for vertical tabs (\v), which match newlines.
Partial text is a whole other problem. The solution to dealing with partially selected text above finds the location of newlines by extracting a text string from the text element and searching in that string. It then uses these locations to delete the relevant characters. This is repeated until the number of newlines in the selected text has been reached.
This Stack Overflow answer removes, specifically, "\n". It may help, it helped me indeed.

google-apps-script multiple criteria writing over headers

I have taken a bit of script from Serge which is great (original link here. I have added in a second criteria to exclude certain rows and it works great except, if there is not header in the sheet being copied to, it will not work (error: "The coordinates or dimensions of the range are invalid.") and if I enter a header or some other data, it overwrites it. Can anyone assist please? I have also found that is there is no match to the criteria I get following message "TypeError: Cannot read property "length" from undefined."
Also, what change would I need to make to change the cell 'dataSheetLog[i][12]' to the status variable, i.e. "COPIED" after I have copied it across. I have tried writing a setValue line but it is obviously the wrong instruction for that syntax.
Code is:
{
var Spreadsheet = SpreadsheetApp.getActiveSpreadsheet();
var sheetLog = Spreadsheet.getSheetByName("LOG");
var sheetMaint = Spreadsheet.getSheetByName("MAINTENANCE");
var Alast = sheetLog.getLastRow();
var criteria = "08 - Maintenance"
var status = "COPIED"
var dataSheetLog = sheetLog.getRange(2,1,Alast,sheetLog.getLastColumn()).getValues();
var outData = [];
for (var i in dataSheetLog) {
if (dataSheetLog[i][2]==criteria && dataSheetLog[i][12]!=status){
outData.push(dataSheetLog[i]);
}
}
sheetMaint.getRange(sheetMaint.getLastRow(),1,outData.length,outData[0].length).setValues(outData);
}
In:
sheetMaint.getRange(sheetMaint.getLastRow(),1,outData.length,outData[0].length).setValues(outData);
getLastRow() refers to the last occupied row and should be ,getLastRow() + 1,to keep from overwriting your headers and other problems.
Edited:
{
var Spreadsheet = SpreadsheetApp.getActiveSpreadsheet();
var sheetLog = Spreadsheet.getSheetByName("LOG");
var sheetMaint = Spreadsheet.getSheetByName("MAINTENANCE");
var Alast = sheetLog.getLastRow(); // Log
var criteria = "08 - Maintenance"
var status = "COPIED"
var dataSheetLog = sheetLog.getRange(2,1,Alast,sheetLog.getLastColumn()).getValues(); //Log
var dataSheetLogStatusRange = sheetLog.getRange(2,13,Alast,1); //Log
var dataSheetLogStatus = dataSheetLogStatusRange.getValues(); //Log
var outData = [];
for (var i =0; i < dataSheetLog.length; i++) {
if (dataSheetLog[i][2]==criteria && dataSheetLog[i][12]!=status){
outData.push(dataSheetLog[i]);
dataSheetLogStatus[i][0] = "COPIED";
}
}
if(outData.length > 0) {
sheetMaint.getRange(sheetMaint.getLastRow() + 1,1,outData.length,outData[0].length).setValues(outData);
dataSheetLogStatusRange.setValues(dataSheetLogStatus);
}
}
}
what change would I need to make to change the cell
'dataSheetLog[i][12]' to the status variable, i.e. "COPIED" after I
have copied it across.
You were trying to update the value in the array that was extracted from the sheet and not the sheet itself. As arrays are zero based and spreadsheets are not, to translate, +1 must be added to array row and column indices. I am assuming status is in column M of your sheet.

Restrict TextField to act like a numeric stepper

I am making a numeric stepper from scratch, so I want my text field to only accept numbers in this format: xx.x, x.x, x, or xx where x is a number. For example:
Acceptable numbers:
1
22
15.5
3.5
None Acceptable numbers:
213
33.15
4332
1.65
Maybe this will help some how:
http://livedocs.adobe.com/flash/9.0/ActionScriptLangRefV3/flash/text/TextField.html#restrict
This is what I got so far:
var tx:TextField = new TextField();
tx.restrict="0-9."; //Maybe there is a regular expression string for this?
tx.type=TextFieldType.INPUT;
tx.border=true;
You can copy past this in flash and it should work.
Thank you very much for your help good sirs.
Very similar to TheDarklins answer, but a little more elegant. And actually renders _tf.restrict obsolete, but I would still recommend using it.
_tf.addEventListener(TextEvent.TEXT_INPUT, _onTextInput_validate);
Both of these event listeners here do the EXACT same function identically. One is written in a one line for those who like smaller code. The other is for those who like to see what's going on line by line.
private function _onTextInput_validate(__e:TextEvent):void
{
if ( !/^\d{1,2}(?:\.(?:\d)?)?$/.test(TextField(__e.currentTarget).text.substring(0, TextField(__e.currentTarget).selectionBeginIndex) + __e.text + TextField(__e.currentTarget).text.substring(TextField(__e.currentTarget).selectionEndIndex)) ) __e.preventDefault();
}
for a more broken down version of the event listener
private function _onTextInput_validate(__e:TextEvent):void
{
var __reg:RegExp;
var __tf:TextField;
var __text:String;
// set the textfield thats causing the event.
__tf = TextField(__e.currentTarget);
// Set the regular expression.
__reg = new RegExp("\\d{1,2}(?:\\.(?:\\d)?)?$");
// or depending on how you like to write it.
__reg = /^\d{1,2}(?:\.(?:\d)?)?$/;
// Set all text before the selection.
__text = __tf.text.substring(0, __tf.selectionBeginIndex);
// Set the text entered.
__text += __e.text;
// Set the text After the selection, since the entered text will replace any selected text that may be entered
__text += __tf.text.substring(__tf.selectionEndIndex);
// If test fails, prevent default
if ( !__reg.test(__text) )
{
__e.preventDefault();
}
}
I have had to allow xx. as a valid response otherwise you would need to type 123 then go back a space and type . for 12.3. That is JUST NOT NICE. So 12. is now technically valid.
package
{
import flash.display.Sprite;
import flash.text.TextField;
import flash.text.TextFieldType;
import flash.events.TextEvent;
public class DecimalPlaces extends Sprite
{
public function DecimalPlaces()
{
var tf:TextField = new TextField();
tf.type = TextFieldType.INPUT;
tf.border = true;
tf.width = 200;
tf.height = 16;
tf.x = tf.y = 20;
tf.restrict = ".0-9"
tf.addEventListener(TextEvent.TEXT_INPUT, restrictDecimalPlaces);
addChild(tf);
}
function restrictDecimalPlaces(evt:TextEvent):void
{
var matches:Array = evt.currentTarget.text.match(/\./g);
var allowedDecimalPlaces:uint = 1;
if ((evt.text == "." && matches.length >= 1) ||
(matches.length == 1 && (evt.currentTarget.text.lastIndexOf(".") + allowedDecimalPlaces < evt.currentTarget.text.length)))
evt.preventDefault();
}
}
}