parse google document for text and copy result to sheet - regex

I wish to parse a series of documents in a Google Drive folder using regular expressions.
The documents contain equipment model and serial numbers. I wish to then copy the results to a google sheet row by row. I have managed a similar task with emails successfully but to no avail with google docs.
Can anyone offer some guidance. I have tested the regular expressions in the 'find and replace' menu in google docs and they work fine. The following is simply an attempt to see if I can capture some data and write it to a cell in the active sheet.
function write() {
var ss= SpreadsheetApp.getActiveSpreadsheet().getActiveSheet();
var doc =
DocumentApp.openById('1ZNqJjSJo1wkD3eaCRTY64g98hYEY77D4MDU6XpvA4MI');
var body = doc.getBody();
var text = body.findText('(\W|^)GSS\d{2}H(\W|$)')
ss.getRange(1,1).setValue(text);
}

You want to retrieve all values matched by (\W|^)GSS\d{2}H(\W|$) in the document, and put the result to spreadsheet with row by row. If my understanding is correct, how about this modification? I think that there are several answers for your situation. So please think of this as one of them.
Modification points :
Retrieve text from document.
Retrieve all matched values using the regex.
For this situation, I used RegExp#exec.
Put the result to spreadsheet.
Modified script :
function write() {
var ss = SpreadsheetApp.getActiveSpreadsheet().getActiveSheet();
var doc = DocumentApp.openById('1ZNqJjSJo1wkD3eaCRTY64g98hYEY77D4MDU6XpvA4MI');
var body = doc.getBody();
// Modified script
var text = doc.getBody().getText();
var result = [];
var r = /(\W|^)GSS\d{2}H(\W|$)/g;
while ((res = r.exec(text)) !== null) { // or while (res = r.exec(text)) {
result.push([res[0]]);
}
ss.getRange(ss.getLastRow() + 1, 1, result.length, 1).setValues(result);
}
If this was not what you want, I'm sorry. At that time, could you please provide the sample input and output you need? I would like to modify my answer.

Related

Skip translating words with %% and [] in Google Sheets

I am using GoogleTranslate() with Sheets to translate some contents into different languages. In those contents, we have multiple hooks [ ] and % % in one string that do not need to translate. Example :
[name] [surname] looked at your profile %number% !
I do not need to translate hooks like [username] and %number%.
I'm looking for :
[name] [surname] a regardé ton profil %number% ! (in french for example)
A solution is already provided here for one character using REGEXREPLACE and REGEXEXTRACT. But I need either symbol [xxx] and %xxx% in one formula. Thank you.
Alternatively, instead of using the GOOGLETRANSLATE with multiple nested functions, you can try creating a bound script on your spreadsheet file & then copy/paste the simple custom script below that contains translate() function for a more simplified use of function on your sheet:
CUSTOM SCRIPT
function translate(range) {
var container = [];
//KEEP ALL %***% and [***] INTO A CONTAINER
var regex = /(\[.*?])|(\%.*?%)/gm,
stringTest = range,
matched;
while(matched = regex.exec(stringTest)){
container.push(matched[0]);
}
//TRANSLATE TEXT TO FRENCH FROM ENGLISH W/O %***% and [***]
var replacedData = stringTest.replace(regex,'#');
var toTranslate = LanguageApp.translate(replacedData, 'en', 'fr');
var res = "";
//REARRANGE THE TRANSLATED TEXT WITH %***% and [***] FROM CONTAINER
for(x=0;x<toTranslate.split("#").length;x++){
res = res + toTranslate.split("#")[x]+" "+container[x];
}
//RETURN FINAL TRANSLATED TEXT WITH UNMODIFIED %***% and [***]
return res.trim().replace("undefined","");
}
SAMPLE RESULT
After saving the script, just simply put =translate(A1) (e.g. the text you want to translate is on cell A1) on a sheet cell and the script will skip any multiple words inside [***] & %***%, then it will only translate the rest of the text to french.
Try this:
=arrayformula(if(A1<>"",join("",if(isnumber(flatten(split(GOOGLETRANSLATE(join(" ",iferror(regexreplace(to_text(flatten(split(A1," "))),"(\[.*\])|(\%.*\%)","["&row(A$1:A)&"]"),)),"en","fr"),"[]"))),vlookup(flatten(split(GOOGLETRANSLATE(join(" ",iferror(regexreplace(to_text(flatten(split(A1," "))),"(\[.*\])|(\%.*\%)","["&row(A$1:A)&"]"),)),"en","fr"),"[]")),{sequence(len(regexreplace(A1,"[^\ ]",))+1,1),flatten(split(A1," "))},2,false),flatten(split(GOOGLETRANSLATE(join(" ",iferror(regexreplace(to_text(flatten(split(A1," "))),"(\[.*\])|(\%.*\%)","["&row(A$1:A)&"]"),)),"en","fr"),"[]")))),))
GOOGLETRANSLATE does not work with ARRAYFORMULA, but you can drag down this formula from cell B1 if you want to apply it to multiple rows in column A.
Individual steps taken:
Split text by space character, then flatten into one column.
Cell D1: =flatten(split(A1," "))
Replace [***] and %***% with [row#].
Cell E1: =arrayformula(iferror(regexreplace(to_text(flatten(split(A1," "))),"(\[.*\])|(\%.*\%)","["&row(A$1:A)&"]"),))
Join the rows into one cell.
Cell F1: =join(" ",E:E)
Apply Google Translate.
Cell G1: =GOOGLETRANSLATE(F1,"en","fr")
Split by [].
Cell H1: =flatten(split(G1,"[]"))
Where rows contain numbers, lookup item 1) above.
Cell I1: =arrayformula(if(isnumber(H1:H),vlookup(H1:H,{row(A$1:A),D:D},2,false),H1:H))
Join the rows into one cell.
Cell J1: =join(" ",I:I)

Been trying to automate the Find and Replace in Google Sheets but did not work

My sheet is a query-sheet from database. Some of them contain html hex-code color which I need to manually use edit>Find and Replace every time it is refreshed.
I am very new to Google App Script and been trying to use the following code:
function Clearcode() {
var lookupone = new RegExp(/{color:#.{7}/);
var rep = "";
var spreadSheet = SpreadsheetApp.getActive();
var querySheet = spreadSheet.getSheetByName("QUERY");
var lastRow = querySheet.getLastRow();
var lastColumn = querySheet.getLastColumn();
var data = querySheet.getRange(2, 1, lastRow-1, lastColumn).getValues();
var textfinder = querySheet.createTextFinder(lookupone);
var found = textfinder.replaceAllWith(rep);
return (found);
}
Yet, when I run this function in the sheet it did not work. Any thought?
P.S. I planned to eliminated "[color]" part of the hex-code as well by create the similar function.
P.S.2 I have attached a snapshot of a table as you requested. The red line is just for confidentiality of the data. Below the line is just a normal text.
Pay attention to types!
CreateTextFinder accepts String as argument NOT a regexp object.
To use strings as regular expressions, useRegularExpressions needs to be set to true
querySheet.createTextFinder("\\{color:#?.{0,6}\\}")//only 6 characters
.useRegularExpressions(true)
.replaceAllWith("")

Remove Characters from Google Sheets up to # sign

I have a column of E-mail addresses in a Google Sheet and want to remove all of the domain names and '#' symbol and copy this to a new column. For example:
Column-A
test#test.com
testb#gmail.com
testc#yahoo.com
Copied and removing the domains to:
Column-B
test
testb
testc
all you need is:
=ARRAYFORMULA(IFNA(REGEXEXTRACT(A1:A&"", "(.+)#")))
use this function on google App script:
function myFunction() {
// Your spreadsheet
var ss = SpreadsheetApp.getActive()
//if you got only one sheet
var sheet = ss.getSheets()[0];
// This is in the case that your sheet as a header, if not replace 2 by 1 and (sheet.getLastRow()-1) by sheet.getLastRow()
var valuesColumnA = sheet.getRange(2,1,(sheet.getLastRow()-1)).getValues();
//Just to have each value in the same array
var valuesColumnAMapped = valuesColumnA.map(function(r){return r[0]});
valuesColumnAMapped.forEach(function(value, index){
var split = value.split("#");
sheet.getRange((index+2),2).setValue(split[0]);
})
}
My answer as per my understanding maybe I'm wrong so please follow if I'm right to understand.
use split to get this
Go to Data
Click on Split Text to Columns
Pick Custom From Drop Down
enter # and you get your result.

Google Script - Find and Replace using "Search Within Formulas"

I have a google sheet that receives a list of phone numbers from an outside source. Phone numbers arrive in one of two formats:
Numbers that appear as 12345678901 are seen without error.
Numbers that appear as 1(234)567-8901 result in #ERROR!.
It seems that google sheets is reading the second set of numbers as a formula. When I click into an error cell, the phone number is preceded with "=+", as in "=+1(234)567-8901". I can fix this manually for the entire document by using Find and Replace with "Search within Formulas" checked.
Find: "=+"
Replace: " "
Is there any way to automate this within google apps scripts? I would like to run this function onEdit() so that #ERROR! phone numbers are fixed in real time.
You can remove the ()- characters using a spreadsheet formula, let's say the number was in cell A1, then in another cell you can put:
=CONCATENATE(SPLIT(A1, "()-" ))
which will remove the ()- characters.
If you would like to do this with a script then you can use replace to remove the ()-
.replace(/[()-]/gi, "")
apply above your number column range to properly format number.
EDIT
This should work, change "A1:A" to your column
function onEdit(){
var sheet = SpreadsheetApp.getActiveSheet();
var range = sheet.getRange("A1:A" + sheet.getLastRow());
var data = range.getValues();
var formulas = range.getFormulas();
for (var i=0;i< formulas.length;i++) {
if(typeof formulas[i] !== "undefined" && formulas[i] != ""){
formulas[i][0] = formulas[i][0].replace(/[=()+-]/gi, "");
data[i][0] = formulas[i][0].toString();
}
}
range.setValues(data).setNumberFormat("0");
}

GoogleSheet script editor - onEdit event with conditions / if statement

guys!
I'm new to this website and also not good with coding. So I would really appreciate some help.
Right now I'm in need of a specific code to make a google sheet work perfectly.
To further explain:
I have a google sheet that a few information will be input by other co-workers. What I need is a code that will register the date in a specific cell and by whom the input was made on another cell.
So far this is what I have:
function onEdit(event) {
var sheet = event.source.getSheetByName("Input");
// Note: actRng = return the last cell of the row modified
var actRng = event.source.getActiveRange();
var index = actRng.getRowIndex();
var cindex = actRng.getColumnIndex();
// Note: date = return date
// Note: user = return the user email
var userCell = sheet.getRange(index,14);
var dateCell = sheet.getRange(index,2);
var inputdate = Utilities.formatDate(new Date(), "GMT+0200", "yyyy-MM-dd");
// Note(with hour): var inputdate = Utilities.formatDate(new Date(), "GMT+0200", "yy-MM-dd HH:mm");
//var user = event.user; // Note: event.user will not give you collaborator's Id
var user = Session.getEffectiveUser();
// Note: setValue = Insert in the cell the date when this row was modified
if (userCell.Value == null) {
userCell.setValue(user);
dateCell.setValue(inputdate)
}
}
My main problems/questions are:
I don't exactly need the last modifier, but the person who first input info on the cells. Therefore I tried that last IF (If the cell that is supposed to have the last modifier e-mail is blank, it means that nobody changed that row before, so the code should add the user on the userCell), although it is not working since every change I make it ignores the verification.
I also want to add that the event will only happen if you add values, if you delete them, nothing happens. (so far even when I delete cells, it counts as modification)
Most of the sheet is protected to avoid that people by accident erase some of the formulas, so the cells that this code changes are also protected. Is there a way to make the code bypass cell protection?
Please, help me identify what I'm doing wrong and hopefully I'll get this working perfectly! Thanks for the help !
If you want to prevent the script from firing when a cell is deleted, try:
var editedCell = SpreadsheetApp.getActiveSheet().getRange(e.range.getRow(), e.range.getColumn());
if (editedCell == "") {
return;
}
I would change Session.getEffectiveUser() to session.getActiveUser().
The last if statement is unnecessary. You want whoever most recently edited the field to be identified, along with the date.