Google Script Split cell contents after / with a line break in email body - replace

In google script I am referencing a cell to key into the body of my email that is a joined consoladation of other cells with a delimiter of "/".
results is eg A/B/C/D/E
I would like to get the cell value but add a line break after the "/" so it reads in the body of my email..
A
B
C
D
E
I tried this bbut it is not working
var issues = e.source.getSheetByName("Form Responses").getRange(e.range.rowStart, 3).getDisplayValue().replace(/(/)/g,"\n");//

Related

Skip translating words with %% and [] in Google Sheets

I am using GoogleTranslate() with Sheets to translate some contents into different languages. In those contents, we have multiple hooks [ ] and % % in one string that do not need to translate. Example :
[name] [surname] looked at your profile %number% !
I do not need to translate hooks like [username] and %number%.
I'm looking for :
[name] [surname] a regardé ton profil %number% ! (in french for example)
A solution is already provided here for one character using REGEXREPLACE and REGEXEXTRACT. But I need either symbol [xxx] and %xxx% in one formula. Thank you.
Alternatively, instead of using the GOOGLETRANSLATE with multiple nested functions, you can try creating a bound script on your spreadsheet file & then copy/paste the simple custom script below that contains translate() function for a more simplified use of function on your sheet:
CUSTOM SCRIPT
function translate(range) {
var container = [];
//KEEP ALL %***% and [***] INTO A CONTAINER
var regex = /(\[.*?])|(\%.*?%)/gm,
stringTest = range,
matched;
while(matched = regex.exec(stringTest)){
container.push(matched[0]);
}
//TRANSLATE TEXT TO FRENCH FROM ENGLISH W/O %***% and [***]
var replacedData = stringTest.replace(regex,'#');
var toTranslate = LanguageApp.translate(replacedData, 'en', 'fr');
var res = "";
//REARRANGE THE TRANSLATED TEXT WITH %***% and [***] FROM CONTAINER
for(x=0;x<toTranslate.split("#").length;x++){
res = res + toTranslate.split("#")[x]+" "+container[x];
}
//RETURN FINAL TRANSLATED TEXT WITH UNMODIFIED %***% and [***]
return res.trim().replace("undefined","");
}
SAMPLE RESULT
After saving the script, just simply put =translate(A1) (e.g. the text you want to translate is on cell A1) on a sheet cell and the script will skip any multiple words inside [***] & %***%, then it will only translate the rest of the text to french.
Try this:
=arrayformula(if(A1<>"",join("",if(isnumber(flatten(split(GOOGLETRANSLATE(join(" ",iferror(regexreplace(to_text(flatten(split(A1," "))),"(\[.*\])|(\%.*\%)","["&row(A$1:A)&"]"),)),"en","fr"),"[]"))),vlookup(flatten(split(GOOGLETRANSLATE(join(" ",iferror(regexreplace(to_text(flatten(split(A1," "))),"(\[.*\])|(\%.*\%)","["&row(A$1:A)&"]"),)),"en","fr"),"[]")),{sequence(len(regexreplace(A1,"[^\ ]",))+1,1),flatten(split(A1," "))},2,false),flatten(split(GOOGLETRANSLATE(join(" ",iferror(regexreplace(to_text(flatten(split(A1," "))),"(\[.*\])|(\%.*\%)","["&row(A$1:A)&"]"),)),"en","fr"),"[]")))),))
GOOGLETRANSLATE does not work with ARRAYFORMULA, but you can drag down this formula from cell B1 if you want to apply it to multiple rows in column A.
Individual steps taken:
Split text by space character, then flatten into one column.
Cell D1: =flatten(split(A1," "))
Replace [***] and %***% with [row#].
Cell E1: =arrayformula(iferror(regexreplace(to_text(flatten(split(A1," "))),"(\[.*\])|(\%.*\%)","["&row(A$1:A)&"]"),))
Join the rows into one cell.
Cell F1: =join(" ",E:E)
Apply Google Translate.
Cell G1: =GOOGLETRANSLATE(F1,"en","fr")
Split by [].
Cell H1: =flatten(split(G1,"[]"))
Where rows contain numbers, lookup item 1) above.
Cell I1: =arrayformula(if(isnumber(H1:H),vlookup(H1:H,{row(A$1:A),D:D},2,false),H1:H))
Join the rows into one cell.
Cell J1: =join(" ",I:I)

Google Apps Script: REGEX to fix malformed pipe delimited csv file runs too slowly

I have a Google Apps Script that processes this "csv" file daily.
The file is getting larger and it is starting to time out.
The pipe delimited "csv" file includes new line and next line in the comments fields in some records. This causes those records to break before the true end of record. The following code removes the extraneous new line and next line when they are in the middle of a record and formats the data in a useful csv format. Is there a more efficient way to write this code?
Here's the snippet:
function cleanCSV(csvFileId){
//The file we receive has line breaks in the middle of the records, this removes the line breaks and converts the file to a csv.
var content = DriveApp.getFileById(csvFileId).getBlob().getDataAsString();
var identifyNewLine = content.replace(/\r\n\d{1,5}\|/g,"~~$&"); //This marks the beginning of a new record with double tildes before we can remove all the line breaks.
var noReturnsContent = identifyNewLine.replace(/\r\n/g, ""); //Removes Returns
var newContent = noReturnsContent.replace(/~~/g,"\r\n"); //returns one record per client
var noEndQuote = newContent.replace(/'\|/g,"|"); // removes trailing single quote
var csvContent = noEndQuote.replace(/\|'/g,"|"); // removes leading single quote
//Logger.log(csvContent);
var sheetId = DriveApp.getFolderById(csvFolderId).createFile(csvFileName, csvContent, MimeType.CSV).getId();
return sheetId;
}
Here is a sample of the file:
The first three replace lines can be merged into one, you just want to remove all \r\n occurrences that are not followed with 1 to 5 digits and a |, .replace(/\r\n(?!\d{1,5}\|)/g,"").
The last two replace lines can also be merged into one if you use alternaton, .replace(/'\||\|'/g,"|").
Use
function cleanCSV(csvFileId){
//The file we receive has line breaks in the middle of the records, this removes the line breaks and converts the file to a csv.
var content = DriveApp.getFileById(csvFileId).getBlob().getDataAsString();
var newContent = content.replace(/\r\n(?!\d{1,5}\|)/g,""); // remove line endings not followed with 1-5 digits and |
var csvContent = newContent.replace(/'\||\|'/g,"|"); // removes trailing/leading single quote
//Logger.log(csvContent);
var sheetId = DriveApp.getFolderById(csvFolderId).createFile(csvFileName, csvContent, MimeType.CSV).getId();
return sheetId;
}

How to extract parts of logs based on identification numbers?

I am trying to extract and preprocess log data for a use case.
For instance, the log consists of problem numbers with information to each ID underneath. Each element starts with:
#!#!#identification_number###96245#!#!#change_log###
action
action1
change
#!#!#attribute###value_change
#!#!#attribute1###status_change
#!#!#attribute2###<None>
#!#!#attribute3###status_change_fail
#!#!#attribute4###value_change
#!#!#attribute5###status_change
#!#!#identification_number###96246#!#!#change_log###
action
change
change1
action1
#!#!#attribute###value_change
#!#!#attribute1###status_change_fail
#!#!#attribute2###value_change
#!#!#attribute3###status_change
#!#!#attribute4###value_change
#!#!#attribute5###status_change
I extracted the identification numbers and saved them as a .csv file:
f = open(r'C:\Users\reszi\Desktop\Temp\output_new.txt', encoding="utf8")
change_log = f.readlines()
number = re.findall('#!#!#identification_number###(.+?)#!#!#change_log###', change_log)
Now what I am trying to achieve is, that for every ID in the .csv file I can append the corresponding log content, which is:
action
change
#!#!#attribute###
Since I am rather new to Python and only started working with regex a few days ago, I was hoping for some help.
Each log for an ID starts with "#!#!identification_number###" and ends with "#!#!attribute5### <entry>".
I have tried the following code, but the result is empty:
In:
x = re.findall("\[^#!#!#identification_number###((.|\n)*)#!#!#attribute5###((.|\n)*)$]", str(change_log))
In:
print(x)
Out:
[]
Try this:
pattern='entification_number###(.+?)#!#!#change_log###(.*?)#!#!#id'
re.findall(pattern, string+'#!#!#id', re.DOTALL)
The dotall flag makes the point match newline, so hopefully in the second capturing group you will find the logs.
If you want to get the attributes, for each identification number, you can parse the logs (got for the search above) of each id number with the following:
pattern='#!#!#attribute(.*?)###(.*?)#!#'
re.findall(pattern, string_for_each_log_match+'#!#', re.DOTALL)
If you put each id into the regex when you search using string.format() you can grab the lines that contain the correct changelog.
with open(r'path\to\csv.csv', 'r') as f:
ids = f.readlines()
with open(r'C:\Users\reszi\Desktop\Temp\output_new.txt', encoding="utf8") as f:
change_log = f.readlines()
matches = {}
for id_no in ids:
for i in range(len(change_log)):
reg = '#!#!#identification_number###({})#!#!#change_log###'.format(id_no)
if re.search(reg, change_log[i]):
matches[id_no] = i
break
This will create a dictionary with the structure {id_no:line_no,...}.
So once you have all of the lines that tell you where each log starts, you can grab the lines you want that come after these lines.

Google Script - Find and Replace using "Search Within Formulas"

I have a google sheet that receives a list of phone numbers from an outside source. Phone numbers arrive in one of two formats:
Numbers that appear as 12345678901 are seen without error.
Numbers that appear as 1(234)567-8901 result in #ERROR!.
It seems that google sheets is reading the second set of numbers as a formula. When I click into an error cell, the phone number is preceded with "=+", as in "=+1(234)567-8901". I can fix this manually for the entire document by using Find and Replace with "Search within Formulas" checked.
Find: "=+"
Replace: " "
Is there any way to automate this within google apps scripts? I would like to run this function onEdit() so that #ERROR! phone numbers are fixed in real time.
You can remove the ()- characters using a spreadsheet formula, let's say the number was in cell A1, then in another cell you can put:
=CONCATENATE(SPLIT(A1, "()-" ))
which will remove the ()- characters.
If you would like to do this with a script then you can use replace to remove the ()-
.replace(/[()-]/gi, "")
apply above your number column range to properly format number.
EDIT
This should work, change "A1:A" to your column
function onEdit(){
var sheet = SpreadsheetApp.getActiveSheet();
var range = sheet.getRange("A1:A" + sheet.getLastRow());
var data = range.getValues();
var formulas = range.getFormulas();
for (var i=0;i< formulas.length;i++) {
if(typeof formulas[i] !== "undefined" && formulas[i] != ""){
formulas[i][0] = formulas[i][0].replace(/[=()+-]/gi, "");
data[i][0] = formulas[i][0].toString();
}
}
range.setValues(data).setNumberFormat("0");
}

Remove a line from list and all successive lines to N?

I have some list in R, which is a set of lines from a relatively unstructured document that I am scraping for data. At the top of each page is a page number, proceeded by the string "page" and several lines of header information which I would like to drop.
Each document has a different number of header lines. My solution so far:
RawFeed.1<- grep("Page",RawFeed)
RawFeed.1a<-length(RawFeed.1)
RawFeed.1<-RawFeed.1[-1]
Note the first instance is dropped here because the first page always has more header lines than the rest of the pages and its dropped later anyway.
y<-RawFeed.1[1]
ya<-c(y:length(RawFeed))
NSearch<-RawFeed[ya]
NSearch.1<-grep("Start", NSearch)
y1<-NSearch.1[1]
y1<-y1-1
y2<-c(0:y1)
As 'start' is always found on the line before the data begins, this consistently gives me the document specific number of header lines.
Next I attempt to remove them by:
PageBreak <-function(y) {
RawFeed<-RawFeed[-x-y]
}
RawFeedTemp<-lapply(RawFeed.1,PageBreak,y=y2)
Which does work, sort of - I am left with an array such that RawFeedTemp[[n]] has the header information removed only for that page.
So how can I preform a similar operation where I am left with a list where each page's header information has been removed or is there a way to combine the elements in the array such that it contains only one set of lines, excluding those I am trying to remove?
Edit: An example of the data
[306] N 46 10/08/12 10/08/12 Stuff :30 NM 0 $0.00"
[307] Week: 10/08/12 10/14/12 Other Stuff $6,500.00 0.00
[308] " Contract Agreement Between: Print Date 10/05/12 Page 5 of 6"
[309] ""
[310] ""
[311] " Contract / Revision Alt Order #"
[312] " Person
[313] " Address 1
[314] " Address 2
[315] " Address 3
[316] " Address 4
[317] ""
[318] " Original Date / Revision"
[319] ""
[320] "08/10/12 / 10/04/12"
[321] ""
[322] ""
[323] ""
[324] "* Line Ch Start Date End Date Description Start
[325] MORE DATA
Another File might have a different number of these headers. Also note that records occupy more than one line, most files finish a record before starting a new page but a few insist on pushing the second line of the record to a new page which why I need to remove them all
Thanks for your help!
Since you don't give a clear example of your data, I am not sure of the given solution.
If I understand you have document with parts (header) between 'Page' and 'start' That you want to remove. Here a sample of your data with 2 headers:
str <- 'Page ...... ### header1
alalalala
lalalalalal
aalalala
lslslsls start ksksksks
keep me 1
keep me 2
Page ...... ### header 2
aalalala
lslslsls start ksksksks
keep me 3
keep me 4'
Here I am using readLines to read the document , and find header lines using grep, and remove the join of lines index from the lines list.
ll <- readLines(textConnection(str))
ids <- matrix(grep('Page|start',ll),ncol=2,byrow=TRUE)
ll[-unlist(apply(ids,1,function(x)seq(x[1],x[2])))]
[1] "keep me 1" "keep me 2" "keep me 3" "keep me 4"