I am trying to split by date and event columns. It is impossible to search for ". " some lines contain multiple sentences ending with ". " Also, some lines don't start with dates. The idea of the script was to use a regexp to find lines starting with the fragment "one or two numbers, space, letters, period, space" and then replace "point, space" with a rare character, for example, "#". If the line does not start with this fragment, then add "#" to the beginning. Then this array can be easily divided into two parts by this symbol ("#") and written to the sheet.
Unfortunately, something went wrong today. I came across the fact that match(re) is always null. I ask for help in composing the correct regular expression and solving the problem.
Original text:
1 June. Astronomers report narrowing down the source of Fast Radio
Bursts (FRBs). It may now plausibly include "compact-object mergers
and magnetars arising from normal core collapse supernovae".[3][4]
The existence of quark cores in neutron stars is confirmed by Finnish
researchers.[5][6][7]
3 June. Researchers show that compared to rural populations urban red
foxes (pictured) in London are mirroring patterns of domestication
similar to domesticated dogs, as they adapt to their city
environment.[21]
The discovery of the oldest and largest structure in
the Maya region, a 3,000-year-old pyramid-topped platform Aguada
Fénix, with LiDAR technology is reported.
17 June. Physicists at the XENON dark matter research facility report
an excess of 53 events, which may hint at the existence of
hypothetical Solar axions.
Desired result:
Code:
function replace() {
const sheetName = "Sheet1";
const sheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName(sheetName);
const lr = sheet.getLastRow();
// const range = sheet.getRange(2, 4, lr - 1);
const range = sheet.getRange(100, 4, 5);
const arr = range.getValues();
const newArr = [];
const re = new RegExp("^([0-9]{1,2}\s[a-z]+\.)\s");
for (let i = 0; i < arr.length; i++) {
const match = arr[i][0].match(re);
if (match == null) {
let newEntry = "#" + arr[i];
newArr.push(newEntry);
} else {
// let newEntry = "#" + arr[i];
// newArr.push(newEntry);
}
}
// range.offset(0,1).setValues(newArr);
// console.log(newArr);
}
function breakapart() {
const ms = ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December']
const ss = SpreadsheetApp.getActive();
const sh = ss.getSheetByName('Sheet1');//Data Sheet
const osh = ss.getSheetByName('Sheet2');//Output Sheet
osh.clearContents();
const vs = sh.getRange(1, 1, sh.getLastRow(), sh.getLastColumn()).getDisplayValues().flat();
let oA = [];
vs.forEach(p => {
let f = p.split(/[. ]/);
if (!isNaN(f[0]) && ms.includes(f[1])) {
let s = p.slice(0, p.indexOf('.'));
let t = p.slice(p.indexOf('.')+2);
oA.push([s, t]);
} else {
oA.push(['',p]);
}
});
osh.getRange(1,1,oA.length,oA[0].length).setValues(oA);
}
Related
I have soma data, starting from A10 to column M, until the 59th row.
I have some dates in column F10:F that are text strings, converted to official dates in column N (here the question with the process)
M3 is set to =NOW().
In cell N3 I have: =M3+14.
I want to delete all the rows, with a date in column N10:N that comes before [today + 2 weeks] (so cell N3).
When I create a script in Apps Script, it doesn't run the if statement, but if I leave it in comments, it can go in the for loop and deletes the rows, so I'm pretty sure the problem is, again, date formatting.
In this question I ask: how do I compare the values of N10:N with N3, in order to delete all the rows that don't meet the condition if(datesNcol <= targetDate)? (in code is written as if (rowData[i] < flatArray))
I leave also a demo sheet with this problem explained in detail and two alternatives (getBackground condition and numeric days condition).
Attempts:
This is a simplified code example:
const gen = SpreadsheetApp.getActiveSpreadsheet().getSheetByName('Generatore');
const bVals = gen.getRange('B10:B').getValues();
const bFilt = bVals.filter(String);
const dataLastRow = bFilt.length;
function deleteExpired() {
dateCorrette(); //ignore, formula that puts corrected dates from N10 to dataLastRow
var dateCorrect = gen.getRange(10,14,dataLastRow,1).getValues();
var targetDate = gen.getRange('N3').getValues();
var flatArray = [].concat.apply([], targetDate);
for (var i = dateCorrect.length - 1; i >= 0; i--) {
var rowData = dateCorrect[i];
if (rowData[i] < flatArray) {
gen.deleteRow(i+10);
}
}
};
If run the script, nothing is deleted.
If I //comment the if function and the closing bracket, it delets all the rows of the list one by one.
I can't manage to meet that condition.
Right now, it logs this [Sun Jan 01 10:33:20 GMT-05:00 2023] as flatArray
and this [Wed Dec 21 03:00:00 GMT-05:00 2022] as dateCorrect[49], so the first row to delete, that is the 50th (is correct for all the dateCorrect[i] dates).
I tried putting a getTime() method in the targetDate variable, but it only functions if there is the getValue() method, not getValues(), so I then don't know how to use getTime() method on rowData, which is based on dateCorrected[i], which have to use the getValues() method. And then it also doesn't accept the flatArray variable, that has to be commented out (or it logs [ ] for flatArray, not the corrected date)
I leave the other attempts in the demo sheet, because I want to prioritize this problem around the date and make it clear in my head.
Thanks for all the help.
DEMO SHEET, ITA Locale time
I don't know how the demo sheet works with Apps Script, I suggest to copy the code in a personal sheet
UPDATE:
I've also tried putting an extra column, with an IF built-in function that writes "del" if the function has to be deleted.
=IF(O10>14;"del";"")
And then
var boba = gen.getRange(10,16,bLast,1).getDisplayValues();
.
.
if (boba[i] == 'del')
This does the job. But I can't understand why the other methods don't work.
Try this. It seems like you do a lot of things that aren't necessary. Unless I'm missing something.
A few notes. I typically do not use global variable, unless absolutely necessary. I don't create a variable for last row unless I have to use that value multiple times in my script. I use the method Sheet.getLastRow(). dataCorrect is a 2D array of 1 column so the second index can only be [0]. And getRange('N4') is a single cell so getValue() is good enough.
function deleteExpired() {
const gen = SpreadsheetApp.getActiveSpreadsheet().getSheetByName('Generatore');
var dateCorrect = gen.getRange(10,14,gen.getLastRow()-9,1).getValues();
var targetDate = gen.getRange('N3').getValue();
for (var i = dateCorrect.length - 1; i >= 0; i--) {
if (dataCorrect[i][0] < targetDate) {
gen.deleteRow(i+10);
}
}
}
Try this:
function delRows() {
const ss = SpreadsheetApp.getActive();
const gsh = ss.getSheetByName('Generatore');
const colB = gsh.getRange('B10:B' + gsh.getLastRow()).getValues();
var colN = gsh.getRange('N10:N' + gsh.getLastRow()).getValues();
var tdv = new Date(new Date().getFullYear(), new Date().getMonth(), new Date().getDate() + 14).valueOf();//current date + 14
let d = 0;
colN.forEach((n, i) => {
if (new Date(n).valueOf() < tdv) {
gsh.deleteRow(i + 10 - d++);
}
});
}
I found the following code to emulate the proper formula, but it has a wrong ( maybe outdated) syntax, and as far as i understood, it should applies to all columns of a given sheet.
function PROPER_CASE(str) {
if (typeof str != "string")
throw `Expected string but got a ${typeof str} value.`;
str = str.toLowerCase();
var arr = str.split(/.-:?—/ );
return arr.reduce(function(val, current) {
return val += (current.charAt(0).toUpperCase() + current.slice(1));
}, "");
}
Here's an example of the input :
A
B
C
D
ColumnA
ColumnB
ColumnC
ColumnD
EXCEL ACTION LIMIMTED (毅添有限公司)
207/2018
n/a
without-proper
Hang Wo Holdings
205/2015
35/2020
without-proper
central southwood limited
308/2019
n/a
without-proper
This would be the desired output:
ColumnA ColumnB ColumnC COlumnD
Excel Action Limited (毅添有限公司) 207/2018 n/a without-proper
Hang Wo Holdings 205/2015 35/2020 without-proper
Central Southwood Limited 308/2019 n/a without-proper
And this is the error output of that function :
Erro
Expected string but got a undefined value.
PROPER_CASE # macros.gs:115
This is the only way I can see of reproducing you results. I don't see how to avoid captalizing the first letter of the last two columns with avoiding them:
function lfunko() {
const ss = SpreadsheetApp.getActive();
const sh = ss.getSheetByName("Sheet0");
if (sh.getLastRow() > 4) {
sh.getRange(6, 1, sh.getLastRow() - 5, sh.getLastColumn()).clearContent();
SpreadsheetApp.flush();
}
const vs = sh.getDataRange().getDisplayValues().map((r, i) => {
return r.map((c, j) => {
if (i > 0 && j < 1) {
let arr = c.toString().toLowerCase().split(/.-:?-/g);
return arr.reduce((val, current) => {
//Logger.log(current)
return val += current.charAt(0).toUpperCase() + current.slice(1);
}, '');
} else {
return c;
}
});
});
Logger.log(JSON.stringify(vs))
sh.getRange(sh.getLastRow() + 2, 1, vs.length, vs[0].length).setValues(vs);
}
A
B
C
D
Data
ColumnA
ColumnB
ColumnC
ColumnD
EXCEL ACTION LIMIMTED (毅添有限公司)
207/2018
n/a
without-proper
Hang Wo Holdings
205/2015
35/2020
without-proper
central southwood limited
308/2019
n/a
without-proper
Outpput
ColumnA
ColumnB
ColumnC
ColumnD
Excel action limimted (毅添有限公司)
207/2018
n/a
without-proper
Hang wo holdings
205/2015
35/2020
without-proper
Central southwood limited
308/2019
n/a
without-proper
I have tested your code and it works fine. It does convert the input string into a proper case.
However, take note that in Google Sheets, when you get values, your data is in 2D Array or Nested Array.
So to apply this to your Spreadsheet after getting the values you will have to target the column you want to replace and loop through each string in the array. You will then have to setValues() back to the specified range to replace it in the spreadsheet.
Solution 1:
Try:
With your function, try adding this script to apply to your spreadsheet.
function setToColumn(){
var ss = SpreadsheetApp.getActiveSpreadsheet();
var sheet = ss.getActiveSheet();
var dataRange = sheet.getRange(1,1,sheet.getLastRow()); //2ND Parameter is the column, replace if you want to edit different column
var allData = dataRange.getValues().flat();
var properData = []
allData.forEach(function(data){
properData.push([PROPER_CASE(data)])
});
dataRange.setValues(properData);
}
From:
Result:
Solution 2:
If you don't mind using different script which only needs one function you may use the script below:
function properCase() {
var ss = SpreadsheetApp.getActiveSpreadsheet();
var sheet = ss.getActiveSheet();
var dataRange = sheet.getRange(1,1,sheet.getLastRow()); //2ND Parameter is the column, replace if you want to edit different column (1 = Column A, 2 = Column B)
var allData = dataRange.getValues().flat();
var properData = []
allData.forEach(function(data){
properData.push([data.toLowerCase().replace(/\b[a-z]/ig, function(match) {return match.toUpperCase()})]);
});
dataRange.setValues(properData);
}
Reference for Solution 2:
Apps script how to format a cell to Proper Text (Case)
Trying to compare two columns in GoogleSheets with this formula in Column C:
=if(A1=B1,"","Mismatch")
Works fine, but I'm getting a lot of false positives:
A.
B
C
MARY JO
Mary Jo
JAY, TIM
TIM JAY
Mismatch
Sam Ron
Sam Ron
Mismatch
Jack *Ma
Jack MA
Mismatch
Any ideas how to work this?
This uses a score based approach to determine a match. You can determine what is/isn't a match based on that score:
Score Formula = getMatchScore(A1,B1)
Match Formula = if(C1<.7,"mismatch",)
function getMatchScore(strA, strB, ignoreCase=true) {
strA = String(strA);
strB = String(strB)
const toLowerCase = ignoreCase ? str => str.toLowerCase() : str => str;
const splitWords = str => str.split(/\b/);
let [maxLenStr, minLenStr] = strA.length > strB.length ? [strA, strB] : [strB, strA];
maxLenStr = toLowerCase(maxLenStr);
minLenStr = toLowerCase(minLenStr);
const maxLength = maxLenStr.length;
const minLength = minLenStr.length;
const lenScore = minLength / maxLength;
const orderScore = Array.from(maxLenStr).reduce(
(oldItem, nItem, index) => nItem === minLenStr[index] ? oldItem + 1 : oldItem, 0
) / maxLength;
const maxKeyWords = splitWords(maxLenStr);
const minKeyWords = splitWords(minLenStr);
const keywordScore = minKeyWords.reduce(({ score, searchWord }, nItem) => {
const newSearchWord = searchWord?.replace(new RegExp(nItem, ignoreCase ? 'i' : ''), '');
score += searchWord.length != newSearchWord.length ? 1: 0;
return { score, searchWord: newSearchWord };
}, { score: 0, searchWord: maxLenStr }).score / minKeyWords.length;
const sortedMaxLenStr = Array.from(maxKeyWords.sort().join(''));
const sortedMinLenStr = Array.from(minKeyWords.sort().join(''));
const charScore = sortedMaxLenStr.reduce((oldItem, nItem, index) => {
const surroundingChars = [sortedMinLenStr[index-1], sortedMinLenStr[index], sortedMinLenStr[index+1]]
.filter(char => char != undefined);
return surroundingChars.includes(nItem)? oldItem + 1 : oldItem
}, 0) / maxLength;
const score = (lenScore * .15) + (orderScore * .25) + (charScore * .25) + (keywordScore * .35);
return score;
}
try:
=ARRAYFORMULA(IFERROR(IF(LEN(
REGEXREPLACE(REGEXREPLACE(LOWER(A1:A), "[^a-z ]", ),
LOWER("["&B1:B&"]"), ))>0, "mismatch", )))
Implementing fuzzy matching via Google Sheets formula would be difficult. I would recommend using a custom formula for this one or a full blown script (both via Google Apps Script) if you want to populate all rows at once.
Custom Formula:
function fuzzyMatch(string1, string2) {
string1 = string1.toLowerCase()
string2 = string2.toLowerCase();
var n = -1;
for(i = 0; char = string2[i]; i++)
if (!~(n = string1.indexOf(char, n + 1)))
return 'Mismatch';
};
What this does is compare if the 2nd string's characters order is found in the same order as the first string. See sample data below for the case where it will return mismatch.
Output:
Note:
Last row is a mismatch as 2nd string have r in it that isn't found at the first string thus correct order is not met.
If this didn't meet your test cases, add a more definitive list that will show the expected output of the formula/function so this can be adjusted, or see player0's answer which solely uses Google Sheets formula and is less stricter with the conditions.
Reference:
https://stackoverflow.com/a/15252131/17842569
The main limitation of traditional fuzzy matching is that it doesn’t take into consideration similarities outside of the strings. Topic clustering requires semantic understanding. Goodlookup is a smart function for spreadsheet users that gets very close to semantic understanding. It’s a pre-trained model that has the intuition of GPT-3 and the join capabilities of fuzzy matching. Use it like vlookup or index match to speed up your topic clustering work in google sheets.
https://www.goodlookup.com/
I have a Google Sheets question I was hoping someone could help with.
I have a list of about 200 keywords which looks like the ones below:
**List 1**
Italy City trip
Italy Roundtrip
Italy Holiday
Hungary City trip
Czechia City trip
Croatia Montenegro Roundtrip
....
....
And I then have another list with jumbled keywords with around 1 million rows. The keywords in this list don't exactly match with the first list. What I need to do is search for the keywords in list 1 (above) in list 2 (below) and sum all corresponding cost values. As you can see in the list below the keywords from list 1 are in the second list but with other keywords around them. For example, I need a formula that will search for "Italy City trip" from list 1, in list 2 and sum the cost when that keyword occurs. In this case, it would be 6 total. Adding the cost of "Italy City trip April" and "Italy City trip June" together.
**List 2** Cost
Italy City trip April 1
Italy City trip June 5
Next week Italy Roundtrip 4
Italy Holiday next week 1
Hungary City holiday trip 9
....
....
I hope that makes sense.
Any help would be greatly appreciated
try:
=ARRAYFORMULA(QUERY({IFNA(REGEXEXTRACT(PROPER(C1:C),
TEXTJOIN("|", 1, SORT(PROPER(A1:A), 1, 0)))), D1:D},
"select Col1,sum(Col2)
where Col1 is not null
group by Col1
label sum(Col2)''", 0))
You want to establish whether keywords in one list (List#1) can be found in another list (List#2).
List#2 is 1,000,000 rows long, so I would recommend segmenting the list so that execution times are not exceeded. That's something you will be able to establish by trial and error.
The solution is to use the javascript method indexOf.
Paraphrasing from w3schools: indexOf() returns the position of the first occurrence of a specified value in a string. If the value is not found, it returns -1. So testing if (idx !=-1){ will only return List#1 values that were found in List#2. Note: The indexOf() method is case sensitive.
function so5864274503() {
var ss = SpreadsheetApp.getActiveSpreadsheet();
var srcname = "source";
var tgtname = "target";
var sourceSheet = ss.getSheetByName(srcname);
var targetSheet = ss.getSheetByName(tgtname);
// get the source list
var sourceLR = sourceSheet.getLastRow();
var srcData = sourceSheet.getRange(1,1,sourceLR).getValues();
//get the target list
var targetLR = targetSheet.getLastRow();
var tgtlist = targetSheet.getRange(1,1,targetLR,2).getValues();
var totalcostvalues = [];
// start looping through the keywords (list 1)
for (var s = 0;s<srcData.length;s++){
var totalcost = 0;
var value = srcData[s][0]
// start looping through the strings (List 2)
for (var i=0;i<tgtlist.length;i++){
// set cost to zero
var cumcost = 0;
// use indexOf to test if keyword is in the string
var idx = tgtlist[i][0].indexOf(value);
// value of -1 = no match, value >-1 indicates posuton in the string where the key word was found
if (idx !=-1){
var cost = tgtlist[i][1]
cumcost = cumcost + cost;
totalcost = totalcost+cost
}
}//end of loop - list2
//Logger.log("DEBUG: Summary: "+value+", totalcost = "+totalcost)
totalcostvalues.push([totalcost])
}// end of loop - list1
//Logger.log(totalcostvalues); //DEBUG
sourceSheet.getRange(1,2,sourceLR).setValues(totalcostvalues);
}
I also got this one, but it's case sensitive a bit
function myFunction() {
var ss = SpreadsheetApp.getActive();
var sheet1 = ss.getSheets()[0];
var sheet2 = ss.getSheets()[1];
var valuesSheet1 = sheet1.getRange(2,1, (sheet1.getLastRow()-1), sheet1.getLastColumn()).getValues();
var valuesCol1Sheet1 = valuesSheet1.map(function(r){return r[0]});
var valuesCol2Sheet1 = valuesSheet1.map(function(r){return r[1]});
Logger.log(valuesCol2Sheet1);
var valuesSheet2 = sheet2.getRange(2,1, (sheet2.getLastRow()-1)).getValues();
var valuesCol1Sheet2 = valuesSheet2.map(function(r){return r[0]});
for (var i = 0; i<= valuesCol1Sheet2.length-1; i++){
var price = 0;
valuesCol1Sheet1.forEach(function(elt,index){
var position = elt.toLowerCase().indexOf(valuesCol1Sheet2[i].toLowerCase());
if(position >-1){
price = price + valuesCol2Sheet1[index];
}
});
sheet2.getRange((i+2),2).setValue(price);
};
}
I'm trying to retrieve a value from an array, based on an index parsed from a string of digits. I'm stuck on this error, and the other answers to similar questions in this forum appear to be for more advanced developers (this is my first iOS app).
The app will eventually look up weather reports ("MAFOR" groupings of 5 digits each) from a web site, parse each group and lookup values from arrays for wind direction, speed, forecast period etc using each character.
The playground code is below, appreciate any help on where I am going wrong (look for ***)
//: Playground - noun: a place where people can play
import UIKit
var str = "Hello, playground"
// create array for Forecast Period
let forecastPeriodArray = ["Existing conditions at beginning","3 hours","6 hours","9 hours","12 hours","18 hours","24 hours","48 hours","72 hours","Occasionally"]
// create array for Wind Direction
let windDirectionArray = ["Calm","Northeast","East","Southeast","South","Southwest","West","Northwest","North","Variable"]
// create array for Wind Velocity
let windVelocityArray = ["0-10 knots","11-16 knots","17-21 knots","22-27 knots","28-33 knots","34-40 knots","41-47 knots","48-55 knots","56-63 knots","64-71 knots"]
// create array for Forecast Weather
let forecastWeatherArray = ["Moderate or good visibility (> 3 nm.","Risk of ice accumulation (temp 0C to -5C","Strong risk of ice accumulkation (air temp < -5C)","Mist (visibility 1/2 to 3 nm.)","Fog (visibility less than 1/2 nm.)","Drizzle","Rain","Snow, or rain and snow","Squally weather with or without showers","Thunderstorms"]
// retrieve full MAFOR line of several information groups (this will be pulled from a web site)
var myMaforLineString = "11747 19741 13757 19751 11730 19731 11730 13900 11630 13637"
// split into array components wherever " " is encountered
var myMaforArray = myMaforLineString.components(separatedBy: " ")
let count = myMaforArray.count
print("There are \(count) items in the array")
// Go through each group and parse out the needed digits
for maforGroup in myMaforArray {
print("MAFOR group \(maforGroup)")
// get Forecast Period
var idx = maforGroup.index(maforGroup.startIndex, offsetBy: 1)
var periodInt = maforGroup[idx]
print("periodInt is \(periodInt)")
// *** here is where I am stuck... trying to use the periodInt index value to retrieve the description from the ForecastPeriodArray
var periodDescription = forecastPeriodArray(periodInt)
print("Forecast period = (forecastPeriodArray(periodInt)")
// get Wind Direction
idx = maforGroup.index(maforGroup.startIndex, offsetBy: 2)
var directionInt = maforGroup[idx]
print("directionInt is \(directionInt)")
// get Wind Velocity
idx = maforGroup.index(maforGroup.startIndex, offsetBy: 3)
var velocityInt = maforGroup[idx]
print("velocityInt is \(velocityInt)")
// get Weather Forecast
idx = maforGroup.index(maforGroup.startIndex, offsetBy: 4)
var weatherInt = maforGroup[idx]
print("weatherInt is \(weatherInt)")
}
#shallowThought was close.
You are trying to access an array by its index, therefore use the array[index] notation. But your index has to be of the correct type. forecastPeriodArray[periodInt] therefore does not work since periodInt is not an Int as the name would suggest. Currently it is of type Character which does not make much sense.
What you are probably trying to achieve is convert the character to an integer and use that to access the array:
var periodInt = Int(String(maforGroup[idx]))!
You might want to add error handling for the case when the character does not actually represent an integer.