Google Sheets Search and Sum in two lists - regex

I have a Google Sheets question I was hoping someone could help with.
I have a list of about 200 keywords which looks like the ones below:
**List 1**
Italy City trip
Italy Roundtrip
Italy Holiday
Hungary City trip
Czechia City trip
Croatia Montenegro Roundtrip
....
....
And I then have another list with jumbled keywords with around 1 million rows. The keywords in this list don't exactly match with the first list. What I need to do is search for the keywords in list 1 (above) in list 2 (below) and sum all corresponding cost values. As you can see in the list below the keywords from list 1 are in the second list but with other keywords around them. For example, I need a formula that will search for "Italy City trip" from list 1, in list 2 and sum the cost when that keyword occurs. In this case, it would be 6 total. Adding the cost of "Italy City trip April" and "Italy City trip June" together.
**List 2** Cost
Italy City trip April 1
Italy City trip June 5
Next week Italy Roundtrip 4
Italy Holiday next week 1
Hungary City holiday trip 9
....
....
I hope that makes sense.
Any help would be greatly appreciated

try:
=ARRAYFORMULA(QUERY({IFNA(REGEXEXTRACT(PROPER(C1:C),
TEXTJOIN("|", 1, SORT(PROPER(A1:A), 1, 0)))), D1:D},
"select Col1,sum(Col2)
where Col1 is not null
group by Col1
label sum(Col2)''", 0))

You want to establish whether keywords in one list (List#1) can be found in another list (List#2).
List#2 is 1,000,000 rows long, so I would recommend segmenting the list so that execution times are not exceeded. That's something you will be able to establish by trial and error.
The solution is to use the javascript method indexOf.
Paraphrasing from w3schools: indexOf() returns the position of the first occurrence of a specified value in a string. If the value is not found, it returns -1. So testing if (idx !=-1){ will only return List#1 values that were found in List#2. Note: The indexOf() method is case sensitive.
function so5864274503() {
var ss = SpreadsheetApp.getActiveSpreadsheet();
var srcname = "source";
var tgtname = "target";
var sourceSheet = ss.getSheetByName(srcname);
var targetSheet = ss.getSheetByName(tgtname);
// get the source list
var sourceLR = sourceSheet.getLastRow();
var srcData = sourceSheet.getRange(1,1,sourceLR).getValues();
//get the target list
var targetLR = targetSheet.getLastRow();
var tgtlist = targetSheet.getRange(1,1,targetLR,2).getValues();
var totalcostvalues = [];
// start looping through the keywords (list 1)
for (var s = 0;s<srcData.length;s++){
var totalcost = 0;
var value = srcData[s][0]
// start looping through the strings (List 2)
for (var i=0;i<tgtlist.length;i++){
// set cost to zero
var cumcost = 0;
// use indexOf to test if keyword is in the string
var idx = tgtlist[i][0].indexOf(value);
// value of -1 = no match, value >-1 indicates posuton in the string where the key word was found
if (idx !=-1){
var cost = tgtlist[i][1]
cumcost = cumcost + cost;
totalcost = totalcost+cost
}
}//end of loop - list2
//Logger.log("DEBUG: Summary: "+value+", totalcost = "+totalcost)
totalcostvalues.push([totalcost])
}// end of loop - list1
//Logger.log(totalcostvalues); //DEBUG
sourceSheet.getRange(1,2,sourceLR).setValues(totalcostvalues);
}

I also got this one, but it's case sensitive a bit
function myFunction() {
var ss = SpreadsheetApp.getActive();
var sheet1 = ss.getSheets()[0];
var sheet2 = ss.getSheets()[1];
var valuesSheet1 = sheet1.getRange(2,1, (sheet1.getLastRow()-1), sheet1.getLastColumn()).getValues();
var valuesCol1Sheet1 = valuesSheet1.map(function(r){return r[0]});
var valuesCol2Sheet1 = valuesSheet1.map(function(r){return r[1]});
Logger.log(valuesCol2Sheet1);
var valuesSheet2 = sheet2.getRange(2,1, (sheet2.getLastRow()-1)).getValues();
var valuesCol1Sheet2 = valuesSheet2.map(function(r){return r[0]});
for (var i = 0; i<= valuesCol1Sheet2.length-1; i++){
var price = 0;
valuesCol1Sheet1.forEach(function(elt,index){
var position = elt.toLowerCase().indexOf(valuesCol1Sheet2[i].toLowerCase());
if(position >-1){
price = price + valuesCol2Sheet1[index];
}
});
sheet2.getRange((i+2),2).setValue(price);
};
}

Related

How to mark column with 0 or 1 if data exists in another table in Power BI?

I have 2 tables
1 is a set of employees (Table 1)
1 is a set of terminations (Table 2)
They will both match on an Employee ID column. I want to add a new calculated column to Table 1 that returns 1 if the employee is in Table 2 and returns 0 otherwise. I can't figure out how to write this in DAX. I feel like this should be extremely simple.
I tried
Column =
VAR X = RELATED(Table1[Employee ID])
VAR RES = IF(ISBLANK(X), "no data", X)
RETURN
RES
This just returns "#ERROR" in all values.
Make sure your IF statement returns the same data type for both the true and the false section:
use either
VAR RES = IF(ISBLANK(X), "no data", FORMAT(X, "#"))
or
VAR RES = IF(ISBLANK(X), 0, X)
And referring to the title of your question you should actually use
VAR RES = IF(ISBLANK(X), 0, 1)

power bi DAX hierarchical table concatenation of names

Since two days I'm on a problem and I can't solve it so I come here to ask some help...
I have that bit of dax that basically take the path of a hierarchical table (integers) and take the string names of the 2 first in the path.
the names I use:
'HIERARCHY' the hierarchical table with names, id, path, nbrItems, string
mytable / addedcolumn1/2 the new table used to emulate the for loop
DisplayPath =
var __Path =PATH(ParentChild[id], ParentChild[parent_id])
var __P1 = PATHITEM(__Path,1) var __P2 = PATHITEM(__Path,2)
var l1 = LOOKUPVALUE(ParentChild[Place],ParentChild[id],VALUE(__P1))
var l2a = LOOKUPVALUE(ParentChild[Place],ParentChild[id],VALUE(__P2))
var l2 = if(ISBLANK(l2a), "", " -> " & l2a)
return CONCATENATE(l1,l2)
My problem is... I don't know the number of indexes in my path, can go from 0 to I guess 15...
I've tried some things but can't figure out a solution.
First I added a new column called nbrItems which calculate the number of items in the list of the path.
The two columns:
Then I added that bit of code that emulates a for loop depending on the number of items in the path list, and I'd like in it to
get name of parameters
concatenate them in one string that I can return and get
string =
var n = 'HIERARCHY'[nbrItems]
var mytable = GENERATESERIES(1, n)
var addedcolumn1 = ADDCOLUMNS(mytable, "nom", /* missing part: get name */)
var addedcolumn2 = ADDCOLUMNS(addedcolumn1, "string", /* missing part: concatenate previous concatenated and new name */)
var mymax = MAXX(addedcolumn2, [Value])
RETURN MAXX(FILTER(addedcolumn2, [Value] = mymax), [string])
Full table:
Thanks for your help in advance!
Ok, so after some research and a lot of try and error... I've came up to a nice and simple solution:
The original problem was that I had a hierarchical table ,but with all data in the same table.
like so
What I did was, adding a new "parent" column with this dax:
parent =
var a = 'HIERARCHY'[id_parent]
var b = CALCULATE(MIN('HIERARCHY'[libelle]), FILTER(ALL('HIERARCHY'), 'HIERARCHY'[id_h] = a))
RETURN b
This gets the parent name from the id_parent (ref. screen).
then I could just use the path function, not on the id's but on the names... like so:
path = PATH('HIERARCHY'[libelle], 'HIERARCHY'[parent])
It made the problem easy because I didn't need to replace the id's by there names after this...
and finally to make it look nice, I used some substitution to remove the pipes:
formated_path = SUBSTITUTE('HIERARCHY'[path], "|", " -> ")
final result

Extract the digits and append it in a different cell?

I am trying to automatically RegExp(extract) the digits(AREA number) in Column 3 combined with the Text 'A' to append in Column 1 Date INDEX.
The problem is I'm not yet familiar in using google sheets app-scripts.
Tried looking for solutions with similar situation as me, but to no avail.
I don't know to put VBA to app-scripts.
Tried using some codes.
I still can't seem to make it work.
Can anyone point me in the right direction?
Thank you if you can help me out. Thanks.
EDIT:
The scenarios is in the office i cant make column for the formula.
It must be "behind the scene".
My googlesheets
//NOT WORKING code
function onEdit(e) {
var rg=e.range;
var sh=e.range.getSheet();
var area=sh.getName();
var regExp = new RegExp("\d*"); // Extract the digits
var dataIndex = regExp.exec(area)[1];
if(rg.columnStart==3) { // Observe column 3
var vA=rg.getValues();
for(var i=0;i<vA.length;i++){
if(vA[i][0]) {
sh.getRange(rg.rowStart + i,1).appendText((dataIndex) +'A'); // append to column 1 with 'A' and extracted digits
}
}
}
}
This answer extends your approach of using a script with an OnEdit trigger. But there are a number of differences between the two sets of code.
The most significant difference is that I have used the Javascript split method (var fields = value.split(' ');) to get distinct values from the data entry.
Most of the other differences are error checking:
if(rg.columnStart === 3 && area === "work") {: test for sheet="work" as well as an edit on Column C
var value = e.value.toUpperCase();: anticipate that the test might be in lower case.
if (fields.length !=2){: test that there are two elements in the data entry.
if (fields[0] != "AREA"){: test that the first elment of the entry is the word 'area'
if (num !=0 && numtype ==="number"){; test that the second element is a number, and that it is NOT zero.
if (colA.length !=0){: test that Column A is not empty
var newColA = colA+"A"+num;: construct the new value for Column A by using unary operator '+'.
function onEdit(e){
// so5911459101
// test for edit in column C and sheet = work
var ss = SpreadsheetApp.getActiveSpreadsheet;
// get Event Objects
var rg=e.range;
var sh=e.range.getSheet();
var area=sh.getName();
var row = rg.getRow();
// test if the edit is in Column C of sheet = work
if(rg.columnStart === 3 && area === "work") { // Observe column 3 and sheet = work
//Logger.log("DEBUG: the edit is in Column C of 'Work'")
// get the edited value
var value = e.value.toUpperCase();
//Logger.log("DEBUG: the value = "+value+", length = "+value.length+", uppercase = "+value.toUpperCase());
// use Javascript split on the value
var fields = value.split(' ');
//Logger.log(fields);//DEBUG
// Logger.log("DEBUG: number of fields = "+fields.length)
// test if there are two fields in the value
if (fields.length !=2){
// Logger.log("DEBUG: the value doesn't have two fields")
}
else{
// Logger.log("DEBUG: the value has two fields")
// test if the first field = 'AREA'
if (fields[0] != "AREA"){
// Logger.log("DEBUG: do nothing because the value doesn't include area")
}
else{
// Logger.log("DEBUG: do something because the value does include area")
// get the second field - it should be a value
var num = fields[1];
num =+num
var numtype = typeof num;
// Logger.log("DEBUG: num= "+num+" type = "+numtype); //number
// test type of second field
if (num !=0 && numtype ==="number"){
// Logger.log("DEBUG: the second field IS a number")
// get the range for the cell in Column A
var colARange = sh.getRange(row,1);
// Logger.log("DEBUG: the ColA range = "+colARange.getA1Notation());
// get the value of Column A
var colA = colARange.getValue();
// Logger.log("DEBUG: Col A = "+colA+", length = "+colA.length);
// test if Column A is empty
if (colA.length !=0){
var newColA = colA+"A"+num;
// Logger.log("DEBUG: the new cola = "+newColA);
// update the value in Column A
colARange.setValue(newColA);
}
else{
// Logger.log("DEBUG: do nothing because column A is empty")
}
}
else{
// Logger.log("DEBUG: the second field isn't a number")
}
}
}
}
else{
//Logger.log("DEBUG: the edit is NOT in Column C of 'Work'")
}
}
REVISION
If the value in Column C is sourced from data validation, then no need for and testing except that the edit was in Column C and the sheet = "work".
Included two additional lines of code:
var colAfields = colA.split('-');
var colAdate = colAfields[0];
This has the effect of excluding any existing characters after the hyphen, and re-establishing the hyphen, row number plus "A" and the AREA numeral.
function onEdit(e){
// so5911459101 revised
// only one test - check for ColumnC and sheet="work"
// test for edit in column C and sheet = work
var ss = SpreadsheetApp.getActiveSpreadsheet;
// get Event Objects
var rg=e.range;
var sh=e.range.getSheet();
var area=sh.getName();
var row = rg.getRow();
// test if the edit is in Column C of sheet = work
if(rg.columnStart === 3 && area === "work") { // Observe column 3 and sheet = work
Logger.log("DEBUG: the edit is in Column C of 'Work'")
// get the edited value
var value = e.value
//Logger.log("DEBUG: the value = "+value+", length = "+value.length);
// use Javascript split on the value
var fields = value.split(' ');
//Logger.log(fields);//DEBUG
// get the second field - it should be a value
var num = fields[1];
// get the range for the cell in Column A
var colARange = sh.getRange(row,1);
// Logger.log("DEBUG: the ColA range = "+colARange.getA1Notation());
// get the value of Column A
var colA = colARange.getValue();
// Logger.log("DEBUG: Col A = "+colA+", length = "+colA.length);
// use Javascript split on Column A in case of existing value
var colAfields = colA.split('-');
var colAdate = colAfields[0];
// build new value
var newColA = colAdate+"-"+row+"A"+num;
// Logger.log("DEBUG: the new cola = "+newColA);
// update the value in Column A
colARange.setValue(newColA);
}
else{
Logger.log("DEBUG: the edit is NOT in Column C of 'Work'")
}
}

PowerBi subtracting two cells in different rows with condition

I am wondering if something that i would like to achieve is possible, please look at the picture and read description below:
I would like to add a column to the right, where if a cell table[ActionType] = "TERMINATING", it calculates a difference between timestamps (timestamp for TERMINATING - timestamp for STARTING in below row). If the result is positive (>0) then store it in a column in a corresponding row (eg next to timestapm for terminating), if result is negative don't store it. And all of that applied to whole table.
I tried conditional column and i guess it cannot be done with this or at least I couldn't make it.
Will be very thankful for responses and tips!
Pre-requisite :- Add an Index Column using the query editor. Make sure they are in the next row to each other.
It is advisable to keep TimeStamp column as a DateTime Column itself.
So, if you can change your TimeStamp column to a DateTime column then try this :-
Difference =
Var Get_Action_Type = Table1[ActionType]
Var required_Index_1 = Table1[Index] + 1
Var required_Index = IF(required_Index_1 > MAX(Table1[Index]),required_Index_1-1, required_Index_1)
Var Current_Action_TimeStamp = Table1[TimeStamp]
Var next_Action_TimeStamp = CALCULATE(MAX(Table1[TimeStamp]),FILTER(Table1, Table1[Index] = required_Index))
Var pre_result = IF(Get_Action_Type = "TERMINATING", DATEDIFF(Current_Action_TimeStamp, next_Action_TimeStamp,SECOND), BLANK())
Var result = IF(pre_result > 0, pre_result, BLANK())
return result
And if you cannot change it to a Date Time, then try this calculated column,
Difference_2 =
Var Get_Action_Type = Table1[ActionType]
Var required_Index_1 = Table1[Index] + 1
Var required_Index = IF(required_Index_1 > MAX(Table1[Index]),required_Index_1-1, required_Index_1)
Var Current_Action_TimeStamp = Table1[Time_Stamp_Number]
Var next_Action_TimeStamp = CALCULATE(MAX(Table1[Time_Stamp_Number]),FILTER(Table1, Table1[Index] = required_Index))
Var pre_result = IF(Get_Action_Type = "TERMINATING", next_Action_TimeStamp - Current_Action_TimeStamp, BLANK())
Var result = IF(pre_result > 0, pre_result, BLANK())
return result
The Output looks as below :-
Kindly accept the answer if it helps and do let me know, how it works for you.

Google Sheets - formula for expense sheet

I need a formula that will help me calculate the $ expense amount for my milage rates.
Here is my sheet https://docs.google.com/spreadsheets/d/1PPFuFWbxWi9iIdtYBJvnSNB1j1cjjQutepF3hnpGoFM/edit?usp=sharing
On the tab that's expenses, I need to check the year of my date in Col A (because different years have different rates), then check Col D to see if it says Milage, then check Col E for what type of milage (there are 4 types), then return the number input in Col I times the appropriate rate for that year and type of milage. I have a table set up with the rates in another tab.
I'm thinking it will be a long IF and AND formula. Any help would be great!
In 'Essentials Log', I've changed the format of your year start-end dates to state the year you're referencing. I've added a column for "Milage".
Expense Reference Screenshot
In 'Expenses'!M2, I tried this and it seems to work:
=ARRAYFORMULA(iferror(if(isblank(A2:A),"",I2:I*vlookup(D2:D&year(A2:A)&E2:E,query({'Essentials Log'!Y2:Y&'Essentials Log'!Z2:Z&'Essentials Log'!AA2:AA,'Essentials Log'!AB2:AB},"Select *"),2,false))))
Working Example Screenshot
Have a look (I copied your example sheet): Milage expense
I solved this with creating an app script.
I realized I didn't want to use a formula because in my $ col most entries will be keyed in and so I didn't want to copy and paste a formula every time.
Here is the code if anyone might need it. I added it to the menu so the calculation is done with a click.
function expenseMilageCalculation() {
var auditionsSheet = SpreadsheetApp.getActiveSpreadsheet();
var expensesTab = auditionsSheet.getActiveSheet();
var activeCell = expensesTab.getActiveCell();
var activeRow = activeCell.getRow();
var expenseDate = expensesTab.getRange(activeRow, 1).getValue();
var expenseYear = expenseDate.getFullYear();
var expenseType = expensesTab.getRange(activeRow, 4).getValue();
var expenseDetails = expensesTab.getRange(activeRow, 5).getValue();
var numMiles = expensesTab.getRange(activeRow, 9).getValue();
var essentialsLogTab = auditionsSheet.getSheetByName("Essentials Log")
//2018 Business Milage
if (expenseYear == 2018 && expenseType == "Milage" && expenseDetails == "Business") {
var milageBusinessRate2018 = essentialsLogTab.getRange(3, 27).getValue();
var dollarMilesDeduction = numMiles * milageBusinessRate2018
expensesTab.getRange(activeRow, 8).setValue(dollarMilesDeduction)
}
//2019 Business Milage
if (expenseYear == 2019 && expenseType == "Milage" && expenseDetails == "Business") {
var milageBusinessRate2019 = essentialsLogTab.getRange(8, 27).getValue();
var dollarMilesDeduction = numMiles * milageBusinessRate2019
expensesTab.getRange(activeRow, 8).setValue(dollarMilesDeduction)
}