How to I extract time from sentence and transform into number? - regex

Can I use this method to extract the timelines from this string:
1 month in role 1 year 11 months in company
and transform them into a number of months?
E.g
1 month = 1 1 year 11 months = 23
Any help greatly appreciated!
Have tried the =split formula but all sentences are slightly different

You could try using regular expressions.
=IFERROR(INDEX(SPLIT(REGEXEXTRACT(A1,"(\d+ years? (\d+ months? )?in company)"), " "), 0, 1), 0) * 12 + IFERROR(INDEX(SPLIT(REGEXEXTRACT(A1,"(\d+ months? in company)"), " "), 0, 1), 0)
(A1 in the formula above represents the cell containing the timeline string. This would need to be adjusted as needed)
Basically, this formula looks for "x year(s) (x month(s) )in company" If such a string is found, it will split it up by spaces and take the first portion (the x in x years). If no such pattern is found (for example, when the string is "1 month in role 1 month in company") then the year part is ignored.
For robustness, it is necessary to check if the the year is followed by an optional month component, then "in company." Otherwise, "1 year 1 month in role 2 years 11 months in company" would return 1 for the year, which is not what we want.
The second part of the formula looks for "x month(s) in company" If not found, then the month portion is ignored (e.g. "1 year in role 1 year in company")

formula in B2 cell:
=ARRAYFORMULA(IFERROR(IF(REGEXMATCH(A2:A, "role"),
IF(REGEXEXTRACT(A2:A, "(year|mont)")="mont", REGEXEXTRACT(A2:A, "\d+"),
IF(REGEXEXTRACT(A2:A, "(year|mont)")="year", REGEXEXTRACT(A2:A, "\d+")*12+
IFERROR(REGEXEXTRACT(A2:A, "(\d+) mont")), )), )))
formula in C2 cell:
=ARRAYFORMULA(IFERROR(
IF(REGEXEXTRACT(REGEXEXTRACT(A2:A, "role (.+)"), "(year|mont)")="mont",
REGEXEXTRACT(REGEXEXTRACT(A2:A, "role (.+)"), "\d+"),
IF(REGEXEXTRACT(REGEXEXTRACT(A2:A, "role (.+)"), "(year|mont)")="year",
REGEXEXTRACT(REGEXEXTRACT(A2:A, "role (.+)"), "\d+")*12+
IFERROR(REGEXEXTRACT(REGEXEXTRACT(A2:A, "role (.+)"), "(\d+) mont")), )),
IF(REGEXMATCH(A2:A, "company"),
IF(REGEXEXTRACT(A2:A, "(year|mont)")="mont", REGEXEXTRACT(A2:A, "\d+"),
IF(REGEXEXTRACT(A2:A, "(year|mont)")="year", REGEXEXTRACT(A2:A, "\d+")*12+
IFERROR(REGEXEXTRACT(A2:A, "(\d+) mont")), )), )))

Related

How to search a row for cell value(s) then output the header?

I'm new here and trying to automate. I have a work roster and would like for it to output who are on duty on a daily basis. Ideally. it would check today's date then search the corresponding table row for the relevant personnel each day.
Screenshot
Spreadsheet: here
Desired output:
On today's date, AM shifts are Person 1 (duty) Person 2, PM shifts are Person 3
Current formula:
="On "&textjoin("",TRUE,B7)&": AM shifts are "
&OFFSET(INDEX(A3:E3,MATCH("AM Duty",A3:E3,0)),1-row(vlookup(today(),A1:E5,2,0)),0)&" (duty) "
&OFFSET(INDEX(A3:E3,MATCH("AM Reg",A3:E3,0)),1-row(vlookup(today(),A1:E5,2,0)),0)
&", PM shifts are "
&OFFSET(INDEX(A3:E3,MATCH("PM Reg",A3:E3,0)),1-row(vlookup(today(),A1:E5,2,0)),0)
Some problems with formula:
Row needs to adjust according to today's date as it goes down the list, currently it's hardcoded A3:E3
Unsure how to capture repeated AM Reg in each row
Not sure if I'm overcomplicating things here, and open to better solutions. Thank you in advance!
try:
=INDEX(TEXT(TODAY(), "On dd mmmm yy: A\M \s\hift\s ar\e ")&
TEXTJOIN(", ", 1, IF(REGEXMATCH(VLOOKUP(TODAY(), A2:E5, {2,3,4,5}, ), "AM Duty"), B1:E1&" (duty), ", ))&
TEXTJOIN(", ", 1, IF(REGEXMATCH(VLOOKUP(TODAY(), A2:E5, {2,3,4,5}, ), "AM Reg"), B1:E1, ))&" and PM shifts are "&
TEXTJOIN(", ", 1, IF(REGEXMATCH(VLOOKUP(TODAY(), A2:E5, {2,3,4,5}, ), "PM Duty"), B1:E1&" (duty), ", ))&
TEXTJOIN(", ", 1, IF(REGEXMATCH(VLOOKUP(TODAY(), A2:E5, {2,3,4,5}, ), "PM Reg"), B1:E1, )))

PowerBI Create List of Month Dates

Hi in powerbi I am trying to create a list of dates starting from a column in my table [COD], and then ending on a set date. Right now this is just looping through 60 months from the column start date [COD]. Can i specify an ending variable for it loop until?
List.Transform({0..60}, (x) =>
Date.AddMonths(
(Date.StartOfMonth([COD])), x))
Assuming
start=Date.StartOfMonth([COD]),
end = #date(2020,4,30),
One way is to add column, custom column with formula
= { Number.From(start) .. Number.From(end) }
then expand and convert to date format
or you could generate a list with List.Dates instead, and expand that
= List.Dates(start, Number.From(end) - Number.From(start)+1, #duration(1, 0, 0, 0))
Assuming you want start of month dates through June 2023. In the example below, I have 2023 and 6 hard coded, but this could easily come from a parameter Date.Year(DateParameter) or or column Date.Month([EndDate]).
Get the count of months with this:
12 * (2023 - Date.Year([COD]) )
+ (6 - Date.Month([COD]) )
+ 1
Then just use this column in your formula:
List.Transform({0..[Month count]-1}, (x) =>
Date.AddMonths(Date.StartOfMonth([COD]), x)
)
You could also combine it all into one harder to read formula:
List.Transform(
{0..
(12 * ( Date.Year(DateParameter) - Date.Year([COD]) )
+ ( Date.Month(DateParameter) - Date.Month([COD]) )
)
}, (x) => Date.AddMonths(Date.StartOfMonth([COD]), x)
)
If there is a chance that COD could be after the End Date, you would want to include error checking the the Month count formula.
Generate list:
let
Start = Date1
, End = Date2
, Mos = ElapsedMonths(End, Start) + 1
, Dates = List.Transform(List.Numbers(0,Mos), each Date.AddMonths(Start, _))
in
Dates
ElapsedMonths(D1, D2) function def:
(D1 as date, D2 as date) =>
let
DStart = if D1 < D2 then D1 else D2
, DEnd = if D1 < D2 then D2 else D1
, Elapsed = (12*(Date.Year(DEnd)-Date.Year(DStart))+(Date.Month(DEnd)-Date.Month(DStart)))
in
Elapsed
Of course, you can create a function rather than hard code startdate and enddate:
(StartDate as date, optional EndDate as date, optional Months as number)=>
let
Mos = if EndDate = null
then (if Months = null
then error Error.Record("Missing Parameter", "Specify either [EndDate] or [Months]", "Both are null")
else Months
)
else ElapsedMonths(StartDate, EndDate) + 1
, Dates = List.Transform(List.Numbers(0, Mos), each Date.AddMonths(StartDate, _))
in
Dates

Counting and adding multiple variables from single cell in sheets

I have a sheets document that has cells that users input data into. They know to input the data in a certain format; a 'number' and a 'letter', followed by a space, a SKU number, and then a comma.
I'd like to have a formula that counts the amount of each 'letters' and then adds the 'numbers' for each letter.
There are only five 'letters' users can choose from; M, E, T, W, B.
The data they input isn't restricted to a set order, and there isn't a limit of how much they can input, as long as it follows the aforementioned syntax.
I attached a screenshot of an example of how this should look.
The yellow cell is the user inputted data, and the green cells is data created by formula.
Or here's a link to a live version: link
I tried doing it with COUNTIF but that didn't work. I'm guessing it would be done with an array, but I don't know where to start. If I can see an example of something similar, I could probably do the rest.
yes:
=INDEX(REGEXREPLACE(SPLIT(REGEXREPLACE(FLATTEN(QUERY(TRANSPOSE(QUERY(TRANSPOSE(SORT(TRANSPOSE(QUERY(SPLIT(
FLATTEN(REGEXREPLACE(TRIM(SPLIT(A2:A9, ",")), "\b(\d+(?:\.\d+)?)(.+?)\b(.*)", ROW(A2:A9)&"×$1$2×$1×$2")), "×"),
"select count(Col2),sum(Col3) where Col2 is not null group by Col1 pivot Col4 label count(Col2)''")))),
"offset 1", 0)*1&TRIM(REGEXREPLACE(TRANSPOSE(SORT(FLATTEN(QUERY(SPLIT(
FLATTEN(REGEXREPLACE(TRIM(SPLIT(A2:A9, ",")), "\b(\d+(?:\.\d+)?)(.+?)\b(.*)", ROW(A2:A9)&"×$1$2×$1×$2")), "×"),
"select count(Col2),sum(Col3) where Col2 is not null group by Col1 pivot Col4 limit 0 label count(Col2)''")))),
".*sum", ))),,9^9)), "([^ ]+ [^ ]+) ", "$1×"), "×"), "(\d+(?:\.\d+)?)$", "($1)"))
I've added a new sheet ("Erik Help") with the following solution:
=ArrayFormula(FILTER( SPLIT("B E M T W", " ") & " (" & IFERROR(VLOOKUP(ROW(A1:A) & SPLIT("B E M T W", " "), QUERY(FLATTEN(SPLIT(QUERY(FLATTEN(IFERROR(REPT(ROW(A1:A) & REGEXEXTRACT(SPLIT(REGEXREPLACE(A1:A&",", "\d+,", ""), " ", 0, 1), "\D") & "~", 1*REGEXEXTRACT(SPLIT(REGEXREPLACE(A1:A&",", "\d+,", ""), " ", 0, 1), "\d+")))), "WHERE Col1 <>'' "), "~", 1, 1)), "Select Col1, COUNT(Col1) GROUP BY Col1"), 2, FALSE), 0)&")", A1:A<>""))

Google Script: Match RegEx into 2D array

I'm trying to extract information from Gmail into Google Spreadsheet. The information in the email has a table structure with the following columns List of Products, QTY Sold and the Subtotal for each product. These repeat N times.
When accesing the information using message.getPlainBody() I get the following text:
Product
Quantity
Price
Chocolate
1
$8.58
Apples
2
$40.40
Bananas
1
$95.99
Candy
1
$4.99
Subtotal:
$149.96
Progress
First I tried to use a regular expression to identify each row with all his elements:
Product name: Any amount of characters that don't include ':' (.*)[^:]
QTY Sold: Any number \d*
Anything that looks like a SubTotal [$]\d*.\d*
Wrapping everything up it looks like this
function ExtractDetail(message){
var mainbody = message.getPlainBody();
//RegEx
var itemListRegex = new RegExp(/(.*)[^:][\r\n]+(\d*[\r\n]+[$](\d*\.\d*)[\r\n]+/g);
var itemList = mainbody.match(itemListRegex);
Logger.log(itemList);
}
And so far it works:
itemList: Chocolate 1 $8.58 ,Apples 2 $40.40 ,Bananas 1 $95.99
,Candy 1 $4.99
However, I'm getting the following result:
[Chocolate 1 $8.58]
[Apples 2 $40.40]
[Bananas 1 $95.99]
[Candy 1 $4.99]
Instead of:
[Chocolate] [ 1 ] [$8.58]
[Apples] [ 2 ] [$40.40]
[Bananas] [ 1 ] [$95.99]
[Candy] [ 1 ] [$4.99]
Question
My question is, how can I append a new row in a way that it each row corresponds to each match found and that each column corresponds to each property?
How do I turn the result of each match into an array? Is it possible or should I change my approach?
Update:
Since the result of my current attemp is a large string I'm trying to find other options. This one poped up:
var array = Array.from(mainbody.matchAll(itemListRegex), m => m[1]);
Source: How do you access the matched groups in a JavaScript regular expression?
I'm still working on it. I still need to find how to add more columns and for some reason it starts on 'Apples' (following the examples), leaving 'Chocolates' behind.
Log:
Logger.log('array: ' + array);
If you want to use matchAll like Array.from(mainbody.matchAll(itemListRegex), m => m[1]), how about this modification?
In this case, /(.*[^:])[\r\n]+(\d*)[\r\n]+([$]\d*\.\d*)[\r\n]/g is used as the regex.
Modified script:
const itemListRegex = /(.*[^:])[\r\n]+(\d*)[\r\n]+([$]\d*\.\d*)[\r\n]/g;
var array = Array.from(mainbody.matchAll(itemListRegex), ([,b,c,d]) => [b,Number(c),d]);
Result:
[
["Chocolate",1,"$8.58"],
["Apples",2,"$40.40"],
["Bananas",1,"$95.99"],
["Candy",1,"$4.99"]
]
The result is the same with TheMaster's answer.
Test of script:
const mainbody = `
Product
Quantity
Price
Chocolate
1
$8.58
Apples
2
$40.40
Bananas
1
$95.99
Candy
1
$4.99
Subtotal:
$149.96
`;
const itemListRegex = /(.*[^:])[\r\n]+(\d*)[\r\n]+([$]\d*\.\d*)[\r\n]/g;
var array = Array.from(mainbody.matchAll(itemListRegex), ([,b,c,d]) => [b,Number(c),d]);
console.log(array)
Note:
About how can I append a new row in a way that it each row corresponds to each match found and that each column corresponds to each property?, this means for putting the values to Spreadsheet? If it's so, can you provide a sample result you expect?
References:
matchAll()
Array.from()
Map and split the resulting array by \new lines:
const data = `Product
Quantity
Price
Chocolate
1
$8.58
Apples
2
$40.40
Bananas
1
$95.99
Candy
1
$4.99
Subtotal:
$149.96`;
const itemListRegex = /.*[^:][\r\n]+\d*[\r\n]+\$\d*\.\d*(?=[\r\n]+)/g;
const itemList = data.match(itemListRegex);
console.info(itemList.map(e => e.split(/\n/)));//map and split

how to know which quarter does the current month belongs to ? (in python )

I want to know to which quarter(Q1,Q2,Q3,Q4) does the current month belongs to in python. I'm fetching the current date by importing time module as follows:
import time
print "Current date " + time.strftime("%x")
any idea how to do it ?
Modifying your code, I get this:
import time
month = int(time.strftime("%m")) - 1 # minus one, so month starts at 0 (0 to 11)
quarter = month / 3 + 1 # add one, so quarter starts at 1 (1 to 4)
quarter_str = "Q" + str(quarter) # convert to the "Qx" format string
print quarter_str
Or you could use the bisect module:
import time
import bisect
quarters = range(1, 12, 3) # This defines quarters: Q1 as 1, 2, 3, and so on
month = int(time.strftime("%m"))
quarter = bisect.bisect(quarters, month)
quarter_str = = "Q" + str(quarter)
print quarter_str
strftime does not know about quarters, but you can calculate them from the month:
Use time.localtime to retrieve the current time in the current timezone. This function returns a named tuple with year, month, day of month, hour, minute, second, weekday, day of year, and time zone offset. You will only need the month (tm_mon).
Use the month to calculate the quarter. If the first quarter starts with January and ends with March, the second quarter starts with April and ends with June, etc. then this is as easy as dividing by 4 without remainder and adding 1 (for 1..3 // 4 == 0, 0 + 1 == 1, 4..6 // 4 == 1, 1 + 1 == 2, etc.). If your definition of what a quarter is differs (e.g. companies may choose different start dates for their financial quarters), you have to adjust the calculation accordingly.