Concatenate cells between dynamic start and end row - regex

I'm trying to concatenate a number of cells into one if they are between two cells with a certain string.
For example: In the Element column there are modalOpen and modalClose and in between those are modalFields. Between modalOpen and modalClose I need to add the Name of each row with Element modalField into the Output column for the modalOpen row.
The number of modalFields can vary from 2 - 20.

delete everything in C column and paste this in C2 cell:
=ARRAYFORMULA(TRIM(SUBSTITUTE(IFERROR(VLOOKUP(B2:B,
SPLIT(TRANSPOSE(SPLIT(QUERY(IF(B2:B<>"",
IF(A2:A="modalOpen", "♥"&B2:B&"♦"&B2:B&" with",
IF(A3:A="modalClose", "& <"&B2:B&">", "<"&B2:B&">,")), )
,,999^99), "♥")), "♦"), 2, 0)), ">, & ", "> & ")))

=ARRAYFORMULA(REGEXREPLACE(TRIM(TRANSPOSE(SPLIT(QUERY(FILTER(IF(A2:A="modalClose","",IF(A2:A="modalOpen","♠"&B2:B&" with ","<"&B2:B&">,")),A2:A<>""),,2^99),"♠"))),"(, )(\<[^<>]\>),$"," and $2"))
The result:
Test1 with <1>, <2> and <3>
Test2 with <1>, <2>, <3> and <4>

Related

How to search a row for cell value(s) then output the header?

I'm new here and trying to automate. I have a work roster and would like for it to output who are on duty on a daily basis. Ideally. it would check today's date then search the corresponding table row for the relevant personnel each day.
Screenshot
Spreadsheet: here
Desired output:
On today's date, AM shifts are Person 1 (duty) Person 2, PM shifts are Person 3
Current formula:
="On "&textjoin("",TRUE,B7)&": AM shifts are "
&OFFSET(INDEX(A3:E3,MATCH("AM Duty",A3:E3,0)),1-row(vlookup(today(),A1:E5,2,0)),0)&" (duty) "
&OFFSET(INDEX(A3:E3,MATCH("AM Reg",A3:E3,0)),1-row(vlookup(today(),A1:E5,2,0)),0)
&", PM shifts are "
&OFFSET(INDEX(A3:E3,MATCH("PM Reg",A3:E3,0)),1-row(vlookup(today(),A1:E5,2,0)),0)
Some problems with formula:
Row needs to adjust according to today's date as it goes down the list, currently it's hardcoded A3:E3
Unsure how to capture repeated AM Reg in each row
Not sure if I'm overcomplicating things here, and open to better solutions. Thank you in advance!
try:
=INDEX(TEXT(TODAY(), "On dd mmmm yy: A\M \s\hift\s ar\e ")&
TEXTJOIN(", ", 1, IF(REGEXMATCH(VLOOKUP(TODAY(), A2:E5, {2,3,4,5}, ), "AM Duty"), B1:E1&" (duty), ", ))&
TEXTJOIN(", ", 1, IF(REGEXMATCH(VLOOKUP(TODAY(), A2:E5, {2,3,4,5}, ), "AM Reg"), B1:E1, ))&" and PM shifts are "&
TEXTJOIN(", ", 1, IF(REGEXMATCH(VLOOKUP(TODAY(), A2:E5, {2,3,4,5}, ), "PM Duty"), B1:E1&" (duty), ", ))&
TEXTJOIN(", ", 1, IF(REGEXMATCH(VLOOKUP(TODAY(), A2:E5, {2,3,4,5}, ), "PM Reg"), B1:E1, )))

Counting and adding multiple variables from single cell in sheets

I have a sheets document that has cells that users input data into. They know to input the data in a certain format; a 'number' and a 'letter', followed by a space, a SKU number, and then a comma.
I'd like to have a formula that counts the amount of each 'letters' and then adds the 'numbers' for each letter.
There are only five 'letters' users can choose from; M, E, T, W, B.
The data they input isn't restricted to a set order, and there isn't a limit of how much they can input, as long as it follows the aforementioned syntax.
I attached a screenshot of an example of how this should look.
The yellow cell is the user inputted data, and the green cells is data created by formula.
Or here's a link to a live version: link
I tried doing it with COUNTIF but that didn't work. I'm guessing it would be done with an array, but I don't know where to start. If I can see an example of something similar, I could probably do the rest.
yes:
=INDEX(REGEXREPLACE(SPLIT(REGEXREPLACE(FLATTEN(QUERY(TRANSPOSE(QUERY(TRANSPOSE(SORT(TRANSPOSE(QUERY(SPLIT(
FLATTEN(REGEXREPLACE(TRIM(SPLIT(A2:A9, ",")), "\b(\d+(?:\.\d+)?)(.+?)\b(.*)", ROW(A2:A9)&"×$1$2×$1×$2")), "×"),
"select count(Col2),sum(Col3) where Col2 is not null group by Col1 pivot Col4 label count(Col2)''")))),
"offset 1", 0)*1&TRIM(REGEXREPLACE(TRANSPOSE(SORT(FLATTEN(QUERY(SPLIT(
FLATTEN(REGEXREPLACE(TRIM(SPLIT(A2:A9, ",")), "\b(\d+(?:\.\d+)?)(.+?)\b(.*)", ROW(A2:A9)&"×$1$2×$1×$2")), "×"),
"select count(Col2),sum(Col3) where Col2 is not null group by Col1 pivot Col4 limit 0 label count(Col2)''")))),
".*sum", ))),,9^9)), "([^ ]+ [^ ]+) ", "$1×"), "×"), "(\d+(?:\.\d+)?)$", "($1)"))
I've added a new sheet ("Erik Help") with the following solution:
=ArrayFormula(FILTER( SPLIT("B E M T W", " ") & " (" & IFERROR(VLOOKUP(ROW(A1:A) & SPLIT("B E M T W", " "), QUERY(FLATTEN(SPLIT(QUERY(FLATTEN(IFERROR(REPT(ROW(A1:A) & REGEXEXTRACT(SPLIT(REGEXREPLACE(A1:A&",", "\d+,", ""), " ", 0, 1), "\D") & "~", 1*REGEXEXTRACT(SPLIT(REGEXREPLACE(A1:A&",", "\d+,", ""), " ", 0, 1), "\d+")))), "WHERE Col1 <>'' "), "~", 1, 1)), "Select Col1, COUNT(Col1) GROUP BY Col1"), 2, FALSE), 0)&")", A1:A<>""))

How do you print a list of values in one column?

I have calculated a list of values from an equation and I want to print all 140 of the output values in a single column, so that I can convert it to a txt document with one column of data. When I say print(values), it prints the output in multiple columns.
For example:
N = [1,2,3,4,5]
print(N)
This is the result: [1, 2, 3, 4, 5]
I want these values in a single column.
l=[1,2,3,4] #let this be the list
for i in range(len(l)):
print(l[i],"/n")
N = [1,2,3,4,5] #your list here
for i in range(0,len(N)):
print(N[i],"\n")
Just rmb to use backslash in "\n" to go to next line instead of "/n".

Remove all words containing '#' from list in DataFrame

I have a DataFrame in which one column contains lists of words.
>>dataset.head(1)
>> contain
0 ["name", "Place", "ect#gtr", "nick"]
1 ["gf#e", "nobel", "play", "hi"]
I want to remove all the words which contain '#'. In the above example, I want to remove "ect#gtr" and "gf#e".
Try This one
ab= np.column_stack([~df[col].str.contains(r"#") for col in df])
new_df=df.loc[ab.any(axis=1)]
print(new_df)
Use list comprehension with filtering, regex here is not necessary:
df = pd.DataFrame({'contain':[['name', 'Place', 'ect#gtr', 'nick'],
['gf#e', 'nobel', 'play', 'hi']]})
print (df)
contain
0 [name, Place, ect#gtr, nick]
1 [gf#e, nobel, play, hi]
df.contain = df.contain.apply(lambda x: [y for y in x if '#' not in y])
Or:
df.contain = [[y for y in x if '#' not in y] for x in df.contain]
print (df)
contain
0 [name, Place, nick]
1 [nobel, play, hi]
EDIT: For remove values in strings add split with join:
df = pd.DataFrame({'contain':['name Place ect#gtr nick',"gf#e nobel play hi"]})
print (df)
contain
0 name Place ect#gtr nick
1 gf#e nobel play hi
df.contain = df.contain.apply(lambda x: ' '.join([y for y in x.split() if '#' not in y]))
print (df)
contain
0 name Place nick
1 nobel play hi

Delete similar rows

I have list of 3 word phrases with 90000 rows. I need to delete every row, if any other row contains 2 of the same words. For example
Word1 word2 word3
word1 word2 word4 - delete
word1 word2 word5 - delete
word1 word6 word7 - keep, only 1 matching words compared to earlier rows
Is there any way to do this?
Step 1. Separate words into three columns (A, B, and C) using Text to Columns or formulas
Step 2. In columns D, E, and F, past the following formulas to create all two-word combinations:
=A1&B1
=B1&C1
=A1&C1
Step 3. Put the following formula in G1 and fill it through columns H and I and all the rows:
=SUM(COUNTIF(OFFSET($D$1,0,0,ROW(D1),1),D1),COUNTIF(OFFSET($E$1,0,0,ROW(E1),1),D1),COUNTIF(OFFSET($F$1,0,0,ROW(F1),1),D1))-COUNTIF($D1:$F1,D1)
The spreadsheet should now look like this screenshot (besides the two rows I added to the end):
All rows with two words that match two words in a row above will have a value greater than 0 in columns G, H, or I.
Step 4. Finally, filter the entire table by rows G, H, and I equal to 0. You can copy and past (by value) the words to another sheet if desired.
Are the three word phrases in separate cells or are they all in the same cell.
If they are in separate cells, you can use this macro:
Option Explicit
Sub DeleteDups()
Dim colPhrase As Collection
Dim colRows As Collection
Dim V As Variant, vRes() As Variant
Dim I As Long, J As Long
Dim lDupCount As Long
Dim rRes As Range 'results range
V = Worksheets("sheet1").Range("a1", Cells(Rows.Count, "C").End(xlUp))
Set colPhrase = New Collection
Set colRows = New Collection
Set rRes = Range("e1")
'look for dups
For I = 1 To UBound(V)
lDupCount = 0
On Error Resume Next
For J = 1 To 3
colPhrase.Add Item:=CStr(V(I, J)), Key:=CStr(V(I, J))
If Err.Number <> 0 Then lDupCount = lDupCount + 1
Err.Clear
Next J
On Error GoTo 0
If lDupCount < 2 Then colRows.Add Item:=CStr(I)
Next I
ReDim vRes(1 To colRows.Count, 1 To 3)
For I = 1 To colRows.Count
For J = 1 To 3
vRes(I, J) = V(colRows(I), J)
Next J
Next I
Set rRes = rRes.Resize(UBound(vRes), 3)
rRes.EntireColumn.Clear
rRes = vRes
End Sub
If they are in the same cell, depending on how the phrases are separated, you would just need to add a line that separates them into three array elements.