I want to match a row in one sheet with row in another. (To use conditional formatting). But this match is based on multiple column.
https://docs.google.com/spreadsheets/d/18Cr13bQZ2ZZnb1Y2Nq6aMFhHXhFTZsioJ3M4S1fzURQ/edit?usp=sharing
Sheet 1
|Country|Year|Location|
|India |2001|D1 |
|Russia |1999|D3 |
|Kenya |1001|D4 |
|India |1999|D2 |
Sheet 2
|Country |Year|Destination|
|India |2000|DA1 |
|Bulgaria |1999|DA3 |
|Wakanda |1001|DA4 |
|India |1999|DA2 |
Only India-1999 should be highlighted
try:
=ARRAYFORMULA(REGEXMATCH($A2&$B2, TEXTJOIN("|", 1,
INDIRECT("Sheet1!A2:A")&INDIRECT("Sheet1!B2:B"))))
Related
I need to write a regexg_replace query in spark.sql() and I'm not sure how to handle it. For readability purposes, I have to utilize SQL for it. I am trying to pull out the hashtags from the table. I know how to do this using the python method but most of my team are SQL users.
My dataframe example looks like so:
Insta_post
Today, Senate Dems vote to #SaveTheInternet. Proud to support similar #NetNeutrality legislation here in the House…
RT #NALCABPolicy: Meeting with #RepDarrenSoto . Thanks for taking the time to meet with #LatinoLeader ED Marucci Guzman. #NALCABPolicy2018.…
RT #Tharryry: I am delighted that #RepDarrenSoto will be voting for the CRA to overrule the FCC and save our #NetNeutrality rules. Find out…
My code:
I create a tempview:
post_df.createOrReplaceTempView("post_tempview")
post_df = spark.sql("""
select
regexp_replace(Insta_post, '.*?(.|'')(#)(\w+)', '$1') as a
from post_tempview
where Insta_post like '%#%'
""")
My end result:
+--------------------------------------------------------------------------------------------------------------------------------------------+
|a |
+--------------------------------------------------------------------------------------------------------------------------------------------+
|Today, Senate Dems vote to #SaveTheInternet. Proud to support similar #NetNeutrality legislation here in the House… |
|RT #NALCABPolicy: Meeting with #RepDarrenSoto . Thanks for taking the time to meet with #LatinoLeader ED Marucci Guzman. #NALCABPolicy2018.…|
|RT #Tharryry: I am delighted that #RepDarrenSoto will be voting for the CRA to overrule the FCC and save our #NetNeutrality rules. Find out…|
+--------------------------------------------------------------------------------------------------------------------------------------------+
desired result:
+---------------------------------+
|a |
+---------------------------------+
| #SaveTheInternet, #NetNeutrality|
| #NALCABPolicy2018 |
| #NetNeutrality |
+---------------------------------+
I haven't really used regexp_replace too much so this is new to me. Any help would be appreciated as well as an explanation of how to structure the subsets!
For Spark 3.1+, you can use regexp_extract_all function to extract multiple matches:
post_df = spark.sql("""
select regexp_extract_all(Insta_post, '(#\\\\w+)', 1) as a
from post_tempview
where Insta_post like '%#%'
""")
post_df.show(truncate=False)
#+----------------------------------+
#|a |
#+----------------------------------+
#|[#SaveTheInternet, #NetNeutrality]|
#|[#NALCABPolicy2018] |
#|[#NetNeutrality] |
#+----------------------------------+
For Spark <3.1, you can use regexp_replace to remove all that doesn't match the hashtag pattern :
post_df = spark.sql("""
select trim(trailing ',' from regexp_replace(Insta_post, '.*?(#\\\\w+)|.*', '$1,')) as a
from post_tempview
where Insta_post like '%#%'
""")
post_df.show(truncate=False)
#+-------------------------------+
#|a |
#+-------------------------------+
#|#SaveTheInternet,#NetNeutrality|
#|#NALCABPolicy2018 |
#|#NetNeutrality |
#+-------------------------------+
Note the use trim to remove the unnecessary trailing commas created by the first replace $,.
Do you really need a view? Because the following code might do it:
df = df.filter(F.col('Insta_post').like('%#%'))
col_trimmed = F.trim((F.regexp_replace('Insta_post', '.*?(#\w+)|.+', '$1 ')))
df = df.select(F.regexp_replace(col_trimmed,'\s',', ').alias('a'))
df.show(truncate=False)
# +--------------------------------+
# |a |
# +--------------------------------+
# |#SaveTheInternet, #NetNeutrality|
# |#NALCABPolicy2018 |
# |#NetNeutrality |
# +--------------------------------+
I ended up using two of regexp_replace, so potentially there could be a better alternative, just couldn't think of one.
I have a column in Google Sheets where each cell contains pre-defined logic. For example, something like the second column in this table:
| 1 | =A1*-1 |
| 2 | =B2*-1 |
| -3 | =C2*-1 |
Let's say later I want to add the same logic to each cell in column B. For example, make it such that it looks like:
| 1 | =MAX(A1*-1,0) |
| 2 | =MAX(B2*-1,0) |
| -3 | =MAX(C2*-1,0) |
What is the fastest way to do this, besides manually typing MAX(...,0) in each cell? Normal Sheets functions act on the value of the cell, not the logic, so I'm a bit lost.
To my knowledge there isn't a function that pipes in the logic from one cell to another ...
try:
=ARRAYFORMULA(IF(A1:A="",,IF(SIGN(A1:A)<0, A1:A*-1, 0)))
=ARRAYFORMULA(IF(A1:A="",,IF(SIGN(A1:A)>0, A1:A, 0)))
I'm not sure how to do this in a dataframe context
I have the table below here with text information
TEXT |
-------------------------------------------|
"Get some new #turbo #stacks today!" |
"Is it one or three? #phone" |
"Mayhaps it be three afterall..." |
"So many new issues with phone... #iphone" |
And I want to edit it down to where only the words with a '#' symbol are kept, like in the result below.
TEXT |
-----------------|
"#turbo #stacks" |
"#phone" |
"" |
"#iphone" |
In some cases, I'd also like to know if it's possible to eliminate the rows that are empty by checking for NaN as true or if you run a different kind of condition to get this result:
TEXT |
-----------------|
"#turbo #stacks" |
"#phone" |
"#iphone" |
Python 2.7 and pandas for this.
You could try using regex and extractall:
df.TEXT.str.extractall('(#\w+)').groupby(level=0)[0].apply(' '.join)
Output:
0 #turbo #stacks
1 #phone
3 #iphone
Name: 0, dtype: object
I am creating a listview with 5 columns in Gtk+ using C++. I was able to do that. But the problem is, I need subcolumns for the 2nd column which I'm not sure how to proceed.
firstcolumn | second column | third |
|SC1 | SC2 | SC3| |
| | | | |
Is this possible? Can you suggest how to go about it?
I'm trying to capture the id of an element that will be randomly generated. I can successfully capture the value of my element id like this...
| storeAttribute | //div[1]#id | variableName |
Now my variable will be something like...
divElement-12345
I want to remove 'divElement-' so that the variable I am left with is '12345' so that I can use it later to select the 'form-12345' element associated with it...something like this:
| type | //tr[#id='form-${variableName}']/td/form/fieldset/p[1]/input | Type this |
How might I be able to accomplish this?
You have two options in Selenium, XPath and CSS Selector. I have read that CSS Selector is better for doing tests in both FireFox and IE.
Using the latest version of Selenium IDE (3/5/2009) I've had success with using storeEval which evaluates Javascript expressions, giving you access to javascript string functions.
XPath:
storeAttribute | //div[1]#id | divID
storeEval | '${divID}'.replace("divElement-", "") | number
type | //tr[#id='form-${number}']/td/form/fieldset/p[1]/input | Type this
CSS Selector:
storeAttribute | css=div[1]#id | divID
storeEval | '${divID}'.replace("divElement-", "") | number
type | css=tr[id='form-${number}'] > td > form > fieldset > p[1] > input | Type this
There are many functions in XPATH which should solve your problem. Assuming "divElement-" is a constant that will not change and that you are using XPath 2.0, I would suggest:
substring-after(div[1]/#id/text(),"divElement-")