Expression for YYYY-MM-DD~Sucess - regex

I have written regex for replacing some text
x.ServiceLogStatus
.replace(/Failed/g, "<div class='div1'>\Error</div>")
.replace(/Match not found/g, "<div class='div1'>\Error</div>"));
I also want to add another replacement for the pattern.
YYYY-MM-DD
YYYY-MM-DD~Success
How it is possible?
Let say 2001-12-31 i need to replace that date with red color which is inside div.
Let say 2001-12-31~Success, i need to replace that date with red color which is inside div.
What should i do?

Myself fixed it
x.ServiceLogStatus = x.ServiceLogStatus.replace(/(\d+[-/]\d+[-/]\d+)/g, "<div class='div2'>\ $1</div>").replace(/(\d+[-/]\d+[-/]\d+)/g, "<div class='div4'>\ $1</div>").replace(/~Success/g, "<div class='div2'>\ ~Success</div>"));

Related

Conditional Formatting Cells in a Column Only Containing Letters F-Z Excluding RR

How could I highlight cells using conditional formatting if I want to only highlight cells containing letters F-Z? The formula would also need to exclude RR.
For example, if a cell in that column contains RR or any letter from A to E, I don't want to highlight it. But, if it contains any letter from F all the way to Z, I want it automatically highlighted.
Tried using regexmatch, but it's not working.
use:
=REGEXMATCH(A2, "[F-Z]")*(REGEXMATCH(A2, "RR")=FALSE)

Removing the last specific character from the results of my formula

I'm using some VLOOKUPs to pull in text from another tab on my spreadsheet using the below formula
={"Product Category Test";ARRAYFORMULA(IF(ISBLANK(A2:A),"",
VLOOKUP(A2:A,'Import Template'!A:DB,MATCH("Product Category",'Import
Template'!A1:DB1,0),false)&"|"&IF(VLOOKUP(A2:A,'Import Template'!A:DB,MATCH("Automatic
Categories",'Import Template'!A1:DB1,0),false)<>"",VLOOKUP(A2:A,'Import
Template'!A:DB,MATCH("Automatic Categories",'Import Template'!A1:DB1,0),false),"")))}
Example of results: Books|Coming Soon Images|
All of my results will be delimited by a "|" which will also be the final character. I need to remove the final "|" from the results ideally without using a helper column, is there a way to wrap another function around my formula to achieve this? I've played around with RIGHT and LEN but can't figure it out.
Thanks,
use regex:
=ARRAYFORMULA({"Product Category Test"; REGEXREPLACE(""&IF(ISBLANK(A2:A),,
VLOOKUP(A2:A,'Import Template'!A:DB,MATCH("Product Category",'Import
Template'!A1:DB1,0),)&"|"&IF(VLOOKUP(A2:A,'Import Template'!A:DB,MATCH("Automatic
Categories",'Import Template'!A1:DB1,0), )<>"",VLOOKUP(A2:A,'Import
Template'!A:DB,MATCH("Automatic Categories",'Import Template'!A1:DB1,0),),)), "\|$", )})
if this won't work make sure there are no empty spaces after last |

Extract a list of unique text characters/ emojis from a cell

I have a text in cell (A1) like this:
✌😋👅👅☝️😉🍌🍪💧💧
I want to extract the unique emojis from this cell into separate cells:
✌😋👅☝️😉🍌🍪💧
Is this possible?
You want to put each character of ✌😋👅👅☝️😉🍌🍪💧💧 to each cell by splitting using the built-in function of Google Spreadsheet.
Sample formula:
=SPLIT(REGEXREPLACE(A1,"(.)","$1#"),"#")
✌😋👅👅☝️😉🍌🍪💧💧 is put in a cell "A1".
Using REGEXREPLACE, # is put to between each character like ✌#😋#👅#👅#☝#️#😉#🍌#🍪#💧#💧#.
Using SPLIT, the value is splitted with #.
Result:
Note:
In your question, the value of ️ which cannot be displayed is included. It's \ufe0f. So "G1" can be seen like no value. But the value is existing. So please be careful this. If you want to remove the value, you can use ✌😋👅👅☝😉🍌🍪💧💧.
References:
REGEXREPLACE
SPLIT
Added:
From marikamitsos's comment, I could notice that my understanding was not correct. So the final result is as follows. This is from marikamitsos.
=TRANSPOSE(UNIQUE(TRANSPOSE(SPLIT(REGEXREPLACE(A1,"(.)","$1#"),"#"))))
or try:
=TRANSPOSE(UNIQUE(TRANSPOSE(REGEXEXTRACT(A1, REPT("(.)", LEN(A1))))))
Formula
Appears, one of the best formula solutions would be:
=SPLIT(REGEXREPLACE(A1,"(.)","$1#"),"#")
You may also add some additional checks like skin tones & intermediate chars:
=TRANSPOSE(SPLIT(REGEXREPLACE(A2,"(.[🏻🏼🏽🏾🏿"&CHAR(8205)&CHAR(65039)&"]*)","#$1"),"#"))
It will help to join some emojis as a single emoji.
Script
More precise way is to use the script:
https://github.com/orling/grapheme-splitter/blob/master/index.js
↑
Add the code to Script editor
Add code for sample usage:
function splitEmojis(string) {
var splitter = new GraphemeSplitter();
// split the string to an array of grapheme clusters (one string each)
var graphemes = splitter.splitGraphemes(string);
return graphemes;
}
Tests
Not 100% precise
1
Please note: some emojis are not correctly shown in sheets
🏴󠁧󠁢󠁷󠁬󠁳󠁿🏴󠁧󠁢󠁳󠁣󠁴󠁿🏴󠁧󠁢󠁥󠁮󠁧󠁿🏴
↑ emojis:
flag: England
flag: Scotland
flag: Wales
black flag
are the same for Google Sheets.
2
Vlookup function in #GoogleSheets and in #Excel thinks chars
#️⃣ and
*️⃣
are the same!

Google Sheets formula to add case-insensitive text + text in cell

I have some text on row A, and I want to write on cell E1 to filter whenever I put this formula
=Filter(A1:A10;ArrayFormula(E1 REGEXMATCH(A1:A10;E1)))
but I want it to CONTAINS not EXACT text
=filter(A1:A10;REGEXMATCH(A1:A10;"(i?) TEX"))
This works but I want to add a cell value
so somehow to combine this to together
I'm trying to put value in cell E1 (?i)TEX and it finds TEXT on A row, but I want to put (?i) in the formula but can't find how to do it.
I tried
=Filter(A1:A10;ArrayFormula(E1 REGEXMATCH(A1:A10;"(i?) +"E1"")))
doesn't work
=Filter(A1:A10;ArrayFormula(E1 REGEXMATCH(A1:A10;"(i?)"+E1)))
doesn't work
=filter(A1:A10;REGEXMATCH(A1:A10;"(i?)&" "&E1"))
doesn't work
I really don't have an idea of how to add (i?) to cell value
To make a match case-insensitive you'll need (?i) instead of (i?). I believe this should work
=filter(A1:A10;REGEXMATCH(A1:A10; "(?i)"&E1))

How to extract text under specific headings from a pdf?

I want to extract text under specific headings from a pdf using python.
For example, I have a pdf with headings Introduction,Summary,Contents. I need to extract only the text under the heading 'Summary'.
How can I do this?
This scenario is exactly what I am working on in my current company. We need to extract text lying under a heading. I'm personally using a rule based system i.e, using regex to identify all the numbered headings after reading the entire document line by line. Once I have the headings I enter the name of the heading for which I want to find the corresponding paragraph. This input is matched with the pre-existing list of headings and using universal sentence encoder I find the nearest match. After that I just display all the contents that is present from that heading upto the immediate next heading.
Pdf is unstructured text so there are no tags to extract data directly. So we use regular expression to find desired information from a corpus of text.
Extract raw page text using following code.
import fitz
page = pdf_file.loadPage(0) # 0 represents the page number... upto n-1 pages...
dl = page.getDisplayList()
tp = dl.getTextPage()
tp_text=tp.extractText()
re.split('\n\d+.+[ \t][a-zA-Z].+\n',tp_text)
Then apply regular expression as per your need... ( this re worked for me but you may or may not need to change it)
I am giving a detailed example how this will work
re.findall('\n\d+.+[ \t][a-zA-Z].+\n',"some text\n1. heading 1\nparagraph 1\n1.2.3 Heading 2\nparapgraph 2")
Output : ['\n1. heading 1\n', '\n1.2.3 Heading 2\n']
You can use re.split to split text per headings and retrieve you desired heading text.
re.split('\n\d+.+[ \t][a-zA-Z].+\n',"some text\n1. heading 1\nparagraph 1\n1.2.3 Heading 2\nparapgraph 2")
Output: ['some text', 'paragraph 1', 'parapgraph 2']
Simply ith heading will have (i+1) heading text.
The best method i found using regular expression
regex = r"^\d+(?:\.\d+)* .*(?:\r?\n(?!\d+(?:\.\d+)* ).*)*"
print(re.findall(regex,samplestring, re.M))