find a pattern in string and remove that pattern of the string from excel cells without touching the pattern in the middle of the string - regex

I have a column which has "--" pattern in the beginning, middle and end of the string. For example:
-- myString
my -- String
myString --
I want to find these two types of cells
-- myString
myString --
and remove the "--" pattern, so it will look fine! I am an amateur user of excel but can use functions if you suggest me. It should be possible with find and use the results of the Find in Replace functions, but I do not know how to pass the results to Replace.
Please note: The answer should take care all the cells in the column, which are hundreds. One solution for changing all, not one solution for one cell.

EDIT: Just reread the request, per instruction from Gary'sStudent. This will remove all instances of "--", not only those at the beginning/end.
If the data is in A1, use the following formula:
=SUBSTITUTE(A1,"--","")

With data in A1 in B1 enter:
=IF(LEFT(A1,2)="--",MID(A1,3,9999),IF(RIGHT(A1,2)="--",MID(A1,1,LEN(A1)-2),A1))

OK, I found the answer. The answer from #Dubison helped me to find the right answer.
If the left two characters in this cell is "--" and the last two characters are "--" the substitute the "--" with "", else to nothing.
=IF(LEFT(A1,2)="--",SUBSTITUTE(A1,"--",""),IF(RIGHT(A1,2)="--",SUBSTITUTE(A1,"--",""), A1))

This will be pretty much the same with previous answers, only using simpler logic. If your strings first or last character = "-" do nothing, else replace "--" with "".
=IF(LEFT(A1,1)="-",A1,IF(RIGHT(A1,1)="-",A1, SUBSTITUTE(A1,"--","")))
UPDATE:
I noticed that I have misread the question. Above code will remove the "--" only if it is in the middle. However original question was to remove "--" only if it is at the beginning or at the end. So formula should be:
=IF(OR(LEFT(A1,2)="--",RIGHT(A1,2)="--"),SUBSTITUTE(A1,"--",""),A1)

Related

Splitting name/value pairs with regex to ignore special characters based on surrounding characters

I have this regex that's worked well so far that splits 'name=value' pairs separated by a given character.
(?s)([^\s=]+)=(.*?)(?=\s+[^\s=]+=|\Z)
I know the separator, but the problem is in the example below (tab separated):
usrName=Wilma sev=4 cat=Detection CommandLine="C:\powershell.exe" -Enc 0ATQBpAG0AAcABDAHIAZQBkAHMAIgA= IOCValue= ProcessEndTime=2023-01-18 15:51:05
https://regex101.com/r/1wgVxs/5
Some values can have no value in the case of 'IOCValue' which works as expected, however some values like the CommandLine are giving me up to -Enc as one match and the remainder to the next pair as another.
What I'm hoping to get out from the above is:
usrName=Wilma
sev=4
cat=Detection
CommandLine="C:\powershell.exe" -Enc 0ATQBpAG0AAcABDAHIAZQBkAHMAIgA=
IOCValue=
ProcessEndTime=2023-01-18 15:51:05
But I'm getting:
usrName=Wilma
sev=4
cat=Detection
CommandLine="C:\powershell.exe" -Enc
0ATQBpAG0AAcABDAHIAZQBkAHMAIgA=
IOCValue=
ProcessEndTime=2023-01-18 15:51:05
Given I know the separator is a tab I think what I need is to only look for name=value pairs when they are at the start of the line or proceeded by the separator (tab). Is this possible?
Note, I can expect a space separator too, but I have a less performant and messy non-regex version I can send these too, so presume tab.
You may use this simplified regex:
(?s)([^\s=]+)=(.*?)(?=\t|\Z)
Updated RegEx Demo
Here, lookahead (?=\t|\Z) will make sure that value part is followed by either a tab character or end position.

Removing the last specific character from the results of my formula

I'm using some VLOOKUPs to pull in text from another tab on my spreadsheet using the below formula
={"Product Category Test";ARRAYFORMULA(IF(ISBLANK(A2:A),"",
VLOOKUP(A2:A,'Import Template'!A:DB,MATCH("Product Category",'Import
Template'!A1:DB1,0),false)&"|"&IF(VLOOKUP(A2:A,'Import Template'!A:DB,MATCH("Automatic
Categories",'Import Template'!A1:DB1,0),false)<>"",VLOOKUP(A2:A,'Import
Template'!A:DB,MATCH("Automatic Categories",'Import Template'!A1:DB1,0),false),"")))}
Example of results: Books|Coming Soon Images|
All of my results will be delimited by a "|" which will also be the final character. I need to remove the final "|" from the results ideally without using a helper column, is there a way to wrap another function around my formula to achieve this? I've played around with RIGHT and LEN but can't figure it out.
Thanks,
use regex:
=ARRAYFORMULA({"Product Category Test"; REGEXREPLACE(""&IF(ISBLANK(A2:A),,
VLOOKUP(A2:A,'Import Template'!A:DB,MATCH("Product Category",'Import
Template'!A1:DB1,0),)&"|"&IF(VLOOKUP(A2:A,'Import Template'!A:DB,MATCH("Automatic
Categories",'Import Template'!A1:DB1,0), )<>"",VLOOKUP(A2:A,'Import
Template'!A:DB,MATCH("Automatic Categories",'Import Template'!A1:DB1,0),),)), "\|$", )})
if this won't work make sure there are no empty spaces after last |

Tcl - How to Add Text after last character through regex?

I need a tip, tip or suggestion followed by some example of how I can add an extension in .txt format after the last character of a variable's output line.
For example:
set txt " ONLINE ENGLISH COURSE - LESSON 5 "
set result [concat "$txt" .txt]
Print:
Note that there is space in the start, means and fin of the variable phrase (txt). What must be maintained are the spaces of the start and means. But replace the last space after the end of the sentence, with the format of the extension [.txt].
With the built-in concat method of Tcl, it does not achieve the desired effect.
The expected result was something like this:
ONLINE ENGLISH COURSE - LESSON 5.txt
I know I could remove spaces with string map but I don't know how to remove just the last occurrence on the line.
And otherwise I don’t know how to remove the last space to add the text [.txt]
If anyone can point me to one or more solutions, thank you in advance.
set result "[string trimright $txt].txt"
or
set result [regsub {\s*$} $txt ".txt"]

Get just X number of strings from a comma separated cel

having real trouble finding a succinct solution to this simple problem. Currently I have cells which contain many comma separated items. I just want the first 5.
ie. cell A1 =
text, another string, something else, here's another one, guess what another string here, and another, hello i'm another string, another string etc, etc, etccccc
and I'm trying to grab just the first 5 strings.
Beyond that, I wonder if I can incorporate a formula such as =LEN(A1)>20
Currently I do this with numerous; =IFERROR(INDEX( SPLIT(C31,","),1)) then =IFERROR(INDEX( SPLIT(C31,","),2)) etc. then run the LEN formula above.
Is there a simpler solution? Thanks so much.
Try,
=split(replace(A1, find("|", SUBSTITUTE(A1, ", ", "|", 5)), len(A1), ""), ", ", false)
For Excel, with data in A1, in B1 enter:
=TRIM(MID(SUBSTITUTE($A1,",",REPT(" ",999)),COLUMNS($A:A)*999-998,999))
and copy across:
To get all 5 substrings into a single cell, use:
=LEFT(A1,FIND(CHAR(1),SUBSTITUTE(A1,",",CHAR(1),5))-1)
=ARRAY_CONSTRAIN(SPLIT(A1,","),1,5)
=REGEXEXTRACT(A1,"((?:.*?,){5})")
=REGEXEXTRACT(A1,REPT("(.*?),",5))
SPLIT to split by delimiter
ARRAY_CONSTRAIN to constrain the array
REGEX1 to extract 5 comma separated values
. Any character
.*?, Any character repeated unlimited number of times (? as little as possible) followed by a ,
{5} Quantifier
REPT to repeat strings

Remove Multiple Periods Up To Bracket From String

Would like to know how to create an Emacs macro that will
Find the first instance of multiple periods in string
Set mark
Move to the first closed bracket in string
Remove all chars between mark and closed bracket
Here is an example string. I'd like to go from this:
* [This is Chapter 1.......................................................... 1-83](chapter1.md)
To this:
* [This is Chapter 1](chapter1.md)
Can anyone assist?
Thanks
Heres the hacky way I accomplished. I'm sure there is a cleaner way.
Start with cursor at the beg of line
M-x start-kbd-macro
C-s RET .. to search for first instance of ".." in the string
C-SPACE to set mark
C-s ] to search for first instance of "]" in the string
DEL to remove everything marked
BKSP BKSP to remove the final two ".."
DWN ARROW to get to next line
C-a to get to beg of line
M-x end-kbd-macro
I know its lame, but it worked!! I have ~100 pages of docs to do this to! Need to figure out how to reliably perform this on the entire doc next.