Match and highlight two sets of columns in VIM - regex

This SO post describes how to highlight all characters on a line in VIM past a given line number (80, in this case).
I'd like to have two sets of highlighted characters, columns 81-100 highlighted with one background color, and columns 101+ with another background color.
Here's what I've tried so far:
" Light highlight characters past column 80. Red highlight past 100.
highlight OverLength1 ctermbg=red ctermfg=white guibg=#5b4f62
match OverLength1 /\%81v.\+/
highlight OverLength2 ctermbg=red ctermfg=white guibg=#990500
match OverLength2 /\%101v.\+/
as well as this variation on the 3rd line:
match OverLength1 /\%81v.\+($|100v)/
Neither works. The best I can get is to match 101+ alone; it seems like the second match overwrites the first match.
I don't like the colorcolumn option, I don't want to highlight empty columns, just text in the ranges specified.

Try
" Light highlight characters past column 80. Red highlight past 100.
highlight OverLength1 ctermbg=red ctermfg=white guibg=#5b4f62
match OverLength1 /\%81v.\+/
highlight OverLength2 ctermbg=red ctermfg=white guibg=#990500
2match OverLength2 /\%101v.\+/
Read more about it on :h 2match.

Related

Google Sheets: Find match, check for text in adjacent cell to matched cell

Example data here.
I'd like to use conditional formatting to highlight cells where, if the cell's number is found in another sheet, and there is text in an adjacent cell, it is highlighted.
So, given Sheet2:
When a B-column cell in Sheet1 matches an A-column cell in Sheet2, it checks if there is text in the adjacent E-column cell (in Sheet2), and highlights if there is text.
try:
=REGEXMATCH(B2&"", "^"&TEXTJOIN("$|^", 1,
FILTER(INDIRECT("Sheet2!A2:A"), INDIRECT("Sheet2!E2:E")<>""))&"$")

RegexReplace the nth occurrence of a string of underscores

I'm having trouble getting a REGEXREPLACE working in a Google Sheets formula. I'm aiming to replicate a certain card game which is opposed to humankind. I have a cell containing a string which contains one, two or three occurrences of a series of underscores, e.g.
"_____ is the new _____"
And let's say I want to substitute in the strings "Orange" for the first occurrence, and "Black" for the second occurrence.
I don't know how many underscores will be in each string, it could be one or more, so it seems like a job for regex. I tried SUBSTITUTE and it didn't seem to recognise asterisks. Based on this link, I tried using {1} {2} and {3} to match the first/second/third occurrence, but I'm not doing something right:
=REGEXREPLACE(G16,".*(_*){1}.*",G17)
G16 is: _____ is the new _____.
G17 is: Orange
The output of the formula is: OrangeOrange.
Can anyone help me figure out the correct way to do this?
You may use
=REGEXREPLACE(REGEXREPLACE(G16,"^([^_]*)_+","$1Orange"), "^([^_]*)_+", "$1Black")
|----- First occurrence -----------------|
|----------------- Second occurrence ------------------------------------------|
Details
^ - start of string
([^_]*) - Capturing group 1 ($1 will refer to this group value): 0 or more chars other than an underscore
_+ - 1 or more underscores.

Use RegEx to match data in cell in order to pull out to new rows in Excel

I have a spreadsheet containing numerous cells of data, but each cell contains numerous lines without carriage return or line feed. I want to create new rows by matching each occurrence of a ten digit number and grabbing the number and all text up until the next occurrence.
For example, this is one cell's text.
8770304350 PRINTER 4610-2CR W/IRON GRAY COVERS (2921) $750.75 2881057001 PAYMENT DEVICE - VERIFONE MX915 - WALMART CONSIGNE 8770242020 DISPLAY 4820-5GB USB W/ I/O SUPPORT IRON GRAY $907.27 8770242216 KEYPAD-MSR 3 TRACK IRON GREY $213.85 2881037020 CONSIGNED- SCANNER DS6878-SR20117WR IMAGER 2D BLUE
I want to split it into new rows each time there is a ten digit number so it would end up looking like this where each line is a new row.
8770304350 PRINTER 4610-2CR W/IRON GRAY COVERS (2921) $750.75
2881057001 PAYMENT DEVICE - VERIFONE MX915 - WALMART CONSIGNE
8770242020 DISPLAY 4820-5GB USB W/ I/O SUPPORT IRON GRAY $907.27
8770242216 KEYPAD-MSR 3 TRACK IRON GREY $213.85
2881037020 CONSIGNED- SCANNER DS6878-SR20117WR IMAGER 2D BLUE
I tried using RegEx on my own, but i was either matching just the number or the entire string and it's very complicated to me.
For example, this tried the look ahead but ended up selecting all text except first number and last selection.
(?<=[0-9]{10}).*(?=[0-9]{10})
You may use
\b\d{10}.*?(?=\s*\b\d{10}|$)
See the regex demo. If there can be line breaks, replace .*? with [\s\S]*?.
Details
\b - leading word boundary
\d{10} - 10 digits
.*? - any 0+ chars other than line break chars as few as possible
(?=\s*\b\d{10}|$) - a positive lookahead that, immediately to the right of the current location, requires
\s*\b\d{10} - 0+ whitespaces, word boundary and 10 digits
| - or
$ - end of string.

vim: substitute specific character, but only after nth occurance

I need to make this exercise about regexes and text manipulation in vim.
So I have this file about the most scoring soccer players in history, with 50 entries looking like this:
1 Cristiano Ronaldo Portugal 88 121 0.73 03 Manchester United Real Madrid
The whitespaces between the fields are tabs (\t)
The fields each respond to a differen category: etc...
This last field contains one or more clubs the player has played in. (so not a fixed number of clubs)
The question: replace all tabs with a ';', except for the last field, where the clubs need to be seperated by a ','.
So I thought: I just replace all of them with a comma, and then I replace the first 7 commas with a semicolon. But how do you do that? Everything - from regex to vim commands - is allowed.
The first part is easy: :2,$s/\t/,/g
But the second part, I can't seem to figure out.
Any help would be greatly appreciated.
Thanks, Zeno
This answer is similar to #Amadan's, but it makes use of the ability to provide an expression as the replace string to actually do the difficult bit of changing the first set of tabs to semicolons:
%s/\v(.{-}\t){7}/\=substitute(submatch('0'), '\t', ';', 'g')/|%s/\t/,/g
Broken down this is a set of three substitute commands. The first two are cobbled together with a sub-replace-expression:
%s/\v(.{-}\t){7}/\=substitute(submatch('0'), '\t', ';', 'g')/
What this does is find exactly seven occurrances ({7}) of any character followed by a tab, in a non-greedy way. ((.{-}\t)). Then we replace this entire match (submatch(0)) with the result of the substitute expression (\=substitute(...)). The substitute expression is simple by comparison as it just converts all tabs to semicolons.
The last substitute just changes any other tabs on the line to commas.
See :help sub-replace-expression
Here's one way you could do it:
:let #q=":s/\t/;\<cr>"
:2,$norm 7#q
:2,$s/\t/,/g
Explanation:
First, we define a macro 'q' that will replace one tab with a semicolon. Now, on any line we can simply run this macro n times to replace the first n tabs. To automatically do this to every line, we use the norm command:
:2,$norm 7#q
This is essentially the same thing as literally typing 7#q (e.g. "run macro 'q' seven times") on every line in the specified range. From there, we can simply replace every tab with a comma.
:2,$s/\t/,/g
:2,$s/\t\(.*\t\)\#=/;/g
:2,$s/\t/,
Change any tabs where there is a tab later to ;
Change any remaining tabs to ,
EDIT: Misunderstood. Here is a fixed version:
:2,$s/\(\(\t.*\)\{7}\)\#<=\t/,/g
:2,$s/\t/;/g
Change any tabs where there's seven tabs before it to ,
Change any remaining tabs to ;
My PatternsOnText plugin has (among others) a :SubstituteSelected command that allows to specify the match positions. With this, you can easily replace the first 8 tabs with semicolons, and then use a regular substitute to change the remaining tabs into commas:
:2,$SubstituteSelected/\t/;/g 1-8
:2,$s/\t/,/g
We solved the issue by just capturing the first 8 groups manually ([^\t]*\t)(...)(...) and then separate them with a semicolon (\1;\2;...;) then replacing the remaining tabs with comma's | 2,$s/\t/,/g
Thanks to everyone trying to help!

How to convert a regular expression from OR to XOR

I wish to evaluate a structure similar to the following:
The house is green but my favorite colors are blue red and yellow
I determine the color of the house with a regular expression like this:
the house \ s + (\ w \ s *) + (? = (cyan | green | red | blue))
What does it do? This expression returns the next match:
The house is green but my favorite colors are blue
That is, returns the last match in the string in the list CharacterClass colors indicated, ie it takes until the appearance of RED, but the first color you see is GREEN.
What should I do? What I'm looking for is to just take the first color mentioned in the list and stop looking, that is to tell me that the house color is green, and nothing else.
Q1: How to loop through the string until the appearance of only one and only one of the expressions that you indicated, that is, how to convert the expression (cyan or green or blue or red) to a list that behaves like an XOR. Important: Only use regular expressions, ie without any como.NET background language, Java, PERL, etc ...
Q2: Are there any alternative to using regular expressions that I missed. That is, the road I took is the right one?
In advance, thank you all
It's returning the latest match because your (\w\s*)+ is greedy; it matches as much as it can (i.e. all the way up to just before the 'red').
You could change it to non-greedy using +? instead of +
the house\s+(\w\s*)+?(?=(cyan|green|red|blue))
But I think you can do better than that.
Why (\w\s*)+ you're potentially just matching a single letter at a time! why not match whole words instead with (\w+\s+)+.
Also, why not just match up to the first colour?
the\s+house\s+(\w+\s+)+?(cyan|green|red|blue)
Then capturing group 2 (the second set of brackets) will contain the first occurence of cyan, green, red, or blue (i.e. your colour list). Note the +? making sure that the word regex is non-greedy, meaning it won't gobble up instances of 'cyan', 'green', 'red' or 'blue'.
You could even just do
house.*?\b(cyan|green|red|blue)
Where the .*? is non-greedy, and just gobbles everything up, up to the first colour. The \b is a "word boundary" and just makes sure the regex doesn't match the 'red' in 'desired', for example.
This is how i would do it in python, im not sure if other languages have the .seach feature.
"What I'm looking for is to just take the first color mentioned in the list and stop looking, "
s='The house is green but my favorite colors are blue red and yellow'
import re
print re.search('(cyan|green|red|blue)',s,).group(1)
print re.match('The house is (cyan|green|red|blue)',s,).group(1)#or if u had to use the .match
note the lack of spaces in the (cyan|green|red|blue).
it prints this:
green
green