Have Tabulize ignore some lines and align the others - regex

I would want Tabulize to ignore lines which do not have a particular character and then align/tabularize the lines ..
text1_temp = text_temp;
temporary_line;
text2 = text_temp;
In the end i would like the following :
text1_temp = text_temp;
temporary_line;
text2 = text_temp;
// The 2nd "=" is spaced/tabbed with relation to the first "="
If i run ":Tabularize /=" for the 3 lines together I get :
text1_temp = text_temp;
temporary_line;
text2 = text_temp;
Where the two lines with "=" are aligned with respect to the length of the middle line
Any suggestions .. ?
PS: I edited the post possibly to explain the need better ..

I am not sure how to do this with Tabular directly. You might be able to use Christian Brabandt's NrrwRgn plugin to filter out only lines with = using :NRP then running :NRM. This will give you a new buffer with only the lines with = so you can run :tabularize/=/ and then save the the buffer (:w, :x, etc).
:g/=/NRP
:NRM
:tabularize/=/
:x
The easiest option is probably to use vim-easy-align which supports such behavior out of the box it seems. Example of using EasyAlign (Using ga as EasyAlign's mapping you):
gaip=

What about a simple replace, like :g/=/s/\t/ /g ?
If that doesn't work, you can try this too: :g/=/s/ \+= \+/ = /g
Explanation:
The :/g/=/s will find all the lines that contain '=', and do the replacement for them.
So, s/\t/ /g will replace tabs with spaces. These two things combined will do what you need.

Related

Swapping columns in vi with regex without using awk, read, etc

I have a file of 1000 lines, with 5 to 8 columns in each line separated by :
1:2:3:4:5:6:7:8
4g10:8s:45:9u5b:a:z1
I want to have all lines in some order 4:3:1:2:5:6:7...
How would I swap only first 4 columns with regex?
I think this would probably be easier to do with another approach, but you could use ex to do it, so be in command mode and enter:
:%s/^\([^:]\+\):\([^:]\+\):\([^:]\+\):\([^:]\+\):/\4:\3:\1:\2:/
which will create capture groups for the first 4 colon delimited fields, then replace them in a different order than they were there originally.
Here is a regex that should do what you are looking for:
newtext = re.sub("([^:]+):([^:]+):([^:]+):([^:]+)(:)?(.*)?",r"\4:\3:\1:\2\5\6",text)
The take away is you'll want to use parans for capturing and then reorder them in the order you want them in the replace. Each capture "group" is just one or more non : separated by : If there is possibility of empty groups change each + to a *
Here is a sample in Python for clarity:
import re
textlist = [
"1:2:3:4:5:6:7:8",
"1:2:3:4:5",
"1:2:3:4",
]
for text in textlist:
newtext = re.sub("([^:]+):([^:]+):([^:]+):([^:]+)(:)?(.*)?",r"\4:\3:\1:\2\5\6",text)
print (newtext)
output:
4:3:1:2:5:6:7:8
4:3:1:2:5
4:3:1:2

Regex to Capture and wrap outline formatted text

I have source text that is not particularly clean or well formed but I have a need to find text and wrap a line in a tag. The text is in outline format.
1. becomes a <h1> tag
A. becomes a <h2> tag
(1) becomes a <h3> tag
and so on...
Here are some examples of the source.
PREPARE FOR TEST A. Open the door. B. Turn on the light.
The desired result would be
<h1>1. PREPARE FOR TEST</h1>
<h2>A. Open the door.</h2>
<h2>B. Turn on the light.</h2>
Unfortunately, the text could be the same line or it could be on multiple lines or even have a different number of spaces between the outline number and the text. Another example
(1) Check air inlet and air outlet valves are shown open if OAT is above > 53.6 deg F., or closed if OAT is below
48.2 deg F.
In this case the desired result would be
<h3>(1) Check skin air inlet and skin air outlet valves are shown open if temperature is above 53.6 deg F., or closed if temperature is below 48.2 deg F.</h3>
My questions are
How do I find an entire line of text that is associated with an outline level, i.e., the 1., A., (1) and so on.
How do I then wrap that text with the appropriate tag.
I'm not particularly strong at regex, I have been able to do some of the simpler things required of this project but this has me stumped a bit. Here's what I used to try to find the H1 lines, but as anyone that knows regex can plainly see, this won't work past the first word.
\d{1,3}.\s+[A-Z]{2,}
I'm using Python at the moment but am better with PHP and can move to that if needed and still may because I'm better at PHP then Python.
Thank you.
Since every regex needs a different substitution, you need to apply each regex in turn. Assuming that you want the match to always span an entire line, I'd suggest something like this:
import re
s = """1. becomes a h1 tag
A. becomes a h2 tag
(1) becomes a h3 tag
and so on..."""
regexes = {r"\d+\.": "h1",
r"[A-Z]+\.": "h2",
r"\(\d+\)": "h3",
}
for regex in regexes:
repl = regexes[regex]
s = re.sub("(?m)^" + regex + ".*", "<" + repl + ">" + r"\g<0>" + "</" + repl + ">", s)
print(s)
Result:
<h1>1. becomes a h1 tag</h1>
<h2>A. becomes a h2 tag</h2>
<h3>(1) becomes a h3 tag</h3>
and so on...
Explanation:
Each of the regexes (which only match the actual identifiers) is modified to match from the start of the line until the end of the line:
"(?m)^" + regex + ".*" # (?m) allows ^ to match at the start of lines
The entire match is contained in group 0 which can be accessed in the replacement string via \g<0>.
"<" + repl + ">" + r"\g<0>" + "</" + repl + ">" # add tags around line
For future reference and to close this, what I eventually came up with was to run through the entire string of text and remove some trash first. There are actually 15 of these that I use for this step.
$regexes['lf'] = "/[\n\r]*/";
$regexes['tab-cr-lf'] = "/\t[\r\n]/";
preg_replace($regexes,"", $string);
I then discovered that I could count on space and \t after each header identifier, so then I run some more regexes on the string
$regexes['step1'] = "/(\d{1,2}\..\t)/";
$regexes['step2'] = "/([A-Z]\. \t)/";
$replacements['step1'] = "\n\n<step1>$0";
$replacements['step2'] = "\n\n<step2>$0";
preg_replace($this->headerRegexes, $replacements, $string);
These steps have given me some usable text that I can work with.
Thanks to everyone that chimed in, it gave me somethings to think about as I tackled this problem.

Swift 3: iosMath label removing all spaces

I'm trying to display text which may at times contain a math expression so I am using MTMathUILabel from iosMath. I generate the labels dynamically and add them to a stack as I pull the strings from the db. The problem is that all text which is not math appears with no spaces. i.e:
In db: Solve the following equation: (math here)
In label: Solvethefollowingequation: (math here)
Here is what I have tried so far:
for question in all_questions {
let finalString = question.question?.replacingOccurrences(of: " ", with: "\\space", options: .literal, range: nil)
let label = MTMathUILabel()
label.textColor = UIColor.black
label.latex = finalString
stack.addArrangedSubview(label)
}
But the problem is that it literally places two . And xcode doesn't let me write just one \ because it is not escaped. However if I just write
print("\\space")
Then it will print just one.
How can I fix this so I add only one \? If this cannot be done, how can I achieve what I want? Is there a better library out there?
After giving a quick look at MTMathUILabel's doc and LaTeX conventions, I believe you should replace your spaces with a tilde character "~". This will make them non-breaking spaces and avoid the backslash issue (which is probably due to \space not being understood by MTMathUILabel).
Systematic replacement of all spaces may yield undesirable result if the formula itself has legitimate spaces in it.
For example, a quadratic equation would be expressed as:
x = \frac{-b \pm \sqrt{b^2-4ac}}{2a}
You will end up replacing spaces inside curly braces, and that may or may not be what you want:
x~=~\frac{-b~\pm~\sqrt{b^2-4ac}}{2a}

Replace end of string when start matched

This is what I have in first file:
16;01978B66;BC101;FALSE
17;0195B4E5;BC101;FALSE
18;019796C6;BC101;FALSE
19;0197D016;BC101;FALSE
This is what I have in 2nd file
16;01978B66;BC102;FALSE
17;0195B4E5;BC102;FALSE
18;019796C6;BC102;FALSE
19;0197D016;BC102;FALSE
What regex should I use if I want to replace end of every line starting with 16; and 18; , from ;FALSE to ;TRUE ? I would like to use notepad++ replace in files, so I can replace multiple lines 16; and 18; in all files without touching middle of the string with different values.
I understand regex once I get it explained but I searched for hours and I get lost in other examples...
This is what I should get:
16;01978B66;BC101;TRUE
17;0195B4E5;BC101;FALSE
18;019796C6;BC101;TRUE
19;0197D016;BC101;FALSE
and
16;01978B66;BC102;TRUE
17;0195B4E5;BC102;FALSE
18;019796C6;BC102;TRUE
19;0197D016;BC102;FALSE
I tried to capture in 3 groups with
^(17;)[a-zA-Z0-9\;]{9}[a-zA-Z0-9\;]{6}[a-zA-Z0-9\;]{5}
but replace with ($3);TRUEis leaving me with only;TRUE` which is not good.
This must be piece of cake for someone who knows how to replace end of string.
btw 0197D016;BC101; is constant in lenght, 8 digits ; 2letters3numbers ;
Thanks in advance for help.
Please try the following:
Find what: ^((16|18);.+?)FALSE$
Replace with: $1TRUE

Notepad++ RegeEx group capture syntax

I have a list of label names in a text file I'd like to manipulate using Find and Replace in Notepad++, they are listed as follows:
MyLabel_01
MyLabel_02
MyLabel_03
MyLabel_04
MyLabel_05
MyLabel_06
I want to rename them in Notepad++ to the following:
Label_A_One
Label_A_Two
Label_A_Three
Label_B_One
Label_B_Two
Label_B_Three
The Regex I'm using in the Notepad++'s replace dialog to capture the label name is the following:
((MyLabel_0)((1)|(2)|(3)|(4)|(5)|(6)))
I want to replace each capture group as follows:
\1 = Label_
\2 = A_One
\3 = A_Two
\4 = A_Three
\5 = B_One
\6 = B_Two
\7 = B_Three
My problem is that Notepad++ doesn't register the syntax of the regex above. When I hit Count in the Replace Dialog, it returns with 0 occurrences. Not sure what's misesing in the syntax. And yes I made sure the Regular Expression radio button is selected. Help is appreciated.
UPDATE:
Tried escaping the parenthesis, still didn't work:
\(\(MyLabel_0\)\((1\)|\(2\)|\(3\)|\(4\)|\(5\)|\(6\)\)\)
Ed's response has shown a working pattern since alternation isn't supported in Notepad++, however the rest of your problem can't be handled by regex alone. What you're trying to do isn't possible with a regex find/replace approach. Your desired result involves logical conditions which can't be expressed in regex. All you can do with the replace method is re-arrange items and refer to the captured items, but you can't tell it to use "A" for values 1-3, and "B" for 4-6. Furthermore, you can't assign placeholders like that. They are really capture groups that you are backreferencing.
To reach the results you've shown you would need to write a small program that would allow you to check the captured values and perform the appropriate replacements.
EDIT: here's an example of how to achieve this in C#
var numToWordMap = new Dictionary<int, string>();
numToWordMap[1] = "A_One";
numToWordMap[2] = "A_Two";
numToWordMap[3] = "A_Three";
numToWordMap[4] = "B_One";
numToWordMap[5] = "B_Two";
numToWordMap[6] = "B_Three";
string pattern = #"\bMyLabel_(\d+)\b";
string filePath = #"C:\temp.txt";
string[] contents = File.ReadAllLines(filePath);
for (int i = 0; i < contents.Length; i++)
{
contents[i] = Regex.Replace(contents[i], pattern,
m =>
{
int num = int.Parse(m.Groups[1].Value);
if (numToWordMap.ContainsKey(num))
{
return "Label_" + numToWordMap[num];
}
// key not found, use original value
return m.Value;
});
}
File.WriteAllLines(filePath, contents);
You should be able to use this easily. Perhaps you can download LINQPad or Visual C# Express to do so.
If your files are too large this might be an inefficient approach, in which case you could use a StreamReader and StreamWriter to read from the original file and write it to another, respectively.
Also be aware that my sample code writes back to the original file. For testing purposes you can change that path to another file so it isn't overwritten.
Bar bar bar - Notepad++ thinks you're a barbarian.
(obsolete - see update below.) No vertical bars in Notepad++ regex - sorry. I forget every few months, too!
Use [123456] instead.
Update: Sorry, I didn't read carefully enough; on top of the barhopping problem, #Ahmad's spot-on - you can't do a mapping replacement like that.
Update: Version 6 of Notepad++ changed the regular expression engine to a Perl-compatible one, which supports "|". AFAICT, if you have a version 5., auto-update won't update to 6. - you have to explicitly download it.
A regular expression search and replace for
MyLabel_((01)|(02)|(03)|(04)|(05)|(06))
with
Label_(?2A_One)(?3A_Two)(?4A_Three)(?5B_One)(?6B_Two)(?7B_Three)
works on Notepad 6.3.2
The outermost pair of brackets is for grouping, they limit the scope of the first alternation; not sure whether they could be omitted but including them makes the scope clear. The pattern searches for a fixed string followed by one of the two-digit pairs. (The leading zero could be factored out and placed in the fixed string.) Each digit pair is wrapped in round brackets so it is captured.
In the replacement expression, the clause (?4A_Three) says that if capture group 4 matched something then insert the text A_Three, otherwise insert nothing. Similarly for the other clauses. As the 6 alternatives are mutually exclusive only one will match. Thus only one of the (?...) clauses will have matched and so only one will insert text.
The easiest way to do this that I would recommend is to use AWK. If you're on Windows, look for the mingw32 precompiled binaries out there for free download (it'll be called gawk).
BEGIN {
FS = "_0";
a[1]="A_One";
a[2]="A_Two";
a[3]="A_Three";
a[4]="B_One";
a[5]="B_Two";
a[6]="B_Three";
}
{
printf("Label_%s\n", a[$2]);
}
Execute on Windows as follows:
C:\Users\Mydir>gawk -f test.awk awk.in
Label_A_One
Label_A_Two
Label_A_Three
Label_B_One
Label_B_Two
Label_B_Three