Remove the first character of each line and append using Vim - regex

I have a data file as follows.
1,14.23,1.71,2.43,15.6,127,2.8,3.06,.28,2.29,5.64,1.04,3.92,1065
1,13.2,1.78,2.14,11.2,100,2.65,2.76,.26,1.28,4.38,1.05,3.4,1050
1,13.16,2.36,2.67,18.6,101,2.8,3.24,.3,2.81,5.68,1.03,3.17,1185
1,14.37,1.95,2.5,16.8,113,3.85,3.49,.24,2.18,7.8,.86,3.45,1480
1,13.24,2.59,2.87,21,118,2.8,2.69,.39,1.82,4.32,1.04,2.93,735
Using vim, I want to reomve the 1's from each of the lines and append them to the end. The resultant file would look like this:
14.23,1.71,2.43,15.6,127,2.8,3.06,.28,2.29,5.64,1.04,3.92,1065,1
13.2,1.78,2.14,11.2,100,2.65,2.76,.26,1.28,4.38,1.05,3.4,1050,1
13.16,2.36,2.67,18.6,101,2.8,3.24,.3,2.81,5.68,1.03,3.17,1185,1
14.37,1.95,2.5,16.8,113,3.85,3.49,.24,2.18,7.8,.86,3.45,1480,1
13.24,2.59,2.87,21,118,2.8,2.69,.39,1.82,4.32,1.04,2.93,735,1
I was looking for an elegant way to do this.
Actually I tried it like
:%s/$/,/g
And then
:%s/$/^./g
But I could not make it to work.
EDIT : Well, actually I made one mistake in my question. In the data-file, the first character is not always 1, they are mixture of 1, 2 and 3. So, from all the answers from this questions, I came up with the solution --
:%s/^\([1-3]\),\(.*\)/\2,\1/g
and it is working now.

A regular expression that doesn't care which number, its digits, or separator you've used. That is, this would work for lines that have both 1 as their first number, or 114:
:%s/\([0-9]*\)\(.\)\(.*\)/\3\2\1/
Explanation:
:%s// - Substitute every line (%)
\(<something>\) - Extract and store to \n
[0-9]* - A number 0 or more times
. - Every char, in this case,
.* - Every char 0 or more times
\3\2\1 - Replace what is captured with \(\)
So: Cut up 1 , <the rest> to \1, \2 and \3 respectively, and reorder them.

This
:%s/^1,//
:%s/$/,1/
could be somewhat simpler to understand.

:%s/^1,\(.*\)/\1,1/
This will do the replacement on each line in the file. The \1 replaces everything captured by the (.*)

:%s/1,\(.*$\)/\1,1/gc
.........................

You could also solve this one using a macro. First, think about how to delete the 1, from the start of a line and append it to the end:
0 go the the start of the line
df, delete everything to and including the first ,
A,<ESC> append a comma to the end of the line
p paste the thing you deleted with df,
x delete the trailing comma
So, to sum it up, the following will convert a single line:
0df,A,<ESC>px
Now if you'd like to apply this set of modifications to all the lines, you will first need to record them:
qj start recording into the 'j' register
0df,A,<ESC>px convert a single line
j go to the next line
q stop recording
Finally, you can execute the macro anytime you want using #j, or convert your entire file with 99#j (using a higher number than 99 if you have more than 99 lines).
Here's the complete version:
qj0df,A,<ESC>pxjq99#j
This one might be easier to understand than the other solutions if you're not used to regular expressions!

Related

Modify position in a line if Regular Expression found

I need to modify the positions number 10 of every line that finds the word 'Example' (can´t use the actual data here) and add the string '(ID) '. It doesn´t necessarily have to begin with 9 numbers, it just needs to add the string to the position number 10.
For example, this line should be modified like this:
ORIGINAL: 123456789This line is being used as an Example
SOLUTION: 123456789(ID) This line is being used as an Example
So far I have this, to find the Example and copy the rest of the line as to not lose the text:
Find: (.*)Example
Bonus points if it works for two different words 'Example1' and 'Example2' in different sentences, the 'and also' part of this example would change in every line.
ORIGINAL: 123456789This line is being used as an Example1 and also Example2
SOLUTION: 123456789(ID) This line is being used as an Example1 and also Example2
This would have this search:
Find: (.*)Example1(.*)Example2
Thank you
You could try:
Find: (\d{9})(?=.*\bExample1\b.*\bExample2\b)
Replace: $(ID)
^^^ single space after (ID)
Demo
The regex pattern used matches and captures a 9 digit number (you may adjust to any width, or range of widths, which you want). It also uses a positive lookahead to assert that Example1 and Example2 in fact occur later in the same line:
(?=.*\bExample1\b.*\bExample2\b)
This is how you add characters in a certain position, even tho I accepted Tims answer because it´s very similar and made me figure it out:
^(\S{9})(?=.*\bExample1\b.*\bExample2\b)
As you can see, I only added '^' so it´s the position from the start of the line, and 'S' instead of 'd' so it counts characters that are not whitespace, instead of numbers. This should work for any type of line you have.

add character before first word in line

I want to add a minus sign "-" infront of the first word in a line on the editor VIM. The lines contains spaces for indentation. The indentation shall not be touched. E.g
As Is
list point 1
sub list point 2
and so on...
I want
- list point 1
- sub list point 2
- and so on...
I can find the first word, but i struggle with replacing it in the correct way.
^\s*\w
in Vim
/^\s*\w
But in the replacement I always remove the complete found part....
:s/^\s*\w/- \w/
Which leads to
- ist point 1
- ub list point 2
- nd so on...
Use & which is replaced with the matched string:
:%s/\w/- &
I'm late to the party but:
:%norm! I- <CR>
And another one with :s:
:%s/^\s*/&- /
An alternative to falsetrue's answer: You can capture the first word character and print it out along with the leading -:
%s/\(\w\)/- \1/
:normal cmd may help too:
:%norm! wi-
note that after - there is a space.

Vim: regular expression to delete all lines except those starting with a given list of numbers

I have a csv file where every line but the first starts with a number and looks like this:
subject,parameter1,parameter2,parameter3
1,blah,blah,blah
3,blah,blah,blah
2,blah,blah,blah
44,blah,blah,blah
12,blah,blah,blah
14,blah,blah,blah
11,blah,blah,blah
10,blah,blah,blah
11,blah,blah,blah
13,blah,blah,blah
3,blah,blah,blah
...
I would like to delete all lines except the first that start, say, with the numbers 1,6,12.
I was trying something like this:
:g!/^[1 6 12]\|^subject/d
But the 12 is interpreted as "1 or 2" so this also erases the lines that start with 2..
What am I missing, and what should be the most efficient way to do this?
Btw instead of 1, 6, 12, my list contains many multiple single and 2-digit numbers.
The character class [1 6 12] means "any single character that is in this class,
i.e. any one of ' ', 1, 2, 6 (the repeated 1 is ignored).
You could use
:g!/^1,\|^6,\|^12,\|^subject/d
which is close to your original syntax - but it works (tested with vim on Mac OS X).
Note - it is important to include the comma, so that the line starting with 1 doesn't "protect" 11, 12345, etc.
You might want to do this differently though - using grep.
Put all the "white listed" numbers in a file, one per line, like so:
^subject
^1,
^2,
^6,
^12,
then do
grep -f whitelist csvFile
and the output will be your "edited" file (which you can pipe to a new file).
If you are even more interested in "efficiency", you could make your text file (let's continue to call it whitelist) just
subject
1
2
6
12
and use the following command:
cat whitelist | xargs -I {} grep "^"{}"," cvsFile
This needs a bit of explaining.
xargs - take the input one line at a time
-I {} - and insert that line in the command that follows, at the {}
This means that the grep command will be run n times (once per line in the whitelist file), and each time the regular expression that is fed into grep will be the concatenation of
"^" - start of line
{} - contents of one line of the input file (whitelist)
"," - comma that follows the number
So this is a compact way of writing
grep "^subject," csvFile; grep "^1," csvFile; grep "^2," csvFile;
etc.
It has the advantage that you can now generate your whitelist any way you want - as long as it ends up in a file, one line at a time, you can use it; the disadvantage is that you are essentially running grep n times. If your files get very large, and you have a large number of items in your white list, that may start to be a problem; but since your OS is likely to put the file into cache after the first read-through, it is really quite fast. The use of the ^ anchor makes the regular expression very efficient - as soon as it doesn't find a match it goes on to the next line.
Use a global match:
:v/^\(subject\|1\|6\|12\),/ delete
For every line that does not match that regular expression, delete it.
It yields:
subject,parameter1,parameter2,parameter3
1,blah,blah,blah
12,blah,blah,blah
EDIT: Just now I realised that you were already using the global match. You error was in the character class. It matches any character inside it regardless of repeated ones, in your case numbers one, two, six and a space. You must separate them in different branches, like I did before.
a "functional" alternative:
:g/./if index([1,12,6],str2nr(split(getline("."),",")[0]))<0|exec 'normal! dd'|endif

VB.Net Beginner: Replace with Wildcards, Possibly RegEx?

I'm converting a text file to a Tab-Delimited text file, and ran into a bit of a snag. I can get everything I need to work the way I want except for one small part.
One field I'm working with has the home addresses of the subjects as a single entry ("1234 Happy Lane Somewhere, St 12345") and I need each broken down by Street(Tab)City(Tab)State(Tab)Zip. The one part I'm hung up on is the Tab between the State and the Zip.
I've been using input=input.Replace throughout, and it's worked well so far, but I can't think of how to untangle this one. The wildcards I'm used to don't seem to be working, I can't replace ("?? #####") with ("??" + ControlChars.Tab + "#####")...which I honestly didn't expect to work, but it's the only idea on the matter I had.
I've read a bit about using Regex, but have no experience with it, and it seems a bit...overwhelming.
Is Regex my best option for this? If not, are there any other suggestions on solutions I may have missed?
Thanks for your time. :)
EDIT: Here's what I'm using so far. It makes some edits to the line in question, taking care of spaces, commas, and other text I don't need, but I've got nothing for the State/Zip situation; I've a bad habit of wiping something if it doesn't work, but I'll append the last thing I used to the very end, if that'll help.
If input Like "Guar*###/###-####" Then
input = input.Replace("Guar:", "")
input = input.Replace(" ", ControlChars.Tab)
input = input.Replace(",", ControlChars.Tab)
input = "C" + ControlChars.Tab + strAccount + ControlChars.Tab + input
End If
input = System.Text.RegularExpressions.Regex.Replace(" #####", ControlChars.Tab + "#####") <-- Just one example of something that doesn't work.
This is what's written to input in this example
" Guar: LASTNAME,FIRSTNAME 999 E 99TH ST CITY,ST 99999 Tel: 999/999-9999"
And this is what I can get as a result so far
C 99999/9 LASTNAME FIRSTNAME 999 E 99TH ST CITY ST 99999 999/999-9999
With everything being exactly what I need besides the "ST 99999" bit (with actual data obviously omitted for privacy and professional whatnots).
UPDATE: Just when I thought it was all squared away, I've got another snag. The raw data gives me this.
# TERMINOLOGY ######### ##/##/#### # ###.##
And the end result is giving me this, because this is a chunk of data that was just fine as-is...before I removed the Tabs. Now I need a way to replace them after they've been removed, or to omit this small group of code from a document-wide Tab genocide I initiate the code with.
#TERMINOLOGY###########/##/########.##
Would a variant on rgx.Replace work best here? Or can I copy the code to a variable, remove Tabs from the document, then insert the variable without losing the tabs?
I think what you're looking for is
Dim r As New System.Text.RegularExpressions.Regex(" (\d{5})(?!\d)")
Dim input As String = rgx.Replace(input, ControlChars.Tab + "$1")
The first line compiles the regular expression. The \d matches a digit, and the {5}, as you can guess, matches 5 repetitions of the previous atom. The parentheses surrounding the \d{5} is known as a capture group, and is responsible for putting what's captured in a pseudovariable named $1. The (?!\d) is a more advanced concept known as a negative lookahead assertion, and it basically peeks at the next character to check that it's not a digit (because then it could be a 6-or-more digit number, where the first 5 happened to get matched). Another version is
" (\d{5})\b"
where the \b is a word boundary, disallowing alphanumeric characters following the digits.

Math in Vim search-and-replace

I have a file with times (minutes and seconds), which looks approximately as follows:
02:53 rest of line 1...
03:10 rest of line 2...
05:34 rest of line 3...
05:35 rest of line 4...
10:02 rest of line 5...
...
I would like to replace the time by its equivalent in seconds. Ideally, I would like to run some magical command like this:
:%s/^\(\d\d\):\(\d\d\) \(.*\)/(=\1*60 + \2) \3/g
...where the (=\1*60 + \2) is the magical part. I know I can insert results of evaluation with the special register =, but is there a way to do this in the subst part of a regex?
Something like this?
:%s/^\(\d\d\):\(\d\d\)/\=submatch(1)*60+submatch(2)/
When the replacement starts with a \= the replacment is interpreted as an expression.
:h sub-replace-expression is copied below
Substitute with an expression *sub-replace-expression*
*sub-replace-\=*
When the substitute string starts with "\=" the remainder is interpreted as an
expression. This does not work recursively: a substitute() function inside
the expression cannot use "\=" for the substitute string.
The special meaning for characters as mentioned at |sub-replace-special| does
not apply except for "<CR>", "\<CR>" and "\\". Thus in the result of the
expression you need to use two backslashes to get one, put a backslash before a
<CR> you want to insert, and use a <CR> without a backslash where you want to
break the line.
For convenience a <NL> character is also used as a line break. Prepend a
backslash to get a real <NL> character (which will be a NUL in the file).
When the result is a |List| then the items are joined with separating line
breaks. Thus each item becomes a line, except that they can contain line
breaks themselves.
The whole matched text can be accessed with "submatch(0)". The text matched
with the first pair of () with "submatch(1)". Likewise for further
sub-matches in ().
Use submatch() to refer to a grouped part in the substitution place:
:%s/\v^(\d{2}):(\d{2})>/\=submatch(1) * 60 + submatch(2)/
With your example yields:
173 rest of line 1...
190 rest of line 2...
334 rest of line 3...
335 rest of line 4...
602 rest of line 5...
Hopefully it would be helpful to someone else, but i had similar problem where i wanted to replace "id" with a different number, in-fact any other number
{"id":1,"first_name":"Ruperto","last_name":"Bonifayipio","gender":"Male","ssn":"318-69-4987"},
used expression
%s/\v(\d+),/\=submatch(1)*1111/g
which results into following new value
{"id":1111,"first_name":"Ruperto","last_name":"Bonifayipio","gender":"Male","ssn":"318-69-4987"},