Bring all rows to one row - regex

In Excel, I have rows like below:
1 2 3 4 5
6 7 8 9 0
9 8 7 6 5
...
I need to bring all of them to the first row:
1 2 3 4 5 6 7 8 9 0 9 8 7 6 5 ...
The numbers of rows and columns are fixed.
What is the fastest way I can achieve this?
Alternatively, can I solve this on a Textpad or Notepad++ using some REGEX grouping?

If you wanted to do it with an Excel formula, pasting the following, starting in column F, would produce it across the top row:
=INDIRECT("r"&CEILING(COLUMN()/5,1)&"c"&IF(MOD(COLUMN(),5)=0,5,MOD(COLUMN(),5)),FALSE)

If your table started from A2 and your row values are to be copied from A1 onwards, following should work:
=OFFSET($A$2, (COLUMN()-COLUMN($A$1))/5, MOD(COLUMN()-COLUMN($A$1), 5))
However, I think for just a small table of size 5X10, using '=' sign manually would be the fastest.

I just select and drag them up there. When there was a pattern of projects I wrote a VBA function to do the job, but for most small unique projects select and drag worked for me.

In Notepad++, for Find what : \r\n, Replace with : 'space' with Search Mode Extended, Replace All, then copy result into Excel.
With images :) here:
Replace Carriage Return and Line Feed in Notepad++.

Related

Find and KEEP all DUPLICATE lines (instead of unique lines) in a text file

I am aiming to identify and keep DUPLICATE, TRIPLICATE, etc. lines, i.e., all lines that occur more than once in Notepad++? In other words, how can I delete all unique lines only?
For example, here are seven (7) separate lists and the desired true duplicate lines of each lists (shown as 7 columns, regard each column as an individual list or file!). (The lists here are shown side by side only to save space, in real life, each of the 7 lists occurs alone and independently from the others and are separate files!)
list1 list2 list3 list4 list5 list6 list7
1 0 0 0 0 0 0
2 1 1 1 1 1 1
3 2 2 2 2 2 2
4 3 3 3 3 3 3
4 4 4 4 4 4 4
4 4 4 4 4 4 4
5 4 4 4 4 4 4
6 5 5 5 5 5 5
7 5 5 5 5 5 5
8 6 6 6 6 6 6
9 6 6 6 6 6 6
abc 7 7 7 7 7 7
abd 8 8 8 8 8 8
abd 9 9 9 9 9 9
abe <CR> 9 9 9 9
<CR> 99 99
<CR>
[Lines of multiple occurence of above lists:]
4 4 4 4 4 4 4
4 4 4 4 4 4 4
4 4 4 4 4 4 4
abd 5 5 5 5 5 5
abd 5 5 5 5 5 5
6 6 6 6 6 6
6 6 6 6 6 6
9 9 9 9
9 9 9 9
There are many solutions to eliminate duplicates (e.g., TextFX; notepad++ delete duplicate and original lines to keep unique lines), I can not find solutions to keep duplicates only.
((.*)\R(\2\R)+)*\K.+\R
#Lars Fischer: This script works nearly OK, except the last entry of the (presorted) list needs to be unique line followed by a <CR> empty line. One (suboptimal) workaround is to insert an artificial (helper) unique line (e.g., zzz) followed by an empty line <CR> as the last two lines.
(END OF QUESTION)
UPDATE 3: This question is reposted per stackoverflow "ask a new question" instruction. (#AdrianHHH, #B. Desai, #Paolo Forgia, #greg-449, #Erik von Asmuth draw the incorrect conclusion that this question is a duplicate of notepad++ delete duplicate and original lines to keep unique lines. This question is definitely not a duplicate of the one #AdrianHHH et al quotes.
UPDATE 2: #AdrianHHH This question is not less "broad" (in fact, one can hardly be more specific) or less researched than other Notepad++ questions, including the one https://stackoverflow.com/questions/29303148 cited (wrongly) by #AdrianHHH et al. as the same question.
UPDATE:
#AdrianHHH, #B. Desai, #Paolo Forgia, #greg-449, #Erik von Asmuth
This questions is different from:
https://stackoverflow.com/questions/29303148
beacuse Q 29303148 is (i) neither asking how to identify and keep only the lines of multiple occurrence, (ii) neither there is a solution provided in the answers for that. Q 29303148 asks "...I just need the unique lines."
Here is a solution based on regular Expressions and bookmarks, it works for a sorted file (i.e. each duplicated line is followed by its duplicates):
Open the Mark Dialog (Search -> Mark ....)
click Clear all Marks on the right
check Bookmark line
check Wrap aound
Find What: ((.*)\R(\2\R?)+)*\K.*
Check regular expression and uncheck . matches newline
Mark All
Click Close
Search -> Bookmark -> Remove Bookmarked Lines
Explanation
The regular expression is made up of three parts:
((.*)\R(\2\R?)+)* : this is an optional block of duplicates consisting of one ore more line blocks
the outher ( ... )* matches zero or more such blocks of duplicated lines (if in your example the three 4 would be followed by two 5 we will need a concept of sequences of duplicate blocks)
(.*)\R(\2\R?)+: \2 references the content of (.*): this are all duplicates of one line
the second \R is an optional ( due to the ?) linebreak. Thus it is possible to match a duplicate in the last line of the file if that line does not end with a linebreak
If there is a block of duplicated lines after the cursor position from which you start, this will match it.
now \K discards what we have matched so far (the duplicates) and "puts the cursor" before the first unique line
.* matches the next (unique) line and bookmarks it
Using Mark All we bookmark all such unique lines, so that we can remove them using the Entry from the Search -> Bookmark menu.

How can I use regex to capture this specfic set of ages?

I have a set of age data, like below;
1
2
3
4
5
6
7
8
9
10
1,1
1,2
1,3
2,12
11,13,15
7,8,12
12,15
14,16,17
15,6
13,11,10,2
And so on... I am trying to use Regex in to target a 'mixed' range of childrens ages. The logic requires at least a combination of 2 childen (so requires one of the lines with a comma), with at least one aged under 10 (min is 1), and at least one aged equal or greater to 10 (max 17).
My expected results from the above would be to return these lines below, and nothing else;
2,12
7,8,12
15,6
13,11,10,2
Any advice would be appreciated on how to resolve? Thanks in advance, I am continuing to try to correct.
You can use this regex to meet your requirements:
^(?=.*\b[1-9]\b)(?=.*\b1[0-7]\b)[0-9]+(?:,[0-9]+)+$
RegEx Demo
There are 2 lookaheads to assert 2 numbers one between 1-9 and another between 10-17
([1-9]) matches a number that should be between 1 and 9
1[0-7] matches a number that should be between 10 and 17
[0-9]+(?:,[0-9]+)+ in the regex is for matching 1 or more comma separated numbers in the middle.
You can do it with
\b\d,1[0-7]\b
provided the ages always are sorted (youngest to oldest).
If the age of 0 isn't allowed, change to
\b[1-9],1[0-7]\b
It checks for a single digit followed by a comma and one followed by a single digit in the range 0-7.
See it here at regex101.

How do I remove first 5 characters in each line in a text file using vi?

How do I remove the first 5 characters in each line in a text file?
I have a file like this:
4 Alabama
4 Alaska
4 Arizona
4 Arkansas
4 California
54 Can
8 Carolina
4 Colorado
4 Connecticut
8 Dakota
4 Delaware
97 Do
4 Florida
4 Hampshire
47 Have
4 Hawaii
I'd like to remove the number and the space at the beginning of each line in my txt file.
:%s/^.\{0,5\}// should do the trick. It also handles cases where there are less than 5 characters.
Use the regular expression ^..... to match the first 5 characters of each line. use it in a global substitution:
:%s/^.....//
As all lines are lined up, you don't need a substitution to solve this problem.
Just bring the cursor to the top left position (gg), then:
CTRL+vGwlx
I think easiest way is to use cut.
just type cut -c n- <filename>
Try
:s/^.....//
You probably don't need the "^" (start of line), and there'd be shortcuts for the 5 characters - but simple is good :)
Since the text looks like it's columnar data, awk would usually be helpful. I'd use V to select the lines, then hit :! and use awk:
:'<,'>! awk '{ print $2 }'
to print out the second column of the data. Saves you from counting spaces altogether.
:%s/^.\{0,5\}//g for global, since we want to remove first 5 columns of each line for every line.
In my case, to Delete first 2 characters Each Line I used this :%s/^.\{0,2\}// and it works with or without g the same.
I am on a VIM - Vi IMproved 8.2, macOS version, Normal version without GUI.

List a number's digits in J

I use the programming language: J.
I want to put all of the digit of a number in a list.
From:
12345
to:
1 2 3 4 5
What can I do?
The way I'd write this is
10&#.^:_1
which we can see in use with this sentence:
(10&#.^:_1) 123456789
1 2 3 4 5 6 7 8 9
That program relies on the reshaping built in to Base. It uses the (built-in) obverse of Base as a synonym for Antibase.
I found the answer:
intToList =: (".#;"0#":)
Another approach:
intToList =: 3 : '((>. 10 ^. y)#10) #: y'
This doesn't convert to string and back, which can be potentially costly, but counts the digits with a base-10 log, then uses anti-base (#:) to get each digit.
EDIT:
Better, safer version based on Dan Bron's comment:
intToList =: 3 : '10 #.^:_1 y'

conditionally remove portion of a line in delimited file

I have a ~ delimited text file with about 20 nullable columns.
I am trying to use SED (from cygwin) to "blank out" the value in column 11 if the following conditions are met...
Column 3 is a zero (0)
Column 11 is in date format mm/dd/yy (I'm not really concerned if it's a valid date)
Here's what I'm trying...
s/\([^~]*~[^~]*~0~[^~]*~[^~]*~[^~]*~[^~]*~[^~]*~[^~]*~[^~]*~[^~]*~\)\(\d{2}\/\d{2}\/\d{2}~\)\(.*$\)/\1~\3/
Here's a sample from the file:
Test A~7~1~~~~72742050~~~Z370~10/25/11~~~0~8.58563698~6.40910452~4.59198764~3.18239469~1.72955975~.23345372~-1.30891113~-2.89971394~1~0
Test B~7~0~~~~72742060~~~Z351~05/15/12~05/14/12~~0~18.88910518~12.69425528~9.96182381~6.76077612~6.76077612~3.86279298~.22449489~-.91021010~0~0
Test C~7~0~~~~72742060~~~Z352~06/12/12~ABC~~0~20.60845679~17.54889351~15.52912556~12.43279217~12.43279217~10.32033576~9.35296144~8.09245899~0~0
...and here's what I expect to get back
Test A~7~1~~~~72742050~~~Z370~10/25/11~~~0~8.58563698~6.40910452~4.59198764~3.18239469~1.72955975~.23345372~-1.30891113~-2.89971394~1~0
Test B~7~0~~~~72742060~~~Z351~05/15/12~~~0~18.88910518~12.69425528~9.96182381~6.76077612~6.76077612~3.86279298~.22449489~-.91021010~0~0
Test C~7~0~~~~72742060~~~Z352~06/12/12~ABC~~0~20.60845679~17.54889351~15.52912556~12.43279217~12.43279217~10.32033576~9.35296144~8.09245899~0~0
but the file comes through with line 2 completely unchanged.
You are trying to replace column 12 instead of 11:
\([^~]*~[^~]*~0~[^~]*~[^~]*~[^~]*~[^~]*~[^~]*~[^~]*~[^~]*~[^~]*~\)\(\d{2}\/\d{2}\/\d{2}~\)\(.*$\)
1 2 3 4 5 6 7 8 9 10 11 12
If just removing one of the [^~]*~ from the end of the first group doesn't fix it, it could be because your version of sed doesn't support either \d or repetition with {2} (although escaping the curly brackets would probably fix that).
Here is a version that should work everywhere which replaces each \d{2} with [0-9][0-9] (and fixes the incorrect column issue mentioned above):
s/\([^~]*~[^~]*~0~[^~]*~[^~]*~[^~]*~[^~]*~[^~]*~[^~]*~[^~]*~\)\([0-9][0-9]\/[0-9][0-9]\/[0-9][0-9]~\)\(.*$\)/\1~\3/