Remove line entirely (not to leave it empty) - regex

This is what I have in doc:
1;01878916;BC101;FALSE
16;01978916;BC101;FALSE
17;0195B4E5;BC101;FALSE
19;0197D016;BC101;FALSE
After I run find&replace: ^((1|17);.+?)$ with: empty it leaves
blankrow
16;01978916;BC101;FALSE
blankrow
19;0197D016;BC101;FALSE
and then I have to run find and replace \s+$ in order to remove empty line(s) and manually remove first empty line.
Im weak with regex, tried to combine those 2 commands into one.
How it should be, to remove entirely empty rows, without leaving empty row?
To get
16;01978916;BC101;FALSE
19;0197D016;BC101;FALSE
Thanks in advance. I need to have regex commands in order to run FIND and Replace in all open files, because I'm doing this in 10 files at once. Line operations > Remove blank lines is not an option.

The regex:
^(1|17);.+?\s+
mentioned above works well here as there is no whitespace at the beginning of the lines you want to keep. If that's ever not the case, you can also do:
\s+^(1|17);.+?$

Related

Set Difference in Notepad++ with Regexes

Suppose I have two files main.txt and sub.txt. Suppose both files have unique lines i.e. the same line of text does not occur twice in either file. Also suppose there are no empty lines in either file. Now, consider the files as sets of strings, with each member of the set occuring on a line. This is possible because of our uniqueness condition. Now suppose sub.txt is a subset of main.txt in this way. How do we compute the set difference of main.txt and sub.txt to produce a new file diff.txt? To be clear, the lines of diff.txt should be those that occur in main.txt but not sub.txt. There should be no empty lines in diff.txt. Order in diff.txt is irrelevant.
Example
main.txt:
Hello
World
How
You
Are
sub.txt:
World
Hello
diff.txt:
How
Are
You
Bonus Questions
How can I tell that one set is actually a subset of the other? This is an assumption in the question, but in practice we mightn't know this for sure and would want a way to check it automatically.
How can I tell if the lines in each file are truly unique?
How can I tell if there are no blank lines?
Bonus Answer
I'll answer the bonus questions first. Follow these steps in order to ensure the right conditions hold as stated in the question:
Open both files in Notepad++ and close any other files
Lexographically sort each file: https://superuser.com/questions/762279/sorting-lines-in-notepad-without-the-textfx-plugin
Ensure that the following regex has no matches in either file, which will guarantee they're duplicate-free: ^(.+$\r\n)\1. If you want to remove duplicates, replace all ocurrences of that regex with \1.
Ensure there are no blank lines in either file by searching for ^$. If any are found you can delete them manually.
Create a third file and paste the contents of both sub.txt and main.txt into this file. Then lexographically sort it. Count the number of occurrences of the regex: ^(.+$)\r\n\1 to detect duplicate lines. If the count matches the number of lines in sub.txt, then it's a subset of main.txt. Keep this file for later.
Main Answer
In the third file you created in the last part, search for ^(.+$)\r\n\1\r?\n? and replace with the empty string. This will remove all elements of sub.txt from main.txt leaving you with diff.txt.
Note: This approach may leave you with a single blank line at the end of diff.txt, in the case where there was a duplicate found there. In that case, just delete it manually.

Notepad++ - Selecting or Highlighting multiple sections of repeated text IN 1 LINE

I have a text file in Notepad++ that contains about 66,000 words all in 1 line, and it is a set of 200 "lines" of output that are all unique and placed in 1 line in the basic JSON form {output:[{output1},{output2},...}]}.
There is a set of characters matching the RegEx expression "id":.........,"kind":"track" that occurs about 285 times in total, and I am trying to either single them out, or copy all of them at once.
Basically, without some super complicated RegEx terms, I am stuck because I can't figure out how to highlight all of them at once, and also the Remove Unbookmarked Lines feature does not apply because this is all in one line. I have only managed to be able to Mark every single occurrence.
So does this require a large number of steps to get the file into multiple lines and work from there, or is there something else I am missing?
Edit: I have come up with a set of Macro schemes that make the process of doing this manually work much faster. It's another alternative but still takes a few steps and quite some time.
Edit 2: I intended there to be an answer for actually just highlighting the different sections all at once, but I guess that it not possible. The answer here turns out to be more useful in my case, allowing me to have a list of IDs without everything else.
You seem to already have a regex which matches single instances of your pattern, so assuming it works and that we must use Notepad++ for this:
Replace .*?("id":.........,"kind":"track").*?(?="id".........,"kind":"track"|$) with \1.
If this textfile is valid JSON, this opens you up to other, non-notepad++ options, like using Python with the json module.
Edited to remove unnecessary steps

How to get string between two commands in vim

I'm trying to write a vim script for get string between two commands. eg:
\string{new strings}, I want to get the new string if it contain empty lines or space
:%s/\\string{[^}]*\n*[^}]*}/new/gec
Your requirement is not that clear, but if you want to
get the "new string" if it contain empty lines or space
Also you commented:
select the contents inside the curly brace of \string{} even if there is any space or line
This line does it, it is a search command, not :s
/\\string{\zs\_[^}]*
If you want to do some substitution on the content between \string{ and }, you can use the pattern:
%s/\\string{\zs\_[^}]*\ze}/whatever/g
Note that, you can also write s/\\string{\zs\_[^}]*/whatever/g, the \ze} will make sure that the closing bracket must be there. Not sure if this is needed.
For the detail of \_[], do a :h \_[
use this code
%s/\\string{[^}]*\n*[^}]*\n*[^}]*\n*[^}]*\n*[^}]*\n*[^}]*\n*[^}]*\n*[^}]*\n*[^}]*\n*[^}]*\n*[^}]*\n*[^}]*\n*[^}]*\n*[^}]*\n*[^}]*\n*[^}]*\n*[^}]*\n*[^}]*\n*[^}]
*}/new/gec

How to add matches from a search to a list without doing a substitute?

Example:
let hits = []
:5s/regex-search/\=join(add(hits, submatch(0)))/g
This add all the matches in line 5 to a list.
However it does also a substitute in the text.
I tried to add the 'n' flag after the 'g'
but that doesn't add the matches to the list.
Is there any way to resolve my problem?
Almost there. First I don't think you need the join. Second, add returns the list with the match added. So you can just select the last element of the list to be the replaced element. (This makes it seem like nothing got replaced)
s/regex-search/\=add(hits,submatch(0))[-1]/g
With a recent enough Vim version, you can prevent that the actual substitution does take place (and messes up your undo-branches), while the expression on the right side of an :s command is still being evaluated.
You need at least Vim patch Vim patch 7.3.627 and then you can simply use
:s/foobar/\=add(hits, submatch(0))/gn

Trying to remove the first column of a document.

I'm using this command below to remove the first column of a document:
%s/^[^\t]*\zs\t[^\t]*\ze//g
but it says command not found. Any idea?
Here's the quickest way to remove the first column:
Press gg to go to the first character in the document.
Hit Ctrl+V to enter visual block mode.
Hit G (that is, shift-g) to go to the end of the document
Hit x to delete the first column.
I like the block selection solution of #Peter, but if you want to use substitution you need this command:
:%s/^.//
Let's analyze why this works:
:%s exec a substitution on all the document
/^./ select the first character after the start of the line
/ and replace it with... nothing.
If I understand you correctly, this should do the job:
:%s/^[^\t]//
The command removes all leading characters that are not a tabulator.
Alternatively, if you're editing a tabulator separated values document and want to remove all "columns" before the first tabulator, then this should do it for you:
%s/^[^\t]*\t//
The below command worked for me:
:%s/^\w*//