Trying to standardize exports

Trying to standardize exports - replace

I am trying to standardize my exports file. I need to remove all spaces and tabs between the first two fields and replace them with two tabs. I am using VI.
So I want to change
/vol/vol1/home1/xxx -rw=admin:app:admhosts
to
/vol/vol1/home1/xxx -rw=admin:app:admhosts
making the space equil to To two TABS.
I am using VI.

Do :s/[ ^I]\+/^I^I/ for a single line, or :%s/[ ^I]\+/^I^I/ for the whole file. Note that I mean ^I to mean you press Ctrl+I

Related

C++ Select Multiple Occurances of a variable?

I am new to C++, but I come from Python, and was starting C++ in a new semester. I remember in Python, you can select multiple occurrences. ATM I am using Code::Blocks.
So, when I start selecting a variable that pops up a few times, I see it highlights the other same variables in red, but I am only able to change one at a time. So I am looking for like a multiple-cursor type method where I can select all occurrences of a variable or word and rewrite them. For example, if I want to change 15 places where I used the variable name weight instead of weightsKG then I want to be able to select all at once, not CTRL-select. Or is that not possible?

In CodeBlocks, you can use the Replace... or Replace in files... command from the Search menu.
At this point a dialog will open where you will enter the text to be replaced and the updated one.
Pressing the Replace button will open another dialog where you can confirm one by one the replacement of the text, or you can replace all the occurrences in one go.
That's all.

Google Sheets Using RegEX To Reformat & Concatenate

Link To Spreadsheet
Sheet!1Name - Names are in Single Column
Sheet!2Names - Names are in First Name, Last Name columns.
What I'm trying to do is basically remove any suffixes, special characters, and spaces, capitalize that information, and combine it with information from another field.
I was able to figure out how to piece together some regex that seems to effectively get rid of suffixes and removes special characters. It's below. That's where my skill set stops.
={"PlayerKey";ARRAYFORMULA(UPPER(IF(ISBLANK(C2:C8),,PROPER(TRIM(REGEXREPLACE(C2:C8," Jr\.$| J$| Sr\.$| S$|IV$|III$|II$|\.|-|'",""))))))}
I'm having trouble nesting formulas - i believe what i need to do is nest both concat and substitute but not sure if that's the method to get the "Desired Output example" that is in the sheet. I'm also having trouble understanding what order to do things, which is why i'm having trouble with 2Name i think.

How's this in A1 of the new tab called MK.Help?
=ARRAYFORMULA({"Player Key";UPPER(TRIM(REGEXREPLACE(IF(MID(C2:C8,2,1)=".",INDEX(SPLIT(C2:C8," "),,1),LEFT(C2:C8))&D2:D8," Jr\.$| J$| Sr\.$| S$|IV$|III$|II$|\.|-|'",""))&E2:E8)})

Set Difference in Notepad++ with Regexes

Suppose I have two files main.txt and sub.txt. Suppose both files have unique lines i.e. the same line of text does not occur twice in either file. Also suppose there are no empty lines in either file. Now, consider the files as sets of strings, with each member of the set occuring on a line. This is possible because of our uniqueness condition. Now suppose sub.txt is a subset of main.txt in this way. How do we compute the set difference of main.txt and sub.txt to produce a new file diff.txt? To be clear, the lines of diff.txt should be those that occur in main.txt but not sub.txt. There should be no empty lines in diff.txt. Order in diff.txt is irrelevant.
Example
main.txt:
Hello
World
How
You
Are
sub.txt:
World
Hello
diff.txt:
How
Are
You
Bonus Questions
How can I tell that one set is actually a subset of the other? This is an assumption in the question, but in practice we mightn't know this for sure and would want a way to check it automatically.
How can I tell if the lines in each file are truly unique?
How can I tell if there are no blank lines?

Bonus Answer
I'll answer the bonus questions first. Follow these steps in order to ensure the right conditions hold as stated in the question:
Open both files in Notepad++ and close any other files
Lexographically sort each file: https://superuser.com/questions/762279/sorting-lines-in-notepad-without-the-textfx-plugin
Ensure that the following regex has no matches in either file, which will guarantee they're duplicate-free: ^(.+$\r\n)\1. If you want to remove duplicates, replace all ocurrences of that regex with \1.
Ensure there are no blank lines in either file by searching for ^$. If any are found you can delete them manually.
Create a third file and paste the contents of both sub.txt and main.txt into this file. Then lexographically sort it. Count the number of occurrences of the regex: ^(.+$)\r\n\1 to detect duplicate lines. If the count matches the number of lines in sub.txt, then it's a subset of main.txt. Keep this file for later.
Main Answer
In the third file you created in the last part, search for ^(.+$)\r\n\1\r?\n? and replace with the empty string. This will remove all elements of sub.txt from main.txt leaving you with diff.txt.
Note: This approach may leave you with a single blank line at the end of diff.txt, in the case where there was a duplicate found there. In that case, just delete it manually.

Notepad++ - Selecting or Highlighting multiple sections of repeated text IN 1 LINE

I have a text file in Notepad++ that contains about 66,000 words all in 1 line, and it is a set of 200 "lines" of output that are all unique and placed in 1 line in the basic JSON form {output:[{output1},{output2},...}]}.
There is a set of characters matching the RegEx expression "id":.........,"kind":"track" that occurs about 285 times in total, and I am trying to either single them out, or copy all of them at once.
Basically, without some super complicated RegEx terms, I am stuck because I can't figure out how to highlight all of them at once, and also the Remove Unbookmarked Lines feature does not apply because this is all in one line. I have only managed to be able to Mark every single occurrence.
So does this require a large number of steps to get the file into multiple lines and work from there, or is there something else I am missing?
Edit: I have come up with a set of Macro schemes that make the process of doing this manually work much faster. It's another alternative but still takes a few steps and quite some time.
Edit 2: I intended there to be an answer for actually just highlighting the different sections all at once, but I guess that it not possible. The answer here turns out to be more useful in my case, allowing me to have a list of IDs without everything else.

You seem to already have a regex which matches single instances of your pattern, so assuming it works and that we must use Notepad++ for this:
Replace .*?("id":.........,"kind":"track").*?(?="id".........,"kind":"track"|$) with \1.
If this textfile is valid JSON, this opens you up to other, non-notepad++ options, like using Python with the json module.
Edited to remove unnecessary steps

Compare files and return only the differences using Notepad++

Notepad++ has a Compare Plugin tool for comparing text files, which operates like this:
Launch Notepad++ and open the two files you wish to run a comparison
check on.
Click the “Plugins” menu,
Select “Compare” and click “Compare.”
The plugin will run a comparison check and display the two files side
by side, with any differences in the text highlighted.
This is a nice feature, and which I have used happily for some time. Now, I have been looking for an option to go further and select the highlighted differing lines (e.g. by deleting the non-highlighted ones), or vice versa: i.e. expunge the highlighted lines.
Is there a straightforward way to achieve this?

To substract two files in notepad++ (file1 - file2) you may follow this procedure:
Recommended: If possible, remove duplicates on both files, specially if the files are big. To do this: Edit => Line operations => Sort Lines Lexicographically Ascending (do it on both files)
Add ---------------------------- as a footer on file1 (add at least 10 dashes). This is the marker line that separates file1 content from file2.
Then copy the contents of file2 to the end of file1 (after the marker)
Control + H
Search: (?m-s)^(?:-{10,}+\R[\s\S]*+|(.*+)\R(?=(?:(?!^-{10,}$)-++|[^-]*+)*+^-{10,}+\R(?:^.*+\R)*?\1(?:\R|\z))) note: use case sensitivity according to your needs
Replace by: (leave empty)
Select Regular expression radio button
Replace All
You can modify the marker if It is possible that file1/file2 can have lines equal to the marker. In that case you will have to adapt the regular expression.
By the way, you could even record a macro to do all steps (add the marker, switch to file2, copy content to file1, apply the regex with a single button press.
Edited:
Changed the regex to add some improvements:
Speed related:
Avoid as much backtracking as possible
Avoid searching after the mark
Usability:
Dashes are allowed for the lines. But the separator is still ^-{10,}$
Works with other characters besides words
Speed comparison:
New method vs Old method
So basically 78ms vs 1.6seconds. So a nice improvement! That makes comparing Kilobyte-sized files possible.
Still you may want to use some dedicated program for comparing or substracting bigger files.

If the number of differences is not large, a quicker method might be just bookmarking each differing line using keyboard shortcuts. Starting from the beginning of the file, press Alt+Page Down to focus on the first difference, and then press Ctrl+F2 to bookmark it. Continue with alternatingly pressing Alt+Page Down and Ctrl+F2 until the last difference.
With all the differing lines bookmarked, you can use any of the operations under "Search -> Bookmarks" menu:
Cut Bookmarked Lines
Copy Bookmarked Lines
Paste to (Replace) Bookmarked Lines
Remove Bookmarked Lines
Remove Unmarked Lines

I have a dirty workaround for this. It saves some time compared to Control+C, Alt+Tab, Control+V; Control+C, Alt+Tab, Control+V; ... but It may not be worth on big files or if the differences for both files are big. For bigger files you may prefer using some other tool.
Typically this works best when comparing group of 'words' and does not work with content that is tabulated (like source code)
So the workaround is:
Optional: (depends on the content that's being compared) Sort both files (it will make the future comparison easier) To do this: Edit => Line operations => Sort Lines Lexicographically Ascending (do it on both files)
Compare files with the plugin
Choose one file and inspect the lines you want to keep. Add one tabulator before each of those lines. Remeber you can select several lines and press tab for tabulating them. Optionally, you may add tabulators to the lines you want to remove
Sort the file. The tabulated lines will come up first. So now you can copy-paste them (or copy-paste the untabulated ones)

move the files to a linux box and then execute diff command:
$ diff file1.txt file2.txt > file_diff.txt

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Trying to standardize exports - replace

Do :s/[ ^I]\+/^I^I/ for a single line, or :%s/[ ^I]\+/^I^I/ for the whole file. Note that I mean ^I to mean you press Ctrl+I

Related

C++ Select Multiple Occurances of a variable?

Google Sheets Using RegEX To Reformat & Concatenate

Set Difference in Notepad++ with Regexes

Notepad++ - Selecting or Highlighting multiple sections of repeated text IN 1 LINE

Compare files and return only the differences using Notepad++

Categories

Resources