regular expressions for selecting multiple lines - regex

i have a text file in a particular format..
!c_xyz|crby=112|crdate=12jun11|mdby=112|mddate=12jun11|Desc=xyz
asdasda........................................................
asddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd
!c_abc|crby=112|crdate=12jun11|mdby=112|mddate=12jun11|Desc=xyz...
I need a regular expression to reformat this file using Find and Replace - Visual Studio. The Desc field value has overflowed onto next lines. i need to move them back to the actual line. Final string should be like
!c_xyz|crby=112|crdate=12jun11|mdby=112|mddate=12jun11|Desc=xyzsdasda.........asdddddd..
!c_abc|crby=112|crdate=12jun11|mdby=112|mddate=12jun11|Desc=xyz...
I need an RE for "desc=" followed by anything until the next ! symbol

find Desc=([^\|\r\n]+)[\r\n](([^!\r\n][^\r\n]+[\r\n])*), replace with Desc=\1\2 and repeat until every line starts with ! (you can test this using ^[^!] as a search expr which should find nothing).
alternatively find [\r\n]+, replace with the empty string. thereafter find !, replace with \r\n!. this suggestion has 2 drawbacks. it temporarily produces very long lines which your editor (notably vs) may or may not have difficulties with and processes descriptions containing ! incorrectly.
addendum:
your input seems to be fixed format up to the Desc section. if it is indeed, you can apply alternative #2, step 1, being followed by a search/replace run using (!.{53}\|Desc=)/[\r\n]\1.

As mentioned in the comments by #X3074861X, you can use Notepad++.
Input:
!c_xyz|crby=112|crdate=12jun11|mdby=112|mddate=12jun11|Desc=xyz
asdasda........................................................
asddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd
!c_abc|crby=112|crdate=12jun11|mdby=112|mddate=12jun11|Desc=xyz...
For the find and replace, select the mode as Regular expression with the options as follows:
Find what: \r\n[^!]
Leave Replace with blank.
Output:
!c_xyz|crby=112|crdate=12jun11|mdby=112|mddate=12jun11|Desc=xyzsdasda........................................................sddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd
!c_abc|crby=112|crdate=12jun11|mdby=112|mddate=12jun11|Desc=xyz...
Screenshot:

Related

Regex to extract all strings from source code used when calling a function

We have an old, grown project with thousands of php files and need to clean it up.
Throughout the whole project we do have a lot of function calls similar to:
trans('somestring1');
trans("SomeString2");
trans('more_string',$somevar);
trans("anotherstring4",$somevar);
trans($tx_key);
trans($anotherKey,$somevar);
All of those are embedded into the code and represent translation keys. I would like to find a way to extract all "translation keys" in all occurrences.
The PHP project is in VS Code, so a RegEx Search would be helpful to list the results.
Or I could search through the project with any other tool you would recommend
However I would also need to "export" just the strings to a textfile or similar.
The ideal result would be:
somestring1
SomeString2
more_string
anotherstring4
$tx_key
$anotherKey
As a bonus - if someone knows, how I could get the above list including filename where the result has been found - that would be really fantastic!
Any help would be greatly appreciated!
Update:
The RegEx I came up with:
/(trans)+\([^\)]*\)(\.[^\)]*\))?/gim
list the full occurrence - How can I just get the first part of the result (between Single Quotes OR between Double Quotes OR beginning with $)
See here: regexr.com/548d4
Here are some steps to get exactly what you want. Using this you can do a find and replace on your search results!
So you could do sequential regex find/replaces in the right circumstances.
The replace can be just within the search results editor and not affect the underlying files at all - which is what you want.
You can also have the replace action actually edit the underlying files if you wish.
[Hint: This technique can also make doing a find item a / replace with b in files that contain term c much easier to do.]
(1) Open a new search editor: Ctrl+Shift+P
(That command is currently unbound to a keybinding.)
(2) Paste this regex into the Search input box (with the regex option .* selected):
`(.*?)(\btrans\(['"]?)([^,'")]+)(.*)` - a relatively simple regex
regex101 demo
See my other answer for a regex to work with up to 6 entries per line:
(\s*\d+:\s)?((.*?)(\btrans\(['"]?)([^,'")]*)((.*?)(\btrans\(['"]?)([^,'")]*))?((.*?)(\btrans\(['"]?)([^,'")]*))?((.*?)(\btrans\(['"]?)([^,'")]*))?((.*?)(\btrans\(['"]?)([^,'")]*))?((.*?)(\btrans\(['"]?)([^,'")]*))?)(.*)
(3) You will get a list of files with the search results. Now open a Find widget Shift+F in this Search editor.
(4) Put the same regex into that Find input. Regex option selected. Put $3 into the Replace field. This only replaces in this Search editor - not the original files (although that can be done if you want it in some case). Replace All.
If using the 1-6 version regex, replace with:
$1$5 $9 $13 $17 $21 $25
(5) Voila. You can now save this Search Editor as a file.
The first answer works for one desired capture per line as in the original question. But that relatively simple regex won't work if there are two or more per line.
The regex below works for up to 6 entries per line, like
trans('somestring1');
stuff trans("SomeString2"); some content trans("SomeString2a");more stuff [repeat, repeat]
But it doesn't for 7+ - you'll need a regex guru for that.
Here is the process again with a twist of using a snippet in the Search Editor instead of a Find/Replace. Using a snippet allows more control over the formatting of the final result.
(1) Open a new search editor: Ctrl+Shift+P (That command is currently unbound to a keybinding.)
(2) Paste this regex into the Search input box (with the regex option .* selected):
`((.*?)(\btrans\(['"]?)([^,'")]*)((.*?)(\btrans\(['"]?)([^,'")]*))?((.*?)(\btrans\(['"]?)([^,'")]*))?((.*?)(\btrans\(['"]?)([^,'")]*))?((.*?)(\btrans\(['"]?)([^,'")]*))?((.*?)(\btrans\(['"]?)([^,'")]*))?)(.*)`
regex101 demo
(3) You will get a list of files with the search results. Now select all your results individually with Ctrl+Shift+L.
(4) Trigger this keybinding:
{
"key": "alt+i", // whatever keybinding you like
"command": "editor.action.insertSnippet",
"when": "editorTextFocus",
"args": {
"snippet": "${TM_SELECTED_TEXT/((.*?)(\\btrans\\([\\'\\\"]?)([^,\\'\\\")]*)((.*?)(\\btrans\\([\\'\\\"]?)([^,\\'\\\")]*))?((.*?)(\\btrans\\([\\'\\\"]?)([^,\\'\\\")]*))?((.*?)(\\btrans\\([\\'\\\"]?)([^,\\'\\\")]*))?((.*?)(\\btrans\\([\\'\\\"]?)([^,\\'\\\")]*))?((.*?)(\\btrans\\([\\'\\\"]?)([^,\\'\\\")]*))?)(.*)/$4${8:+\n }$8${12:+\n }$12${16:+\n }$16${20:+\n }$20${24:+\n }$24/g}"
}
},
That snippet will be applied to each selection in your search result. This part ${8:+\n } is a conditional which adds a newline and some spaces if there is a capture group 8 - which would be a second trans(...) on a line.
Demo: (unfortunately, it doesn't properly show the Ctrl+Shift+L selecting all lines individually or the Alt+i snippet trigger)

How would I copy and paste selected text using Regular Expressions and the Replace dialog in Notepad ++?

Dvelving straight into the problem; all I'm trying to do here is to duplicate a line and add a bracket at the end using Regular Expressions and automate the process through the Replace With dialog in Notepad ++.
My issue visualized:
In the representation underneath, I have a bunch of instances of "["Mesh"]" that all have different path values assigned to it. All I want to do is duplicate it the path entry and add bracket at the end before the comma in the duplicated one.
What I have right now:
...
["Mesh"] = Platform(
"models/ships/japan/Zuikaku.mmod",
...
What I'm trying to achieve:
...
["Mesh"] = Platform(
"models/ships/japan/Zuikaku.mmod",
"models/ships/japan/Zuikaku.mmod"),
...
Without getting too specific, since there are ~500 of these instances across the file I'm modifying, I do not want to go through each one while simply clicking CTRL + D to duplicate each line and add the bracket as that would take literal ages to do.
I have some limited experience with Regular Expressions from previous uses, but very limited. I know I can select the entire line in the Search dialog using ".*" but that's as far as I've gotten.
Thank you in advance for your time!
You should be able to use this regex (disable . matches newline). I am using (\R+) to capture end-of-line characters (and reproduce them in the output) so that it will work on systems that use other than just newline to end lines.
(\["Mesh"\]\s*=\s*.*(\R+))(.*),$
Replace with
$1$3,$2$3\),
For the input of
...
["Mesh"] = Platform(
"models/ships/japan/Zuikaku.mmod",
...
This gives
...
["Mesh"] = Platform(
"models/ships/japan/Zuikaku.mmod",
"models/ships/japan/Zuikaku.mmod"),
...

Search/replace in block selection in Notepad++

Is there a way to limit search/replace only to a columnar block selection in Notepad++?
Here is what I am trying to do:
I am bulk-editing metadata extracted from large numbers of photos.
The metadata comes to me as a csv file with no quotes around fields in header line and no quotes around first field in each succeeding line.
I edit this file in Open Office calc which exports with quotes around all fields.
I can easily edit header row but the problem comes in stripping quotes from only first field in successive lines.
I can use notepad in columnar mode but, after selecting the first column, the 'search only in selection' option box is greyed out.
I can do this by hand but it means lots of hand-work and increased chance of error.
I know, this probably won't help you any more, but I just had the same problem and stumbled across this question.
I found moving the block in question to a new file and performing the find/replace there works quite decently. When moving the block back, be sure to select it in block mode (see this question).
No. Another editor may have this feature.
sort of a late reply but... I had the same problem when I moved to a new machine with Notepad++ installed. Previously, I was using a text editor called Boxer that had this feature, which I found invaluable. Its not free-ware however.
You may not be able to Search/Replace within a columnar selection, but you can easily carry out your task within Notepad++. Use Find and Replace feature, with the Regular Expressions box checked.
If you want to remove quotes only from a target column, use the following regular expression in the Find field:
(^([^,]*,){i})"([^,\n\r]*)"(.*$)
Replace i with the position of the target column minus 1.
(i.e.- Us 2 if you want quotes around the third column, 0 for the first column, etc)
In the Replace field use:
\1\3\4
Clicking "Replace All" will strip quotes from the target column.
If you want to blow away all quotes surrounding each element in your csv without prejudice, use the following regular expression in the Find field:
((?<=,)|(?<=^))"(.*?)"((?=$|,))
In the Replace field use:
\1\2\3
Clicking Replace All will strip quotes form the columns.
Example
Since you didn't provide an example csv file, I'll walk through my own working example. Below is my csv:
"0","1","2","3","4","5","6","7","8","9"
"10","11","12","13","14","15","16","17","18","19"
"20","21","22","23","24","25","26","27","28","29"
"30","31","32","33","34","35","36","37","38","39"
"40","41","42","43","44","45","46","47","48","49"
"50","51","52","53","54","55","56","57","58","59"
"60","61","62","63","64","65","66","67","68","69"
"70","71","72","73","74","75","76","77","78","79"
"80","81","82","83","84","85","86","87","88","89"
"90","91","92","93","94","95","96","97","98","99"
"100","101","102","103","104","105","106","107","108","109"
"110","111","112","113","114","115","116","117","118","119"
"120","121","122","123","124","125","126","127","128","129"
"130","131","132","133","134","135","136","137","138","139"
"140","141","142","143","144","145","146","147","148","149"
"150","151","152","153","154","155","156","157","158","159"
"160","161","162","163","164","165","166","167","168","169"
"170","171","172","173","174","175","176","177","178","179"
"180","181","182","183","184","185","186","187","188","189"
"190","191","192","193","194","195","196","197","198","199"
If I wanted to remove quotes from the second column, I would use the below Find and Replace fields
(^([^,]*,){1})"([^,\n\r]*)"(.*$)
\1"\3"\4
Clicking Replace All yields the below result:
"0",1,"2","3","4","5","6","7","8","9"
"10",11,"12","13","14","15","16","17","18","19"
"20",21,"22","23","24","25","26","27","28","29"
"30",31,"32","33","34","35","36","37","38","39"
"40",41,"42","43","44","45","46","47","48","49"
"50",51,"52","53","54","55","56","57","58","59"
"60",61,"62","63","64","65","66","67","68","69"
"70",71,"72","73","74","75","76","77","78","79"
"80",81,"82","83","84","85","86","87","88","89"
"90",91,"92","93","94","95","96","97","98","99"
"100",101,"102","103","104","105","106","107","108","109"
"110",111,"112","113","114","115","116","117","118","119"
"120",121,"122","123","124","125","126","127","128","129"
"130",131,"132","133","134","135","136","137","138","139"
"140",141,"142","143","144","145","146","147","148","149"
"150",151,"152","153","154","155","156","157","158","159"
"160",161,"162","163","164","165","166","167","168","169"
"170",171,"172","173","174","175","176","177","178","179"
"180",181,"182","183","184","185","186","187","188","189"
"190",191,"192","193","194","195","196","197","198","199"
My search on internet, to to see weather notepad++ suports this; brought me here.
I have used TextPad and confirm that it supports find-and-replace within column selected block. Also TextPad is free for personal use.

EditPad: Need a regex that handles multiple possible data formats

First, I'm using EditPadPro for my regex cleaning, so any answers given should work within that environment.
I get a large spreadsheet full of data that I have to clean every day. I've managed to get it down to a couple of different regexes that I run, and this works... but I'm curious to see if it's possible to reduce down to a single regex.
Here is some sample data:
3-CPC_114851_70095_70095_CAN-bre
3-CPC_114851_70095_70095_CAN
b11-ao1-113775-bre
b7-ao-114441
b7-ao-114441-bre
b7-ao1-114441
b7-ao1-114441-bre
http://go.nlvid.com/results1/?http://bo
go.nlv/results1/?click
b4-sm-1359
b6-sm-1356-bre
1359_195_1453814569-bre
1356_104_1456856729
b15-rad-8905
b15-rad-8905-bre
Here is how the above data needs to end up:
114851-bre
114851
113775-bre
114441
114441-bre
114441
114441-bre
http://go.nlvid.com/results1/
go.nlv/results1/
sm-1359
sm-1356-bre
sm-1359-bre
sm-1356
rad-8905
rad-8905-bre
So, there are numerous rules, such as:
In cases of more than 2 underscores, the result needs to contain only the value immediately after the first underscore, and everything from the dash onwards.
In cases where the string contains "-ao-", "-ao1-", everything prior to the final numeric string should be removed.
If a question mark is present, everything from the mark onwards should be removed.
If the string contains "-sm-" or "-rad-", everything prior to those alpha strings should be removed.
If the string contains 2 underscores, averything after the first numeric string up to a dash
(if present) should be removed, and the string "sm-" should be prepended.
Additionally there is other data that must be left untouched, including but not limited to:
113535|24905|24905
as well as many variations on this pattern of xxxxxx|yyyyy|zzzzz (and not always those string lengths)
This may be asking way too much of regex, I'm not sure as I'm not great with it. But I've seen some pretty impressive things done with it, so I thought I'd put this out to the community and see what you come back with.
Jonathan, I can wrap all of those into one regex, except the last one (where you prepend sm- to a string that does not contain sm). It is not possible in this context, because we cannot capture "sm" to reuse in the replacement, and because there is no "conditional replacement" syntax in EPP.
That being said, you can achieve what you want in EPP with two regexes and one macro to chain the two.
Here is how.
The solution below is tested in EPP.
Regex 1
Press Ctrl + Sh + F to enter Search / Replace mode
Enter the following Search and Replace in the appropriate boxes
At the top right of the Search bar, click the Favorite Searches pull-down, select "Add", give it a name, e.g. Regex 1
Search:
(?mx)^
(?=(?:[^_\r\n]*?_){3})[^_\r\n]+?_([^_\r\n]+)[^-\r\n]+(-[^\r\n]+)?
|
[^\r\n]*?-ao1?-\D*([^\r\n]+)
|
([^\r\n?]*)(?=\?)[^\r\n]+
|
[^\r\n]*?-((?:sm|rad)-[^\r\n]+)
Replace:
\1\2\3\4\5
Regex 2
Same 1-2-3 steps as above.
Search
^(?!(?:[^_\r\n]*?_){3})(?=(?:[^_\r\n]*?_){2})(\d+)(?:[^-\r\n]+(-[^\r\n]+)?)
Replace
sm-\1\2
Chaining Regex 1 and Regex 2
Top menu: Macros, Record Macro, give it a name.
Click the Favorite searches pulldown, select Regex 1
Hit Replace All.
Click the Favorite searches pulldown, select Regex 2
Hit Replace All.
Macros, Stop recording.
Whenever you want to do your sequence of replacements, pull it by name under the Macros menu.
Testing This
I have tested my "Jonathan macro" on your input. Here is the result:
114851-bre
114851
113775-bre
114441
114441-bre
114441
114441-bre
http://go.nlvid.com/results1/
go.nlv/results1/
sm-1359
sm-1356-bre
sm-1359-bre
sm-1356
rad-8905
rad-8905-bre
Try this:
Toggle the Search Panel : SHIFT+CTRL+F
SEARCH: .*?((?:sm-|rad-)?(?:(?:\d+|[\w\.]+\/.*?))(?:-\w+)?$)
REPLACE: $1
Check REGEX and WORDS
Click Replace All or Hit CTRL+ALT+F3
Check the image below:

Vim: How to apply external command only to lines matching pattern

Two of my favorite Vim features are the ability to apply standard operators to lines matching a regex, and the ability to filter a selection or range of lines through an external command. But can these two ideas be combined?
For example, I have a text file that I use as a lab notebook, with notes from different dates separated by a line of dashes. I can do something like delete all the dash-lines with something like :% g/^-/d. But let's say I wanted to resize all the actual text lines, without touching those dash lines.
For a single paragraph, this would be something like {!}fmt. But how can this be applied to all the non-dash paragraphs? When I try what seems the logical thing, and just chain these two together with :% v/^-/!fmt, that doesn't work. (In fact, it seems to crash Vim...)
Is there a way to connect these two ideas, and only pass lines (not) matching a pattern into an external command like fmt?
Consider how the :global command works.
:global (and :v) make two passes through the buffer,
first marking each line that matches,
then executing the given command on the marked lines.
Thus if you can come up with a command – be it an Ex command or a command-line tool – and an associated range that can be applied to each matching line (and range), you have a winner.
For example, assuming that your text is soft-wrapped and your paragraphs are simply lines that don't begin with minus, here's how to reformat the paragraphs:
:v/^-/.!fmt -72
Here we used the range . "current line" and thus filtered every matching line through fmt. More complicated ranges work, too. For instance, if your text were hard-wrapped and paragraphs were defined as "from a line beginning with minus, up until the next blank line" you could instead use this:
:g/^-/.,'}!fmt -72
Help topics:
:h multi-repeat
:h :range!
:h :range
One way to do it may be applying the command to the lines matching the pattern 'not containing only dashes'
The solution I would try the is something like (not tested):
:g/\v^(-+)#!/normal V!fmt
EDIT I was doing some experiments and I think a recurvie macro should work for you
first of all set nowrapscan:
set nowrapscan
To prevent the recursive macro executing more than you want.
Then you make a search:
/\v^(-+)#!
Test if pressing n and p works with your pattern and tune it up if needed
After that, start recording the macro
qqn:.!awk '{print $2}'^M$
In this case I use awk as an example .! means filter current line with an external program
Then to make the macro recursive just append the string '#q' to the register #q
let #q .= '#q'
And move to the beggining of the buffer to apply the recursive macro and make the modifications:
gg#q
Then you are done. Hope this helps