Search in VS Code for multiple terms - regex

Suppose I search on VS Code the terms 'word1 word2'. Then it finds all the occurrences where 'word1' is followed by 'word2'. In reality I want to find all the files where word1 and word2 occur, but they don't have to be consecutive. How can I do it?

Use regex flag and search for (word1[\s\S\n]*word2)|(word2[\s\S\n]*word1)
Made a small extension based on #tonix regex:
https://marketplace.visualstudio.com/items?itemName=usernamehw.search

Here is also a simple way for simple needs - use this as regex
(word1)|(word2)|(word3)
It may not cover some cases, but has been working fine for me, and easy to remember to type it in.

VSCode has an open issue to support multiple searches. You may want to get on there and push them a little.

To apply logical and
(?=.*word1)(?=.*word2)(?=.*word3)
To apply logical or
(word1)|(word2)|(word3)

For you guys,
if you want to search for multiple words (more than 2) at once in a single file and all the words must appear in the file at least once (logical AND), you can use the following regex which leverages lookahead assertions:
^(?=[\s\S\n]*(word1))(?=[\s\S\n]*(word2))(?=[\s\S\n]*(word3))(?=[\s\S\n]*(word4))[\s\S\n]*$
A global search with this pattern will only return all the files that contain word1 AND word2 AND word3 AND word4 in any order (e.g. word4 may appear at the beginning and/or word2 may appear at the end of the file).
I also wrote a little Python CLI helper which creates the regex automatically for you given the patterns you want to AND (though creating the regex by hand is pretty straightforward).
Copy the following code, paste it in a new file and save it somewhere on your machine (I've called it regex_and_lookahead.py). Then make the file executable with chmod +x ./regex_and_lookahead.py (important, I used Python 3.6, the literal prefix f -> f'(?=[\s\S\\n]*({arg}))' won't work in previous versions):
#!/usr/bin/env python
from sys import argv
args = argv[1:]
regex = '^'
for arg in args:
regex += f'(?=[\s\S\\n]*({arg}))'
regex += '[\s\S\\n]*$'
print(regex)
Usage:
./regex_and_lookahead.py word1 word2 word3 word4
Will generate the above regex. You can also use it to generate more complex regexes cause each parameter can have regex characters in it!
As an example:
./regex_and_lookahead.py "pattern with space" "option1|option2" "\bword3\b" "(repeated pattern\.){6}"
Will generate the following regex:
^(?=[\s\S\n]*(pattern with space))(?=[\s\S\n]*(option1|option2))(?=[\s\S\n]*(\bword3\b))(?=[\s\S\n]*((repeated pattern\.){6}))[\s\S\n]*$
Which will match a file if and only if all of the following conditions are true:
There's at least one occurrence of the string pattern with space;
There's at least one occurrence of either option1 or option2;
There's at least one occurrence of the word word3 delimited by word boundary assertions;
There is at least one occurrence of the string repeated pattern. repeated 6 times (i.e.: repeated pattern.repeated pattern.repeated pattern.repeated pattern.repeated pattern.repeated pattern.).
As you can see, the sky is the only limit. Have fun!

This is now supported, you can search for the term then open in editor and use ctrl + f to search the search results thanks #pushkin

This extension: Find and Transform, I am the author, makes it quite easy to do any number of sequential searches across files only using the files from previous search results for future searches.
There is a variable ${resultsFiles} that resolves to those previous search results files and can be used in the "filesToInclude" argument. Here is a sample keybinding
{
"key": "alt+b",
"command": "runInSearchPanel",
"args": {
"find": ["first", "second"],
"delay": 2000, // necessary to allow results to populate
// delay may need to be longer if you are searching a lot of files
"replace": ["", "knuckles"], // optional
"filesToInclude": ["", "${resultsFiles}"],
"filesToExclude": "Users\\Mark\\AppData\\Roaming\\Code\\User\\keybindings.json",
"isRegex": true,
// so that the first search will be triggered and produce results
"triggerSearch": true,
"triggerReplaceAll": [false, true] // optional
}
}
"find": ["first", "second"], : search for first and then search for second
"filesToInclude": ["", "${resultsFiles}"], : clear the filesToInclude on the first search, on second search use the resultFiles from the first search
You can do as many sequential searches as you like
The finds can be regex's and as complex as you wish

The original question asked to do a single search for files containing two separate words in the same file. Below is what I do to search for two (or more) words in the same file by using multiple searches.:
Search Like you normally do
Click on "Open in editor"
Adjust the context line count. (The higher the context count the more you can search for that second term, but the more non relevant searches you bring in)
Hit Cmd + F (or equivalent if not on mac) and search there. In the image below I have narrowed it down to 53 hits. I can manually skip through until I find it.
Need even more Fine tuning?
Same Steps as 1 - 3
Copy the contents to a file. (In the image below I saved it to a file called haystack.ts)
Search there for a third word. (In the image below I have now narrowed it down to 7 searches.)

Try Open new Search Editor command, through command pallete, You can map it to any keybinding you'd like in the Keybindings Editor. I mapped to cmd+shift+i
This is helpful for me!There is one more way, using up/ down arrow key in search editor, moves us across our search history, even this is useful,
It needs a little bent of mind to accept that it is equivalent to having multiple search editors (what IntelliJ etc provides) but without persistence!

Related

Regex to extract all strings from source code used when calling a function

We have an old, grown project with thousands of php files and need to clean it up.
Throughout the whole project we do have a lot of function calls similar to:
trans('somestring1');
trans("SomeString2");
trans('more_string',$somevar);
trans("anotherstring4",$somevar);
trans($tx_key);
trans($anotherKey,$somevar);
All of those are embedded into the code and represent translation keys. I would like to find a way to extract all "translation keys" in all occurrences.
The PHP project is in VS Code, so a RegEx Search would be helpful to list the results.
Or I could search through the project with any other tool you would recommend
However I would also need to "export" just the strings to a textfile or similar.
The ideal result would be:
somestring1
SomeString2
more_string
anotherstring4
$tx_key
$anotherKey
As a bonus - if someone knows, how I could get the above list including filename where the result has been found - that would be really fantastic!
Any help would be greatly appreciated!
Update:
The RegEx I came up with:
/(trans)+\([^\)]*\)(\.[^\)]*\))?/gim
list the full occurrence - How can I just get the first part of the result (between Single Quotes OR between Double Quotes OR beginning with $)
See here: regexr.com/548d4
Here are some steps to get exactly what you want. Using this you can do a find and replace on your search results!
So you could do sequential regex find/replaces in the right circumstances.
The replace can be just within the search results editor and not affect the underlying files at all - which is what you want.
You can also have the replace action actually edit the underlying files if you wish.
[Hint: This technique can also make doing a find item a / replace with b in files that contain term c much easier to do.]
(1) Open a new search editor: Ctrl+Shift+P
(That command is currently unbound to a keybinding.)
(2) Paste this regex into the Search input box (with the regex option .* selected):
`(.*?)(\btrans\(['"]?)([^,'")]+)(.*)` - a relatively simple regex
regex101 demo
See my other answer for a regex to work with up to 6 entries per line:
(\s*\d+:\s)?((.*?)(\btrans\(['"]?)([^,'")]*)((.*?)(\btrans\(['"]?)([^,'")]*))?((.*?)(\btrans\(['"]?)([^,'")]*))?((.*?)(\btrans\(['"]?)([^,'")]*))?((.*?)(\btrans\(['"]?)([^,'")]*))?((.*?)(\btrans\(['"]?)([^,'")]*))?)(.*)
(3) You will get a list of files with the search results. Now open a Find widget Shift+F in this Search editor.
(4) Put the same regex into that Find input. Regex option selected. Put $3 into the Replace field. This only replaces in this Search editor - not the original files (although that can be done if you want it in some case). Replace All.
If using the 1-6 version regex, replace with:
$1$5 $9 $13 $17 $21 $25
(5) Voila. You can now save this Search Editor as a file.
The first answer works for one desired capture per line as in the original question. But that relatively simple regex won't work if there are two or more per line.
The regex below works for up to 6 entries per line, like
trans('somestring1');
stuff trans("SomeString2"); some content trans("SomeString2a");more stuff [repeat, repeat]
But it doesn't for 7+ - you'll need a regex guru for that.
Here is the process again with a twist of using a snippet in the Search Editor instead of a Find/Replace. Using a snippet allows more control over the formatting of the final result.
(1) Open a new search editor: Ctrl+Shift+P (That command is currently unbound to a keybinding.)
(2) Paste this regex into the Search input box (with the regex option .* selected):
`((.*?)(\btrans\(['"]?)([^,'")]*)((.*?)(\btrans\(['"]?)([^,'")]*))?((.*?)(\btrans\(['"]?)([^,'")]*))?((.*?)(\btrans\(['"]?)([^,'")]*))?((.*?)(\btrans\(['"]?)([^,'")]*))?((.*?)(\btrans\(['"]?)([^,'")]*))?)(.*)`
regex101 demo
(3) You will get a list of files with the search results. Now select all your results individually with Ctrl+Shift+L.
(4) Trigger this keybinding:
{
"key": "alt+i", // whatever keybinding you like
"command": "editor.action.insertSnippet",
"when": "editorTextFocus",
"args": {
"snippet": "${TM_SELECTED_TEXT/((.*?)(\\btrans\\([\\'\\\"]?)([^,\\'\\\")]*)((.*?)(\\btrans\\([\\'\\\"]?)([^,\\'\\\")]*))?((.*?)(\\btrans\\([\\'\\\"]?)([^,\\'\\\")]*))?((.*?)(\\btrans\\([\\'\\\"]?)([^,\\'\\\")]*))?((.*?)(\\btrans\\([\\'\\\"]?)([^,\\'\\\")]*))?((.*?)(\\btrans\\([\\'\\\"]?)([^,\\'\\\")]*))?)(.*)/$4${8:+\n }$8${12:+\n }$12${16:+\n }$16${20:+\n }$20${24:+\n }$24/g}"
}
},
That snippet will be applied to each selection in your search result. This part ${8:+\n } is a conditional which adds a newline and some spaces if there is a capture group 8 - which would be a second trans(...) on a line.
Demo: (unfortunately, it doesn't properly show the Ctrl+Shift+L selecting all lines individually or the Alt+i snippet trigger)

Vim regex matching multiple results on the same line

I'm working on a project where I'm converting an implementation of a binary tree to an AVL tree, so I have a few files that contain lines like:
Tree<int>* p = new Tree<int>(*t);
all over the place. The goal I have in mind is to use a vim regex to turn all instances of the string Tree into the string AVLTree, so the line above would become:
AVLTree<int>* p = new AVLTree<int>(*t);
the regex I tried was :%s/Tree/AVLTree/g, but the result was:
AVLTree<int>* p = new Tree<int>(*t);
I looks to me like when vim finds something to replace on a line it jumps to the next one, so is there a way to match multiple strings on the same line? I realize that this can be accomplished with multiple regex's, so my question is mostly academic.
Credit on this one goes to Marth for pointing this out. My issue was with vim's gdefault. By default it's set to 'off', which means you need the /g tag to make your search global, which is what I wanted. I think mine was set to 'on', which means without the tag the search is global, but with the tag the search is not. I found this chart from :help 'gdefault' helpful:
command 'gdefault' on 'gdefault' off
:s/// subst. all subst. one
:s///g subst. one subst. all
:s///gg subst. all subst. one

Search for multiple strings in several files with Sublime 3 using AND

This previous (similar) question of mine Search for multiple strings in several files with Sublime 3 was answered with a way to search for multiple strings in multiple files in SublimeText, using the regex OR operator:
Find: (string1|string2)
Where: <open folders>
This works perfectly for searching files where either string1 OR string2 is present. What I need now is to search in lots of files for both strings present. I.e., I need to use the AND operator.
I looked around this question Regular Expressions: Is there an AND operator? and also this one Regex AND operator and came up with the following recipes:
(?=string1)(?=string2)
(?=.*string1)(?=.*string2)
(string1 string2)
(string1\&string2)
but none of them work.
So the question is: how can I search multiple strings in several files at once with SublimeText?
(I'm using SublimeText 3103)
Add: the strings are not necessarily in the same line. They can be located anywhere within each file. For example, this file:
string1 dfgdfg d dfgdf
sadasd
asdasd
dfgdfg string2 dfgdfg
should trigger a match.
Open sublime Text and press
Shift+Ctrl+F
or click on the Find in Files options under Files tab. The above is keyboard shortcut for this option. When you press above key, these are following options
When you select ... button from above, you get 6 options which are Add Folder or Add Open Files or Add Open Folders
To search strings that occur in the same line
Use the following regex for your and operation
(?=.*string1)(?=.*string2)
I am using the following regex
(?=.*def)(?=.*s)\w+ <-- \w+ will help in understanding which line is matched(will see later)
and I am searching within current open files
Make sure the Use Buffer option is enabled (one just before Find). It will display the matches in a new file. Also make sure the Show Context (one just before Use Buffer) option is enabled. This will display the exact line that matches. Now Click on Find on the right side.
Here is the output I am getting
See the difference in background color of line 1315 and 1316(appearing in left side). 1316 is matched line in designation file
This is the image of last part
There were total 6 files that were opened while I used this regex
For finding strings anywhere in file
Use
(?=[\s\S]*string1)(?=[\s\S]*string2)[\s\S]+
but it will kill sublime if number of lines increases.
If there are only two words that you need to find, the following will work super fast in comparison to above
(\bstring1\b[\S\s]*\bstring2\b)|(\bstring2\b[\S\s]*\bstring1\b)

Compare files and return only the differences using Notepad++

Notepad++ has a Compare Plugin tool for comparing text files, which operates like this:
Launch Notepad++ and open the two files you wish to run a comparison
check on.
Click the “Plugins” menu,
Select “Compare” and click “Compare.”
The plugin will run a comparison check and display the two files side
by side, with any differences in the text highlighted.
This is a nice feature, and which I have used happily for some time. Now, I have been looking for an option to go further and select the highlighted differing lines (e.g. by deleting the non-highlighted ones), or vice versa: i.e. expunge the highlighted lines.
Is there a straightforward way to achieve this?
To substract two files in notepad++ (file1 - file2) you may follow this procedure:
Recommended: If possible, remove duplicates on both files, specially if the files are big. To do this: Edit => Line operations => Sort Lines Lexicographically Ascending (do it on both files)
Add ---------------------------- as a footer on file1 (add at least 10 dashes). This is the marker line that separates file1 content from file2.
Then copy the contents of file2 to the end of file1 (after the marker)
Control + H
Search: (?m-s)^(?:-{10,}+\R[\s\S]*+|(.*+)\R(?=(?:(?!^-{10,}$)-++|[^-]*+)*+^-{10,}+\R(?:^.*+\R)*?\1(?:\R|\z))) note: use case sensitivity according to your needs
Replace by: (leave empty)
Select Regular expression radio button
Replace All
You can modify the marker if It is possible that file1/file2 can have lines equal to the marker. In that case you will have to adapt the regular expression.
By the way, you could even record a macro to do all steps (add the marker, switch to file2, copy content to file1, apply the regex with a single button press.
Edited:
Changed the regex to add some improvements:
Speed related:
Avoid as much backtracking as possible
Avoid searching after the mark
Usability:
Dashes are allowed for the lines. But the separator is still ^-{10,}$
Works with other characters besides words
Speed comparison:
New method vs Old method
So basically 78ms vs 1.6seconds. So a nice improvement! That makes comparing Kilobyte-sized files possible.
Still you may want to use some dedicated program for comparing or substracting bigger files.
If the number of differences is not large, a quicker method might be just bookmarking each differing line using keyboard shortcuts. Starting from the beginning of the file, press Alt+Page Down to focus on the first difference, and then press Ctrl+F2 to bookmark it. Continue with alternatingly pressing Alt+Page Down and Ctrl+F2 until the last difference.
With all the differing lines bookmarked, you can use any of the operations under "Search -> Bookmarks" menu:
Cut Bookmarked Lines
Copy Bookmarked Lines
Paste to (Replace) Bookmarked Lines
Remove Bookmarked Lines
Remove Unmarked Lines
I have a dirty workaround for this. It saves some time compared to Control+C, Alt+Tab, Control+V; Control+C, Alt+Tab, Control+V; ... but It may not be worth on big files or if the differences for both files are big. For bigger files you may prefer using some other tool.
Typically this works best when comparing group of 'words' and does not work with content that is tabulated (like source code)
So the workaround is:
Optional: (depends on the content that's being compared) Sort both files (it will make the future comparison easier) To do this: Edit => Line operations => Sort Lines Lexicographically Ascending (do it on both files)
Compare files with the plugin
Choose one file and inspect the lines you want to keep. Add one tabulator before each of those lines. Remeber you can select several lines and press tab for tabulating them. Optionally, you may add tabulators to the lines you want to remove
Sort the file. The tabulated lines will come up first. So now you can copy-paste them (or copy-paste the untabulated ones)
move the files to a linux box and then execute diff command:
$ diff file1.txt file2.txt > file_diff.txt

EditPad: Need a regex that handles multiple possible data formats

First, I'm using EditPadPro for my regex cleaning, so any answers given should work within that environment.
I get a large spreadsheet full of data that I have to clean every day. I've managed to get it down to a couple of different regexes that I run, and this works... but I'm curious to see if it's possible to reduce down to a single regex.
Here is some sample data:
3-CPC_114851_70095_70095_CAN-bre
3-CPC_114851_70095_70095_CAN
b11-ao1-113775-bre
b7-ao-114441
b7-ao-114441-bre
b7-ao1-114441
b7-ao1-114441-bre
http://go.nlvid.com/results1/?http://bo
go.nlv/results1/?click
b4-sm-1359
b6-sm-1356-bre
1359_195_1453814569-bre
1356_104_1456856729
b15-rad-8905
b15-rad-8905-bre
Here is how the above data needs to end up:
114851-bre
114851
113775-bre
114441
114441-bre
114441
114441-bre
http://go.nlvid.com/results1/
go.nlv/results1/
sm-1359
sm-1356-bre
sm-1359-bre
sm-1356
rad-8905
rad-8905-bre
So, there are numerous rules, such as:
In cases of more than 2 underscores, the result needs to contain only the value immediately after the first underscore, and everything from the dash onwards.
In cases where the string contains "-ao-", "-ao1-", everything prior to the final numeric string should be removed.
If a question mark is present, everything from the mark onwards should be removed.
If the string contains "-sm-" or "-rad-", everything prior to those alpha strings should be removed.
If the string contains 2 underscores, averything after the first numeric string up to a dash
(if present) should be removed, and the string "sm-" should be prepended.
Additionally there is other data that must be left untouched, including but not limited to:
113535|24905|24905
as well as many variations on this pattern of xxxxxx|yyyyy|zzzzz (and not always those string lengths)
This may be asking way too much of regex, I'm not sure as I'm not great with it. But I've seen some pretty impressive things done with it, so I thought I'd put this out to the community and see what you come back with.
Jonathan, I can wrap all of those into one regex, except the last one (where you prepend sm- to a string that does not contain sm). It is not possible in this context, because we cannot capture "sm" to reuse in the replacement, and because there is no "conditional replacement" syntax in EPP.
That being said, you can achieve what you want in EPP with two regexes and one macro to chain the two.
Here is how.
The solution below is tested in EPP.
Regex 1
Press Ctrl + Sh + F to enter Search / Replace mode
Enter the following Search and Replace in the appropriate boxes
At the top right of the Search bar, click the Favorite Searches pull-down, select "Add", give it a name, e.g. Regex 1
Search:
(?mx)^
(?=(?:[^_\r\n]*?_){3})[^_\r\n]+?_([^_\r\n]+)[^-\r\n]+(-[^\r\n]+)?
|
[^\r\n]*?-ao1?-\D*([^\r\n]+)
|
([^\r\n?]*)(?=\?)[^\r\n]+
|
[^\r\n]*?-((?:sm|rad)-[^\r\n]+)
Replace:
\1\2\3\4\5
Regex 2
Same 1-2-3 steps as above.
Search
^(?!(?:[^_\r\n]*?_){3})(?=(?:[^_\r\n]*?_){2})(\d+)(?:[^-\r\n]+(-[^\r\n]+)?)
Replace
sm-\1\2
Chaining Regex 1 and Regex 2
Top menu: Macros, Record Macro, give it a name.
Click the Favorite searches pulldown, select Regex 1
Hit Replace All.
Click the Favorite searches pulldown, select Regex 2
Hit Replace All.
Macros, Stop recording.
Whenever you want to do your sequence of replacements, pull it by name under the Macros menu.
Testing This
I have tested my "Jonathan macro" on your input. Here is the result:
114851-bre
114851
113775-bre
114441
114441-bre
114441
114441-bre
http://go.nlvid.com/results1/
go.nlv/results1/
sm-1359
sm-1356-bre
sm-1359-bre
sm-1356
rad-8905
rad-8905-bre
Try this:
Toggle the Search Panel : SHIFT+CTRL+F
SEARCH: .*?((?:sm-|rad-)?(?:(?:\d+|[\w\.]+\/.*?))(?:-\w+)?$)
REPLACE: $1
Check REGEX and WORDS
Click Replace All or Hit CTRL+ALT+F3
Check the image below: