visual Studio 2010 regular expressions for 'Find In Files' - regex

I have look at the many stackoverflow posts concerning VS regular expressions and read the Microsoft page concerning regular expressions but still cannot determine where I am going wrong.
Microsoft VS regex
I want to find all lines which include the word, attribute, but which are not comment lines (do not contain the // symbol).
I have tried using the regular expression
~(^ *//).*attribute.*
meaning:
~(^ *//) --> exclude lines which begin with '//' preceded by zero or more spaces
.* --> match any character zero or more times
attributes --> match the word attributes
.* --> match any character that comes after the word attribute
I have tried several other regular expressions with about the same amount of failure. I am wondering if anyone can spot something obvious that I am not doing.
I also gave the below a try:
~( *//).*attribute.* (thinking maybe the carat was being taken as a literal instead of special)
~(//).*attribute.* (thinking maybe the * was being taken as a literal instead of special)
~(//)attribute (imminent failure but will try anything)
\s*~(//).*attributes.*
I saw quite a few posts suggesting to use the find command in batch. This can be done, but I would prefer to have the ability to double click on the findings so that the file will be opened and already scrolled to the correct location.

How about this one.
^(?=.*attribute.*\n)(?!.*//).*

Related

Regex: Replace double double quotes (solved), but only in lines that contain a special string (subcondition unsolved)

1. Summary of the problem
I have a csv file where I want to replace normal quotes in text with typographic ones.
It was hard (because HTML is also included), but I have meanwhile created a good regex expression that does just the right thing: in three "capturing groups" I find the left and right quotation marks and the text inside. Replacing then is a piece of cake.
2. Regex engine
I can use the regex engine of Notepad++ (boost) or PCRE2 comaptible, for developping and testing purposes I have used https://regex101.com.
3. What I'm having a hard time with and just can't get right, where I need your help is here:
I want to add a sub condition, in order to find the text in quotes only in certain lines, want to identify these lines by the language, e.g. ENGLISH or FRENCH (see also example in the screenshot).
Screenshot of a sample
The string indicating the language is always in the same line before the text to be found, BUT only the text in quotes (main condition) should be marked after matching the sub condition, so that I will be able to replace them.
It is about a few thousand records in the csv file, in the worst case I could also replace it manually. But I'm pretty sure that this should also work via regex.
4. What I have tried
Different approaches with look arounds and non-capturing groups didn't lead me to the desired result - possibly because I didn't really understand how they work.
An example can be found here: https://regex101.com/r/ketwwm/1
The example can be found here, it only contains the regex expression to match and mark the (three) groups WITHOUT the searched subcondition:
("")([^<>]*?)("")(?=(?:[^>]*?(?:<|$)))
Hopefully anyone in the community could help? (Hopefully I have not missed anything, it's my first post here )
5. Update 03/18/2022: Almost resolved with two slightly different approaches (thank you all!) What is still unsolved ..
Solution of #Thefourthbird (see answer 1)
^(?!.?"ENGLISH")[^"]".*(SKIP)(F)|("")([^<>]?)("")(?=(?:[^>]?(?:<|$)))
Nearly perfect, just missing matches in an HTML section. HTML sections in the csv file are always enclosed by double quotes and may have line feeds (LF). https://regex101.com/r/x5shnx/1
Solution of #Wiktor Stribiżew (see in comments below)
^.?"ENGLISH".?\K("")([^<>]?)("")(?=(?:[^>]?(?:<|$)))
The same with matches in HTML sections, see above. Plus: Doesn't match text in double double quotes if more than one such entry occurs within a text. https://regex101.com/r/I4NTdb/1
Screenshot (only to illustrate)
If you want to match multiple occasions, you can use SKIP matching all lines that do not start with FRENCH:
^"(?!FRENCH")[^"]*".*(*SKIP)(*F)|("")([^<>]*?)("")(?=(?:[^>]*?(?:<|$)))
The pattern matches:
^ Start of string
" Match literally
(?!FRENCH") Negative lookhead, assert not FRENCH" directly to the right
[^"]*" Match any char except " and match "
.*(*SKIP)(*F) Match the rest of the line and skip it
| Or
("")([^<>]*?)("")(?=(?:[^>]*?(?:<|$))) Your current pattern
Regex demo

RegEx Expression for Eclipse that searches for all items that have not been dealt with

To help stop SQL Injection attacks, I am going through about 2000 parameter requests in my code to validate them. I validate them by determining what type of value (e.g. integer, double) they should return and then applying a function to them to sanitize the value.
Any requests I have dealt with look like this
*SecurityIssues.*(request.getParameter
where * signifies any number of characters on the same line.
What RegExp expression can I use in the Eclipse search (CTRL+H) which will help me search for all the ones I have not yet dealt with, i.e. all the times that the text request.getParameter appears when it is not preceded by the word SecurityIssues?
Examples for matches
The regular expression should match each of the following e.g.
int companyNo = StringFunctions.StringToInt(request.getParameter("COMPANY_NO‌​"))
double percentage = StringFunctions.StringToDouble(request.getParameter("MARKETSHARE"))
int c = request.getParameter("DUMMY")
But should not match:
int companyNo = SecurityIssues.StringToIntCompany(request.getParameter("COMP‌​ANY_NO"))
With inspiration and the links provided by #michaeak (thank you), as well as testing in https://regex101.com/ I appear to have found the answer:
^((?!SecurityIssues).)*(request\.getParameter)
The advantage of this answer is that I can blacklist the word SecurityIssues, as opposed to having to whitelist the formats that I do want.
Note, that it is relatively slow, and also slowed down my computer a lot when performing the search.
Try e.g.
=\s*?((?!SecurityIssues).)*?(request\.getParameter)\(
Notes
Paranthesis ( or ) are special characters for group matching. They need to be escaped with \.
If .* will match anything, also characters that you don't want it to match. So .*? will prevent it from matching anything (reluctant). This can be helpful if after the wildcard other items need to match.
There is a tutorial at https://docs.oracle.com/javase/tutorial/essential/regex/index.html , I think all of these should be available in eclipse. You can then deal with generic replacement also.
Problem
From reading Regular expression that doesn't contain certain string and Regular expression to match a line that doesn't contain a word? it seems quite difficult to create a regex matching anything but not to contain a certain word.

Wildcard in Word 2013 to match zero or more whitespaces

What is the analog of regular expression's * modifier in Word 2013 wildcards?
In Word 2013 Find tool with wildcards enabled, apparently 0 is not a valid number as the number of matches. For example, if you type in the search box
fe{1,2}d
it will match fed and feed. However,
fe{0,2}d
will just produce an error message. What is the correct expression to match fd, fed, feed, feeed, etc.?
My motivation is to match a specific text when it is in a paragraph alone (i.e., surrounded by paragraph marks ^13) but with a possible whitespaces after it:
^13hello world {0,}^13
which just produces an error message. I did not find any solution without enabling wildcards, but even with wildcards enabled I can't get it working.
Similarly,
^13hello world #^13
matches one or more spaces, but I need zero or more.
I don't believe Word has ever had an equivalent for the zero-or-more operator, so while I haven't checked in Word 2013, I wouldn't expect to see it there either. (This page is old, but as far as I know it's still pretty authoritative on wildcard searching in Word: http://word.mvps.org/faqs/general/usingwildcards.htm)
In general, I would suggest doing two searches, one without the character and one using the 1-or-more operator.
ETA: Removed bad wildcard search.

Find and Replace with Regex in Microsoft Word 2013

I am editing an e-book document with a lot of unnecessary markup. I have a number of sections in the text with code similar to this:
<i>Some text here</i>
I am trying to run a regex find and replace that will find any phrase between the two i-tags, remove the i-tags, and apply a style to the text.
Here is what I'm using to search:
Find: (<i>)(*)(</i>)
Replace: \2
I'm also selecting Styles > i (for italic). This tells our conversion software to apply italics to the text. If I leave the i-tags, what ends up happening is ScribeNet's conversion process converts them to hex-values so that they show up as literal text in the e-book. Messy.
When I run this search, I get no results. I have "use wildcards" checked. What am I missing? According to Microsoft's help website, * is used to represent any number or type of characters, and individual strings are supposed to be enclosed in parentheses.
To search for a character that's defined as a wildcard, place a backslash (\) before that character. The * itself matches any string of characters, so use the range quantifier to match (1 or more times)
Find: \<i\>(*{1,})\</i\>
Replace: \1
Search for \<i\>(*{1,})\</i\> and replace with \1. Don't forget to check Use wildcard.
There is a reference table for Word's "regular expressions" here: http://office.microsoft.com/en-ca/word-help/find-and-replace-text-by-using-regular-expressions-advanced-HA102350661.aspx
< and > are special characters that need to be escaped
* means any character
{1,} means one or more times
There is a special tool for Microsoft Word called Multiple Find & Replace (see http://www.translatortools.net/products/transtoolsplus/word-multiplefindreplace) which allows to work around Word's wildcard limitations. This tool can use the standard regular expressions syntax to search and replace any text within a Word document. For example, to search for any HTML tags, you can just use <[^>]+> which will find opening, closing and standalone HTML tags. You can add any number of expressions to a list and then search the document for all of them, replace everything, see all matches for all the search expressions entered, replace only selected matches, and a few more things.
I created it for translators and editors, but it is great for any advanced search/replace operations in Word, and I am sure you will find it very useful.
Stanislav

Visual Studio Find and Replace Regular Expressions help

I'd like to replace some assignment statements like:
int someNum = txtSomeNum.Text;
int anotherNum = txtAnotherNum.Text;
with
int someNum = Int32.Parse(txtSomeNum.Text);
int anotherNum = Int32.Parse(txtAnotherNum.Text);
Is there a good way to do this with Visual Studio's Find and Replace, using Regular Expressions? I'm not sure what the Regular expression would be.
I think in Visual Studio, you can mark expressions with curly braces {txtSomeNum.Text}. Then in the replacement, you can refer to it with \1. So the replacement line would be something like Int32.Parse(\1).
Update: via #Timothy003
VS 11 does away with the {} \1 syntax and uses () $1
Comprehensive guide
http://blog.goyello.com/2009/08/22/do-it-like-a-pro-%E2%80%93-visual-studio-find-and-replace/
This is what I was looking for:
Find: = {.*\.Text}
Replace: = Int32.Parse(\1)
Better regex for the original problem would be
find expr.: {:i\.Text}
replace expr.: Int32.Parse(\1)
Check out:
http://msdn.microsoft.com/en-us/library/2k3te2cs%28v=vs.100%29.aspx
for the definitive guide to regex in VS.
I recently completed reformatting another programmer's C++ project from hell. He had completely and arbitrarily entered, or left out at random, spaces and tabs, indentation (or not), and an insane level of parentheses nesting, such that none of us used to coding standards of any type could even begin to read the code before I started. Used regex extensively to find and correct abnormal constructs. In a couple of hours, I was able to correct major problems in approximately 125,000 lines of code without actually looking at most of them. In one particular single find/replace I changed more than 22,000 lines of code in 125 files, total time under 10 seconds.
Particularly useful constructs in the regex:
:b+ == one or more blanks and/or tabs.
:i == matches a C-style variable name or keyword (i.e. while, if,
pick3, bNotImportant)
:Wh == a whitespace char.; not just blank or tab
:Sm == any of the arithmetic symbols (+, -, >, =, etc.)
:Pu == any punctuation mark
\n == line break (useful for finding where he had inserted 8 or 10 blank lines)
^ == matches start of line ($ to match end)
While it would have been nice to match some other regex standard (duh), I did find a number of the MS extensions extremely useful for searching a code base, such as not having to define 'identifier' hundreds of times as "[A-Za-z0-9]+", instead just using ":i".