Search eclipse workspace where string not starting with - regex

In the eclipse workspace search (CTRL+H) what regular expression could be used to find occurrences of a word that do not begin with 2 leading forward slashes (/) - or in other words, are not commented?
For example //var_dump and // var_dump is what is not to be matched, but var_dump is what is to be matched.

The following regex in the file search tab would do it:
(?<!//)[ \t]*var_dump
If you only care about the first word on a line, precede it with a caret:
^(?<!//)[ \t]*var_dump
The (?<!//) part is a negative lookbehind - it only matches thing that do not have the contents of the brackets before the thing you are looking for.
However, if you are looking for variables that are not commented rather than not preceded by the specific combo "//", you might be better off using a syntax-aware search, which depends on the language used. For example, this regex will still match /* var_dump */, which you may or may not want to happen. A syntax-aware search would know this is commented.
If you look in the Search window, you will see a "Java Search" tab and tabs for other plugins you might have (I have "C/C++ Search" and "Git Search" for example). In these tabs, you can choose to search only for functions, variable, classes, or whatever else makes sense in that language.
A further thing you might want is to hit Ctrl-Shift-G on a word and it will list all the references to this object in your code - very useful when trying to track down what is calling your function or using your variable.

If you're looking for a variable name (or the name of any other kind of Java element), you could use the Java search instead of the File search. You don't have regular expression support, but the search knows you're looking for a variable/class name and so you don't have to worry about surrounding syntax.

Related

Sigil editor: Regex string to look for a (hyphen) character in text, but not html attributes

My problem:
I use Sigil to edit xhtml files of an ebook.
When exporting from InDesign to ePub I tick option to remove forced line breaks.
This act removes all - hyphen characters which are auto-generated by InDesign, but the characters which were added manually during my word-break fine-tune remain in the text.
Current ability of Sigil search: searching by - parses everything, including css class names.
TODO: How to construct regex query which finds the - within the text, but not in the html code?
Thank you!
What I have already tried: https://www.mobileread.com/forums/showpost.php?p=4099971&postcount=169:
Here is a simple example to find the word "title" not inside a tag itself, here is the simplest regex search I could think of off the top of my head. It assumes there is no bare text in the body tag and that the xhtml is well formed.
I tried it and it appears to work. There are probably better more exhaustive regex, that can handle even broken xhtml.
Code:
title(?=[^>]*<)
This basically says search for "title" but lookahead to make sure there are no closing tag chars ">" before you find the next opening tag char "<".
There are probably look behind versions that could work with reverse logic. And there are ways to use regex to find a two strings that ignores any intervening tags.
Give it a try. You could add a saved search easily to do that. But again it will not handle find and replacement of text that crosses over elements (over nodes in the tree). That is the hard part unless you have one to one corresponding matching of matching substrings to replacement substrings which in general need not be the case.
And of course if you use < and > inside strings to show a "tag" or code snippet, these would be found by mistake so reviewing each find before the replace would be needed.
In Sigil, PCRE regex engine is used.
Thus, you can use
<[^<>]*>(*SKIP)(*F)|-
See the regex demo.
Details:
<[^<>]*>(*SKIP)(*F) - matches <, zero or more chars other than < and > and then a >, and then skips the match and goes on to search for the next match from the position where the failure occurred
| - or
- - a hyphen.
NOTE: you might want to match any dashes with [\p{Pd}\x{00AD}] (to replace with -).

Regex: Find word that does not have a different target anywhere ahead of it in the file?

How do I find a word that does not have a specific target text anywhere ahead of it in a file?
Let's say I want to find "setting3" not preceeded by a right square bracket "]" (which denotes a header). This following file would fail that test, due to [header]:
[header] #can be named anything
setting1=True
setting2=various setting values, can include any type of text
setting3=1
But a file with an orphaned setting should be a match:
setting3=1
Lookbehinds won’t work, because I may have arbitrary settings in between the header and the text I'm looking for. Because my terms span multiple lines, it makes it trickier.
For context, this is to set a rule with a tool that only offers one regex line (Ansible, which I think uses Python's regex engine). I don't believe I have access to special settings (global, etc.)
You might be able to use the following regular expression :
^[^[]*setting3=1
It matches from the start of the file up to the setting you're looking for, but only matching characters that aren't [, which guarantees that it will only match the setting you're looking for if it wasn't preceded by an header since they contain [.
Note that this could miss some settings that should be matched, in particular if comments preceding the setting contain a [ character.

Vim: Incsearch for replace queries

I noticed I could use regex functions with search in vim, and I could see hilights while I typed by setting incsearch. But that didn't work for search and replace queries like this one:
:%s/std::regex\s\([_a-zA-Z]*\)(/regex_t \1 = dregc(/gc
That one surprised me when it actually worked.
Are there settings or plugins for vim that, like incsearch but better, will highlight your replace query as you type? Just highlighting the matches would be pretty neat, but putting the old and new strings next to eachother in different highlighting colors would be a godsend, because I might not be sure about the backreference.
Not a direct answer to your question, but traditionally in Vim you craft your search regex first, as in:
/regex
Then you hit enter to execute it. The settings :set hlsearch and :set incsearch make this easy to see visually. Then you can just do:
:%s//replace
With no search specified, :%s (substitute acting on %, a shorctut meaning all lines in the file) will use the last search term you specified.
Going one step further, you could then do
:%s/~/replace2
Which replaces your last substitution (in this case, replace1) with replace2.
Unrelated, it may be useful for you to put this in your .vimrc:
set gdefault
Which will make all replaces global by default, so you don't need the /g flag after every :%s command.
You might be looking for vim-over?
This is a plugin that: (to clarify, let's say we're doing :%s/foo/bar/g.) i) highlights matches for substitutions in the buffer (foo) and optionally ii) previews what's after replacement (bar).

EditPad: How to replace multiple search criteria with multiple values?

I did some searching and found tons of questions about multiple replacements with Regex, but I'm working in EditPadPro and so need a solution that works with the regex syntax of that environment. Hoping someone has some pointers as I haven't been able to work out the solution on my own.
Additional disclaimer: I suck with regex. I mean really... it's bad. Like I barely know wtf I'm doing.So that being said, here is what I need to do and how I'm currently approaching it...
I need to replace two possible values, with their corresponding replacements. My two searches are:
(.*)-sm
(.*)-rad
Currently I run these separately and replace each with simple strings:
sm
rad
Basically I need to lop off anything that comes prior to "sm" so I just detect everything up to and including sm, and then replace it all with that string (and likewise for "rad").
But it seems like there should be a way to do this in a single search/replace operation. I can do the search part fine with:
(.*)-sm|(.*)-rad
But then how to replace each with it's matching value? That's where I'm stuck. I tried:
sm|rad
but alas, that just becomes the literal complete string that is used for replacement.
Jonathan, first off let me congratulate you for using EPP Pro for regex in your text. It's my main text editor, and the main reason I chose it, as a regex lover, is that its support of regex syntax is vastly superior to competing editors. For instance Notepad++ is known for its shoddy support of regular expressions. The reason of course is that EPP's author Jan Goyvaerts is the author of the legendary RegexBuddy.
A picture is worth a thousand words... So here is how I would do your replacement. Just hit the "replace all button". The expression in the regex box assumes that anything before the dash that is not a whitespace character can be stripped, so if this is not what you want, we need to tune it.
Search for:
(.*)-(sm|rad)
Now, when you put something in parenthesis in Regex, those matches are stored in temporary variables. So whatever matched (.*) is stored in \1 and whatever matched (sm|rad) is stored in \2. Therefore, you want to replace with:
\2
Note that the replacement variable may be different depending on what programming language you are using. In Perl, for example, I would have to use $2 instead.

How many backslashes are required to escape regexps in emacs' "Customize" mode?

I'm trying to use emacs' customize-group packages to tweak some parts of my setup, and I'm stymied. I see things like this in my .emacs file after I make changes with customize:
'(tramp-backup-directory-alist (quote (("\\\\`.*\\\\'" . "~/.emacs.d/autobackups"))))
This was the result of putting the following into the customize text field:
Regexp matching filename: \\`.*\\'
This is a representative sample: I'm actually trying to change several things that want a regexp, and they all show this same problem. How many layers of quoting are there, really? I can't seem to find the magic number of backslashes to get the gosh-dang thing to do what I'm asking it to, even for the simplest regular expressions like .*. Right now, the given customization produces - nothing. It makes no change from emacs' default behavior.
Better yet, where on earth is this documented? It's a little difficult to Google for, but I've been trying quite a few things there as well as in the official documentation and the Emacs wiki. Where is an authoritative source for how many dang backslashes one needs to make a regular expression in customize-mode actually work - or at the very least, to fail with some kind of warning instead of failing silently?
EDIT: As so often happens with questions asked in anger, I was asking the wrong question. Fortunately the answers below, led me to the answer to the question that I needed, which was about quoting rules. I'm going to try to write down what I learned here, because I find the documentation and Googleable resources to be maddeningly obscure about this. So here are the quoting rules I found by trial and error, and I hope that they help someone else, inspire correction, or both.
When an emacs customize-mode buffer asks you for a "Regexp matching filename", it is being, as emacs often is, both terse and idiosyncratic (how often the creator's personality is imparted to the creation!). It means, for one thing, a regexp that will be compared to the whole path of the file in search of a match, not just to the name of the file itself as you might assume from the term "filename". This is the same sense of "filename" used in emacs' buffer-file-name function, for example.
Further, although if you put foo in the field, you'll see "foo" (with double-quotes) written to the actual file, that's not enough quoting and not the right quoting. You need to quote your regexp with the quoting style that, as far as I can tell, only emacs uses: the ``backtick-foo-single-quote'`scheme. And then you need to escape that, making it \`backslash-backtick-foo-backslash-single-quote\' (and if you think that's a headache to input in Markdown, it's more so in emacs).
On top of this, emacs appears to have a rule that the . regexp special character does not match a / at the beginning of filenames, so, as was happening to me above, the classic .* pattern will appear to match nothing: to match "all files", you actually need the regexp /.*, which then you stuff into the quote format of customize-mode to produce \`/.*\', after which customize paints another layer of escaping onto it and writes it to the customization file.
The final result for one of my efforts - a setting such that #autosave# files don't gunk up the directory you're working in, but instead all live in one place:
(custom-set variables
'(auto-save-file-name-transforms (quote (
("\\`/[^/]*:\\([^/]*/\\)*\\([^/]*\\)\\'" "~/.emacs.d/autobackups/\\2" t)
("\\`/.*/\\(.*?\\)\\'" "~/.emacs.d/autobackups/\\1" t)
))))
Backslashes in elisp are a far greater threat to your sanity than parentheses.
EDIT 2: Time for me to be wrong again. I finally found the relevant documentation (through reading another Stack Overflow question, of course!): Regexp Backslash Constructs. The crucial point of confusion for me: the backtick and single quote are not quoting in this context: they're the equivalent of perl's ^ and $ special characters. The backslash-backtick construct matches an empty string anchored at the beginning of the string being checked for a match, and the backslash-single-quote construct matches the empty string at the end of the string-under-consideration. And by "string under consideration," I mean "buffer, which just happens to contain only a file path in this case, but you need to match the whole dang thing if you want a match at all, since this is elisp's global regexp behavior."
Swear to god, it's like dealing with an alien civilization.
EDIT 3: In order to avoid confusing future readers -
\` is the emacs regex for "the beginning of the buffer." (cf Perl's \A)
\' is the emacs regex for "the end of the buffer." (cf Perl's \Z)
^ is the common-idiom regex for "the beginning of the line." It can be used in emacs.
$ is the common-idiom regex for "the end of the line." It can be used in emacs.
Because regex searches across multi-line bodies of text are more common in emacs than elsewhere (e.g. M-x occur), the backtick and single-quote special characters are used in emacs, and as best as I can tell, they're used in the context of customize-mode because if you are considering generic unknown input to a customize-mode field, it could contain newlines, and therefore you want to use the beginning-of-buffer and end-of-buffer special characters because the beginning and end of the input are not guaranteed to be the beginning and end of a line.
I am not sure whether to regret hijacking my own Stack Overflow question and essentially turning it into a blog post.
In the customize field, you'd enter the regexp according to the syntax described here. When customize writes the regexp into a string, any backslashes or double-quote chars in the regexp will be escaped, as per regular string escaping conventions.
So in short, just enter single backslashes in the regexp field, and they'll get correctly doubled up in the resulting custom-set-variables clause written to your .emacs.
Also: since your regexp is for matching filenames, you might try opening up a directory containing files you'd like to match, and then run M-x re-builder RET. You can then enter the regexp in string-escaped format to confirm that it matches those files. By typing % m in a dired buffer, you can enter a regexp in unescaped format (ie. just like in the customize field), and dired will mark matching filenames.