I have a bunch of column names from a SQL query, and I want to get rid of everything before the AS using Emacs. In other words, I want to go from
MAX(CASE WHEN maintenance.work_order IS NULL THEN 1 ELSE 0 END) AS Has_work_order,
to
Has_work_order,
I used re-builder to create a simple regex: "\.\*AS " which highlights the appropriate parts of the buffer. However, when I select the entire buffer and run query-replace-regexp using M-x query-replace-regexp <RET> "\.\*AS " <RET> "" <RET>, Emacs displays a Replaced 0 occurrences message.
What am I doing wrong?
By using re-builder (which is a good idea) to create a regexp for interactive use, you are then getting confused between the different regexp syntax options. re-builder defaults to read syntax (which you would use when writing elisp code), whereas for interactive use you want string syntax.
Refer to Why do regular expressions created with the regex builder use syntax different from the interactive regular expressions? for explanation and clarification.
In read syntax, \.\*AS represents the regexp .*AS (because . and * are not special when reading strings, so those backslashes are redundant); but in string syntax \.\*AS is the regexp \.\*AS in which the . and * characters which are special to regexps have been escaped, and therefore lose their special meaning, and will instead match literal . and * characters in the text.
Note, however, that when entering a regexp interactively you should not include the surrounding double-quote characters " that are present in re-builder even for its string syntax mode. If you enter the " characters interactively, then the regexp will be matching text that contains those " characters.
I was able to do this with the following:
M-x query-replace-regexp <RET> \(.+\)AS <RET> <RET>
Note the cursor must be above the line(s) that need replacing. I've not used this before, but it's interactive (pressing 'y' for each replace, this may be able to be done automatically/globally, but I've not played around with it.
Related
I need to be able to handle data that can look like:
set setting1 "bind button_x +actionslot1;bind button_y \" bind button_x +stance \" "
bind button_a jump
set setting2 1 1 0 1
toggle setting_3 " \"value 1\" \"value 2\" \"value 3\" "
These are what some of the commands for the console of a game look like, and I'm trying to write an emulator of sorts that will interpret the code the same way the game will.
The first thing that comes to mind is regex, but I'm not sure it's the best option. For example, when matching for the value of a setting, I might trying something like /set [\w_]+ "?(.+)"?/, but the wildcard matches the ending quote because it's not lazy, but if I make it lazy, it matches the quote inside the value. If I make it greedy and stop it from matching the quotes, it won't match the escaped quotes in the values.
Even if there are possible regex solutions, they seem like the wrong option. I had asked before about how programs like Visual Studio and Notepad++ know which parentheses and curly braces matched, and I was told there was something similar to regex in some ways but much more powerful.
The only other thing I can think of is to go through the lines of code character by character and use booleans to determine that state of the current character.
What are my options here? What do game developers use to handle console commands?
edit: Here's another possible command which strongly deters me from using regex:
set setting4 "bind button_a \" bind button_b "\" set setting1 0 \" " \" "
The commands include not just escaped quotes, but quotes of the manner "\" inside escaped quotes.
I would suggest you read about Lexical Analysis
, this is the process of tokenizing your text using a grammar.
I think it will help you with what you are trying to do.
I don't want to keep you on the path of regex -- you are correct that there are non-regex solutions that may be more appropriate (I just don't know what they are). However, here is one possible regex that should fix your quotes issue:
/set [\w_]+ "?((\\"|[^"])+)"?/
I changed .+ to (\\"|[^"])+. Basically it's matching occurrences of \" OR of anything that isn't a quote. In other words, it will will match anything except quotes that aren't escaped.
Again, if someone can suggest a more sophisticated non-regex solution, you should strongly consider it.
Edit: The updated example you've provided breaks this solution, and I think it would break any regex solution.
Edit 2: Here is a C# string version of your regex. It uses # to tell the compiler to treat the string as a verbatim literal, which means it ignores \ as an escape character. The only caveat is that in order to represent " in a verbatim literal you have to type it as "", but it's still better than having slashes everywhere. Given the prevalence of escape sequences in regexes, I recommend using verbatim literals anywhere that you have to type a regex in a string.
string pattern = #"set [\w_]+ ""?((\\""|[^""])+)""?"
I bother you to have some tips for this problem: I'm working in Latex with a very dirty code, generated by writer2latex (quite good programme, anyway) and, using Emacs, I'm trying to query-replace multiple lines of code, for instance:
{\centering [Warning: Image ignored] % Unhandled or unsupported graphics:
%\includegraphics[width=11.104cm,height=8.23cm]{img34}
have to become:
\begin{figure}[tpb]
\begin{center}
\includegraphics[width=\textwidth]{img34}
Using M-x re-builder, I found out that I could underline the whole region I need to query-replace with the string: \{.*centering.*c-qc-j.*cm] but, if I M-x replace-regexp using this, I only get: Invalid regexp: "Invalid content of \\{\\}"
Any suggestion about how to perform the query? I have a HUGE amount of lines like these to replace... :-)
You're getting this error message because in Emacs' regular expressions the curly braces\{ and \} have special meaning. These braces are used to specify that the part of the regexp immediately before the braces should be matched a certain number of times.
From the GNU Emacs documentation on regexps:
\{n\}
is a postfix operator specifying n repetitions [...]
\{n,m\}
is a postfix operator specifying between n and m repetitions [...]
If you want your regexp to actually match a curly brace, do not escape it with a leading slash:
{.*centering.*C-q C-j.*cm]
In order to use a backslash in the replacement string you have to escape it with another backslash. (When doing this in code, it quickly becomes quite ugly because inside a double-quoted string backslashes themselves have to be escaped already. However, since you are doing your replacements interactively, the double escaping is not necessary and thus two backslashs are enough.)
M-C-% {.*centering.*C-q C-j.*cm] RET \\begin{figure}[tpb]C-q C-j\\begin{center}C-q C-j\\includegraphics[width=\\textwidth] RET
Make sure the re-syntax is "read", C-c tab. Remove the initial backslash. Now the regexp should work if you yank it into replace-regexp
I am aware of nano's search and replace functionality, but is it capable of using regular expressions for matching and substitution (particularly substitutions that use a part of the match)? If so, can you provide some examples of the syntax used (both for matching and replacing)?
I cut my teeth on Perl-style regular expressions, but I've found that text editors will sometimes come up with their own syntax.
My version of nano has an option to swtich to regex search with the meta character + R. In cygwin on Windows, the meta-key is alt, so I hit ctrl+\ to get into search-and-replace mode, and then alt+r to swtich to regex search.
You need to add, or un-comment, the following entry in your global nanorc file (on my machine, it was /etc/nanorc):
set regexp
Then fire up a new terminal and press CTRL + / and do your replacements which should now be regex-aware.
EDIT
Search for conf->(\S+):
Replace with \1_conf
Press a to replace all occurrences:
End result:
The regular expression format / notation for nano use "Extended Regular Expression", i.e. POSIX Extended Regular Expression, which is used by egrep and sed -r, this include metacharacters ., [ and ], ^, $, (, ), \1 to \9, *, { and }, ?, +, |, and character classes like [:alnum:], [:alpha:], [:cntrl:], [:digit:], [:graph:], [:lower:], [:print:], [:punct:], [:space:], [:upper:], and [:xdigit:].
For more complete documentation you can see manual page, man 7 regex in Linux or man 7 re_format in OS X. This page may give you same information as well: https://en.wikipedia.org/wiki/Regular_expression#POSIX_basic_and_extended
Unfortunately in nano there seems to be no way to match anything that span across multiple lines.
This is a bit old, just updating the search index.
Nano 5.5 uses the ASCII column from this same table.
Thanks to #S P Arif Sahari Wibowo ,
I found the answer here anyway (same wiki link):
https://en.wikipedia.org/wiki/Regular_expression#POSIX_basic_and_extended
I was recently faced with the problem of inserting text at the beginning of everyline that started with a numerical digit. For that the only way to distinguish this from text i didn't want to change was the previous new line.
Playing around with the information provided in this answer I was able to do it and decided to add it to the answer in case somebody else faces the same situation.
To search for the beginning of the line followed by a number and then insert "Text String" at the beginning of each line that starts with a number:
\ then "(^[0-9])" press carry return, then: "Text String 1" press carry return and the select yes, if it does what you want next press a for all. Omit the " quotation marks.
How do you do a query-replace-regexp in Emacs that will match across multiple lines?
as a trivial example I'd want <p>\(.*?\)</p> to match
<p>foo
bar
</p>
M-x re-builder
is your friend. And it led me to this regular expression:
"<p>\\(.\\|\n\\)*</p>"
which is the string version of
<p>\(.\|^J\)*</p> ;# where you enter ^J by C-q C-j
And that works for me when I do re-search-forward, but not when I do 'query-replace-regexp. Unsure why...
Now, when doing a 're-search-forward (aka C-u C-s), you can type M-% which will prompt you for a replacement (as of Emacs 22). So, you can use that to do your search and replace with the above regexp.
Note, the above regexp will match until the last </p> found in the buffer, which is probably not what you want, so use re-builder to build a regexp that comes closer to what you want. Obviously regular expressions can't count parenthesis, so you're on your own for that - depends on how robust a solution you want.
Try character classes. As long as you're using only ASCII character set, you can use [[:ascii:]] instead of the dot. Using the longer [[:ascii:][:nonascii:]] ought to work for everything.
suppose i have the following in a text file (or dired buffer) open in emacs:
file_01.txt
file_02.txt
...
file_99.txt
I want to query-replace or replace the files to
01_file.txt, etc.
I want to use query-replace-regexp or replace-regexp, but don't know what to put in. The search part i put in "file_..", but the ".." are read as periods in the replacement string. I'm beginning to learn regexp and don't know how to do this. Please help, thanks.
M-x replace-regexp invokes the function to replace with regular expressions.
For Replace regexp enter: \(file\)_\([0-9]+\)
This will create two groups, one that matches the 'file' part, and one that matches the number. The braces \( ... \) are necessary to make the match available later in the replacement string.
For Replace with enter: \2_\1
This inserts the second match from the search string (the numeric part), adds the _ (underscore) and then adds the first match from the search string (the 'file').
For more information on Emacs' regular expressions, see Regexp Syntax and Regexp Replace.
Once you have mastered the regexp basics you might want to check out the Emacs ReBuilder tool with M-x re-builder, which lets you build regexes interactively.