Emacs Lisp and non-deterministic regexes - regex

I've been spending too much time lately trying to debug some auto-complete-mode functionality in Emacs, this function appears to be non-deterministic and has left me utterly confused.
(re-search-backward "\\(\\sw\\|\\s_\\|\\s\\.\\|\\s\\\\|[##|]\\)\\=")
The command is called in a while loop, searching backwards from the current point to find the full "word" that should be autocompleted. For reference, the actual code.
A bit of background and my investigations
I have been trying to setup autocompletion for Javascript, using slime to connect to a Node.js backend.
Autocomplete inside a Slime REPL connected to a Node.js backend is perfect,
Autocomplete inside a js2-mode buffer, connected to Slime, is failing to look up completions from slime. In this image you can see it falling back to the words already in the buffer.
I've tracked this down to Slime's slime-beginning-of-symbol function.
Assume that I'm trying to complete fs.ch where fs has been required and is in scope already, the point is located on after the h character.
In the slime repl buffer the beginning function moves the point all of the way back until it hits whitespace and matches fs.ch.
In the js2-mode buffer the beginning function moves the point only to the dot character and matches only ch.
Reproducing the problem
I've been testing this by evaling (re-search-backward "\\(\\sw\\|\\s_\\|\\s\\.\\|\\s\\\\|[##|]\\)\\=") repeatedly in various buffers. For all examples, the point starts at the end of the line and moves backwards until the search fails.
In the scratch buffer fs.ch the point ends on the c.
In the slime repl fs.ch the point ends on the f.
In the js2-mode buffer fs.ch the point ends on the c.
In an emacs-lisp-mode buffer fs.ch the point ends on the f.
I have no idea why this is happening
I'm going to assume that there's something in these modes that either sets or unsets a global regex var that then has this effect, but so far I've been unable to find or implicate anything.
I even tracked this down to the emacs c code, but at that point realised that I was in completely over my head and decided to ask for help.
Help?

You should replace \\s\\. with \\s. in your regexp.

I "fixed" the problem by redefining the source that gets added to auto complete's ac-sources.
I'm still learning my way around elisp so this is likely the most hack-like way of achieving what I need, but it works.
I changed the regex from:
\\(\\sw\\|\\s_\\|\\s\\.\\|\\s\\\\|[##|]\\)\\=
to
\\(\\sw\\|\\s_\\|\\s.\\|\\s\\\\|[##|]\\)\\=
(note the change of \\s\\.\\ to \\s.\\).
And then overrode the auto-complete setup in my init.el. (I'll probably find a hundred ways to refine this when I actually know elisp).
(defun js-slime-beginning-of-symbol ()
"Move to the beginning of the CL-style symbol at point."
(while (re-search-backward "\\(\\sw\\|\\s_\\|\\s.\\|\\s\\\\|[##|]\\)\\="
(when (> (point) 2000) (- (point) 2000))
t))
(re-search-forward "\\=#[-+.<|]" nil t)
(when (and (looking-at "#") (eq (char-before) ?\,))
(forward-char)))
(defun js-slime-symbol-start-pos ()
"Return the starting position of the symbol under point.
The result is unspecified if there isn't a symbol under the point."
(save-excursion (js-slime-beginning-of-symbol) (point)))
(defvar ac-js-source-slime-simple
'((init . ac-slime-init)
(candidates . ac-source-slime-simple-candidates)
(candidate-face . ac-slime-menu-face)
(selection-face . ac-slime-selection-face)
(prefix . js-slime-symbol-start-pos)
(symbol . "l")
(document . ac-slime-documentation)
(match . ac-source-slime-case-correcting-completions))
"Source for slime completion.")
(defun set-up-slime-js-ac (&optional fuzzy)
"Add an optionally-fuzzy slime completion source to `ac-sources'."
(interactive)
(add-to-list 'ac-sources ac-js-source-slime-simple))
In response to my own question about regex global state. There is a lot of it.
Emacs regexes use syntax tables defined in the major mode to determine which characters to match. The reason I was seeing the dot match in the lisp mode but not the js mode was because of different definitions. In the lisp mode '.' is defined as symbol, in js2-mode '.' is defined as punctuation.
As a consequence, an alternative way to fix the problem is to redefine .'s syntax in js2-mode. I tried this out and redefined . as a word with (modify-syntax-entry ?. "w"). However I decided not to stay with that result because it will probably break something down the line.
Also, I have to thank the people in #emacs, they really helped me out on this, teaching me about syntax tables and the horrors of elisp regex globals.

Related

When writing C++ in Emacs, why does it seemingly randomly indent in the middle of lines?

I'm writing a C++ program in Emacs, and I've encountered an irritating feature whereby when I type one of a set of seemingly random characters, it will indent the line I'm typing on.
For example:
cout<<"Case #"<<case<<": ";
Each time I typed the << operator, the line would be indented by two spaces, which I would then have to go back and remove. The same thing would occur when typing (. After a while, you start to get bored of this. Any idea why this might be happening?
The only style point I've changed from default is to set the 'style' variable to 'linux', and all I've got in my ~/.emacs is:
(setq backup-directory-alist (("." . "~/.saves")))
(setq tab-width 4)
You can toggle electric indentation on/off with c-toggle-electric-state (eg M-x ...). When turned off you shouldn't get an automatic indentation after typing << or (.
If you always want it off, it can be turned off in your c++-mode-hook, eg
(defun my-c++-mode-hook ()
;; ...
(c-toggle-electric-state -1))
(add-hook 'c++-mode-hook 'my-c++-mode-hook)

setting a regular expression into the emacs header line

So I'm new to emacs lisp, and I've got a long file with walls of text broken up by dates. Sometimes I can't see what date I'm reading under without scrolling up and losing my position, and I decided I wanted to be able to see this at all times.
After skim-reading the manuals, borrowing code examples and making a wild stab in the dark, the following worked beautifully:
(add-hook 'text-mode-hook
(lambda ()
(setq header-line-format
'(:eval
;;(setq temp-point point)
(setq temp-string
(if (re-search-backward "../../.." nil t nil) (match-string 0) '("no date"))
)
;;(goto-char 'temp-point)
`(temp-string)
)
)
)
)
There's only one fly in the ointment: re-search-backward moves the point. I want it to stay put.
First of all, is there a function that can do a regexp search without moving the point and return the match?
Secondly, and whether or not that's true: as you can tell from the commented code I've been trying to work around it by saving the value of point and then resetting the position afterward. The order of the code is on the assumption that the last element of the list is the one that gets returned.
However that assumption doesn't seem to always hold: as soon as I uncomment the first line (which can actually be anything as long as it's valid code) the header gets set to a blank string instead.
If someone could tell me where I'm going wrong, that would be great. Also if I've got any bad habits or less-than-efficient ways of solving a problem, please point them out.
You're looking for the save-excursion macro. Wrap it around your search, and it will undo any movement. To quote the docstring:
(save-excursion &rest BODY)
Save point, mark, and current buffer; execute BODY; restore those things.
So in place of (re-search-backward "../../.." nil t nil) you'd put:
(save-excursion
(re-search-backward "../../.." nil t nil))
As to why uncommenting the first line would change the result, this has to do with the way :eval works in mode/header line format specs. The docstring for mode-line-format says that it's (:eval FORM) - with just one form - and apparently it silently ignores any forms after the first one. (Your intuition is correct: most of the time the last value inside the form is returned, and the behaviour of :eval is indeed surprising.)

Emacs Brace and Bracket Highlighting?

When entering code Emacs transiently highlights the matching brace or bracket. With existing code however is there a way to ask it to highlight a matching brace or bracket if I highlight its twin?
I am often trying to do a sanity check when dealing with compiler errors and warnings. I do enter both braces usually when coding before inserting the code in between, but have on occasion unintentionally commented out one brace when commenting out code while debugging.
Any advice with dealing with brace and bracket matching with Emacs?
OS is mostly Linux/Unix, but I do use it also on OS X and Windows.
If you're dealing with a language that supports it, give ParEdit a serious look. If you're not using with a Lisp dialect, it's not nearly as useful though.
For general brace/bracket/paren highlighting, look into highlight-parentheses mode (which color codes multiple levels of braces whenever point is inside them). You can also turn on show-paren-mode through customizations (that is M-x customize-variable show-paren-mode); that one strongly highlights the brace/bracket/paren matching one at point (if the one at point doesn't match anything, you get a different color).
my .emacs currently contains (among other things)
(require 'highlight-parentheses)
(define-globalized-minor-mode global-highlight-parentheses-mode highlight-parentheses-mode
(lambda nil (highlight-parentheses-mode t)))
(global-highlight-parentheses-mode t)
as well as that show-paren-mode customization, which serves me well (of course, I also use paredit when lisping, but these are still marginally useful).
Apart from the answer straight from the manual or wiki, also have a look at autopair.
tried on emacs 26
(show-paren-mode 1)
(setq show-paren-style 'mixed)
enable showing parentheses
set the showing in such as highlit the braces char., or if either one invisible higlight what they enclose
for toggling the cursor position / point between both, put this script in .emacs
(defun swcbrace ()(interactive)
(if (looking-at "(")(forward-list)
(backward-char)
(cond
((looking-at ")")(forward-char)(backward-list))
((looking-at ".)")(forward-char 2)(backward-list))
)))
(global-set-key (kbd "<C-next>") 'swcbrace)
it works toggling by press Control-Pgdn
BTW, for the immediate question: M-x blink-matching-open will "re-blink" for an existing close paren, as if you had just inserted it. Another way to see the matching paren is to use M-C-b and M-C-f (which jump over matched pairs of parens), which are also very useful navigation commands.
I second ParEdit. it is very good atleast for lisp development.
FWIW I use this function often to go to matching paren (back and forth).
;; goto-matching-paren
;; -------------------
;; If point is sitting on a parenthetic character, jump to its match.
;; This matches the standard parenthesis highlighting for determining which
;; one it is sitting on.
;;
(defun goto-matching-paren ()
"If point is sitting on a parenthetic character, jump to its match."
(interactive)
(cond ((looking-at "\\s\(") (forward-list 1))
((progn
(backward-char 1)
(looking-at "\\s\)")) (forward-char 1) (backward-list 1))))
(define-key global-map [(control ?c) ?p] 'goto-matching-paren) ; Bind to C-c p
Declaimer: I am NOT the author of this function, copied from internet.
If you just want to check the balanced delimiters, be them parentheses, square brackets or curly braces, you can use backward-sexp (bound to CtrlAltB) and forward-sexp (bound to CtrlAltF) to skip backward and forward to the corresponding delimiter. These commands are very handy to navigate through source files, skipping structures and function definitions, without any buffer modifications.
You can set the below in your init.el:
(setq show-paren-delay 0)
(show-paren-mode 1)
to ensure matching parenthesis are highlighted.
Note that (setq show-paren-delay 0) needs to be set before (show-paren-mode 1) so that there's no delay in highlighting, as per the wiki.
If you want to do a quick check to see whether brackets in the current file are balanced:
M-x check-parens
Both options tested on Emacs 27.1

Emacs compilation-mode regex for multiple lines

So I have a tool lints python changes I've made and produces errors and warnings. I would like this to be usable in compile mode with Emacs, but I have an issue. The file name is output only once at the beginning, and then only line numbers appear with the errors and warnings. Here's an example:
Linting file.py
E0602: 37: Undefined variable 'foo'
C6003: 42: Unnecessary parens after 'print' keyword
2 new errors, 2 total errors in file.py.
It's very similar to pylint, but there's no output-format=parseable option. I checked the documentation for compilation-error-regexp-alist, and found something promising:
If FILE, LINE or COLUMN are nil or that index didn't match, that
information is not present on the matched line. In that case the
file name is assumed to be the same as the previous one in the
buffer, line number defaults to 1 and column defaults to
beginning of line's indentation.
So I tried writing a regexp that would optionally match the file line and pull it out in a group, and then the rest would match the other lines. I assumed that it would first match
Linting file.py
E0602: 37: Undefined variable 'foo'
and be fine. Then it would continue and match
C6003: 42: Unnecessary parens after 'print' keyword
with no file. Since there was no file, it should use the file name from the previous match right? Here's the regexp I'm using:
(add-to-list 'compilation-error-regexp-alist 'special-lint)
(add-to-list 'compilation-error-regexp-alist-alist
'(special-lint
"\\(Linting \\(.*\\)\\n\\)?\\([[:upper:]][[:digit:]]+:\\s-+\\([[:digit:]]\\)+\\).*"
2 4 nil nil 3))
I've checked it with re-builder and manually in the scratch buffer. It behaves as expected. the 2nd group is the file name, the 4th is the line number, and the 3rd is what I want highlighted. Whenever I try this, I get the error:
signal(error ("No match 2 in highlight (2 compilation-error-face)"))
I have a workaround for this that involves transforming the output before the compile module looks at it, but I'd prefer to get rid of that and have a "pure" solution. I would appreciate any advice or pointing out any dumb mistakes I may have made.
EDIT
Thomas' pseudo code below worked quite well. He mentioned that doing a backwards re search could mess up the match data, and it did. But that was solved by adding the save-match-data special form before save-excursion.
FILE can also have the form (FILE
FORMAT...), where the FORMATs (e.g.
"%s.c") will be applied in turn to the
recognized file name, until a file of
that name is found. Or FILE can also
be a function that returns (FILENAME)
or (RELATIVE-FILENAME . DIRNAME). In
the former case, FILENAME may be
relative or absolute.
You could try to write a regex that doesn't match the file name at all, only the column. Then for the file, write a function that searches backwards for the file. Perhaps not as efficient, but it should have the advantage that you can move upwards through the error messages and it will still identify the correct file when you cross file boundaries.
I don't have the necessary stuff installed to try this out, but take the following pseudo-code as an inspiration:
(add-to-list 'compilation-error-regexp-alist-alist
'(special-lint
"^\\S-+\\s-+\\([0-9]+\\):.*" ;; is .* necessary?
'special-lint-backward-search-filename 1))
(defun special-lint-backward-search-filename ()
(save-excursion
(when (re-search-backward "^Linting \\(.*\\)$" (point-min) t)
(list (match-string 1)))))
(It could be that using a search function inside special-lint-backward-search-filename will screw up the sub-group matching of the compilation-error-regexp, which would suck.)
I don't think you can make compilation do what you want here, because it won't assume that a subsequent error relates to a previously-seen filename. But here's an alternative; write a flymake plugin. Flymake always operates on the current file, so you only need to tell it how to find line (and, optionally, column) numbers.
Try hacking something like this, and you'll likely be pleasantly surprised.

How do I bind a regular expression to a key combination in emacs?

For context, I am something of an emacs newbie. I haven't used it for very long, but have been using it more and more (I like it a lot). Also I'm comfortable with lisp, but not super familiar with elisp.
What I need to do is bind a regular expression to a keyboard combination because I use this particular regex so often.
What I've been doing:
M-C-s ^.*Table\(\(.*\n\)*?GO\)
Note, I used newline above, but I've found that for isearch-forward-regexp, you really need to replace the \n in the regular expression with the result of C-q Q-j. This inserts a literal newline (without ending the command) enabling me to put a newline into the expression and match across lines.
How can I bind this to a key combination?
I vaguely understand that I need to create an elisp function which executes isearch-forward-regexp with the expression, but I'm fuzzy on the details. I've searched google and found most documentation to be a tad confusing.
How can I bind a regular expression to a key combination in emacs?
Mike Stone had the best answer so far -- not exactly what I was looking for but it worked for what I needed
Edit - this sort of worked, but after storing the macro, when I went back to use it later, I couldn't use it with C-x e. (i.e., if I reboot emacs and then type M-x macro-name, and then C-x e, I get a message in the minibuffer like 'no last kbd macro' or something similar)
#Mike Stone - Thanks for the information. I tried creating a macro like so:
C-x( M-C-s ^.*Table\(\(.*C-q C-J\)*?GO\) C-x)
This created my macro, but when I executed my macro I didn't get the same highlighting that I ordinarily get when I use isearch-forward-regexp. Instead it just jumped to the end of the next match of the expression. So that doesn't really work for what I need. Any ideas?
Edit: It looks like I can use macros to do what I want, I just have to think outside the box of isearch-forward-regexp. I'll try what you suggested.
You can use macros, just do C-x ( then do everything for the macro, then C-x ) to end the macro, then C-x e will execute the last defined macro. Then, you can name it using M-x name-last-kbd-macro which lets you assign a name to it, which you can then invoke with M-x TESTIT, then store the definition using M-x insert-kbd-macro which will put the macro into your current buffer, and then you can store it in your .emacs file.
Example:
C-x( abc *return* C-x)
Will define a macro to type "abc" and press return.
C-xeee
Executes the above macro immediately, 3 times (first e executes it, then following 2 e's will execute it twice more).
M-x name-last-kbd-macro testit
Names the macro to "testit"
M-x testit
Executes the just named macro (prints "abc" then return).
M-x insert-kbd-macro
Puts the following in your current buffer:
(fset 'testit
[?a ?b ?c return])
Which can then be saved in your .emacs file to use the named macro over and over again after restarting emacs.
I've started with solving your problem literally,
(defun search-maker (s)
`(lambda ()
(interactive)
(let ((regexp-search-ring (cons ,s regexp-search-ring)) ;add regexp to history
(isearch-mode-map (copy-keymap isearch-mode-map)))
(define-key isearch-mode-map (vector last-command-event) 'isearch-repeat-forward) ;make last key repeat
(isearch-forward-regexp)))) ;`
(global-set-key (kbd "C-. t") (search-maker "^.*Table\\(\\(.*\\n\\)*?GO\\)"))
(global-set-key (kbd "<f6>") (search-maker "HELLO WORLD"))
The keyboard sequence from (kbd ...) starts a new blank search. To actually search for your string, you press last key again as many times as you need. So C-. t t t or <f6> <f6> <f6>. The solution is basically a hack, but I'll leave it here if you want to experiment with it.
The following is probably the closest to what you need,
(defmacro define-isearch-yank (key string)
`(define-key isearch-mode-map ,key
(lambda ()
(interactive)
(isearch-yank-string ,string)))) ;`
(define-isearch-yank (kbd "C-. t") "^.*Table\\(\\(.*\\n\\)*?GO\\)")
(define-isearch-yank (kbd "<f6>") "HELLO WORLD")
The key combos now only work in isearch mode. You start the search normally, and then press key combos to insert your predefined string.
#Justin:
When executing a macro, it's a little different... incremental searches will just happen once, and you will have to execute the macro again if you want to search again. You can do more powerful and complex things though, such as search for a keyword, jump to the beginning of the line, mark, go to end of the line, M-w (to copy), then jump to another buffer, then C-y (paste), then jump back to the other buffer and end your macro. Then, each time you execute the macro you will be copying a line to the next buffer.
The really cool thing about emacs macros is it will stop when it sees the bell... which happens when you fail to match an incremental search (among other things). So the above macro, you can do C-u 1000 C-x e which will execute the macro 1000 times... but since you did a search, it will only copy 1000 lines, OR UNTIL THE SEARCH FAILS! Which means if there are 100 matches, it will only execute the macro 100 times.
EDIT: Check out C-hf highlight-lines-matching-regexp which will show the help of a command that highlights everything matching a regex... I don't know how to undo the highlighting though... anyways you could use a stored macro to highlight all matching the regex, and then another macro to find the next one...?
FURTHER EDIT: M-x unhighlight-regexp will undo the highlighting, you have to enter the last regex though (but it defaults to the regex you used to highlight)
In general, to define a custom keybinding in Emacs, you'd write
(define-key global-map (kbd "C-c C-f") 'function-name)
define-key is, unsurprisingly, the function to define a new key. global-map is the global keymap, as opposed to individual maps for each mode. (kbd "C-c C-f") returns a string representing the key sequence C-c C-f. There are other ways of doing this, including inputting the string directly, but this is usually the most straightforward since it takes the normal written representation. 'function-name is a symbol that's the name of the function.
Now, unless your function is already defined, you'll want to define it before you use this. To do that, write
(defun function-name (args)
(interactive)
stuff
...)
defun defines a function - use C-h f defun for more specific information. The (interactive) there isn't really a function call; it tells the compiler that it's okay for the function to be called by the user using M-x function-name and via keybindings.
Now, for interactive searching in particular, this is tricky; the isearch module doesn't really seem to be set up for what you're trying to do. But you can use this to do something similar.