Regex Searching in Emacs - regex

I'm trying to write some Elisp code to format a bunch of legacy files.
The idea is that if a file contains a section like "<meta name=\"keywords\" content=\"\\(.*?\\)\" />", then I want to insert a section that contains existing keywords. If that section is not found, I want to insert my own default keywords into the same section.
I've got the following function:
(defun get-keywords ()
(re-search-forward "<meta name=\"keywords\" content=\"\\(.*?\\)\" />")
(goto-char 0) ;The section I'm inserting will be at the beginning of the file
(or (march-string 1)
"Rubber duckies and cute ponies")) ;;or whatever the default keywords are
When the function fails to find its target, it returns Search failed: "[regex here]" and prevents the rest of evaluation. Is there a way to have it return the default string, and ignore the error?

Use the extra options for re-search-forward and structure it more like
(if (re-search-forward "<meta name=\"keywords\" content=\"\\(.*?\\)\" />" nil t)
(match-string 1)
"Rubber duckies and cute ponies")

Also, consider using the nifty "rx" macro to write your regex; it'll be more readable.
(rx "<meta name=\"keywords\" content=\""
(group (minimal-match (zero-or-more nonl)))
"\" />")

Related

How to not highlight values in comments in Emacs?

I want to highlight true and false values in some configuration files. I've
done it this way:
(defun my/highlight-in-properties-files ()
"Highlight regexps in PROPERTIES files."
(when (string-match-p ".properties" (file-name-nondirectory buffer-file-name))
(highlight-regexp "true" 'hi-green)
(highlight-regexp "false" 'hi-pink)))
But it also highlight those values in comments:
Is there a way to exclude those highlightings?
UPDATE -- highlight-regexp is an alias for ‘hi-lock-face-buffer’ in ‘hi-lock.el’. And string-match-p is a compiled Lisp function in ‘subr.el’.
You can just add the regexps via font-lock-add-keywords, which will already account for the comment syntax in the buffer, eg.
(defun my-font-lock-additions ()
(require 'hi-lock) ; fonts
(font-lock-add-keywords
nil
'(("\\btrue\\b" . 'hi-green)
("\\bfalse\\b" . 'hi-pink)))
(font-lock-flush))
And call (font-lock-refresh-defaults) to revert back to OG settings.
Sticking to a purely regexp solution with highlight-regexp will undoubtebly produce some mistakes in odd cases, but, I think just customizing your regex to check for a comment prefix would probably work well enough as well,
(defun my/highlight-in-properties-files ()
"Highlight regexps in PROPERTIES files."
(when (string-match-p ".properties" (file-name-nondirectory buffer-file-name))
(comment-normalize-vars) ; ensure comment variables are setup
(let ((cmt-re (format "^[^%s]*" (regexp-quote (string-trim comment-start)))))
(highlight-regexp (format "%s\\(\\_<true\\_>\\)" cmt-re) 'hi-green 1)
(highlight-regexp (format "%s\\(\\_<false\\_>\\)" cmt-re) 'hi-pink 1))))
A way would be to simply add a $ at the end of your regex to not match the comments, as the true/false are always at the end, while those in comments are always situated in the middle of a sentence
(defun my/highlight-in-properties-files ()
"Highlight regexps in PROPERTIES files."
(when (string-match-p ".properties" (file-name-nondirectory buffer-file-name))
(highlight-regexp "true$" 'hi-green)
(highlight-regexp "false$" 'hi-pink)))

Clojure: java.lang.Integer cannot be cast to clojure.lang.IFn

I know there are a lot of questions out there with this headline, but I can't glean my answer from them, so here goes.
I'm an experienced programmer, but fairly new to Clojure. I'm trying to parse a RTF file by converting it to a HTML file then calling the html parser.
The converter I'm using (unrtf) always prints to stdout, so I need to capture the output and write the file myself.
(defn parse-rtf
"Use unrtf to parse a rtf file"
[#^java.io.InputStream istream charset]
(let [rtffile (File/createTempFile "parse" ".rtf" (File. "/vault/tmp/"))
htmlfile (File/createTempFile "parse" ".ohtml" (File. "/vault/tmp/"))
command (str "/usr/bin/unrtf "
(.getPath rtffile)
)
]
(try
(with-open [rtfout (FileOutputStream. rtffile)]
(IOUtils/copy istream rtfout))
(let [ proc (.exec (Runtime/getRuntime) command)
ostream (.getInputStream proc)
result (.waitFor proc)]
(if (> result 0)
(
(println "unrtf failed" command result)
; throwing an exception causes a parse failure to be logged
(throw (Exception. (str "RTF to HTML conversion failed")))
)
(
(with-open [htmlout (FileOutputStream. htmlfile)]
(IOUtils/copy ostream htmlout))
; since we now have html, run it through the html parser
(parse-html (FileInputStream. htmlfile) charset)
)
)
)
(finally
(.delete rtffile)
(.delete htmlfile)
)
)))
The exception points to the line with
(IOUtils/copy ostream htmlout))
which really confuses me, since I used that form earlier (just after the try:) and it seems to be OK there. I can't see the difference.
Thanks for any help you can give.
As others have correctly pointed out, you can't just add extra parentheses for code organization to group forms together. Parentheses in a Clojure file are tokens that delimit a list in the corresponding code; lists are evaluated as s-expressions - that is, the first form is evaluated and the result is invoked as a function (unless it names a special form such as if or let).
In this case you have the following:
(
(with-open [htmlout (FileOutputStream. htmlfile)]
(IOUtils/copy ostream htmlout))
; since we now have html, run it through the html parser
(parse-html (FileInputStream. htmlfile) charset)
)
The IOUtils/copy function has an integer return value (the number of bytes copied). This value is then returned when the surrounding with-open macro is evaluated. Since the with-open form is the first in a list, Clojure will then try to invoke the integer return value from IOUtils/copy as a function, resulting in the exception that you see.
To evaluate multiple forms for side-effects without invoking the result from the first one, wrap them in a do form; this is a special form that evaluates each expression and returns the result of the final expression, discarding the result from all others. Many core macros and special forms such as let, when, and with-open (among many others) accept multiple expressions and evaluate them in an implicit do.
I didnt try to run your code, just had a look at it, and after the if (> result 0) you have ((println ...)(throw ...)) without a do. Having an extra parens causes the returned value from the inner parens to be treated as a function and get executed.
try to include it, like this (do (println ...) (throw ...))

How to get to end of line but before start of comment in emacs C++ mode?

In practice I usually encounter below situation:
statement_line /*..some comments..*/
Sometimes I need to jump to just pass the end of the statement_line to: add some code, or delete the entire comment region.
Is there a keyboard shortcut to do it? (I know C-s ; Ret works for statement_line ending with a ";" but this is not all the case. e.g. if(...) /*...comments...*/)
Thanks in advance.
Try this code:
(defun end-of-statement ()
(interactive)
(beginning-of-line)
(if (comment-search-forward (line-end-position) t)
(re-search-backward "//\\|/\\*")
(end-of-line))
(skip-chars-backward " \t"))
I just wrote it, so it might require some tweaks, but it looks
fine at the moment.
Not a full answer to your question, but for the case where you want to remove the comment, you can use C-u M-; (aka comment-kill).
I have a command to bounce between the end of line and end of code. It works in most programming modes.
(defun end-of-line-code ()
(interactive "^")
(require 'newcomment)
(if comment-start-skip
(save-match-data
(let* ((bolpos (line-beginning-position)))
(end-of-line)
(if (comment-search-backward bolpos 'noerror)
(search-backward-regexp comment-start-skip bolpos 'noerror))
(skip-syntax-backward " " bolpos)))
(end-of-line)))
(defun end-of-line-or-code ()
"Move to EOL, or if already there, to EOL sans comments."
(interactive "^")
(if (eolp) ;; test me here
(end-of-line-code)
(end-of-line)))
(put 'end-of-line-or-code 'CUA 'move)
(defun jpk/prog-mode-hook ()
...
(local-set-key (kbd "C-e") 'end-of-line-or-code)
...
)
(add-hook 'prog-mode-hook 'jpk/prog-mode-hook)
I have a similar command that bounces between the beginning of line and beginning of text, which I can post if you're interested.

google-style file for C++ for emacs

I use the google-style file for emacs. It also looks like a good one to start learning some emacs lisp, not that long. However there is sth I am trying configure in that file, maybe some already did that before, for coding a class, I wrote,
namespace A
{
class A_A
{
public:
A_A();
private:
int a;
};
}
however public/private keywords are not at the right places, I did not understand why it places them like this out of the box, how can fix this? I am not good at emacs lisp yet unfortunately.
EDIT: I wanted to have sth like
namespace A
{
class A_A
{
public:
A_A();
private:
int a;
};
}
To get indent you like use such debug techniques:
(setq c-echo-syntactic-information-p t)
When you press TAB for indenting you will see something like:
syntax: ((inclass 33) (access-label 33))
As you can see access-label identify how indent priv/pub modifiers.
So change to what you want:
(defconst my-c-style
'(
(c-tab-always-indent . t)
(c-offsets-alist
. (
(access-label . /) ; XXXXXX LOOK HERE!!!!!!!
))
)
"My C Programming Style")
(defun my-c-mode-style-hook ()
(c-add-style "my" my-c-style t)
;; If set 'c-default-style' before 'c-add-style'
;; "Undefined style: my" error occured from 'c-get-style-variables'.
(setq c-default-style
'(
(java-mode . "my") (c-mode . "my") (csharp-mode . "my") (c++-mode . "my")
(other . "my")
))
)
(add-hook 'c-mode-common-hook 'my-c-mode-style-hook)
In example I remove half-level indent as inclass add one full indent (to get 1/2 of indent. For offset syntax read C-h v c-offsets-alist RET. For example:
If OFFSET is one of the symbols `+', `-', `++', `--', `*', or `/'
then a positive or negative multiple of `c-basic-offset' is added to
the base indentation; 1, -1, 2, -2, 0.5, and -0.5, respectively.
Probably nowadays a good solution is given by the config file provided in Github by Google.
In the repository styleguide
there is the config file google-c-style.el
that, as described in the file,
;; Provides the google C/C++ coding style. You may wish to add
;; `google-set-c-style' to your `c-mode-common-hook' after requiring this
;; file. For example:
;;
;; (add-hook 'c-mode-common-hook 'google-set-c-style)
;;
;; If you want the RETURN key to go to the next line and space over
;; to the right place, add this to your .emacs right after the load-file:
;;
;; (add-hook 'c-mode-common-hook 'google-make-newline-indent)
The file is also distributed via MELT package system as google-c-style.el.

C++ Templates and Emacs: Customizing Indentation

As far as I know in emacs, there is no way of customizing the indentation level of the closing '>' character of a template list in C++. Currently my emacs indentation scheme does this:
template <
typename T1,
typename T2,
typename T3
>
class X;
What I want is something like this:
template <
typename T1,
typename T2,
typename T3
>
class X;
Setting the indent variable template-args-cont to zero will indent the '>' character properly, but at the cost of unindenting the actual body of the template argument list.
Any suggestions from the emacs gurus out there?
EDIT:
I got it somewhat working with the following hack:
(defun indent-templates (elem)
(c-langelem-col elem t)
(let ((current-line
(buffer-substring-no-properties
(point-at-bol) (point-at-eol))))
(if (string-match-p "^\\s-*>" current-line)
0
'+)))
And then setting template-args-cont to indent-templates in my custom theme, ala:
(c-add-style "my-style"
'("stroustrup"
;; ... Other stuff ...
(template-args-cont . indent-templates))))
But it's still pretty buggy. It works most of the time, but sometimes emacs gets confused at thinks a template list is an arglist, and then hilarity ensues.
The best solution that I have found is writing a custom (and relatively straightforward) indentation function.
The Code
(defun c++-template-args-cont (langelem)
"Control indentation of template parameters handling the special case of '>'.
Possible Values:
0 : The first non-ws character is '>'. Line it up under 'template'.
nil : Otherwise, return nil and run next lineup function."
(save-excursion
(beginning-of-line)
(if (re-search-forward "^[\t ]*>" (line-end-position) t)
0)))
(add-hook 'c++-mode-hook
(lambda ()
(c-set-offset 'template-args-cont
'(c++-template-args-cont c-lineup-template-args +))))
This handles all of the cases that I have come across even with templates nested several levels deep.
How It Works
For indenting code, if a list of indentation functions is provided, then Emacs will try them in order and if the one currently being executed returns nil, it will invoke the next one. What I have done is added a new indentation function to the beginning of the list that detects whether the first non-whitespace character on the line is '>', and if it is, set the indentation to position 0 (which will line it up with the opening template). This also covers the case where you have template-template parameters as follows:
template <
template <
typename T,
typename U,
typename... Args
> class... CS
>
because it doesn't care what's after the '>'. So as a result of how the list of indentation functions works, if '>' is not the first character, the function returns nil and the usual indentation function gets invoked.
Comments
I think part of the problem you experience is that when you instantiate templates, emacs CC mode views it with the same template-args-cont structure. So, taking this into account, I expanded on your original idea and tried to make it suit my liking; I made the code verbose so that hopefully everyone can understand my intention. :) This should not cause problems when you instantiate, and it also appears to work for template template parameters! Try this out until someone with more Elisp skills can provide a better solution!
If you experience any 'fighting' (i.e. alternating or broken indentation), try reloading the cpp file C-xC-vEnter and indenting again. Sometimes with template template parameters emacs shows the inner arguments as arglist-cont-nonempty and even alternates back and forth with template-args-const, but the reload always restored state.
Usage
To do what you want try this out by using the code below and adding to your c-offsets-alist an entry:
(template-args-cont . brian-c-lineup-template-args)
and set the variable
(setq brian-c-lineup-template-closebracket t)
I actually prefer a slightly different alignment:
(setq brian-c-lineup-template-closebracket 'under)
Code
(defvar brian-c-lineup-template-closebracket 'under
"Control the indentation of the closing template bracket, >.
Possible values and consequences:
'under : Align directly under (same column) the opening bracket.
t : Align at the beginning of the line (or current indentation level.
nil : Align at the same column of previous types (e.g. col of class T).")
(defun brian-c-lineup-template--closebracket-p ()
"Return t if the line contains only a template close bracket, >."
(save-excursion
(beginning-of-line)
;; Check if this line is empty except for the trailing bracket, >
(looking-at (rx (zero-or-more blank)
">"
(zero-or-more blank)))))
(defun brian-c-lineup-template--pos-to-col (pos)
(save-excursion
(goto-char pos)
(current-column)))
(defun brian-c-lineup-template--calc-open-bracket-pos (langelem)
"Calculate the position of a template declaration opening bracket via LANGELEM."
(save-excursion
(c-with-syntax-table c++-template-syntax-table
(goto-char (c-langelem-pos langelem))
(1- (re-search-forward "<" (point-max) 'move)))))
(defun brian-c-lineup-template--calc-indent-offset (ob-pos)
"Calculate the indentation offset for lining up types given the opening
bracket position, OB-POS."
(save-excursion
(c-with-syntax-table c++-template-syntax-table
;; Move past the opening bracket, and check for types (basically not space)
;; if types are on the same line, use their starting column for indentation.
(goto-char (1+ ob-pos))
(cond ((re-search-forward (rx
(or "class"
"typename"
(one-or-more (not blank))))
(c-point 'eol)
'move)
(goto-char (match-beginning 0))
(current-column))
(t
(back-to-indentation)
(+ c-basic-offset (current-column)))))))
(defun brian-c-lineup-template-args (langelem)
"Align template arguments and the closing bracket in a semi-custom manner."
(let* ((ob-pos (brian-c-lineup-template--calc-open-bracket-pos langelem))
(ob-col (brian-c-lineup-template--pos-to-col ob-pos))
(offset (brian-c-lineup-template--calc-indent-offset ob-pos)))
;; Optional check for a line consisting of only a closebracket and
;; line it up either at the start of indentation, or underneath the
;; column of the opening bracket
(cond ((and brian-c-lineup-template-closebracket
(brian-c-lineup-template--closebracket-p))
(cond ((eq brian-c-lineup-template-closebracket 'under)
(vector ob-col))
(t
0)))
(t
(vector offset)))))
It's a different approach then changing the tabs, but what about using a snippet system like Yasnippet (see examples here).
The only issue is that if you reformat the doc "M-x index-region" (or that section), it will probably go back to the other tab rules.