Context-sensitive discontinuous font-locking in Emacs

Context-sensitive discontinuous font-locking in Emacs - regex

Im trying to setup font-locking for a major mode. Here is some example code:
USING: foo bar bar ;
IN: fuelcolors
TUPLE: font < super-class
name
size
bold?
italic?
{ foreground initial: COLOR: black }
{ background initial: COLOR: white } ;
TUPLE: rgb red green blue ;
: foobar ( seq -- seq )
hello there { 1 2 3 } ;
The following symbols should be highlighted with some face (doesn't matter which, my problem is the matching part): name, size, bold?, italic?, foreground, background, red, green, blue. They represent names of slots in tuples.
I know a regexp wont do it because the matched region isn't continuous. italic? and foreground should be matched, but not the { character in between those symbols. So instead I thought I could author a font-lock matcher function, similar to the one Dmitri offered here: Context-sensitive font-locking in emacs for a very similar problem.
But afaict, his solution takes advantage of the fact that the "sequence" of items to highlight is inside paranthesises which is not the case here.
Font-lock has trouble with situations like these (Unknown number of matches in regex and font-lock), but I'm still hoping for some "good enough" solution even it if requires hacking font-lock internals.

A matcher function seems like a good match for this.
I would use two functions, one that you can use to match TUPLE and to set the limits of the search of the members, and an inner matcher to find each element. The inner function could be written to be aware of the { name ... } construct.
The trick is that the inner function is called once for each element, so there is never a situation where you have an unknown number of matches.
The rule would look something like:
'(my-match-a-tuple
(1 'font-lock-keyword-name-face) ;; Highlight the word TUPLE
(my-match-tuple-member
;; Pre-match form (can be used move the point around and to set search limits)
nil
;; Post-match form (can be used to move point around afterwords)
nil
(1 'font-lock-variable-face)))
You can look at my package for fontifying cmake scripts, it makes heavy use of matcher functions: https://github.com/Lindydancer/cmake-font-lock

Here is the solution I ended up with:
(,"\\(TUPLE\\):[ \n]+\\(\\(?:\\sw\\|\\s_\\)+\\)\\(?:[ \n]+<[ \n]+\\(\\(?:\\sw\\|\\s_\\)+\\)\\)?"
(1 'factor-font-lock-parsing-word)
(2 'factor-font-lock-type-name)
(3 'factor-font-lock-type-name nil t)
("\\(\\(?:\\sw\\|\\s_\\)+\\)\\|\\(?:{[ \n]+\\(\\(?:\\sw\\|\\s_\\)+\\)[^}]+\\)"
((lambda (&rest foo)
(save-excursion
(re-search-forward " ;" nil t)
(1- (point)))))
nil
(1 'factor-font-lock-symbol nil t)
(2 'factor-font-lock-symbol nil t)))

Related

Common Lisp Applying Regex-like Patterns to Keys in PLIST

I am wondering if it is possible to apply Regex-like pattern matching to keys in a plist.
That is, suppose we have a list like this (:input1 1 :input2 2 :input3 3 :output1 10 :output2 20 ... :expand "string here")
The code I need to write is something along the lines of:
"If there is :expand and (:input* or :output*) in the list's keys, then do something and also return the :expand and (:output* or :input*)".
Obviously, this can be accomplished via cond but I do not see a clear way to write this elegantly. Hence, I thought of possibly using a Regex-like pattern on keys and basing the return on the results from that pattern search.
Any suggestions are appreciated.

Normalize your input
A possible first step for your algorithm that will simplify the rest of your problem is to normalize your input in a way that keep the same information in a structured way, instead of inside symbol's names. I am converting keys from symbols to either symbols or lists. You could also define your own class which represents inputs and outputs, and write generic functions that works for both.
(defun normalize-key (key)
(or (cl-ppcre:register-groups-bind (symbol number)
("^(\\w+)(\\d+)$" (symbol-name key))
(list (intern symbol "KEYWORD")
(parse-integer number)))
key))
(defun test-normalize ()
(assert (eq (normalize-key :expand) :expand))
(assert (equal (normalize-key :input1) '(:input 1))))
The above normalize-key deconstructs :inputN into a list (:input N), with N parsed as a number. Using the above function, you can normalize the whole list (you could do that recursively too for values, if you need it):
(defun normalize-plist (plist)
(loop
for (key value) on plist by #'cddr
collect (normalize-key key)
collect value))
(normalize-plist
'(:input1 1 :input2 2 :input3 3 :output1 10 :output2 20 :expand "string here"))
=> ((:INPUT 1) 1
(:INPUT 2) 2
(:INPUT 3) 3
(:OUTPUT 1) 10
(:OUTPUT 2) 20
:EXPAND "string here")
From there, you should be able to implement your logic more easily.

How this code translates a number to English?

I'm new to Clojure and found there's a piece of code like following
user=> (def to-english (partial clojure.pprint/cl-format nil
"~#(~#[~R~]~^ ~A.~)"))
#'user/to-english
user=> (to-english 1234567890)
"One billion, two hundred thirty-four million, five hundred sixty-seven
thousand, eight hundred ninety"
at https://clojuredocs.org/clojure.core/partial#example-542692cdc026201cdc326ceb. I know what partial does and I checked clojure.pprint/cl-format doc but still don't understand how it translates an integer to English words. Guess secret is hidden behind "~#(~#[~R~]~^ ~A.~)" but I didn't find a clue to read it.
Any help will be appreciated!

The doc mentions it, but one good resource is A Few FORMAT Recipes from Seibel's Practical Common Lisp.
Also, check §22.3 Formatted Output from the HyperSpec.
In Common Lisp:
CL-USER> (format t "~R" 10)
ten
~#(...~^...) is case conversion, where the # prefix means to capitalize (upcase only the first word). It contains an escape upward operation ~^, which in this context marks the end of what is case-converted. It also exits the current context when there are no more argument available.
~#[...] is conditional format: the inner format is applied on a value only if it is non nil.
The final ~A means that the function should be able to accept one more argument and print it.
In fact, your example looks like the one in §22.3.9.2:
If ~^ appears within a ~[ or ~( construct, then all the commands up to
the ~^ are properly selected or case-converted, the ~[ or ~(
processing is terminated, and the outward search continues for a ~{ or
~< construct to be terminated. For example:
(setq tellstr "~#(~#[~R~]~^ ~A!~)")
=> "~#(~#[~R~]~^ ~A!~)"
(format nil tellstr 23) => "Twenty-three!"
(format nil tellstr nil "losers") => " Losers!"
(format nil tellstr 23 "losers") => "Twenty-three losers!"

google-style file for C++ for emacs

I use the google-style file for emacs. It also looks like a good one to start learning some emacs lisp, not that long. However there is sth I am trying configure in that file, maybe some already did that before, for coding a class, I wrote,
namespace A
{
class A_A
{
public:
A_A();
private:
int a;
};
}
however public/private keywords are not at the right places, I did not understand why it places them like this out of the box, how can fix this? I am not good at emacs lisp yet unfortunately.
EDIT: I wanted to have sth like
namespace A
{
class A_A
{
public:
A_A();
private:
int a;
};
}

To get indent you like use such debug techniques:
(setq c-echo-syntactic-information-p t)
When you press TAB for indenting you will see something like:
syntax: ((inclass 33) (access-label 33))
As you can see access-label identify how indent priv/pub modifiers.
So change to what you want:
(defconst my-c-style
'(
(c-tab-always-indent . t)
(c-offsets-alist
. (
(access-label . /) ; XXXXXX LOOK HERE!!!!!!!
))
)
"My C Programming Style")
(defun my-c-mode-style-hook ()
(c-add-style "my" my-c-style t)
;; If set 'c-default-style' before 'c-add-style'
;; "Undefined style: my" error occured from 'c-get-style-variables'.
(setq c-default-style
'(
(java-mode . "my") (c-mode . "my") (csharp-mode . "my") (c++-mode . "my")
(other . "my")
))
)
(add-hook 'c-mode-common-hook 'my-c-mode-style-hook)
In example I remove half-level indent as inclass add one full indent (to get 1/2 of indent. For offset syntax read C-h v c-offsets-alist RET. For example:
If OFFSET is one of the symbols `+', `-', `++', `--', `*', or `/'
then a positive or negative multiple of `c-basic-offset' is added to
the base indentation; 1, -1, 2, -2, 0.5, and -0.5, respectively.

Probably nowadays a good solution is given by the config file provided in Github by Google.
In the repository styleguide
there is the config file google-c-style.el
that, as described in the file,
;; Provides the google C/C++ coding style. You may wish to add
;; `google-set-c-style' to your `c-mode-common-hook' after requiring this
;; file. For example:
;;
;; (add-hook 'c-mode-common-hook 'google-set-c-style)
;;
;; If you want the RETURN key to go to the next line and space over
;; to the right place, add this to your .emacs right after the load-file:
;;
;; (add-hook 'c-mode-common-hook 'google-make-newline-indent)
The file is also distributed via MELT package system as google-c-style.el.

C++ Templates and Emacs: Customizing Indentation

As far as I know in emacs, there is no way of customizing the indentation level of the closing '>' character of a template list in C++. Currently my emacs indentation scheme does this:
template <
typename T1,
typename T2,
typename T3
>
class X;
What I want is something like this:
template <
typename T1,
typename T2,
typename T3
>
class X;
Setting the indent variable template-args-cont to zero will indent the '>' character properly, but at the cost of unindenting the actual body of the template argument list.
Any suggestions from the emacs gurus out there?
EDIT:
I got it somewhat working with the following hack:
(defun indent-templates (elem)
(c-langelem-col elem t)
(let ((current-line
(buffer-substring-no-properties
(point-at-bol) (point-at-eol))))
(if (string-match-p "^\\s-*>" current-line)
0
'+)))
And then setting template-args-cont to indent-templates in my custom theme, ala:
(c-add-style "my-style"
'("stroustrup"
;; ... Other stuff ...
(template-args-cont . indent-templates))))
But it's still pretty buggy. It works most of the time, but sometimes emacs gets confused at thinks a template list is an arglist, and then hilarity ensues.

The best solution that I have found is writing a custom (and relatively straightforward) indentation function.
The Code
(defun c++-template-args-cont (langelem)
"Control indentation of template parameters handling the special case of '>'.
Possible Values:
0 : The first non-ws character is '>'. Line it up under 'template'.
nil : Otherwise, return nil and run next lineup function."
(save-excursion
(beginning-of-line)
(if (re-search-forward "^[\t ]*>" (line-end-position) t)
0)))
(add-hook 'c++-mode-hook
(lambda ()
(c-set-offset 'template-args-cont
'(c++-template-args-cont c-lineup-template-args +))))
This handles all of the cases that I have come across even with templates nested several levels deep.
How It Works
For indenting code, if a list of indentation functions is provided, then Emacs will try them in order and if the one currently being executed returns nil, it will invoke the next one. What I have done is added a new indentation function to the beginning of the list that detects whether the first non-whitespace character on the line is '>', and if it is, set the indentation to position 0 (which will line it up with the opening template). This also covers the case where you have template-template parameters as follows:
template <
template <
typename T,
typename U,
typename... Args
> class... CS
>
because it doesn't care what's after the '>'. So as a result of how the list of indentation functions works, if '>' is not the first character, the function returns nil and the usual indentation function gets invoked.

Comments
I think part of the problem you experience is that when you instantiate templates, emacs CC mode views it with the same template-args-cont structure. So, taking this into account, I expanded on your original idea and tried to make it suit my liking; I made the code verbose so that hopefully everyone can understand my intention. :) This should not cause problems when you instantiate, and it also appears to work for template template parameters! Try this out until someone with more Elisp skills can provide a better solution!
If you experience any 'fighting' (i.e. alternating or broken indentation), try reloading the cpp file C-xC-vEnter and indenting again. Sometimes with template template parameters emacs shows the inner arguments as arglist-cont-nonempty and even alternates back and forth with template-args-const, but the reload always restored state.
Usage
To do what you want try this out by using the code below and adding to your c-offsets-alist an entry:
(template-args-cont . brian-c-lineup-template-args)
and set the variable
(setq brian-c-lineup-template-closebracket t)
I actually prefer a slightly different alignment:
(setq brian-c-lineup-template-closebracket 'under)
Code
(defvar brian-c-lineup-template-closebracket 'under
"Control the indentation of the closing template bracket, >.
Possible values and consequences:
'under : Align directly under (same column) the opening bracket.
t : Align at the beginning of the line (or current indentation level.
nil : Align at the same column of previous types (e.g. col of class T).")
(defun brian-c-lineup-template--closebracket-p ()
"Return t if the line contains only a template close bracket, >."
(save-excursion
(beginning-of-line)
;; Check if this line is empty except for the trailing bracket, >
(looking-at (rx (zero-or-more blank)
">"
(zero-or-more blank)))))
(defun brian-c-lineup-template--pos-to-col (pos)
(save-excursion
(goto-char pos)
(current-column)))
(defun brian-c-lineup-template--calc-open-bracket-pos (langelem)
"Calculate the position of a template declaration opening bracket via LANGELEM."
(save-excursion
(c-with-syntax-table c++-template-syntax-table
(goto-char (c-langelem-pos langelem))
(1- (re-search-forward "<" (point-max) 'move)))))
(defun brian-c-lineup-template--calc-indent-offset (ob-pos)
"Calculate the indentation offset for lining up types given the opening
bracket position, OB-POS."
(save-excursion
(c-with-syntax-table c++-template-syntax-table
;; Move past the opening bracket, and check for types (basically not space)
;; if types are on the same line, use their starting column for indentation.
(goto-char (1+ ob-pos))
(cond ((re-search-forward (rx
(or "class"
"typename"
(one-or-more (not blank))))
(c-point 'eol)
'move)
(goto-char (match-beginning 0))
(current-column))
(t
(back-to-indentation)
(+ c-basic-offset (current-column)))))))
(defun brian-c-lineup-template-args (langelem)
"Align template arguments and the closing bracket in a semi-custom manner."
(let* ((ob-pos (brian-c-lineup-template--calc-open-bracket-pos langelem))
(ob-col (brian-c-lineup-template--pos-to-col ob-pos))
(offset (brian-c-lineup-template--calc-indent-offset ob-pos)))
;; Optional check for a line consisting of only a closebracket and
;; line it up either at the start of indentation, or underneath the
;; column of the opening bracket
(cond ((and brian-c-lineup-template-closebracket
(brian-c-lineup-template--closebracket-p))
(cond ((eq brian-c-lineup-template-closebracket 'under)
(vector ob-col))
(t
0)))
(t
(vector offset)))))

It's a different approach then changing the tabs, but what about using a snippet system like Yasnippet (see examples here).
The only issue is that if you reformat the doc "M-x index-region" (or that section), it will probably go back to the other tab rules.

Emacs - override indentation

I have a multiply nested namespace:
namespace first {namespace second {namespace third {
// emacs indents three times
// I want to intend here
} } }
so emacs indents to the third position. However I just want a single indentation.
Is it possible to accomplish this effect simply?

Use an an absolute indentation column inside namespace:
(defconst my-cc-style
'("gnu"
(c-offsets-alist . ((innamespace . [4])))))
(c-add-style "my-cc-style" my-cc-style)
Then use c-set-style to use your own style.
Note that this only works in c++-mode, c-mode doesn't know 'innamespace'.

With c++-mode in Emacs 23, I had to do like this:
(defun my-c-setup ()
(c-set-offset 'innamespace [4]))
(add-hook 'c++-mode-hook 'my-c-setup)
To disable the indentation in namespaces altogether, change [4] to 0.

OK so this seems to work in both emacs 21 and 22 at least:
(defun followed-by (cases)
(cond ((null cases) nil)
((assq (car cases)
(cdr (memq c-syntactic-element c-syntactic-context))) t)
(t (followed-by (cdr cases)))))
(c-add-style "foo"
`(( other . personalizations )
(c-offsets-alist
( more . stuff )
(innamespace
. (lambda (x)
(if (followed-by
'(innamespace namespace-close)) 0 '+))))))
(The first solution doesn't support constructs like
namespace X { namespace Y {
class A;
namespace Z {
class B;
}
}}
)

If you simply want to input a literal tab, rather than changing emacs' indentation scheme, C-q TAB should work.

Unfortunately, I don't think emacs has a separate style for a namespace inside another namespace. If you go to the inner line and do C-c, C-o, you can change the topmost-intro style, and if you run customize-variable c-offsets-alist you can edit all the different indentation options emacs has, but one doesn't exist for your specific use case. You would need to write it manually in elisp

This works for me, inherit from cc-mode and replace the name space indenting to 0, aka, disable it's indent.
(defconst my-cc-style
'("cc-mode"
(c-offsets-alist . ((innamespace . [0])))))
(c-add-style "my-cc-mode" my-cc-style)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Context-sensitive discontinuous font-locking in Emacs - regex

Related

Common Lisp Applying Regex-like Patterns to Keys in PLIST

How this code translates a number to English?

google-style file for C++ for emacs

C++ Templates and Emacs: Customizing Indentation

Emacs - override indentation

Categories

Resources