AND operator for text search in Emacs - regex

I am new to Emacs. I can search for text and show all lines in a separate buffer using "M-x occur". I can also search for multiple text items using OR operator as : one\|two , which will find lines with "one" or "two" (as explained on Emacs occur mode search for multiple strings). How can I search for lines with both "one" and "two"? I tried using \& and \&& but they do not work. Will I need to create a macro or function for this?
Edit:
I tried writing a function for above in Racket (a Scheme derivative). Following works:
#lang racket
(define text '("this is line number one"
"this line contains two keyword"
"this line has both one and two keywords"
"this line contains neither"
"another two & one words line"))
(define (srch . lst) ; takes variable number of arguments
(for ((i lst))
(set! text (filter (λ (x) (string-contains? x i)) text)))
text)
(srch "one" "two")
Ouput:
'("this line has both one and two keywords" "another two & one words line")
But how can I put this in Emacs Lisp?

Regex doesn't support "and" because it has very limited usefulness and weird semantics when you try to use it in any nontrivial regex. The usual fix is to just search for one.*two\|two.*one ... or in the case of *Occur* maybe just search for one and then M-x delete-non-matching-lines two.
(You have to mark the *Occur* buffer as writable before you can do this. read-only-mode is a toggle; the default keybinding is C-x C-q. At least in my Emacs, you have to move the cursor away from the first line or you'll get "Text is read-only".)
(defun occur2 (regex1 regex2)
"Search for lines matching both REGEX1 and REGEX2 by way of `occur'.
We first (occur regex1) and then do (delete-non-matching-lines regex2) in the
*Occur* buffer."
(interactive "sFirst term: \nsSecond term: ")
(occur regex1)
(save-excursion
(other-window 1)
(let ((buffer-read-only nil))
(forward-line 1)
(delete-non-matching-lines regex2))))
The save-excursion and other-window is a bit of a wart but it seemed easier than hardcoding the name of the *Occur* buffer (which won't always be true; you can have several occur buffers) or switching there just to fetch the buffer name, then Doing the Right Thing with set-buffer etc.

Related

emacs isearch with automatic application of a regex between characters

I would like to have a hook into Emacs's isearch-forward function to make it automatically apply a regex between the input characters while searching a string. For example, I would like to set this regex to [-=<>]. If I now type foobar into isearch, it should match foo<bar, fo=ob=>ar, f-o-o-b-a-r, etc.
Is such a functionality already available? I looked into ELPA and MELPA without success. In case this is not available, and since my Elisp abilities are very limited: How could this be implemented?
OK, I found a solution by myself after inspecting hexl.el from Emacs.
Here's the code.
(defun my-isearch-function ()
"Make isearch skip characters -=<> while searching."
(if (not isearch-regexp)
(lambda (string &optional bound noerror count)
(funcall
(if isearch-forward
're-search-forward
're-search-backward)
(mapconcat (lambda (c) (regexp-quote (string c))) string
"\\(?:[-=<>]*\\)?")
bound
noerror
count))
(isearch-search-fun-default)))
(defun toggle-my-isearch ()
"Toggle my search mode.
If activated, incremental search skips characters -=<> while
searching.
For example, searching `foobar' matches `foo-bar' or `f-o-o=b<a>r'."
(interactive)
(if (eq isearch-search-fun-function 'isearch-search-fun-default)
(progn
(setq isearch-search-fun-function 'my-isearch-function)
(message "my isearch on"))
(setq isearch-search-fun-function 'isearch-search-fun-default)
(message "my isearch off")))
(global-set-key (kbd "s-s") 'toggle-my-isearch)
I wrote a package called flex-isearch, that basically inserts ".*" in between each character of the search string (it's a bit more complicated than that) and switches to regexp searching. It does this automatically when the isearch fails.

Open file in racket and use regex on said file to print matches

I have been trying to use regular expressions in racket on a text file full of random words separated by the end of line character \n. I'm trying to read in the file as a string or list (whichever is easiest and most intuitive) and use regex to print all the words in the file of length 6 that does not contain a certain letter (in this case the letter t). Below you can see how I read in the file but I am not sure how to use its resulting list because of the lack of variables. Also you can see below I try a test with regex that's true outcome is #f when I actually want the words grumpy and foobar returned excluding stumpy.
#lang racket
(require 2htdp/batch-io)
(require racket/match)
;(file->string "words.txt");;reads in a file to a string
;(file->list "words.txt);; reads in a file to a list
(define (listMatches)
(regexp-match #rx"\b[^<t> | ^<T> | ^<\n>]{<6>}\b" "grumpy\nstumpy\nfoobar" )
)
I am very new to Racket and would love some input, useful links, and any other help.
I would not use a regex at all, but rather use for/list, in combination with string-length and string-countains? to solve the problem. The overall solution looks something like this:
(call-with-input-file* "words.txt"
(lambda (f)
(for/list ([i (in-lines f)]
#:when (and (= (string-length i) 6)
(not (string-contains? i "t"))))
i)))
The use of call-with-input-file* takes a procedure, and in this case binds f to an open file. This way we do not need to close the file ourselves when we are done with it.
Finally, string-contains? was added relatively recently to Racket. And if you need to support older versions of Racket, you can use regexp-match to just search for "t", which is much easier.
One of the things Racket regular expressions can take as a value to match a regular expression against is an input port. This means you can look for matches in a file without having to first read from it; the matching code will do that part for you. Combine with using multi-line mode so that ^ and $ match after and before newlines as well as the very beginning and end of the input, and you get a simple approach using regexp-match* and a RE that matches 6 non-t characters on a line by themselves:
#lang racket/base
(require racket/port)
;;; Using a string port to demonstrate
(define input "grumpy\nstumpy\nfoobar")
(define (list-matches inp)
(map bytes->string/utf-8 (regexp-match* #px"(?m:^[^t]{6}$)" inp)))
(println (call-with-input-string input list-matches)) ; '("grumpy" "foobar")
The big thing to remember about using an input port is that what it returns are byte strings; you have to convert them to strings yourself.

Remove Spaces From List of Lists in Racket

I'm working on a PL logic resolver, and I need to make sure the input either has no spaces, or evenly spaced. I think removing the spaces will be easier. So I'm writing a function that removes the spaces from an input.
So far I have:
;sample input
(define KB&!alpha
'((Girl)
(~ Boy)
(~~Boy)
( ~(FirstGrade ^ ~ ~ Girl))
(Boy / Child)))
(define formatted null)
;formatting function
(define (Format_Spaces KB&!alpha)
(for/list ((item KB&!alpha))
(cond
((list? item)(Format_Spaces item))
((not (eq? item " "))(set! formatted (append formatted (list item))))
((eq? item " ")(Format_Spaces (cdr KB&!alpha)))
)
)
)
But it's clearly giving me the wrong output.
Not only are the spaces still there, the output is a weird combination of the input. Can anybody help me out on this?
I want to get something like this:
'((FirstGrade)
(FirstGrade=>Child)
(Child^Male=>Boy)
(Kindergarten=>Child)
(Child^Female=>Girl)
(Female)))
Thanks for reading.
EDIT: I'm trying to make the input uniform in format. In the new sample input, (~ Boy) is parsed as 2 symbols, (~~Boy) as 1 symbol, and (~ ~ Girl) as 3. I think this will be difficult to parse. Especially with different variations of symbols/operators/spaces. (ie. is "Child^" to be parsed as "Child","^" or as "Child^" a whole symbol?)
RE-EDIT:
Based on the comments you've made below, it looks to me like you're actually going to be writing this algorithm in Racket.
In that case, I have a much simpler prescription for you: Don't Do Anything. In particular, your input doesn't currently contain any spaces at all. The spaces you see are being inserted as part of Racket's display mechanism, in much the same way that a database printer might print fields separated with commas or tabs.
Rather than worrying about the commas, focus on the resolution algorithm. What does it take, and what does it produce?

Emacs lisp escaping regexp

As a first experience in defining a function for emacs, I would like to make write a function that take all occurences of argv[some number] and renumber them in order.
This is done inside emacs with replace-regexp, entering as search/replace strings
argv\[\([0-9]+\)\]
argv[\,(+ 1 \#)]
Now, I want to write this in my .emacs so I understand I need to escape also for Lisp special characters. So in my opinion it should write
(defun argv-order ()
(interactive)
(goto-char 1)
(replace-regexp "argv\\[[0-9]+\\]" "argv[\\,\(+ 1 \\#\)]")
)
The search string works fine but the replacement string gives me the error "invalid use of \ in replacement text. I've been trying around adding or removing some \ but with no success.
Any idea ?
Quoting the help from replace-regexp (the bold is mine):
In interactive calls, the replacement text may contain `\,'
You are not using it interactively in your defun, hence the error message. Another quote from the same help that helps solving your problem:
This function is usually the wrong thing to use in a Lisp program.
What you probably want is a loop like this:
(while (re-search-forward REGEXP nil t)
(replace-match TO-STRING nil nil))
which will run faster and will not set the mark or print anything.
And a solution based on that:
(defun argv-order ()
(interactive)
(let ((count 0))
(while (re-search-forward "argv\\[[0-9]+\\]" nil t)
(replace-match (format "argv[%d]" count) nil nil)
(setq count (1+ count)))))

regexp for elisp

In Emacs I would like to write some regexp that does the following:
First, return a list of all dictionary words that can be formed within "hex space". By this I mean:
#000000 - #ffffff
so #00baba would be a word (that can be looked up in the dictionary)
so would #baba00
and #abba00
and #0faded
...where trailing and leading 0's are considered irrelevant. How would I write this? Is my question clear enough?
Second, I would like to generate a list of words that can be made using numbers as letters:
0 = o
1 = i
3 = e
4 = a
...and so on. How would I write this?
First, load your dictionary. I'll assume that you're using /var/share/dict/words, which is nearly always installed by default when you're running Linux. It lists one word per line, which is a very handy format for this sort of thing.
Next run M-x keep-lines. It'll ask you for a regular expression and then delete any line that doesn't match it. Use the regex ^[a-f]\{,6\}$ and it will filter out anything that can't be part of a color.
Specifically, the ^ makes the regex start at the beginning of the line, the [a-f] matches any one character that is between a and f (inclusive), the {,6} lets it match between 0 and 6 instances of the previous item (in this case the character class [a-f] and finally the $ tells it that the next thing must be the end of the line.
This will return a list of all instances of #000000 - #ffffff in the buffer, although this pattern may not be restrictive enough for your purposes.
(let ((my-colour-list nil))
(save-excursion
(goto-char (point-min))
(while (re-search-forward "#[0-9a-fA-F]\\{6\\}" nil t)
(add-to-list 'my-colour-list (match-string-no-properties 0)))
my-colour-list))
I'm not actually certain that this is what you were asking for. What do you mean by "dictionary"?
A form that will return you a hash table with all the elements you specify in it could be this:
(let ((hash-table (make-hash-table :test 'equal)))
(dotimes (i (exp 256 3))
(puthash (concat "#" (format "%06x" i)) t hash-table))
hash-table)
I'm not sure how Emacs will manage that size of elements (16 million). As you don't want the 0, you can generate the space without that format, and removing trailing 0's. I don't know what do you want to do with the rest of the numbers. You can write the function step by step like this then:
(defun find-hex-space ()
(let (return-list)
(dotimes (i (exp 256 3))
(let* ((hex-number (strip-zeros (format "%x" i)))
(found-word (gethash hex-number *dictionary*)))
(if found-word (push found-word return-list))))
return-list))
Function strip-zeros is easy to write, and here I suppose your words are in a hash called *dictionary*. strip-zeros could be something like this:
(defun strip-zeros (string)
(let ((sm (string-match "^0*\\(.*?\\)0*$" string)))
(if sm (match-string 1 string) string)))
I don't quite understand your second question. The words would be also using the hex space? Would you then consider only the words formed by numbers, or would also include the letters in the word?