I am trying to match only the numbers after the following strings:
sequentialGrid: 650274
parallelGrid: 650274
My goal is to highlight the numbers, via M-x highlight-regexp after lines beginning with sequentialGrid: and parallelGrid:
Here was my attempt, using a Perl-like approach:
^sequentialGrid: \([0-9]*\).*/$1/
Unfortunately, Emacs does not support Perl functionality. Thus, I hope my request is not impossible or perhaps someone can offer a convenient workaround.
BTW I verified that ^sequentialGrid: \([0-9]*\).* highlights the entire line. I just need to extract the number.

If your goal is to add font-lock highlighting, the following expression will work:
2 font-lock-warning-face)))
The nil MODE parameter in it to the current buffer, or you could specify the mode name as a symbol. See the manual and the wiki for more on font-lock-add-keywords and font-lock-remove-keywords.


Regex to match everything except a pattern

Regex noob here struggling with this, which I know it will be easy for some of you regex gods out there!
Given the following:
title: Some title
date: 2022-08-15
tags: <value to extract>
identifier: 1234567
Some text
some more text
I would like a regex to match everything except the value of tags (ie the "<value to extract>" text).
For context, this is supposed to run on emacs (in case it matters).
EDIT: Just to clarify as per #phils question, all I care about extracting the tags value. However, this is via a package setting that asks for a regex string and I don't have much control over how it gets use. It seems to expect a regex to strip what I don't need from the string rather than matching what I do want, which is slightly annoying.. Also, the since it seems to match everything with \\(.\\), I'm guessing it's using the global flag?
Please let me know if any of this isn't clear.
Emacs regular expressions can't trivially express "not foo" for arbitrary values of foo. (The likes of PCRE have non-regular extensions for zero-width negative look-ahead/behind assertions, but in Emacs that sort of functionality is generally done with the support of lisp code1.)
You can still do it purely with regexp matching, but it's simply very cumbersome. An Emacs regexp which matches any line which does not begin with tags: is:
or if you need to enter it in the elisp double-quoted read syntax for strings:
1 In lisp code you would instead simply check each line to see whether it does start with tags: and, if so, skip it (which is why Emacs generally gets away without the feature you're looking for, but of course that doesn't help you here).
After playing around with it for a bit and taken inspiration from #phils' answer, I've come up with the following:
I've also added an extra \\(#\\+\\)? to account for org meta keys which would usually have the format #+key: value.

regular expression matching filename with multiple extensions

Is there a regular expression to match the some.prefix part of both of the following filenames?
xyz can be any character of [a-z0-9-_\ ]
some.prefix part can be any character in [a-zA-Z0-9-_\.\ ].
I intentionally included a . in some.prefix.
I have tried many combinations. For example:
It works with abc.def.csv by catching abc.def, but fail to catch it in abc.def.csv.gz.
I primarily use Python, but I thought the regex itself should apply to many languages.
Update: It's not possible, see discussion with #nowox below.
I think your regex works pretty well. I recommend you to trying regex101 with your example:
The expression
^(?i)[ \w-]+\.[ \w-]+
Should work in your case:
som e.prefix.xyz.xyz
And in Python you can use:
import re
text = """some.prefix.xyz.xyz
print re.findall('^(?i)[ \w-]+\.[ \w-]+', text, re.MULTILINE)
Which will display:
['som e.prefix', 'some.prefix', 'abc.def']
I might think you are a bit confused about your requirement. If I summarize, you have a pathname made of chars and dot such as:
How would you separate these string into a base-name and an extension? Here we recognize some known patterns .tar.gz is definitely an extension, but is .bar.baz.0 the extension or it is only .0?
The answer is not easy and no regexes in this World would be able to guess the correct answer at 100% without some hints.
For example you can list the acceptable extensions and make some criteria:
An extension match the regex \.\w{1,4}$
Several extensions may be concatenated together (\.\w{1,4}){1,4}$
The remaining is called the basename
From this you can build this regular expression:
Try this[a-z0-9-_\\]+\.[a-z0-9-_\\]+[a-zA-Z0-9-_\.\\]+

Regular expression to remove comment

I am trying to write a regular expression which finds all the comments in text.
For example all between /* */.
/* Hello */
When I do this:/\*.*\*/, it behaves odd and nothing is shown. What is wrong with it?
EDIT: The comments can be spread across multiple lines
Unlike the example posted above, you were trying to match comments that spanned multiple lines. By default, . does not match a line break. Thus you have to enable multi-line mode in the regex to match multi-line comments.
Also, you probably need to use .*? instead of .*. Otherwise it will make the largest match possible, which will be everything between the first open comment and the last close comment.
I don't know how to enable multi-line matching mode in Sublime Text 2. I'm not sure it is available as a mode. However, you can insert a line break into the actual pattern by using CTRL + Enter. So, I would suggest this alternative:
If Sublime Text 2 doesn't recognize the \n, you could alternatively use CTRL + Enter to insert a line break in the pattern, in place of \n.
I encountered this problem several years ago and wrote an entire article about it.
If you don't have access to non-greedy matching (not all regex libraries support non-greedy) then you should use this regex:
If you do have access to non-greedy matching then you can use:
Also, keep in mind that regular expressions are just a heuristic for this problem. Regular expressions don't support cases in which something appears to be a comment to the regular expression but actually isn't:
someString = "An example comment: /* example */";
// The comment around this code has been commented out.
// /*
// */
Just want to add for HTML Comments is is this
Just an additionnal note about using regex to remove comments inside a programming language file.
Doing this you must not forget the case where you have the string /* or */ inside a string in the code - like var string = "/*"; - (we never know if you parse a huge code that is not yours)!
So the best is to parse the document with a programming language and have a boolean to save the state of an open string (and ignore any match inside open string).
Again a string delimited by " can contain a \" so pay attention with the regex!
You cannot write a regular expression that would be able to correctly find all comments, or even one type of comments - single-line or multiline.
Regular expressions can only provide a partial match, one that would would cover perhaps 90% of all cases, but that's it.
The syntax for regular expression is so complex, it is only possible to identify them correctly in 100% of cases by doing a full expression evaluation, which in turn is based on tokenizing the code. The latter is a huge task, which is implemented by all AST parsers today. See AST Explorer
Only a proper-written AST parser can tell you precisely where all regular expressions are located in your code. You would have to write a parser then based on that.
Or, you could use one of the existing libraries that already do all that, like decomment.
RegEx examples where any head-on approach is going to stumble, being unable to tell a regular expression from a comment block:
/\// - it will think this reg-ex is a single-line comment
/\/*/ - it will think this reg-ex opens a multi-line comment
The answer which user1919238 wrote works. Just corroborating that here, although the many upvotes probably do give you a clue.
It got rid of all these annoying block comments, put here just to show the usefulness/thank user1919238 for saving time:
/*# sourceMappingURL=data:application/json;base64,eyJ2ZXJzaW9uIjozLCJzb3VyY2VzIjpbIndlYnBhY2s6Ly9zdHlsZXMvZ2xvYmFscy5jc3MiXSwibmFtZXMiOltdLCJtYXBwaW5ncyI6IkFBQUE7O0VBRUUsVUFBVTtFQUNWLFNBQVM7RUFDVDt3RUFDc0U7QUFDeEU7O0FBRUE7RUFDRSxjQUFjO0VBQ2QscUJBQXFCO0FBQ3ZCOztBQUVBO0VBQ0Usc0JBQXNCO0FBQ3hCIiwic291cmNlc0NvbnRlbnQiOlsiaHRtbCxcbmJvZHkge1xuICBwYWRkaW5nOiAwO1xuICBtYXJnaW46IDA7XG4gIGZvbnQtZmFtaWx5OiAtYXBwbGUtc3lzdGVtLCBCbGlua01hY1N5c3RlbUZvbnQsIFNlZ29lIFVJLCBSb2JvdG8sIE94eWdlbixcbiAgICBVYnVudHUsIENhbnRhcmVsbCwgRmlyYSBTYW5zLCBEcm9pZCBTYW5zLCBIZWx2ZXRpY2EgTmV1ZSwgc2Fucy1zZXJpZjtcbn1cblxuYSB7XG4gIGNvbG9yOiBpbmhlcml0O1xuICB0ZXh0LWRlY29yYXRpb246IG5vbmU7XG59XG5cbioge1xuICBib3gtc2l6aW5nOiBib3JkZXItYm94O1xufVxuIl0sInNvdXJjZVJvb3QiOiIifQ== */
if you want to replace the obnoxious comment from flutter main.dart,
Press cmd +r on mac or cntrl+ r on windows,
type //.* into the box above, leave the box below empty
click .* on the replace dialog, to activate regex,
then click on replace all. this will remove all your comments, you can do this if you want to remove all comments in any file in a flutter.
Additional, to reformat the main.dart
press cmd+a on mac and cntrl+a on windows,
then press cmd+alt(option)+l or cntrl+alt+l, this will reformat the code.
I will attach a picture of the main. dart, the green .* at the top of the page is what you will press to activate the regex.

Help with an Emacs Regular Expression

I have statements like this all over my code:
LogWrite (String1,
L"=======format string======",
I want to change each of these to:
LogWrite (String1,
L"format string",
I'm trying to write the regexp required to do this using the Emacs function query-replace-regexp, but not much success yet. Help please!
1) In case it is not clear, this question is emacs specific.
2) I would like to match the entire code chunk starting from Log... ending at );
3) I used the following reg-exp to match the code chunk:
I used re-builder to match this regexp. the \n is used because I found that otherwise emacs would stop matching at the new line. The problem is that I don't know how to select the format string and save it to use it in the replace regexp - hence the ==.* part in the regexp. That needs to be modified to save the format string.
If you don't have multiple (or escaped) double quotes in those format string lines, you can
Update: Removed the lazy quantifier (thanks #tim). Make sure that the regex is not multiline; the greedy * will lead to pretty bad results if . matches new lines
A great tool to figure out emacs regular expressions is:
M-x re-builder
A brief description from the documentation:
When called up re-builder' attaches
itself to the current buffer which
becomes its target buffer, where all
the matching is done. The active
window is split so you have a view on
the data while authoring the RE. If
the edited expression is valid the
matches in the target buffer are
marked automatically with colored
overlays (for non-color displays see
below) giving you feedback over the
extents of the matched (sub)
expressions. The (non-)validity is
shown only in the modeline without
throwing the errors at you. If you
want to know the reason why RE Builder
considers it as invalid call
reb-force-update' ("\C-c\C-u") which
should reveal the error.
It comes built into Emacs (since 21)
And for the syntax of Emacs regular expressions, you can read these info pages:
Syntax of Regular Expressions
Backslash in Regular Expressions
this should do.

Regex to change to sentence case

I'm using Notepad++ to do some text replacement in a 5453-row language file. The format of the file's rows is:
variable.name = Variable Value Over Here, that''s for sure, Really
Double apostrophe is intentional.
I need to convert the value to sentence case, except for the words "Here" and "Really" which are proper and should remain capitalized. As you can see, the case within the value is typically mixed to begin with.
I've worked on this for a little while. All I've got so far is:
(. )([A-Z])(.+)
which seems to at least select the proper strings. The replacement piece is where I'm struggling.
Find: (. )([A-Z])(.+)
Replace: \1\U\2\L\3
In Notepad++ 6.0 or better (which comes with built-in PCRE support).
Regex replacement cannot execute function (like capitalization) on matches. You'd have to script that, e.g. in PHP or JavaScript.
Update: See Jonas' answer.
I built myself a Web page called Text Utilities to do that sort of things:
paste your text
go in "Find, regexp & replace" (or press Ctrl+Shift+F)
enter your regex (mine would be ^(.*?\=\s*\w)(.*)$)
check the "^$ match line limits" option
choose "Apply JS function to matches"
add arguments (first is the match, then sub patterns), here s, start, rest
change the return statement to return start + rest.toLowerCase();
The final function in the text area looks like this:
return function (s, start, rest) {
return start + rest.toLowerCase();
Maybe add some code to capitalize some words like "Really" and "Here".
In Notepad++ you can use a plugin called PythonScript to do the job. If you install the plugin, create a new script like so:
Then you can use the following script, replacing the regex and function variables as you see fit:
import re
#change these
regex = r"[a-z]+sym"
function = str.upper
def perLine(line, num, total):
for match in re.finditer(regex, line):
if match:
s, e = match.start(), match.end()
line = line[:s] + function(line[s:e]) + line[e:]
editor.replaceWholeLine(num, line)
This particular example works by finding all the matches in a particular line, then applying the function each each match. If you need multiline support, the Python Script "Conext-Help" explains all the functions offered including pymlsearch/pymlreplace functions defined under the 'editor' object.
When you're ready to run your script, go to the file you want it to run on first, then go to "Scripts >" in the Python Script menu and run yours.
Note: while you will probably be able to use notepad++'s undo functionality if you mess up, it might be a good idea to put the text in another file first to verify it works.
P.S. You can 'find' and 'mark' every occurrence of a regular expression using notepad++'s built-in find dialog, and if you could select them all you could use TextFX's "Characters->UPPER CASE" functionality for this particular problem, but I'm not sure how to go from marked or found text to selected text. But, I thought I would post this in case anyone does...
Edit: In Notepad++ 6.0 or higher, you can use "PCRE (Perl Compatible Regular Expression) Search/Replace" (source: http://sourceforge.net/apps/mediawiki/notepad-plus/?title=Regular_Expressions) So this could have been solved using a regex like (. )([A-z])(.+) with a replacement argument like \1\U\2\3.
The questioner had a very specific case in mind.
As a general "change to sentence case" in notepad++
the first regexp suggestion did not work properly for me.
while not perfect, here is a tweaked version which
was a big improvement on the original for my purposes :
find: ([\.\r\n][ ]*)([A-Za-z\r])([^\.^\r^\n]+)
replace: \1\U\2\L\3
You still have a problem with lower case nouns, names, dates, countries etc. but a good spellchecker can help with that.