Match Regex to a new line that has no characters preceding it - regex

example text
example text
I was wondering if there was a way to match the line break in the middle of these two bits of text.
I was using \n but it would match at the end of "example text" and in the blank line
I am using this in a text to speech program called Voicedream to say out loud that it has progressed to a new line.

I suggest that you only match a newline that is preceded with another newline.
Use a positive lookbehind (?<=\n):
(?<=\n)\n
^^^^^^^

Related

Regex with lazy match (for first orccurance only) in Notepad++

I have this regex to find all lines starting from the word "chapter" till the first blank line: ^chapter.*^\s*$. However I want it to show the first occurrence only, so I tried adding to the end '.?' or '(.+?)'. But I an not sure how to implement the lazy quantifier here.
Example text:
Chapter 1: some text
more than one line,
next line.
Chapter 2: text text text
other text
Chapter 3: more text
more lines
more lines
So the regex should match from the first word "Chapter " till the blank line before the next chapter.. etc.
You can use" /Chapter((?!\n\n).)*/s, on windows Chapter((?!\R\R)(.|\R))*\R?
Chapter literal matches chapter beginning
((?!\n\n).)* matches any character as long as next two characters are not newlines (due to negative lookahead (?!\n\n))
note the s option that makes the dot . match a newline; If you don't have such option in notepad++, you can use Chapter((?!\n\n)(.|\n))* and on windows Chapter((?!\R\R)(.|\R))*\R? to match the newline.
Basic Demo
Windows demo
You may use this regex to select first set of lines starting with Chapter:
\AChapter.*\r?\n(?:.*?\S.*\r?\n)*
RegEx Demo
RegEx Details:
\A: Start anchor (matches once per document)
Chapter.*\r?\n: Match text Chapter followed by any text till line break
(?:.*?\S.*\r?\n)*: Match 0 or more following lines containing at least one non-space character

Notepad++ N text lines separated by blank lines?

I searched a bit, but didn't find a solution for this specific situation. I need to combine groups of non-blank lines into single lines, while preserving the blank lines. For example, the input:
Hi, My name is
Max
What are you
doing
Right now?
Hi
Hello
World
should be output as:
Hi, My name is Max
What are you doing Right now?
Hi
Hello World
Thanks in advance to all who respond.
You could try replacing
(?<![\n\r])[\n\r](?![\n\r])
With a space, as demonstrated here
Explanation -
(?<![\n\r]) is a negative look-behind which tells the regex that anything to be matched must not be preceded by a newline or by a carriage return (just take it as a newline)
[\n\r] is the newline or carriage return which is matched (and later replaced with a space)
(?![\n\r]) is a negative look-ahead that tells the regex that any newline to be matched should not be followed by another newline or carriage return.
In essence, this replaces the blank, new lines which are not followed by another newline - with a space.
You can try this too,
(?m)(?!^\s*$)(^[^\n]*)\n(?!^\s*$)
Demo,,, in which matches all lines which are not empty and not followed by empty line and remove all matched newline character (\n).
But, in notepad++, you must consider carrige return(\r) with newline(\n). Thus,
(?m)(?!^\s*$)(^[^\n]*)\r\n(?!^\s*$)

Regex for matching text between two regex-patters

I am looking for a way to capture text and its paragraph title from a text document.
Text File:
paraTitle-1
--------
Lines and words
empty....
more lines
still part of paraTitle-1
paraTitle-2
--------
Lines and words
empty....
more lines
still part of paraTitle-2
I want to capture both the titles and the text below them.
array = [paraTitle-1: <text...below paraTitle-11>,
paraTitle-2: <text below paraTitle-2>]
I made a few attempts with pattern (?<=(.*))\n----*\n(?=(.*)) to no avail. Any guidance would be awesome.
The following regex will do:
(?!--------\R)(.*)\R--------\R((?:\R?(?!.*\R--------\R).*)+)
See regex101.
The title separator line (--------) can also be specified as -{8}, which is easier to adjust to variable length if needed, e.g. instead of exactly 8 dashes, it could be 6 or more: -{6,}
Explanation:
Capture a line of text (paragraph title):
(.*)\R
The . doesn't match line break characters
\R matches line breaks, including the Windows CRLF pair. If your regex engine doesn't support \R, use \r?\n as a simple alternative.
Make sure the captured text is not the title separator line:
(?!--------\R)
Skip the mandatory title separator line:
--------\R
Capture the paragraph text, as a repeating group of lines:
((?:xxx)+)
A line has an optional leading line break (first line doesn't have one):
\R?.*
But make sure the line is not the title of the next paragraph, i.e. it's not a line followed by the title separator line.
(?!.*\R--------\R)

Multi-line regular expressions in Visual Studio Code

I cannot figure a way to make regular expression match stop not on end of line, but on end of file in VS Code? Is it a tool limitation or there is some kind of pattern that I am not aware of?
It seems the CR is not matched with [\s\S]. Add \r to this character class:
[\s\S\r]+
will match any 1+ chars.
Other alternatives that proved working are [^\r]+ and [\w\W]+.
If you want to make any character class match line breaks, be it a positive or negative character class, you need to add \r in it.
Examples:
Any text between the two closest a and b chars: a[^ab\r]*b
Any text between START and the closest STOP words:
START[\s\S\r]*?STOP
START[^\r]*?STOP
START[\w\W]*?STOP
Any text between the closest START and STOP words:
START(?:(?!START)[\s\S\r])*?STOP
See a demo screenshot below:
To matcha multi-line text block starting from aaa and ending with the first bbb (lazy qualifier)
aaa(.|\n)+?bbb
To find a multi-line text block starting from aaa and ending with the last bbb. (greedy qualifier)
aaa(.|\n)+bbb
If you want to exclude certain characters from the "in between" text, you can do that too. This only finds blocks where the character "c" doesn't occur between "aaa" and "bbb":
aaa([^c]|\n)+?bbb

Regex - Ignore newlines - match any thing untill end of text or numbers

Hello and thanks for reading. I am Once again playing and trying to learn more about regex. It's a regex question so dont please offer other solutions. I can easily do this with other VB methods. I love every time i improve my regex brain.
Take NEWLIne as a carriage return, i have a textbox in this format.
NEWLIne
NEWLIne
hello some text
NEWLIne
some more text
NEWLIne
NEWLIne
I would like to match the data
hello some text
NEWLIne
some more text
Ignore every newline until it matches a number or letter, plus a few special chars, ONE newline and proceeds match new text until it reaches a new line break.
Here is what i have (?i)(?<=\n+)[a-z0-9 :\-\n]+(?=\n+)
But it's still match every thing. I guess its because of the \n]+
With the assumption that a new line is specified by CR, try: (?im)^.+\n.+$
m in (?im) specifies multi-line mode, which should be useful for your needs.