REGEX, get text and character across multiple lines - regex

I would like to get to
await ƒS.Speech.tell(
characters.narrator,
"Some text here"
);
await ƒS.Speech.tell(characters.anothername, "Some other text"
);
via REGEX the following
characters.narrator, "Some text here"
characters.anothername, "Some other text"
However, so far I have only been able to get the text between the quotes via
"(.*?)"
How could the REGEX be extended to get that?

You could extend the pattern like:
\w+(?:\.\w+)+,\s+"[^"]*"
Explanation
\w+ Match 1+ word chars
(?:\.\w+)+ Repeat 1+ times matching . and again 1+ word chars
,\s+ Match a comma and 1+ whitspace chars (that can also match a newline)
"[^"]*" Match "..."
See a regex101 demo.

Related

Regex to match numbers, special chars, spaces and a specific whole word?

I'm trying to create a Regex to match numbers, special chars, spaces and a specific whole word ("ICT").
Example for the string:
[Columbia (ICT-59)]
Currently I've this Regex to match the numbers, special chars and spaces:
[\W\s\d]
And this one to for the word "ICT":
(ICT)
How can I match both of this in one Regular expression?
use regex:
/(?<=\[)[a-zA-Z]+\s\(ICT-\d+\)(?=\])/g
or
/^\[[a-zA-Z]+\s\(ICT-\d+\)\]$/g
You can use \w+ at the position of [a-zA-Z], if you want to allow digits and special characters at the position of the Location.
demo:
https://regex101.com/r/ZS0jeO/1
https://regex101.com/r/hpQok3/1
You could capture the part the you want in a capture group right after the opening [ and match the rest of the format.
\[([^()]+?)\s*\(ICT-\d+\)]
\[ Match [
([^()]+?) Capture group 1, match 1+ chars other than ( or ), as few as possible
\s* Match optional whitespace chars
\(ICT-\d+\) Match (ICT- 1+ digits and )
] Match literally
Regex demo
Or matching just a single word using \w+
\[(\w+)\s*\(ICT-\d+\)]
Regex demo

Move all text between two "\t" in Notepad++

I have more than a million lines of text in this format:
AAAA BBBBBBBBBBBBBBB CCCC
Separated by \t
I want to have it in a format
AAAA_CCCC BBBBBBBBBBBBBBB
But I cannot seem to figure out how to do it using regular expressions in Notepad++
You may try the following find and replace, in regex mode:
Find: ^(\S+)\t(\S+)\t(\S+)$
Replace: $1_$3 $2
Here is a demo.
If the separator is a tab, you can use
^[^\r\n\t]+\K\t([^\r\n\t]+)\t([^\r\n\t]+)$
The pattern matches:
^ Start of string
[^\r\n\t]+ Match 1+ chars other than a tab or newline
\K\t Forget what is matches so far using \K and match a tab
([^\r\n\t]+) Capture group 1, match any 1+ chars other than a newline or tab
\t Match a tab
([^\r\n\t]) Capture group 2, match 1 char other than a newline or tab
$ end of string
In the replacement use the 2 capture groups with an underscore in between.
_$2 $1
See a regex demo.
The result of the replacement:
AAAA_CCCC BBBBBBBBBBBBBBB

Swift parse URI from card with regex

I need get URI of photo. URI spans multiple lines with delimiters. What should be the right side of the pattern to capture everything up to the uppercase letters with a colon? Such as END:, FN: or N:?
let pattern = "URI:(.*)"
let text = "BEGIN:VCARD\r\nVERSION:3.0\r\nPRODID:-//Apple Inc.//iPhone OS 13.6//EN\r\nN:;John;;;\r\nFN:John\r\nTEL;TYPE=CELL,VOICE,pref:+71234567890\r\nPHOTO;VALUE=URI:https://imgurl.com/download/photo.2A2472\r\n 0C-745E-4B17-AE46-B575B81C9490.afeb521a-9397-484a-8703-3e246b6d526d.19D320\r\n 9E-EC79-48D5-AE65-0E29CF208278\r\nEND:VCARD"
Expected Result is https://imgurl.com/download/photo.2A2472\r\n0C-745E-4B17-AE46-B575B81C9490.afeb521a-9397-484a-8703-3e246b6d526d.19D320\r\n9E-EC79-48D5-AE65-0E29CF208278\r\n
You could use:
(?s)URI(.*?)\s*[A-Z]+:
Regex demo | Swift demo code
Explanation
(?s) Inline modifier, make the dot match a newline
URI Match literally
(.*?) Capture group 1, match any char, as least chars as possible
\s*[A-Z]+: Match 0+ whitespace chars, 1+ uppercase chars A-Z followed by :
Use this pattern:
https?:.*(?=END)
See Demo in PCRE

Regex match all text except trailing paranthesis with number

This is probably easy, but I can't seem to grasp regex properly.
I need to match characters in strings up from start until a paranthesis with a digit inside (if one exists). If the paranthesis is followed by more text the entire string should match.
Test string (abc) = match "Test string (abc)"
Test string (abc) test = match "Test string (abc) test"
Test string (1) = match "Test string"
Test string (1) Test = match "Test string (1) Test"
I have this but it don't care whats inside the paranthesis so only match "Test string" no matter what.
^[^\(\d\)]+
Can anyone help me out? Thanks a lot!
EDITED: Added extra test string (#4) to my question and Wictor's regex in the comment matches this as well:
^.*?(?=\s*\(\d+\)$|$)
If you are extracting with -match, you may use
^.*?(?=\s*\(\d+\)|$)
See the regex demo
Details
^ - start of string
.*? - any 0+ chars other than newline, as few as possible
(?=\s*\(\d+\)|$) - a positive lookahead that matches a location immediately followed with 0+ whitespaces, (, 1+ digits, ) or end of string.
Note you may try a replacing approach with
$line -replace '\s*\(\d+\).*'
where \s*\(\d+\).* matches 0+ whitespaces, (, 1+ digits, ) and then the whole rest of the line with .*.

How to write nested regex to find words below some string?

I am converting one pdf to text with xpdf and then find some words
with help of regex and preg_match_all.
I am seperating my words with colon in pdftotext.
Below is my pdftotext output:
In respect of Shareholders
Name: xyx
Residential address: dublin
No of Shares: 2
Name: abc
Residential address: canada
No of Shares: 2
So i write one regex that will show me words after colon in text().
$regex = '/(?<=: ).+/';
preg_match_all($regex, $string, $matches);
But Now i want regex that will display all data after In respect of Shareholders.
So, i write $regex = '/(?<=In respect of Shareholders).*?(?=\s)';
But it shows me only :
Name: xyx
I want first to find all data after In respect of shareholders and then another regex to find words after colon.
You may use
if (preg_match_all('~(?:\G(?!\A)|In respect of Shareholders)\s*[^:\r\n]+:\h*\K.*~', $string, $matches)) {
print_r($matches[0]);
}
See the regex demo
Details
(?:\G(?!\A)|In respect of Shareholders) - either the end of the previous successful match or In respect of Shareholders text
\s* - 0+ whitespaces
[^:\n\r]+ - 1 or more chars other than :, CR and LF
: - a colon
\h* - 0+ horizontal whitespaces
\K - match reset operator that discards all text matched so far
.* - the rest of the line (0 or more chars other than line break chars).
In your regex (?<=: ).+ you will match any character 1+ times after a colon and a space. To capture all that follows the spaces or tabs in a group, you could use (?<=: )[\t ](.+)
Another way to match the texts using a capturing group could be:
^.*?:[ \t]+(\w+)
Explanation
^ Assert start of the string
.*?: Match any character non greedy followed by a :
[ \t]+ Match 1+ times a space or a tab
(\w+) Capture in a group 1+ word characters
Regex demo | Php demo
Or use \K to forget what was matched if that is supported:
^.*?:\h*\K\w+
Regex demo