REGEX: How to split string with space and double quote - regex

I have a input of string with spaces and double quotes as below:
Input :
18 17 16 "Arc 10 12 11 13" "Segment 10 23 33 32 12" 23 76 21
Expected Output:
18
17
16
Arc 10 12 11 13
Segment 10 23 33 32 12
23
76
21
How can I do this using Regex? Thank you in advance

You can use next regexp(see example):
("[^"]+")|\S+
("[^"]+") - quoted sequence.
\S+ - non whitespace sequence.
Probably order of groups is depend from regexp implementation. In the demo engine matching stared from left to right. Also do not forget escape special characters with double slash.

"(.+?)"|(\w+(?=\s|$))
check here

Related

Regex match multiple numbers stop at string (word) despite more matches exist

Goal;
Match all variations of phone numbers with 8 digits + (optional) country code.
Stop match when "keyword" is found, even if more matches exist after the "keyword".
Need this in a one-liner and have tried a plethora of variations with lookahead/behind and negate [^keyword] but I am unable to understand how to achieve this.
Example of text;
abra 90998855
kadabra 04 94 84 54
cat 132 23 564
oh the nice Hat +41985 32 565
+17 98 56 32 56
Ladida
keyword
I Want It To Stop Matching Here Or Right Before The "keyword"
more nice text with some matches
cat 132 23 564
oh the nice Hat +41985 32 565
+17 98 56 32 56
Example of regex;
(\+\d{1,2})?[\s]?\(?\d{2,3}\)?[\s]?(\d{2})[\s]?(\d{2})?[\s]?(\d{2,3})
-> This matches all numbers also below the keyword
(\+\d{1,2})?[\s]?\(?\d{2,3}\)?[\s]?(\d{2})[\s]?(\d{2})?[\s]?(\d{2,3})[^keyword]
-> This matches all numbers also below the keyword
(\+\d{1,2})?[\s]?\(?\d{2,3}\)?[\s]?(\d{2})[\s]?(\d{2})?[\s]?(\d{2,3})(?!keyword)
-> This matches all numbers also below the keyword
(\+\d{1,2})?[\s]?\(?\d{2,3}\)?[\s]?(\d{2})[\s]?(\d{2})?[\s]?(\d{2,3})(?=keyword)
-> This matches nothing
((\+\d{1,2})?[\s]?\(?\d{2,3}\)?[\s]?(\d{2})[\s]?(\d{2})?[\s]?(\d{2,3})(?:(?!keyword))*)
-> This matches all numbers also below the keyword

add number as prefix

I have list of number:
19
20
21
22
23
24
25
26
many more numbers...
I want to add one number to all of then as prefix so thay will all becam etree digit numbers:
219
220
221
222
223
224
225
226
It should go lik this in find section: \S{2,} than what should I put in replace section? 2$1 or what I em not expert.
Find all two digits and capture them (with parentheses).
\b(\d\d)\b
Replace captured groups with an additional 2 in front.
2$1

Matching across multiple lines regular expression

I have several lists in a single text file that look like below. It always starts with 0 and it always ends with the word Unique at the start of a newline. I would like to get rid of all of it apart from the line with Unique on it. I looked through stackoverflow and tried the following but it returns the whole text file (there are other strings in the file that I haven't put in this example). Basically the problem is how to account for the newlines in the regex selection
^0(.|\n)*
Input:
0 145
1 139
2 175
3 171
4 259
5 262
6 293
7 401
8 430
9 417
10 614
11 833
12 1423
13 3062
14 10510
15 57587
16 5057575
17 10071
18 375
19 152
20 70
21 55
22 46
23 31
24 25
25 22
26 25
27 14
28 16
29 16
30 8
31 10
32 8
33 21
34 8
35 51
36 65
37 605
38 32
39 2
40 1
41 2
44 1
48 2
51 1
52 1
57 1
63 2
68 1
82 1
94 1
95 1
101 3
102 7
103 1
110 1
111 1
119 1
123 1
129 2
130 3
131 2
132 1
135 1
136 2
137 7
138 4
Unique: 252851
Expected output:
Unique: 252851
You need to use something like
^0[\s\S]*?[\n\r]Unique:
and replace with Unique:.
^ - start of a line
0 - a literal 0
[\s\S]*? - zero or more characters incl. a newline as few as possible
[\n\r] - a linebreak symbol
Unique: - a whole word Unique:
Another possible regex is:
^0[^\r]*(?:\r(?!Unique:)[^\r]*)*
where \r is the line endings in the current file. Replace with an empty string.
Note that you could also use (?m)^0.*?[\r\n]Unique: regex (to replace with Unique:) with the (?m) option:
m: multi-line (dot(.) match newline)
Your method of matching newlines should work, although it's not optimal (alternation is rather slow); the next problem is to make sure the match stops before Unique:
(?s)^0.*(?=Unique:)
should work if there is only one Unique: in your file.
Explanation:
(?s) # Start "dot matches all (including newlines) mode
^0 # Match "0" at the start of the file
.* # Match as many characters as possible
(?=Unique:) # but then backtrack until you're right before "Unique:"

Regex, replacing for newline with group replace

675185538end432 204 9/9 4709 908 2
343269172end430 3 43 9335 975 7
590144128end89 7 29 3-5-4 420 2
337460105end8Y5 7A 78 2 23
292484648end70 A53 03 9235 93
These are the strings that I am working with. I want to find a regex to replace the above strings as follows
675185538
432 204 9/9 4709 908 2
343269172
430 3 43 9335 975 7
590144128
89 7 29 3-5-4 420 2
337460105
8Y5 7A 78 2 23
292484648
70 A53 03 9235 93
Wherever end comes, \r\n should be introduced.
The string before end is numeric and after end is alphanumeric with whiteline characters.
I am using notepad++.
To make the match strict, try this:
Find: ^(\d+)end(\w)
Replace: \1\r\n\2
This captures, then puts back via back references, the preceding number between start of line and "end" and the following digit/letter. This won't match "end" elsewhere.
Kludgery:
Find (\d\d\d\d\d\d\d\d\d)end(\d)
Replace \1\r\n\2
Find creates two capture groups:
each group is bounded by an ( and a )
one capture group matches exactly nine numerals
the other capture group matches exactly one numeral.
In the replace:
the first capture group is referenced with \1
and the second group with \2.

Trying to extract the last number on a line, with sets of numbers delimited by spaces

So I've extracted the digits from a log file and it looks like this:
2011 04 13 23 54 14 601 04 13 23 54 14 10 35 1 14 8080 59 250
What I'm trying to get is the last number (250), and it will loop through each line of the log. Once I get the last number from each line, I will do some calculations...I just can't extract that last number at the end of the line. Thanks!
while (<>) {
my ($last) = /(\d+)$/;
}
If your data is an array, #digits, then the last one is $digits[-1].
If your data is in a string, use the split to get it into an array.