I am looking for a solution to know if a specific word or digit is last in the line, followed by nothing , not even space [duplicate] - regex

This question already has answers here:
regex to get the number from the end of a string
(2 answers)
Closed 3 years ago.
set vv "abc 123 456 "
regexp {abc[\s][\d]+[\s][\d]+} $vv
1
regexp {abc[\s][\d]+[\s][\d]+(?! )} $vv
1
Should return 0, as the line contains extra space at the end or extra characters.
From a list of lines, i am trying to know which lines have space at the end and which do not.
lines can be of any format, for instance, i need to extract line 1 and 3 but not 2 and 4.
"abc 123 456"
"abc 123 456 abc 999"
"xyz 123 999"
"xyz 123 999 zzz 222"

You could use a repeating pattern matching a space and digits to make sure that the line ends with digits only:
^abc(?: \d+)+$
Regex demo
Or a bit broader match using word characters \w if the lines can be of any format:
^\w+(?: \w+)+$
Regex demo

Not sure about TCL regex, but I think you have to add an anchor:
abc\s\d+\s\d+$

It can be summarized as ending of line ($) proceeding by words(\w).
puts [regexp {\w$} $vv]

If all you need is to find out if a line ends with a space or not, use this for a regex:
\s$

The regular express would be {^abc.*\d$} -- a digit followed by the end of the string.
% regex {^abc.*\d$} $vv
0
The glob pattern would be {abc*[0-9]}
% string match {abc*[0-9]} $vv
0
% string match {abc*[0-9] } $vv
1

Related

Regex - Group string with space

I need to group a string into groups of 3 characters.
Examples:
In: 900123456 -> Out: 900 123 456
In: 90012345 -> Out: 900 123 45
In: 90012 -> Out: 900 12
Is there any way to do this with regex?
Thank you very much.
Have a go with /\d{3}(?!\b)/gm as pattern and $0 as replacement.
Explanation:
\d to match a digit. But we want 3 of them so it becomes \d{3}.
we would like to replace the match by itself followed by a space. But we should not do that if it is at the end of the line because we don't want to add a trailing space. This can be avoided with a negative lookahead to search for a word boundary with \b. This becomes (?!\b) for the negative lookahead.
You can test it here: https://regex101.com/r/MIQnF3/1
let input = document.getElementById('input');
let output = document.getElementById('output');
// In JS I had to capture the 3 digits in a group since $0 did not work.
let pattern = /(\d{3})(?!\b)/gm;
output.innerHTML = input.innerHTML.replace(pattern, '$1 ');
<p>Input:</p>
<pre><code id="input">900123456
90012345
90012</code></pre>
<p>Output:</p>
<pre><code id="output"></code></pre>

Ruby Regex - If the string is more than 10 characters, remove the first character if it is a "1"

Without using a gem, I just want to write a simple regex formula to remove the first character from strings if it's a 1, and, if there are more than 10 total characters in the string. I never expect more than 11 characters, 11 should be the max. But in the case there are 10 characters and the string begins with "1", I don't want to remove it.
str = "19097147835"
str&.remove(/\D/).sub(/^1\d{10}$/, "\1").to_i
Returns 0
I'm looking for it to return "9097147835"
You could use your pattern, but add a capture group around the 10 digits to use the group in the replacement.
\A1(\d{10})\z
For example
str = "19097147835"
puts str.gsub(/\D/, '').sub(/\A1(\d{10})\z/, '\1').to_i
Output
9097147835
Another option could be removing all the non digits, and match the last 10 digits:
\A1\K\d{10}\z
\A Start of string
1\K Match 1 and forget what is matched so far
\d{10} Match 10 digits
\z End of string
Regex demo | Ruby demo
str = "19097147835"
str.gsub(/\D/, '').match(/\A1\K\d{10}\z/) do |match|
puts match[0].to_i
end
Output
9097147835
You can use
str.gsub(/\D/, '').sub(/\A1(?=\d{10})/, '').to_i
See the Ruby demo and the regex demo.
The regex matches
\A - start of string
1 - a 1
(?=\d{10}) - immediately to the right of the current location, there must be 10 digits.
Non regex example:
str = str[1..] if (str.start_with?("1") and str.size > 10)
Regexes are powerful, but not easy to maintain.

notepad++ regex search string and other string in previous line [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 2 years ago.
I am looking for regex pattern where it will search string only if another string matches in previous few lines,
for e.g.
abc----1
pqr----2
123----3
xyz----4
lll----5
pqr----6
123----7
qqq----8
so here say I want to find 123 only if we go above and first find xyz and not abc. So outout should be only matching pattern is line no 7 and not 3.
Thanks, Tim for an answer,
One more additional criteria are I wanted to replace 123 only not all line found by this pattern
You may try searching on the following pattern:
xyz((?!abc).)*?123
This matches xyz which is then not followed by abc anywhere before encountering 123 later in the text. You should run the above pattern in dot all mode, so that .* can match across newlines.
Demo
Edit:
To replace the 123 with some other content, say 456, you may capture everything leading up the 123, then replace with the captured quantity followed by the new text:
Find: (xyz(?:(?!abc).)*?)123
Replace: $1456
Demo

How I can delete lines which have less than 11 numbers but more than 8 numbers in one line in notepad++

How I can delete lines which have less than 11 numbers but more than 8 numbers in one line in notepad++. The numbers are separeted from each other with letters or spaces, etc.
Your requirement says to remove lines having 9 or 10 digits, but not more or less than this. You may try using lookaheads to handle this. In regex mode, try finding the following pattern:
^(?!.*\d.*\d.*\d.*\d.*\d.*\d.*\d.*\d.*\d.*\d.*\d)(?=.*\d.*\d.*\d.*\d.*\d.*\d.*\d.*\d.*\d).*
Then just replace that with empty string (nothing). Follow the demo below to see that the pattern correctly flags the appropriate lines.
Demo
Edit:
Here is another pattern you may use, without lookaheads, which is a bit easier on the eyes:
^\D*\d\D*\d\D*\d\D*\d\D*\d\D*\d\D*\d\D*\d\D*\d\D*\d?\D*$
This again says to match any line which contains either 9 or 10 digits, but not more or less than this.
Ctrl+H
Find what: ^(?:\D*\d){8}(?:\D*\d){0,3}(?:\R|$)
Replace with: LEAVE EMPTY
check Wrap around
check Regular expression
Replace all
Explanation:
^ # beginning of line
(?:\D*\d){8} # non capture group, 0 or more NON digit and 1 digit, may appear 8 times
(?:\D*\d){0,3} # non capture group, 0 or more NON digit and 1 digit, may appear 0 upto 3 times
(?:\R|$) # non capture group, linebreak or end of file
Given:
1234567
12345678
123456789
1234567890
12345678901
123456789012
a1b2c3d4e5f6g7
a1b2c3d4e5f6g7h8
a1b2c3d4e5f6g7h8i9
a1b2c3d4e5f6g7h8i9j0k1l2
Result for given example:
1234567
123456789012
a1b2c3d4e5f6g7
a1b2c3d4e5f6g7h8i9j0k1l2
Screen capture:

REGEX Search and keep specific characters

I have hundreds of References in the following format
HCVSAM0123BK
c35UNI0321RS
scruni0321
XXXXXX ZZZZ WW
6 characters 4 digits 2 characters
I want to keep the 4 digits after the first 6 characters, but in some cases it doesn't have the last 2 characters
My goal is to get only ZZZZ (the 4 digits)
ex: from HCVSAM0123BK to 0123
Thank You
You can do match the following:
^\w{6}(\d+)(\w{2})?$
and the first captured group \1 is what you want.
Demo: http://regex101.com/r/qT0lY8
Answer to udpated question:
^(?!\d+$)\w{6}(\d+)(\w{2})?$
(?!\d+$) is a negative look ahead, that will fail the match if the line is only digits, and \w stands for [0-9a-zA-Z_].
search : ^.{6}(.{4}).*
and replace with : \1
demo here : http://regex101.com/r/kZ7dS8
output :
0123
0321
0321
using branch reset :
search : (?|.*(\d{4}).*)
and replace with : \1