Regular expression character class with parenthesis with grep command - regex

Regular expression with grep command. For example let say i have file called regular.txt which contain date like below:
$ cat regular.txt
july
jul
Fourth
4th
4
So i am trying match all these text from the input file,using below process method:
Method 1: Match only Fourth|4th|4
$egrep '(Fourth|4th|4)` regular.txt
output method 1:
Fourth
4th
4
Method 2: Match only Fourth|4th|4 using optional parenthesis
$ egrep '(Fourth|4(th)?)` regular.txt
output method 2:
Fourth
4th
4
Method 3: Match entire file july, jul, Fourth, 4th, 4. i am using command like below:
$ egrep 'july? (Fourth|4(th)?)` regular.txt
output method 3: Nothing will be match here. how to do this ?
could you please help me on this ?
Thanks,

Your july? (Fourth|4(th)?) regex matches a sequence of patterns, jul followed with an optional y, then a space, and then 2 alternatives: Fourth or 4 optionally followed with th substring.
If you plan to match jul or july as a 3rd alternative, add it to the grouping construct:
'Fourth|4(th)?|july?'
^ ^^

Related

Regex Stop at the First Occurrence of a Word

How would I change my Regex to stop after the first match of the word?
My text is:
-rwxr--r-- 1 bob123 bob123 0 Nov 10 22:48 /path/for/bob123/dir/to/file.txt
There is a variable called owner, the first arg from cmd:
owner=$1
My regex is: ^.*${owner}
My match ends up being:
-rwxr--r-- 1 bob123 bob123 0 Nov 10 22:48 /path/for/bob123
But I only want it to be: -rwxr--r-- 1 bob123.
By adding a question mark: ^.*?${owner}. This will make the * quantifier non-greedy. But use -P option: grep -P to use Perl-compatible regular expression.
https://regex101.com/r/ThGpcq/1.
You do not need a regex here, use string manipulation and concatenation:
text='-rwxr--r-- 1 bob123 bob123 0 Nov 10 22:48 /path/for/bob123/dir/to/file.txt'
owner='bob123'
echo "${text%%$owner*}$owner"
# => -rwxr--r-- 1 bob123
See the online Bash demo.
The ${text%%$owner*} removes as much text as possible from the end of string (due to %%) up to and including the $owner, and - since the $owner text is removed - "...$owner" adds $owner back.

Can we use regex to concat chars at different index

Say for example here is a string input:
D8DB2F1F0F21R123
We need to extract chars at index(assumption starting index is 0): 4,8,6,1 ie: '2','0','1','8' and concat them.
Final output should be: 2018
Can we achieve the above desired result, just by a regex ?
Using bash :
$ x=D8DB2F1F0F21R123
$ echo ${x:4:1}${x:8:1}${x:6:1}${x:1:1}
2018

Use recoginsed data for replacing

I have column of dates in my Notepad++:
2017-06-12
2017-06-13
2017-06-14
2017-06-15
2017-06-16
2017-06-17
2017-06-18
2017-06-19
2017-06-20
2017-06-20
2017-06-21
2017-06-22
2017-06-23
2017-06-24
2017-06-25
2017-06-26
2017-06-27
2017-06-28
2017-06-29
2017-06-30
2017-07-01
2017-07-02
2017-07-03
2017-07-04
2017-07-05
2017-07-06
2017-07-07
2017-07-08
2017-07-09
2017-07-10
I need it to cut in weeks by placing \r\n after each week like :
2017-06-12
2017-06-13
2017-06-14
2017-06-15
2017-06-16
2017-06-17
2017-06-18
2017-06-19
2017-06-20
2017-06-20
2017-06-21
2017-06-22
2017-06-23
2017-06-24
2017-06-25
2017-06-26
2017-06-27
2017-06-28
2017-06-29
2017-06-30
2017-07-01
2017-07-02
2017-07-03
2017-07-04
2017-07-05
2017-07-06
2017-07-07
2017-07-08
2017-07-09
2017-07-10
I do replace by using RegEx. I find 7 days:
\d\d\d\d-\d\d-\d\d\r\n\d\d\d\d-\d\d-\d\d\r\n\d\d\d\d-\d\d-\d\d\r\n\d\d\d\d-\d\d-\d\d\r\n\d\d\d\d-\d\d-\d\d\r\n\d\d\d\d-\d\d-\d\d\r\n\d\d\d\d-\d\d-\d\d\r\n
And now I would like to add \r\n
But how to use selected data for replace with itself plus \r\n ?
If you are sure that the first date is monday, you could that:
Ctrl+H
Find what: (?:\d{4}-\d\d-\d\d\R){7}
Replace with: $0\r\n
Replace all
In your example input there are some lines doubled. e.g. the 2017-06-20. In your example output this line is also doubled and the week-block consists of eight lines. Seven unique lines and one doubled line for 2017-06-20. I assume that all lines in the input are sorted, thus non unique lines are behind each other. Additionally I assume that the first line marks the first day of a week.
Do a regular expression find/replace like this:
Open Replace Dialog
Find What: (((.*\R)\3*){7})
Replace With: \1\r\n
Check regular expression, do not check . matches newline
Click Replace or Replace All
Explanation
Lets explain (((.*\R)\3*){7}) from the inside out, starting at the third inner group: in the following x,y are regex-parts and do not mean literal characters.
(.*\R) the third group is just one line from start to end
(y\3*) we look for a y followed by an optional part that is captured in the third braces group, here it means a y followed by an optional number of repetitions of y, here y is the third group referenced by \3; this deals with the 2017-06-20 case
(x{7}) we match seven repetions of x, which means here seven unique rows wich can have repetitions in the block, so 8 line with one line doubled is ok

adding a space after each 4th number/digit Oracle 11G

I am trying to get a space into every 4th number/digit (not character). This is what I come up with:
newStudentNumber := regexp_replace(newStudentNumber, '[[:digit:]](....)', '\1 ');
dbms_output.put_line(newStudentNumber);
result:
NL 2345 7894 TUE
What I actually want:
NL 1234 5678 944 TUE
My code replaces the number at every 4th place with a spacebar, instead of adding a space like the wanted result above.
Can anyone explain this to me?
Thanks in advance
You can use the following regex..
([[:digit:]]{4})
And replace with what you are doing now.. \1(space)
Why yours is not working?
Your regex matches a digit and captures next 4 characters (not only digits). So.. when you do a replace.. the digit which is matched but not captured is also replaced.. and not because it is unable to insert.
Explanation for input = NL 12345678944 TUE and regex = [[:digit:]](....):
NL 12345678944 TUE (it will match digit "1" and captures "2345")
See DEMO

Regex: How to match a unix datestamp?

I'd like to be able to match this entire line (to highlight this sort of thing in vim): Fri Mar 18 14:10:23 ICT 2011. I'm trying to do it by finding a line that contains ICT 20 (first two digits of the year of the year), like this: syntax match myDate /^*ICT 20*$/, but I can't get it working. I'm very new to regex. Basically what I want to say: find a line that contains "ICT 20" and can have anything on either side of it, and match that whole line. Is there an easy way to do this?
.*ITC 20.*
should do the trick. . is a wildcard that matches any character, and * means you can have 0 or more of the pattern it follows. (i.e. ba(na)* will match ba, banana, bananananana and so on)