How to rename and reorganize file using regex and rename-cli?

How to rename and reorganize file using regex and rename-cli? - regex

I have a large amount of article on my blog I want to rename and reorganize.
The current structure is this one:
2005_03_19_this_is_the_filename.md
2007_07_23_another_filename.md
2021_01_12_filename.md
And here's what I want to achieve:
2005/03/this_is_the_filename/index.md
2007/07/another_filename/inedx.md
2021/01/filename/index.md
Here's the regex I want to use but I do not know how to execute it.
/(\d{4})_(\d{2})_(\d{2})_(.*).md/gm
substitution: $1/$2/$4/index.md
I'm currently trying to execute it with rename-cli but can't figure out how to write the command.

I manage to it with the following command:
$ rename -p 's~(\d{4})_(\d{2})_\d{2}_(.+)(\.md)$~$1/$2/$3/index.md~' *.md
Thanks to #anubhava

Also use
rename -p 's,([0-9]{4})_([0-9]{2})_[0-9]{2}_(.*)(?=\.md$),$1/$2/$3/index,' *.md
See regex proof
EXPLANATION
NODE EXPLANATION
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
[0-9]{4} any character of: '0' to '9' (4 times)
--------------------------------------------------------------------------------
) end of \1
--------------------------------------------------------------------------------
_ '_'
--------------------------------------------------------------------------------
( group and capture to \2:
--------------------------------------------------------------------------------
[0-9]{2} any character of: '0' to '9' (2 times)
--------------------------------------------------------------------------------
) end of \2
--------------------------------------------------------------------------------
_ '_'
--------------------------------------------------------------------------------
[0-9]{2} any character of: '0' to '9' (2 times)
--------------------------------------------------------------------------------
_ '_'
--------------------------------------------------------------------------------
( group and capture to \3:
--------------------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
) end of \3
--------------------------------------------------------------------------------
(?= look ahead to see if there is:
--------------------------------------------------------------------------------
\. '.'
--------------------------------------------------------------------------------
md 'md'
--------------------------------------------------------------------------------
$ before an optional \n, and the end of
the string
--------------------------------------------------------------------------------
) end of look-ahead

Related

Regex to match inverse using negative lookahead ignores matches

I am trying to create a regex to match any tags not including [first].
# Trying to match:
# [second]
# [first.second]
# [first.third]
[first]
# something = else
[second]
test = yes
[first.second]
[first.third]
I was trying ^\[((?!first).*)\]$
https://regex101.com/r/1fz1CW/1
And this seems to match [second] in the example above but I can't figure out why it doesn't match [first.second] or [first.third] I was thinking I may need word boundaries, but I can't seem to get them to work.

Use
^\[((?!first\])[^\]\[]*)\]$
See regex proof.
EXPLANATION
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
\[ '['
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
(?! look ahead to see if there is not:
--------------------------------------------------------------------------------
first 'first'
--------------------------------------------------------------------------------
\] ']'
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
[^\]\[]* any character except: '\]', '\[' (0 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
) end of \1
--------------------------------------------------------------------------------
\] ']'
--------------------------------------------------------------------------------
$ before an optional \n, and the end of the
string

Inverse Regular Expressions Match in Webpack SVGO-Loader

I have a regular expression looking for .svg files under the icons-wc/icons directory for a Webpack svgo-loader.
/icons-wc\\icons\\.*\.svg$/
I'd now like to find all .svg files outside the icons-wc/icons directory but I'm not sure how to approach it. I've tried something like this but that doesn't seem to work. It seems to be too over eager to select
/(?<!icons-wc\\icons)\\.*\.svg$/

Use
/^(?!.*(?:^|[\\\/])icons-wc[\\\/]icons[\\\/]).*\.svg$/
See regex proof.
EXPLANATION
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
(?! look ahead to see if there is not:
--------------------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
(?: group, but do not capture:
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
[\\\/] any character of: '\\', '\/'
--------------------------------------------------------------------------------
) end of grouping
--------------------------------------------------------------------------------
icons-wc 'icons-wc'
--------------------------------------------------------------------------------
[\\\/] any character of: '\\', '\/'
--------------------------------------------------------------------------------
icons 'icons'
--------------------------------------------------------------------------------
[\\\/] any character of: '\\', '\/'
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
\. '.'
--------------------------------------------------------------------------------
svg 'svg'
--------------------------------------------------------------------------------
$ before an optional \n, and the end of the
string

How to remove the hash tag from this regex

My current regex:
(\d.{17})[^#]*(\D+)(\d+)gr(\d+)
In group 2, they are still having the hashtag, I want to remove it from there. What should I change from my current regex?
201223E0MWJPJD2230#AdeSaputra290gr99000
2101023CNV6TT1109J#Fefe430gr142000
2101183EDTFPSA0128#Jessica500gr112000
201221E2QKWRY11413#EssyYosita880gr233500
2101123G9XQ7R41705#Meily1120gr329000
201228ECEWTJT50859#WidyaNatali1720gr457230
201227EEBX1K9K1020#Excelio112gr58900
2101112N4YNFB12016#DebyNath520gr156220
2101072R8A0QB22347#AlycieHandoTan700gr85000
Output:
group 1: 201223E0MWJPJD2230
group 2: #AdeSaputra
group 3: 290
group 4: 99000

remove [^] from your regex
(\d.{17})#*(\D+)(\d+)gr(\d+)
see this

\D matches any non-digits and it matches # in your input.
Add # to the pattern instead of the opposite [^#] so as to match it.
Use
^(\d.{17})#(\D+)(\d+)gr(\d+)$
See proof it works. Adding anchors to match entire strings.
Explanation
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
\d digits (0-9)
--------------------------------------------------------------------------------
.{17} any character except \n (17 times)
--------------------------------------------------------------------------------
) end of \1
--------------------------------------------------------------------------------
# '#'
--------------------------------------------------------------------------------
( group and capture to \2:
--------------------------------------------------------------------------------
\D+ non-digits (all but 0-9) (1 or more
times (matching the most amount
possible))
--------------------------------------------------------------------------------
) end of \2
--------------------------------------------------------------------------------
( group and capture to \3:
--------------------------------------------------------------------------------
\d+ digits (0-9) (1 or more times (matching
the most amount possible))
--------------------------------------------------------------------------------
) end of \3
--------------------------------------------------------------------------------
gr 'gr'
--------------------------------------------------------------------------------
( group and capture to \4:
--------------------------------------------------------------------------------
\d+ digits (0-9) (1 or more times (matching
the most amount possible))
--------------------------------------------------------------------------------
) end of \4
--------------------------------------------------------------------------------
$ before an optional \n, and the end of the
string

regex for escaping '\{{expression}}'?

I'd like to use mustache style for expression: {{abc}}, it's pretty easy to write /({{[a-z]+}})/.
However I cannot get it right to handle \{{abc}}, for which I'd like to skip them on match list. I tried /((?!\\)({{[a-z]+}}))/ but it doesn't work.
https://regex101.com/r/67nXA2/2

Use
(?<!\\)(?:\\\\)*\K{{[a-z]+}}
See proof
Explanation
--------------------------------------------------------------------------------
(?<! look behind to see if there is not:
--------------------------------------------------------------------------------
\\ '\'
--------------------------------------------------------------------------------
) end of look-behind
--------------------------------------------------------------------------------
(?: group, but do not capture (0 or more times
(matching the most amount possible)):
--------------------------------------------------------------------------------
\\ '\'
--------------------------------------------------------------------------------
\\ '\'
--------------------------------------------------------------------------------
)* end of grouping
--------------------------------------------------------------------------------
\K match reset operator
--------------------------------------------------------------------------------
{{ '{{'
--------------------------------------------------------------------------------
[a-z]+ any character of: 'a' to 'z' (1 or more
times (matching the most amount possible))
--------------------------------------------------------------------------------
}} '}}'

Multiple regex matching within a lookaround

i am trying to do a multiple match within a lookbehind and a look forward
Let's say i have the following string:
$ Hi, this is an example #
Regex: (?<=$).+a.+(?=#)
I am expecting it to return both 'a' within those boundaries, is there a way to do it with only one regex?
Engine: Python

If you can use quantifiers in the lookbehind, use
(?<=\$[^$#]*?)a(?=[^#]*#)
See proof.
Explanation
--------------------------------------------------------------------------------
(?<= look behind to see if there is:
--------------------------------------------------------------------------------
\$ '$'
--------------------------------------------------------------------------------
[^$#]*? any character except: '$' and '#' (0 or more
times (matching the least amount possible))
--------------------------------------------------------------------------------
) end of look-behind
--------------------------------------------------------------------------------
a 'a'
--------------------------------------------------------------------------------
(?= look ahead to see if there is:
--------------------------------------------------------------------------------
[^#]* any character except: '#' (0 or more
times (matching the most amount
possible))
--------------------------------------------------------------------------------
# '#'
--------------------------------------------------------------------------------
) end of look-ahead
PCRE pattern:
(?:\G(?<!^)|\$)[^$]*?\Ka(?=[^#]*#)
See another proof
Explanation
--------------------------------------------------------------------------------
(?: group, but do not capture:
--------------------------------------------------------------------------------
\G where the last m//g left off
--------------------------------------------------------------------------------
(?<! look behind to see if there is not:
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
) end of look-behind
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
\$ '$'
--------------------------------------------------------------------------------
) end of grouping
--------------------------------------------------------------------------------
[^$#]*? any character except: '$' and '#' (0 or more times
(matching the least amount possible))
--------------------------------------------------------------------------------
\K match reset operator
--------------------------------------------------------------------------------
a 'a'
--------------------------------------------------------------------------------
(?= look ahead to see if there is:
--------------------------------------------------------------------------------
[^#]* any character except: '#' (0 or more
times (matching the most amount
possible))
--------------------------------------------------------------------------------
# '#'
--------------------------------------------------------------------------------
) end of look-ahead

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to rename and reorganize file using regex and rename-cli? - regex

I manage to it with the following command: $ rename -p 's~(\d{4})_(\d{2})_\d{2}_(.+)(\.md)$~$1/$2/$3/index.md~' *.md Thanks to #anubhava

Related

Regex to match inverse using negative lookahead ignores matches

Inverse Regular Expressions Match in Webpack SVGO-Loader

How to remove the hash tag from this regex

regex for escaping '\{{expression}}'?

Multiple regex matching within a lookaround

Categories

Resources