Find all file names that match a pattern - regex

I am trying to find a way to list all file names in a folder that matches this pattern :
20131106XXXXX.pdf
The prefix is the date, and the content and length of XXXX vary across files, and I only care about pdf files.
Anyone could advise a way to do this?

Try this
list.files(path="./yourdir",pattern="[[:digit:]]{8}.*\\.pdf")

You can use regex.
files <- dir(pattern="^[0-9]{8}.*\\.pdf")

Related

Regex expression help needed to filter filenames in Mirth Connect

I need a way, in a Mirth file reader channel, to pick up all files but one with a given name. I can use a regex expression in the Filename Filter Pattern box.
Most files are of the format #######.brf. I need to pick up any file that isn't named 0050450.brf. Can someone help with this?
Thanks
Rut
I think this one should work: ^(?!(0050450\.brf))\w*\.brf

delete file in pentaho

I need to delete some ".text" files but I can not delete all files with this extension. wanted to use regular expressions to search by filename. can you help me?
There are several steps you can use to achieve that. Easiest one is the Job Step Delete Files. You can specify multiple folders and RegExp for each individual folder.
RegExp ignoring numbers - hs_err_pid.*\.txt
Meaning, it will match any file that begins with hs_err_pid and ends in .txt ... having any amount of characters between hs_err_pid and .txt

Special RegEx Patterns

I am trying to search for a specific pattern using regex, but I am having a difficult time. I have over 4,000 images, named as three different file sets, named like so...
0001234_name-of-file.jpeg
0001235_name-of-file_100.jpeg
0001236_name-of-file_200.jpeg
What I want to do is JUST search for the files like 0001234_name-of-file.jpeg
I do NOT want any of the files that have the _100 or _200 at the end before the extension.
I would go with that :
^((?!_[21]00).)*$
Which matches strings which do not contain either _100 or _200

Regex for file name in a directory

I have two files in a directory. FileAbc_1.xml and FileAbc.xml. I want write a regex that only select FileAbc_1.xml.
My regex is : FileAbc.*.xml
It is picking up both file names but I only want FileAbc_1.xml. Any help would great favor.
This will work for you
FileAbc_[0-9]+.xml
That should just be: FileAbc_\d\.xml
(assuming there's never more than one digit after the underscore)
You can go with this for anything that will start with FileAbc and end with XML FileAbc.+\.xml.

Target file names using Regex

If I have a list of file names in an XML and want to remove all instances where the file name doesn't have a file extension, how can I do this using regular expressions? I need to do the replace in TextWrangler and have no other option unfortunately.
For example, if I have such a list in an XML as:
<name>AAA_A026C032_150522_R4RO.mov</name>
<name>BBB_A016D032_150809_R4RO.aiff</name>
<name>CCC_A026C038_151010_R4RO</name>
<name>DDGS_A006C132_150409_R4RO.mp3</name>
<name>EFFD_B026C001_150607_R4RO</name>
<name>FGHG_A026C032_141215_R4RO.cine</name>
Have can the files without the file extension be targeted using regular expressions? I would like to replace these (clear them) in the output document.
Thanks in advance,
Matt
'(?!>\w+\.[a-zA-Z0-9]+)>(\w+)'
this pattern gets the name of the files without extensions as its first capturing group. I dont know how to use TextWrangler but I assume that with filename string, you can probably figure it out?