Reading dates from filenames - c++

I want to extract dates from the suffixes of files in a particular folder. The contents of such a folder look something like:
Packed_Folder_1_2016.06.10
Packed_Folder_1_2016.08.06
Packed_Folder_1_2015.09.03
packed_Folder_1_2015.01.08
... (so on and so forth, always in the same path just different suffixes)
There is no pattern to the dates. I need to make a VS form (2013) to read the name of the files and store the date differences.

Notice how the filenames always follow a pattern? It's always Packed_Folder_1_####.##.##, where the last part is a date.
So what you want to do is list the file names in the folder, and try to find a file that matches the pattern. You could use a regular expression to match the filename (it would be something like R"(Packed_Folder_1_\d{4}\.\d{2}\.\d{2})").

You are talking about Forms, so I am assuming you are able to use Visual C++. If that is the case, you can check FileSystemWatcher Class.
You instantiated it with a given path ( file or directory ), and it will trigger events based on some changes on the target (simple change, creation, rename - you can select which one). You could then update your reference, in case its change suits your needs.

Related

Rename files sequentially keeping "connected" file names pattern using regular expressions

I'm trying to reorganize photo library that contains edited files as well as originals. I already achieved desired folder structure using Exif Sorter, i.e %UserProfile%\Photos\%year%\%month%\%day%.
Each %day% folder contains photo image files with a little bit different name pattern:
IMG_0001.jpg
ZMGM00002.jpg
ZMGM00003 (Edited).jpg
ZMGM00003.jpg
IMG_0002 (Edited).jpg
IMG_0002.jpg
IMG_0004.jpg
I'd like files to be named sequentially but keeping relevant " (Edited)" suffix:
DSC_0001.jpg
DSC_0002.jpg
DSC_0002 (Edited).jpg
DSC_0003.jpg
DSC_0004 (Edited).jpg
DSC_0004.jpg
DSC_0005.jpg
So far I came up with regular expression to rename "*.jpg" and "* (Edited).jpg" preserving it's "suffix" part when it's there (" (Edited)") (sorry I use RegexRenamer because I'm beginner):
match string ^(\D+)(_)?(\d+)(Edited)?
replace string DCS_$#$4
However I get sequential numbering across all files and thus the relevance of edited files is lost:
DSC_0001.jpg
DSC_0002.jpg
DSC_0003 (Edited).jpg
DSC_0004.jpg
DSC_0005 (Edited).jpg
DSC_0006.jpg
DSC_0007.jpg
Is there any way I can rename files and preserve filename "connection" pattern between them, i.e. so I get DSC_0002 (Edited).jpg & DSC_0002.jpg instead of DSC_0002 (Edited).jpg & DSC_0003.jpg?
Since I've got thousands of folders, the renaming should be recourse & sequence should restarted with each new folder. I believe this requires PowerShell or batch scripting that will determine required condition but I'm not sure where to start. I am open to ideas like maybe I could process file names via Excel first and then batch-rename from TXT/CSV file.
P.S. I've got like 80000 family photos since late 90's, it would take ages to process by hand. I can run anything in Windows and macOS to solve this (would prefer Windows though).

I wonder if I can perform data-pipeline by directory of a specific name with DataFusion

I'm using google-cloud-platform data fusion.
Assuming that the bucket's path is as follows:
test_buk/...
In the test_buk bucket there are four files:
20190901, 20190902
20191001, 20191002
Let's say there is a directory inside test_buk called dir.
I have a prefix-based bundle based on 201909(e.g, 20190901, 20190902)
also, I have a prefix-based bundle based on 201910(e.g, 20191001, 20191002)
I'd like to complete the data-pipeline for 201909 and 201910 bundles.
Here's what I've tried:
with regex path filter
gs://test_buk/dir//2019 to run the data pipeline.
If regex path filter is inserted, the Input value is not read, and likewise there is no Output value.
When I want to create a data pipeline with a specific directory in a bundle, how do I handle it in a datafusion?
If using directly the raw path (gs://test_buk/dir/), you might be getting an error when escaping special characters in the regex. That might be the reason for which you do not get any input file into the pipeline that matches your filter.
I suggest instead that you use ".*" to math the initial part (given that you are also specifying the path, no additional files in other folders will match the filter).
Therefore, I would use the following expressions depending on the group of files you want to use (feel free to change the extension of the files):
path = gs://test_buk/dir/
regex path filter = .*201909.*\.csv or .*201910.*\.csv
If you would like to know more about the regex used, you can take a look at (1)

kettle wildcard subdirectory regex

I'm trying to process a file in a Kettle transformation. The targeted file has a static name, let's say TARGETED.LOG and it's in a subdirectory which contains a date component (variable) in his name. So, the whole path name will be something like:
c:\username\kettleworkspace\report_[DDMMYYYY]\TARGETED.LOG.
Any advice?
Use the Get File Names step with the include subfolders option, and drop the resulting list of files in your Text File Input with the Accept filenames from previous step option.
Of course between these two step you would probably want to add some Filter step.

Pick up a particular file from a directory using regex in Talend

My directory contains files named as WM_PersonFile_22022018 , WM_PersonFile_23022018, WM_PersonFile_24022018 , WM_PersonFile_25022018 and these files come on a daily basis. I am using tFileList to iterate through the files
What should be my regex in my Filemask to pick up the most recent file? Should the Use Global Expressions as Filemask be unchecked?
I tried "*.txt" which is picking up all the files.
RegEx would help you to filter for the correct files.
Some other logic would get you the newest file. If you use tFileList, you might be able to sort after date and only take the first result.
Alternatively, if you also want to check the date in the filename is correct, you might need to add a little logic with a tMap, tAssert, tJava or tJavaRow.

Is there anyway to rename the "Source" button to something like "HTML"?

Is there anyway to rename the "Source" button to something like "HTML", I ask this as users are confused at how to add html code using the editor?
Yes, inside of the "lang" folder you will see all of the various language files.
For my case, and probably yours, You will want to edit the file "en.js". The file is "compressed" to some degree so it may be difficult to read, but it's still not too difficult to change one string. If you do plan on changing multiple strings you will most likely want to use a service to format Javascript.
Search for the following segment of code. It was one of the very last lines in the file.
"sourcearea":{"toolbar":"Source"}
change it to
"sourcearea":{"toolbar":"HTML"}
Avoid This Method Unless Required
And as for a very unsuggested method, since you can't modify the language files for some reason, you can modify the ckeditor.js file and force a specific label.
Inside of "ckeditor.js" change the line below
a.ui.addButton("Source",{label:a.lang.sourcearea.toolbar,command:"source",toolbar:"mode,10"});
to the follow code
a.ui.addButton("Source",{label:"HTML",command:"source",toolbar:"mode,10"});
The only thing modified is the "label" value in the above line. We remove the reference to the a.language.sourcearea.toolbar and insert a string in it's place instead.