I'm looking for the tools to detect a pattern in multiple files with the purpose to correct any heterogeneity. Precisely, this tool would help to generate the renaming scheme.
A correct pattern would be a mix of fixed patterns and increments followed by whatever and then an authorized extension.
Number of files ranges from 10 000 to 100 000, the solution should aim to minimize user intervention.
I have a directory like this :
testDir/
├── corrupttestfile100.ext
├── testfil0000.ext
├── testfil0001.ext
├── testfil0002.ext
├── testfil0003.ext
├── testfil0004.ext
├── testfil0005.ext
├── testfil0006.ext
├── testfil0007.ext
├── testfil0008.ext
├── testfil0009.ext
├── testfile010.ext
├── testfile011.ext
├── testfile012.ext
├── testfile013.ext
├── testfile014.ext
├── testfile015.ext
├── testfile016.ext
├── testfile017.ext
├── testfile018.ext
├── testfile019.ext
├── testfile020.ext
├── testfile021.ext
├── testfile022.ext
├── testfile023.ext
├── testfile024.ext
├── testfile025.ext
├── testfile026.ext
├── testfile027.ext
├── testfile028.ext
├── testfile029.ext
├── testfile030.ext
├── testfile031.ext
├── testfile032.ext
├── testfile033.ext
├── testfile034.ext
├── testfile035.ext
├── testfile036.ext
├── testfile037.ext
├── testfile038.ext
├── testfile039.ext
├── testfile040.ext
├── testfile041.ext
├── testfile042.ext
├── testfile043.ext
├── testfile044.ext
├── testfile045.ext
├── testfile046.ext
├── testfile047.ext
├── testfile048.ext
├── testfile049.ext
├── testfile050.ext
├── testfile051.ext
├── testfile052.ext
├── testfile053.ext
├── testfile054.ext
├── testfile055.ext
├── testfile056.ext
├── testfile057.ext
├── testfile058.ext
├── testfile059.ext
├── testfile060.ext
├── testfile061.ext
├── testfile062.ext
├── testfile063.ext
├── testfile064.ext
├── testfile065.ext
├── testfile066.ext
├── testfile067.ext
├── testfile068.ext
├── testfile069.ext
├── testfile080.ext
├── testfile081.ext
├── testfile082.ext
├── testfile083.ext
├── testfile084.ext
├── testfile085.ext
├── testfile086.ext
├── testfile087.ext
├── testfile088.ext
├── testfile089.ext
├── testfile090.ext
├── testfile091.ext
├── testfile092.ext
├── testfile093.ext
├── testfile094.ext
├── testfile095.ext
├── testfile096.ext
├── testfile097.ext
├── testfile098.ext
├── testfile099.ext
├── testfile101.ext2
└── testfileNotRelevant.ext
And I'd like to have the adequate tools to produce following feedback or equivalent :
1. pattern ^testfil matching 92/93
Exception :
corrupttestfile100.ext
2. pattern ^testfile matching 82/93
Exception :
corrupttestfile100.ext
testfil0000.ext
testfil0001.ext
testfil0002.ext
testfil0003.ext
testfil0004.ext
testfil0005.ext
testfil0006.ext
testfil0007.ext
testfil0008.ext
testfil0009.ext
3. Increment pattern breaks [70-79]
4. Increment pattern 3 digits 92/93
5. Increment pattern 4 digits 10/93
6. Extension exception : ext2 1/93
I began to scrip in bash using grep, find, sed and rename as it seemed natural but I'm wondering if will hit some walls and if I'm reinventing the wheel. I don't seem to find any relevant source of help and similar existing tools aren't open source nor data agnostic.
Related
I'm trying to use libxl in my c++ project on visual studio code. This is my directory map:
├── ken.code-workspace
└── src
├── include
│ ├── AutoFilterA.h
│ ├── AutoFilterW.h
│ ├── BookA.h
│ ├── BookW.h
│ ├── enum.h
│ ├── FilterColumnA.h
│ ├── FilterColumnW.h
│ ├── FontA.h
│ ├── FontW.h
│ ├── FormatA.h
│ ├── FormatW.h
│ ├── FormControlA.h
│ ├── FormControlW.h
│ ├── handle.h
│ ├── libxl.h
│ ├── RichStringA.h
│ ├── RichStringW.h
│ ├── setup.h
│ ├── SheetA.h
│ └── SheetW.h
├── ken.cpp
└── lib
└── libxl.so
And this is my ken.cpp:
#include <iostream>
#include <vector>
#include <string>
#include "include/libxl.h"
using namespace std;
using namespace libxl;
int main()
{
cout << "Hello babe :)" << endl;
Book* book = xlCreateBook();
}
But I get these errors while trying to compile:
/run/media/mahdi/Mahdi/Projects/ken/src/ken.cpp:7:17: error: ‘libxl’ is not a namespace-name
7 | using namespace libxl;
| ^~~~~
/run/media/mahdi/Mahdi/Projects/ken/src/ken.cpp: In function ‘int main()’:
/run/media/mahdi/Mahdi/Projects/ken/src/ken.cpp:12:5: error: ‘Book’ was not declared in this scope
12 | Book* book = xlCreateBook();
| ^~~~
/run/media/mahdi/Mahdi/Projects/ken/src/ken.cpp:12:11: error: ‘book’ was not declared in this scope; did you mean ‘bool’?
12 | Book* book = xlCreateBook();
| ^~~~
| bool
I've never added an external library to visual studio code project before. Can I know what I'm doing wrong?
I tried to ignore some files that end with ".NUMBERS" like .133443. I tried this Regex but it still copies the files:
unison -batch -owner -group -times -ignore "Regex ^*\.[0-9]+$" /hadoop/bigdata/giin/data ssh://cnp31ginhortonen1.giin.recouv//hadoop/bigdata/giin/data -prefer newer`
Source :
[root#cnp31ginhortonen1 .unison]# tree /hadoop/bigdata/giin/data/
/hadoop/bigdata/giin/data/
├── aefd.csv
├── aefd.log
├── aefd.xml
├── subdir
│ ├── aefd.csv
│ ├── aefd.log
│ ├── aefd.xml
│ └── TB5E.B01.117.210409074
├── TB5E.B01.117.10409074
└── TB5E.B01.117.210409074
I need to write a script, that would copy a directory recursively, but only copying subdirectories and files matched by a certain RegEx. For instance for a tree like this:
.
└── toCopy
├── A
│ ├── 123
│ ├── D
│ │ └── rybka23
│ ├── file
│ ├── file1
│ └── random
├── B
├── C
│ ├── file_25
│ └── somefile
└── E1
└── something
For a RegEx
.*[0-9]+
I need to get a new directory:
newDir
├── A
│ ├── 123
│ ├── D
│ │ └── rybka23
│ └── file1
├── C
│ └── file_25
└── E1
So my first thinking was something like this:
find toCopy -regex ".*[0-9]+" -exec cp -R '{}' newDir \;
But that doesn't really work, because I'm only getting the paths to the files/directories I need to copy and I have no idea how to build the tree from them.
I would really appreciate any hints on how to do that.
You can do that using find command and loop through the results:
#!/usr/bin/env bash
cd toDir
while IFS= read -rd '' elem; do
if [[ -d $elem ]]; then
mkdir -p ../newDir/"$elem"
else
d="${elem%/*}"
mkdir -p ../newDir/"$d"
cp "$elem" ../newDir/"$d"
fi
done < <(find . -name '*[0-9]*' -print0)
This requires bash as we are using process substitution.
I have a problem with pdf.js and Qt 5.8, i tried to do the same code in this link Using pdf.js with Qt5.8 in my application but he doesn't work i dont know why, qt show me this message about JS :
"js: Uncaught TypeError: Cannot read property 'PDFJS' of undefined".
this is my code in mainwindow :
QWebEngineView *view;
QString pdfFileURL;
QString pathToPDFjs = QString("file:///"+qApp->applicationDirPath()+"/libraries/PDF/viewer.html");
pdfFileURL = "file:///C:/Users/Administrateur/Desktop/CV.pdf";
view = new QWebEngineView();
this->setCentralWidget(view);
view->load(QUrl::fromUserInput(pathToPDFjs + QString("?file=") + pdfFileURL));
view->show();
I would recommend downloading the source code from here.
Then copy the entire file into a folder within your project (in my case 3rdParty):
.
├── 3rdParty
│ └── pdfjs-1.7.225-dist
│ ├── build
│ │ ├── pdf.js
│ │ └── pdf.worker.js
│ ├── LICENSE
│ └── web
│ ├── cmaps
│ ├── {another files}
│ ├── viewer.css
│ ├── viewer.html
│ └── viewer.js
├── CV.pdf
├── main.cpp
├── mainwindow.cpp
├── mainwindow.h
├── mainwindow.ui
└── pdfjsExample.pro
Another recommendation is to create a command in the .pro so you can copy the library to the side of the executable and have no problems of folder location (where CV.pdf is the pdf that I use to do the test).
COPY_CONFIG = 3rdParty CV.pdf
copy_cmd.input = COPY_CONFIG
copy_cmd.output = ${QMAKE_FILE_IN_BASE}${QMAKE_FILE_EXT}
copy_cmd.commands = $$QMAKE_COPY_DIR ${QMAKE_FILE_IN} ${QMAKE_FILE_OUT}
copy_cmd.CONFIG += no_link_no_clean
copy_cmd.variable_out = PRE_TARGETDEPS
QMAKE_EXTRA_COMPILERS += copy_cmd
And the code would look like this:
QWebEngineView *view;
QString pdfFileURL;
QString pathToPDFjs = QString("file:///%1/%2")
.arg(QDir::currentPath())
.arg("3rdParty/pdfjs-1.7.225-dist/web/viewer.html");
pdfFileURL = QString("file:///%1/%2").arg(QDir::currentPath()).arg("CV.pdf");
view = new QWebEngineView();
setCentralWidget(view);
QUrl url = QUrl::fromUserInput(pathToPDFjs + QString("?file=") + pdfFileURL);
view->load(url);
NOTE: modify applicationDirPath to CurrentPath so that if I move the executable to another location I did not generate problems, for the application to work correctly the 3rdParty folder and our executable must be together.
The complete code is here.
If you want to hide the print button and the open button, you should comment the following lines:
viewer.html [line 178]
<!--button id="openFile" class="toolbarButton openFile hiddenLargeView" title="Open File" tabindex="32" data-l10n-id="open_file">
<span data-l10n-id="open_file_label">Open</span>
</button>
<button id="print" class="toolbarButton print hiddenMediumView" title="Print" tabindex="33" data-l10n-id="print">
<span data-l10n-id="print_label">Print</span>
</button-->
viewer.js [line 3058]
/*items.openFile.addEventListener('click', function (e) {
eventBus.dispatch('openfile');
});
items.print.addEventListener('click', function (e) {
eventBus.dispatch('print');
});*/
I'm trying to connect to a ClojureScript browser REPL, and I'm having trouble with clojure.browser.repl/connect. My compiled JavaScript throws a TypeError trying to call appendChild on a null object in the block of Google Closure code at the top. I'm following the instructions in ClojureScript: Up and Running (Chapter 9, p.78, available in the preview), and wondering if the tooling for this has changed since it was published.
I'm using Leiningen 2.0.0, Java 1.6.0_37, OS X 10.7.5, plus the dependencies in my project.clj:
(defproject brepl-hello "0.1.0-SNAPSHOT"
:dependencies [[org.clojure/clojure "1.4.0"]
[org.clojure/clojurescript "0.0-1552"]
[compojure "1.1.5"]
[ring/ring-jetty-adapter "1.1.8"]]
:plugins [[lein-cljsbuild "0.3.0"]]
:source-paths ["src/clj"]
:cljsbuild {:builds [{
:source-paths ["src/cljs"]
:compiler {
:output-to "resources/public/brepl-hello.js"
:optimizations :whitespace
:pretty-print true}}]})
Here's the only ClojureScript source file, src/cljs/brepl_hello/brepl-hello.cljs:
(ns brepl-hello
(:require [clojure.browser.repl :as repl]))
(repl/connect "http://localhost:9000/repl")
This compiles to the file resources/public/brepl-hello.js, which I've inserted into index.html in the same directory:
<!DOCTYPE html>
<html>
<head>
<title></title>
<script type="text/javascript" src="brepl-hello.js"></script>
</head>
<body>
</body>
</html>
I've been serving this on port 3000 with Ring/Jetty from the REPL or Python SimpleHTTPServer. When I open this page in Chrome, the dev console shows Uncaught TypeError: Cannot call method 'appendChild' of null, with a traceback to this if/else block in the Google Closure code at the top of the complied js file, where parentElm (passed in to the containing function as a parameter) is null.
if(goog.userAgent.GECKO || goog.userAgent.WEBKIT) {
window.setTimeout(goog.bind(function() {
parentElm.appendChild(iframeElm);
iframeElm.src = peerUri.toString();
goog.net.xpc.logger.info("peer iframe created (" + iframeId + ")")
}, this), 1)
}else {
iframeElm.src = peerUri.toString();
parentElm.appendChild(iframeElm);
goog.net.xpc.logger.info("peer iframe created (" + iframeId + ")")
}
This seems to be a problem with clojure.browser.repl/connect. Swapping out this line in the ClojureScript source for something like:
(ns brepl-hello
(:require [clojure.browser.repl :as repl]))
(.write js/document "Hello World!")
Will compile and run in the browser just fine. I suspect something is misconfigured in my build settings or directory structure, or I'm making a noob mistake somewhere in all this. What's changed since the time the instructions I'm following were published? I found a couple references to this problem in the #clojure irc logs, but no solution.
Finally, here's an abbreviated directory tree for reference:
├── out
│ ├── cljs
│ │ ├── core.cljs
│ │ └── core.js
│ ├── clojure
│ │ └── browser
│ │ ├── event.cljs
│ │ ├── event.js
│ │ ├── net.cljs
│ │ ├── net.js
│ │ ├── repl.cljs
│ │ └── repl.js
│ └── goog
│ └── [...]
├── pom.xml
├── project.clj
├── resources
│ └── public
│ ├── brepl-hello.js
│ └── index.html
├── src
│ ├── clj
│ │ └── brepl_hello
│ │ └── core.clj
│ └── cljs
│ └── brepl_hello
│ └── brepl-hello.cljs
└─── target
├── brepl-hello-0.1.0-SNAPSHOT.jar
├── classes
├── cljsbuild-compiler-0
│ ├── brepl_hello
│ │ └── brepl-hello.js
│ ├── cljs
│ │ ├── core.cljs
│ │ └── core.js
│ └── clojure
│ └── browser
│ ├── event.cljs
│ ├── event.js
│ ├── net.cljs
│ ├── net.js
│ ├── repl.cljs
│ └── repl.js
└── stale
└── extract-native.dependencies
Well, its open source and looking at the code it seems that document.body is null at the time the repl hidden iframe is being added to it (the connect call leads to this point).
You should do this connect call on dom ready or body on load and it should work fine.
Take a look at:
https://github.com/magomimmo/modern-cljs/blob/master/doc/tutorial-02.md
or, for a better brepl experience, here
https://github.com/magomimmo/modern-cljs/blob/master/doc/tutorial-18.md