I am creating a python script that uses PyPdf2. I am trying to open and append a file using a wild card in the file name. It is taking the * literally in the file name.
Is there a way to declare wildcards with the open and merge functionality in PyPdf2? And if so, how?
Why not use the glob.glob function to find the list of matching files, then append each individually? That is a far cleaner separation of concerns than expecting PyPDF to guess when you mean a literal filename and when you mean a wildcard.
Related
I am using the glob module to parse through a bunch of text files. Here is the line of that code:
for file in g.glob('*.TXT'):
for col in csv.DictReader(open(file,'rU')):
It works fine but is there a way to grab the names of the files that it iterates through? Im thinking this is not possible since it just looks for any files with the suffix '.TXT'. But I just thought I would ask.
Since glob.glob() returns only a list of matching file names, it's not possible to fetch all the file names considered using just this function. That would be a job for os.listdir().
If you only want to keep track of the matched files, you can store the return value of glob() before iterating over it:
filenames = g.glob('*.TXT')
for filename in filenames:
for col in csv.DictReader(open(filename,'rU')):
...
Also note that I changed the name file to filename, because that's more precise.
I have a large XML file, with many references to different file names, all PDF files. I want to replace all the different file names, with the a specific file name. I am using Notepad++.
For example:
cat.pdf
dog.pdf
bird.pdf
Replace all these with whale.pdf.
I have googled, searched, tried and failed for so long right now, and I cannot make it work. I don't know what I am doing wrong.
If you specifically intend to match several names you can do that in this way:
(cat|dog|bird)\.pdf\b
You can try
\w+\.pdf\b
Replace with whale.pdf.
This is probably real simple, but I can't seem to figure out how to do it.
I have an application in R (Shiny) where a user uploads to the application a *.zip file that contains all the components of an ESRI shapefile. I unpack these files into their own directory. This folder then, may or may not, contain a *.shp.xml file. At some point in my R code, I need to find the exact name of the *.shp file that has been unpacked, and distinguish it from the *.shp.xml file. How do I write the expression that will do that? I was thinking to use list.files, but I am unsure how to write the rest of the expression.
thanks!
With R regex patterns the "$" has special meaning as the end of a character element (and the 'dots' need to be escaped with \\, so
shpfils <- list.files(path, pattern="\\.shp$")
This should isolate your file -
Sys.glob("*shp")
as compared to
Sys.glob("*shp*")
which should give both the files
or
Sys.glob("*shp.xml")
which should give the .shp.xml file
I've got folder with about 1300 png icons. What I need is html file with all of them inside like:
<img src="path-to-image.png" alt="file name without .png" id="file-name-without-.png" class="icon"/>
Its easy as hell but with that number of files its pure waste of time to do it manually. Have you any ideas how to automate it?
If you need it just once, then do a "dir" or "ls" and redirect it to a file, then use an editor with macro-ability like notepad++ to record modifying a single line like you desire, then hit play macro for the remainder of the file. If it's dynamic, use PHP.
I would not use C++ to do this. I would use vi, honestly, because running regular expressions repeatedly is all that is needed for this.
But young an do this in C++. I would start with a plan text file with all the file names generated by Dir or ls on the command prompt.
Then write code that takes a line of input and turns it into a line formatted the way you want. Test this and get it working on a single line first.
The RE engine of C++ is probably overkill (and is not all that well supported in compilers), but substr and basic find and replace is all you need. Is there a string library you are familiar with? std::string would do.
To generate the file name without PNG, check the last four characters and see if they exist and are .PNG (if not report an error). Then strip them. To remove dashes, copy characters to a new string but if you are reading a dash write a space. Everything else is just string concatenation.
I have a lot of java files:
Foo01.java
Foo02.java
Foo03.java
Foo04.java
Foo05.java
Foo01Bar.java
Foo01Bar.java
Foo02Bar.java
Foo03Bar.java
Foo04Bar.java
Foo05Bar.java
And I need to replace an expression in and only in FooXX.java classes.
Using CTRL + H in eclipse, in the file name pattern, I tried Foo(\d\d).java, but It does not work. If I write Foo*.java, every FooXXBar.java will also appears, and I don't want to.
What's the way to do it?
I don't think eclipse has the capability to do full regular expressions on file names. As far as I know you can use * to match any string and ? to match any single character for a file. As a result if your file list is similar to the above you can search for:
Foo??.java
For more complex file searches you probably need to use a combination of the unix/windows command line tools (depending on your OS choice).