Boost-Python: Load python module with unicode chars in path - c++

I'm working on game project. I use python 2.7.2 for scripting. My application works fine with non unicode path to .exe. But it can't load scripts with unicode path using
boost::python::import (import_path.c_str());
I tried this example
5.3. Pure Embedding http://docs.python.org/extending/embedding.html#embedding-python-in-c
It also can't handle unicode path. I linked python as dll.
Explain me, please, how to handle such path.

boost::python::import needs a std::string, so chances are that import_path misses some characters.
Do you have to work on multiple platform ? On Windows, you could call GetShortPathName to retreive the 8.3 filename and use that to load your dll.
You can make a quick test :
Rename your extension to "JaiDéjàTestéÇaEtJaiDétestéÇa.pyd".
At the command line, type dir /x *.pyd to get the short file name (JAIDJT~1.PYD on my computer)
Use the short name to load your extension.
+The file name above if French for "I already tested this and I didn't like it". It is a rhyme that takes the edge off working with Unicode ;)

This isn't really an answer that will suit your needs, but maybe it will give you something to go on.
I ran into a very similar problem with Python, in my case my application is a pure Python application. I noticed as well that if my application was installed to a directory with a path string that could not be encoded in MBCS (what Python converts to internally for imports, at least Python prior to 3.2 as far as I understand), the Python interpreter would fail, claiming not module of that name existed.
What I had to do was write an Import Hook to trick it into loading those files anyway.
Here's what I came up with:
import imp, os, sys
class UnicodeImporter(object):
def find_module(self,fullname,path=None):
if isinstance(fullname,unicode):
fullname = fullname.replace(u'.',u'\\')
exts = (u'.pyc',u'.pyo',u'.py')
else:
fullname = fullname.replace('.','\\')
exts = ('.pyc','.pyo','.py')
if os.path.exists(fullname) and os.path.isdir(fullname):
return self
for ext in exts:
if os.path.exists(fullname+ext):
return self
def load_module(self,fullname):
if fullname in sys.modules:
return sys.modules[fullname]
else:
sys.modules[fullname] = imp.new_module(fullname)
if isinstance(fullname,unicode):
filename = fullname.replace(u'.',u'\\')
ext = u'.py'
initfile = u'__init__'
else:
filename = fullname.replace('.','\\')
ext = '.py'
initfile = '__init__'
if os.path.exists(filename+ext):
try:
with open(filename+ext,'U') as fp:
mod = imp.load_source(fullname,filename+ext,fp)
sys.modules[fullname] = mod
mod.__loader__ = self
return mod
except:
print 'fail', filename+ext
raise
mod = sys.modules[fullname]
mod.__loader__ = self
mod.__file__ = os.path.join(os.getcwd(),filename)
mod.__path__ = [filename]
#init file
initfile = os.path.join(filename,initfile+ext)
if os.path.exists(initfile):
with open(initfile,'U') as fp:
code = fp.read()
exec code in mod.__dict__
return mod
sys.meta_path = [UnicodeImporter()]
I still run into two issues when using this:
Double clicking on the launcher file (a .pyw file) in windows explorer does not work when the application is installed in a trouble directory. I believe this has to do with how Windows file associations passes the arguments to pythonw.exe (my guess is Windows passes the full path string, which includes the non-encodeable character, as the argument to the exe). If I create a batch file and have the batch file call the Python executable with just the file name of my launcher, and ensure it's launched from the same directory, it launches fine. Again, I'm betting this is because now I can use a relative path as the argument for python.exe, and avoid those trouble characters in the path.
Packaging my application using py2exe, the resulting exe will not run if placed in one of these trouble paths. I think this has to do with the zipimporter module, which unfortunately is a compiled Python module so I cannot easily modify it (I would have to recompile, etc etc).

Related

Running script with fixed internal commands arguments to include Qwebengine pepflashplayer

I made a PyQt5 QWebengine app i wanna make portable.
I found out that flash weren't working in the app.
After a lot of reading i found out that having pepflashplayer64_*.dll & manifest.json in folder
C:\Windows\System32\Macromed\Flash\ is working.
However i wanna ship the pepflashplayer with app, and
adding custom flash folder to PATH env var, do not have effect , or
sys.path.insert()
the command
myapp.py --ppapi-flash-path=C:\Flash\pepflashplayer64_27_0_0_187.dll
works , but how to pass extra augments internally when script is launched ?
i tried dirty hack to run sys.arg[0] script with extra command but no success.
if __name__ == "__main__":
# print sys.argv
flash = (' --ppapi-flash-path=C:\Flash\pepflashplayer64_27_0_0_187.dll').split()
# print flash
noooo = (sys.argv[0] + flash[0]).split()
import sys
app = QtWidgets.QApplication(noooo)
# ... the rest of your handling: `sys.exit(app.exec_())`, etc.
okay i got it to work so i can make app the app with browser portable , and solution was simpler than i thought.
Parsing second internal argument like this.
if __name__ == "__main__":
programname = os.path.dirname(sys.argv[0]) #get current script full folder path
pepperpflash = ' --ppapi-flash-path=' + programname + '/Flash/pepflashplayer64_27_0_0_187.dll'
try:
app = QtWidgets.QApplication(sys.argv + [pepperpflash])
except:
app = QtWidgets.QApplication(sys.argv)
# ... the rest of your handling: `sys.exit(app.exec_())`, etc.

Create version number variations for info.plist using #define and clang?

Years ago, when compiling with GCC, the following defines in a #include .h file could be pre-processed for use in info.plist:
#define MAJORVERSION 2
#define MINORVERSION 6
#define MAINTVERSION 4
<key>CFBundleShortVersionString</key> <string>MAJORVERSION.MINORVERSION.MAINTVERSION</string>
...which would turn into "2.6.4". That worked because GCC supported the "-traditional" flag. (see Tech Note TN2175 Info.plist files in Xcode Using the C Preprocessor, under "Eliminating whitespace between tokens in the macro expansion process")
However, fast-forward to 2016 and Clang 7.0.2 (Xcode 7.2.1) apparently does not support either "-traditional" or "-traditional-cpp" (or support it properly), yielding this string:
"2 . 6 . 4"
(see Bug 12035 - Preprocessor inserts spaces in macro expansions, comment 4)
Because there are so many different variations (CFBundleShortVersionString, CFBundleVersion, CFBundleGetInfoString), it would be nice to work around this clang problem, and define these once, and concatenate / stringify the pieces together. What is the commonly-accepted pattern for doing this now? (I'm presently building on MacOS but the same pattern would work for IOS)
Here is the Python script I use to increment my build number, whenever a source code change is detected, and update one or more Info.plist files within the project.
It was created to solve the issue raised in this question I asked a while back.
You need to create buildnum.ver file in the source tree that looks like this:
version 1.0
build 1
(you will need to manually increment version when certain project milestones are reached, but buildnum is incremented automatically).
NOTE the location of the .ver file must be in the root of the source tree (see SourceDir, below) as this script will look for modified files in this directory. If any are found, the build number is incremented. Modified means source files changes after the .ver file was last updated.
Then create a new Xcode target to run an external build tool and run something like:
tools/bump_buildnum.py SourceDir/buildnum.ver SourceDir/Info.plist
(make it run in ${PROJECT_DIR})
and then make all the actual Xcode targets dependent upon this target, so it runs before any of them are built.
#!/usr/bin/env python
#
# Bump build number in Info.plist files if a source file have changed.
#
# usage: bump_buildnum.py buildnum.ver Info.plist [ ... Info.plist ]
#
# andy#trojanfoe.com, 2014.
#
import sys, os, subprocess, re
def read_verfile(name):
version = None
build = None
verfile = open(name, "r")
for line in verfile:
match = re.match(r"^version\s+(\S+)", line)
if match:
version = match.group(1).rstrip()
match = re.match(r"^build\s+(\S+)", line)
if match:
build = int(match.group(1).rstrip())
verfile.close()
return (version, build)
def write_verfile(name, version, build):
verfile = open(name, "w")
verfile.write("version {0}\n".format(version))
verfile.write("build {0}\n".format(build))
verfile.close()
return True
def set_plist_version(plistname, version, build):
if not os.path.exists(plistname):
print("{0} does not exist".format(plistname))
return False
plistbuddy = '/usr/libexec/Plistbuddy'
if not os.path.exists(plistbuddy):
print("{0} does not exist".format(plistbuddy))
return False
cmdline = [plistbuddy,
"-c", "Set CFBundleShortVersionString {0}".format(version),
"-c", "Set CFBundleVersion {0}".format(build),
plistname]
if subprocess.call(cmdline) != 0:
print("Failed to update {0}".format(plistname))
return False
print("Updated {0} with v{1} ({2})".format(plistname, version, build))
return True
def should_bump(vername, dirname):
verstat = os.stat(vername)
allnames = []
for dirname, dirnames, filenames in os.walk(dirname):
for filename in filenames:
allnames.append(os.path.join(dirname, filename))
for filename in allnames:
filestat = os.stat(filename)
if filestat.st_mtime > verstat.st_mtime:
print("{0} is newer than {1}".format(filename, vername))
return True
return False
def upver(vername):
(version, build) = read_verfile(vername)
if version == None or build == None:
print("Failed to read version/build from {0}".format(vername))
return False
# Bump the version number if any files in the same directory as the version file
# have changed, including sub-directories.
srcdir = os.path.dirname(vername)
bump = should_bump(vername, srcdir)
if bump:
build += 1
print("Incremented to build {0}".format(build))
write_verfile(vername, version, build)
print("Written {0}".format(vername))
else:
print("Staying at build {0}".format(build))
return (version, build)
if __name__ == "__main__":
if os.environ.has_key('ACTION') and os.environ['ACTION'] == 'clean':
print("{0}: Not running while cleaning".format(sys.argv[0]))
sys.exit(0)
if len(sys.argv) < 3:
print("Usage: {0} buildnum.ver Info.plist [... Info.plist]".format(sys.argv[0]))
sys.exit(1)
vername = sys.argv[1]
(version, build) = upver(vername)
if version == None or build == None:
sys.exit(2)
for i in range(2, len(sys.argv)):
plistname = sys.argv[i]
set_plist_version(plistname, version, build)
sys.exit(0)
First, I would like to clarify what each key is meant to do:
CFBundleShortVersionString
A string describing the released version of an app, using semantic versioning. This string will be displayed in the App Store description.
CFBundleVersion
A string specifing the build version (released or unreleased). It is a string, but Apple recommends to use numbers instead.
CFBundleGetInfoString
Seems to be deprecated, as it is no longer listed in the Information Property List Key Reference.
During development, CFBundleShortVersionString isn't changed that often, and I normally set CFBundleShortVersionString manually in Xcode. The only string I change regularly is CFBundleVersion, because you can't submit a new build to iTunes Connect/TestFlight, if the CFBundleVersion wasn't changed.
To change the value, I use a Rake task with PlistBuddy to write a time stamp (year, month, day, hour, and minute) to CFBundleVersion:
desc "Bump bundle version"
task :bump_bundle_version do
bundle_version = Time.now.strftime "%Y%m%d%H%M"
sh %Q{/usr/libexec/PlistBuddy -c "Set CFBundleVersion #{bundle_version}" "DemoApp/DemoApp-Info.plist"}
end
You can use PlistBuddy, if you need to automate CFBundleShortVersionString as well.

Django: No such file or directory

I have a process that scans a tape library and looks for media that has expired, so they can be removed and reused before sending the tapes to an offsite vault. (We have some 7 day policies that never make it offsite.) This process takes around 20 minutes to run, so I didn't want it to run on-demand when loading/refreshing the page. Rather, I set up a django-cron job (I know I could have done this in Linux cron, but wanted the project to be as self-contained as possible) to run the scan, and creates a file in /tmp. I've verified that this works -- the file exists in /tmp from this morning's execution. The problem I'm having is that now I want to display a list of those expired (scratch) media on my web page, but the script is saying that it can't find the file. When the file was created, I use the absolute filename "/tmp/scratch.2015-11-13.out" (for example), but here's the error I get in the browser:
IOError at /
[Errno 2] No such file or directory: '/tmp/corpscratch.2015-11-13.out'
My assumption is that this is a "web root" issue, but I just can't figure it out. I tried copying the file to the /static/ and /media/ directories configured in django, and even in the django root directory, and the project root directory, but nothing seems to work. When it says it cant' find /tmp/file, where is it really looking?
def sample():
""" Just testing """
today = datetime.date.today() #format 2015-11-31
inputfile = "/tmp/corpscratch.%s.out" % str(today)
with open(inputfile) as fh: # This is the line reporting the error
lines = [line.strip('\n') for line in fh]
print(lines)
The print statement was used for testing in the shell (which works, I might add), but the browser gives an error.
And the file does exist:
$ ls /tmp/corpscratch.2015-11-13.out
/tmp/corpscratch.2015-11-13.out
Thanks.
Edit: was mistaken, doesn't work in python shell either. Was thinking of a previous issue.
Use this instead:
today = datetime.datetime.today().date()
inputfile = "/tmp/corpscratch.%s.out" % str(today)
Or:
today = datetime.datetime.today().strftime('%Y-%m-%d')
inputfile = "/tmp/corpscratch.%s.out" % today # No need to use str()
See the difference:
>>> str(datetime.datetime.today().date())
'2015-11-13'
>>> str(datetime.datetime.today())
'2015-11-13 15:56:19.578569'
I ended up finding this elsewhere:
today = datetime.date.today() #format 2015-11-31
inputfilename = "tmp/corpscratch.%s.out" % str(today)
inputfile = os.path.join(settings.PROJECT_ROOT, inputfilename)
With settings.py containing the following:
PROJECT_ROOT = os.path.abspath(os.path.dirname(__file__))
Completely resolved my issues.

Registered Trademark: Why does strip remove ® but replace can't find it? How do I remove symbols from folder and file names?

If the registered trademark symbol does not appear at the end of a file or folder name, strip cannot be used. Why doesn't replace work?
I have some old files and folders named with a registered trademark symbol that I want to remove.
The files don't have an extension.
folder: "\data\originals\Word Finder®"
file 1: "\data\originals\Word Finder® DA"
file 2: "\data\originals\Word Finder® Thesaurus"
For the folder, os.rename(p,p.strip('®')) works. However, replace os.rename(p,p.replace('®','')) does not work on either the folder or the files.
Replace works on strings fed to it, ie:
print 'Registered® Trademark®'.replace('®',''). Is there a reason the paths don't follow this same logic?
note:
I'm using os.walk() to get the folder and file names
I have been unable to recreate your issue, so I'm not sure why it isn't working for you. Here is a workaround though: instead of using the registered character in your source code with the string methods, try being more explicit with something like this:
import os
for root, folders, files in os.walk(os.getcwd()):
for fi in files:
oldpath = os.path.join(root, fi)
newpath = os.path.join(root, fi.decode("utf-8").replace(u'\u00AE', '').encode("utf-8"))
os.rename(oldpath, newpath)
Explicitly specifying the unicode codepoint you're looking for can help eliminate the number of places your code could be going wrong. The interpreter no longer has to worry about the encoding of your source code itself.
My original question 'Registered Trademark: Why does strip remove ® but replace can't find it?' is no longer applicable. The problem isn't strip or replace, but how os.rename() deals with unicode characters. So, I added to my question.
Going off of what Cameron said, os.rename() seems like it doesn't work with unicode characters. (please correct me if this is wrong - I don't know much about this). shutil.move() ultimately gives the same result that os.rename() should have.
Despite ScottLawson's suggestion to use u'\u00AE' instead of '®', I could not get it to work.
Basically, use shutil.move(old_name,new_name) instead.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import shutil
import os
# from this answer: https://stackoverflow.com/q/1033424/3889452
def remove(value):
deletechars = '®'
for c in deletechars:
value = value.replace(c,'')
return value
for root, folders, files in os.walk(r'C:\Users\myname\da\data\originals\Word_4_0'):
for f in files:
rename = remove(f)
shutil.move(os.path.join(root,f),os.path.join(root,rename))
for folder in folders:
rename = remove(folder)
shutil.move(os.path.join(root,folder),os.path.join(root,rename))
This also works for the immediate directory (based off of this) and catches more symbols, chars, etc. that aren't included in string.printable and ® doesn't have to appear in the python code.
import shutil
import os
import string
directory_path = r'C:\Users\myname\da\data\originals\Word_4_0'
for file_name in os.listdir(directory_path):
new_file_name = ''.join(c for c in file_name if c in string.printable)
shutil.move(os.path.join(directory_path,file_name),os.path.join(directory_path,new_file_name))

How to tell whether a file is executable on Windows in Python?

I'm writing grepath utility that finds executables in %PATH% that match a pattern.
I need to define whether given filename in the path is executable (emphasis is on command line scripts).
Based on "Tell if a file is executable" I've got:
import os
from pywintypes import error
from win32api import FindExecutable, GetLongPathName
def is_executable_win(path):
try:
_, executable = FindExecutable(path)
ext = lambda p: os.path.splitext(p)[1].lower()
if (ext(path) == ext(executable) # reject *.cmd~, *.bat~ cases
and samefile(GetLongPathName(executable), path)):
return True
# path is a document with assoc. check whether it has extension
# from %PATHEXT%
pathexts = os.environ.get('PATHEXT', '').split(os.pathsep)
return any(ext(path) == e.lower() for e in pathexts)
except error:
return None # not an exe or a document with assoc.
Where samefile is:
try: samefile = os.path.samefile
except AttributeError:
def samefile(path1, path2):
rp = lambda p: os.path.realpath(os.path.normcase(p))
return rp(path1) == rp(path2)
How is_executable_win could be improved in the given context? What functions from Win32 API could help?
P.S.
time performance doesn't matter
subst drives and UNC, unicode paths are not under consideration
C++ answer is OK if it uses functions available on Windows XP
Examples
notepad.exe is executable (as a rule)
which.py is executable if it is associated with some executable (e.g., python.exe) and .PY is in %PATHEXT% i.e., 'C:\> which' could start:
some\path\python.exe another\path\in\PATH\which.py
somefile.doc most probably is not executable (when it is associated with Word for example)
another_file.txt is not executable (as a rule)
ack.pl is executable if it is associated with some executable (most probably perl.exe) and .PL is in %PATHEXT% (i.e. I can run ack without specifing extension if it is in the path)
What is "executable" in this question
def is_executable_win_destructive(path):
#NOTE: it assumes `path` <-> `barename` for the sake of example
barename = os.path.splitext(os.path.basename(path))[0]
p = Popen(barename, stdout=PIPE, stderr=PIPE, shell=True)
stdout, stderr = p.communicate()
return p.poll() != 1 or stdout != '' or stderr != error_message(barename)
Where error_message() depends on language. English version is:
def error_message(barename):
return "'%(barename)s' is not recognized as an internal" \
" or external\r\ncommand, operable program or batch file.\r\n" \
% dict(barename=barename)
If is_executable_win_destructive() returns when it defines whether the path points to an executable for the purpose of this question.
Example:
>>> path = r"c:\docs\somefile.doc"
>>> barename = "somefile"
After that it executes %COMSPEC% (cmd.exe by default):
c:\cwd> cmd.exe /c somefile
If output looks like this:
'somefile' is not recognized as an internal or external
command, operable program or batch file.
Then the path is not an executable else it is (lets assume there is one-to-one correspondence between path and barename for the sake of example).
Another example:
>>> path = r'c:\bin\grepath.py'
>>> barename = 'grepath'
If .PY in %PATHEXT% and c:\bin is in %PATH% then:
c:\docs> grepath
Usage:
grepath.py [options] PATTERN
grepath.py [options] -e PATTERN
grepath.py: error: incorrect number of arguments
The above output is not equal to error_message(barename) therefore 'c:\bin\grepath.py' is an "executable".
So the question is how to find out whether the path will produce the error without actually running it? What Win32 API function and what conditions used to trigger the 'is not recognized as an internal..' error?
shoosh beat me to it :)
If I remember correctly, you should try to read the first 2 characters in the file. If you get back "MZ", you have an exe.
hnd = open(file,"rb")
if hnd.read(2) == "MZ":
print "exe"
I think, that this should be sufficient:
check file extension in PATHEXT - whether file is directly executable
using cmd.exe command "assoc .ext" you can see whether file is associated with some executable (some executable will be launched when you launch this file). You can parse capture output of assoc without arguments and collect all extensions that are associated and check tested file extension.
other file extensions will trigger error "command is not recognized ..." therefore you can assume that such files are NOT executable.
I don't really understand how you can tell the difference between somefile.py and somefile.txt because association can be really the same. You can configure system to run .txt files the same way as .py files.
A windows PE always starts with the characters "MZ". This includes however also any kind of DLLs which are not necessarily executables.
To check for this however you'll have to open the file and read the header so that's probably not what you're looking for.
Here's the grepath.py that I've linked in my question:
#!/usr/bin/env python
"""Find executables in %PATH% that match PATTERN.
"""
#XXX: remove --use-pathext option
import fnmatch, itertools, os, re, sys, warnings
from optparse import OptionParser
from stat import S_IMODE, S_ISREG, ST_MODE
from subprocess import PIPE, Popen
def warn_import(*args):
"""pass '-Wd' option to python interpreter to see these warnings."""
warnings.warn("%r" % (args,), ImportWarning, stacklevel=2)
class samefile_win:
"""
http://timgolden.me.uk/python/win32_how_do_i/see_if_two_files_are_the_same_file.html
"""
#staticmethod
def get_read_handle (filename):
return win32file.CreateFile (
filename,
win32file.GENERIC_READ,
win32file.FILE_SHARE_READ,
None,
win32file.OPEN_EXISTING,
0,
None
)
#staticmethod
def get_unique_id (hFile):
(attributes,
created_at, accessed_at, written_at,
volume,
file_hi, file_lo,
n_links,
index_hi, index_lo
) = win32file.GetFileInformationByHandle (hFile)
return volume, index_hi, index_lo
#staticmethod
def samefile_win(filename1, filename2):
"""Whether filename1 and filename2 represent the same file.
It works for subst, ntfs hardlinks, junction points.
It works unreliably for network drives.
Based on GetFileInformationByHandle() Win32 API call.
http://timgolden.me.uk/python/win32_how_do_i/see_if_two_files_are_the_same_file.html
"""
if samefile_generic(filename1, filename2): return True
try:
hFile1 = samefile_win.get_read_handle (filename1)
hFile2 = samefile_win.get_read_handle (filename2)
are_equal = (samefile_win.get_unique_id (hFile1)
== samefile_win.get_unique_id (hFile2))
hFile2.Close ()
hFile1.Close ()
return are_equal
except win32file.error:
return None
def canonical_path(path):
"""NOTE: it might return wrong path for paths with symbolic links."""
return os.path.realpath(os.path.normcase(path))
def samefile_generic(path1, path2):
return canonical_path(path1) == canonical_path(path2)
class is_executable_destructive:
#staticmethod
def error_message(barename):
r"""
"'%(barename)s' is not recognized as an internal or external\r\n
command, operable program or batch file.\r\n"
in Russian:
"""
return '"%(barename)s" \xad\xa5 \xef\xa2\xab\xef\xa5\xe2\xe1\xef \xa2\xad\xe3\xe2\xe0\xa5\xad\xad\xa5\xa9 \xa8\xab\xa8 \xa2\xad\xa5\xe8\xad\xa5\xa9\r\n\xaa\xae\xac\xa0\xad\xa4\xae\xa9, \xa8\xe1\xaf\xae\xab\xad\xef\xa5\xac\xae\xa9 \xaf\xe0\xae\xa3\xe0\xa0\xac\xac\xae\xa9 \xa8\xab\xa8 \xaf\xa0\xaa\xa5\xe2\xad\xeb\xac \xe4\xa0\xa9\xab\xae\xac.\r\n' % dict(barename=barename)
#staticmethod
def is_executable_win_destructive(path):
# assume path <-> barename that is false in general
barename = os.path.splitext(os.path.basename(path))[0]
p = Popen(barename, stdout=PIPE, stderr=PIPE, shell=True)
stdout, stderr = p.communicate()
return p.poll() != 1 or stdout != '' or stderr != error_message(barename)
def is_executable_win(path):
"""Based on:
http://timgolden.me.uk/python/win32_how_do_i/tell-if-a-file-is-executable.html
Known bugs: treat some "*~" files as executable, e.g. some "*.bat~" files
"""
try:
_, executable = FindExecutable(path)
return bool(samefile(GetLongPathName(executable), path))
except error:
return None # not an exe or a document with assoc.
def is_executable_posix(path):
"""Whether the file is executable.
Based on which.py from stdlib
"""
#XXX it ignores effective uid, guid?
try: st = os.stat(path)
except os.error:
return None
isregfile = S_ISREG(st[ST_MODE])
isexemode = (S_IMODE(st[ST_MODE]) & 0111)
return bool(isregfile and isexemode)
try:
#XXX replace with ctypes?
from win32api import FindExecutable, GetLongPathName, error
is_executable = is_executable_win
except ImportError, e:
warn_import("is_executable: fall back on posix variant", e)
is_executable = is_executable_posix
try: samefile = os.path.samefile
except AttributeError, e:
warn_import("samefile: fallback to samefile_win", e)
try:
import win32file
samefile = samefile_win.samefile_win
except ImportError, e:
warn_import("samefile: fallback to generic", e)
samefile = samefile_generic
def main():
parser = OptionParser(usage="""
%prog [options] PATTERN
%prog [options] -e PATTERN""", description=__doc__)
opt = parser.add_option
opt("-e", "--regex", metavar="PATTERN",
help="use PATTERN as a regular expression")
opt("--ignore-case", action="store_true", default=True,
help="""[default] ignore case when --regex is present; for \
non-regex PATTERN both FILENAME and PATTERN are first \
case-normalized if the operating system requires it otherwise \
unchanged.""")
opt("--no-ignore-case", dest="ignore_case", action="store_false")
opt("--use-pathext", action="store_true", default=True,
help="[default] whether to use %PATHEXT% environment variable")
opt("--no-use-pathext", dest="use_pathext", action="store_false")
opt("--show-non-executable", action="store_true", default=False,
help="show non executable files")
(options, args) = parser.parse_args()
if len(args) != 1 and not options.regex:
parser.error("incorrect number of arguments")
if not options.regex:
pattern = args[0]
del args
if options.regex:
filepred = re.compile(options.regex, options.ignore_case and re.I).search
else:
fnmatch_ = fnmatch.fnmatch if options.ignore_case else fnmatch.fnmatchcase
for file_pattern_symbol in "*?":
if file_pattern_symbol in pattern:
break
else: # match in any place if no explicit file pattern symbols supplied
pattern = "*" + pattern + "*"
filepred = lambda fn: fnmatch_(fn, pattern)
if not options.regex and options.ignore_case:
filter_files = lambda files: fnmatch.filter(files, pattern)
else:
filter_files = lambda files: itertools.ifilter(filepred, files)
if options.use_pathext:
pathexts = frozenset(map(str.upper,
os.environ.get('PATHEXT', '').split(os.pathsep)))
seen = set()
for dirpath in os.environ.get('PATH', '').split(os.pathsep):
if os.path.isdir(dirpath): # assume no expansion needed
# visit "each" directory only once
# it is unaware of subst drives, junction points, symlinks, etc
rp = canonical_path(dirpath)
if rp in seen: continue
seen.add(rp); del rp
for filename in filter_files(os.listdir(dirpath)):
path = os.path.join(dirpath, filename)
isexe = is_executable(path)
if isexe == False and is_executable == is_executable_win:
# path is a document with associated program
# check whether it is a script (.pl, .rb, .py, etc)
if not isexe and options.use_pathext:
ext = os.path.splitext(path)[1]
isexe = ext.upper() in pathexts
if isexe:
print path
elif options.show_non_executable:
print "non-executable:", path
if __name__=="__main__":
main()
Parse the PE format.
http://code.google.com/p/pefile/
This is probably the best solution you will get other than using python to actually try to run the program.
Edit: I see you also want files that have associations. This will require mucking in the registry which I don't have the information for.
Edit2: I also see that you differentiate between .doc and .py. This is a rather arbitrary differentiation which must be specified with manual rules, because to windows, they are both file extensions that a program reads.
Your question can't be answered. Windows can't tell the difference between a file which is associated with a scripting language vs. some other arbitrary program. As Windows is concerned, a .PY file is simply a document which is opened by python.exe.