Can't open a file with a Japanese filename in Python

Can't open a file with a Japanese filename in Python - python-2.7

Why doesn't this work in the Python interpreter? I am running the Python 2.7 version of python.exe on Windows 7. My locale is en_GB.
open(u'黒色.txt')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IOError: [Errno 22] invalid mode ('r') or filename: u'??.txt'
The file does exist, and is readable.
And if I try
name = u'黒色.txt'
name
the interpreter shows
u'??.txt'
Additional:
Okay, I was trying to simplify my problem for the purposes of this forum. Originally the filename was arriving in a cgi script from a web page with a file picker. The idea was to let the web page user upload files to a server:
import cgi
form = cgi.FieldStorage()
fileItems = form['attachment[]']
for fileItem in fileItems:
if fileItem.file:
fileName = os.path.split(fileItem.filename)[1]
f = open(fileName, 'wb')
while True:
chunk = fileItem.file.read(100000)
if not chunk:
break
f.write(chunk)
f.close()
but the files created at the server side had corrupted names. I started investigating this in the Python interpreter, reproduced the problem (so I thought), and that is what I put into my original question. However, I think now that I managed to create a separate problem.
Thanks to the answers below, I fixed the cgi script by making sure the file name is treated as unicode:
fileName = unicode(os.path.split(fileItem.filename)[1])
I never got my example in the interpreter to work. I suspect that is because my PC has the wrong locale for this.

Here's an example script that reads and writes the file. You can use any encoding for the source file that supports the characters you are writing but make sure the #coding line matches. You can use any encoding for the data file as long as the encoding parameter matches.
#coding:utf8
import io
with io.open(u'黒色.txt','w',encoding='utf8') as f:
f.write(u'黒色.txt content')
with io.open(u'黒色.txt',encoding='utf8') as f:
print f.read()
Output:
黒色.txt content
Note the print will only work if the terminal running the script supports Japanese; otherwise, you'll likely get a UnicodeEncodeError. I am on Windows and use an IDE that supports UTF-8 output, since the Windows console uses a legacy US-OEM encoding that doesn't support Japanese.

Run IDLE if you want to work with Unicode strings interactively in Python. Then inputting or printing any characters will just work.

Related

File opening operation misbehaving in Python 2.7

I am learning about exceptions and so performing some file operations and testing various parts of code that can possibly generate exceptions while working with files in Python. I am executing this Python 2.7 code on Canopy.
#!/usr/bin/python
import os
try:
fp = open('testfile', 'r')
except IOError:
print 'File not opened successfully'
else:
print 'File opened successfully'
try:
fp.write('Hello!')
except IOError:
print 'Write not allowed on this file'
else:
print 'Write successful'
try:
fp.close()
except IOError:
print 'File not closed properly'
else:
print 'File closed successfully'
finally:
if os.path.exists(fp.name):
os.remove(fp.name)
When I execute this code, I get the following output:
File not opened properly
NameErrorTraceback (most recent call last)
/home/sr/Python/tcs.py in ()
--> 185 if os.path.exists(fp.name)
NameError: name 'fp' is not defined
But if I change the access mode of file to 'w', Then everything seems to work properly with the correct output as:
File opened successfully
Write successful
File closed successfully
I cannot understand why the 'r' mode is not making the file open properly and thus the fp file object is not created. Please help me figure the problem out.
P.S.: Also I would like to know if there is a better way of implementing the same thing. But this is optional.

Explanation
The error combined with your printout should be pretty self-explanatory: the variable fp does not exist if you can't open the file.
The mode 'r' indicates that you want to open the file for reading. You can not read something that is not there, so you end up going to the finally block in your code after processing the IOError. But the error occurs before fp was set, so there is no variable fp, hence the error. [Solutions below]
The mode 'w' indicates that you want to open for writing, but from scratch. There is also an 'a' mode to append if the file already exists. You can write to a non-existent file just fine, so your code does not fail. In fact, if the file did exist in 'w' mode, it would be trucated and any previous contents would be lost.
Try creating an empty file and running with mode 'r'. You should get an exception that prints 'Write not allowed on this file'. That is because, as your error message correctly indicates, writing to a file opened in read mode is not allowed.
Improvements
There are two major improvements that you can make to your code. One is fixing the logical flaws, the other is a major stylistic improvement using with statements.
You have two major logic errors. The first is in the outermost finally block that you already saw. The simplest fix is moving the contents of the finally block into the else, since you don't have any action to take if the file was not opened. Another solution is to refer to the file name you are trying to open in the first place. For example, you could store the file name into a variable and use that:
filename = 'testfile'
try:
fp = open(filename, 'r')
...
finally:
if os.path.exists(filename):
os.remove(filename)
The second major logic error is that you do not close the file if your write fails. Notice that you call fp.close() only in the else clause of your try block. If should instead appear in a finally block. The print statement should of course stay in the else. Change
else:
print 'Write successful'
try:
fp.close()
...
to
else:
print 'Write successful'
finally:
try:
fp.close()
...
The entire code can be improved stylistically by using with blocks to manage your file operations. The obvious way to do this is as follows:
fname = 'testfile'
with open(fname, 'r') as fp:
fp.write('Hello!')
if os.path.exists(fname):
os.remove(fname)
You will not get as many detailed messages when things fail, but overall, this code is cleaner, shorter and more robust than what you have. It guarantees that the file will be closed whether or not an exception occurred anywhere along the way. If you need the detailed error output that you currently have, keep using the current try blocks. The reason that most people will prefer with is that any error that occurs will have a detailed desciption and a line number that it happened on, so you basically get all the necessary information with a lot less work.
Here are some additional resources to help you understand with and context managers:
Understanding Python's "with" statement (from effbot.org)
Official documentation for with
SO 1, 2, 3, 4

Open file inside Python

How can I open a file once I am inside Python, that is, once I have typed "python" in the terminal? I know how to open a file by typing something similar to the following in a script, and then running it:
from sys import argv
script, filename = argv
txt = open(filename)
print txt.read()
But I have no idea how to do it once I'm inside the Python interpreter. I've tried to type open (file.txt) and also open ("file.txt"), but I get a long error message either way.
Which is the correct way to do this?

You have to add a mode to the call txt = open('filename.txt', 'r') if you want to read (or w/a for writing or appending). I just tried it, it works :)

Python subprocess module giving an OSError while running UNIX commands

Here's the context:
I am using python 2.7.5. And I would like to run UNIX commands as well as maven commands in a python script.
I was successful to do so, using os.system("cmd"), but I need to work on the result of the given command. After reading the doc and some threads in here, I decided to use the subprocess module to redirect the output to the stdout using PIPE. Unexpectedly, I am getting an OSError as shown in the attached image. Your help will be much appreciated.
In addition to the given sample in the attached image, I have tried:
p = os.popen("java -version")
result = subprocess.check_output(p, shell=True)
subprocess.call("ls /usr", shell=True)
p.s. Using shell=True is strongly discouraged (doc), since it can be dangerous when coupled with unsanitized input.
Also, I took a look at the given script in the error message /usr/lib64/python2.7/subprocess.py, line 711 adn 1327 but didn't learn more than what is mentionned in the error message: raise child_exception
Subprocess Terminal Output

You aren't using subprocess.check_output correctly. You're trying to pass a pipe file object (the return value of os.popen) to check_output but it's expecting a command argument or argument vector.
Also, the subprocess.call function won't capture the executed command's output, so you would only use that if you want the output of ls /usr (or whatever) to be seen by the user running the script interactively. (Which is pretty much the same as os.system.)
Try this instead (showing with and without the shell):
import subprocess
out1a = subprocess.check_output(['java', '-version'], stderr=subprocess.STDOUT)
print(out1a)
out1b = subprocess.check_output('java -version', stderr=subprocess.STDOUT, shell=True)
print(out1b)
out2a = subprocess.check_output(['ls', '/usr'])
print(out2a)
out2b = subprocess.check_output('ls /usr', shell=True)
print(out2b)
# Cannot capture output this way, but it will be visible to user
subprocess.call('ls /usr', shell=True)
Note that in the case of the java -version command, the version info gets printed to the command's standard error output so you must redirect that in order to capture it as the returned value of check_output (hence the stderr=subprocess.STDOUT).

Run LIWC as external program to python - subprocess

I would like to run LIWC (installed in my Mac) within a python 2.7 script.
I have been reading about subprocess (popen and check_output seem the way to go), but I do not get the syntax for:
opening the program;
getting a text file to be analysed;
running the program;
getting the output (analysis) and storing it in a text file.
This is my first approach to subprocess, is this possible?
I appreciate the suggestions.
EDIT
This is the closest to implementing a solution (still does not work):
I can open the application.
subprocess.call(['open', '/file.app'])
But cannot make it process the input file and get an output one.
subprocess.Popen(['/file.app', '-input', 'input.txt', '-output', 'output.txt'])
Nothing comes out of this code.
EDIT 2
After reading dozens of posts, I am still very confused about the syntax for the solution.
Following How do I pipe a subprocess call to a text file?
I came out with this code:
g = open('in_file.txt', 'rb', 0)
f = open('out_file.txt', 'wb')
subprocess.call(['open', "file.app"] stdin=g, stdout=f)
The output file comes out empty.
EDIT 3
Following http://www.cplusplus.com/forum/unices/40680/
When I run the following shell script on the Terminal:
cat input.txt | /Path/LIWC > output.txt
The output txt file is empty.
EDIT 4
When I run:
subprocess.check_call(['/PATH/LIWC', 'PATH/input.txt', 'PATH/output.txt'])
It opens LIWC, does not create an output file and freezes.
EDIT 5
When I run:
subprocess.call(['/PATH/LIWC', 'PATH/input.txt', 'PATH/output.txt'])
It runs LIWC, creates an empty output.txt file and freezes (the process does not end).

The problem with using 'open' in subprocess.call(['open', "file.app"] stdin=g, stdout=f) is that it requests that a file be opened through a service, and doesn't directly attach it to your python process. You'll need to instead use the path to LIWC. I'm not sure that it supports reading from stdin, though, so you might need to even pass in the path to the file you'd like it to open.

How to change permissoin of $Extend directory? Python? C++?

I want to hide a text file by moving it to $Extend directory (What is this directory?). So I run cmd as Administrator and run the below code :
C:\Windows\system32>copy I:\ToHide.txt I:\$Extend
Access is denied.
0 file(s) copied.
C:\Windows\system32>
As you see, I couldn't and I received Access Denied error. So I tried to takeown the destination directory ($Extend) and change its ACLs as below :
C:\Windows\system32>takeown /f I:\$Extend
SUCCESS: The file (or folder): "I:\$Extend" now owned by user "Abraham-VAIO\Abra
ham".
C:\Windows\system32>cacls I:\$Extend /G Abraham:F
Are you sure (Y/N)?Y
The system cannot find the file specified.
C:\Windows\system32>
Q1: Why cacls couldn't see this directory, while takeown could!?
After that, I use the below python code :
import win32api
import win32con
import win32security
FILENAME = "I:\\$Extend"
open (FILENAME, "w").close ()
print "I am", win32api.GetUserNameEx (win32con.NameSamCompatible)
sd = win32security.GetFileSecurity (FILENAME, win32security.OWNER_SECURITY_INFORMATION)
owner_sid = sd.GetSecurityDescriptorOwner ()
name, domain, type = win32security.LookupAccountSid (None, owner_sid)
print "File owned by %s\\%s" % (domain, name)
And I receive Access Denied again :
>>> ================================ RESTART ================================
>>>
Traceback (most recent call last):
File "C:\Users\Abraham\Desktop\teste.py", line 6, in <module>
open (FILENAME, "w").close ()
IOError: [Errno 13] Permission denied: 'I:\\$Extend'
>>>
Q2: Is this python code equal to takeown or it is an alternative for cacls?
Q3: Why I receive access denied,while I run idle (and after that python in command-line) as Administrator?
Last questions :
Q4: Why I can't open this directory using Windows Explorer, While I can open it using WinRAR? Does Windows restrict some APIs for Explorer but they are available for other softwares?
By the way, Is there any way to I achieve my goal using Python or C++ or ...? (Hiding something in $Extend directory)

In general, you can access the MFT directly by opening \.\PhysicalDriveX - which is the underlying physical disk (X is the number of the disk you want to open) - and then parse the disk directly, i.e. find the partition offset from the Master Boot Record, then parse the first NTFS sector and from there find the location of the MFT.
There is a great open source sample of how to parse the MFT in the ntfsfastfind project, see here:
http://home.comcast.net/~lang.dennis/console/ntfsfastfind/ntfsfastfind.html
I also recommend that you read about NTFS internals here:
http://technet.microsoft.com/en-us/library/cc781134(v=ws.10).aspx
http://ntfs.com/ntfs-mft.htm

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js