how to prevent IPTCInfo3 from creating a copy of the original after saving keywords? - iptc

I use IPTCInfo3 in my Python app to write keywords to the IPCT metadata of an image.
For some reason if I use info.save() it creates a copy of the original, e.g. it will write the keyword to Clean.JPG but will also create Clean.JPG~ without the keywords.
If I use info.save_as('Clean.jpg') instead (try to force it to overwrite the original), it doesn't write the keywords to the file.
Is there a solution to this?
import iptcinfo3
new_keyword = ["cool", "sad", "blah"]
info = iptcinfo3.IPTCInfo('C:/Tmp/IPTCINFO/Clean.JPG')
for keyword in new_keyword:
if keyword.encode('UTF-8') not in info['keywords']:
info['keywords'].append(keyword)
info.save()

This solution worked for me.
You will have to comment out 2 lines in the iptcinfo3.py file inside the
"save_as" method around line 695 of the file.
else:
tmpfh.close()
#if os.path.exists(newfile):
# shutil.move(newfile, newfile + '~')
shutil.move(tmpfn, newfile)
This will write inside the original file without creating a new one and leaving the one with ".jpg~" behind.

Related

How to get number of lines of code of a file in a remote repo using PyGithub/ Githubsearch api?

commit = repo.get_commit(sha="0adf369fda5c2d4231881d66e3bc0bd12fb86c9a")
print(commit.stats.total)
i = commit.files[0].filename
I can get the filename, even the file sha; but can't seem to get loc of the file. Any pointers?
So let's see this line
commit = repo.get_commit(sha="0adf369fda5c2d4231881d66e3bc0bd12fb86c9a")
Here the commit is of type github.Commit.Commit
Now when you pick a file, it's of the type github.File.File
If you checked that, you'll see that there is no real way of getting lines of code directly. But there is one important field raw_url.
This will give you the raw_url of the file, which you can now get, perhaps like
url = commit.files[0].raw_url
r = requests.get(url)
r.text
This will give you the raw data of the file and you can use it to get the number of lines of code.

read text file content with python at zapier

I have problems getting the content of a txt-file into a Zapier
object using https://zapier.com/help/code-python/. Here is the code I am
using:
with open('file', 'r') as content_file:
content = content_file.read()
I'd be glad if you could help me with this. Thanks for that!
David here, from the Zapier Platform team.
Your code as written doesn't work because the first argument for the open function is the filepath. There's no file at the path 'file', so you'll get an error. You access the input via the input_data dictionary.
That being said, the input is a url, not a file. You need to use urllib to read that url. I found the answer here.
I've got a working copy of the code like so:
import urllib2 # the lib that handles the url stuff
result = []
data = urllib2.urlopen(input_data['file'])
for line in data: # file lines are iterable
result.append(line) # keep each line, or parse, etc.
return {'lines': result}
The key takeaway is that you need to return a dictionary from the function, so make sure you somehow squish your file into one.
​Let me know if you've got any other questions!
#xavid, did you test this in Zapier?
It fails miserably beacuse urllib2 doesn't exist in the zapier python environment.

Finding out the file name in a FileUploadHandler

I am rolling my own fileupload handler in django and would like to know the file name. I am supporting more than one file format and want to do different processing in the receive_data_chunk method depending on which file format the uploaded file has. I thought I would be pragmatic and just judge file format based on file ending but I can't figure out how to get hold of the file name. If I try to extract the file name with something like the following code (before that method is called):
if request.method == 'POST':
p = re.compile('^.*\.sdf$', re.IGNORECASE)
if ( p.search(request.FILES['filecontent'].name) ) :
self.sdf = True
else:
self.sdf = False
It seems I never reach the receive_data_chunk method. I presume the call to request.FILES trigger the loading somehow and then it's already done? How can I do different processing based on file ending in my receive_data_chunk method?
Have you tried using
data=request.POST.copy()
and then working on the copy? I have used this for other things but may work in this case as well.

How to access a tempfile object across 2 separate requests (was:views) in Django

Can't find a direct, head on answer to this. Is there a way to access a tempfile in Django across 2 distinct views? Say I have the following code:
view#1(request):
temp = tempfile.NamedTemporaryFile()
write_book.save(temp_file)
temp_file_name = temp_file.name
print temp_file_name
request.session['output_file_name'] = temp_file_name
request.session.modified = True
return #something or other
view#2(request):
temp_file_name = request.session['output_file_name']
temp_file = open(str(temp_file_name))
#do something with 'temp_file' here
My problem comes in specifically on view#2, the 2nd line "open(temp_file_name)". Django complains this file/pathway doesn't exist, which is consistent of my understanding of the tempfile module (that the file is 'hidden' and only available to Django).
Is there a way for me to access this file? In case it matters, I ONLY need to read from it (technically serve it for download).
I'd think of this as how to access a NamedTemporaryFile across different requests, rather than different views. Looking at this documentation on NamedTemporaryFile, it says that the file can be opened across the same process, but not necessarily across multiple processes. Perhaps your other view is being called in a different Django process.
My suggestion would be to abandon the use of NamedTemporaryFile and instead just write it as a permanent file, then delete the file in the other view.
Thanks seddonym for attempting to answer. My partner clarified this for me...seddonym is correct for the Django version of NamedTemporaryFile. By calling the python version (sorry, don't have enough cred to post hyperlinks. Stupid rule) you CAN access across requests.
The trick is setting the delete=False parameter, and closing the file before 'returning' at the end of the request. Then, in the subsequent request, just open(file_name). Psuedo code below:
>>> import tempfile
>>> file = tempfile.NamedTemporaryFile(delete=False)
>>> file.name
'c:\\users\\(blah)\(blah)\(blah)\\temp\\tmp9drcz9'
>>> file.close()
>>> file
<closed file '<fdopen>', mode 'w+b' at 0x00EF5390>
>>> f = open(file.name)
>>> f
<open file 'c:\users\ymalik\appdata\local\temp\tmp9drcz9', mode 'r' at 0x0278C128>
This is, of course, done in the console, but it works in django as well.

Django upload file into specific directory that depends on the POST URI

I'd like to store uploaded files into a specific directory that depends on the URI of the POST request. Perhaps, I'd also like to rename the file to something fixed (the name of the file input for example) so I have an easy way to grep the file system, etc. and also to avoid possible security problems.
What's the preferred way to do this in Django?
Edit: I should clarify that I'd be interested in possibly doing this as a file upload handler to avoid writing a large file twice to the file system.
Edit2: I suppose one can just 'mv' the tmp file to a new location. That's a cheap operation if on the same file system.
Fixed olooney example. It is working now
#csrf_exempt
def upload_video_file(request):
folder = 'tmp_dir2/' #request.path.replace("/", "_")
uploaded_filename = request.FILES['file'].name
BASE_PATH = '/home/'
# create the folder if it doesn't exist.
try:
os.mkdir(os.path.join(BASE_PATH, folder))
except:
pass
# save the uploaded file inside that folder.
full_filename = os.path.join(BASE_PATH, folder, uploaded_filename)
fout = open(full_filename, 'wb+')
file_content = ContentFile( request.FILES['file'].read() )
try:
# Iterate through the chunks.
for chunk in file_content.chunks():
fout.write(chunk)
fout.close()
html = "<html><body>SAVED</body></html>"
return HttpResponse(html)
except:
html = "<html><body>NOT SAVED</body></html>"
return HttpResponse(html)
Django gives you total control over where (and if) you save files. See: http://docs.djangoproject.com/en/dev/topics/http/file-uploads/
The below example shows how to combine the URL and the name of the uploaded file and write the file out to disk:
def upload(request):
folder = request.path.replace("/", "_")
uploaded_filename = request.FILES['file'].name
# create the folder if it doesn't exist.
try:
os.mkdir(os.path.join(BASE_PATH, folder))
except:
pass
# save the uploaded file inside that folder.
full_filename = os.path.join(BASE_PATH, folder, uploaded_filename)
fout = open(full_filename, 'wb+')
# Iterate through the chunks.
for chunk in fout.chunks():
fout.write(chunk)
fout.close()
Edit: How to do this with a FileUploadHandler? It traced down through the code and it seems like you need to do four things to repurpose the TemporaryFileUploadHandler to save outside of FILE_UPLOAD_TEMP_DIR:
extend TemporaryUploadedFile and override init() to pass through a different directory to NamedTemporaryFile. It can use the try mkdir except for pass I showed above.
extend TemporaryFileUploadHandler and override new_file() to use the above class.
also extend init() to accept the directory where you want the folder to go.
Dynamically add the request handler, passing through a directory determined from the URL:
request.upload_handlers = [ProgressBarUploadHandler(request.path.replace('/', '_')]
While non-trivial, it's still easier than writing a handler from scratch: In particular, you won't have to write a single line of error-prone buffered reading. Steps 3 and 4 are necessary because FileUploadHandlers are not passed request information by default, I believe, so you'll have to tell it separately if you want to use the URL somehow.
I can't really recommend writing a custom FileUploadHandler for this. It's really mixing layers of responsibility. Relative to the speed of uploading a file over the internet, doing a local file copy is insignificant. And if the file's small, Django will just keep it in memory without writing it out to a temp file. I have a bad feeling that you'll get all this working and find you can't even measure the performance difference.