gdal.driver.create is removing another file in the directory - python-2.7

I've been working with gdal in python for a few years, and the past few days I've found what I suspect may be a bug in the gdal driver's Create command. I'm working with Landsat imagery, and have tried the below code on a few scenes with the same results each time. In certain situations, when I call create, it deletes another file in the directory (always the MTL file).
import gdal
path = '.../LC80110112013243LGN00/' #path to where ever your landsat scene is
outfile = path+path[-22:-1]+'_B5_test.tif'
#outfile = path + 'TestB5.tif'
infile = path+path[-22:-1]+'_B5.tif'
infile_open = gdal.Open(infile)
infile_array = infile_open.GetRasterBand(1).ReadAsArray()
dtype=gdal.GDT_Float32
outfile = gdal.GetDriverByName('GTiff').Create(outfile, infile_array.shape[1], infile_array.shape[0], 1, dtype)
infile_open = None
outfile = None
infile_array = None
If I use the first outfile name, which creates a filename matching the rest of the Landsat band files, and the file "outfile" already exists, it is replaced (expected behavior) and the met file is deleted (unexpected behavior). If I use the second outfile name, which does not match the Landsat band filename format, when I run the code if "outfile" already exists it simply replaces the old file (expected behavior). I have not been able to find any other reference to this happening. Any ideas what's going on?

Similarly!
GDAL Version: GDAL 2.1.3, released 2017/20/01
Paltform: Ubuntu 16.04 LTS
Out of the situation so:
. . .
if os.path.exists(outputFileName):
os.remove(outputFileName)
dst_ds = driver.Create(outputFileName, width, height, bands_value, gdal.GDT_Float32)
. . .

Related

GitPython: How can I access the contents of a file in a commit in GitPython

I am new to GitPython and I am trying to get the content of a file within a commit. I am able to get each file from a specific commit, but I am getting an error each time I run the command. Now, I know that the file exist in GitPython, but each time I run my program, I am getting the following error:
returned non-zero exit status 1
I am using Python 2.7.6 and Ubuntu Linux 14.04.
I know that the file exist, since I also go directly into Git from the command line, check out the respective commit, search for the file, and find it. I also run the cat command on it, and the file contents are displayed. Many times when the error shows up, it says that the file in question does not exist. I am trying to go through each commit with GitPython, get every blob or file from each individual commit, and run an external Java program on the content of that file. The Java program is designed to return a string to Python. To capture the string returned from my Java code, I am also using subprocess.check_output. Any help will be greatly appreciated.
I tried passing in the command as a list:
cmd = ['java', '-classpath', '/home/rahkeemg/workspace/CSCI499_Java/bin/:/usr/local/lib/*:', 'java_gram.mainJava','absolute/path/to/file']
subprocess.check_output(cmd, stderr=subprocess.STDOUT, shell=False)
And I have also tried passing the command as a string:
subprocess.check_output('java -classpath /home/rahkeemg/workspace/CSCI499_Java/bin/:/usr/local/lib/*: java_gram.mainJava {file}'.format(file=entry.abspath.strip()), shell=True)
Is it possible to access the contents of a file from GitPython?
For example, say there is a commit and it has one file foo.java
In that file is the following lines of code:
foo.java
import java.io.FileInputStream;
import java.io.InputStream;
import java.util.ArrayList;
import java.util.List;
public class foo{
public static void main(String[] args) throws Exception{}
}
I want to access everything in the file and run an external program on it.
Any help would be greatly appreciated. Below is a piece of the code I am using to do so
#! usr/bin/env python
__author__ = 'rahkeemg'
from git import *
import git, json, subprocess, re
git_dir = '/home/rahkeemg/Documents/GitRepositories/WhereHows'
# make an instance of the repository from specified path
repo = Repo(path=git_dir)
heads = repo.heads # obtain the different repositories
master = heads.master # get the master repository
print master
# get all of the commits on the master branch
commits = list(repo.iter_commits(master))
cmd = ['java', '-classpath', '/home/rahkeemg/workspace/CSCI499_Java/bin/:/usr/local/lib/*:', 'java_gram.mainJava']
# start at the very 1st commit, or start at commit 0
for i in range(len(commits) - 1, 0, -1):
commit = commits[i]
commit_num = len(commits) - 1 - i
print commit_num, ": ", commit.hexsha, '\n', commit.message, '\n'
for entry in commit.tree.traverse():
if re.search(r'\.java', entry.path):
current_file = str(entry.abspath.strip())
# add the current file or blob to the list for the command to run
cmd.append(current_file)
print entry.abspath
try:
# This is the scenario where I pass arguments into command as a string
print subprocess.check_output('java -classpath /home/rahkeemg/workspace/CSCI499_Java/bin/:/usr/local/lib/*: java_gram.mainJava {file}'.format(file=entry.abspath.strip()), shell=True)
# scenario where I pass arguments into command as a list
j_response = subprocess.check_output(cmd, stderr=subprocess.STDOUT, shell=False)
except subprocess.CalledProcessError as e:
print "Error on file: ", current_file
# Use pop on list to remove the last string, which is the selected file at the moment, to make place for the next file.
cmd.pop()
First of all, when you traverse the commit history like this, the file will not be checked out. All you get is the filename, maybe leading to the file or maybe not, but certainly it will not lead to the file from different revision than currently checked-out.
However, there is a solution to this. Remember that in principle, anything you could do with some git command, you can do with GitPython.
To get file contents from specific revision, you can do the following, which I've taken from that page:
git show <treeish>:<file>
therefore, in GitPython:
file_contents = repo.git.show('{}:{}'.format(commit.hexsha, entry.path))
However, that still wouldn't make the file appear on disk. If you need some real path for the file, you can use tempfile:
f = tempfile.NamedTemporaryFile(delete=False)
f.write(file_contents)
f.close()
# at this point file with name f.name contains contents of
# the file from path entry.path at revision commit.hexsha
# your program launch goes here, use f.name as filename to be read
os.unlink(f.name) # delete the temp file

Os.walk - WindowsError: [Error 123] The filename, directory name, or volume label syntax is incorrect:

new to python and looking for some help on a problem I am having with os.walk. I have had a solid look around and cannot find the right solution to my problem.
What the code does:
Scans a users selected HD or folder and returns all the filenames, subdirs and size. This is then manipulated in pandas (not in code below) and exported to an excel spreadsheet in the formatting I desired.
However, in the first part of the code, in Python 2.7, I am currently experiencing the below error:
WindowsError: [Error 123] The filename, directory name, or volume label syntax is incorrect: 'E:\03. Work\Bre\Files\folder2\icons greyscale flatten\._Icon_18?10 Stainless Steel.psd'
I have explored using raw string (r') but to no avail. Perhaps I am writing it wrong.
I will note that I never get this in 3.5 or on cleanly labelled selected folders. Due to Pandas and pysinstaller problems with 3.5, I am hoping to stick with 2.7 until the error with 3.5 is resolved.
import pandas as pd
import xlsxwriter
import os
from io import StringIO
#Lists for Pandas Dataframes
fpath = []
fname = []
fext = []
sizec = []
# START #Select file directory to scan
filed = raw_input("\nSelect a directory to scan: ")
#Scan the Hard-Drive and add to lists for Pandas DataFrames
print "\nGetting details..."
for root, dirs, files in os.walk(filed):
for filename in files:
f = os.path.abspath(root) #File path
fpath.append(f)
fname.append(filename) #File name
s = os.path.splitext(filename)[1] #File extension
s = str(s)
fext.append(s)
p = os.path.join(root, filename) #File size
si = os.stat(p).st_size
sizec.append(si)
print "\nDone!"
Any help would be greatly appreciated :)
In order to traverse filenames with unicode characters, you need to give os.walk a unicode path name.
Your path contains a unicode character, which is being displayed as ? in the exception.
If you pass in the unicode path, like this os.walk(unicode(filed)) you should not get that exception.
As noted in Convert python filenames to unicode sometimes you'll get a bytestring if the path is "undecodable" by Python 2.

Django: No such file or directory

I have a process that scans a tape library and looks for media that has expired, so they can be removed and reused before sending the tapes to an offsite vault. (We have some 7 day policies that never make it offsite.) This process takes around 20 minutes to run, so I didn't want it to run on-demand when loading/refreshing the page. Rather, I set up a django-cron job (I know I could have done this in Linux cron, but wanted the project to be as self-contained as possible) to run the scan, and creates a file in /tmp. I've verified that this works -- the file exists in /tmp from this morning's execution. The problem I'm having is that now I want to display a list of those expired (scratch) media on my web page, but the script is saying that it can't find the file. When the file was created, I use the absolute filename "/tmp/scratch.2015-11-13.out" (for example), but here's the error I get in the browser:
IOError at /
[Errno 2] No such file or directory: '/tmp/corpscratch.2015-11-13.out'
My assumption is that this is a "web root" issue, but I just can't figure it out. I tried copying the file to the /static/ and /media/ directories configured in django, and even in the django root directory, and the project root directory, but nothing seems to work. When it says it cant' find /tmp/file, where is it really looking?
def sample():
""" Just testing """
today = datetime.date.today() #format 2015-11-31
inputfile = "/tmp/corpscratch.%s.out" % str(today)
with open(inputfile) as fh: # This is the line reporting the error
lines = [line.strip('\n') for line in fh]
print(lines)
The print statement was used for testing in the shell (which works, I might add), but the browser gives an error.
And the file does exist:
$ ls /tmp/corpscratch.2015-11-13.out
/tmp/corpscratch.2015-11-13.out
Thanks.
Edit: was mistaken, doesn't work in python shell either. Was thinking of a previous issue.
Use this instead:
today = datetime.datetime.today().date()
inputfile = "/tmp/corpscratch.%s.out" % str(today)
Or:
today = datetime.datetime.today().strftime('%Y-%m-%d')
inputfile = "/tmp/corpscratch.%s.out" % today # No need to use str()
See the difference:
>>> str(datetime.datetime.today().date())
'2015-11-13'
>>> str(datetime.datetime.today())
'2015-11-13 15:56:19.578569'
I ended up finding this elsewhere:
today = datetime.date.today() #format 2015-11-31
inputfilename = "tmp/corpscratch.%s.out" % str(today)
inputfile = os.path.join(settings.PROJECT_ROOT, inputfilename)
With settings.py containing the following:
PROJECT_ROOT = os.path.abspath(os.path.dirname(__file__))
Completely resolved my issues.

Boost-Python: Load python module with unicode chars in path

I'm working on game project. I use python 2.7.2 for scripting. My application works fine with non unicode path to .exe. But it can't load scripts with unicode path using
boost::python::import (import_path.c_str());
I tried this example
5.3. Pure Embedding http://docs.python.org/extending/embedding.html#embedding-python-in-c
It also can't handle unicode path. I linked python as dll.
Explain me, please, how to handle such path.
boost::python::import needs a std::string, so chances are that import_path misses some characters.
Do you have to work on multiple platform ? On Windows, you could call GetShortPathName to retreive the 8.3 filename and use that to load your dll.
You can make a quick test :
Rename your extension to "JaiDéjàTestéÇaEtJaiDétestéÇa.pyd".
At the command line, type dir /x *.pyd to get the short file name (JAIDJT~1.PYD on my computer)
Use the short name to load your extension.
+The file name above if French for "I already tested this and I didn't like it". It is a rhyme that takes the edge off working with Unicode ;)
This isn't really an answer that will suit your needs, but maybe it will give you something to go on.
I ran into a very similar problem with Python, in my case my application is a pure Python application. I noticed as well that if my application was installed to a directory with a path string that could not be encoded in MBCS (what Python converts to internally for imports, at least Python prior to 3.2 as far as I understand), the Python interpreter would fail, claiming not module of that name existed.
What I had to do was write an Import Hook to trick it into loading those files anyway.
Here's what I came up with:
import imp, os, sys
class UnicodeImporter(object):
def find_module(self,fullname,path=None):
if isinstance(fullname,unicode):
fullname = fullname.replace(u'.',u'\\')
exts = (u'.pyc',u'.pyo',u'.py')
else:
fullname = fullname.replace('.','\\')
exts = ('.pyc','.pyo','.py')
if os.path.exists(fullname) and os.path.isdir(fullname):
return self
for ext in exts:
if os.path.exists(fullname+ext):
return self
def load_module(self,fullname):
if fullname in sys.modules:
return sys.modules[fullname]
else:
sys.modules[fullname] = imp.new_module(fullname)
if isinstance(fullname,unicode):
filename = fullname.replace(u'.',u'\\')
ext = u'.py'
initfile = u'__init__'
else:
filename = fullname.replace('.','\\')
ext = '.py'
initfile = '__init__'
if os.path.exists(filename+ext):
try:
with open(filename+ext,'U') as fp:
mod = imp.load_source(fullname,filename+ext,fp)
sys.modules[fullname] = mod
mod.__loader__ = self
return mod
except:
print 'fail', filename+ext
raise
mod = sys.modules[fullname]
mod.__loader__ = self
mod.__file__ = os.path.join(os.getcwd(),filename)
mod.__path__ = [filename]
#init file
initfile = os.path.join(filename,initfile+ext)
if os.path.exists(initfile):
with open(initfile,'U') as fp:
code = fp.read()
exec code in mod.__dict__
return mod
sys.meta_path = [UnicodeImporter()]
I still run into two issues when using this:
Double clicking on the launcher file (a .pyw file) in windows explorer does not work when the application is installed in a trouble directory. I believe this has to do with how Windows file associations passes the arguments to pythonw.exe (my guess is Windows passes the full path string, which includes the non-encodeable character, as the argument to the exe). If I create a batch file and have the batch file call the Python executable with just the file name of my launcher, and ensure it's launched from the same directory, it launches fine. Again, I'm betting this is because now I can use a relative path as the argument for python.exe, and avoid those trouble characters in the path.
Packaging my application using py2exe, the resulting exe will not run if placed in one of these trouble paths. I think this has to do with the zipimporter module, which unfortunately is a compiled Python module so I cannot easily modify it (I would have to recompile, etc etc).

How do you shift all pages of a PDF document right by one inch?

I want to shift all the pages of an existing pdf document right one inch so they can be three hole punched without hitting the content. The pdf documents will be already generated so changing the way they are generated is not possible.
It appears iText can do this from a previous question.
What is an equivalent library (or way do this) for C++ or Python?
If it is platform dependent I need one that would work on Linux.
Update: Figured I would post a little script I wrote to do this in case anyone else finds this page and needs it.
Working code thanks to Scott Anderson's suggestion:
rightshift.py
#!/usr/bin/python2
import sys
import os
from pyPdf import PdfFileReader, PdfFileWriter
#not sure what default user space units are.
# just guessed until current document i was looking at worked
uToShift = 50;
if (len(sys.argv) < 3):
print "Usage rightshift [in_file] [out_file]"
sys.exit()
if not os.path.exists(sys.argv[1]):
print "%s does not exist." % sys.argv[1]
sys.exit()
pdfInput = PdfFileReader(file( sys.argv[1], "rb"))
pdfOutput = PdfFileWriter()
pages=pdfInput.getNumPages()
for i in range(0,pages):
p = pdfInput.getPage(i)
for box in (p.mediaBox, p.cropBox, p.bleedBox, p.trimBox, p.artBox):
box.lowerLeft = (box.getLowerLeft_x() - uToShift, box.getLowerLeft_y())
box.upperRight = (box.getUpperRight_x() - uToShift, box.getUpperRight_y())
pdfOutput.addPage( p )
outputStream = file(sys.argv[2], "wb")
pdfOutput.write(outputStream)
outputStream.close()
You can try the pypdf library. In 2022 PyPDF2 was merged back into pypdf.
two ways to perform this task in Linux
using ghostscript trough gsview
look in your /root or /home for the hidden file .gsview.ini
go to section:
[pdfwrite Options]
Options=
Xoffset=0
Yoffset=0
change the values for X axis, settling a convenient value (values are in postscript points, 1 inch = 72 postscript points)
so:
[pdfwrite Options]
Options=
Xoffset=72
Yoffset=0
close .gsview.ini
open your pdf file with gsview
file / convert / pdfwrite
select first odd pages and print to a new file (you can name this as odd.pdf)
now repeat same steps for even pages
open your pdf file with gsview
[pdfwrite Options]
Options=
Xoffset=-72
Yoffset=0
file / convert / pdfwrite
select first even pages and print to a new file (you can name this as even.pdf)
now you need to mix these two pdf with odd and even pages
you can use:
Pdf Transformer
http://sourceforge.net/projects/pdf-transformer/
java -jar ./pdf-transformer-0.4.0.jar <INPUT_FILE_NAME1> <INPUT_FILE_NAME2> <OUTPUT_FILE_NAME> merge -j
2: : use podofobox + pdftk
first step: with pdftk separate whole pdf document in two pdf files with only odd and even pages
pdftk file.pdf cat 1-endodd output odd.pdf && pdftk file.pdf cat 1-endeven output even.pdf
now with podofobox, included into podofo utils
http://podofo.sourceforge.net/about.html
podofobox file.pdf odd.pdf crop -3600 0 widht height for odd pages and
podofobox file.pdf even.pdf crop 3600 0 widht height for even pages
width and height are in postscript point x 100 and can be found with pdfinfo
e.g. if your pdf file has pagesize 482x680, then you enter
./podofobox file.pdf odd.pdf crop -3600 0 48200 68000
./podofobox file.pdf even.pdf crop 3600 0 48200 68000
then you can mix together odd and even in a unique file with already cited
Pdf Transformer
http://sourceforge.net/projects/pdf-transformer/
With pdfjam, the command to translate all pages 1 inch to the right is
pdfjam --offset '1in 0in' doc.pdf
The transformed document is saved to doc-pdfjam.pdf. For further options, type pdfjam --help. Currently pdfjam requires a Unix-like command prompt (Linux, Mac, or Cygwin). In Ubuntu, it can be installed with
sudo apt install pdfjam
Not a full answer, but you can use LaTeX with pdfpages:
http://www.ctan.org/tex-archive/macros/latex/contrib/pdfpages/
Multiple commandline linux tools also use this approach, for instance pdfjam uses this:
http://www2.warwick.ac.uk/fac/sci/statistics/staff/academic-research/firth/software/pdfjam
Maybe pdfjam can already provide what you need already.
Here is a modified version for python3.x.
First install pypdf2 via pip install pypdf2
import sys
import os
from PyPDF2 import PdfFileReader, PdfFileWriter
uToShift = 40; # amount to shift contents by. +ve shifts right
if (len(sys.argv) < 3):
print ("Usage rightshift [in_file] [out_file]")
sys.exit()
if not os.path.exists(sys.argv[1]):
print ("%s does not exist." % sys.argv[1])
sys.exit()
path=os.path.dirname(os.path.realpath(__file__))
with open(("%s\\%s" % (path, sys.argv[1])), "rb") as pdfin:
with open(("%s\\%s" % (path, sys.argv[2])), "wb") as pdfout:
pdfInput = PdfFileReader(pdfin)
pdfOutput = PdfFileWriter()
pages=pdfInput.getNumPages()
for i in range(0,pages):
p = pdfInput.getPage(i)
for box in (p.mediaBox, p.cropBox, p.bleedBox, p.trimBox, p.artBox):
box.lowerLeft = (box.getLowerLeft_x() - uToShift, box.getLowerLeft_y())
box.upperRight = (box.getUpperRight_x() - uToShift, box.getUpperRight_y())
pdfOutput.addPage( p )
pdfOutput.write(pdfout)