Calling rsync with pexpect: glob string not working - python-2.7

I'm attempting to rsync some files with pexpect. It appears the glob string argument I'm providing to identify all the source files is not working.
The gist of it is something like this...
import pexpect
import sys
glob_str = (
"[0-9]" * 4 + "-" +
"[0-9]" * 2 + "-" +
"[0-9]" * 2 + "-" +
"[A-B]" + "*"
)
SRC = "../data/{}".format(glob_str)
DES = "user#host:" + "/path/to/dest/"
args = [
"-avP",
SRC,
DES,
]
print "rsync" + " ".join(args)
# Execute the transfer
child = pexpect.spawn("rsync", args)
child.logfile_read = sys.stdout # log what the child sends back
child.expect("Password:")
child.sendline("#######")
child.expect(pexpect.EOF)
Fails with this...
building file list ...
rsync: link_stat "/Users/U6020643/git/ue-sme-query-logs/code/../data/[0-9][0-9][0-9][0-9]\-[0-9][0-9]\-[0-9][0-9]\-[A-B]*" failed: No such file or directory (2)
0 files to consider
...
The same command run in the shell works just fine
rsync -avP ../data/[0-9][0-9][0-9][0-9]\-[0-9][0-9]\-[0-9][0-9]\-[A-B].csv username#host:/path/to/dest/
The pexpect documentation mentions this
Remember that Pexpect does NOT interpret shell meta characters such as redirect, pipe, or wild cards (>, |, or *). This is a common mistake. If you want to run a command and pipe it through another command then you must also start a shell.
But doing so...
...
args = [
"rsync",
"-avP",
SRC,
DES,
]
...
child = pexpect.spawn("/bin/bash", args) # have to use a shell for glob expansion to work
...
Runs into a permissions issue
/usr/bin/rsync: /usr/bin/rsync: cannot execute binary file

To run rsync with bash you have to use bash -c "cmd...":
args = ["-c", "rsync -avP {} {}".format(SRC, DES)]
child = pexpect.spawn('/bin/bash', args=args)
And I think you can also try rsync --include=PATTERN.

Related

submit job via pipe to SGE's qsub using python subprocess

I would like to submit jobs to a computer cluster via the scheduler SGE using a pipe:
$ echo -e 'date; sleep 2; date' | qsub -cwd -j y -V -q all.q -N test
(The queue might be different depending on the particular cluster.)
Running this command-line in a bash terminal works for me on the cluster I have access to, with GNU bash version 3.2.25, GE version 6.2u5 and Linux 2.6 x86_64.
In Python 2.7.2, here are my commands (the whole script is available as a gist):
import subprocess
queue = "all.q"
jobName = "test"
cmd = "date; sleep 2; date"
echoArgs = ["echo", "-e", "'%s'" % cmd]
qsubArgs = ["qsub", "-cwd", "-j", "y", "-V", "-q", queue, "-N", jobName]
Case 1: using shell=True makes it work:
wholeCmd = " ".join(echoArgs) + " | " + " ".join(qsubArgs)
out = subprocess.Popen(wholeCmd, shell=True, stdout=subprocess.PIPE)
out = out.communicate()[0]
jobId = out.split()[2]
But I would like to avoid that for security reasons explained in the official documentation.
Case 2: using the same code as above but with shell=False results in the following error message, so that the job is not even submitted:
Traceback (most recent call last):
File "./test.py", line 22, in <module>
out = subprocess.Popen(cmd, shell=False, stdout=subprocess.PIPE)
File "/share/apps/lib/python2.7/subprocess.py", line 679, in __init__
errread, errwrite)
File "/share/apps/lib/python2.7/subprocess.py", line 1228, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
Case 3: therefore, following the official documentation as well as this on SO, here is one proper way to do it:
echoProc = subprocess.Popen(echoArgs, stdout=subprocess.PIPE)
out = subprocess.check_output(qsubArgs, stdin=echoProc.stdout)
echoProc.wait()
The job is successfully submitted, but it returns the following error message:
/opt/gridengine/default/spool/compute-2-27/job_scripts/3873705: line 1: echo 3; date; sleep 2; date: command not found
This is something I don't understand.
Case 4: another proper way to do it following this is:
echoProc = subprocess.Popen(echoArgs, stdout=subprocess.PIPE)
qsubProc = subprocess.Popen(qsubArgs, stdin=echoProc.stdout, stdout=subprocess.PIPE)
echoProc.stdout.close()
out = qsubProc.communicate()[0]
echoProc.wait()
Here again the job is successfully submitted, but returns the following error message:
/opt/gridengine/default/spool/compute-2-32/job_scripts/3873706: line 1: echo 4; date; sleep 2; date: command not found
Did I make mistakes in my Python code? Could the problem come from the way Python or SGE were compiled and installed?
You're getting "command not found" because 'echo 3; date; sleep 2; date' is being interpreted as a single command.
Just change this line:
echoArgs = ["echo", "-e", "'%s'" % cmd]
to:
echoArgs = ["echo", "-e", "%s" % cmd]
(I.e., remove the single quotes.) That should make both Case 3 and Case 4 work (though it will break 1 and 2).
Your specific case could be implemented in Python 3 as:
#!/usr/bin/env python3
from subprocess import check_output
queue_name = "all.q"
job_name = "test"
cmd = b"date; sleep 2; date"
job_id = check_output('qsub -cwd -j y -V'.split() +
['-q', queue_name, '-N', job_name],
input=cmd).split()[2]
You could adapt it for Python 2, using Popen.communicate().
As I understand, whoever controls the input cmd may run arbitrary commands already and therefore there is no much point to avoid shell=True here:
#!/usr/bin/env python
from pipes import quote as shell_quote
from subprocess import check_output
pipeline = 'echo -e {cmd} | qsub -cwd -j y -V -q {queue_name} -N {job_name}'
job_id = check_output(pipeline.format(
cmd=shell_quote(cmd),
queue_name=shell_quote(queue_name),
job_name=shell_quote(job_name)),
shell=True).split()[2]
Implementing the pipeline by hand is error-prone. If you don't want to run the shell; you could use plumbum module that supports a similar pipeline syntax embedded in pure Python:
#!/usr/bin/env python
from plumbum.cmd import echo, qsub # $ pip install plumbum
qsub_args = '-cwd -j y -V -q'.split() + [queue_name, '-N', job_name]
job_id = (echo['-e', cmd] | qsub[qsub_args])().split()[2]
# or (qsub[qsub_args] << cmd)()
See How do I use subprocess.Popen to connect multiple processes by pipes?

How to run a line in Powershell in Python 2.7?

I have a line of Powershell script that runs just fine when I enter it in Powershell's command line. In my Python application which I run from Powershell, I am trying to send this line of script to Powershell.
powershell -command ' & {. ./uploadImageToBigcommerce.ps1; Process-Image '765377' '.jpg' 'C:\Images' 'W:\product_images\import'}'
I know that the script works because I've been able to implement it on its own from the Powershell command line. However, I haven't been able to get Python to send this line to the shell without getting a "non-zero exit status 1."
import subprocess
product = "765377"
scriptPath = "./uploadImageToBigcommerce.ps1"
def process_image(sku, fileType, searchDirectory, destinationPath, scriptPath):
psArgs = "Process-Image '"+sku+"' '"+fileType+"' '"+searchDirectory+"' '"+destinationPath+"'"
subprocess.check_call([create_PSscript_call(scriptPath, psArgs)], shell=True)
def create_PSscript_call(scriptPath, args):
line = "powershell -command ' & {. "+scriptPath+"; "+args+"}'"
print(line)
return line
process_image(product, ".jpg", "C:\Images", "C:\webDAV", scriptPath)
Does anyone have any ideas to help? I've tried:
subprocess.check_call()
subprocess.call()
subprocess.Popen()
And maybe it is just a syntax issue, but I haven't been able to find enough documentation to confirm that.
Using single quotes inside a single quoted string breaks the string. Use double quotes outside and single qoutes inside or vice versa to avoid that. This statement:
powershell -command '& {. ./uploadImageToBigcommerce.ps1; Process-Image '765377' '.jpg' 'C:\Images' 'W:\product_images\import'}'
should rather look like this:
powershell -command "& {. ./uploadImageToBigcommerce.ps1; Process-Image '765377' '.jpg' 'C:\Images' 'W:\product_images\import'}"
Also, I'd use subprocess.call (and a quoting function), like this:
import subprocess
product = '765377'
scriptPath = './uploadImageToBigcommerce.ps1'
def qq(s):
return "'%s'" % s
def process_image(sku, fileType, searchDirectory, destinationPath, scriptPath):
psCode = '. ' + scriptPath + '; Process-Image ' + qq(fileType) + ' ' + \
qq(searchDirectory) + ' ' + qq(destinationPath)
subprocess.call(['powershell', '-Command', '& {'+psCode+'}'], shell=True)
process_image(product, '.jpg', 'C:\Images', 'C:\webDAV', scriptPath)

How can i execute my django custom command with cron in a python script

I have created a Django custom command. I would like to run this command at specific inervals of time (say for example every 5 minutes). How can i make this from my script or from terminal.
My django custom command in periodic_tasks.py:
`class Command(BaseCommand):
help = 'Displays Data....'
def handle(self, *args, **options):
hostip = '192.168.1.1'
cmd = 'sudo nmap -sV -T4 -O -F --version-light -oX - '+hostip
scandate = timezone.now()
#scandate = datetime.datetime.now()
self.addToLog('Detailed Scan',hostip,scandate,'Started')
child = pexpect.spawn(cmd, timeout = 60)
index = child.expect (['password:',pexpect.EOF, pexpect.TIMEOUT])
child.sendline('abcdef')
scandate = timezone.now()
self.addToLog('Detailed Scan',hostip,scandate,'Finished')
print 'before xml.....'
with open('portscan.xml', 'w') as fObj:
fObj.write(child.before)
print 'in xml.....'
print 'after xml.....'
portscandata = self.parsexml()
self.addToDB(portscandata,hostip)`
In my script I am trying to do this:
test = subprocess.Popen(["*/5","*","*","*", "*", "/usr/local/bin/python2.6","periodic_tasks"], stdout=subprocess.PIPE)
output = test.communicate()[0]
I am trying to run this from terminal like this:
*/5 * * * * root /usr/local/bin/python2.6 /home/sat034/WorkSpace/SAFEACCESS/NetworkInventory/manage.py periodic_tasks
It is saying:
bash: */5: No such file or directory
Please suggest me if I am missing somewhere. Thanks in Advance
Your string
*/5 * * * * root /usr/local/bin/python2.6 /home/sat034/WorkSpace/SAFEACCESS/SynfosysNetworkInventory/manage.py periodic_tasks
looks like cron configuration string. You can add it to cron with crontab -e (editor will be started so you will add this line)
Meaning of this line is "run every 5 minutes next command"
before adding to cron i suggest test this command running it without */5 * * * *

How to copy files using subprocess in python?

I have a list of files:
file_list=['test1.txt','test2.txt','test3.txt']
I want to find and copy these files to a destination folder. I have the following code:
for files in file_list:
subprocess.call(["find", "test_folder/",
"-iname", files,
"-exec", "cp", "{}",
"dest_folder/",
"\;"])
But, i keep getting the error:
find: missing argument to `-exec
The shell command looks something like this:
$find test_folder/ -iname 'test1.txt' -exec cp {} dest_folder/ \;
Anything i am doing wrong?
You don't need to escape semi-colon. Here's what is working for me:
import shlex
import subprocess
file_list = ['test1.txt','test2.txt','test3.txt']
cmd = 'find test_folder -iname %s -exec cp {} dest_folder ;'
for files in file_list:
subprocess.Popen(shlex.split(cmd % files))
Also see:
Python equivilant to find -exec
find command with exec in python subprocess gives error
Hope that helps.
You don't need to escape the arguments; subprocess module calls find command directly without the shell. Replace "\;" with ";" and your command will work as is.
You could combine the search into a single command:
from subprocess import call
expr = [a for file in file_list for a in ['-iname', file, '-o']]
expr.pop() # remove last `-o`
rc = call(["find", "test_folder/", "("] + expr + [")", "-exec",
"cp", "-t", "dest_folder/", "--", "{}", "+"])
You could also combine expr list into a single -iregex argument if desired.
You don't need find command; you could implement the copying in pure Python using os.walk, re.match, and shutil.copy:
import os
import re
import shutil
found = re.compile('(?i)^(?:%s)$' % '|'.join(map(re.escape, file_list))).match
for root, dirs, files in os.walk('test_folder/'):
for filename in files:
if found(filename):
shutil.copy(os.path.join(root, filename), "dest_folder/")

How to replace string in multiple files in the folder?

I m trying to read two files and replace content of one file with content of other file in files present in folder which also has sub directories.
But its tell sub process not defined.
i'm new to python and shell script can anybody help me with this please?
import os
import sys
import os.path
f = open ( "file1.txt",'r')
g = open ( "file2.txt",'r')
text1=f.readlines()
text2=g.readlines()
i = 0;
for line in text1:
l = line.replace("\r\n", "")
t = text2[i].replace("\r\n", "")
args = "find . -name *.tml"
Path = subprocess.Popen( args , shell=True )
os.system(" sed -r -i 's/" + l + "/" + t + "/g' " + Path)
i = i + 1;
To specifically address your actual error, you need to import the subprocess module as you are making use of it (oddly) in your code:
import subprocess
After that, you will find more problems. I will try and keep it as simple as possible with my suggestions. Code first, then I will break it down. Keep in mind, there are more robust ways to accomplish this task. But I am doing my best to keep in mind your experience level and making it make your current approach as closely as possible.
import subprocess
import sys
# 1
results = subprocess.Popen("find . -name '*.tml'",
shell=True, stdout=subprocess.PIPE)
if results.wait() != 0:
print "error trying to find tml files"
sys.exit(1)
# 2
tml_files = []
for tml in results.stdout:
tml_files.append(tml.strip())
if not tml_files:
print "no tml files found"
sys.exit(0)
tml_string = " ".join(tml_files)
# 3
with open ("file1.txt") as f, open("file2.txt") as g:
while True:
# 4
f_line = f.readline()
if not f_line:
break
g_line = g.readline()
if not g_line:
break
f_line = f_line.strip()
g_line = g_line.strip()
if not f_line or not g_line:
continue
# 5
cmd = "sed -i -e 's/%s/%s/g' %s" % \
(f_line.strip(), g_line.strip(), tml_string)
ret = subprocess.Popen(cmd, shell=True).wait()
if ret != 0:
print "error doing string replacement"
sys.exit(1)
You do not need to read in your entire files at once. If they are large this could be a lot of memory. You can consume a line at a time, and you can also make use of what is called "context managers" when you open the files. This will ensure they close properly no matter what happens:
We start with a subprocess command that is run only once to find all your .tml files. Your version had the same command being run multiple times. If the search path is the same, then we only need it once. This checks the exit code of the command and quits if it failed.
We loop over stdout on the subprocess command, and add the stripped lines to a list. This is a more robust way of your replace("\r\n"). It removes whitespace. A "list comprehension" would be better suited here (down the line). If we didn't find any tml files, then we have no work to do, so we exit. Otherwise, we join them together in a space-separated string to be suitable for our command later.
This is called "context managers". You can open the file in a way that no matter what they will be closed properly. The file is open for the length of the context within that code block. We are going to loop forever, and break when appropriate.
We pull a line, one at a time, from each file. If either line is blank, we reached the end of the file and cannot do any more work, so we break out. We then strip the newlines, and if either string is empty (blank line) we still can't do any work, but we just continue to the next available line.
A modified version of your sed command. We construct the command string on each loop for the source and replacement strings, and tack on the tml file string. Bear in mind this is a very naive approach to the replacement. It really expects your replacement strings to be safe characters and not break the s///g sed format. But we run that with another subprocess command. The wait() simply waits for the return code, and we check it for an error. This approach replaces your os.system() version.
Hope this helps. Eventually you can improve this to do more checking and safe operations.