How to copy files using subprocess in python? - python-2.7

I have a list of files:
file_list=['test1.txt','test2.txt','test3.txt']
I want to find and copy these files to a destination folder. I have the following code:
for files in file_list:
subprocess.call(["find", "test_folder/",
"-iname", files,
"-exec", "cp", "{}",
"dest_folder/",
"\;"])
But, i keep getting the error:
find: missing argument to `-exec
The shell command looks something like this:
$find test_folder/ -iname 'test1.txt' -exec cp {} dest_folder/ \;
Anything i am doing wrong?

You don't need to escape semi-colon. Here's what is working for me:
import shlex
import subprocess
file_list = ['test1.txt','test2.txt','test3.txt']
cmd = 'find test_folder -iname %s -exec cp {} dest_folder ;'
for files in file_list:
subprocess.Popen(shlex.split(cmd % files))
Also see:
Python equivilant to find -exec
find command with exec in python subprocess gives error
Hope that helps.

You don't need to escape the arguments; subprocess module calls find command directly without the shell. Replace "\;" with ";" and your command will work as is.
You could combine the search into a single command:
from subprocess import call
expr = [a for file in file_list for a in ['-iname', file, '-o']]
expr.pop() # remove last `-o`
rc = call(["find", "test_folder/", "("] + expr + [")", "-exec",
"cp", "-t", "dest_folder/", "--", "{}", "+"])
You could also combine expr list into a single -iregex argument if desired.
You don't need find command; you could implement the copying in pure Python using os.walk, re.match, and shutil.copy:
import os
import re
import shutil
found = re.compile('(?i)^(?:%s)$' % '|'.join(map(re.escape, file_list))).match
for root, dirs, files in os.walk('test_folder/'):
for filename in files:
if found(filename):
shutil.copy(os.path.join(root, filename), "dest_folder/")

Related

Shell command excution on Python2.7

Want to verify if site domain contains "com". Assume I have shell varibale as
export FIRST_URL="http://www.11111.com"
export SECOND_URL="http://www.22222.org"
User calls Python script with parameter (partial shell varibale) as
python2.7 FIRST # OR
python2.7 SECOND
Python script is,
import sys, os, subprocess
PART_URL = sys.argv[1]
print( "PART_URL=",PART_URL)
COMPLETE_URL = PART_URL+'_URL' # Formed a full shell varibale
cmd_str='echo {} | grep \"com\".format(COMPLETE_URL)' # equivalent to echo $FIRST_URL | grep com
my_site=subprocess.check_output(cmd_str, shell=True) # Note we cant use subprocess.run() in Python Python 2.7
print("The validated Site is ", my_site)
The output should be "The validated Site is http://www.11111.com"
Refer How to access environment variable values?
$ export FIRST_VAR="http://www.11111.com"
$ python
>>> foo = 'FIRST'
>>> print os.environ[foo + '_VAR']
http://www.11111.com
>>>
Figured it out,
COMPLETE_URL = PART_URL+'_URL'
cmd_str='echo ${} | grep \'com\'.format(COMPLETE_URL)'
my_site=subprocess.check_output(cmd_str, shell=True)
Alteratively, we can use,
my_site=subprocess.call(shlex.split(cmd_str))

How to change the values of a parameter in multiple files using python

I am a new user of Python. I got to learn a way of changing value of a parameter in a single file. The script:
#####test.py##########
from sys import argv
script,filename,sigma = argv
file_data = open(filename,'r')
txt = file_data.read()
txt=txt.replace('3.7',sigma)
file_data = open(filename,'w')
file_data.write(txt)
file_data.close()
It's run in command line with test.txt as
test.py test.txt 2.
3.7 is replaced by 2 in test.txt, as a result.
Now if I want to do the same for all the .txt files in the directory e.g.
test.py *.txt 2
what are the suggested modifications?
Your suggestions are highly appreciated.
Hafiz.
bash (or whatever your shell is) will expand the *.txt (to test0.txt test1.txt ... or whatever the *.txt files in your current directory are called) before passing it to your python script. your python script will therefore get many arguments (and not just 2 as you expect). print sys.argv to inspect.
you could solve that in bash itself with something like
for name in *.txt; do test.py ${name} 2; done
otherwise you would need to treat sys.argv differently in python and allow for more than 2 arguments.
Importing glob solved that issue. But I've got some queries.
Query 1:
I'm rewriting my code as:
#####test.py##########
from sys import argv
script,filename,sigma = argv
file_data = open(filename,'r')
txt = file_data.read()
txt=txt.replace('3.7'|'3',sigma) #gives syntax error
file_data = open(filename,'w')
file_data.write(txt)
file_data.close()
I want to replace 3.7 or 3 by sigma. What will be the corrected code?
Query 2:
I'm rewriting it in the following manner:
#####test.py##########
from sys import argv
script,filename,sigma = argv
file_data = open(filename,'r')
txt = file_data.read()
txt=txt.replace('x="2"','x=sigma')
file_data = open(filename,'w')
file_data.write(txt)
file_data.close()
With
py test.py test.txt 3.
I get x=sigma, but I want to get x=3
What'd be the modification?
Regards,
Hafiz

Error while triggering a python script using os.system. Script takes sys.argv arguments

Simple script1.py that takes arguments and calls script2.py by passing them to os.system() :
#! /usr/bin/env python
import os
import sys
os.system("script2.py sys.argv[1] sys.argv[2]")
Running this :
./script1.py "arg1" "arg2"
Getting this single error :
sh: 1: script2.py: not found
Both scripts are present in the same directory.
Applied chmod 777 on both script1.py and script2.py and are executable.
Both scripts call the same interpreter installed at /usr/bin/env python.
When I try these :
os.system("./script2.py sys.argv[1] sys.argv[2]")
os.system("python script2.py sys.argv[1] sys.argv[2]")
The sys.argv[1] and sys.argv[2] are being considered as strings instead of dynamic variables
Have you tried with:
./script2.py "arg1" "arg2"
Inside the os.system?
UPDATE 2
Try with:
import urllib
call_with_args = "./script2.py '%s' '%s'" % (str(arg1), str(arg2))
os.system(call_with_args)

Calling rsync with pexpect: glob string not working

I'm attempting to rsync some files with pexpect. It appears the glob string argument I'm providing to identify all the source files is not working.
The gist of it is something like this...
import pexpect
import sys
glob_str = (
"[0-9]" * 4 + "-" +
"[0-9]" * 2 + "-" +
"[0-9]" * 2 + "-" +
"[A-B]" + "*"
)
SRC = "../data/{}".format(glob_str)
DES = "user#host:" + "/path/to/dest/"
args = [
"-avP",
SRC,
DES,
]
print "rsync" + " ".join(args)
# Execute the transfer
child = pexpect.spawn("rsync", args)
child.logfile_read = sys.stdout # log what the child sends back
child.expect("Password:")
child.sendline("#######")
child.expect(pexpect.EOF)
Fails with this...
building file list ...
rsync: link_stat "/Users/U6020643/git/ue-sme-query-logs/code/../data/[0-9][0-9][0-9][0-9]\-[0-9][0-9]\-[0-9][0-9]\-[A-B]*" failed: No such file or directory (2)
0 files to consider
...
The same command run in the shell works just fine
rsync -avP ../data/[0-9][0-9][0-9][0-9]\-[0-9][0-9]\-[0-9][0-9]\-[A-B].csv username#host:/path/to/dest/
The pexpect documentation mentions this
Remember that Pexpect does NOT interpret shell meta characters such as redirect, pipe, or wild cards (>, |, or *). This is a common mistake. If you want to run a command and pipe it through another command then you must also start a shell.
But doing so...
...
args = [
"rsync",
"-avP",
SRC,
DES,
]
...
child = pexpect.spawn("/bin/bash", args) # have to use a shell for glob expansion to work
...
Runs into a permissions issue
/usr/bin/rsync: /usr/bin/rsync: cannot execute binary file
To run rsync with bash you have to use bash -c "cmd...":
args = ["-c", "rsync -avP {} {}".format(SRC, DES)]
child = pexpect.spawn('/bin/bash', args=args)
And I think you can also try rsync --include=PATTERN.

How to replace string in multiple files in the folder?

I m trying to read two files and replace content of one file with content of other file in files present in folder which also has sub directories.
But its tell sub process not defined.
i'm new to python and shell script can anybody help me with this please?
import os
import sys
import os.path
f = open ( "file1.txt",'r')
g = open ( "file2.txt",'r')
text1=f.readlines()
text2=g.readlines()
i = 0;
for line in text1:
l = line.replace("\r\n", "")
t = text2[i].replace("\r\n", "")
args = "find . -name *.tml"
Path = subprocess.Popen( args , shell=True )
os.system(" sed -r -i 's/" + l + "/" + t + "/g' " + Path)
i = i + 1;
To specifically address your actual error, you need to import the subprocess module as you are making use of it (oddly) in your code:
import subprocess
After that, you will find more problems. I will try and keep it as simple as possible with my suggestions. Code first, then I will break it down. Keep in mind, there are more robust ways to accomplish this task. But I am doing my best to keep in mind your experience level and making it make your current approach as closely as possible.
import subprocess
import sys
# 1
results = subprocess.Popen("find . -name '*.tml'",
shell=True, stdout=subprocess.PIPE)
if results.wait() != 0:
print "error trying to find tml files"
sys.exit(1)
# 2
tml_files = []
for tml in results.stdout:
tml_files.append(tml.strip())
if not tml_files:
print "no tml files found"
sys.exit(0)
tml_string = " ".join(tml_files)
# 3
with open ("file1.txt") as f, open("file2.txt") as g:
while True:
# 4
f_line = f.readline()
if not f_line:
break
g_line = g.readline()
if not g_line:
break
f_line = f_line.strip()
g_line = g_line.strip()
if not f_line or not g_line:
continue
# 5
cmd = "sed -i -e 's/%s/%s/g' %s" % \
(f_line.strip(), g_line.strip(), tml_string)
ret = subprocess.Popen(cmd, shell=True).wait()
if ret != 0:
print "error doing string replacement"
sys.exit(1)
You do not need to read in your entire files at once. If they are large this could be a lot of memory. You can consume a line at a time, and you can also make use of what is called "context managers" when you open the files. This will ensure they close properly no matter what happens:
We start with a subprocess command that is run only once to find all your .tml files. Your version had the same command being run multiple times. If the search path is the same, then we only need it once. This checks the exit code of the command and quits if it failed.
We loop over stdout on the subprocess command, and add the stripped lines to a list. This is a more robust way of your replace("\r\n"). It removes whitespace. A "list comprehension" would be better suited here (down the line). If we didn't find any tml files, then we have no work to do, so we exit. Otherwise, we join them together in a space-separated string to be suitable for our command later.
This is called "context managers". You can open the file in a way that no matter what they will be closed properly. The file is open for the length of the context within that code block. We are going to loop forever, and break when appropriate.
We pull a line, one at a time, from each file. If either line is blank, we reached the end of the file and cannot do any more work, so we break out. We then strip the newlines, and if either string is empty (blank line) we still can't do any work, but we just continue to the next available line.
A modified version of your sed command. We construct the command string on each loop for the source and replacement strings, and tack on the tml file string. Bear in mind this is a very naive approach to the replacement. It really expects your replacement strings to be safe characters and not break the s///g sed format. But we run that with another subprocess command. The wait() simply waits for the return code, and we check it for an error. This approach replaces your os.system() version.
Hope this helps. Eventually you can improve this to do more checking and safe operations.