Easy way to set on_delete across entire application - regex

I've been using the -Wd argument for Python and discovered tons of changes I need to make in order to prepare my upgrade to Django 2.0
python -Wd manage.py runserver
The main thing is that on_delete is due to become a required argument.
RemovedInDjango20Warning: on_delete will be a required arg for ForeignKey in Django 2.0. Set it to models.CASCADE on models and in existing migrations if you want to maintain the current default behavior.
See https://docs.djangoproject.com/en/1.9/ref/models/fields/#django.db.models.ForeignKey.on_delete
Is there an easy regex (or way) I can use to put on_delete into all of my foreign keys?

Use with care
You can use
(ForeignKey|OneToOneField)\(((?:(?!on_delete|ForeignKey|OneToOneField)[^\)])*)\)
This will search for all foreign keys that currently do not already define what happens upon deletion and also ignores anywhere you have overridden ForeignKey.
It will then capture anything inside the brackets which allows you to replace the inner text with the capture group plus the on_delete
$1($2, on_delete=models.CASCADE)
It is not advised to do a replace all with the above, and you should still step through to ensure no issues are created (such as any pep8 line length warnings)

I had to do this, and Sayse 's solution worked:
import re
import fileinput
import os, fnmatch
import glob
from pathlib import Path
# https://stackoverflow.com/questions/41571281/easy-way-to-set-on-delete-across-entire-application
# https://stackoverflow.com/questions/11898998/how-can-i-write-a-regex-which-matches-non-greedy
# https://stackoverflow.com/a/4719629/433570
# https://stackoverflow.com/a/2186565/433570
regex = r'(.*?)(ForeignKey|OneToOneField)\(((?:(?!on_delete|ForeignKey|OneToOneField)[^\)])*)\)(.*)'
index = 0
for filename in Path('apps').glob('**/migrations/*.py'):
print(filename)
=> filename = (os.fspath(filename), ) # 3.6 doesn't have this
for line in fileinput.FileInput(filename, inplace=1):
a = re.search(regex, line)
if a:
print('{}{}({}, on_delete=models.CASCADE){}'.format(a.group(1), a.group(2), a.group(3), a.group(4)))
else:
print(line, end='')

I made this bash script that may help you.
#!/bin/bash
FK=()
IFS=$'\n'
count=0
for fk in $(cat $1 | egrep -i --color -o 'models\.ForeignKey\((.*?)');
do
FK[$count]=$fk
#FK+=$fk
count=$(($count + 1))
done
for c in "${FK[#]}";
do
r=`echo "${c}" | sed -e 's/)$/,on_delete=models.CASCADE)/g'`
a="${c}"
sed -i "s/${c}/${r}/g" $1
done
Maybe you want a more "save" approach, changing sed -i with sed -e and redirect the output to a file to compare against your original models.py file.
Happy coding!!

Related

Making re.sub print whole input along with its modifications

All I need is to perform a parameter replacement in a config file:
#!/usr/bin/python3
import re
current_config = '''
#USERNAME
[name = foo]
#-USERNAME
#DATABASE
[config = old]
#-DATABASE
#MORE_FIELDS
[moresettings]
#-MORE_FIELDS
'''
new_config ='''
#DATABASE
[config = new]
#-DATABASE
'''
final = re.sub('#DATABASE}}(.*)#-#{{#-DATABASE}}', current_config, new_config)
print(final)
The regex itself works fine, the problem is that it prints out only the (correctly) modified part, ignoring the rest of the file:
dev-sandbox#workstation:~/PycharmProjects/test$ python3 test.py
#DATABASE
[config = new]
#-DATABASE
What I want to achieve is the entire "current_config" variable including the new settings for the DATABASE block. I got it working with SED on shell, but I need this in python3.
Any help is much appreciated, the documentation for re.sub sugests it can't be done with it. Thank you!

How to change the values of a parameter in multiple files using python

I am a new user of Python. I got to learn a way of changing value of a parameter in a single file. The script:
#####test.py##########
from sys import argv
script,filename,sigma = argv
file_data = open(filename,'r')
txt = file_data.read()
txt=txt.replace('3.7',sigma)
file_data = open(filename,'w')
file_data.write(txt)
file_data.close()
It's run in command line with test.txt as
test.py test.txt 2.
3.7 is replaced by 2 in test.txt, as a result.
Now if I want to do the same for all the .txt files in the directory e.g.
test.py *.txt 2
what are the suggested modifications?
Your suggestions are highly appreciated.
Hafiz.
bash (or whatever your shell is) will expand the *.txt (to test0.txt test1.txt ... or whatever the *.txt files in your current directory are called) before passing it to your python script. your python script will therefore get many arguments (and not just 2 as you expect). print sys.argv to inspect.
you could solve that in bash itself with something like
for name in *.txt; do test.py ${name} 2; done
otherwise you would need to treat sys.argv differently in python and allow for more than 2 arguments.
Importing glob solved that issue. But I've got some queries.
Query 1:
I'm rewriting my code as:
#####test.py##########
from sys import argv
script,filename,sigma = argv
file_data = open(filename,'r')
txt = file_data.read()
txt=txt.replace('3.7'|'3',sigma) #gives syntax error
file_data = open(filename,'w')
file_data.write(txt)
file_data.close()
I want to replace 3.7 or 3 by sigma. What will be the corrected code?
Query 2:
I'm rewriting it in the following manner:
#####test.py##########
from sys import argv
script,filename,sigma = argv
file_data = open(filename,'r')
txt = file_data.read()
txt=txt.replace('x="2"','x=sigma')
file_data = open(filename,'w')
file_data.write(txt)
file_data.close()
With
py test.py test.txt 3.
I get x=sigma, but I want to get x=3
What'd be the modification?
Regards,
Hafiz

Rewrite YAML frontmatter with regular expression

I want to convert my WordPress website to a static site on GitHub using Jekyll.
I used a plugin that exports my 62 posts to GitHub as Markdown. I now have these posts with extra frontmatter at the beginning of each file. It looks like this:
---
ID: 51
post_title: Here's my post title
author: Frank Meeuwsen
post_date: 2014-07-03 22:10:11
post_excerpt: ""
layout: post
permalink: >
https://myurl.com/slug
published: true
sw_timestamp:
- "399956"
sw_open_thumbnail_url:
- >
https://myurl.com/wp-content/uploads/2014/08/Featured_image.jpg
sw_cache_timestamp:
- "408644"
swp_open_thumbnail_url:
- >
https://myurl.com/wp-content/uploads/2014/08/Featured_image.jpg
swp_open_graph_image_data:
- '["https://i0.wp.com/myurl.com/wp-content/uploads/2014/08/Featured_image.jpg?fit=800%2C400&ssl=1",800,400,false]'
swp_cache_timestamp:
- "410228"
---
This block isn't parsed right by Jekyll, plus I don't need all this frontmatter. I would like to have each file's frontmatter converted to
---
ID: 51
post_title: Here's my post title
author: Frank Meeuwsen
post_date: 2014-07-03 22:10:11
layout: post
published: true
---
I would like to do this with regular expressions. But my knowledge of regex is not that great. With the help of this forum and lots of Google searches I didn't get very far. I know how to find the complete piece of frontmatter but how do I replace it with a part of it as specified above?
I might have to do this in steps, but I can't wrap my head around how to do this.
I use Textwrangler as the editor to do the search and replace.
YAML (and other relatively free formats like HTML, JSON, XML) is best not transformed using regular expressions, it is easy to work for one example and break for the next that has extra whitespace, different indentation etc.
Using a YAML parser in this situation is not trivial, as many either expect a single YAML document in the file (and barf on the Markdown part as extraneous stuff) or expect multiple YAML documents in the file (and barf because the Markdown is not YAML). Moreover most YAML parser throw away useful things like comments and reorder mapping keys.
I have used a similar format (YAML header, followed by reStructuredText) for many years for my ToDo items, and use a small Python program to extract and update these files. Given input like this:
---
ID: 51 # one of the key/values to preserve
post_title: Here's my post title
author: Frank Meeuwsen
post_date: 2014-07-03 22:10:11
post_excerpt: ""
layout: post
permalink: >
https://myurl.com/slug
published: true
sw_timestamp:
- "399956"
sw_open_thumbnail_url:
- >
https://myurl.com/wp-content/uploads/2014/08/Featured_image.jpg
sw_cache_timestamp:
- "408644"
swp_open_thumbnail_url:
- >
https://myurl.com/wp-content/uploads/2014/08/Featured_image.jpg
swp_open_graph_image_data:
- '["https://i0.wp.com/myurl.com/wp-content/uploads/2014/08/Featured_image.jpg?fit=800%2C400&ssl=1",800,400,false]'
swp_cache_timestamp:
- "410228"
---
additional stuff that is not YAML
and more
and more
And this program ¹:
import sys
import ruamel.yaml
from pathlib import Path
def extract(file_name, position=0):
doc_nr = 0
if not isinstance(file_name, Path):
file_name = Path(file_name)
yaml_str = ""
with file_name.open() as fp:
for line_nr, line in enumerate(fp):
if line.startswith('---'):
if line_nr == 0: # don't count --- on first line as next document
continue
else:
doc_nr += 1
if position == doc_nr:
yaml_str += line
return ruamel.yaml.round_trip_load(yaml_str, preserve_quotes=True)
def reinsert(ofp, file_name, data, position=0):
doc_nr = 0
inserted = False
if not isinstance(file_name, Path):
file_name = Path(file_name)
with file_name.open() as fp:
for line_nr, line in enumerate(fp):
if line.startswith('---'):
if line_nr == 0:
ofp.write(line)
continue
else:
doc_nr += 1
if position == doc_nr:
if inserted:
continue
ruamel.yaml.round_trip_dump(data, ofp)
inserted = True
continue
ofp.write(line)
data = extract('input.yaml')
for k in list(data.keys()):
if k not in ['ID', 'post_title', 'author', 'post_date', 'layout', 'published']:
del data[k]
reinsert(sys.stdout, 'input.yaml', data)
You get this output:
---
ID: 51 # one of the key/values to preserve
post_title: Here's my post title
author: Frank Meeuwsen
post_date: 2014-07-03 22:10:11
layout: post
published: true
---
additional stuff that is not YAML
and more
and more
Please note that the comment on the ID line is properly preserved.
¹ This was done using ruamel.yaml a YAML 1.2 parser, which tries to preserve as much information as possible on round-trips, of which I am the author.
Editing my post because I misinterpreted the question the first time, I failed to understand that the actual post was in the same file, right after the ---
Using egrep and GNU sed, so not the bash built-in, it's relatively easy:
# create a working copy
mv file file.old
# get only the fields you need from the frontmatter and redirect that to a new file
egrep '(---|ID|post_title|author|post_date|layout|published)' file.old > file
# get everything from the old file, but discard the frontmatter
cat file.old |gsed '/---/,/---/ d' >> file
# remove working copy
rm file.old
And if you want it all in one go:
for i in `ls`; do mv $i $i.old; egrep '(---|ID|post_title|author|post_date|layout|published)' $i.old > $i; cat $.old |gsed '/---/,/---/ d' >> $i; rm $i.old; done
For good measure, here's what I wrote as my first response:
===========================================================
I think you're making this way too complicated.
A simple egrep will do what you want:
egrep '(---|ID|post_title|author|post_date|layout|published)' file
redirect to a new file:
egrep '(---|ID|post_title|author|post_date|layout|published)' file > newfile
a whole dir at once:
for i in `ls`; do egrep '(---|ID|post_title|author|post_date|layout|published)' $i > $i.new; done
In cases like yours it is better to use actual YAML parser and some scripting language. Cut off metadata from each file to standalone files (or strings), then use YAML library to load the metadata. Once the metadata are loaded, you can modify them safely with no trouble. Then use serialize method from the very same library to create a new metadata file and finally put the files back together.
Something like this:
<?php
list ($before, $metadata, $after) = preg_split("/\n----*\n/ms", file_get_contents($argv[1]));
$yaml = yaml_parse($metadata);
$yaml_copy = [];
foreach ($yaml as $k => $v) {
// copy the data you wish to preserve to $yaml_copy
if (...) {
$yaml_copy[$k] = $yaml[$k];
}
}
file_put_contents('new/'.$argv[1], $before."\n---\n".yaml_emit($yaml_copy)."\n---\n".$after);
(It is just an untested draft with no error checks.)
You could do it with gawk like this:
gawk 'BEGIN {RS="---"; FS="\000" } (FNR == 2) { print "---"; split($1, fm, "\n"); for (line in fm) { if ( fm[line] ~ /^(ID|post_title|author|post_date|layout|published):/) {print fm[line]} } print "---" } (FNR > 2) {print}' post1.html > post1_without_frontmatter_fields.html
You basically want to edit the file. That is what sed (stream editor) is for.
sed -e s/^ID:(*)$^post_title:()$^author:()$^postdate:()$^layout:()$^published:()$/ID:\1\npost_title:\2\nauthor:\3\npostdate:\4\nlayout:\5\npublished:\6/g
You also can use python-frontmatter:
import frontmatter
import io
from os.path import basename, splitext
import glob
# Where are the files to modify
path = "*.markdown"
# Loop through all files
for fname in glob.glob(path):
with io.open(fname, 'r') as f:
# Parse file's front matter
post = frontmatter.load(f)
for k in post.metadata:
if k not in ['ID', 'post_title', 'author', 'post_date', 'layout', 'published']:
del post[k]
# Save the modified file
newfile = io.open(fname, 'w', encoding='utf8')
frontmatter.dump(post, newfile)
newfile.close()
If you want to see more examples visit this page
Hope it helps.

Multiple Command Line Arguments in Python

In my python script, I am reading one text file. For that file, I am giving path to command line in UNIX as follows:
python My_script.py --d /fruit/apple/data1.txt
I am going to read one more file in same script. So I just wanted to know how to pass 2 arguments to get path to 2 files.
I have following code which is working perfectly for one argument.
parser=argparse.ArgumentParser()
parser.add_argument('--d', '--directory', required=True, action='store', dest='directory', default=False, help="provide directory name")
args=parser.parse_args()
file_apple=args.directory
A=open(file_apple)
file1=A.read()
so in my unix command line I write following and script runs successfully
python My_script.py --d /fruit/apple/data1.txt
Goal is to provide second argument as follows and want to read that file as the first one.
python My_script.py --d /fruit/apple/data1.txt --d /fruit/orange/data2.txt
I will appreciate your help on this.
You can make use of nargs.
parser=argparse.ArgumentParser()
parser.add_argument('-d', '--directory', nargs='+' required=True, action='store', dest='directory', default=False, help="provide directory name")
args=parser.parse_args()
file_apple=args.directory
print file_apple
...
I have given nargs value as + which means 1 or many arguments for that command. So, you have to give at least one file path argument.
If you are sure that you are going to have only two or some fixed number always, then you can specify that also like nargs = 3
Now file_apple will be a variable containing list of paths you passed.
$ python My_script.py -d /fruit/apple/data1.txt
['/fruit/apple/data1.txt']
and:
$ python My_script.py -d /fruit/apple/data1.txt /fruit/orange/data2.txt
['/fruit/apple/data1.txt', '/fruit/orange/data2.txt']
PS: conventionally single dash is used for single character flags and doubledash for multi characters. like -d or --directory

Django makemessages ignore switch doesn't work for me

I have problems localizing a django-nonrel project, which is deployed to GAE. Because of GAE I have to put everything into my project folder, so it looks like something like this
project
+ django
+ dbindexer
+ registration
+ myapp
...
+ locale
+ templates
I have strings to localize in templates directory, and in the myapp directory.
When I run python manage.py makemessages -l en --ignore django\* from the project dir it crawl through all the directories of the project, including django, so I get a quite big po file. My strings from the templates are there, along with all of the strings from django directory.
after --ignore ( or just -i ) I tried to pu django django/* , but nothing changed.
Any ideas?
./manage.py help makemessages
-i PATTERN, --ignore=PATTERN
Ignore files or directories matching this glob-style
pattern. Use multiple times to ignore more.
I have just tested it, and this command successfully ignored my application:
./manage.py makemessages -l da -i "django*"
But beware that before you test it, you should delete the old .po file, as I think it will not automatically remove the translation lines from your previous makemessages execution.
The problem is with the pattern - maybe the shell was expanding it for you.
In general - it is good to avoid path separators (whether / or \) in the pattern.
If you need to always pass specific options to the makemessages command, you could consider your own wrapper, like this one, which I use myself:
from django.conf import settings
from django.core.management.base import BaseCommand
from django.core.management import call_command
class Command(BaseCommand):
help = "Scan i18n messages without going into externals."
def handle(self, *args, **options):
call_command('makemessages',
all=True,
extensions=['html', 'inc'],
ignore_patterns=['externals*'])
This saves you typing, and gives a common entry point for scanning messages across the project (your translator colleague will not destroy translations by missing out some parameter).
Don't delete the old .po file, once you have cleared it from the totally unwanted (i.e. - those from 'django' directory) messages. This allows gettext to recycle old unused messages, once they are used again (or simmilar ones, which will be marked as #, fuzzy.
Edit - as mt4x noted - the wrapper above doesn't allow for passing the options to the wrapped command. This is easy to fix:
from django.core.management import call_command
from django.core.management.commands.makemessages import (
Command as MakeMessagesCommand
)
class Command(MakeMessagesCommand):
help = "Scan i18n messages without going into externals."
def handle(self, *args, **options):
options['all'] = True
options['extensions'] = ['html', 'inc']
if 'ignore_patterns' not in options:
options['ignore_patterns'] = []
options['ignore_patterns'] += ['externals*']
call_command('makemessages', **options)
Thus - you can fix what needs to be fixed, and flex the rest.
And this needs not be blind override like above, but also some conditional edit of the parameters passed to the command - appending something to a list or only adding it when it's missing.