Create/Edit MS Word & Word Perfect docs in Django? - django

Is it possible to create and/or edit MS Word and Word Perfect documents with django? I'd like to be able to allow the user to fill out a form and have the form fields inserted into an MS Word/Word Perfect document. Or, the form fields are used to create a new MS Word/Word Perfect document. The user can then send that document via email to others who may not have access to the django web-app.
I have a client who needs this functionality and I'd like to keep it all within the web-app.
Any ideas?
Thanks!

For MS Word, you could use docx-mailmerge. Run the following commands to install lxml(dependecy required by docx-mailmerge) and docx-mailmerge
conda install lxml
pip install docx-mailmerge
In order for docx-mailmerge to work correctly, you need to create a standard Word document and define the appropriate merge fields. The examples below are for Word 2010. Other versions of Word should be similar. It actually took me a while to figure out this process but once you do it a couple of times, it is pretty simple.
Start Word and create the basic document structure. Then place the cursor in the location where the merged data should be inserted and choose Insert -> Quick Parts -> Field..:
Word Quick Parts
From the Field dialog box, select the “MergeField” option from the Field Names list. In the Field Name, enter the name you want for the field. In this case, we are using Business Name.
Word Add Field
Once you click ok, you should see something like this: <> in the Word document. You can go ahead and create the document with all the needed fields.
from __future__ import print_function
from mailmerge import MailMerge
from datetime import date
template = "Practical-Business-Python.docx"
document = MailMerge(template)
document.merge(
status='Gold',
city='Springfield',
phone_number='800-555-5555',
Business='Cool Shoes',
zip='55555',
purchases='$500,000',
shipping_limit='$500',
state='MO',
address='1234 Main Street',
date='{:%d-%b-%Y}'.format(date.today()),
discount='5%',
recipient='Mr. Jones')
document.write('test-output.docx')
More at http://pbpython.com/python-word-template.html

I do not know how to do exactly what you ask for, but I would suggest that you also look into creating PDFs with Django. If you only want to send information in a particular format then PFD might be better because it is more portable across platforms. You may also want to look at this documentation for how to send emails from Django.

Related

OTRS search via deep link possible?

I want to link to our OTRS (version 5) -- is it possible to create a URL in such a way that a special parametrized search is performed within OTRS?
I'd like to link from a webpage to something like:
https://otrs.charite.de?Ralf.Hildebrandt#charite.de
and that should display all tickets in the queue XYZ and customeruser == Ralf.Hildebrandt#charite.de
Unfortunately that's not possible out of the box - at least in the way you probably intend to use it. All real search functions in OTRS (that I'm aware of) use HTTP-POST. Parameterized POST request using an URL is in principle possible, but strongly discouraged and wouldn't really work if you intend to store those searches as bookmarks or something like that.
The good news - you can create a saved search and trigger that via a URL like this:
https://url.to.otrs.de/otrs/index.pl?Action=AgentTicketSearch;Subaction=Search;TakeLastSearch=1;SaveProfile=1;Profile=current%20Changes
In this case the Profile=current%20Changes would be replaced by the name of the search profile (Special Characters must be encoded).

How can I improve this piece of code (scraping with Python)?

I'm quite new to programming so I appologise if my question is too trivial.
I've recently taken some Udacity courses like "Intro to Computer Science", "Programming foundations with Python" and some others.
The other day my boss asked me to collect some email addresses from certain websites. Some of them had many addresses at the same page so, the bell rang and I was thinking of creating my own code to do the repetitive task of collecting the emails and pasting them in a spreadsheet.
So, after reviewing some of the lessons of those corses plus some videos on youtube I came up with this code.
Notes: It's written in Python 2.7.12 and I'm using Ubuntu 16.04.
import xlwt
from bs4 import BeautifulSoup
import urllib2
def emails_page2excel(url):
# Create html file from a given url
sauce = urllib2.urlopen(url).read()
soup = BeautifulSoup(sauce,'lxml')
# Create the spreadsheet book and a page in it
wb = xlwt.Workbook()
sheet1 = wb.add_sheet('Contacts')
# Find the emails and write them in the spreadsheet table
count = 0
for url in soup.find_all('a'):
link = url.get('href')
if link.find('mailto')!=-1:
start_email = link.find('mailto')+len('mailto:')
email = link[start_email:]
sheet1.write(count,0,email)
count += 1
wb.save('This is an example.xls')
The code runs fine and it's quite quick. However I'd like to improve it in these ways:
I got the feeling that the for loop could be done in a more elegant
way. Is there any other way to look for the email besides the string find? Just in a similar way in which I found the 'a' tags?
I'd like to be able to evaluate this code with a list of websites(most likely in a spreadsheet) instead of evaluating it only with a url string. I haven't had time to research on how to do this yet but any suggestion is welcome.
Last but not least, I'd like to ask if there's any way to implement this script in some sort of friendly-to-use mini-programme. I mean, for instance, my boss is totally bad at computers: I can't imagine her opening a terminal shell and executing the python code. Instead I'd like to crate some programme where she could just paste the url, or upload a spreadsheet with the websites she wants to extract the emails from, select whether she wants to extract emails or any other information, maybe some more features and then click a button and get the result.
I hope I've expressed myself clearly.
Thanks in advance,
Anqin
As far as BeautifulSoup goes you can search for emails in a in three ways:
1) Use find_all with a lambda to search all tags that are a and have href as an attribute and its value has mailto.
for email in soup.find_all(lambda tag: tag.name == "a" and "href" in tag.attrs and "mailto:" in tag.attrs["href"]):
print (email["href"][7:])
2) Use find_all with regex to find mailto: in an a tag.
for email in soup.find_all("a", href=re.compile("mailto:")):
print (email["href"][7:])
3) Use select to find an a tag on which its href attribute starts with mailto:.
for email in soup.select('a[href^=mailto]'):
print (email["href"][7:])
This is my personal preference, but I prefer using requests over urllib. Far simpler, better at error handling and safer when threading.
As far as your other questions, you can create a method for fetching, parsing and returning the results that you want and pass an url as a parameter. You would only need to loop your list of urls and call that method.
For your boss question you should use a GUI.
Have fun coding.

Google Analytics Filter Set of Pages

I need to create a filter on Google Analytics to include only a set of pages, for example, the view will have a filter to collect data only from
www.example.com/page1.html
www.example.com/page2.html
www.example.com/page3.html
I am trying to achieve this by using a Custom Filter to Include - > Request URI and using a Regex on the Filter Pattern.
My problem is that the Regex exceeds the 255 character limitation, even after I tried to optimice the regex to be a small as possible.
Creating more than one Include Filter does not work because this way no data would be collected, so I am wondering how could I achieve this? Thank you
This is the original regex
/es/investigacion/lace\.html|/en/research/lace\.html|/es/investigacion/lace/programa-maestrias\.html|/es/investigacion/lace/alumni/latin-american-forum-entrepreneurs\.html|/es/investigacion/lace/alumni/fondo-angel-investment\.html|/es/investigacion/lace/alumni\.html|/es/investigacion/lace/fondo-inversion\.html|/es/investigacion/lace/investigacion\.html|/es/investigacion/lace/acerca-del-centro\.html|/es/investigacion/lace/alumni/estudiantes-del-pais\.html|/en/research/investigation\.html|/en/research/about-the-center\.html|/es/investigacion/lace/alumni/mentoring\.html|/es/investigacion/lace/alumni/reatu-entrepreneur-award\.html|/en/research/lace/master-program\.html|/en/research/lace/alumni\.html|/en/research/investment-fund\.html
Edit: first try to compress the regex
/es/investigacion/lace/(programa-maestrias|alumni|investigacion|programa-maestrias|alumni/latin-american-forum-entrepreneurs|alumni/fondo-angel-investment|fondo-inversion|investigacion|acerca-del-centro|alumni/estudiantes-del-pais|alumni/mentoring|alumni/incae-entrepreneur-award)\.html|
Edit: the reason for this is because I need to create a new user profile on GA, and this new profile will have access to the information of a set of URLs only; so what occurred to me is create a new View that only captures the information of this set of URLs, and then assign the profile to this view with "Read/Analyze" permissons.
There are definitely more ways to optimise the regex. For example, since all string options end with .html, you could do something like this:
/(es/investigacion/lace|en/research/lace)\.html
by taking the .html out.
You could also take out
/es/investigacion/lace
and weave in the variable part of that using |'s, eg.
/es/investigacion/lace/(programa-maestrias|alumni|investigacion)\.html
But try a few of those optimisation techniques and you should be able to fit more in.

Mediawiki mass user delete/merge/block

I have 500 or so spambots and about 5 actual registered users on my wiki. I have used nuke to delete their pages but they just keep reposting. I have spambot registration under control using reCaptcha. Now, I just need a way to delete/block/merge about 500 users at once.
You could just delete the accounts from the user table manually, or at least disable their authentication info with a query such as:
UPDATE /*_*/user SET
user_password = '',
user_newpassword = '',
user_email = '',
user_token = ''
WHERE
/* condition to select the users you want to nuke */
(Replace /*_*/ with your $wgDBprefix, if any. Oh, and do make a backup first.)
Wiping out the user_password and user_newpassword fields prevents the user from logging in. Also wiping out user_email prevents them from requesting a new password via email, and wiping out user_token drops any active sessions they may have.
Update: Since I first posted this, I've had further experience of cleaning up large numbers of spam users and content from a MediaWiki installation. I've documented the method I used (which basically involves first deleting the users from the database, then wiping out up all the now-orphaned revisions, and finally running rebuildall.php to fix the link tables) in this answer on Webmasters Stack Exchange.
Alternatively, you might also find Extension:RegexBlock useful:
"RegexBlock is an extension that adds special page with the interface for blocking, viewing and unblocking user names and IP addresses using regular expressions."
There are risks involved in applying the solution in the accepted answer. The approach may damage your database! It incompletely removes users, doing nothing to preserve referential integrity, and will almost certainly cause display errors.
Here a much better solution is presented (a prerequisite is that you have installed the User merge extension):
I have a little awkward way to accomplish the bulk merge through a
work-around. Hope someone would find it useful! (Must have a little
string concatenation skills in spreadsheets; or one may use a python
or similar script; or use a text editor with bulk replacement
features)
Prepare a list of all SPAMuserIDs, store them in a spreadsheet or textfile. The list may be
prepared from the user creation logs. If you do have the
dB access, the Wiki_user table can be imported into a local list.
The post method used for submitting the Merge & Delete User form (by clicking the button) should be converted to a get method. This
will get us a long URL. See the second comment (by Matthew Simoneau)
dated 13/Jan/2009) at
http://www.mathworks.com/matlabcentral/newsreader/view_thread/242300
for the method.
The resulting URL string should be something like below:
http: //(Your Wiki domain)/Special:UserMerge?olduser=(OldUserNameHere)&newuser=(NewUserNameHere)&deleteuser=1&token=0d30d8b4033a9a523b9574ccf73abad8%2B\
Now, divide this URL into four sections:
A: http: //(Your Wiki domain)/Special:UserMerge?olduser=
B: (OldUserNameHere)
C: &newuser=(NewUserNameHere)&deleteuser=1
D: &token=0d30d8b4033a9a523b9574ccf73abad8%2B\
Now using a text editor or spreadsheet, prefix each spam userIDs with part A and Suffix each with Part C and D. Part C will include the
NewUser(which is a specially created single dummy userID). The Part D,
the Token string is a session-dependent token that will be changed per
user per session. So you will need to get a new token every time a new
session/batch of work is required.
With the above step, you should get a long list of URLs, each good to do a Merge&Delete operation for one user. We can now create a
simple HTML file, view it and use a batch downloader like DownThemAll
in Firefox.
Add two more pieces " Linktext" to each line at
beginning and end. Also add at top and at
bottom and save the file as (for eg:) userlist.html
Open the file in Firefox, use DownThemAll add-on and download all the files! Effectively, you are visiting the Merge&Delete page for
each user and clicking the button!
Although this might look a lengthy and tricky job at first, once you
follow this method, you can remove tens of thousands of users without
much manual efforts.
You can verify if the operation is going well by opening some of the
downloaded html files (or by looking through the recent changes in
another window).
One advantage is that it does not directly edit the
MySQL pages. Nor does it require direct database access.
I did a bit of rewriting to the quoted text, since the original text contains some flaws.

Django url pattern to retrieve query string parameters

I was about ready to start giving a jqgrid in a django app greater functionality (pagination, searching, etc). In order to do this it looks as though jqgrid sends its parameters in the GET to the server. I plan to write an urlpattern to pull out the necessary stuff (page number, records per page, search term, etc) so I can pass it along to my view to return the correct rows to the grid. Has anyone out there already created this urlpattern I am in search of?
Thanks much.
The answer to this was simpler than I realized. As stated in Chapter 7 of the djangobook in the section titled "Query string parameters" one can simply do something as follows, where "someParam" is the parameter in the query string you want to retrieve. However, Django is designed to be clean in that address bar at the top of the page so you should only use this option if you must.
The query string might look something like this.
http://somedomainname.com/?someString=1
The view might look like this.
def someView(request):
if 'someParam' in request.GET and request.GET['someParam']:
someParam = request.GET['someParam']
Hopefully this is of some help to someone else down the road.