How to include multiple items in a tweepy status update - python-2.7

Hobbyist Pythoner here. I am using Tweepy to build a Twitter bot that responds to requests for links to research documents. I would like the response to look like this:
Here you go #UserName: http:thelinkgoeshere.com
In the code below the variable "link" refers to a URL looked up from a list.
The code I have been using is
api.update_status("Here you go: #" + tweet.user.screen_name, link)
But that isn't working. I can get it to tweet the first part through the user's ID or the link by itself, but not both. What am I doing wrong?

Try this:
api.update_status("Here you go #{}: {}".format(tweet.user.screen_name, link))
Here's a link to help understand what format is doing: https://pyformat.info/.
I suspect that api.update_status only takes 1 argument, and you are giving it two.

Related

Regex in add_rewrite_rule for wordpress for year and episode name

I think this is my last problem to solve before everything clicks into place:
I have a homepage with a custom plugin that sends some data to another page, I am building a theme for this website that works with the plugin.
So in the theme functions.php I have added a
function myvar($vars){
$vars[] = 'ep_year';
$vars[] = 'ep_name';
return $vars;
}
add_filter('query_vars','myvar');
works a charm, I can send those values over to a custom page with a custom template assigned.
This has a permalink as follows:
http://ngofwp.local/episode/
and when I send the data over it looks like so:
http://ngofwp.local/episode/?ep_year=2011&ep_name=silly_donkey
I know I have to use an add_rewrite_rule so I've started coding that as follows:
function custom_rewrite_rule()
{
add_rewrite_rule('^episode/([^/]*)/([^/]*)/([^/]*)?','ep_year=$matches[1]&ep_name=$matches[2]','top');
}
add_action('init', 'custom_rewrite_rule');
(this is pasted from an example I found)
But now for the life of me I have no clue about the formulae to get it to work. I have read the regex rules but not sure
What I'd like it to look like is this:
http://ngofwp.local/episode/2011/silly_donkey
The
http://ngofwp.local/episode/
is given by wordpress's permalink setting
Would anyone regex savvy be able to help me out please?
We are a podcast that talks about new games on old platforms and I am migrating the site from node.
Thank you very much in advance

How can I improve this piece of code (scraping with Python)?

I'm quite new to programming so I appologise if my question is too trivial.
I've recently taken some Udacity courses like "Intro to Computer Science", "Programming foundations with Python" and some others.
The other day my boss asked me to collect some email addresses from certain websites. Some of them had many addresses at the same page so, the bell rang and I was thinking of creating my own code to do the repetitive task of collecting the emails and pasting them in a spreadsheet.
So, after reviewing some of the lessons of those corses plus some videos on youtube I came up with this code.
Notes: It's written in Python 2.7.12 and I'm using Ubuntu 16.04.
import xlwt
from bs4 import BeautifulSoup
import urllib2
def emails_page2excel(url):
# Create html file from a given url
sauce = urllib2.urlopen(url).read()
soup = BeautifulSoup(sauce,'lxml')
# Create the spreadsheet book and a page in it
wb = xlwt.Workbook()
sheet1 = wb.add_sheet('Contacts')
# Find the emails and write them in the spreadsheet table
count = 0
for url in soup.find_all('a'):
link = url.get('href')
if link.find('mailto')!=-1:
start_email = link.find('mailto')+len('mailto:')
email = link[start_email:]
sheet1.write(count,0,email)
count += 1
wb.save('This is an example.xls')
The code runs fine and it's quite quick. However I'd like to improve it in these ways:
I got the feeling that the for loop could be done in a more elegant
way. Is there any other way to look for the email besides the string find? Just in a similar way in which I found the 'a' tags?
I'd like to be able to evaluate this code with a list of websites(most likely in a spreadsheet) instead of evaluating it only with a url string. I haven't had time to research on how to do this yet but any suggestion is welcome.
Last but not least, I'd like to ask if there's any way to implement this script in some sort of friendly-to-use mini-programme. I mean, for instance, my boss is totally bad at computers: I can't imagine her opening a terminal shell and executing the python code. Instead I'd like to crate some programme where she could just paste the url, or upload a spreadsheet with the websites she wants to extract the emails from, select whether she wants to extract emails or any other information, maybe some more features and then click a button and get the result.
I hope I've expressed myself clearly.
Thanks in advance,
Anqin
As far as BeautifulSoup goes you can search for emails in a in three ways:
1) Use find_all with a lambda to search all tags that are a and have href as an attribute and its value has mailto.
for email in soup.find_all(lambda tag: tag.name == "a" and "href" in tag.attrs and "mailto:" in tag.attrs["href"]):
print (email["href"][7:])
2) Use find_all with regex to find mailto: in an a tag.
for email in soup.find_all("a", href=re.compile("mailto:")):
print (email["href"][7:])
3) Use select to find an a tag on which its href attribute starts with mailto:.
for email in soup.select('a[href^=mailto]'):
print (email["href"][7:])
This is my personal preference, but I prefer using requests over urllib. Far simpler, better at error handling and safer when threading.
As far as your other questions, you can create a method for fetching, parsing and returning the results that you want and pass an url as a parameter. You would only need to loop your list of urls and call that method.
For your boss question you should use a GUI.
Have fun coding.

Twitter search with urllib2 failing

I am trying to search Twitter for a given search term with the following code:
from bs4 import BeautifulSoup
import urllib2
link = "https://twitter.com/search?q=stackoverflow%20since%3A2014-11-01%20until%3A2015-11-01&src=typd&vertical=default"
page = urllib2.urlopen(link).read()
soup = BeautifulSoup(page)
first = soup.find_all('p')
(Replace "stackoverflow" in link with any search term you want.) However, when I do this (and every time I have tried for the past few days, thinking Twitter might be too bogged down), I get this error:
No results.
Twitter may be over capacity or experiencing a momentary hiccup.
(HTML in results of BS omitted for simplicity in viewing.)
This code used to work for me, but now is not. Additionally, plugging link directly into a browser gives the correct result and Twitter status shows all is well.
Thoughts?
I was able to reproduce your results. I believe that Twitter is using this message to discourage people from scraping. It makes sense since they have taken the time to publish an API for people to access their data, that they discourage scraping.
My advice is to use their API which is documented here: https://dev.twitter.com/overview/documentation

How can I retrieve all comments on a Facebook post using the php SDK?

I'm building an app which allows users to post articles to their facebook wall. When an article is posted, I retrieve the post id and store that in the database along with the rest of the article details. Now I want to be able to show the comments made on that post when someone views the article in my site; I would also like to allow users to add comments to the post from my site.
I know that the user is always logged into Facebook when they are viewing the article, as the system checks for that earlier on.
I've been using the PHP SDK, and thought all I had to do was something like:
$post_comments = $facebook->api('/' . $post_id . '/comments');
However, when I do this, I get the following error:
Fatal error: Uncaught GraphMethodException: Unsupported get request. thrown in /APP_PATH/facebook/src/facebook.php on line 560
I really don't have much of a clue what I'm doing here, to be honest, as I'm very new to the Facebook Graph API, and I can't seem to find a lot of documentation on it.
Can anyone tell me what I should be doing here, or point me to some documentation I could read about it?
Thanks!
It should work.
Here is the code I am using which is working for me.
$comments = $facebook->api($postid . '/comments');
Make sure your postid is a valid one.
Alternatively, you can directly type that url in browser to get details like this
https://graph.facebook.com/<postedid>/comments
Please refer this link for further reference
http://developers.facebook.com/docs/reference/api/Comment/
I don't know what your PHP library is doing, but you can actually access comments by reading graph.facebook.com/<post_id>/comments. Indeed, try with this one from the doc.
Are your sure of your post id? Try to call the buggy function with 19292868552_118464504835613 as post id. It has to work.

Django url pattern to retrieve query string parameters

I was about ready to start giving a jqgrid in a django app greater functionality (pagination, searching, etc). In order to do this it looks as though jqgrid sends its parameters in the GET to the server. I plan to write an urlpattern to pull out the necessary stuff (page number, records per page, search term, etc) so I can pass it along to my view to return the correct rows to the grid. Has anyone out there already created this urlpattern I am in search of?
Thanks much.
The answer to this was simpler than I realized. As stated in Chapter 7 of the djangobook in the section titled "Query string parameters" one can simply do something as follows, where "someParam" is the parameter in the query string you want to retrieve. However, Django is designed to be clean in that address bar at the top of the page so you should only use this option if you must.
The query string might look something like this.
http://somedomainname.com/?someString=1
The view might look like this.
def someView(request):
if 'someParam' in request.GET and request.GET['someParam']:
someParam = request.GET['someParam']
Hopefully this is of some help to someone else down the road.