How to include an argument while requesting for api in django? - django

I am trying to fatch live cricket score from sportsmonk cricket api and cricapi. using Django
everything goes fine until I request for any endpoint with a unique Id that is stored in a variable.
it gives key error always while doing so
my request :
resL = json.loads(requests.get(
'https://cricket.sportmonks.com/api/v2.0/fixtures/id?api_token').text)
here 'id' is a variable for a particular fixture.
It works fine while putting any numerical value instead of a variable.
Same is the case with this url :
resS = json.loads(requests.get(
'http://cricapi.com/api/fantasySummary/?apikey=123&unique_id=id).text)
I not getting what I am doing wrong here.

You need to indicate that the id param in your url is a variable. The way you have it written, it's just a string with the characters id.
One way to accomplish this is with f-strings:
id = 543 # or whatever your id is
api_token = 123
resL = json.loads(requests.get(
f'https://cricket.sportmonks.com/api/v2.0/fixtures/{id}?apikey=
{api_token}').text)
resS = json.loads(requests.get(
f'http://cricapi.com/api/fantasySummary/?apikey={api_token}&unique_id=
{id}).text)
When python parses these strings, it will replace {id} with the id you've defined as a variable elsewhere. Please note that you have to include the f at the beginning of the string.
Another way is with the older 'str'.format() method, which works as follows:
id = 543 # or whatever your id is
api_token = 123
resL = json.loads(requests.get(
'https://cricket.sportmonks.com/api/v2.0/fixtures/{}?apikey=
{}'.format(id, api_token)).text)
resS = json.loads(requests.get(
'http://cricapi.com/api/fantasySummary/?apikey={}&unique_id=
{}'.format(id, api_token)).text)
Here, the variables in the format() function will be substituted into the empty brackets in order. (you do not need the f here)

Related

Python Data Scraping (using Xpath) - Returning empty lists and stripping characters

--
I am attempting to scrape information from the website:
http://www.forexfactory.com/#tradesPositions
Now, I used to have one up and running which this forum helped me get going, but I think something has changed on the website and the script I had no longer works.
What do I need?
I would like to scrape the number of 'short' and 'long' positions for AUDUSD, EURUSD, GBPUSD, USDJPY, USDCAD, NZDUSD and USDCHF.
NOT the percentages, the actual number of traders.
What have I done?
This is for EURUSD
import lxml.html
from selenium import webdriver
driver = webdriver.Chrome("C:\Users\MY NAME\Downloads\Chrome Driver\chromedriver.exe")
url = ('http://www.forexfactory.com/#tradesPositions')
driver.get(url)
tree = lxml.html.fromstring(driver.page_source)
results_short = tree.xpath('//*[#id="flexBox_flex_trades/positions_tradesPositionsCopy1"]/div[1]/table/tbody/tr/td[2]/div[1]/ul[1]/li[2]/span/text()')
results_long = tree.xpath('//*[#id="flexBox_flex_trades/positions_tradesPositionsCopy1"]/div[1]/table/tbody/tr/td[2]/div[1]/ul[1]/li[1]/span/text()')
print "Forex Factory"
print "Traders Short EURUSD:",results_short
print "Traders Long EURUSD:",results_long
driver.quit()
This returns
Forex Factory
Traders Short EURUSD: ['337 Traders ', ' ']
Traders Long EURUSD: [' 259 Traders']
I would like to strip everything away from the result except for the numbers. I've tried .strip() and .replace() but neither work on a list. Which will come as no surprise to you guys I don't think!
Empty List
When I apply the same technique to AUDUSD I get an empty list.
import lxml.html
from selenium import webdriver
driver = webdriver.Chrome("C:\Users\Andrew G\Downloads\Chrome Driver\chromedriver.exe")
url = ('http://www.forexfactory.com/#tradesPositions')
driver.get(url)
tree = lxml.html.fromstring(driver.page_source)
results_short = tree.xpath('//*[#id="flexBox_flex_trades/positions_tradesPositionsCopy1"]/div[6]/table/tbody/tr/td[2]/div[1]/ul[1]/li[2]/span/text()')
results_long = tree.xpath('//*[#id="flexBox_flex_trades/positions_tradesPositionsCopy1"]/div[6]/table/tbody/tr/td[2]/div[1]/ul[1]/li[1]/span/text()')
s2 = results_short
l2 = results_long
print "Traders Short AUDUSD:",s2
print "Traders Long AUDUSD:",l2
This returns
Traders Short AUDUSD: []
Traders Long AUDUSD: []
What gives? Is the Xpath not working? Just use Chromes 'inspect element' feature and navigated to the desired number, and copied the path. Same method for EURUSD.
Ideally, It would be nice to set up a list of div numbers that can insert into the tree.xpath instead of repeating the lines of code for all the different currencies to make it neater. So, in the Xpath where it has:
/div[number]/
It would be nice to have a list, i.e [1,2,3,4,5,6] that can insert into that because the rest of the Xpath is the same for the currencies. Anyway, that's an optional bonus, priority is to get a return for all currencies listed.
THANKS
You can remove all the space inside your result as you mentioned with strip method, here is my sample code:
for index in range(len(results_short)):
results_short[index] = results_short[index].strip()
if results_short[index] == "":
del results_short[index]
for index in range(len(results_long)):
results_long[index] = results_long[index].strip()
if results_long[index] == "":
del results_long[index]
For the problem you cannot get the result of AUD because the values are not loaded to the page until you have clicked the "expand" button. But I have found you can get the result from the following page: http://www.forexfactory.com/trades.php
So you can change the value of url as:
url = ('http://www.forexfactory.com/trades.php')
For this page, since the name of CSS id has changed, you need to update your value to:
results_short = tree.xpath('//*[#id="flexBox_flex_trades/positions_tradesPositions"]/div[6]/table/tbody/tr/td[2]/div[1]/ul[1]/li[2]/span/text()')
results_long = tree.xpath('//*[#id="flexBox_flex_trades/positions_tradesPositions"]/div[6]/table/tbody/tr/td[2]/div[1]/ul[1]/li[1]/span/text()')
Then apply the strip function as mentioned above, you should be able to get the correct results.

Python, is there a easier way to add values to a default key?

The program I am working does the following:
Grabs stdout from a .perl program
Builds a nested dict from the output
I'm using the AutoVivification approach found here to build a default nested dictionary. I'm using this method of defaultdict because it's easier for me to follow as a new programmer.
I'd like to add one key value to a declared key per pass of the for line in the below code. Is there a easier way to add values to a key beyond making a [list] of values then adding said values as a group?
import pprint
class Vividict(dict):
def __missing__(self, key):
value = self[key] = type(self)()
return value
reg = 'NtUser'
od = Vividict()
od[reg]
def run_rip():
os.chdir('/Users/ME/PycharmProjects/RegRipper2.8') # Path to regripper dir
for k in ntDict:
run_command = "".join(["./rip.pl", " -r
/Users/ME/Desktop/Reg/NTUSER.DAT -p ", str(k)])
process = subprocess.Popen(run_command,
shell=True,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
out, err = process.communicate() # wait for the process to terminate
parse(out)
# errcode = process.returncode // used in future for errorcode checking
ntDict.popitem(last=False)
def parse(data):
pattern = re.compile('lastwrite|(\d{2}:\d{2}:\d{2})|alert|trust|Value')
grouping = re.compile('(?P<first>.+?)(\n)(?P<second>.+?)
([\n]{2})(?P<rest>.+[\n])', re.MULTILINE | re.DOTALL)
if pattern.findall(data):
match = re.search(grouping, data)
global first
first = re.sub("\s\s+", " ", match.group('first'))
od[reg][first]
second = re.sub("\s\s+", " ", match.group('second'))
parse_sec(second)
def parse_sec(data):
pattern = re.compile(r'^(\(.*?\)) (.*)$')
date = re.compile(r'(.*?\s)(.*\d{2}:\d{2}:\d{2}.*)$')
try:
if pattern.match(data):
result = pattern.match(data)
hive = result.group(1)
od[reg][first]['Hive'] = hive
desc = result.group(2)
od[reg][first]['Description'] = desc
elif date.match(data):
result = date.match(data)
hive = result.group(1)
od[reg][first]['Hive'] = hive
time = result.group(2)
od[reg][first]['Timestamp'] = time
else:
od[reg][first]['Finding'] = data
except IndexError:
print('error w/pattern match')
run_rip()
pprint.pprint(od)
Sample Input:
bitbucket_user v.20091020
(NTUSER.DAT) TEST - Get user BitBucket values
Software\Microsoft\Windows\CurrentVersion\Explorer\BitBucket
LastWrite Time Sat Nov 28 03:06:35 2015 (UTC)
Software\Microsoft\Windows\CurrentVersion\Explorer\BitBucket\Volume
LastWrite Time = Sat Nov 28 16:00:16 2015 (UTC)
If I understand your question correctly, you want to change the lines where you're actually adding values to your dictionary (e.g. the od[reg][first]['Hive'] = hive line and the similar one for desc and time) to create a list for each reg and first value and then extend that list with each item being added. Your dictionary subclass takes care of creating the nested dictionaries for you, but it won't build a list at the end.
I think the best way to do this is to use the setdefault method on the inner dictionary:
od[reg][first].setdefault("Hive", []).append(hive)
The setdefault will add the second value (the "default", here an empty list) to the dictionary if the first argument doesn't exist as a key. It preempts the dictionary's __missing__ method creating the item, which is good, since we want a the value to be list rather than another layer of dictionary. The method returns the value for the key in all cases (whether it added a new value or if there was one already), so we can chain it with append to add our new hive value to the list.

Dictionary error in Python 2.7

I have a file in the format:
0000 | a1_1,a3_2 | b2_1, b3_2
0001 | a1_3 | b4_1
and I'm trying to create a dictionary which has
{ 'a1' : set(['b2', 'b3', 'b4']), 'a3': set(['b2', 'b3']) }
and this is how my code looks like:
def get_ids(row, col):
ids = set()
x = row.strip().split('|')
for a in x[col].split(','):
ids.add(a.split('_')[0])
return ids
def add_to_dictionary(funky_dictionary,key, values):
if key in funky_dictionary:
funky_dictionary[key].update(values)
else:
funky_dictionary[key] = values
def get_dict(input_file):
funky_dictionary = {}
with open(input_file,'r') as ip:
for row in ip:
a_ids = get_ids(row,1)
b_ids = get_ids(row,2)
for key in a_ids:
add_to_dictionary(funky_dictionary,key,b_ids)
return funky_dictionary
So my problem is this when I lookup values for certain key in the dictionary, it returns me with way more values than expected. E.g.
For the above example the expected value of a3 would be set(['b2', ' b3'])
However with the code, I'm getting set(['b2', ' b3', 'b4'])
I cant figure out whats wrong with the code. Any help?
The issue you have is that many of your dictionary's values are in fact references to the same set instances. In your example data, when the first line is processed, 'a1' and 'a3' both get mapped to the same set object (containing 'b2' and 'b3'). When you process the second line and call update on that set via the key 'a1', you'll see the added value through 'a3' too, since both values are references to the same set.
You need to change the code so that each value is a separate set object. I'd suggest getting rid of add_to_dictionary and just using the dictionary's own setdefault method, like this:
for key in a_ids:
funky_dictionary.setdefault(key, set()).update(b_ids)
This code always starts with a new empty set for a new key, and always updates it with new values (rather than adding a reference to the b_ids set to the dictionary directly).

Python, Using repeated field in API

I have an API presented as follow. If the API is called with only one value in params (which is a repeated field), everything work as intended. But if params holds multiple values, then I get error : No endpoint found for path.
1 INPUT = endpoints.ResourceContainer(
2 params = messages.IntegerField(1, repeated = True, variant = messages.Variant.INT32))
3
4 #endpoints.method(INPUT,
5 response_type.CustomResponse,
6 path = 'foo/{params}',
7 http_method = 'POST',
8 name = 'foo')
9 def foo(self, request):
10 #foo body is irrelevent
11 return response
How can I fix this. Something like : path = 'foo/{params[]}', ?
Thank you for your help
If 'params' is expected as part of the query string and not the path, you can just omit it from the path eg:
path = 'foo'
or
path = 'myApi/foo'
The example given in the docs uses a ResourceContainer for a single non-repeated path argument. Given the nature of repeated properties it doesn't look like you can use them as path arguments, only query string arguments. A repeated field in a query string would look like this (easily to deal with):
POST http://app.appspot.com/_ah/api/myApi/v1/foo?param=bar&param=baz ...
But a repeated field in a path argument would look like this (not so much):
POST http://app.appspot.com/_ah/api/myApi/v1/foo/bar/baz....

regex to return all values not just first found one

I'm learning Pig Latin and am using regular expressions. Not sure if the regex is language agnostic or not but here is what I'm trying to do.
If I have a table with two fields: tweet id and tweet, I'd like to go through each tweet and pull out all mentions up to 3.
So if a tweet goes something like "#tim bla #sam #joe something bla bla" then the line item for that tweet will have tweet id, tim, sam, joe.
The raw data has twitter ids not the actual handles so this regex seems to return a mention (.*)#user_(\\S{8})([:| ])(.*)
Here is what I have tried:
a = load 'data.txt' AS (id:chararray, tweet:chararray);
b = foreach a generate id, LOWER(tweet) as tweet;
// filter data so only tweets with mentions
c = FILTER b BY tweet MATCHES '(.*)#user_(\\S{8})([:| ])(.*)';
// try to pull out the mentions.
d = foreach c generate id,
REGEX_EXTRACT(tweet, '((.*)#user_(\\S{8})([:| ])(.*)){1}',3) as mention1,
REGEX_EXTRACT(tweet, '((.*)#user_(\\S{8})([:| ])(.*)){1,2}',3) as mention2,
REGEX_EXTRACT(tweet, '((.*)#user_(\\S{8})([:| ])(.*)){2,3}',3) as mention3;
e = limit d 20;
dump e;
So in that try I was playing with quantifiers, trying to return the first, second and 3rd instance of a match in a tweet {1}, {1,2}, {2,3}.
That did not work, mention 1-3 are just empty.
So I tried changing d:
d = foreach c generate id,
REGEX_EXTRACT(tweet, '(.*)#user_(\\S{8})([:| ])(.*)',2) as mention1,
REGEX_EXTRACT(tweet, '(.*)#user_(\\S{8})([:| ])(.*)#user_(\\S{8})([:| ])(.*)',5) as mention2,
REGEX_EXTRACT(tweet, '(.*)#user_(\\S{8})([:| ])(.*)#user_(\\S{8})([:| ])(.*)#user_(\\S{8})([:| ])(.*)',8) as mention3,
But, instead of returning each user mentioned, this returned the same mention 3 times. I had expected that by cutting n pasting the expression again I'd get the second match, and pasting it a 3rd time would get the 3rd match.
I'm not sure how well I've managed to word this question but to put it another way, imagine that the function regex_extract() returned an array of matched terms. I would like to get mention[0], mention[1], mention[2] on a single line item.
Whenever you use PATTERN_EXTRACT or PATTERN_EXTRACT_ALL udf, keep in mind that it is just pure regex handled by Java.
It is easier to test the regex through a local Java test. Here is the regex I found to be acceptable :
Pattern p = Pattern.compile("#(\\S+).*?(?:#(\\S+)(?:.*?#(\\S+))?)?");
String input = "So if a tweet goes something like #tim bla #sam #joe #bill something bla bla";
Matcher m = p.matcher(input);
if(m.find()){
for(int i=0; i<=m.groupCount(); i++){
System.out.println(i + " -> " + m.group(i));
}
}
With this regex, if there is at least a mention, it will returns three fields, the seconds and/or third being null if a second/third mention is not found.
Therefore, you may use the following PIG code :
d = foreach c generate id, REGEX_EXTRACT_ALL(
tweet, '#(\\S+).*?(?:#(\\S+)(?:.*?#(\\S+))?)?');
You do not even need to filter the data first.