Huge training error with pybrain - python-2.7

This is my training function:
def train(input_layer_data, output_layer_data, dnn, stn):
ds = SupervisedDataSet(len(input_layer_data), len(output_layer_data))
ds.addSample(input_layer_data, output_layer_data)
if 'network' in dnn[stn]:
net_dumped = dnn[stn]['network']
net = pickle.loads(net_dumped)
else:
net = buildNetwork(len(input_layer_data), 50, len(output_layer_data), hiddenclass=SigmoidLayer, outclass = SigmoidLayer)
trainer = BackpropTrainer(net, ds)
trainer.trainEpochs(1)
trnresult = percentError( trainer.testOnClassData(), input_layer_data )
print "epoch: %4d" % trainer.totalepochs, \
" train error: %5.2f%%" % trnresult
return net
I call this function with a single input and output data repeatedly.
And this is the output it generates,
inp=[48, 48, 8, 69, 69, 8, 57, 57, 8, 67, 67, 8, 71, 71, 8, 75, 75, 8, 71, 71, 8]
out=[27, 27, 8, 71, 71, 8, 75, 75, 8, 71, 71, 8, 67, 67, 8, 57, 57, 8, 69, 69, 8]
epoch: 0 train error: 2100.00%
FeedForwardNetwork-152
Modules:
[<BiasUnit 'bias'>, <LinearLayer 'in'>, <SigmoidLayer 'hidden0'>, <SigmoidLayer 'out'>]
Connections:
[<FullConnection 'FullConnection-148': 'bias' -> 'out'>, <FullConnection 'FullConnection-149': 'bias' -> 'hidden0'>, <FullConnection 'FullConnection-150': 'in' -> 'hidden0'>, <FullConnection 'FullConnection-151': 'hidden0' -> 'out'>]
I don't understand such huge error.
The error continues through the whole program(this is for just one call).
How do I reduce the error?

Related

Divide (and replace) numbers extracted from a string in Google Sheets

I'm trying to convert numbers that were previously percentages to a decimal format by dividing them by 100 in Google Sheets. Basically, I have:
<polygon points="48, 6, 43, 7, 38, 9, 34, 12, 29, 16, 24, 22, 22, 30, 22, 44, 23, 50, 23, 65, 25, 72, 28, 77, 32, 82, 35, 86, 40, 90, 43, 92, 50, 93, 55, 91, 62, 87, 70, 76, 74, 69, 75, 64, 75, 54, 74, 49, 74, 40, 74, 32, 71, 23, 66, 15, 59, 9, 53, 6" />
And I want:
<polygon points=".48, .06, .43, .07, .38, .09, .34, .12, .29, .16, .24, .22, .22, .30, .22, .44, .23, .50, .23, .65, .25, .72, .28, .77, .32, .82, .35, .86, .40, .90, .43, .92, .50, .93, .55, .91, .62, .87, .70, .76, .74, .69, .75, .64, .75, .54, .74, .49, .74, .40, .74, .32, .71, .23, .66, .15, .59, .09, .53, .06" />
Is there any way to extract numbers, do an operation on them, then replace them in the previous string? I tried to use a regex token in REGEXREPLACE but it doesn't seem to be supported.
=(REGEXREPLACE(A2,"[^[:digit:]]",($/10)))
You cannot apply any function to the string replacement pattern in REGEXREPLACE. In this concrete case, you may simply append a 0 before single-digit numbers and then add dots before each sequence of 1 or more digits:
=REGEXREPLACE(REGEXREPLACE(A1,"\b\d\b", "0$0"), "\d+", ".$0")
See screenshot:
NOTES:
REGEXREPLACE(A1,"\b\d\b", "0$0") - finds a digit not preceded nor followed with a letter/digit/_, and adds a 0 in front of it ($0 is the placeholder for the whole match)
REGEXREPLACE(..., "\d+", ".$0") - prepends one or more digit chunks with a dot.

Getting a list as the result of a function in pandas

I have data frame in pandas and I have written a function to use the information in each row to generate a new column. I want the result to be in a list format:
A B C
3 4 1
4 2 5
def Computation(row):
if row['B'] >= 3:
return [s for s in range(row['C'],50)]
else:
return [s for s in range(row['C']+2,50)]
df['D'] = df.apply(Computation, axis = 1)
However, I am getting the following error:
"could not broadcast input array from shape (308) into shape (9)"
Could you please tell me how to solve this problem?
Say you start with
In [25]: df = pd.DataFrame({'A': [3, 4], 'B': [4, 2], 'C': [1, 5]})
Then there are at least two ways to do it.
You can apply twice on the C column, but switch on the B column:
In [26]: np.where(df.B >= 3, df.C.apply(lambda c: [s for s in range(c, 50)]), df.C.apply(lambda c: [s for s in range(c + 2, 50)]))
Out[26]:
array([ [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
[7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49]], dtype=object)
Or you can apply on the entire row and switch on the B value per row:
In [27]: df.apply(lambda r: [s for s in range(r.C, 50)] if r.B >= 3 else [s for s in range(r.C + 2, 50)], axis=1)
Out[27]:
0 [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14...
1 [7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, ...
Note that the return types are different, but, in each case, you can still write
df['foo'] = <each one of the above options>

Creating correct for-loop/iteration for a unique list

I have a list of the numbers 1,2,3 and 4.
I wish to print them out in the following manner:
1
2
3
4
11
12
13
14
21
22
23
24
31
..and so on.
How is it possible to do?
Thanks
from itertools import product
maximumDigits = 2
digits = '1234'
for l in range(1, maximumDigits + 1):
for n in product(digits, repeat=l):
print(''.join(n))
Gives you:
1
2
3
4
11
12
13
14
21
22
23
24
31
32
33
34
41
42
43
44
Non-itertools solution:
>>> digits = (1, 2, 3, 4)
>>> nums = newNums = list(digits)
# calculate 2-digit numbers
>>> newNums = [n * 10 + m for n in newNums for m in digits]
>>> nums.extend(newNums)
>>> nums
[1, 2, 3, 4, 11, 12, 13, 14, 21, 22, 23, 24, 31, 32, 33, 34, 41, 42, 43, 44]
# calculate 3-digit numbers
>>> newNums = [n * 10 + m for n in newNums for m in digits]
>>> nums.extend(newNums)
>>> nums
[1, 2, 3, 4, 11, 12, 13, 14, 21, 22, 23, 24, 31, 32, 33, 34, 41, 42, 43, 44, 111, 112, 113, 114, 121, 122, 123, 124, 131, 132, 133, 134, 141, 142, 143, 144, 211, 212, 213, 214, 221, 222, 223, 224, 231, 232, 233, 234, 241, 242, 243, 244, 311, 312, 313, 314, 321, 322, 323, 324, 331, 332, 333, 334, 341, 342, 343, 344, 411, 412, 413, 414, 421, 422, 423, 424, 431, 432, 433, 434, 441, 442, 443, 444]
# this repeats for each new digit you want

O(1) Django ORM strategy to query related objects of related objects

The relationship between Foo and Bar is through Baz as follows:
class Foo(Model):
# stuff
class Bar(Model)
# stuff
class Baz(Model):
foos = ManyToManyField("Foo")
bar = ForeignKey("Bar")
I basically need to generate the following dict representing the Bars that are related to each Foo through Baz (in dict comprehension pseudo-code):
{ foo.id: [list of unique bars related to the foo through any baz] for foo in all foos}
I can currently generate my data structure with O(N) queries (1 query per Foo), but with lots of data this is a bottleneck, and I need it optimized to O(1) (not a single query per se, but a fixed number of queries irrespective of data size of any of the models), while also minimizing iterations of the data in python.
If you can drop to SQL, you could use the single query (the appname should prefix all the tables names):
select distinct foo.id, bar.id
from baz_foos
join baz on baz_foos.baz_id = baz.id
join foo on baz_foos.foo_id = foo.id
join bar on baz.bar_id = bar.id
baz_foos is the many-to-many table Django creates.
#Alasdair's solution is possibly/probably more readable (although if you're doing this for performance reasons that might not be most important). His solution uses exactly two queries (which is hardly a difference). The only problem I see is if you have a large number of Baz objects since the generated sql looks like this:
SELECT "foobar_baz"."id", "foobar_baz"."bar_id", "foobar_bar"."id"
FROM "foobar_baz"
INNER JOIN "foobar_bar" ON ("foobar_baz"."bar_id" = "foobar_bar"."id")
SELECT
("foobar_baz_foos"."baz_id") AS "_prefetch_related_val",
"foobar_foo"."id"
FROM "foobar_foo"
INNER JOIN "foobar_baz_foos" ON ("foobar_foo"."id" = "foobar_baz_foos"."foo_id")
WHERE "foobar_baz_foos"."baz_id" IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,
15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54,
55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74,
75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94,
95, 96, 97, 98, 99, 100, 101)
If you have only a few Bar's and a few hundred Foo's, I would do:
from django.db import connection
from collections import defaultdict
# foos = {f.id: f for f in Foo.objects.all()}
bars = {b.id: b for b in Bar.objects.all()}
c = connection.cursor()
c.execute(sql) # from above
d = defaultdict(set)
for f_id, b_id in c.fetchall():
d[f_id].add(bars[b_id])
Using select_related and prefetch_related, I think you can build the required data structure with 2 queries:
out = {}
bazes = Baz.objects.select_related('bar').prefetch_related('foos')
for baz in bazes:
for foo in baz.foos.all():
out.setdefault(foo.id, set()).add(baz.bar)
The values of the output dictionary are sets, not lists as in your question, to ensure uniqueness.

Extended tuple unpacking in Python 2

Is it possible to simulate extended tuple unpacking in Python 2?
Specifically, I have a for loop:
for a, b, c in mylist:
which works fine when mylist is a list of tuples of size three. I want the same for loop to work if I pass in a list of size four.
I think I will end up using named tuples, but I was wondering if there is an easy way to write:
for a, b, c, *d in mylist:
so that d eats up any extra members.
You can't do that directly, but it isn't terribly difficult to write a utility function to do this:
>>> def unpack_list(a, b, c, *d):
... return a, b, c, d
...
>>> unpack_list(*range(100))
(0, 1, 2, (3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99))
You could apply it to your for loop like this:
for sub_list in mylist:
a, b, c, d = unpack_list(*sub_list)
You could define a wrapper function that converts your list to a four tuple. For example:
def wrapper(thelist):
for item in thelist:
yield(item[0], item[1], item[2], item[3:])
mylist = [(1,2,3,4), (5,6,7,8)]
for a, b, c, d in wrapper(mylist):
print a, b, c, d
The code prints:
1 2 3 (4,)
5 6 7 (8,)
For the heck of it, generalized to unpack any number of elements:
lst = [(1, 2, 3, 4, 5), (6, 7, 8), (9, 10, 11, 12)]
def unpack(seq, n=2):
for row in seq:
yield [e for e in row[:n]] + [row[n:]]
for a, rest in unpack(lst, 1):
pass
for a, b, rest in unpack(lst, 2):
pass
for a, b, c, rest in unpack(lst, 3):
pass
You can write a very basic function that has exactly the same functionality as the python3 extended unpack. Slightly verbose for legibility. Note that 'rest' is the position of where the asterisk would be (starting with first position 1, not 0)
def extended_unpack(seq, n=3, rest=3):
res = []; cur = 0
lrest = len(seq) - (n - 1) # length of 'rest' of sequence
while (cur < len(seq)):
if (cur != rest): # if I am not where I should leave the rest
res.append(seq[cur]) # append current element to result
else: # if I need to leave the rest
res.append(seq[cur : lrest + cur]) # leave the rest
cur = cur + lrest - 1 # current index movded to include rest
cur = cur + 1 # update current position
return(res)
Python 3 solution for those that landed here via an web search:
You can use itertools.zip_longest, like this:
from itertools import zip_longest
max_params = 4
lst = [1, 2, 3, 4]
a, b, c, d = next(zip(*zip_longest(lst, range(max_params))))
print(f'{a}, {b}, {c}, {d}') # 1, 2, 3, 4
lst = [1, 2, 3]
a, b, c, d = next(zip(*zip_longest(lst, range(max_params))))
print(f'{a}, {b}, {c}, {d}') # 1, 2, 3, None
For Python 2.x you can follow this answer.