overload the bracket operator with multiple parameters - python-2.7

I have built a wrapper around numpy array for simplification purposes I will display only the necessary part to show the error:
class Matrix(object):
"""wrap around numpy array
"""
def __init__(self, shape, fill_value):
self.matrix = np.full(shape, fill_value)
def __getitem__(self, a, b):
return self.matrix[a, b]
m = Matrix((10, 10), 5)
print(m[5, 5])
the print statement generates the following error:
KeyError: __getitem__() takes exactly 3 arguments (2 given)
what's the fix to access m using the [] operator like the follwing:
m[1, 1]

Currently, you have a class Matrix with an attribute matrix which is a numpy array. Therefore you would need to reference the attribute first and then pass the indices:
>>> m.matrix[5,5]
5
At this point, you have not wrapped around a numpy array. Depending on what you want to do, this could be a step in the right direction:
class Matrix(np.ndarray):
def __new__(cls, shape, fill_value=0):
return np.full(shape, fill_value)
>>> m = MyMatrix((10, 10), 5)
>>> print(m[5, 5])
>>> 5
However, this essentially does nothing more than m = np.full(shape, fill_value). I suppose you are going to want to add custom attributes and methods to a numpy array, in which you should check out this example in the numpy documentation.

the solution is to pass a tuple inside a variable like the following:
class Matrix(object):
"""wrap around numpy array
"""
def __init__(self, shape, fill_value):
self.matrix = np.full(shape, fill_value)
def __getitem__(self, a):
# we could do also do return self.matrix[a[0], a[1]]
return self.matrix[a]
m = Matrix((10, 10), 5)
print(m[5, 5])

Related

object returning memory location instead of value

So I have this class:
#!/usr/bin/python3
class MyClass(object):
def __init__(self, length):
self._list = length
def get(self, index):
try:
return self._list[index]
except IndexError:
return None
which takes in a list and returns a value, a list index I think. I am trying to get that value:
def my_function(a_list):
a_list = MyClass
for x in (10**p for p in range(1, 9)):
if a_list:
print(a_list)
def main():
length = my_function(MyClass([i for i in range(0, 543)]))
but I keep getting only the memory location of the list, I think this is supposed to return an int.
I am hoping this is a workable bit of code, but I am struggling, with the concept of passing an "object" to a class, it doesn't make any sense to me.
Here is a test I am supposed to use:
def test_large_list():
s_list = My_Class([i for i in xrange(0, 100000)])
assert len(s_list._list) == list_length(s_list)
Ok, Here is my full function that works, it is done, how od I do this so that the first line takes an argument
#!/usr/bin/python3
#def list_length(single_method_list): This is what I am supposed to work with
from single_method_list import SingleMethodList
def my_function(): # This is how I have done it and it works.
a_list = MyClass([i for i in range(0, 234589)])
for x in (10**p for p in range(1, 8)):
if a_list.get(x):
print("More than", x)
first = x
else:
print("Less than", x)
last = x
break
answer = False
while not answer:
result = (first + last)/2
result = int(round(result))
print(result)
if s_list.get(result):
first = result
print('first', result)
else:
last = result
print('last', result)
if s_list.get(result) and not s_list.get(result + 1):
answer = True
print(result + 1)
my_function()
I don't know what more I can give to explain where I am stuck, it is the OOP part of this that I don't know I need the same results here, just passing it to the function instead of creating it inside the function which I did in order to do the algorithm.
Well your class does something else.MyClass is designed to take a List at initialization, so the naming length is not a good idea.
The get() method of this class takes in a number and returns the element located at that particular index in the initialized self._list.
Your logic should be like:
def my_function(a_list):
a_list = MyClass(a_list)
...
def main():
length = my_function([i for i in range(0, 543)])
Just to clarify some misunderstanding that you might have.
Class does not return anything. It is a blueprint for creating objects.
What can return value is a method (function). For instance, if you want to write a method which returns length of some list:
def my_function(some_list):
return len(some_list)
Or in your case:
def my_function(a_list):
return len(a_list._list)
Note that you should not call your variables list. It's a built-in function in python which creates lists.
And as you can see there is another built-in function len in python which returns length of list, tuple, dictionary etc.
Hope this helps, although it's still a bit unclear what you're trying to achieve.

ValueError: This solver needs samples of at least 2 classes in the data, but the data contains only one class: 0.0

I have applied Logistic Regression on train set after splitting the data set into test and train sets, but I got the above error. I tried to work it out, and when i tried to print my response vector y_train in the console it prints integer values like 0 or 1. But when i wrote it into a file I found the values were float numbers like 0.0 and 1.0. If thats the problem, how can I over come it.
lenreg = LogisticRegression()
print y_train[0:10]
y_train.to_csv(path='ytard.csv')
lenreg.fit(X_train, y_train)
y_pred = lenreg.predict(X_test)
print metics.accuracy_score(y_test, y_pred)
StrackTrace is as follows,
Traceback (most recent call last):
File "/home/amey/prog/pd.py", line 82, in <module>
lenreg.fit(X_train, y_train)
File "/usr/lib/python2.7/dist-packages/sklearn/linear_model/logistic.py", line 1154, in fit
self.max_iter, self.tol, self.random_state)
File "/usr/lib/python2.7/dist-packages/sklearn/svm/base.py", line 885, in _fit_liblinear
" class: %r" % classes_[0])
ValueError: This solver needs samples of at least 2 classes in the data, but the data contains only one class: 0.0
Meanwhile I've gone across the link which was unanswered. Is there a solution.
The problem here is that your y_train vector, for whatever reason, only has zeros. It is actually not your fault, and its kind of a bug ( I think ). The classifier needs 2 classes or else it throws this error.
It makes sense. If your y_train vector only has zeros, ( ie only 1 class ), then the classifier doesn't really need to do any work, since all predictions should just be the one class.
In my opinion the classifier should still complete and just predict the one class ( all zeros in this case ) and then throw a warning, but it doesn't. It throws the error in stead.
A way to check for this condition is like this:
lenreg = LogisticRegression()
print y_train[0:10]
y_train.to_csv(path='ytard.csv')
if len(np.sum(y_train)) in [len(y_train),0]:
print "all one class"
#do something else
else:
#OK to proceed
lenreg.fit(X_train, y_train)
y_pred = lenreg.predict(X_test)
print metics.accuracy_score(y_test, y_pred)
TO overcome the problem more easily i would recommend just including more samples in you test set, like 100 or 1000 instead of 10.
I had the same problem using learning_curve:
train_sizes, train_scores, test_scores = learning_curve(estimator,
X, y, cv=cv, n_jobs=n_jobs, train_sizes=train_sizes,
scoring="f1", random_state=RANDOM_SEED, shuffle=True)
add the suffle parameter that will randomize the sets.
This doesn't prevent error from happening but it's a way to increase the chances to have both classes in subsets used by the function.
I found it to be because of only 1's or 0's wound up in my y_test since my sample size was really small. Try chaning your test_size value.
# python3
import numpy as np
from sklearn.svm import LinearSVC
def upgrade_to_work_with_single_class(SklearnPredictor):
class UpgradedPredictor(SklearnPredictor):
def __init__(self, *args, **kwargs):
self._single_class_label = None
super().__init__(*args, **kwargs)
#staticmethod
def _has_only_one_class(y):
return len(np.unique(y)) == 1
def _fitted_on_single_class(self):
return self._single_class_label is not None
def fit(self, X, y=None):
if self._has_only_one_class(y):
self._single_class_label = y[0]
else:
super().fit(X, y)
return self
def predict(self, X):
if self._fitted_on_single_class():
return np.full(X.shape[0], self._single_class_label)
else:
return super().predict(X)
return UpgradedPredictor
LinearSVC = upgrade_to_work_with_single_class(LinearSVC)
or hard-way (more right):
import numpy as np
from sklearn.svm import LinearSVC
from copy import deepcopy, copy
from functools import wraps
def copy_class(cls):
copy_cls = type(f'{cls.__name__}', cls.__bases__, dict(cls.__dict__))
for name, attr in cls.__dict__.items():
try:
hash(attr)
except TypeError:
# Assume lack of __hash__ implies mutability. This is NOT
# a bullet proof assumption but good in many cases.
setattr(copy_cls, name, deepcopy(attr))
return copy_cls
def upgrade_to_work_with_single_class(SklearnPredictor):
SklearnPredictor = copy_class(SklearnPredictor)
original_init = deepcopy(SklearnPredictor.__init__)
original_fit = deepcopy(SklearnPredictor.fit)
original_predict = deepcopy(SklearnPredictor.predict)
#staticmethod
def _has_only_one_class(y):
return len(np.unique(y)) == 1
def _fitted_on_single_class(self):
return self._single_class_label is not None
#wraps(SklearnPredictor.__init__)
def new_init(self, *args, **kwargs):
self._single_class_label = None
original_init(self, *args, **kwargs)
#wraps(SklearnPredictor.fit)
def new_fit(self, X, y=None):
if self._has_only_one_class(y):
self._single_class_label = y[0]
else:
original_fit(self, X, y)
return self
#wraps(SklearnPredictor.predict)
def new_predict(self, X):
if self._fitted_on_single_class():
return np.full(X.shape[0], self._single_class_label)
else:
return original_predict(self, X)
setattr(SklearnPredictor, '_has_only_one_class', _has_only_one_class)
setattr(SklearnPredictor, '_fitted_on_single_class', _fitted_on_single_class)
SklearnPredictor.__init__ = new_init
SklearnPredictor.fit = new_fit
SklearnPredictor.predict = new_predict
return SklearnPredictor
LinearSVC = upgrade_to_work_with_single_class(LinearSVC)
You can find the indexes of the first (or any) occurrence of each of the classes and concatenate them on top of the arrays and delete them from their original positions, that way there will be at least one instance of each class in the training set.
This error related to the dataset you are using, the dataset contains a class for example 1/benign, whereas it must contain two classes 1 and 0 or Benign and Attack.

Rolling Window on Dataframe, mutliple columns input and output

I have a function myfunc, which does calculations on two pandas DataFrame columns. Output is a Numpy array.
def myfunc(df, args):
import numpy
return numpy.array([df.iloc[:,args[0]].sum,df.iloc[:,args[1]].sum])
This function is called within rolling_df_apply:
def rolling_df_apply(df, myfunc, window, *args):
import pandas
result = pandas.concat(pandas.DataFrame(myfunc(df.iloc[i:window+i],args), index=[df.index[i+window-1]]) for i in xrange(0,len(df)-window+1))
return result
Running this via
import numpy
import pandas
df=pandas.DataFrame(numpy.random.randint(5,size=(5,2)))
window=3
args = [0,1]
result = rolling_df_apply(df, myfunc, window, *args)
gives ValueError within pandas.concat(): Shape of passed values is (1, 2), indices imply (1, 1).
What must be changed to get this running?
Which indices imply shape 1,1? Shape of all dataframes to concatenate should be 1,2, though.
In myfunc, .sum should be .sum() in myfunc.
Since myfunc returns an array of length 2,
pandas.DataFrame(myfunc(df.iloc[i:window+i],args), index=[df.index[i+window-1]])
is essentially the same as
pd.DataFrame([0,1], index=[0])
which raises
ValueError: Shape of passed values is (1, 2), indices imply (1, 1)
The error is saying that the value [0,1] implies 1 row and 2 columns,
while the index implies 1 row and 1 column.
On way to fix this would be to pass a dict instead of a list:
In [191]: pd.DataFrame({'a':0,'b':1}, index=[0])
Out[191]:
a b
0 0 1
So, to fix your code with minimal changes,
import pandas as pd
import numpy as np
def myfunc(df, args):
return {'a':df.iloc[:,args[0]].sum(), 'b':df.iloc[:,args[1]].sum()}
def rolling_df_apply(df, myfunc, window, *args):
frames = [pd.DataFrame(myfunc(df.iloc[i:window+i],args),
index=[df.index[i+window-1]])
for i in xrange(0,len(df)-window+1)]
result = pd.concat(frames)
return result
np.random.seed(2015)
df = pd.DataFrame(np.random.randint(5,size=(5,2)))
window=3
args = [0,1]
result = rolling_df_apply(df, myfunc, window, *args)
print(result)
yields
a b
2 7 6
3 7 5
4 3 3
However, it would be much more efficient to replace myfunc and rolling_df_apply with a call to pd.rolling_sum:
result = pd.rolling_sum(df, window=3).dropna(axis=0)
yields the same result.

Split Array with looping on Python 2.7.5.1

def split(self):
assert input_array >= 0
if input_array == 0:
return [0]
array=[]
while input_array> 0:
array.append(int(input_array%10))
input_array = input_array//10
print input_array
return input_array
else:
print "END"
is there any way to split input array with looping?
i tried using selection but it just doesn't work
Are you trying to get the individual digits from a number? Try converting it into a string, iterating over it, and converting back to int.
>>> x = 2342
>>> [int(digit) for digit in str(x)]
[2, 3, 4, 2]
I'm guessing what you want is a list of digits conforming a certain number (input_array in this case).
First the main issues.
You declare a variable called array and if you are a good observer
you will notice you never return it.
print "END" has no purpose here.
input_arry == 0 can be treated as any other number > 0.
Try to not modify input_array
The solution:
Since I see you're working with a class I will code a solution for you using a class as well.
class SomeClass:
def __init__(self, input_array):
""" Constructor """
self.input_array = input_array
def split(self):
array = []
number = self.input_array # Don't modify input_array.
while number > 0:
array.append(number%10)
number = number // 10
array.reverse() # This is easy ;)
return array
def split_1(self):
""" Kevin's solution. When you become more skilled this is the way to go. """
return [int(digit) for digit in str(x)]
>>> some_instance = SomeClass(12345)
>>> print(some_instance.split())
[1, 2, 3, 4, 5]

python 2.7 - is there a more succint way to do this series of yield statements (in python 3, "yield from" would help)

Situation:
Python 2.7 code that contains a number of "yield" statements. But the specs have changed.
Each yield calls a function that used to always return a value. Now the result is sometimes a value that should be yielded, but sometimes no value should be yielded.
Dumb Example:
BEFORE:
def always(x):
return 11 * x
def do_stuff():
# ... other code; each yield is buried inside an if or other flow construct ...
# ...
yield always(1)
# ...
yield always(6)
# ...
yield always(5)
print( list( do_stuff() ) )
=>
[11, 66, 55]
AFTER (if I could use Python 3, but that is not currently an option):
def maybe(x):
""" only keep odd value; returns list with 0 or 1 elements. """
result = 11 * x
return [result] if bool(result & 1) else []
def do_stuff():
# ...
yield from maybe(1)
# ...
yield from maybe(6)
# ...
yield from maybe(5)
=>
[11, 55]
AFTER (in Python 2.7):
def maybe(x):
""" only keep odd value; returns list with 0 or 1 elements. """
result = 11 * x
return [result] if bool(result & 1) else []
def do_stuff():
# ...
for x in maybe(1): yield x
# ...
for x in maybe(6): yield x
# ...
for x in maybe(5): yield x
NOTE: In the actual code I am translating, the "yields" are buried inside various flow-control constructs. And the "maybe" function has two parameters, and is more complex.
MY QUESTION:
Observe that each call to "maybe" returns either 1 value to yield, or 0 values to yield.
(It would be fine to change "maybe" to return the value, or to return None when there is no value, if that helps.)
Given this 0/1 situation, is there any more succinct way to code?
If as you say you can get away with returning None, then I'd leave the code as it was in the first place:
def maybe(x):
""" only keep odd value; returns either element or None """
result = 11 * x
if result & 1: return result
def do_stuff():
yield maybe(1)
yield maybe(6)
yield maybe(5)
but use a wrapped version instead which tosses the Nones, like:
def do_stuff_use():
return (x for x in do_stuff() if x is not None)
You could even wrap the whole thing up in a decorator, if you wanted:
import functools
def yield_not_None(f):
#functools.wraps(f)
def wrapper(*args, **kwargs):
return (x for x in f(*args, **kwargs) if x is not None)
return wrapper
#yield_not_None
def do_stuff():
yield maybe(1)
yield maybe(6)
yield maybe(5)
after which
>>> list(do_stuff())
[11, 55]