What could be the reason for "TypeError: 'StratifiedShuffleSplit' object is not iterable"? - python-2.7

I have to deliver a Machine Learning project, and I received a file called tester.py. After I've finished writing my code in another file, I have to run tester.py to see the results, but I am getting a error: TypeError: 'StratifiedShuffleSplit' object is not iterable
I have researched this error in another topics and website, the solution is always the same: use sklearn.model_selection to import GridSearchCV. I am already doing that since the beginning, but the file tester.py not run.
The part of code from tester.py that occurs the problem is:
def main():
### load up student's classifier, dataset, and feature_list
clf, dataset, feature_list = load_classifier_and_data()
### Run testing script
test_classifier(clf, dataset, feature_list)
if __name__ == '__main__':
main()
My own code works fine.
Any help?

Try changing the following lines of tester.py
The way of working of the current version of StratifiedShuffleSplit is different that the expected when tester.py was developed.
[..]
from sklearn.model_selection import StratifiedShuffleSplit
[..]
#cv = StratifiedShuffleSplit(labels, folds, random_state = 42)
cv = StratifiedShuffleSplit(n_splits=folds, random_state=42)
[..]
#for train_idx, test_idx in cv:
for train_idx, test_idx in cv.split(features, labels):
[..]
I hope you find it useful

Related

web.py running main twice, ignoring changes

I have a simple web.py app that reads a config file and serves to URL paths. However I get two strange behaviors. One, changes made to data in the Main are not reflected in the results of GET. Two, Main appears to run twice.
Desired behavior is modifying data in Main will cause methods to see modified data, and not having main re-run.
Questions:
What is really happening here, that mydict is not modified in either
GET.
Why am I getting some code running twice.
Simplest path to desired behavior (most important)
Pythonic path to desired behavior (least important)
From pbuck (Accepted Answer): Answer for 3.) is replace
app = web.application(urls, globals())
with:
app = web.application(urls, globals(), autoreload=False)
Same behavior on pythons Linux (CentOS 6 python 2.6.6) and MacBook (brew python 2.7.12)
When started I get:
$ python ./foo.py 8080
Initializing mydict
Modifying mydict
http://0.0.0.0:8080/
When queried with:
wget http://localhost:8080/node/first/foo
wget http://localhost:8080/node/second/bar
Which results in (notice a second "Initializing mydict"):
Initializing mydict
firstClass.GET called with clobber foo
firstClass.GET somevalue is something static
127.0.0.1:52480 - - [17/Feb/2017 17:30:42] "HTTP/1.1 GET /node/first/foo" - 200 OK
secondClass.GET called with clobber bar
secondClass.GET somevalue is something static
127.0.0.1:52486 - - [17/Feb/2017 17:30:47] "HTTP/1.1 GET /node/second/bar" - 200 OK
Code:
#!/usr/bin/python
import web
urls = (
'/node/first/(.*)', 'firstClass',
'/node/second/(.*)', 'secondClass'
)
# Initialize web server, start it later at "app . run ()"
#app = web.application(urls, globals())
# Running web.application in Main or above does not change behavior
# Static Initialize mydict
print "Initializing mydict"
mydict = {}
mydict['somevalue'] = "something static"
class firstClass:
def GET(self, globarg):
print "firstClass.GET called with clobber %s" % globarg
print "firstClass.GET somevalue is %s" % mydict['somevalue']
return mydict['somevalue']
class secondClass:
def GET(self, globarg):
print "secondClass.GET called with clobber %s" % globarg
print "secondClass.GET somevalue is %s" % mydict['somevalue']
return mydict['somevalue']
if __name__ == '__main__':
app = web.application(urls, globals())
# read configuration files for initializations here
print "Modifying mydict"
mydict['somevalue'] = "something dynamic"
app.run()
Short answer, avoid using globals as they don't do what you think they do. Especially when you eventually deploy this under nginx / apache where there will (likely) be multiple processes running.
Longer answer
Why am I getting some code running twice?
Code, global to app.py, is running twice because it runs once, as it normally does. The second time is within the web.application(urls, globals()) call. Really, that call to globals() sets up module loading / re-loading. Part of that is re-loading all modules (including app.py). If you set autoreload=False in the web.applications() call, it won't do that.
What is really happening here, that mydict is not modified in either GET?
mydict is getting set to 'something dynamic', but then being re-set to 'something static' on second load. Again, set autoreload=False and it will work as you expect.
Shortest path?
autoreload=False
Pythonic path?
.... well, I wonder why you have mydict['somevalue'] = 'something static' and mydict['somevalue'] = 'something dynamic' in your module this way: why not just set it once under '__main__'?

Returning error string from a method in python

I was reading a similar question Returning error string from a function in python. While I experimenting to create something similar in an Object Oriented programming so I could learn a few more things I got lost.
I am using Python 2.7 and I am a beginner on Object Oriented programming.
I can not figure out how to make it work.
Sample code checkArgumentInput.py:
#!/usr/bin/python
__author__ = 'author'
class Error(Exception):
"""Base class for exceptions in this module."""
pass
class ArgumentValidationError(Error):
pass
def __init__(self, arguments):
self.arguments = arguments
def print_method(self, input_arguments):
if len(input_arguments) != 3:
raise ArgumentValidationError("Error on argument input!")
else:
self.arguments = input_arguments
return self.arguments
And on the main.py script:
#!/usr/bin/python
import checkArgumentInput
__author__ = 'author'
argsValidation = checkArgumentInput.ArgumentValidationError(sys.argv)
if __name__ == '__main__':
try:
result = argsValidation.validate_argument_input(sys.argv)
print result
except checkArgumentInput.ArgumentValidationError as exception:
# handle exception here and get error message
print exception.message
When I am executing the main.py script it produces two blank lines. Even if I do not provide any arguments as input or even if I do provide argument(s) input.
So my question is how to make it work?
I know that there is a module that can do that work for me, by checking argument input argparse but I want to implement something that I could use in other cases also (try, except).
Thank you in advance for the time and effort reading and replying to my question.
OK. So, usually the function sys.argv[] is called with brackets in the end of it, and with a number between the brackets, like: sys.argv[1]. This function will read your command line input. Exp.: sys.argv[0] is the name of the file.
main.py 42
In this case main.py is sys.argv[0] and 42 is sys.argv[1].
You need to identifi the string you're gonna take from the command line.
I think that this is the problem.
For more info: https://docs.python.org/2/library/sys.html
I made some research and I found this useful question/ answer that helped me out to understand my error: Manually raising (throwing) an exception in Python
I am posting the correct functional code under, just in case that someone will benefit in future.
Sample code checkArgumentInput.py:
#!/usr/bin/python
__author__ = 'author'
class ArgumentLookupError(LookupError):
pass
def __init__(self, *args): # *args because I do not know the number of args (input from terminal)
self.output = None
self.argument_list = args
def validate_argument_input(self, argument_input_list):
if len(argument_input_list) != 3:
raise ValueError('Error on argument input!')
else:
self.output = "Success"
return self.output
The second part main.py:
#!/usr/bin/python
import sys
import checkArgumentInput
__author__ = 'author'
argsValidation = checkArgumentInput.ArgumentLookupError(sys.argv)
if __name__ == '__main__':
try:
result = argsValidation.validate_argument_input(sys.argv)
print result
except ValueError as exception:
# handle exception here and get error message
print exception.message
The following code prints: Error on argument input! as expected, because I violating the condition.
Any way thank you all for your time and effort, hope this answer will help someone else in future.

how import statement executes in python?

I read about about import statement in pydocs. It says it executes in two steps.
(1)find a module, and initialize it if necessary; (2) define a name or names in the local namespace (of the scope where the import statement occurs). The first form (without from) repeats these steps for each identifier in the list. The form with from performs step (1) once, and then performs step (2) repeatedly.
I understood some bits of it, but its still not clear to me completely.I am mainly confused about initialization step and at last it says about repeating some step.The only thing which i understood is that if we use say for example:
import sys
in this case if we use functions of this module in our script we need call them using sys.fun_name(). As the functions weren't made available locally using this importstatement.
But when we use
from sys import argv
We can simply use argv function as it makes it available local for out srcipt.
Can someone please explain me its working and also let me know my understanding is correct or not.
Even i tried to import one of the my script into another script and it gave some strange result which i know have something to do with first step of import statement,(initiallization)
##### ex17.py #####
def print_two(*args):
arg1, arg2 = args
print "arg1: %r, arg2: %r" %(arg1, arg2)
def print_two_again(arg1, arg2):
print "arg1: %r, arg2: %r" %(arg1, arg2)
def print_one(arg1):
print "arg1: %r" %arg1
def print_none():
print "I got nothing."
print_two("Gaurav","Pareek")
print_two_again("Gaurav","Pareek")
print_one("First!")
print_none()
####### ex18.py ######
import ex17
ex17.print_none()
The output which i am getting when executing ex18.py is as below
arg1: 'Gaurav', arg2: 'Pareek'
arg1: 'Gaurav', arg2: 'Pareek'
arg1: 'First!'
I got nothing.
I got nothing.
why is it like this. It should only print I got nothing once.
It prints "I got nothing." twice because the function print_none is being invoked twice. Once when loading the ex17 module (since it's imported in ex18) and once when it's called in the ex18 module. If you don't want the function calls in ex17 to execute but only the function defs to be loaded, then you may write them as follows
## in ex17.py
if __name__ == '__main__':
print_two("Gaurav","Pareek")
print_two_again("Gaurav","Pareek")
print_one("First!")
print_none()
Now this code will only be executed if it's run as a script ie. $ python ex17.py but not when it's imported into some other module. More about __main__ here
About the excerpt from the docs, it simply says how the two import forms differ. Step 1 is responsible for finding and initializing the module and step 2 for adding the names to the local namespace. So in case of,
import sys
both step 1 and 2 will be executed once. But in case of,
from sys import argv, stdout
step 1 will be executed just once, but step 2 will be executed twice as it needs to add both argv and stdout to the local namespace.

Python zipfile module erroneously thinks I have a zipfile that spans multiple disks, throws BadZipfile error

I have a 1.4GB zip file and am trying to yield each member in succession. The zipfile module keeps throwing a BadZipfile exception, stating that
"zipfile.BadZipfile: zipfiles that span multiple disks are not supported".
Here is my code:
import zipfile
def iterate_members(zip_file_like_object):
zflo = zip_file_like_object
assert zipfile.is_zipfile(zflo) # Here is where the error happens.
# If I comment out the assert, the same error gets thrown on this next line:
with zipfile.ZipFile(zflo) as zip:
members = zip.namelist()
for member in members:
yield member
fn = "filename.zip"
iterate_members(open(fn, 'rb'))
I'm using Python 2.7.3. I tried on both Windows 8 and ubuntu with the same result. Any help very much appreciated.
I get the same error on a similar file although I am using python 3.4
Was able to fix it by editing line 205 in zipfile.py source code:
if diskno != 0 or disks != 1:
raise BadZipFile("zipfiles that span multiple disks are not supported")
to:
if diskno != 0 or disks > 1:
Hope this helps
Quick Fix, Install zipfile38 using:
pip install zipfile38
And use it in the code same as you are doing before
import zipfile38 as zipfile
#your code goes here

No indexers created by Djapian for Django

I am working through the tutorial for setting up Djapian and am trying to use the indexshell (as demonstrated in this step). When I run the command 'list' I get the following output:
Installed spaces/models/indexers:
- 0: 'global'
I therefore cannot run any queries:
>>> query
No index selected
Which leads me to attempt:
>>> use 0
Illegal index alias '0'. See 'list' command for available aliases
My index.py is as follows:
from djapian import space, Indexer, CompositeIndexer
from cms.models import Article
class ArticleIndexer(Indexer):
fields = ['body']
tags = [
('title', 'title'),
('author', 'author'),
('pub_date', 'pub_date',),
('category', 'category')
]
space.add_index(Article, ArticleIndexer, attach_as='indexer')
Update: I moved the djapian folder from site-packages to within my project folder and I move index.py from the project root to within the djapian folder. When I run 'list' in the indexshell the following is now returned:
>>> list
Installed spaces/models/indexers:
- 0: 'global'
- 0.0 'cms.Article'
-0.0.0: 'djapian.space.defaultcmsarticleindexer'
I still cannot do anything though as when I try to select an index I still get the following error:
>>> use 0.0
Illegal index alias '0'. See 'list' command for available aliases
Update 2: I had a problem with my setting for DJAPIAN_DATABASE_PATH which is now fixed. I can select an indexer using the command 'use 0.0.0' but when I try to run a query it raises the following ValueError: "Empty slice".
Have you fixed the problem of the ValueError: Empty Slice?
I'm having the exact same problem using the djapian tutorial. First I was wondering if my database entries were right, but now I'm thinking it might have something to do with the actual querying of the Xapian install?
Seeing that I haven't had to point to the install at all wonders me if I placed it in the right directory and if djapian knows where to find it.
-- Edit
I've found the solution, atleast for me. The tutorial is not up to date and the query command expects a number of results too. So if you use 'query mykeyword 5' you get 5 results and the ValueError: Empty Slice disappears. It's a known issue and it will be fixed soon from what I read.
Perhaps you're not loading indexes?
You could try placing the following in your main urls.py:
import djapian
djapian.load_indexes()
In a comment to your question you write that you've placed index.py file in the project root. It should actually reside within an app, along models.py.
One more thing (which is very unlikely to be the cause of your problems); you've got a stray comma on the following line:
('pub_date', 'pub_date',),
^