Design python like interactive shell - c++

What is the design pattern behind python like interactive shell. I want to do this for my server but I am ending up with lot of if - then- else pattern.
For example, when I start python interpreter I get something like this
Python 2.6.7 (r267:88850, Feb 2 2012, 23:50:20)
[GCC 4.5.3] on cygwin
Type "help", "copyright", "credits" or "license" for more information.
>>> help
After help the prompt changes to help
Welcome to Python 2.6! This is the online help utility.
If this is your first time using Python, you should definitely check out
the tutorial on the Internet at http://docs.python.org/tutorial/.
Enter the name of any module, keyword, or topic to get help on writing
Python programs and using Python modules. To quit this help utility and
return to the interpreter, just type "quit".
To get a list of available modules, keywords, or topics, type "modules",
"keywords", or "topics". Each module also comes with a one-line summary
of what it does; to list the modules whose summaries contain a given word
such as "spam", type "modules spam".
help>
I think this is some king of read-eval loop design.

for a REPL, you need a context (an object which stores the REPL's state), a command parser (which parses input and produces an AST), and a way to map commands to actions (actions are generally just functions that modifies the context and/or produces side effects).
A simple REPL can be implemented like the following, where context is implemented using a simple dictionary, AST is just the inputted commands split on whitespaces, and a dictionary is used to map commands to actions:
context = {}
commands = {}
def register(func):
""" convenience function to put `func` into commands map """
# in C++, you cannot introspect the function's name so you would
# need to map the function name to function pointers manually
commands[func.__name__] = func
def parse(s):
""" given a command string `s` produce an AST """
# the simplest parser is just splitting the input string,
# but you can also produce use a more complicated grammer
# to produce a more complicated syntax tree
return s.split()
def do(cmd, commands, context):
""" evaluate the AST, producing an output and/or side effect """
# here, we simply use the first item in the list to choose which function to call
# in more complicated ASTs, the type of the root node can be used to pick actions
return commands[cmd[0]](context, cmd)
#register
def assign(ctx, args):
ctx[args[1]] = args[2]
return '%s = %s' % (args[1], args[2])
#register
def printvar(ctx, args):
print ctx[args[1]]
return None
#register
def defun(ctx, args):
body = ' '.join(args[2:])
ctx[args[1]] = compile(body, '', 'exec')
return 'def %s(): %s' % (args[1], body)
#register
def call(ctx, args):
exec ctx[args[1]] in ctx
return None
# more commands here
context['PS1'] = "> "
while True:
# READ
inp = raw_input(context["PS1"])
# EVAL
cmd = parse(inp)
out = do(cmd, commands, context)
# PRINT
if out is not None: print out
# LOOP
A sample session:
> assign d hello
d = hello
> printvar d
hello
> assign PS1 $
PS1 = $
$defun fun print d + 'world'
def fun(): print d + 'world'
$call fun
helloworld
with a little bit more trickery, you could even merge the context and commands dictionary together, allowing the shell's set of commands to be modified in the shell's language.
The name of this design pattern, if it has a name, is Read-Eval-Print Loop design pattern; so yeah, your question sorta answers itself.

Related

Scrapy convert from unicode to utf-8

I've wrote a simple script to extract data from some site. Script works as expected but I'm not pleased with output format
Here is my code
class ArticleSpider(Spider):
name = "article"
allowed_domains = ["example.com"]
start_urls = (
"http://example.com/tag/1/page/1"
)
def parse(self, response):
next_selector = response.xpath('//a[#class="next"]/#href')
url = next_selector[1].extract()
# url is like "tag/1/page/2"
yield Request(urlparse.urljoin("http://example.com", url))
item_selector = response.xpath('//h3/a/#href')
for url in item_selector.extract():
yield Request(urlparse.urljoin("http://example.com", url),
callback=self.parse_article)
def parse_article(self, response):
item = ItemLoader(item=Article(), response=response)
# here i extract title of every article
item.add_xpath('title', '//h1[#class="title"]/text()')
return item.load_item()
I'm not pleased with the output, something like:
[scrapy] DEBUG: Scraped from <200 http://example.com/tag/1/article_name>
{'title': [u'\xa0"\u0412\u041e\u041e\u0411\u0429\u0415-\u0422\u041e \u0421\u0412\u041e\u0411\u041e\u0414\u0410 \u0417\u0410\u041a\u0410\u041d\u0427\u0418\u0412\u0410\u0415\u0422\u0421\u042f"']}
I think I need to use custom ItemLoader class but I don't know how. Need your help.
TL;DR I need to convert text, scraped by Scrapy from unicode to utf-8
As you can see below, this isn't much of a Scrapy issue but more of Python itself. It could also marginally be called an issue :)
$ scrapy shell http://censor.net.ua/resonance/267150/voobscheto_svoboda_zakanchivaetsya
In [7]: print response.xpath('//h1/text()').extract_first()
 "ВООБЩЕ-ТО СВОБОДА ЗАКАНЧИВАЕТСЯ"
In [8]: response.xpath('//h1/text()').extract_first()
Out[8]: u'\xa0"\u0412\u041e\u041e\u0411\u0429\u0415-\u0422\u041e \u0421\u0412\u041e\u0411\u041e\u0414\u0410 \u0417\u0410\u041a\u0410\u041d\u0427\u0418\u0412\u0410\u0415\u0422\u0421\u042f"'
What you see is two different representations of the same thing - a unicode string.
What I would suggest is run crawls with -L INFO or add LOG_LEVEL='INFO' to your settings.py in order to not show this output in the console.
One annoying thing is that when you save as JSON, you get escaped unicode JSON e.g.
$ scrapy crawl example -L INFO -o a.jl
gives you:
$ cat a.jl
{"title": "\u00a0\"\u0412\u041e\u041e\u0411\u0429\u0415-\u0422\u041e \u0421\u0412\u041e\u0411\u041e\u0414\u0410 \u0417\u0410\u041a\u0410\u041d\u0427\u0418\u0412\u0410\u0415\u0422\u0421\u042f\""}
This is correct but it takes more space and most applications handle equally well non-escaped JSON.
Adding a few lines in your settings.py can change this behaviour:
from scrapy.exporters import JsonLinesItemExporter
class MyJsonLinesItemExporter(JsonLinesItemExporter):
def __init__(self, file, **kwargs):
super(MyJsonLinesItemExporter, self).__init__(file, ensure_ascii=False, **kwargs)
FEED_EXPORTERS = {
'jsonlines': 'myproject.settings.MyJsonLinesItemExporter',
'jl': 'myproject.settings.MyJsonLinesItemExporter',
}
Essentially what we do is just setting ensure_ascii=False for the default JSON Item Exporters. This prevents escaping. I wish there was an easier way to pass arguments to exporters but I can't see any since they are initialized with their default arguments around here. Anyway, now your JSON file has:
$ cat a.jl
{"title": " \"ВООБЩЕ-ТО СВОБОДА ЗАКАНЧИВАЕТСЯ\""}
which is better-looking, equally valid and more compact.
There are 2 independant issues affecting display of unicode string.
if you return a list of strings, the output file will have some issue them because it will use ascii codec by default to serialize list elements. You can work around as below but it's more appropriate to use extract_first() as suggested by #neverlastn
class Article(Item):
title = Field(serializer=lambda x: u', '.join(x))
the default implementation of repr() method will serialize unicode string to their escaped version \uxxxx. You can change this behaviour by overriding this method in your item class
class Article(Item):
def __repr__(self):
data = self.copy()
for k in data.keys():
if type(data[k]) is unicode:
data[k] = data[k].encode('utf-8')
return super.__repr__(data)

Passing an argument to main that calls a function in python

I'm trying to pass arguments to my python script using argparse and consequently call functions. Any ideas where I might be going wrong?
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument('-d','--d', dest='action', action='store_const',const=do_comparison,
help="Diff the current and most recent map file memory information)")
options = parser.parse_args()
return options
def do_comparison(parsed_args):
# do things
def main(args):
options = parse_args()
if __name__ == '__main__':
sys.exit(main())
In my comment I missed the fact that you are using store_const and const=do_comparison. So you are trying some sort of callback.
options from parse_args is a argparse.Namespace object. This is a simple object, similar to a dictionary. In fact vars(options) returns a dictionary.
When main is run (with -d), options.action will be set to the const, a function. But remember, in Python, functions are first class objects, and can be set to variables, etc just like numbers and strings. To be used the function has to be 'called'.
options.action()
should end up calling do_comparison. Actually since that function requires an argument, you should use
options.action(options)
or some other way of providing a varible or object to the function.
Of course you'll have to be careful about the case where you don't specify -d. Then options.action will have the default value (e.g. None). If the default isn't a callable, then this call will produce an error.
The argparse documentation illustrates this kind of action in the section dealing with subparsers (subcommands). I vaguely recall a tutorial that set an argument value to functions like add and multiply, creating a simple arithmetic expression evaluator.
Usually the values in the Namespace are strings, or numbers, and to use them you test for string equality. e.g.
if options.action is None:
# default action
elif options.action == 'print':
print(options)
else:
do some other backup or error
A callback kind of action is possible, and may be convenient in some cases, but it isn't the usual arrangement.
You asked about using successfully store a string following the -d, to be used as the function arg with:
parser.add_argument('-d','--d', dest='action', dest='function_input', action='store_const', const=diff_map)
A 'store_const' action does not take an argument (in effect nargs=0). It's more like store_true. In fact store_true is just a store_const with has default=False and const=True.
What you need is another argument, whick could occur either before or after the -d. argparse tries to be order flexible.
Here's a simple script with a callable argument, and flexible positional argument.
import argparse
def action1(*args):
print 'action1',args
def action0(*args):
print 'action0',args
parser = argparse.ArgumentParser()
parser.add_argument('-d', dest='action', action='store_const', const=action1, default=action0)
parser.add_argument('args', nargs='*')
args = parser.parse_args()
args.action(args.args)
resulting runs
1238:~/mypy$ python stack32214076.py
action0 ([],)
1238:~/mypy$ python stack32214076.py one two three
action0 (['one', 'two', 'three'],)
1238:~/mypy$ python stack32214076.py one two three -d
action1 (['one', 'two', 'three'],)
1239:~/mypy$ python stack32214076.py -d one two three
action1 (['one', 'two', 'three'],)
1239:~/mypy$ python stack32214076.py -d
action1 ([],)
TO make -d value perform some action on value, try:
parser.add_argument('-d','--action')
The default action type stores one value (e.g. action='store', nargs=None)
args = parser.parse_args()
if args.action: # or is not None
do_comparison(args.action)
If -d is not given args.action will have default None value, and nothing happens here.
If -d astr is given acts.action will have the string value 'astr'. This if just calls the do_comparison function with this value. It's the present of this (nondefault) value that triggers the function call.
This is a rather straight forward use of a parser and an argument.

Capture the output from function in real time python

I didn't find quite what I was looking for.
I want to obtain the output (stdout) from a python function in real time.
The actual problem is that I want to plot a graph (with cplot from sympy) with a progress bar in my UI. The argument verbose makes cplot output the progress to stdout.
sympy.mpmath.cplot(lambda z: z, real, imag, verbose=True)
The output would be something like:
0 of 71
1 of 71
2 of 71
...
And so on.
I want to capture line by line so I can make a progress bar. (I realize this might not be possible without implementing multithreading). I'm using python2.7 (mainly because I need libraries that aren't in python3)
So, ¿How do I achieve that?
You can capture stdout by monkeypatching sys.stdout. A good way to do it is using a context manager, so that it gets put back when you are done (even if the code raises an exception). If you don't use a context manager, be sure to put the original sys.stdout back using a finally block.
You'll need an object that is file-like, that takes the input and does what you want with it. Subclassing StringIO is a good start. Here's an example of a context manager that captures stdout and stderr and puts them in the result of the bound variable.
class CapturedText(object):
pass
#contextmanager
def captured(disallow_stderr=True):
"""
Context manager to capture the printed output of the code in the with block
Bind the context manager to a variable using `as` and the result will be
in the stdout property.
>>> from tests.helpers import capture
>>> with captured() as c:
... print('hello world!')
...
>>> c.stdout
'hello world!\n'
"""
import sys
stdout = sys.stdout
stderr = sys.stderr
sys.stdout = outfile = StringIO()
sys.stderr = errfile = StringIO()
c = CapturedText()
yield c
c.stdout = outfile.getvalue()
c.stderr = errfile.getvalue()
sys.stdout = stdout
sys.stderr = stderr
if disallow_stderr and c.stderr:
raise Exception("Got stderr output: %s" % c.stderr)
(source)
It works as shown in the docstring. You can replace StringIO() with your own class that writes the progress bar.
Another possibility would be to monkeypatch sympy.mpmath.visualization.print, since cplot uses print to print the output, and it uses from __future__ import print_function.
First, make sure you are using from __future__ import print_function if you aren't using Python 3, as this will otherwise be a SyntaxError.
Then something like
def progressbar_print(*args, **kwargs):
# Take *args and convert it to a progress output
progress(*args)
# If you want to still print the output, do it here
print(*args, **kwargs)
sympy.mpmath.visualization.print = progressbar_print
You might want to monkeypatch it in a custom function that puts it back, as other functions in that module might use print as well. Again, remember to do this using either a context manager or a finally block so that it gets put back even if an exception is raised.
Monkeypatching sys.stdout is definitely the more standard way of doing this, but I like this solution in that it shows that having print as a function can actually be useful.

python 2.7 or 3.2(classes and instances)

I'm a beginner of python. My question is while compiling a project using python, how to make a user-input variable an attribute.
For example:
class supermarket:
num=int(input('enter a no.'))
def __init__(self,num):
self.ini=''
def odd_even(self,num):
if num%2==0:
self.ini='even'
else:
self.ini='odd'
#calling
pallavi=supermarket()
pallavi.(num)
Here, it's showing the error that there is no attribute called num.
What should I do?
This is just a summary and leaves a lot out, but basically, your num should go inside the __init__() call as self.num. So:
class supermarket:
def __init__(self):
self.ini = ''
self.num = int(input('enter a no.'))
# etc.
Then to access the attribute:
pallavi = supermarket()
pallavi.num # No parentheses needed
There's lots more to classes in Python that I don't have time to go into right now, but I'll touch on one thing: until you know what you're doing, all assignments in a class should go inside a function, not in the class definition itself. If you have a statement with a = sign in it that's in the class, not in a function (like the num=int(input("enter a no.")) statement in your example), it's going to fail and you won't understand why.
The reason why goes into the difference between "class variables" and "instance variables", but it might be too soon for you to wrestle with that concept. Still, it might be worth taking a look at the Python tutorial's chapter on classes. If you don't understand parts of that tutorial, don't worry about it yet -- just learn a few concepts, keep on writing code, then go back later and read the tutorial again and a few more concepts may become clear to you.
Good luck!
You have numerous problems here:
num = int(input(...)) assigns a class attribute - this code runs when the class is defined, not when an instance is created, and the attribute will be shared by all instances of the class;
Despite defining a second num parameter to __init__, you call pallavi = supermarket() without passing the argument;
Also, why is num a parameter of odd_even - if it's an attribute, access it via self; and
pallavi.(num) is not correct Python syntax - attribute access syntax is object.attr, the parentheses are a SyntaxError.
I think what you want is something like:
class Supermarket(): # note PEP-8 naming
# no class attributes
def __init__(self, num):
self.num = num # assign instance attribute
self.ini = 'odd' if num % 2 else 'even' # don't need separate method
#classmethod # method of the class, rather than of an instance
def from_input(cls):
while True:
try:
num = int(input('Enter a no.: ')) # try to get an integer
except ValueError:
print("Please enter an integer.") # require valid input
else:
return cls(num) # create class from user input
This separates out the request for user input from the actual initialisation of the instance, and would be called like:
>>> pallavi = Supermarket.from_input()
Enter a no.: foo
Please enter an integer.
Enter a no.: 12
>>> pallavi.num
12
>>> pallavi.ini
'even'
As you mention 3.2 and 2.7, note that input should be replaced with raw_input when using 2.x.

How to alter a python script with arcpy.GetParameterAsText when run as a stand alone script?

I have created a python script that runs from an ArcMap 10.1 session; however, I would like to modify it to run as a stand alone script, if possible. The problem is I don't see a workaround for prompting the user for the parameters when executed outside ArcMap.
Can this even be reasonably done? If so, how would I approach it? Below is a sample of my script. How can I modify this to prompt the user at the command line for the path names of parameters 0 and 1?
import arcpy
arcpy.env.overwriteOutput = True
siteArea = arcpy.GetParameterAsText(0)
tempGDB_Dir = arcpy.GetParameterAsText(1)
tempGDB = tempGDB_Dir + "\\tempGDB.gdb"
# Data from which records will be extracted
redWoods = "D:\\Data\\GIS\\Landforms\\Tress.gdb\\Redwoods"
# List of tree names that will be used in join
treesOfInterest = "C:\\Data\\GIS\\Trees\\RedwoodList.dbf"
inFeature = [redWoods, siteArea]
tempFC = tempGDB_Dir + "\\TempFC"
tempFC_Layer = "TempFC_Layer"
output_dbf = tempGDB_Dir + "\\Output.dbf"
# Make a temporaty geodatabase
arcpy.CreateFileGDB_management(tempGDB_Dir, "tempGDB.gdb")
# Intersect trees with site area
arcpy.Intersect_analysis([redWoods, siteArea], tempFC, "ALL", "", "INPUT")
# Make a temporary feature layer of the results
arcpy.MakeFeatureLayer_management(tempFC, tempFC_Layer)
# Join redwoods data layer to list of trees
arcpy.AddJoin_management(tempFC_Layer, "TreeID", treesOfInterest, "TreeID", "KEEP_COMMON")
# Frequency analysis - keeps only distinct species values
arcpy.Frequency_analysis(tempFC_Layer, output_dbf, "tempFC.TreeID;tempFC.TreeID", "")
# Delete temporary files
arcpy.Delete_management(tempFC_Layer)
arcpy.Delete_management(tempGDB)
This is as much a philosophical question as it is a programmatic one. I am interested in whether this can be done and the amount of effort to do it this way. Is the effort worth the convenience of not opening up a map document?
Check to see if the parameters were specified. If they were not specified, do one of the following:
Use Python's raw_input() method to prompt the user (see this question).
Print a "usage" message, which instructs the user to enter parameters on the command line, and then exit.
Prompting the user could look like this:
siteArea = arcpy.GetParameterAsText(0)
tempGDB_Dir = arcpy.GetParameterAsText(1)
if (not siteArea):
arcpy.AddMessage("Enter the site area:")
siteArea = raw_input()
if (not tempGDB_Dir):
arcpy.AddMessage("Enter the temp GDB dir:")
tempGDB_Dir = raw_input()
Printing a usage message could look like this:
siteArea = arcpy.GetParameterAsText(0)
tempGDB_Dir = arcpy.GetParameterAsText(1)
if (not (siteArea and tempGDB_Dir)):
arcpy.AddMessage("Usage: myscript.py <site area> <temp GDB dir>")
else:
# the rest of your script goes here
If you prompt for input with raw_input(), make sure to make all parameters required when adding to your toolbox in ArcGIS for Desktop. Otherwise, you'll get this error from raw_input() when running in Desktop:
EOFError: EOF when reading a line
hell yeah its worth the convenience of not opening up arcmap. I like to use the optparse module to create command line tools. arcpy.GetParameter(0) is only useful for Esri GUI integration (e.g. script tools). Here is a nice example of a python commandline tool:
http://www.jperla.com/blog/post/a-clean-python-shell-script
I include a unittest class in my tools to testing and automation. I also keep all arcpy.GetParameterAsText statements outside of any real business logic. I like to include at the bottom:
if __name__ == '__main__':
if arcpy.GetParameterAsText(0):
params = parse_arcpy_parameters()
main_business_logic(params)
else:
unittest.main()