Guidance on documenting Pyomo model formulations - pyomo

Does anyone have suggestions or best practices for how to document the formulation of Pyomo models (mainly constraints) directly within the code? Ideally, I'd like to find a way to write the constraint math in LaTeX in the docstring of the helper functions that I use to define constraints and then have those docstrings picked up by Sphinx autodoc.
Currently, I'm using the #decorator style of writing Pyomo components, such as:
import pyomo.environ as pyo
class Model:
def __init__(self, **kwargs):
self.model = pyo.ConcreteModel()
self.model.T = pyo.RangeSet(1, 24)
self.model.x = pyo.Var(self.model.T, within=pyo.NonNegativeReals)
#self.model.Constraint()
def Example_Constraint(model):
"""This is the docstring that I would ideally be able to `autodoc`
.. math ::
\sum_{t \in T} x_{t} \geq 100
"""
return sum(x[t] for t in self.model.T) >= 100
However, Sphinx cannot "see" inner functions, such as the functions used to construct Example_Constraint within the __init__ method.
Options:
Move away from the #decorator syntax and have the helper functions as instance, static, or class methods that are called in the __init__ method (e.g., self.model.Test_Constraint = pyo.Constraint(self.model.T, rule=self.test_constraint_rule). The downside of this approach (and one of the reasons we originally adopted the #decorator syntax) was that the functions that it feels like duplication to define the Pyomo component (self.model.Example_Constraint) and a Python function to construct the component (self.test_constraint_rule).
I've seen #GiorgioBalestieri's package https://github.com/GiorgioBalestrieri/pyomo-sphinx-docs, which works by pulling Pyomo components' doc. This seems like an OK option, but not ideal given that it's an experiment and light on formatting options.
Other approaches that I don't know of?

Related

How to mock spacy models / Doc objects for unit tests?

Loading spacy models slows down running my unit tests. Is there a way to mock spacy models or Doc objects to speed up unit tests?
Example of a current slow tests
import spacy
nlp = spacy.load("en_core_web_sm")
def test_entities():
text = u"Google is a company."
doc = nlp(text)
assert doc.ents[0].text == u"Google"
Based on the docs my approach is
Constructing the Vocab and Doc manually and setting the entities as tuples.
from spacy.vocab import Vocab
from spacy.tokens import Doc
def test()
alphanum_words = u"Google Facebook are companies".split(" ")
labels = [u"ORG"]
words = alphanum_words + [u"."]
spaces = len(words) * [True]
spaces[-1] = False
spaces[-2] = False
vocab = Vocab(strings=(alphanum_words + labels))
doc = Doc(vocab, words=words, spaces=spaces)
def get_hash(text):
return vocab.strings[text]
entity_tuples = tuple([(get_hash(labels[0]), 0, 1)])
doc.ents = entity_tuples
assert doc.ents[0].text == u"Google"
Is there a cleaner more Pythonic solution for mocking spacy objects for unit tests for entities?
This is a great question actually! I'd say your instinct is definitely right: If all you need is a Doc object in a given state and with given annotations, always create it manually wherever possible. And unless you're explicitly testing a statistical model, avoid loading it in your unit tests. It makes the tests slow, and it introduces too much unnecessary variance. This is also very much in line with the philosophy of unit testing: you want to be writing independent tests for one thing at a time (not one thing plus a bunch of third-party library code plus a statistical model).
Some general tips and ideas:
If possible, always construct a Doc manually. Avoid loading models or Language subclasses.
Unless your application or test specifically needs the doc.text, you do not have to set the spaces. In fact, I leave this out in about 80% of the tests I write, because it really only becomes relevant when you're putting the tokens back together.
If you need to create a lot of Doc objects in your test suite, you could consider using a utility function, similar to the get_doc helper we use in the spaCy test suite. (That function also shows you how the individual annotations are set manually, in case you need it.)
Use (session-scoped) fixtures for the shared objects, like the Vocab. Depending on what you're testing, you might want to explicitly use the English vocab. In the spaCy test suite, we do this by setting up an en_vocab fixture in the conftest.py.
Instead of setting the doc.ents to a list of tuples, you can also make it a list of Span objects. This looks a bit more straightforward, is easier to read, and in spaCy v2.1+, you can also pass a string as a label:
def test_entities(en_vocab):
doc = Doc(en_vocab, words=["Hello", "world"])
doc.ents = [Span(doc, 0, 1, label="ORG")]
assert doc.ents[0].text == "Hello"
If you do need to test a model (e.g. in the test suite that makes sure that your custom models load and run as expected) or a language class like English, put them in a session-scoped fixture. This means that they'll only be loaded once per session instead of once per test. Language classes are lazy-loaded and may also take some time to load, depending on the data they contain. So you only want to do this once.
# Note: You probably don't have to do any of this, unless you're testing your
# own custom models or language classes.
#pytest.fixture(scope="session")
def en_core_web_sm():
return spacy.load("en_core_web_sm")
#pytest.fixture(scope="session")
def en_lang_class():
lang_cls = spacy.util.get_lang_class("en")
return lang_cls()
def test(en_lang_class):
doc = en_lang_class("Hello world")

how to group traits together, encapsulating them as a group

I have a coordinate system that it makes sense to treat as a "whole group". They initialize, change, and reset simultaneously. I also like to not re-render as many times as I have coordinates when one changes. Here is the simplified version of what I have in mind, but I can't quite get there. Thanks.
Cleaner code is better in my case even if it uses more advanced features. Could the class 'Coord' be wrapped as a trait itself?
from traits.api import *
class Coord(HasTraits):
x=Float(1.0)
y=Float(1.0)
def __init__(self,**traits):
HasTraits.__init__(self,**traits)
class Model:
coord=Instance(Coord)
#on_trait_change('coord')##I would so have liked this to "just work"
def render(self):#reupdate render whenever coordinates change
class Visualization:
model=Instance(Model)
def increment_x(self):
self.model.coord.x+=1 ##should play well with Model.render
def new_coord(self):
self.model.coord=Coord(x=2,y=2) ##should play well with Model.render
There are a couple of issues with your source code. Model and Visualization both need to be HasTraits classes for the listener to work.
Also, it is rare to actually need to write the __init__ method of a HasTraits class. Traits is designed to work without it. That said, if you do write an __init__ method, make sure to use super to properly traverse the method resolution order. (Note that you will find this inconsistently implemented in the extant documentation and examples.)
Finally, use the 'anytrait' name to listen for any trait:
from traits.api import Float, HasTraits, Instance, on_trait_change
class Coord(HasTraits):
x=Float(1.0)
y=Float(1.0)
class Model(HasTraits):
coord=Instance(Coord, ())
#on_trait_change('coord.anytrait') # listens for any trait on `coord`.
def render(self):
print "I updated"
class Visualization(HasTraits):
model=Instance(Model, ())
def increment_x(self):
self.model.coord.x+=1 # plays well with Model.render
def new_coord(self):
self.model.coord=Coord(x=2,y=2) # plays well with Model.render
Here's my output:
>>> v = Visualization()
>>> v.increment_x()
I updated
>>> v.new_coord()
I updated

Unmocking a mocked object in Django unit tests

I have several TestCase classes in my django application. On some of them, I mock out a function which calls external resources by decorating the class with #mock.patch, which works great. One TestCase in my test suite, let's call it B(), depends on that external resource so I don't want it mocked out and I don't add the decorator. It looks something like this:
#mock.patch("myapp.external_resource_function", new=mock.MagicMock)
class A(TestCase):
# tests here
class B(TestBase):
# tests here which depend on external_resource_function
When I test B independently, things work as expected. However, when I run both tests together, A runs first but the function is still mocked out in B. How can I unmock that call? I've tried reloading the module, but it didn't help.
Patch has start and stop methods. Based on what I can see from the code you have provided, I would remove the decorator and use the setUp and tearDown methods found in the link in your classes.
class A(TestCase):
def setUp(self):
self.patcher1 = patch('myapp.external_resource_function', new=mock.MagicMock)
self.MockClass1 = self.patcher1.start()
def tearDown(self):
self.patcher1.stop()
def test_something(self):
...
>>> A('test_something').run()
Great answer. With regard to Ethereal's question, patch objects are pretty flexible in their use.
Here's one way to approach tests that require different patches. You could still use setUp and tearDown, but not to do the patch.start/stop bit.
You start() the patches in each test and you use a finally clause to make sure they get stopped().
Patches also support Context Manager stuff so that's another option, not shown here.
class A(TestCase):
patcher1 = patch('myapp.external_resource_function', new=mock.MagicMock)
patcher2 = patch('myapp.something_else', new=mock.MagicMock)
def test_something(self):
li_patcher = [self.patcher1]
for patcher in li_patcher:
patcher.start()
try:
pass
finally:
for patcher in li_patcher:
patcher.stop()
def test_something_else(self):
li_patcher = [self.patcher1, self.patcher2]
for patcher in li_patcher:
patcher.start()
try:
pass
finally:
for patcher in li_patcher:
patcher.stop()

Extending SWIG builtin classes

The -builtin option of SWIG has the advantage of being faster, and of being exempt of a bug with multiple inheritance.
The setback is I can't set any attribute on the generated classes or any subclass :
-I can extend a python builtin type like list, without hassle, by subclassing it :
class Thing(list):
pass
Thing.myattr = 'anything' # No problem
-However using the same approach on a SWIG builtin type, the following happens :
class Thing(SWIGBuiltinClass):
pass
Thing.myattr = 'anything'
AttributeError: type object 'Thing' has no attribute 'myattr'
How could I work around this problem ?
I found a solution quite by accident. I was experimenting with metaclasses, thinking I could manage to override the setattr and getattr functions of the builtin type in the subclass.
Doing this I discovered the builtins already have a metaclass (SwigPyObjectType), so my metaclass had to inherit it.
And that's it. This alone solved the problem. I would be glad if someone could explain why :
SwigPyObjectType = type(SWIGBuiltinClass)
class Meta(SwigPyObjectType):
pass
class Thing(SWIGBuiltinClass):
__metaclass__ = Meta
Thing.myattr = 'anything' # Works fine this time
The problem comes from how swig implemented the classes in "-builtin" to be just like builtin classes (hence the name).
builtin classes are not extensible - try to add or modify a member of "str" and python won't let you modify the attribute dictionary.
I do have a solution I've been using for several years.
I'm not sure I can recommend it because:
It's arguably evil - the moral equivalent of casting away const-ness in C/C++
It's unsupported and could break in future python releases
I haven't tried it with python3
I would be a bit uncomfortable using "black-magic" like this in production code - it could break and is certainly obscure - but at least one giant corporation IS using this in production code
But.. I love how well it works to solve some obscure features we wanted for debugging.
The original idea is not mine, I got it from:
https://gist.github.com/mahmoudimus/295200 by Mahmoud Abdelkader
The basic idea is to access the const dictionary in the swig-created type object as a non-const dictionary and add/override any desired methods.
FYI, the technique of runtime modification of classes is called monkeypatching, see https://en.wikipedia.org/wiki/Monkey_patch
First - here's "monkeypatch.py":
''' monkeypatch.py:
I got this from https://gist.github.com/mahmoudimus/295200 by Mahmoud Abdelkader,
his comment: "found this from Armin R. on Twitter, what a beautiful gem ;)"
I made a few changes for coding style preferences
- Rudy Albachten April 30 2015
'''
import ctypes
from types import DictProxyType, MethodType
# figure out the size of _Py_ssize_t
_Py_ssize_t = ctypes.c_int64 if hasattr(ctypes.pythonapi, 'Py_InitModule4_64') else ctypes.c_int
# python without tracing
class _PyObject(ctypes.Structure):
pass
_PyObject._fields_ = [
('ob_refcnt', _Py_ssize_t),
('ob_type', ctypes.POINTER(_PyObject))
]
# fixup for python with tracing
if object.__basicsize__ != ctypes.sizeof(_PyObject):
class _PyObject(ctypes.Structure):
pass
_PyObject._fields_ = [
('_ob_next', ctypes.POINTER(_PyObject)),
('_ob_prev', ctypes.POINTER(_PyObject)),
('ob_refcnt', _Py_ssize_t),
('ob_type', ctypes.POINTER(_PyObject))
]
class _DictProxy(_PyObject):
_fields_ = [('dict', ctypes.POINTER(_PyObject))]
def reveal_dict(proxy):
if not isinstance(proxy, DictProxyType):
raise TypeError('dictproxy expected')
dp = _DictProxy.from_address(id(proxy))
ns = {}
ctypes.pythonapi.PyDict_SetItem(ctypes.py_object(ns), ctypes.py_object(None), dp.dict)
return ns[None]
def get_class_dict(cls):
d = getattr(cls, '__dict__', None)
if d is None:
raise TypeError('given class does not have a dictionary')
if isinstance(d, DictProxyType):
return reveal_dict(d)
return d
def test():
import random
d = get_class_dict(str)
d['foo'] = lambda x: ''.join(random.choice((c.upper, c.lower))() for c in x)
print "and this is monkey patching str".foo()
if __name__ == '__main__':
test()
Here's a contrived example using monkeypatch:
I have a class "myclass" in module "mystuff" wrapped with swig -python -builtin
I want to add an extra runtime method "namelen" that returns the length of the name returned by myclass.getName()
import mystuff
import monkeypatch
# add a "namelen" method to all "myclass" objects
def namelen(self):
return len(self.getName())
d = monkeypatch.get_class_dict(mystuff.myclass)
d['namelen'] = namelen
x = mystuff.myclass("xxxxxxxx")
print "namelen:", x.namelen()
Note that this can also be used to extend or override methods on builtin python classes, as is demonstrated in the test in monkeypatch.py: it adds a method "foo" to the builtin str class that returns a copy of the original string with random upper/lower case letters
I would probably replace:
# add a "namelen" method to all "myclass" objects
def namelen(self):
return len(self.getName())
d = monkeypatch.get_class_dict(mystuff.myclass)
d['namelen'] = namelen
with
# add a "namelen" method to all "myclass" objects
monkeypatch.get_class_dict(mystuff.myclass)['namelen'] = lambda self: return len(self.getName())
to avoid extra global variables

DRY in django queries, reverse queries, and predicates

I am frustrated that in Django I often end up having to write methods on a custom Manager:
class EntryManager(Manager):
def filter_beatle(self, beatle):
return self.filter(headline__contains=beatle)
... and repeat pretty much the same method in a different Manager for a reverse query:
class BlogManager(Manager):
def filter_beatle(self, beatle):
return self.filter(entry__headline__contains=beatle)
... and a predicate on Entry:
def headline_contains(self, beatle):
return self.headline.find(beatle) != -1
[Note that the predicate on Entry will work on Entry objects that haven't even been saved yet.]
This feels like a violation of DRY. Is there some way to express this once and use it in all three places?
What I would like to be able to do is write something like:
q = Q(headline__contains="Lennon")
lennon_entries = Entry.objects.filter(q)
lennon_blogs = Blog.objects.filter(q.reverse(Entry))
is_lennon = entry.would_filter(q)
... where 'headline__contains="Lennon"' expresses exactly once what it means to be 'an Entry about "Lennon"', and this can be used to construct reverse queries and a predicate.
The best place for this is a custom manager. According to django's guidelines a manager class is the best place for code that is affecting more than one object of a class.
class EntryManager(models.Manager):
def filter_lennons(self):
return self.get_query_set().filter(headline__contains='Lennon')
class Entry(models.Model):
headline = models.CharField(max_length=100)
objects = EntryManager()
lennons = Entry.objects.filter_lennons()
You should never rarely have to do the following:
if entry.headline.find('Lennon') >= 0:
because the filter should take care of restricting the result set to the instances you're interested in.
If you're going to be using the same filter multiple times, you can create a custom manager or a simple class method.
class Entry(models.Model):
...
# this really should be on a custom manager, but this was quicker to demonstrate
#classmethod
def find_headlines(cls, text):
return cls.objects.filter(headline__contains=text)
entries = Entry.find_headlines('Lennon')
But really, the DRYness has already been contained within the Queryset API. How often are you really going to be hard coding the string 'Lennon' into a query? Usually, the search parameter will be passed into a view from a GET or POST. Perfectly DRY.
So, what is the actual problem? Other than exploring the queryset API, have you ever had to hard code lookup values in multiple queries like your question?
For the "reverse filter" case you can use a subquery:
Blog.objects.filter(entries__in=Entry.objects.filter_beatle("Lennon"))
Reusing or generating predicates is not possible (in general) as there are predicates that cannot be expressed as queries and queries that cannot be expressed as predicates without db access.
My most common use for the predicate seems to be in asserts. Often something like:
class Thing(Model):
class QuerySet(query.QuerySet):
def need_to_be_whacked():
# ... code ...
def needs_to_be_whacked(self):
return Thing.objects.need_to_be_whacked().filter(id=self.id).exists()
def whack(self):
assert self.needs_to_be_whacked()
for thing in Thing.objects.need_to_be_whacked():
thing.whack()
I want to make sure that no other code is calling whack() in state where it doesn't need to be whacked. It costs a database hit, but it works.