Discovering keys using h5py in python3 - python-2.7

In python2.7, I can analyze an hdf5 files keys use
$ python
>>> import h5py
>>> f = h5py.File('example.h5', 'r')
>>> f.keys()
[u'some_key']
However, in python3.4, I get something different:
$ python3 -q
>>> import h5py
>>> f = h5py.File('example.h5', 'r')
>>> f.keys()
KeysViewWithLock(<HDF5 file "example.h5" (mode r)>)
What is KeysViewWithLock, and how can I examine my HDF5 keys in Python3?

From h5py's website (http://docs.h5py.org/en/latest/high/group.html#dict-interface-and-links):
When using h5py from Python 3, the keys(), values() and items()
methods will return view-like objects instead of lists. These objects
support containership testing and iteration, but can’t be sliced like
lists.
This explains why we can't view them. The simplest answer is to convert them to a list:
>>> list(for.keys())
Unfortunately, I run things in iPython, and it uses the command 'l'. That means that approach won't work.
In order to actually view them, we need to take advantage of containership testing and iteration. Containership testing means we'd have to already know the keys, so that's out. Fortunately, it's simple to use iteration:
>>> [key for key in f.keys()]
['mins', 'rects_x', 'rects_y']
I've created a simple function that does this automatically:
def keys(f):
return [key for key in f.keys()]
Then you get:
>>> keys(f)
['mins', 'rects_x', 'rects_y']

Related

Is tf.Variable a tensor or not?

I've read some answers on this question here and here, however I'm still a bit puzzled by tf.Variable being and/or not being a tf.Tensor.
The linked answers deal with a mutability of tf.Variable and mentioning that tf.Variables maintains their states (when instantiated with default parameter trainable=True).
What makes me still a bit confused is a test case I came across when writing simple unit tests using tf.test.TestCase
Consider the following code snippet. We have a simple class called Foo which has only one property, a tf.Variable initialized to w:
import tensorflow as tf
import numpy as np
class Foo:
def __init__(self, w):
self.w = tf.Variable(w)
Now, let's say you want to test that the instance of Foo has w initialized with tensor of the same dimension as passed in via w. The simplest test case could be written as follows:
import tensorflow as tf
import numpy as np
from foo import Foo
class TestFoo(tf.test.TestCase):
def test_init(self):
w = np.random.rand(3,2)
foo = Foo(w)
init = tf.global_variables_initializer()
with self.test_session() as sess:
sess.run(init)
self.assertShapeEqual(w, foo.w)
if __name__ == '__main__':
tf.test.main()
Now when you run the test you'll get the following error:
======================================================================
ERROR: test_init (__main__.TestFoo)
----------------------------------------------------------------------
Traceback (most recent call last):
File "test_foo.py", line 12, in test_init
self.assertShapeEqual(w, foo.w)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/test_util.py", line 1100, in assertShapeEqual
raise TypeError("tf_tensor must be a Tensor")
TypeError: tf_tensor must be a Tensor
----------------------------------------------------------------------
Ran 2 tests in 0.027s
FAILED (errors=1)
You can "get around" this unit test error by doing something like this (i.e. note assertShapeEqual was replaced with assertEqual):
self.assertEqual(list(w.shape), foo.w.get_shape().as_list())
What I'm interested in, though, is the tf.Variable vs tf.Tensor relationship.
What the test error seems to be suggesting is that foo.w is NOT a tf.Tensor, meaning you probably can't use tf.Tensor API on it. Consider, however, the following interactive python session:
$ python3
Python 3.6.3 (default, Oct 4 2017, 06:09:15)
[GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.37)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> import numpy as np
>>> w = np.random.rand(3,2)
>>> var = tf.Variable(w)
>>> var.get_shape().as_list()
[3, 2]
>>> list(w.shape)
[3, 2]
>>>
In the session above, we create a variable and run the get_shape() method on it to retrieve its shape dimensions. Now, get_shape() method is a tf.Tensor API method as you can see here.
So to get back to my question, what parts of tf.Tensor API does tf.Variable implement. If the answer is ALL of them, why does the above test case fail?
self.assertShapeEqual(w, foo.w)
with
raise TypeError("tf_tensor must be a Tensor")
I'm pretty sure I'm missing something fundamental here or maybe it's a bug in assertShapeEqual ? I would appreciate if someone could shed some light on this.
I'm using following version of tensorflow on macOS with python3:
tensorflow (1.4.1)
That testing utility function is checking whether a variable implements tf.Tensor
>>> import tensorflow as tf
>>> v = tf.Variable('v')
>>> v
<tf.Variable 'Variable:0' shape=() dtype=string_ref>
>>> isinstance(v, tf.Tensor)
False
The answer appears to be 'no'.
Update:
According to the documentation that is correct:
https://www.tensorflow.org/programmers_guide/variables
Unlike tf.Tensor objects, a tf.Variable exists outside the context of
a single session.run call.
Although:
A tf.Variable represents a tensor whose value can be changed by
running ops on it.
(Not quite sure what 'represents a tensor' means - sounds like a design 'feature')

Handling map function in python2 & python3

Recently i came across a question & confused with a possible solution,
code part is
// code part in result reader
result = map(int, input())
// consumer call
result_consumer(result)
its not about how do they work, the problem is when you are running in python2 it will raise an exception, on result fetching part, so result reader can handle the exception, but incase of python3 a map object is returned, so only consumer will be able to handle exception.
is there any solution keeping map function & handle the exception in python2 & python3
python3
>>> d = map(int, input())
1,2,3,a
>>> d
<map object at 0x7f70b11ee518>
>>>
python2
>>> d = map(int, input())
1,2,3,'a'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: 'a'
>>>
the behavior of map is not the only difference between python2 and python3, input is also difference, you need to keep in mind the basic differences between the two to make code compatible for both
python 3 vs python 2
map = itertools.imap
zip = itertools.izip
filter = itertools.ifilter
range = xrange
input = raw_input
so to make code for both, you can use alternatives like list comprehension that work the same for both, and for those that don't have easy alternatives, you can make new functions and/or use conditional renames, like for example
my_input = input
try:
raw_input
except NameError: #we are in python 3
my_input = lambda msj=None: eval(input(msj))
(or with your favorite way to check which version of python is in execution)
# code part in result reader
result = [ int(x) for x in my_input() ]
# consumer call
result_consumer(result)
that way your code do the same regardless of which version of python you run it.
But as jsbueno mentioned, eval and python2's input are dangerous so use the more secure raw_input or python3's input
try:
input = raw_input
except NameError: #we are in python 3
pass
(or with your favorite way to check which version of python is in execution)
then if your plan is to provide your input as 1,2,3 add an appropriate split
# code part in result reader
result = [ int(x) for x in input().split(",") ]
# consumer call
result_consumer(result)
If you always need the exception to occur at the same place you can always force the map object to yield its results by wrapping it in a list call:
result = list(map(int, input()))
If an error occurs in Python 2 it will be during the call to map while, in Python 3, the error is going to surface during the list call.
The slight downside is that in the case of Python 2 you'll create a new list. To avoid this you could alternatively branch based on sys.version and use the list only in Python 3 but that might be too tedious for you.
I usually use my own version of map in this situations to escape any possible problem may occur and it's
def my_map(func,some_list):
done = []
for item in some_list:
done.append( func(item) )
return done
and my own version of input too
def getinput(text):
import sys
ver = sys.version[0]
if ver=="3":
return input(text)
else:
return raw_input(text)
if you are working on a big project add them to a python file and import them any time you need like what I do.

Confused about __import__ in Python

I trying to import module by __import__ like this:
>>> mod = __import__('x.y.z')
But I only got x:
>>> print mod
>>> <module 'x' from '...'>
How should I do to import z ? I tried like this, it works but i don't know why.
>>> mod = __import__('x.y.z', {}, {}, [''])
>>> print mod
>>> <module 'x.y.z' from '...'>
I'm really confused about this, and I also have no idea with the globals and locals parameters.
Thx a lot!
Relevant notes from the docs (__import__):
When the name variable is of the form package.module, normally, the top-level package (the name up till the first dot) is returned, not the module named by name. However, when a non-empty fromlist argument is given, the module named by name is returned.
Hence, it's similar to writing import x.y.z which also makes x available (as well as x.y and x.y.z).
Use the importlib module instead. Of which the bare bones are made available in 2.7.
import importlib
z = importlib.import_module("z", "x.y")
# equivalent to
from x.y import z

Manually building a deep copy of a ConfigParser in Python 2.7

Just starting in on my Python learning curve, and hitting a snag in porting some code up to Python 2.7. It appears that in Python 2.7 it is no longer possible to perform a deepcopy() on instances of ConfigParser. It also appears that the Python team isn't terribly interested in restoring such a capability:
http://bugs.python.org/issue16058
Can someone propose an elegant solution for manually constructing a deepcopy/duplicate of an instance of ConfigParser?
Many thanks, -Pete
This is just an example implementation of Jan Vlcinsky answer written in Python 3 (I don't have enough reputation to post this as a comment to Jans answer). Many thanks to Jan for the push in the right direction.
To make a full (deep) copy of base_config into new_config just do the following;
import io
import configparser
config_string = io.StringIO()
base_config.write(config_string)
# We must reset the buffer ready for reading.
config_string.seek(0)
new_config = configparser.ConfigParser()
new_config.read_file(config_string)
Based on #Toenex answer, modified for Python 2.7:
import StringIO
import ConfigParser
# Create a deep copy of the configuration object
config_string = StringIO.StringIO()
base_config.write(config_string)
# We must reset the buffer to make it ready for reading.
config_string.seek(0)
new_config = ConfigParser.ConfigParser()
new_config.readfp(config_string)
The previous solution doesn't work in all python3 use cases. Specifically if the original parser is using Extended Interpolation the copy may fail to work correctly. Fortunately, the easy solution is to use the pickle module:
def deep_copy(config:configparser.ConfigParser)->configparser.ConfigParser:
"""deep copy config"""
rep = pickle.dumps(config)
new_config = pickle.loads(rep)
return new_config
If you need new independent copy of ConfigParser, then one option is:
have original version of ConfigParser
serialize the config file into temporary file or StringIO buffer
use that tmpfile or StringIO buffer to create new ConfigParser.
And you have it done.
If you are using Python 3 (3.2+) you can use the Mapping Protocol Access to copy (actually deep copy) the sections and options of a source configuration to another ConfigParser object.
You can use read_dict() to copy the state of a configuration parser.
Here is a demo:
import configparser
# the configuration to deep copy:
src_cfg = configparser.ConfigParser()
src_cfg.add_section("Section A")
src_cfg["Section A"]["key1"] = "value1"
src_cfg["Section A"]["key2"] = "value2"
# the destination configuration
dst_cfg = configparser.ConfigParser()
dst_cfg.read_dict(src_cfg)
dst_cfg.add_section("Section B")
dst_cfg["Section B"]["key3"] = "value3"
To display the resulting configuration, you can try:
import io
output = io.StringIO()
dst_cfg.write(output)
print(output.getvalue())
You get:
[Section A]
key1 = value1
key2 = value2
[Section B]
key3 = value3
After reading this article, I am more familiar with config.ini.
Record as follows:
import io
import configparser
def copy_config_demo():
with io.StringIO() as memory_file:
memory_file.write(str(test_config_data.__doc__)) # original_config.write(memory_file)
memory_file.seek(0)
new_config = configparser.ConfigParser(interpolation=configparser.ExtendedInterpolation())
new_config.read_file(memory_file)
# below is just for test
for section_name, list_item in [(section_name, new_config.items(section_name)) for section_name in new_config.sections()]:
print('\n[' + section_name + ']')
for key, value in list_item:
print(f'{key}: {value}')
def test_config_data():
"""
[Common]
home_dir: /Users
library_dir: /Library
system_dir: /System
macports_dir: /opt/local
[Frameworks]
Python: >=3.2
path: ${Common:system_dir}/Library/Frameworks/
[Arthur]
name: Carson
my_dir: ${Common:home_dir}/twosheds
my_pictures: ${my_dir}/Pictures
python_dir: ${Frameworks:path}/Python/Versions/${Frameworks:Python}
"""
output:
[Common]
home_dir: /Users
library_dir: /Library
system_dir: /System
macports_dir: /opt/local
[Frameworks]
python: >=3.2
path: /System/Library/Frameworks/
[Arthur]
name: Carson
my_dir: /Users/twosheds
my_pictures: /Users/twosheds/Pictures
python_dir: /System/Library/Frameworks//Python/Versions/>=3.2
hoping it is helpful to you.

haystack whoosh no results - rebuild_index shows Indexing [number] <django.utils.functional.__proxy__ object at [memory location] >

when I run ./manage.py rebuild_index I get the readout for example:
Indexing 4574 <django.utils.functional.__proxy__ object at at 0x1aab690> .
Having seen other users' readouts, this should show the name of the search index/model instead and I am wondering if this could be part of the explanation as to why I have been experiencing no search results on the website and no objects appear to be indexed when performing:
>>> from haystack.query import SearchQuerySet
>>> sqs = SearchQuerySet().all()
>>> sqs.count()
I did not initially have a
def _unicode_self():
return self.name
on the models I am indexing but then I added it and nothing seemed to change even after doing rebuild_index
This was GitHub pull request #746 for Django Haystack, which has now been merged.
I was seeing this same issue on my local (dev) setup. Updating solved the "functional proxy" placeholder issue for me.
I ran the following command:
pip install -e git+git://github.com/toastdriven/django-haystack.git#master#egg=django-haystack
You may need to tweak the command to suit your own needs and/or environment.