Error converting Tensorflow Object Detection Api model to TFLite - python-2.7

I trained a SSDLite-MobilenetV2 model with Tensorflow using the provided documentation in the Object Detection Api. Then I exported the model by running the export_tflite_ssd_graph script. A pb and a pbtxt file were generated. Finally, I tried to convert the model to tflite format using the tflite_convert command. However, I got the following error:
Traceback (most recent call last):
File "/usr/local/bin/tflite_convert", line 11, in
sys.exit(main())
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/lite/python/tflite_convert.py", line 412, in main
app.run(main=run_main, argv=sys.argv[:1])
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/lite/python/tflite_convert.py", line 408, in run_main
_convert_model(tflite_flags)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/lite/python/tflite_convert.py", line 100, in _convert_model
converter = _get_toco_converter(flags)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/lite/python/tflite_convert.py", line 87, in _get_toco_converter
return converter_fn(**converter_kwargs)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/lite/python/lite.py", line 340, in from_saved_model
output_arrays, tag_set, signature_key)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/lite/python/convert_saved_model.py", line 239, in freeze_saved_model
meta_graph = get_meta_graph_def(saved_model_dir, tag_set)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/lite/python/convert_saved_model.py", line 61, in get_meta_graph_def
return loader.load(sess, tag_set, saved_model_dir)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/saved_model/loader_impl.py", line 197, in load
return loader.load(sess, tags, import_scope, **saver_kwargs)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/saved_model/loader_impl.py", line 350, in load
** saver_kwargs)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/saved_model/loader_impl.py", line 275, in load_graph
meta_graph_def = self.get_meta_graph_def_from_tags(tags)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/saved_model/loader_impl.py", line 251, in get_meta_graph_def_from_tags
" could not be found in SavedModel. To inspect available tag-sets in"
RuntimeError: MetaGraphDef associated with tags set(['serve']) could not be found in SavedModel. To inspect available tag-sets in the SavedModel, please use the SavedModel CLI: saved_model_cli
It seems that the conversion script did not include the SERVING tag constant. How can I fix this?
I am using tensorflow-gpu 1.12.0

Related

Compatibility with GCP aiplatform, bigquery and cloud-storage on hyperparameter tuning docker image

I am doing hyperparameter tuning on GCP using this scikit docker image. When I add the aiplatform package as a dependency, things break. The error comes from the bigquery import.
from google.cloud import bigquery
The error message is below.
The replica workerpool0-0 exited with a non-zero status of 1.
Traceback (most recent call last):
[...]
File "/root/.local/lib/python3.7/site-packages/trainer/task.py", line 7, in
from google.cloud import storage, bigquery
File "/usr/local/lib/python3.7/dist-packages/google/cloud/bigquery/__init__.py", line 35, in
from google.cloud.bigquery.client import Client
File "/usr/local/lib/python3.7/dist-packages/google/cloud/bigquery/client.py", line 60, in
from google.cloud.bigquery import _pandas_helpers
File "/usr/local/lib/python3.7/dist-packages/google/cloud/bigquery/_pandas_helpers.py", line 40, in
from google.cloud.bigquery import schema
File "/usr/local/lib/python3.7/dist-packages/google/cloud/bigquery/schema.py", line 19, in
from google.cloud.bigquery_v2 import types
File "/usr/local/lib/python3.7/dist-packages/google/cloud/bigquery_v2/__init__.py", line 23, in
from google.cloud.bigquery_v2 import types
File "/usr/local/lib/python3.7/dist-packages/google/cloud/bigquery_v2/types.py", line 23, in
from google.cloud.bigquery_v2.proto import encryption_config_pb2
File "/usr/local/lib/python3.7/dist-packages/google/cloud/bigquery_v2/proto/encryption_config_pb2.py", line 64, in
file=DESCRIPTOR,
File "/root/.local/lib/python3.7/site-packages/google/protobuf/descriptor.py", line 560, in __new__
_message.Message._CheckCalledFromGeneratedFile()
TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
1. Downgrade the protobuf package to 3.20.x or lower.
2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).
From the logs, I can see the system is downloading google-cloud-aiplatform v1.17.0. According to the scikit docker image, google-cloud-storage v1.35.0 is installed, but google-cloud-aiplatform drags in v2.5.0.
I am thinking I need to downgrade google-cloud-aiplatform to a specific version. Anyone know which version or how to resolve this problem?
UPDATE: FWIW, if I downgrade google-cloud-aiplatform==1.15.1 then the problem above goes away. However, this problem below shows.
Traceback (most recent call last):
File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/root/.local/lib/python3.7/site-packages/trainer/hpt.py", line 170, in
staging_bucket=f'{args.bucket_uri}'
File "/root/.local/lib/python3.7/site-packages/google/cloud/aiplatform/initializer.py", line 138, in init
backing_tensorboard=experiment_tensorboard,
File "/root/.local/lib/python3.7/site-packages/google/cloud/aiplatform/metadata/metadata.py", line 235, in set_experiment
experiment_name=experiment, description=description
File "/root/.local/lib/python3.7/site-packages/google/cloud/aiplatform/metadata/experiment_resources.py", line 247, in get_or_create
project=project, location=location, credentials=credentials
File "/root/.local/lib/python3.7/site-packages/google/cloud/aiplatform/metadata/metadata_store.py", line 283, in ensure_default_metadata_store_exists
encryption_spec_key_name=encryption_key_spec_name,
File "/root/.local/lib/python3.7/site-packages/google/cloud/aiplatform/metadata/metadata_store.py", line 123, in get_or_create
credentials=credentials,
File "/root/.local/lib/python3.7/site-packages/google/cloud/aiplatform/metadata/metadata_store.py", line 241, in _get
credentials=credentials,
File "/root/.local/lib/python3.7/site-packages/google/cloud/aiplatform/metadata/metadata_store.py", line 73, in __init__
self._gca_resource = self._get_gca_resource(resource_name=metadata_store_name)
File "/root/.local/lib/python3.7/site-packages/google/cloud/aiplatform/base.py", line 617, in _get_gca_resource
return getattr(self.api_client, self._getter_method)(
File "/root/.local/lib/python3.7/site-packages/google/cloud/aiplatform/utils/__init__.py", line 425, in __getattr__
return getattr(self._clients[self._default_version], name)
File "/root/.local/lib/python3.7/site-packages/google/cloud/aiplatform/utils/__init__.py", line 359, in __getattr__
client_info=self._client_info,
File "/root/.local/lib/python3.7/site-packages/google/cloud/aiplatform_v1/services/metadata_service/client.py", line 547, in __init__
api_audience=client_options.api_audience,
File "/root/.local/lib/python3.7/site-packages/google/cloud/aiplatform_v1/services/metadata_service/transports/grpc.py", line 190, in __init__
("grpc.max_receive_message_length", -1),
File "/root/.local/lib/python3.7/site-packages/google/cloud/aiplatform_v1/services/metadata_service/transports/grpc.py", line 241, in create_channel
**kwargs,
File "/root/.local/lib/python3.7/site-packages/google/api_core/grpc_helpers.py", line 318, in create_channel
default_host=default_host,
File "/root/.local/lib/python3.7/site-packages/google/api_core/grpc_helpers.py", line 239, in _create_composite_credentials
credentials, scopes=scopes, default_scopes=default_scopes
TypeError: with_scopes_if_required() got an unexpected keyword argument 'default_scopes'

Using Tensorflow 2.X model on OpenCV

I have to use a Tensorflow 2.X model with the OpenCV framework (v.4.X with C++).
To do this, I need a single .pb file or a .pb and a .pbtxt file, instead of a Tensorflow Saved Model like the one I have.
So my question is: Is there a way to convert a Saved Model in a format that OpenCV could read? Like, maybe, a caffe model?
I tried with MMdnn but it gives me a strange error:
Traceback (most recent call last):
File "/usr/local/bin/mmconvert", line 8, in <module>
sys.exit(_main())
File "/usr/local/lib/python3.5/dist-packages/mmdnn/conversion/_script/convert.py", line 102, in _main
ret = convertToIR._convert(ir_args)
File "/usr/local/lib/python3.5/dist-packages/mmdnn/conversion/_script/convertToIR.py", line 62, in _convert
from mmdnn.conversion.tensorflow.tensorflow_parser import TensorflowParser
File "/usr/local/lib/python3.5/dist-packages/mmdnn/conversion/tensorflow/tensorflow_parser.py", line 15, in <module>
from tensorflow.tools.graph_transforms import TransformGraph
ImportError: No module named 'tensorflow.tools.graph_transforms'
And I suppose it is because it was developed and tested with Tensorflow 1.X.
Edit: I also have the relative Keras model (now that it is integrated with Tensorflow 2), but it is incompatible with OpenCV DNN framework too. Trying converting it with MMdnn I get this error:
Traceback (most recent call last):
File "/usr/local/bin/mmconvert", line 8, in <module>
sys.exit(_main())
File "/usr/local/lib/python3.5/dist-packages/mmdnn/conversion/_script/convert.py", line 102, in _main
ret = convertToIR._convert(ir_args)
File "/usr/local/lib/python3.5/dist-packages/mmdnn/conversion/_script/convertToIR.py", line 46, in _convert
parser = Keras2Parser(model)
File "/usr/local/lib/python3.5/dist-packages/mmdnn/conversion/keras/keras2_parser.py", line 126, in __init__
model = self._load_model(model[0], model[1])
File "/usr/local/lib/python3.5/dist-packages/mmdnn/conversion/keras/keras2_parser.py", line 78, in _load_model
'DepthwiseConv2D': layers.DepthwiseConv2D})
File "/usr/local/lib/python3.5/dist-packages/keras/engine/saving.py", line 664, in model_from_json
return deserialize(config, custom_objects=custom_objects)
File "/usr/local/lib/python3.5/dist-packages/keras/layers/__init__.py", line 168, in deserialize
printable_module_name='layer')
File "/usr/local/lib/python3.5/dist-packages/keras/utils/generic_utils.py", line 147, in deserialize_keras_object
list(custom_objects.items())))
File "/usr/local/lib/python3.5/dist-packages/keras/engine/network.py", line 1056, in from_config
process_layer(layer_data)
File "/usr/local/lib/python3.5/dist-packages/keras/engine/network.py", line 1042, in process_layer
custom_objects=custom_objects)
File "/usr/local/lib/python3.5/dist-packages/keras/layers/__init__.py", line 168, in deserialize
printable_module_name='layer')
File "/usr/local/lib/python3.5/dist-packages/keras/utils/generic_utils.py", line 149, in deserialize_keras_object
return cls.from_config(config['config'])
File "/usr/local/lib/python3.5/dist-packages/keras/engine/base_layer.py", line 1179, in from_config
return cls(**config)
File "/usr/local/lib/python3.5/dist-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/keras/layers/convolutional.py", line 484, in __init__
**kwargs)
File "/usr/local/lib/python3.5/dist-packages/keras/layers/convolutional.py", line 117, in __init__
self.kernel_initializer = initializers.get(kernel_initializer)
File "/usr/local/lib/python3.5/dist-packages/keras/initializers.py", line 515, in get
return deserialize(identifier)
File "/usr/local/lib/python3.5/dist-packages/keras/initializers.py", line 510, in deserialize
printable_module_name='initializer')
File "/usr/local/lib/python3.5/dist-packages/keras/utils/generic_utils.py", line 140, in deserialize_keras_object
': ' + class_name)
ValueError: Unknown initializer: GlorotUniform
Edit 04/2021: Now the ONNX converter mentioned in the comments works properly with OpenCV 4.5.1 (Version 4.5.0 has a bug with some ONNX networks).
If you have the .h5 file, you can try this approach instead of MMdnn, using TensorFlow. The function converts the current session into a static computation graph to capture current states. Then you can write the graph in .pb format using tf.train.write_graph.
You can load the pretrained model with model = load_model('./model/keras_model.h5') before you freeze the graph. There is also a blog post for further explanation.

Q: Sonos Python Self Test error: No handlers could be found for logger "smapi"

I am trying to run the SONOS self test for a music service on Sonos.
After getting the dependencies, and filling out the config file, I try to run the python Sonos selftest, however it runs into an error and I have no clue what the underlying issue might be to get it running:
No handlers could be found for logger "smapi"
Traceback (most recent call last):
File "suite_selftest.py", line 226, in <module>
nightly_mode(parser.config_file)
File "suite_selftest.py", line 51, in nightly_mode
development_mode(config_file)
File "suite_selftest.py", line 186, in development_mode
fixtures.append(getlastupdate.PollingIntervalTest(suite.client, suite.smapiservice))
File "/Users/thomas/Desktop/PythonSelfTest/smapi/content_workflow/getlastupdate.py", line 20, in __init__
self.poll_interval = self.smapiservice.get_polling_interval()
File "../../sonos-1.1.0.dev_r300235-py2.7.egg/sonos/smapi/smapiservice.py", line 465, in get_polling_interval
File "/usr/local/Cellar/python#2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ConfigParser.py", line 362, in getfloat
return self._get(section, float, option)
File "/usr/local/Cellar/python#2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ConfigParser.py", line 356, in _get
return conv(self.get(section, option))
ValueError: could not convert string to float:
Found the fix already, forgot to add the Polling Interval in the config file...

Program Fails for Spark_Home at foreachPartition with change in location of database load utility from one module to another

I am working on a python spark project, where initially i had written a script to load dataframe to postgres for a particular client which include a utility function which loads data to postgres.
df.rdd.repartition(self.__max_conn).foreachPartition(
lambda iterator: load_utils.load_tab_postgres(conn_prop=conn_prop,
tab_name=<tablename>,
iterator=iterator))
Initially the entire code was in single module including the above code snippet and load_utils(), which was working perfectly fine.
Later i had to extract common code including load_utils into a base module that could be used in different client modules. This is when the same code failed with below error:
File
"/opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/lib/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 764, in foreachPartition File
"/opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/lib/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 1004, in count File
"/opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/lib/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 995, in sum File
"/opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/lib/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 869, in fold File
"/opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/lib/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 771, in collect File
"/opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 813, in call File
"/opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/lib/spark/python/lib/pyspark.zip/pyspark/sql/utils.py",
line 45, in deco File
"/opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/lib/spark/python/lib/py4j-0.9-src.zip/py4j/protocol.py",
line 308, in get_return_value py4j.protocol.Py4JJavaError: An error
occurred while calling
z:org.apache.spark.api.python.PythonRDD.collectAndServe. :
org.apache.spark.SparkException: Job aborted due to stage failure:
Task 18 in stage 126.0 failed 4 times, most recent failure: Lost task
18.3 in stage 126.0 (TID 24028, tbsatad6r15g24.company.co.us, executor 242): org.apache.spark.api.python.PythonException: Traceback (most
recent call last): File
"/opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/lib/spark/python/lib/pyspark.zip/pyspark/worker.py", line 98, in main
command = pickleSer._read_with_length(infile) File "/opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/lib/spark/python/lib/pyspark.zip/pyspark/serializers.py",
line 164, in _read_with_length
return self.loads(obj) File "/opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/lib/spark/python/lib/pyspark.zip/pyspark/serializers.py",
line 422, in loads
return pickle.loads(obj) File "build/bdist.linux-x86_64/egg/basemodule/init.py", line 12, in
import basemodule.entitymodule.base File "build/bdist.linux-x86_64/egg/basemodule/entitymodule/base.py", line
12, in File
"build/bdist.linux-x86_64/egg/basemodule/contexts.py", line 17, in
File
"/opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/lib/spark/python/lib/pyspark.zip/pyspark/conf.py",
line 104, in init
SparkContext._ensure_initialized() File "/opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/lib/spark/python/lib/pyspark.zip/pyspark/context.py", line 245, in _ensure_initialized
SparkContext._gateway = gateway or launch_gateway() File "/opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/lib/spark/python/lib/pyspark.zip/pyspark/java_gateway.py", line 48, in launch_gateway
SPARK_HOME = os.environ["SPARK_HOME"] File "/dhcommon/dhpython/python/lib/python2.7/UserDict.py", line 23, in
getitem
raise KeyError(key) KeyError: 'SPARK_HOME'
Below is the spark_submit command to run the code:
spark-submit --master yarn --deploy-mode cluster --driver-class-path
postgresql-42.2.4.jre6.jar --jars
spark-csv_2.10-1.4.0.jar,commons-csv-1.4.jar,postgresql-42.2.4.jre6.jar
--py-files project.egg driver_file.py
In both above scenarios the load_utils file containing "load_tab_postgres" method will be bundled in project.egg.

Pipeline will fail on GCP when writing tensorflow transform metadata

I hope somebody here can help. I've been googling this error like crazy but haven't found anything.
I have a pipeline that works perfectly when executed locally but it fails when executed on GCP. The following are the error messages that I get.
Workflow failed. Causes: S03:Write transform
fn/WriteMetadata/ResolveBeamFutures/CreateSingleton/Read+Write
transform fn/WriteMetadata/ResolveBeamFutures/ResolveFutures/Do+Write
transform fn/WriteMetadata/WriteMetadata failed., A work item was
attempted 4 times without success. Each time the worker eventually
lost contact with the service. The work item was attempted on:
Traceback (most recent call last): File "preprocess.py", line 491,
in
main() File "preprocess.py", line 487, in main
transform_data(args,pipeline_options,runner) File "preprocess.py", line 451, in transform_data
eval_data |= 'Identity eval' >> beam.ParDo(Identity()) File "/Library/Python/2.7/site-packages/apache_beam/pipeline.py", line 335,
in exit
self.run().wait_until_finish() File "/Library/Python/2.7/site-packages/apache_beam/runners/dataflow/dataflow_runner.py",
line 897, in wait_until_finish
(self.state, getattr(self._runner, 'last_error_msg', None)), self) apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException:
Dataflow pipeline failed. State: FAILED, Error: Traceback (most recent
call last): File
"/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py",
line 582, in do_work
work_executor.execute() File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py",
line 166, in execute
op.start() File "apache_beam/runners/worker/operations.py", line 294, in apache_beam.runners.worker.operations.DoOperation.start
(apache_beam/runners/worker/operations.c:10607)
def start(self): File "apache_beam/runners/worker/operations.py", line 295, in
apache_beam.runners.worker.operations.DoOperation.start
(apache_beam/runners/worker/operations.c:10501)
with self.scoped_start_state: File "apache_beam/runners/worker/operations.py", line 300, in
apache_beam.runners.worker.operations.DoOperation.start
(apache_beam/runners/worker/operations.c:9702)
pickler.loads(self.spec.serialized_fn)) File "/usr/local/lib/python2.7/dist-packages/apache_beam/internal/pickler.py",
line 225, in loads
return dill.loads(s) File "/usr/local/lib/python2.7/dist-packages/dill/dill.py", line 277, in
loads
return load(file) File "/usr/local/lib/python2.7/dist-packages/dill/dill.py", line 266, in
load
obj = pik.load() File "/usr/lib/python2.7/pickle.py", line 858, in load
dispatchkey File "/usr/lib/python2.7/pickle.py", line 1083, in load_newobj
obj = cls.new(cls, *args) TypeError: new() takes exactly 4 arguments (1 given)
Any ideas??
Thanks,
Pedro
If the pipeline works locally but fails on GCP it's possible that you're running into a version mismatch.
What TF, tf.Transform, beam versions are you running locally and on GCP?