I am using ubuntu 14.04
I am trying to get a python program to get speech to text from microphone.
For this, I have installed sphinxbase and pocketsphinx. pocketsphinx_continuous works.
thekindlyone#deepthought:.../lib$ pocketsphinx_continuous -inmic yes
INFO: cmd_ln.c(691): Parsing command line:
pocketsphinx_continuous \
-inmic yes
Current configuration:
[NAME] [DEFLT] [VALUE]
-adcdev
-agc none none
-agcthresh 2.0 2.000000e+00
-alpha 0.97 9.700000e-01
-argfile
-ascale 20.0 2.000000e+01
-aw 1 1
-backtrace no no
-beam 1e-48 1.000000e-48
-bestpath yes yes
-bestpathlw 9.5 9.500000e+00
-bghist no no
-ceplen 13 13
-cmn current current
-cmninit 8.0 8.0
-compallsen no no
-debug 0
-dict
-dictcase no no
-dither no no
-doublebw no no
-ds 1 1
-fdict
-feat 1s_c_d_dd 1s_c_d_dd
-featparams
-fillprob 1e-8 1.000000e-08
-frate 100 100
-fsg
-fsgusealtpron yes yes
-fsgusefiller yes yes
-fwdflat yes yes
-fwdflatbeam 1e-64 1.000000e-64
-fwdflatefwid 4 4
-fwdflatlw 8.5 8.500000e+00
-fwdflatsfwin 25 25
-fwdflatwbeam 7e-29 7.000000e-29
-fwdtree yes yes
-hmm
-infile
-input_endian little little
-jsgf
-kdmaxbbi -1 -1
-kdmaxdepth 0 0
-kdtree
-latsize 5000 5000
-lda
-ldadim 0 0
-lextreedump 0 0
-lifter 0 0
-lm
-lmctl
-lmname default default
-logbase 1.0001 1.000100e+00
-logfn
-logspec no no
-lowerf 133.33334 1.333333e+02
-lpbeam 1e-40 1.000000e-40
-lponlybeam 7e-29 7.000000e-29
-lw 6.5 6.500000e+00
-maxhmmpf -1 -1
-maxnewoov 20 20
-maxwpf -1 -1
-mdef
-mean
-mfclogdir
-min_endfr 0 0
-mixw
-mixwfloor 0.0000001 1.000000e-07
-mllr
-mmap yes yes
-ncep 13 13
-nfft 512 512
-nfilt 40 40
-nwpen 1.0 1.000000e+00
-pbeam 1e-48 1.000000e-48
-pip 1.0 1.000000e+00
-pl_beam 1e-10 1.000000e-10
-pl_pbeam 1e-5 1.000000e-05
-pl_window 0 0
-rawlogdir
-remove_dc no no
-round_filters yes yes
-samprate 16000 1.600000e+04
-seed -1 -1
-sendump
-senlogdir
-senmgau
-silprob 0.005 5.000000e-03
-smoothspec no no
-svspec
-time no no
-tmat
-tmatfloor 0.0001 1.000000e-04
-topn 4 4
-topn_beam 0 0
-toprule
-transform legacy legacy
-unit_area yes yes
-upperf 6855.4976 6.855498e+03
-usewdphones no no
-uw 1.0 1.000000e+00
-var
-varfloor 0.0001 1.000000e-04
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 7e-29 7.000000e-29
-wip 0.65 6.500000e-01
-wlen 0.025625 2.562500e-02
INFO: cmd_ln.c(691): Parsing command line:
\
-nfilt 20 \
-lowerf 1 \
-upperf 4000 \
-wlen 0.025 \
-transform dct \
-round_filters no \
-remove_dc yes \
-svspec 0-12/13-25/26-38 \
-feat 1s_c_d_dd \
-agc none \
-cmn current \
-cmninit 56,-3,1 \
-varnorm no
Current configuration:
[NAME] [DEFLT] [VALUE]
-agc none none
-agcthresh 2.0 2.000000e+00
-alpha 0.97 9.700000e-01
-ceplen 13 13
-cmn current current
-cmninit 8.0 56,-3,1
-dither no no
-doublebw no no
-feat 1s_c_d_dd 1s_c_d_dd
-frate 100 100
-input_endian little little
-lda
-ldadim 0 0
-lifter 0 0
-logspec no no
-lowerf 133.33334 1.000000e+00
-ncep 13 13
-nfft 512 512
-nfilt 40 20
-remove_dc no yes
-round_filters yes no
-samprate 16000 1.600000e+04
-seed -1 -1
-smoothspec no no
-svspec 0-12/13-25/26-38
-transform legacy dct
-unit_area yes yes
-upperf 6855.4976 4.000000e+03
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wlen 0.025625 2.500000e-02
INFO: acmod.c(246): Parsed model-specific feature parameters from /usr/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/feat.params
INFO: feat.c(713): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0
INFO: acmod.c(167): Using subvector specification 0-12/13-25/26-38
INFO: mdef.c(517): Reading model definition: /usr/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/mdef
INFO: mdef.c(528): Found byte-order mark BMDF, assuming this is a binary mdef file
INFO: bin_mdef.c(336): Reading binary model definition: /usr/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/mdef
INFO: bin_mdef.c(513): 50 CI-phone, 143047 CD-phone, 3 emitstate/phone, 150 CI-sen, 5150 Sen, 27135 Sen-Seq
INFO: tmat.c(205): Reading HMM transition probability matrices: /usr/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/transition_matrices
INFO: acmod.c(121): Attempting to use SCHMM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/means
INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/variances
INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(354): 0 variance values floored
INFO: s2_semi_mgau.c(903): Loading senones from dump file /usr/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/sendump
INFO: s2_semi_mgau.c(927): BEGIN FILE FORMAT DESCRIPTION
INFO: s2_semi_mgau.c(1022): Using memory-mapped I/O for senones
INFO: s2_semi_mgau.c(1296): Maximum top-N: 4 Top-N beams: 0 0 0
INFO: dict.c(317): Allocating 137543 * 32 bytes (4298 KiB) for word entries
INFO: dict.c(332): Reading main dictionary: /usr/share/pocketsphinx/model/lm/en_US/cmu07a.dic
INFO: dict.c(211): Allocated 1010 KiB for strings, 1664 KiB for phones
INFO: dict.c(335): 133436 words read
INFO: dict.c(341): Reading filler dictionary: /usr/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/noisedict
INFO: dict.c(211): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(344): 11 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(404): Allocating 50^3 * 2 bytes (244 KiB) for word-initial triphones
INFO: dict2pid.c(131): Allocated 60400 bytes (58 KiB) for word-final triphones
INFO: dict2pid.c(195): Allocated 60400 bytes (58 KiB) for single-phone word triphones
INFO: ngram_model_arpa.c(77): No \data\ mark in LM file
INFO: ngram_model_dmp.c(142): Will use memory-mapped I/O for LM file
INFO: ngram_model_dmp.c(196): ngrams 1=5001, 2=436879, 3=418286
INFO: ngram_model_dmp.c(242): 5001 = LM.unigrams(+trailer) read
INFO: ngram_model_dmp.c(288): 436879 = LM.bigrams(+trailer) read
INFO: ngram_model_dmp.c(314): 418286 = LM.trigrams read
INFO: ngram_model_dmp.c(339): 37293 = LM.prob2 entries read
INFO: ngram_model_dmp.c(359): 14370 = LM.bo_wt2 entries read
INFO: ngram_model_dmp.c(379): 36094 = LM.prob3 entries read
INFO: ngram_model_dmp.c(407): 854 = LM.tseg_base entries read
INFO: ngram_model_dmp.c(463): 5001 = ascii word strings read
INFO: ngram_search_fwdtree.c(99): 788 unique initial diphones
INFO: ngram_search_fwdtree.c(147): 0 root, 0 non-root channels, 60 single-phone words
INFO: ngram_search_fwdtree.c(186): Creating search tree
INFO: ngram_search_fwdtree.c(191): before: 0 root, 0 non-root channels, 60 single-phone words
INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 13428
INFO: ngram_search_fwdtree.c(338): after: 457 root, 13300 non-root channels, 26 single-phone words
INFO: ngram_search_fwdflat.c(156): fwdflat: min_ef_width = 4, max_sf_win = 25
INFO: continuous.c(371): pocketsphinx_continuous COMPILED ON: Dec 22 2013, AT: 20:43:21
Then I ran livedemo.py from pocketsphinx/src/gst-plugin This is the error I get:
thekindlyone#deepthought:~/.../gst-plugin$ python livedemo.py
Using pygtkcompat and Gst from gi
Traceback (most recent call last):
File "livedemo.py", line 102, in <module>
app = DemoApp()
File "livedemo.py", line 31, in __init__
self.init_gst()
File "livedemo.py", line 53, in init_gst
+ '! pocketsphinx configured=true ! fakesink')
gi._glib.GError: no element "pocketsphinx"
thekindlyone#deepthought:~/.../gst-plugin$
I found that I have export a new path as per cmusphinx wiki. But /usr/local/lib/gstreamer-1.0 is not present. What should I do next?
output of gst-inspect-1.0 pocketsphinx
No such element or plugin 'pocketsphinx'
output of gst-inspect pocketsphinx
Factory Details:
Long name: PocketSphinx
Class: Filter/Audio
Description: Convert speech to text
Author(s): David Huggins-Daines <dhuggins#cs.cmu.edu>
Rank: none (0)
Plugin Details:
Name: pocketsphinx
Description: PocketSphinx plugin
Filename: /usr/lib/gstreamer-0.10/libgstpocketsphinx.so
Version: 0.8
License: BSD
Source module: pocketsphinx
Binary package: PocketSphinx
Origin URL: http://cmusphinx.sourceforge.net/
GObject
+----GstObject
+----GstElement
+----GstPocketSphinx
Pad Templates:
SINK template: 'sink'
Availability: Always
Capabilities:
audio/x-raw-int
width: 16
depth: 16
signed: true
endianness: 1234
channels: 1
rate: 8000
SRC template: 'src'
Availability: Always
Capabilities:
text/plain
Element Flags:
no flags set
Element Implementation:
Has change_state() function: gst_element_change_state_func
Has custom save_thyself() function: gst_element_save_thyself
Has custom restore_thyself() function: gst_element_restore_thyself
Element has no clocking capabilities.
Element has no indexing capabilities.
Element has no URI handling capabilities.
Pads:
SRC: 'src'
Implementation:
Has custom eventfunc(): gst_pad_event_default
Has custom queryfunc(): gst_pad_query_default
Has custom iterintlinkfunc(): gst_pad_iterate_internal_links_default
Has getcapsfunc(): gst_pad_get_fixed_caps_func
Has acceptcapsfunc(): gst_pad_acceptcaps_default
Pad Template: 'src'
SINK: 'sink'
Implementation:
Has chainfunc(): 0x7f4e0c00c4f0
Has custom eventfunc(): 0x7f4e0c00c1b0
Has custom queryfunc(): gst_pad_query_default
Has custom iterintlinkfunc(): gst_pad_iterate_internal_links_default
Has getcapsfunc(): gst_pad_get_fixed_caps_func
Has acceptcapsfunc(): gst_pad_acceptcaps_default
Pad Template: 'sink'
Element Properties:
name : The name of the object
flags: readable, writable
String. Default: "pocketsphinx0"
hmm : Directory containing acoustic model parameters
flags: readable, writable
String. Default: null
lm : Language model file
flags: readable, writable
String. Default: null
lmctl : Language model control file (for class LMs)
flags: readable, writable
String. Default: null
lmname : Language model name (to select LMs from lmctl)
flags: readable, writable
String. Default: "default"
dict : Dictionary File
flags: readable, writable
String. Default: null
mllr : MLLR file
flags: readable, writable
String. Default: null
fsg : Finite state grammar file
flags: readable, writable
String. Default: null
fsg-model : Finite state grammar object (fsg_model_t *)
flags: writable
Pointer. Write only
fwdflat : Enable Flat Lexicon Search
flags: readable, writable
Boolean. Default: false
bestpath : Enable Graph Search
flags: readable, writable
Boolean. Default: false
maxhmmpf : Maximum number of HMMs searched per frame
flags: readable, writable
Integer. Range: 1 - 100000 Default: 2000
maxwpf : Maximum number of words searched per frame
flags: readable, writable
Integer. Range: 1 - 100000 Default: 20
beam : Beam width applied to every frame in Viterbi search
flags: readable, writable
Float. Range: -1 - 1 Default: 0
wbeam : Beam width applied to phone transitions
flags: readable, writable
Float. Range: -1 - 1 Default: 0
pbeam : Beam width applied to phone transitions
flags: readable, writable
Float. Range: -1 - 1 Default: 0
dsratio : Evaluate acoustic model every N frames
flags: readable, writable
Integer. Range: 1 - 10 Default: 1
latdir : Output Directory for Lattices
flags: readable, writable
String. Default: null
lattice : Word lattice object for most recent result
flags: readable
Boxed pointer of type "PSLattice"
nbest : N-best results
flags: readable
Array of GValues of type "gchararray"
nbest-size : Number of hypothesis in the N-best list
flags: readable, writable
Integer. Range: 1 - 1000 Default: 10
decoder : The underlying decoder
flags: readable
Boxed pointer of type "PSDecoder"
configured : Set this to finalize configuration
flags: readable, writable
Boolean. Default: false
Element Signals:
"partial-result" : void user_function (GstElement* object,
gchararray arg0,
gchararray arg1,
gpointer user_data);
"result" : void user_function (GstElement* object,
gchararray arg0,
gchararray arg1,
gpointer user_data);
UPDATES:
I downloaded fresh copies from github and installed, no change.
sphinxbase build
sphinxbase install
pocketsphinx build
pocketsphinx install
5th attempt on clean install worked. /usr/local/lib/gstreamer1.0 created. Adding this to GST_PLUGIN_PATH worked.
Related
I've provisions a Keyspace on AWS and in order to make sure it can achieve our desired performance I'm trying to run the cassandra-stress tool on it and compare it to other architectures we're experimenting with.
I managed to connect to it using the following cqlshrc:
[connection]
port = 9142
factory = cqlshlib.ssl.ssl_transport_factory
[ssl]
validate = true
certfile = /root/.cassandra/AmazonRootCA1.pem
And the following command (hoping that soon enough there will be Python3 support, the development was completed this February according to their Jira ticket):
cqlsh cassandra.eu-central-1.amazonaws.com 9142 -u "myuser-at-722222222222" -p "12/12ZmHmtD1klsDk9cgqt/XXXXXXXXxUz6Sy687z/U=" --ssl --cqlversion="3.4.4"
Surprisingly or not, when using the official AWS guides things tend to work.
So I went on and tried connecting the cassandra-stress tool (I have it inside a Docker container, I'd rather keep my OS Java free) to the same Keyspace.
First I converted the AWS AmazonRootCA1.pem into cassandra_truststore.jks using the following commands (explained here):
openssl x509 -outform der -in AmazonRootCA1.pem -out temp_file.der
keytool -import -alias cassandra -keystore cassandra_truststore.jks -file temp_file.der
Now when I'm trying to run the actual tool like this:
./cassandra-stress write -node cassandra.eu-central-1.amazonaws.com -port native=9142 thrift=9142 jmx=9142 -transport truststore=/root/.cassandra/cassandra_truststore.jks truststore-password=mypassword -mode native cql3 user="myuser-at-722222222222" password="12/12ZmHmtD1klsDk9cgqt/XXXXXXXXxUz6Sy687z/U="
I'm getting the following error:
******************** Stress Settings ********************
Command:
Type: write
Count: -1
No Warmup: false
Consistency Level: LOCAL_ONE
Target Uncertainty: 0.020
Minimum Uncertainty Measurements: 30
Maximum Uncertainty Measurements: 200
Key Size (bytes): 10
Counter Increment Distibution: add=fixed(1)
Rate:
Auto: true
Min Threads: 4
Max Threads: 1000
Population:
Sequence: 1..1000000
Order: ARBITRARY
Wrap: true
Insert:
Revisits: Uniform: min=1,max=1000000
Visits: Fixed: key=1
Row Population Ratio: Ratio: divisor=1.000000;delegate=Fixed: key=1
Batch Type: not batching
Columns:
Max Columns Per Key: 5
Column Names: [C0, C1, C2, C3, C4]
Comparator: AsciiType
Timestamp: null
Variable Column Count: false
Slice: false
Size Distribution: Fixed: key=34
Count Distribution: Fixed: key=5
Errors:
Ignore: false
Tries: 10
Log:
No Summary: false
No Settings: false
File: null
Interval Millis: 1000
Level: NORMAL
Mode:
API: JAVA_DRIVER_NATIVE
Connection Style: CQL_PREPARED
CQL Version: CQL3
Protocol Version: V4
Username: myuser-at-722222222222
Password: *suppressed*
Auth Provide Class: null
Max Pending Per Connection: 128
Connections Per Host: 8
Compression: NONE
Node:
Nodes: [cassandra.eu-central-1.amazonaws.com]
Is White List: false
Datacenter: null
Schema:
Keyspace: keyspace1
Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
Replication Strategy Pptions: {replication_factor=1}
Table Compression: null
Table Compaction Strategy: null
Table Compaction Strategy Options: {}
Transport:
factory=org.apache.cassandra.thrift.TFramedTransportFactory; truststore=/root/.cassandra/cassandra_truststore.jks; truststore-password=mypassword; keystore=null; keystore-password=null; ssl-protocol=TLS; ssl-alg=SunX509; store-type=JKS; ssl-ciphers=TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA;
Port:
Native Port: 9142
Thrift Port: 9142
JMX Port: 9142
Send To Daemon:
*not set*
Graph:
File: null
Revision: unknown
Title: null
Operation: WRITE
TokenRange:
Wrap: false
Split Factor: 1
java.lang.RuntimeException: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: cassandra.eu-central-1.amazonaws.com/3.127.48.183:9142 (com.datastax.driver.core.exceptions.TransportException: [cassandra.eu-central-1.amazonaws.com/3.127.48.183] Channel has been closed))
at org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:220)
at org.apache.cassandra.stress.settings.SettingsSchema.createKeySpacesNative(SettingsSchema.java:79)
at org.apache.cassandra.stress.settings.SettingsSchema.createKeySpaces(SettingsSchema.java:69)
at org.apache.cassandra.stress.settings.StressSettings.maybeCreateKeyspaces(StressSettings.java:228)
at org.apache.cassandra.stress.StressAction.run(StressAction.java:57)
at org.apache.cassandra.stress.Stress.run(Stress.java:143)
at org.apache.cassandra.stress.Stress.main(Stress.java:62)
Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: cassandra.eu-central-1.amazonaws.com/3.127.48.183:9142 (com.datastax.driver.core.exceptions.TransportException: [cassandra.eu-central-1.amazonaws.com/3.127.48.183] Channel has been closed))
at com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:233)
at com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:79)
at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1424)
at com.datastax.driver.core.Cluster.getMetadata(Cluster.java:403)
at org.apache.cassandra.stress.util.JavaDriverClient.connect(JavaDriverClient.java:160)
at org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:211)
... 6 more
I've tried changing some parameters such as the jks password etc. (Just in case I was wrong) but I got a different error message so it's probably not the case.
Did I miss something?
Try using TLP Stress instead.
tlp-stress run RandomPartitionAccess -d 10m --host cassandra.us-east-1.amazonaws.com --port 9142 --username alice --password fLyWYFlTCD5J2gzGAZ –ssl --max-requests 4000 --dc us-east-2 --threads 10
https://thelastpickle.com/tlp-stress/
Hi I am newbie to tensorflow. My aim is to convert .pb file to .tflite from pretrain model for my understanding. I have download mobilenet_v1_1.0_224 Model. Below is structure for model
mobilenet_v1_1.0_224.ckpt.data-00000-of-00001 - 66312kb
mobilenet_v1_1.0_224.ckpt.index - 20kb
mobilenet_v1_1.0_224.ckpt.meta - 3308kb
mobilenet_v1_1.0_224.tflite - 16505kb
mobilenet_v1_1.0_224_eval.pbtxt - 520kb
mobilenet_v1_1.0_224_frozen.pb - 16685kb
I know model already has .tflite file, but for my understanding I am trying to convert it.
My First Step : Creating frozen Graph file
import tensorflow as tf
imported_meta = tf.train.import_meta_graph(base_dir + model_folder_name + meta_file,clear_devices=True)
graph_ = tf.get_default_graph()
with tf.Session() as sess:
#saver = tf.train.import_meta_graph(base_dir + model_folder_name + meta_file, clear_devices=True)
imported_meta.restore(sess, base_dir + model_folder_name + checkpoint)
graph_def = sess.graph.as_graph_def()
output_graph_def = graph_util.convert_variables_to_constants(sess, graph_def, ['MobilenetV1/Predictions/Reshape_1'])
with tf.gfile.GFile(base_dir + model_folder_name + './my_frozen.pb', "wb") as f:
f.write(output_graph_def.SerializeToString())
I have successfully created my_frozen.pb - 16590 kb . But original file size is 16,685kb, which is clearly visible in folder structure above. So this is my first question why file size is different, Am I following some wrong path.
My Second Step : Creating tflite file using bazel command
bazel run --config=opt tensorflow/contrib/lite/toco:toco -- --input_file=/path_to_folder/my_frozen.pb --output_file=/path_to_folder/model.tflite --inference_type=FLOAT --input_shape=1,224,224,3 --input_array=input --output_array=MobilenetV1/Predictions/Reshape_1
This commands give me model.tflite - 0 kb.
Trackback for bazel Command
INFO: Analysed target //tensorflow/contrib/lite/toco:toco (0 packages loaded).
INFO: Found 1 target...
Target //tensorflow/contrib/lite/toco:toco up-to-date:
bazel-bin/tensorflow/contrib/lite/toco/toco
INFO: Elapsed time: 0.369s, Critical Path: 0.01s
INFO: Build completed successfully, 1 total action
INFO: Running command line: bazel-bin/tensorflow/contrib/lite/toco/toco '--input_file=/home/ubuntu/DEEP_LEARNING/Prashant/TensorflowBasic/mobilenet_v1_1.0_224/frozengraph.pb' '--output_file=/home/ubuntu/DEEP_LEARNING/Prashant/TensorflowBasic/mobilenet_v1_1.0_224/float_model.tflite' '--inference_type=FLOAT' '--input_shape=1,224,224,3' '--input_array=input' '--output_array=MobilenetV1/Predictions/Reshape_1'
2018-04-12 16:36:16.190375: I tensorflow/contrib/lite/toco/import_tensorflow.cc:1265] Converting unsupported operation: FIFOQueueV2
2018-04-12 16:36:16.190707: I tensorflow/contrib/lite/toco/import_tensorflow.cc:1265] Converting unsupported operation: QueueDequeueManyV2
2018-04-12 16:36:16.202293: I tensorflow/contrib/lite/toco/graph_transformations/graph_transformations.cc:39] Before Removing unused ops: 290 operators, 462 arrays (0 quantized)
2018-04-12 16:36:16.211322: I tensorflow/contrib/lite/toco/graph_transformations/graph_transformations.cc:39] Before general graph transformations: 290 operators, 462 arrays (0 quantized)
2018-04-12 16:36:16.211756: F tensorflow/contrib/lite/toco/graph_transformations/resolve_batch_normalization.cc:86] Check failed: mean_shape.dims() == multiplier_shape.dims()
Python Version - 2.7.6
Tensorflow Version - 1.5.0
Thanks In advance :)
The error Check failed: mean_shape.dims() == multiplier_shape.dims()
was an issue with resolution of batch norm and has been resolved in:
https://github.com/tensorflow/tensorflow/commit/460a8b6a5df176412c0d261d91eccdc32e9d39f1#diff-49ed2a40acc30ff6d11b7b326fbe56bc
In my case the error occurred using tensorflow v1.7
Solution was to use tensorflow v1.15 (nightly)
toco --graph_def_file=/path_to_folder/my_frozen.pb \
--input_format=TENSORFLOW_GRAPHDEF \
--output_file=/path_to_folder/my_output_model.tflite \
--input_shape=1,224,224,3 \
--input_arrays=input \
--output_format=TFLITE \
--output_arrays=MobilenetV1/Predictions/Reshape_1 \
--inference-type=FLOAT
I am trying to use delayed_job to process large videos and audio files in the background. For the most part everything works. The only time it runs into any hiccups is when larger files are uploaded (~>200MB)
app/models/userfile.rb
has_attached_file :userfile,
path: ':dir_path/:style_:filename',
use_timestamp: false, styles: lambda { |a| UserfileStyles.get(a.instance)[:styles] },
only_process: lambda { |a| UserfileStyles.get(a.instance)[:foreground] },
source_file_options: { all: '-auto-orient' }
validates_attachment_content_type :userfile, content_type: /.*/
process_in_background :userfile,
url_with_processing: false,
only_process: lambda { |a| UserfileStyles.get(a.instance)[:background] }
app/models/userfile_styles.rb
module UserfileStyles
def self.get userfile
if userfile.video?
{
styles: {
screenshot: ['300x300', :jpg],
thumbnail: {
gemoetry: '100x100#',
format: :jpg,
convert_options: '-thumbnail 100%'
},
preview: {
format: 'mp4',
convert_options: {
output: { ss: '0', t: '10' }
},
processors: [:transcoder]
},
mp4: {
format: 'mp4',
convert_options: {
output: {
vcodec: 'libx264',
vb: '1000k',
'profile:v': 'baseline',
vf: 'scale=-2:480',
acodec: 'aac',
ab: '128k',
preset: 'slow',
threads: 0,
movflags: 'faststart',
}
},
processors: [:transcoder]
}
},
foreground: [:screenshot, :thumbnail],
background: [:preview, :mp4]
}
end
end
end
Example (the first file is being converted from the second file, and the third file is being converted from the fourth file):
v2#web1 ~/divshare-v2 $ ls -alh /tmp
-rw------- 1 v2 v2 70M Sep 10 00:01 2158940a8739e7219125179e0d1528c120160909-14061-8dqfx020160909-14061-egeyeq.mp4
-rw------- 1 v2 v2 515M Sep 9 23:57 2158940a8739e7219125179e0d1528c120160909-14061-8dqfx0.mp4
-rw------- 1 v2 v2 145M Sep 9 23:33 76ba144beb8a14b6cf542225ef885a7c20160909-12733-1ui03vo20160909-12733-y7ywn.mp4
-rw------- 1 v2 v2 604M Sep 9 23:27 76ba144beb8a14b6cf542225ef885a7c20160909-12733-1ui03vo.mp4
I have tried uploading a couple times and with different files. Always gets caught around the same point. However everything works perfectly when smaller videos (~100-200MB).
This is the command being ran:
v2#web1 ~/divshare-v2 $ ps ux | grep ffmpeg
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
v2 14588 26.4 2.9 849840 240524 ? Sl Sep09 12:00 ffmpeg -i /tmp/2158940a8739e7219125179e0d1528c120160909-14061-8dqfx0.mp4 -acodec aac -strict experimental -vcodec libx264 -vb 1000k -profile:v baseline -vf scale=-2:480 -acodec aac -ab 128k -preset slow -threads 0 -movflags faststart -y /tmp/2158940a8739e7219125179e0d1528c120160909-14061-8dqfx020160909-14061-egeyeq.mp4
Any sort of help debugging this would be awesome.
NOTE: I copied the above command and manually ran it so that I could see some logs from ffmpeg, and worked flawlessly.
I am trying out the pyo for python. I installed the pyo for ubuntu using these commands from the homepage:
sudo apt-get install libjack-jackd2-dev libportmidi-dev portaudio19-dev liblo-dev
sudo apt-get install libsndfile-dev python-dev python-tk
sudo apt-get install python-imaging-tk python-wxgtk3.0
git clone https://github.com/belangeo/pyo.git
cd pyo
sudo python setup.py install --install-layout=deb --use-jack --use-double
Howerver when i try the very first example to Play a sound:
>>> from pyo import *
>>> s = Server().boot()
>>> s.start()
>>> sf = SfPlayer("path/to/your/sound.aif", speed=1, loop=True).out()
i get these errors:
>>> from pyo import *
pyo version 0.7.9 (uses single precision)
>>> s = Server().boot()
ALSA lib pcm_dsnoop.c:614:(snd_pcm_dsnoop_open) unable to open slave
ALSA lib pcm_dmix.c:1024:(snd_pcm_dmix_open) unable to open slave
ALSA lib pcm.c:2267:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear
ALSA lib pcm.c:2267:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe
ALSA lib pcm.c:2267:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side
ALSA lib pcm_dmix.c:1024:(snd_pcm_dmix_open) unable to open slave
Cannot connect to server socket err = No such file or directory
Cannot connect to server request channel
jack server is not running or cannot be started
Expression 'parameters->channelCount <= maxChans' failed in 'src/hostapi/alsa/pa_linux_alsa.c', line: 1514
Expression 'ValidateParameters( inputParameters, hostApi, StreamDirection_In )' failed in 'src/hostapi/alsa/pa_linux_alsa.c', line: 2818
portaudio error in Pa_OpenStream: Invalid number of channels
Portaudio error: Invalid number of channels
Server not booted.
Can anyone help?
PS: I am running ubuntu 15.10
Step 1. You should list your audio hardware:
from pyo import *
print("Audio host APIS:")
pa_list_host_apis()
pa_list_devices()
print("Default input device: %i" % pa_get_default_input())
print("Default output device: %i" % pa_get_default_output())
On my system result is:
Audio host APIS:
index: 0, id: 8, name: ALSA, num devices: 10, default in: 9, default out: 9
index: 1, id: 7, name: OSS, num devices: 0, default in: -1, default out: -1
AUDIO devices:
0: OUT, name: HDA Intel HDMI: 0 (hw:0,3), host api index: 0, default sr: 44100 Hz, latency: 0.005805 s
1: OUT, name: HDA Intel HDMI: 1 (hw:0,7), host api index: 0, default sr: 44100 Hz, latency: 0.005805 s
2: OUT, name: HDA Intel HDMI: 2 (hw:0,8), host api index: 0, default sr: 44100 Hz, latency: 0.005805 s
3: OUT, name: HDA Intel HDMI: 3 (hw:0,9), host api index: 0, default sr: 44100 Hz, latency: 0.005805 s
4: OUT, name: HDA Intel HDMI: 4 (hw:0,10), host api index: 0, default sr: 44100 Hz, latency: 0.005805 s
5: IN, name: HDA Intel PCH: CS4208 Analog (hw:1,0), host api index: 0, default sr: 44100 Hz, latency: 0.005805 s
6: OUT, name: HDA Intel PCH: CS4208 Digital (hw:1,1), host api index: 0, default sr: 44100 Hz, latency: 0.005805 s
7: OUT, name: hdmi, host api index: , default sr: 44100 Hz, latency: 0.005805 s
8: IN, name: pulse, host api index: 0, default sr: 44100 Hz, latency: 0.008707 s
8: OUT, name: pulse, host api index: 0, default sr: 44100 Hz, latency: 0.008707 s
9: IN, name: default, host api index: 0, default sr: 44100 Hz, latency: 0.008707 s
9: OUT, name: default, host api index: 0, default sr: 44100 Hz, latency: 0.008707 s
Default input device: 9
Default output device: 9
Step 2. Choose preferred device. In my case device 9 is ok.
from pyo import *
s = Server(duplex=0)
s.setOutputDevice(9) # Use device from the previous step
s.boot()
s.start()
# Try to play sound
a = Sine(mul=0.01).out()
Got it working on Ubuntu 20.04
After trying several things and a lot of frustration... the following worked:
sudo apt install python3-pyo
and the test:
#/usr/bin/env python3
from pyo import *
s = Server()
s.boot()
s.start()
a = Sine(freq=440, mul=0.5)
a.out()
time.sleep(2)
a.stop()
s.stop()
produces a 2 second 440Hz sine sound as desired. Maybe a reboot was needed.
The Ubuntu package must be installing some missing binary dependencies, without which pyo was throwing PyoServerStateException.
More details at: Pyo server.boot() fails with pyolib._core.PyoServerStateException on Ubuntu 14.04
I am currently trying to train my first net with Caffe. I get the following output:
caffe train --solver=first_net_solver.prototxt
I0515 09:01:06.577710 15331 caffe.cpp:117] Use CPU.
I0515 09:01:06.578014 15331 caffe.cpp:121] Starting Optimization
I0515 09:01:06.578097 15331 solver.cpp:32] Initializing solver from parameters:
test_iter: 1
test_interval: 1
base_lr: 0.01
display: 1
max_iter: 2
lr_policy: "inv"
gamma: 0.0001
power: 0.75
momentum: 0.9
weight_decay: 0
snapshot: 1
snapshot_prefix: "first_net"
solver_mode: CPU
net: "first_net.prototxt"
I0515 09:01:06.578203 15331 solver.cpp:70] Creating training net from net file: first_net.prototxt
E0515 09:01:06.578348 15331 upgrade_proto.cpp:609] Attempting to upgrade input file specified using deprecated transformation parameters: first_net.prototxt
I0515 09:01:06.578533 15331 upgrade_proto.cpp:612] Successfully upgraded file specified using deprecated data transformation parameters.
E0515 09:01:06.578549 15331 upgrade_proto.cpp:614] Note that future Caffe releases will only support transform_param messages for transformation fields.
E0515 09:01:06.578574 15331 upgrade_proto.cpp:618] Attempting to upgrade input file specified using deprecated V1LayerParameter: first_net.prototxt
I0515 09:01:06.578635 15331 upgrade_proto.cpp:626] Successfully upgraded file specified using deprecated V1LayerParameter
I0515 09:01:06.578729 15331 net.cpp:42] Initializing net from parameters:
name: "first_net"
input: "data"
input_dim: 1
input_dim: 5
input_dim: 41
input_dim: 41
state {
phase: TRAIN
}
layer {
name: "data"
type: "ImageData"
top: "data2"
top: "data-idx"
transform_param {
mirror: false
crop_size: 41
}
image_data_param {
source: "/home/moose/GitHub/first-net/data-images.txt"
}
}
layer {
name: "label-mask"
type: "ImageData"
top: "label-mask"
top: "label-idx"
transform_param {
mirror: false
crop_size: 41
}
image_data_param {
source: "/home/moose/GitHub/first-net/labels-images.txt"
}
}
layer {
name: "assert-idx"
type: "EuclideanLoss"
bottom: "data-idx"
top: "loss"
}
What does
Attempting to upgrade input file specified using deprecated transform parameters / V1LayerParameter
mean? Where exactly did I use something deprecated? What should I use instead?
Recently, input transformation (scaling/cropping etc.) was separated from the IMAGE_DATA layer into a separate object: data transformer. This change affected the protobuffer syntax and the syntax of the IMAGE_DATA layer.
It appears as if your first_net.prototxt is in the old format and Caffe converts it for you to the new format.
You can do this conversion manually yourself using ./build/tools/upgrade_net_proto_text (for prototxt files) and ./build/tools/upgrade_net_proto_binary (for binaryproto files).