I'm using YOLOv5 to detect multiple object in every frame of a video using a webcam. I would like to track objects instead of detect them every frame and in order to do this I tried YOLOv5-DeepSort. There is a big problem though: Yolov5 can be compiled with TensorRT making it quite fast for an embedded board (50FPS) but DeepSort seems like can't be compiled in the same way.
So I'm now looking for an alternative that is not too expensive and that can improve my detection by tracking objects. Any idea? I already tried the KCF tracker from OpenCV and motpy but both are very bad.
DISCLAIMER: I am the main https://github.com/mikel-brostrom/Yolov5_DeepSort_OSNet contributor.
Sadly, there is no TensorRT export option at the moment. You could try using https://github.com/abewley/sort. This is, DeepSORT but without the deep appearance descriptor, so the tracking will only be based on motion, which depending on your use-case could be good enough.
Another option could be to export the models to ONNX which is relatively easy and then load them with TensorRT following some tutorial like: https://learnopencv.com/how-to-convert-a-model-from-pytorch-to-tensorrt-and-speed-up-inference/
Aug 6 2022 EDIT -------------------
I added a ReID specific export script to my repo. It generates: ONNX, OpenVINO and TFLite models out of mobilenet and resnet50 pt models. I also added a multibackend model loader and inferencer that supports the 3 aforementioned type of models. Planning to add TensorRT in a close future.
A small tutorial can be found here
Sept 9 2022 EDIT -------------------
TensorRT export and inference now supported. Example usage:
python3 reid_export.py --weights /datadrive/mikel/Yolov5_StrongSORT_OSNet/weights/osnet_x0_25_msmt17.pt --include onnx engine --dynamic --device 0 --batch-size 30
python3 track.py --source 0 --strong-sort-weights weights/osnet_x0_25_msmt17.engine --imgsz 640 --yolo-weights weights/yolov5m.engine --device 0 --class 0
Related
I want to do transfer learning in YOLOv3 in Darknet so I want to use the pre-trained model of YOLOv3 that was trained on COCO dataset and then further train it on my own dataset to detect additional objects. So what are the steps that I should do? How can I label my data so that it can be used in Darknet? Please help me because it's the first time that I use Darknet and YOLO.
It's all explained here: https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects
Note that notation must be consistent. Any missing annotated object will result in a bad learning and so a bad prediction.
This question was answered in "Fine-tuning and transfer learning by the example of YOLO" (Fine-tuning and transfer learning by the example of YOLO).
The answer given by gameon67, suggesting this:
If you are using AlexeyAB's darknet repo (not darkflow), he suggests
to do Fine-Tuning instead of Transfer Learning by setting this param
in cfg file : stopbackward=1 .
Then input ./darknet partial yourConfigFile.cfg
yourWeightsFile.weights outPutName.LastLayer# LastLayer# such as :
./darknet partial cfg/yolov3.cfg yolov3.weights yolov3.conv.81 81 It
will create yolov3.conv.81 and will freeze the lower layer, then you
can train by using weights file yolov3.conv.81 instead of original
darknet53.conv.74.
References : https://github.com/AlexeyAB/darknet#how-to-improve-object-detection
I am running my first tensorflow job (object detection training) right now, using the tensorflow API. I am using the ssd mobilenet network from the modelzoo. I used the >>ssd_mobilenet_v1_0.75_depth_quantized_300x300_coco14_sync.config<< as a config-file and as a fine tune checkpoint the >>ssd_mobilenet_v1_0.75_depth_300x300_coco14_sync_2018_07_03<< checkpoint.
I started my training with the following command:
PIPELINE_CONFIG_PATH='/my_path_to_tensorflow/tensorflow/models/research/object_detection/models/model/ssd_mobilenet_v1_0.75_depth_quantized_300x300_coco14_sync.config'
MODEL_DIR='/my_path_to_tensorflow/tensorflow/models/research/object_detection/models/model/train'
NUM_TRAIN_STEPS=200000
SAMPLE_1_OF_N_EVAL_EXAMPLES=1
python object_detection/model_main.py \
--pipeline_config_path=${PIPELINE_CONFIG_PATH} \
--model_dir=${MODEL_DIR} \
--num_train_steps=${NUM_TRAIN_STEPS} \
--sample_1_of_n_eval_examples=$SAMPLE_1_OF_N_EVAL_EXAMPLES \
--alsologtostderr
No coming to my problem, I hope the community can help me with. I trained the network over night and it trained for 1400 steps and then started evaluating per image, which was running the entire night. Next morning I saw, that network only evaluated and the training was still at 1400 steps. You can see part of the console output in the image below.
Console output from evaluation
I tried to take control by using the eval config parameter in the config file.
eval_config: {
metrics_set: "coco_detection_metrics"
use_moving_averages: false
num_examples: 5000
}
I added max_evals = 1, because the documentation says that I can limit the evaluation like this. I also changend eval_interval_secs = 3600 because I only wanted one eval every hour. Both options had no effect.
I also tried other config-files from the modelzoo, with no luck. I searched google for hours, only to find answers which told me to change the parameters I already changed. So I am coming to stackoverflow to find help in this Matter.
Can anybody help me, maybe hat the same experience? Thanks in advance for all your help!
Environment information
$ pip freeze | grep tensor
tensorboard==1.11.0
tensorflow==1.11.0
tensorflow-gpu==1.11.0
$ python -V
Python 2.7.12
I figured out a solution for the problem. The problem with tensorflow 1.10 and after is, that you can not set checkpoint steps or checkpoint secs in the config file like before. By default tensorflow 1.10 and after saves a checkpoint every 10 min. If your hardware is not fast enough and you need more then 10 min for evaluation, you are stuck in a loop.
So to change the time steps or training steps till a new checkpoint is safed (which triggers the evaluation), you have to navigate to the model_main.py in the following folder:
tensorflow/models/research/object_detection/
Once you opened model_main.py, navigate to line 62. Here you will find
config = tf.estimator.RunConfig(model_dir=FLAGS.model_dir)
To trigger the checkpoint save after 2500 steps for example, change the entry to this:
config = tf.estimator.RunConfig(model_dir=FLAGS.model_dir,save_checkpoints_steps=2500).
Now the model is saved every 2500 steps and afterwards an evaluation is done.
There are multiple parameters you can pass through this option. You can find a documentation here:
tensorflow/tensorflow/contrib/learn/python/learn/estimators/run_config.py.
From Line 231 to 294 you can see the parameters and documentation.
I hope I can help you with this and you don't have to look for an answer as long as I did.
Could it be that evaluation takes more than 10 minutes in your case? It could be that since 10 minutes is the default interval for making evaluation, it keeps evaluating.
Unfortunately, the current API doesn't easily support altering the time interval for evaluation.
By default, evaluation happens after every checkpoint saving, which by default is set to 10 minutes.
Therefore you can change the time for saving a checkpoint by specifying save_checkpoint_secs or save_checkpoint_steps as an input to the instance of MonitoredSession (or MonitoredTrainingSession). Unfortunately and best to my knowledge, these parameters are not available to be set as flags to model_main.py or from the config file. Therefore, you can either change their value by hard coding, or exporting them out so that they will be available.
An alternative way, without changing the frequency of saving a checkpoint, is modifying the evaluation frequency which is specified as throttle_secs to tf.estimator.EvalSpec.
See my explanation here as to how to export this parameter to model_main.py.
I've read caffe2 tutorials and tried pre-trained models. I knew caffe2 will leverge GPU to run the model/net. But the input data seems always be given from CPU(ie. Host) memory. For example, in Loading Pre-Trained Models, after model is loaded, we can predict an image by
result = p.run([img])
However, image "img" should be read in CPU scope. What I look for is a framework that can pipline the images (which is decoded from a video and still resides in GPU memory) directly to the prediction model, instead of copying it from GPU to CPU scope, and then transfering to GPU again to predict result. Is Caffe or Caffe2 provides such functions or interfaces for python or C++? Or should I need to patch Caffe to do so? Thanks at all.
Here is my solution:
I'd found in tensor.h, function ShareExternalPointer() can exactly do what I want.
Feed gpu data this way,
pInputTensor->ShareExternalPointer(pGpuInput, InputSize);
then run the predict net through
pPredictNet->Run();
where pInputTensor is the entrance tensor for the predict net pPredictNet
I don't think you can do it in caffe with python interface.
But I think that it can be accomplished using the c++: In c++ you have access to the Blob's mutable_gpu_data(). You may write code that run on device and "fill" the input Blob's mutable_gpu_data() directly from gpu. Once you made this update, caffe should be able to continue its net->forward() from there.
UPDATE
On Sep 19th, 2017 PR #5904 was merged into master. This PR exposes GPU pointers of blobs via the python interface.
You may access blob._gpu_data_ptr and blob._gpu_diff_ptr directly from python at your own risk.
As you've noted, using a Python layer forces data in and out of the GPU, and this can cause a huge hit to performance. This is true not just for Caffe, but for other frameworks too. To elaborate on Shai's answer, you could look at this step-by-step tutorial on adding C++ layers to Caffe. The example given should touch on most issues dealing with layer implementation. Disclosure: I am the author.
I am trying to extract features of a new data-set by using a pre-trained network like that one classify_image_graph_def.pb released by Google in the tensorflow (inception-2015-12-05.tgz). I was successful on that as there is tutorial at transfer_learning, which uses the classify_image_graph_def.pb (inception_v3.pb) to extract fractures of the new data-set.
However, in the new release of pre-trained models tensorflow provides check point files (ex. resnet_v1_152.ckpt) instead of Graph_def (ex. resnet_v1_152.pb). I was wondering how I could use these checkpoint files to extract features as in transfer_learning. Could anyone give me some directions?
Just follow the official model save/restore doc here.
The idea of incremental learning that i understand, is that after Training, i save my model and when i have new data, instead of training the old data with new one, i just load the model i have saved and train again using the new data and the new trained model would build on top of the old one.
I have searched for this in WEKA and i found that this can be done using "Incremental Algorithms". I know that Hoefdding-Tree is an incremental version of the J48 algorithm but i am not sure how do the incremental learning.
If anybody could explain if this is possible in WEKA and how it could be done.
In order to do incremental learning in WEKA, you have to choose classifiers that implement an UpdatableClassifer Interface. There are 10 classifiers that can do this. Note that this can only be done using either coding or command line.
You have to first build your model from training data, then save the model. After that you use the same model and train more.
Using HoefddingTree algorithm, it would be something like this:
java weka.classifiers.trees.HoeffdingTree -L 2 -S 0 -E 1.0E-7 -H 0.1 -M 0.01 -G 200.0 -N 0.0 -t Training.arff -no-cv -d ht.model
java weka.classifiers.trees.HoeffdingTree -t Training.arff -T Testing.arff -l ht.model -d ht.updated.model
of-course there is no need to specify the training parameter again when updating the model because these settings are already saved in the model.
For more information:
http://weka.8497.n7.nabble.com/WEKA-Incremental-Learning-Training-td35691.html
https://weka.wikispaces.com/Classification-Train/test%20set#Classification-Building a Classifier-Incremental