How do I export TFRecord with data in Intel CVAT? - computer-vision

I've annotated roughly 15 mins of video with intel's CVAT. - https://github.com/opencv/cvat
When exporting to TFRecord, the file is only about 4mb (should be closer to 200mb at least), and doesn't appear to actually contain any image data. How can I export a TF Record with the image data along with the annotation data?

As of 12/1/2019 - This is not currently supported in intel CVAT.
I was able to achieve my goal, and create tfrecords containing both annotation data and image data by using a combination of ffmpeg to split my original .mov into frames and create_pascal_tf_record.py to generate the tfrecord.

Related

Google Cloud Vision not automatically splitting images for trainin/test

It's weird, for some reason GCP Vision won't allow me to train my model. I have met the minimum of 10 images per label, no images unlabeled and tried uploading a CSV pointing to 3 of this labels images as VALIDATION images.. Yet I get this error
Some of your labels (e.g. ‘Label1’) do not have enough images assigned to your Validation sets. Import another CSV file and assign those images to those sets.
any ideas would be appreciated
This error generally occurs when you did not labelled all the images because AutoML divides your images, including the mislabelled ones, into the categories and this error is triggered when the unlabelled images go to the VALIDATION set.
According to the documentation, it is recomended 1000 images per label. However, the minimum is 10 images for each label or 50 for complex cases. In addition,
The model works best when there are at most 100x more images for the most common label than for the least common label. We recommend removing very low frequency labels.
Furthermore, AutoML Vision uses the 80% of your content documents for training, 10% for validating, and 10% for testing. Since your images were not divided into these three categories, you should manually assign them to TRAIN, VALIDATION and TEST. You can do that by, uploading your images to a GCS bucket and referencing each labelled image in a .csv file, as follows:
TRAIN, gs://my_bucket/image1.jpeg,cat
As you can see above, it follows the format [SET],[GCS image path], [Label]. Note that you will be dividing your dataset manullay and it should respect the percentages already mentioned. Thus, you will have enough data in each category. You can follow the steps for preparing your training data here and here.
Note: please be aware that your .csv file is case sentive.
Lastly, in order to validate your dataset and inspect labelled/unlabelled images you can export the created dataset and check the exported .csv file. You can do it as described in the documentation. After exporting, download it and verify each SET( TRAIN, VALIDATION and TEST).

save Large RasterBrick to file for later use

I have a Large RasterBrick, created through compiling a large number of .nc files and then manipulating in a few ways (cropping, collapsing, naming layers). I want to save this brick to a file on my laptop, so that I can access it without having to import all data and manipulate anew.
How do I do this? I think it should involve writeRaster, but I'm not sure how to specify the options.
My RasterBrick is 18 by 25, with 14975 layers, each named with the relevant date.
I tried this code from Save multi layer RasterBrick to harddisk:
outfile <- writeRaster(windstack_mn, filename='dailywindgrid.tif', format="GTiff", overwrite=TRUE,options=c("INTERLEAVE=BAND","COMPRESS=LZW"))
However, this code produce a tif file that holds a single 18 by 25 layer. I think it saved only the 1st layer of my RasterBrick, because if I bring in the saved .tif file and plot it, it looks identical to plotting the 1st layer of the original RasterBrick.
Did you look at outfile? Can you show it to us?
You should show what you do to "bring in the saved .tif". I am guessing that you do
raster('dailywindgrid.tif')
whereas you should be doing
brick('dailywindgrid.tif')
The comment/answer fr/ Robert solves my issue, with the one addition that one needs to specify the raster format. So I am now saving the file with this code:
writeRaster(StackName, filename='FileNAme.grd', format="raster", overwrite=TRUE,options=c("INTERLEAVE=BAND","COMPRESS=LZW"))
And that .grd file can later be opened using this code:
ImportName <- brick("FileNAme.grd")

how to create & encode a label file using sample images for tensorflow

we are trying to detect special characters like + and - inside an image using tensorflow by extending the MNIST sample code -> https://github.com/opensourcesblog/tensorflow-mnist
We have also been able to create a binary encoded file using our sample images needed for training the neural network by using the sample code -> https://github.com/jkarnows/idx-formatter
But we are not finding a way how to create a label file for our images and then to create a binary encoded label file using the label file
Both of these files are very important for proceeding further .
Anybody having any idea is most welcome to share them with us
The data is in gzip format, decoded with extract_labels, each label on 8 bytes. You can use numpy.getbuffer to convert a uint8 array back.
Alternatively, you can just create your own extract_labels which reads labels in whatever format you see fit.

Caffe GoogleNet classification.cpp gives random outputs

I used Caffe GoogleNet model to train my own data (10k images, 2 classes). I stop it at 400000th iteration with an accuracy of ~80%.
If I run the below command:
./build/examples/cpp_classification/classification.bin
models/bvlc_googlenet/deploy.prototxt
models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel
data/ilsvrc12/imagenet_mean.binaryproto
data/ilsvrc12/synset_words.txt
1.png
it gives me a different -- apparently random -- result each time (i.e. if I run it n times, then I get n different results). Why? Does my training fail? Does it still use the old data from the reference model?
I don't think it is a problem with the training. Even if the training data wasn't, it should give the same (possibly wrong) output every time. If you are getting random results, it indicates that the weights are not being loaded properly.
When you load a .caffemodel against a .prototxt, caffe will load the weights of all the layers in the prototxt whose names match with the ones in the caffemodel. For the other layers, it will do a random initialisation (gaussian xavier, etc according to the specification in the prototxt).
So the best thing for you to do now is to check if the model was trained using the same prototxt you are using now.
I see that you are using GoogleNet prototxt and reference_caffenet caffemodel. Is this intentional?
When you want to deploy the fine-tuned model, you should check two main things:
Inputs:
Input image uses a BGR channel instead of RGB (e.g. opencv)
Mean file: Is same as mean file when training?
Prototxt:
When fine-tuning the model, you will change some layers' name in the original prototxt, and you should check whether the same layer name used?
And there are some
Fine-tune tricks and CS231n_transfer_learning which are very useful for fine-tuning.

access data from files on disc in *real time*

I have the following problem to solve. I have to build a graph viewer to view a massive data set.
We have some files in a particular format that has millions of records representing the result of an experiment. Each record represents a sample point on a large graph plot. The biggest file I have seen has 43.7 Million records.
An average file contains 10 Million records. Each record is small (76 Bytes + optional 12 Bytes each). The complete data cannot be loaded in to the main memory as it is too large. I have build a new file format that compresses the data to 48 bytes per record and organises the data in to chunks that are associated to each other. I want to "view" the data by displaying the records in a 2D/3D plot. As the data is very dense, I would like to progressively increase the level of detail by loading more data and removing data that is not shown in the view from the main memory.
I would also like to access group of associated records in real time and pre-load similar records in order to keep the loading time to bare minimum. This will give the user a smooth control to view the data instead of an experience similar to viewing a video on YouTube with a very slow internet connection. the user cannot randomly and has to use the controls to navigate and I would like to use this info to load the relevant records into the main memory.
The data has to be loaded progressively from the disc based on what is currently in the main memory. Records in the main memory that are not required in the current context can be removed and if required re loaded.
How to I access data from a disc at high speeds based on some hash number
How do I manage main memory if the data to be viewed in the current context is too large. If your answer is level of detail, then how do I build it for a large data set and should this data be part of the file ?
I have been working on this for the last two weeks and I seem to get stuck due to IO speed.
I am working in native C++ and I cannot use work under GPL. If you need any more info, let me know.
Ram
Under most modern file systems (Linux, Unixes, Windows) you can map a file into memory.
Which means you can access the content of the file as if it was entirely in memory (eg you can use data[i++], strchr(data,..), etc) and it's the operating system that does the mapping between used memory and file. When you want to read some data that is not already in memory, the o/s will fetch it from the file.
You should read this question's answer: Mmap() an entire large file
I think you are looking for organization similar to what's used to store level geometry in games, just that you maybe (depending on how your program works and what data you need to show) need just one dimension. See Quadtree and similar methods (bottom of that article).