regarding pascal voc files - computer-vision

I'm attempting to deploy a YOLO model on a dataset where the labels are given in pascal voc format . (https://www.kaggle.com/datasets/cici118/swimming-pool-detection-algarves-landscape?select=images)
I'm a beginner , can someone please help me understand whether I have to use some library to get the information out of the the pascal voc files and then train the model , or is there some way by which a ML model can output a pascal voc file ?
Alternatively , after some literature survey ( googling ) , I also found out a similar model named ssd also exists , would it have any advantage ( in ease of use mostly , im less concerned about speed accuracy and more concerned about ease of coding from scratch and training time since I'm a student )
Some example code on training a ML model from scratch on pascal voc would be appreciable
Thanks in advance
I've searched on google but been confused by the vast amount of information and difficult code available online .

Related

Input Arrays Into Weka

I am fairly new to machine learning and I am trying to use WEKA (GUI) to implement a neural network on a sports data set. My issue is that I want my inputs to be Arrays (each Array is a contestant with stats such as speed, winrate, etc). I am wondering how I can tell WEKA that each input is an array of values.
You can define it in an .arff file. See this website for detailed information. As the figure below.
Or after opening your data in Weka, you can convert it with the help of some filters. I do not know the current format of your data. However, if you can open it in Weka, you can edit your data with many filters. Meanwhile, artificial neural networks only accept numerical values. Among these filters, there are those who convert nominal data to numerical data. I share an image from these filters below. If you are new to this area, I recommend you to watch videos of WekaMOOC (owned by Weka developers.). I think it will be very useful. Good luck.
Weka_filters_screen

TF-IDF vectorizer doesn't work better than countvectorizer (sci-kit learn

I am working on a multilabel text classification problem with 10 labels.
The dataset is small, +- 7000 items and +-7500 labels in total. I am using python sci-kit learn and something strange came up in the results. As a baseline I started out with using the countvectorizer and was actually planning on using the tfidf vectorizer which I thought would work better. But it doesn't.. with the countvectorizer I get a performance of a 0,1 higher f1score. (0,76 vs 0,65)
I cannot wrap my head around why this could be the case?
There are 10 categories and one is called miscellaneous. Especially this one gets a much lower performance with tfidf.
Does anyone know when tfidf could perform worse than count?
The question is, why not ? Both are different solutions.
What is your dataset, how many words, how are they labelled, how do you extract your features ?
countvectorizer simply count the words, if it does a good job, so be it.
There is no reason why idf would give more information for a classification task. It performs well for search and ranking, but classification needs to gather similarity, not singularities.
IDF is meant to spot the singularity between one sample vs the rest of the corpus, what you are looking for is the singularity between one sample vs the other clusters. IDF smoothens the intra-cluster TF similarity.

Speech Recognition for small vocabulary (about 20 words)

I am currently working on a project for my university. The task is to write speech recognition system that is going to run on a phone in background waiting for few commands (like. call 0 123 ...).
It's 2 months project so it does not have to be very accurate. The amount of acceptable noise can be small and words will be separated by moments of silence.
I am currently at point of loading sample word encoded in RAW 16 bit PCM format. Splitting it to chunks (about 50 per second) and running FFT on each chunk in order to get frequency spectrum.
Things to solve are:
1) going through the longer recording and splitting it into words.
2) finding to best match for the word
1) I was thinking about just checking chunk after chunk and if I encounter few chunks that have higher altitudes of human voice frequencies assume that the word has started. Anyway I am looking for resources that may help with this.
2) This one seams a little bit tougher. Is it necessary to use HMM's for system like this or maybe there are simpler methods assuming that the vocabulary is so small ( 20 words )?
Edit:
The point of the project is writing the system on my own so I cannot use ready libraries like Sphinx or HTK.
Regards,
Karol
If anybody will have the same question in future. Look for 2 main keywords:
MFCC - Mel-Frequency cepstrum coefficients to calculate series of coefficients for each word template
DTW - To match captured word with templates
Good enough description of DTW can be found on wikipedia
This approach was good enough to have around 80% accuracy on 20 words dictionary and give a good demo during the class.
To recognize commands on the phone you can use Pocketsphinx. Tutorial which covers speech recognition applications on Android is available on CMUSphinx website.

Topic Identification with WEKA

I am completely new to the field of Data mining and WEKA tool (just installed it today).
I need to do topic identification based on short text sentences.
Let say I have several categories:
- politics
- sports
- other
I am thinking of doing the following:
Have a list of terms that I compare the text to:
Sports:
NFL
NBA
Touch down
etc
Politics:
election
president
OBama
etc
Also, I would like to add more categories.
Then I would apply some algorithm SVM or Naive Bayes with the help of WEKA.
Any idea on how to start doing this with WEKA?
I have searched some tutorials on WEKA but I can't seem to get any examples similar to what I am trying to do.
Any help to start me up will be appreciated.

Weka cross validation wrong results

I am classifying 5 minutes of EEG data of 4 classes using a Bayesian Network.
When applying cross validation I get 100% correct results whereas when I use training and supplied testing data (the first 3.7 minutes for training, 1.3 minutes for testing) in a separate file I get really low results (30%).
I am new to Weka and do not know how this is possible. Any help would be highly appreciated :)