How to combine dataset with mxnet?

How to combine dataset with mxnet? - python-2.7

I have two separate folders containing 3D arrays (data), each folder contains files of the same classification. I used mxnet.gluon.data.ArrayDataset() create datasets for each label respectively. Is there a way to combine these two datasets into the final training dataset that combines both classifications? The new data sets are different size.
e.g
A_data = mx.gluon.data.ArrayDataset(list2,label_A )
noA_data = mx.gluon.data.ArrayDataset(list,label_noA)
^ I want to combine A_data and noA_data for a complete dataset.
Additionally, is there an easier way to combine the two folders with its classification into a mxnet dataset from the get-go? That would also solve my problem.

You could create an ArrayDataset that contains both, if list and list2 are both python lists then you could do something like
full_data = mx.gluon.data.dataset.ArrayDataset(list + list2, label_noA + labelA)
where len(label_noA) == len(list) and len(label_A) == len(list2)

Related

creating a single variable using the data on three seperate columns

I have results from three different tests for each participant:Vocabulary > scores out of 104Cloze_Test > scores out of 22Read_hours > in terms of hours (differs between 4-50 hours)So the scores have different scales.I want to create a single variable using the data from these three columns.What is the best way?Standardize them and combine? If yes how to combine them? I did Standardized usingdata$Vocabulary <as.data.frame(scale(data$sVocabulary))but can't combine them.I have tried these:data.Pax %>% mutate(Reading = cbind(sVocab, scloze, sreadhours))I also tried PCA but not sure if this is the right way to use PCA...scoredata<-data[,c("Vocabulary","Cloze_Test","Read_hours")]pca_profscore <- prcomp(scoredata, center = T, scale.= T)data.Pax["Reading"] <- vectorizedscore <- pca_profscore$x[,1]summary(pca_profscore)print(pca_profscore)pca_profscore$rotation[,1]

Merge 2 object lists in java

i have two lists listA and listB of type object
ListA[name=abc, age=34, weight=0, height=0] data collected from excel sheet
ListB[name=null, age=0, weight=70, height=6] data collected from database
Now i want to combine both the lists into a single list
MergedList[name=abc, age=34, weight=70, height=6]
Note: my obj class has more than 15 properties so adding each property one by one using getProperty() will be time-consuming.is there a better way?

Convert them to a Map where the key is the name of the object ( you denoting the elements as name=abc suggests they are name/value pairs ).
Map<String,MyMysteriousObject> converted = list.stream().collect( Collectors.toMap(MyMysteriousObject::getName, Function.identity() ) );
( replace the getName with what ever function you use to get the name of your object )
And then just merge the maps. How to merge maps is described here for example.
While at it, consider replacing the List with Map in your entire code. Will surely save a lot of work elsewhere too.
But if you have to have a list again, just List<MyMysteriousObject> resultList = new ArrayList<>(resultMap);

Python 2.7 - How to call individual columns from transposed csv file

I understand that the csv module exists, however for my current project we are not allowed to use the module to call csv files.
My code is as follows;
table = []
for line in open("data.csv"):
data = line.split(",")
table.append(data)
transposed = [[table[j][i] for j in range(len(table))] for i in range(len(table[0]))]
rows = transposed[1][1:]
rows = [float(i) for i in rows]
I'm really new to python so this is probably a massively basic question, I've been scouring the internet all day and struggle to find a solution. All I need to do is to be able to call data from any individual column so I can analyse it. Thanks

your data is organized in a list of lists. Each sub list represents a row. To better illustrate this I would avoid using list comprehensions because they are more difficult to read. Additionally I would avoid using variables like 'i' and 'j' and instead use more descriptive names like row or column. Here is a simple example of how I would accomplish this
def read_csv():
table = []
with open("data.csv") as fileobj:
for line in fileobj.readlines():
data = line.strip().split(',')
table.append(data)
return table
def get_column_data(data, column_index):
column_data = []
for row in data:
cell_data = row[column_index]
column_data.append(cell_data)
return column_data
data = read_csv()
get_column_data(data, column_index=2) #example usage

Top n outliers in ResultWriter

I am dealing with high dimensional and large dataset, so i need to get just Top N outliers from output of ResultWriter.
There is some option in elki to get just the top N outliers from this output?

The ResultWriter is some of the oldest code in ELKI, and needs to be rewritten. It's rather generic - it tries to figure out how to best serialize output as text.
If you want some specific format, or a specific subset, the proper way is to write your own ResultHandler. There is a tutorial for writing a ResultHandler.
If you want to find the input coordinates in the result,
Database db = ResultUtil.findDatabase(baseResult);
Relation<NumberVector> rel = db.getRelation(TypeUtil.NUMBER_VECTOR_VARIABLE_LENGTH);
will return the first relation containing numeric vectors.
To iterate over the objects sorted by their outlier score, use:
OrderingResult order = outlierResult.getOrdering();
DBIDs ids = order.order(order.getDBIDs());
for (DBIDIter it = ids.iter(); it.valid(); it.advance()) {
// Output as desired.
}

Adding a new row to a list

I have:
row1 = [1,'a']
row2 = [2,'b']
I want to create 'allrows' to look like these two rows concatenated together. In fact, I want to start with an empty list and add rows.
append does not do the job, it just creates a long horizontal list.
How do I create a list or other structure that holds each row as a ROW?
For two rows, I want the result to be:
[[1,'a']
[2,'b']]
I am not sure I need the outer brackets, but put them in there assuming the final structure was itself a list, I suppose any other structure that holds these lists, like an "array" of lists, will be fine, as long as I can write out specific rows using:
for line in allrows:
print line
Thanks!

I'm guesing that you code in Python.
List can hold other lists so you can do this
allrows = [row1, row2]
for row in allrows:
print (row)
Output would be
[1,'a']
[2,'b']

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to combine dataset with mxnet? - python-2.7

You could create an ArrayDataset that contains both, if list and list2 are both python lists then you could do something like full_data = mx.gluon.data.dataset.ArrayDataset(list + list2, label_noA + labelA) where len(label_noA) == len(list) and len(label_A) == len(list2)

Related

creating a single variable using the data on three seperate columns

Merge 2 object lists in java

Python 2.7 - How to call individual columns from transposed csv file

Top n outliers in ResultWriter

Adding a new row to a list

Categories

Resources