ML.NET how to make input model generic? - ml.net

I have 3 use cases for multiclass classifications and their InputModels are all different as they have different columns and data structures. How can I refactor below method so that it can predict any kinds of InputModels without copying and repeating the method 3 times just to cater for 3 different input data structures?
private List<MulticlassClassificationPrediction> Predict(string modelName, string testDataPath)
{
PredictionEngine<InputModel, MulticlassClassificationPrediction> predEngine;
predEngine = _predEnginePool.GetPredictionEngine(modelName: modelName);
IDataView dataView = _mlContext.Data.LoadFromTextFile<InputModel>(
path: testDataPath,
hasHeader: true,
separatorChar: ',',
allowQuoting: true,
allowSparse: false);
// Use first line of dataset as model input
// You can replace this with new test data (hardcoded or from end-user application)
List<InputModel> testDataList = _mlContext.Data.CreateEnumerable<InputModel>(dataView, false).ToList();
List<MulticlassClassificationPrediction> predictionList = new List<MulticlassClassificationPrediction>();
foreach (InputModel testData in testDataList)
{
MulticlassClassificationPrediction result = predEngine.Predict(testData);
predictionList.Add(result);
}
return predictionList;
}

If I understand your question correct, have you had a chance to try something like this?
private List<MulticlassClassificationPrediction> Predict<TInputModel>(string modelName, string testDataPath) where TInputModel: class, new()
{
PredictionEngine<TInputModel, MulticlassClassificationPrediction> predEngine;
predEngine = _predEnginePool.GetPredictionEngine(modelName: modelName);
IDataView dataView = _mlContext.Data.LoadFromTextFile<TInputModel>(
path: testDataPath,
hasHeader: true,
separatorChar: ',',
allowQuoting: true,
allowSparse: false);
// Use first line of dataset as model input
// You can replace this with new test data (hardcoded or from end-user application)
var testDataList = _mlContext.Data.CreateEnumerable<TInputModel>(dataView, false).ToList();
List<MulticlassClassificationPrediction> predictionList = new List<MulticlassClassificationPrediction>();
foreach (var testData in testDataList)
{
MulticlassClassificationPrediction result = predEngine.Predict(testData);
predictionList.Add(result);
}
return predictionList;
}

Related

Google Script Run Function IF text in another sheet's column contains a 'specific text'

I've done extensive search for this, but none of them seems to work. They all just give me a blank sheet.
Sample sheet
Basically I have a function that extracts data from Col. B in DATA, to Result. Then does some other things, split, trim etc...
I want to run this function when the text in Col. A in DATA is 250P.
So it would be like: IF (DATA!A1:A contains text "250p" then run function EXTRACT).
This is the code I have as of now:
//this extract works fine but I just need this to work for only those with value 250 in Col A//
function EXTRACT() {
var spreadsheet = SpreadsheetApp.getActive();
spreadsheet.getRange('A1').setFormula('=EXTRACTDATA(DATA!A1:A)');
}
function IF250() {
var ss = SpreadsheetApp.getActiveSpreadsheet();
var sheet = ss.getSheetByName('DATA');
var range = sheet.getRange('DATA!A1:A');
var values = range.getValues();
if (values[i] == "250g") {
EXTRACT();
Better yet, If I can have the data set in 2 separate sheets. The 250s in one sheet & 500s in one sheet. But this is not necessary.
After reviewing your sheet, this is a possible solution
Code.gs
const sS = SpreadsheetApp.getActiveSpreadsheet()
function grabData() {
const sheetIn = sS.getSheetByName('data')
const sheetOut = sS.getSheetByName('Desired Outcome')
const range = 'A2:B'
/* Grab all the data from columns A and B and filter it */
const values = sheetIn.getRange(range).getValues().filter(n => n[0])
/* Retrieve only the names if it containes 250p */
/* In format [[a], [b], ...] */
const parsedValues = values.map((arr) => {
const [type, name] = arr
if (type.toLowerCase().includes('250p')) {
return name.split('\n')
}
})
.filter(n => n)
.flat()
.map(n => [n])
/* Add the values to the Desired Outcome Sheet */
sheetOut
.getRange(sheetOut.getLastRow() + 1, 1, parsedValues.length)
.setValues(parsedValues)
}
Try changing:
var values = range.getValues();
to
var values = range.getDisplayValues()
As this will read the value that is shown. Try logging the values with both to see why! (Blank)
You are also not currently iterating, or looping, your values.
If you're just looking to see if the column contains a cell containing the value 250p, try:
function IF250() {
const sheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName(`DATA`)
const valueExists = sheet.getRange(`A1:A`)
.getDisplayValues()
.filter(String)
.some(row => row.includes(`250P`))
if (valueExists) EXTRACT()
}
Commented:
function IF250() {
const sheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName(`DATA`)
const valueExists = sheet.getRange(`A1:A`)
.getDisplayValues()
// Remove empty cells (not strictly necessary)
.filter(String)
// If the values include a row containing `250p` return true.
.some(row => row.includes(`250P`))
// If valueExists returns true:
if (valueExists) EXTRACT()
}

How to get the correct count for a Lucid Model's Paginate when joining with additional tables

I have 2 Lucid models: Ad and Campaign, which are associated using a Many:Many relationship. They have a pivot table which manages the relationship which has additional information, so my table structure is as follows:
ads
id
...
campaign_ads
campaign_id
ad_id
spend
sent
clicks
leads
ftds
campaigns
id
...
I am trying to fetch the results of a paginate query using the Ad models' query function, but in addition to the Ad models' fields, I would also like to fetch the sum of spend, sent, clicks, leads and ftds from the related Campaign models' pivots.
I have come up with the following code, which returns the correct information in the collection, but returns an incorrect value for the count
const Ad = use('App/Models/Ad');
const query = Ad.query()
.leftJoin('campaign_ads', 'ads.id', 'campaign_ads.ad_id')
.select('ads.*')
.sum('campaign_ads.spend as spend')
.sum('campaign_ads.sent as sent')
.sum('campaign_ads.clicks as clicks')
.sum('campaign_ads.leads as leads')
.sum('campaign_ads.ftds as ftds')
.groupBy('ads.id')
.paginate()
I assume that this is related to how the paginate function rewrites or performs the query, but I have no idea how to fix it.
Here is some example usage based on the answer:
const Ad = use('App/Models/Ad');
const query = Ad.query()
.leftJoin('campaign_ads', 'ads.id', 'campaign_ads.ad_id')
.select('ads.*')
.sum('campaign_ads.spend as spend')
.sum('campaign_ads.sent as sent')
.sum('campaign_ads.clicks as clicks')
.sum('campaign_ads.leads as leads')
.sum('campaign_ads.ftds as ftds')
.groupBy('ads.id')
const paginate = async (query, page = 1, perPage = 20) {
// Types of statements which are going to filter from the count query
const excludeAttrFromCount = ['order', 'columns', 'limit', 'offset', 'group']
// Clone the original query which we are paginating
const countByQuery = query.clone();
// Case Page and Per Page as Numbers
page = Number(page)
perPage = Number(perPage)
// Filter the statments from the array above so we have a query which can run cleanly for counting
countByQuery.query._statements = _.filter(countByQuery.query._statements, (statement) => {
return excludeAttrFromCount.indexOf(statement.grouping) < 0
})
// Since in my case, i'm working with a left join, i'm going to ensure that i'm only counting the unique models
countByQuery.countDistinct([Ad.table, 'id'].join('.'));
const counts = await countByQuery.first()
const total = parseInt(counts.count);
let data;
// If we get a count of 0, there's no point in delaying processing for an additional DB query
if (0 === total) {
data = [];
}
// Use the query's native `fetch` method, which already creates instances of the models and eager loads any relevant data
else {
const {rows} = await query.forPage(page, perPage).fetch();
data = rows;
}
// Create the results object that you would normally get
const result = {
total: total,
perPage: perPage,
page: page,
lastPage: Math.ceil(total / perPage),
data: data
}
// Create the meta data which we will pass to the pagination hook + serializer
const pages = _.omit(result, ['data'])
if (Ad.$hooks) {
await Ad.$hooks.after.exec('paginate', data, pages)
}
// Create and return the serialized versions
const Serializer = Ad.resolveSerializer()
return new Serializer(data, pages);
}
paginate(query, 1, 20)
.then(results => {
// do whatever you want to do with the results here
})
.catch(error => {
// do something with the error here
})
So, as I noted before in my notes, the problem that I was have was caused by how Lucid's query builder handles the paginate function, so I was forced to "roll my own". Here's what I came up with:
paginate (query, page = 1, perPage = 20) {
// Types of statements which are going to filter from the count query
const excludeAttrFromCount = ['order', 'columns', 'limit', 'offset', 'group']
// Clone the original query which we are paginating
const countByQuery = query.clone();
// Case Page and Per Page as Numbers
page = Number(page)
perPage = Number(perPage)
// Filter the statments from the array above so we have a query which can run cleanly for counting
countByQuery.query._statements = _.filter(countByQuery.query._statements, (statement) => {
return excludeAttrFromCount.indexOf(statement.grouping) < 0
})
// Since in my case, i'm working with a left join, i'm going to ensure that i'm only counting the unique models
countByQuery.countDistinct([this.#model.table, 'id'].join('.'));
const counts = await countByQuery.first()
const total = parseInt(counts.count);
let data;
// If we get a count of 0, there's no point in delaying processing for an additional DB query
if (0 === total) {
data = [];
}
// Use the query's native `fetch` method, which already creates instances of the models and eager loads any relevant data
else {
const {rows} = await query.forPage(page, perPage).fetch();
data = rows;
}
// Create the results object that you would normally get
const result = {
total: total,
perPage: perPage,
page: page,
lastPage: Math.ceil(total / perPage),
data: data
}
// Create the meta data which we will pass to the pagination hook + serializer
const pages = _.omit(result, ['data'])
// this.#model references the Model (not the instance). I reference it like this because this function is part of a larger class
if (this.#model.$hooks) {
await this.#model.$hooks.after.exec('paginate', data, pages)
}
// Create and return the serialized versions
const Serializer = this.#model.resolveSerializer()
return new Serializer(data, pages);
}
I only use this version of pagination when I detect group by in my query, and it follow's Lucid's own paginate function pretty closely, and returns identical feedback. While it's not a 100% drop-in solution, it's good enough for my needs

parse and replace a list of object in kotlin

I am currently having a list of obeject defined as:
fun updateList(tools: List<Tool>, updateTools: List<Updated>){
... code below
}
the Tool data class is defined as:
data class Tool(
var id: String = ""
var description: String = ""
var assignedTo: String = ""
)
the Updated data class is defined as:
data class Updated(
var id: String = ""
var assignedTo: String = ""
)
Basically, I parse the list updateTools and if I found a id match in tools, I update the assignedTo field from the Tool type object from tools by the one from updateTools
fun updateList(tools: List<Tool>, updateTools: List<Updated>){
updateTools.forEach{
val idToSearch = it.id
val nameToReplace = it.name
tools.find(){
if(it.id == idToSearch){it.name=nameToReplace}
}
}
return tools
}
it's not working but I do not see how to make it easier to work. I just started kotlin and I feel that it's not the good way to do it
any idea ?
Thanks
First of all:
you're not assigning assignedTo, you're assigning name...
in the predicate passed to find, which
should only return a Boolean value to filter elements, and
should probably not have any side effects,
those should be done later with a call to i.e. forEach.
Additionally, your constructor parameters to the data class are normal parameters, and as such, need commas between them!
Your last code block, corrected, would be:
updateTools.forEach {
val idToSearch = it.id
val nameToReplace = it.name
tools.find { it.id == idToSearch }.forEach { it.assignedTo = nameToReplace }
}
return tools
I'd do it like this (shorter):
updateTools.forEach { u -> tools.filter { it.id == u.id }.forEach { it.assignedTo = u.name } }
This loops through each update, filters tools for tools with the right ID, and sets the name of each of these tools.
I use forEach as filter returns a List<Tool>.
If you can guarantee that id is unique, you can do it like this instead:
updateTools.forEach { u -> tools.find { it.id == u.id }?.assignedTo = u.name }
firstOrNull returns the first element matching the condition, or null if there is none. Edit: it seems find is firstOrNull - its implementation just calls firstOrNull.
The ?. safe call operator returns null if the left operand is null, otherwise, it calls the method.
For = and other operators which return Unit (i.e. void, nothing), using the safe call operator simply does nothing if the left operand is null.
If we combine these, it effectively sets the name of the first element which matches this condition.
First, you're missing comma after properties in your data classes, so it should be:
data class Tool(
var id: String = "",
var description: String = "",
var assignedTo: String = ""
)
data class Updated(
var id: String = "",
var assignedTo: String = ""
)
As for second problem, there're probably number of ways to do that, but I've only corrected your idea:
fun updateList(tools: List<Tool>, updateTools: List<Updated>): List<Tool> {
updateTools.forEach{ ut ->
tools.find { it.id == ut.id }?.assignedTo = ut.assignedTo
}
return tools
}
Instead of assigning values to variables, you can name parameter for forEach and use it in rest of the loop.

Reflection on EmberJS objects? How to find a list of property keys without knowing the keys in advance

Is there a way to retrieve the set-at-creations properties of an EmberJS object if you don't know all your keys in advance?
Via the inspector I see all the object properties which appear to be stored in the meta-object's values hash, but I can't seem to find any methods to get it back. For example object.getProperties() needs a key list, but I'm trying to create a generic object container that doesn't know what it will contain in advance, but is able to return information about itself.
I haven't used this in production code, so your mileage may vary, but reviewing the Ember source suggests two functions that might be useful to you, or at least worth reviewing the implementation:
Ember.keys: "Returns all of the keys defined on an object or hash. This is useful when inspecting objects for debugging. On browsers that support it, this uses the native Object.keys implementation." Object.keys documentation on MDN
Ember.inspect: "Convenience method to inspect an object. This method will attempt to convert the object into a useful string description." Source on Github
I believe the simple answer is: you don't find a list of props. At least I haven't been able to.
However I noticed that ember props appear to be prefixed __ember, which made me solve it like this:
for (f in App.model) {
if (App.model.hasOwnProperty(f) && f.indexOf('__ember') < 0) {
console.log(f);
}
};
And it seems to work. But I don't know whether it's 100% certain to not get any bad props.
EDIT: Adam's gist is provided from comments. https://gist.github.com/1817543
var getOwnProperties = function(model){
var props = {};
for(var prop in model){
if( model.hasOwnProperty(prop)
&& prop.indexOf('__ember') < 0
&& prop.indexOf('_super') < 0
&& Ember.typeOf(model.get(prop)) !== 'function'
){
props[prop] = model[prop];
}
}
return props;
}
Neither of these answers are reliable, unfortunately, because any keys paired with a null or undefined value will not be visible.
e.g.
MyClass = Ember.Object.extend({
name: null,
age: null,
weight: null,
height: null
});
test = MyClass.create({name: 'wmarbut'});
console.log( Ember.keys(test) );
Is only going to give you
["_super", "name"]
The solution that I came up with is:
/**
* Method to get keys out of an object into an array
* #param object obj_proto The dumb javascript object to extract keys from
* #return array an array of keys
*/
function key_array(obj_proto) {
keys = [];
for (var key in obj_proto) {
keys.push(key);
}
return keys;
}
/*
* Put the structure of the object that you want into a dumb JavaScript object
* instead of directly into an Ember.Object
*/
MyClassPrototype = {
name: null,
age: null,
weight: null,
height: null
}
/*
* Extend the Ember.Object using your dumb javascript object
*/
MyClass = Ember.Object.extend(MyClassPrototype);
/*
* Set a hidden field for the keys the object possesses
*/
MyClass.reopen({__keys: key_array(MyClassPrototype)});
Using this method, you can now access the __keys field and know which keys to iterate over. This does not, however, solve the problem of objects where the structure isn't known before hand.
I use this:
Ember.keys(Ember.meta(App.YOUR_MODEL.proto()).descs)
None of those answers worked with me. I already had a solution for Ember Data, I was just after one for Ember.Object. I found the following to work just fine. (Remove Ember.getProperties if you only want the keys, not a hash with key/value.
getPojoProperties = function (pojo) {
return Ember.getProperties(pojo, Object.keys(pojo));
},
getProxiedProperties = function (proxyObject) {
// Three levels, first the content, then the prototype, then the properties of the instance itself
var contentProperties = getPojoProperties(proxyObject.get('content')),
prototypeProperties = Ember.getProperties(proxyObject, Object.keys(proxyObject.constructor.prototype)),
objectProperties = getPojoProperties(proxyObject);
return Ember.merge(Ember.merge(contentProperties, prototypeProperties), objectProperties);
},
getEmberObjectProperties = function (emberObject) {
var prototypeProperties = Ember.getProperties(emberObject, Object.keys(emberObject.constructor.prototype)),
objectProperties = getPojoProperties(emberObject);
return Ember.merge(prototypeProperties, objectProperties);
},
getEmberDataProperties = function (emberDataObject) {
var attributes = Ember.get(emberDataObject.constructor, 'attributes'),
keys = Ember.get(attributes, 'keys.list');
return Ember.getProperties(emberDataObject, keys);
},
getProperties = function (object) {
if (object instanceof DS.Model) {
return getEmberDataProperties(object);
} else if (object instanceof Ember.ObjectProxy) {
return getProxiedProperties(object);
} else if (object instanceof Ember.Object) {
return getEmberObjectProperties(object);
} else {
return getPojoProperties(object);
}
};
In my case Ember.keys(someObject) worked, without doing someObject.toJSON().
I'm trying to do something similar, i.e. render a generic table of rows of model data to show columns for each attribute of a given model type, but let the model describe its own fields.
If you're using Ember Data, then this may help:
http://emberjs.com/api/data/classes/DS.Model.html#method_eachAttribute
You can iterate the attributes of the model type and get meta data associated with each attribute.
This worked for me (from an ArrayController):
fields: function() {
var doc = this.get('arrangedContent');
var fields = [];
var content = doc.content;
content.forEach(function(attr, value) {
var data = Ember.keys(attr._data);
data.forEach(function(v) {
if( typeof v === 'string' && $.inArray(v, fields) == -1) {
fields.push(v);
}
});
});
return fields;
}.property('arrangedContent')

EF4: Get the linked column names from NavigationProperty of an EDMX

I am generating POCOs (lets say they are subclasses of MyEntityObject) by using a T4 template from an EDMX file.
I have 3 entities, e.g.:
MyTable1 (PrimaryKey: MyTable1ID)
MyTable2 (PrimaryKey: MyTable2ID)
MyTable3 (PrimaryKey: MyTable3ID)
These entities have the following relations:
MyTable1.MyTable1ID <=>
MyTable2.MyTable1ID (MyTable1ID is the
foreign key to MyTable1)
MyTable2.MyTable2ID <=>
MyTable3.MyTable2ID (MyTable2ID is the
foreign key to MyTable2)
Or in another view:
MyTable1 <= MyTable2 <= MyTable3
I want to extract all foreign key relations
NavigationProperty[] foreignKeys = entity.NavigationProperties.Where(np => np.DeclaringType == entity && ((AssociationType)np.RelationshipType).IsForeignKey).ToArray();
forewach (NavigationProperty foreignKey in foreignKeys)
{
// generate code....
}
My Question: How can I extract the column names that are linked between two entities?
Something like this:
void GetLinkedColumns(MyEntityObject table1, MyEntityObject table2, out string fkColumnTable1, out string fkColumnTable2)
{
// do the job
}
In the example
string myTable1Column;
string myTable2Column;
GetLinkedColumns(myTable1, myTable2, out myTable1Column, out myTable2Column);
the result should be
myTable1Column = "MyTable1ID";
myTable2Column = "MyTable2ID";
The first answer works if your foreign key columns are exposed as properties in your conceptual model. Also, the GetSourceSchemaTypes() method is only available in some of the text templates included with EF, so it is helpful to know what this method does.
If you want to always know the column names, you will need to load the AssociationType from the storage model as follows:
// Obtain a reference to the navigation property you are interested in
var navProp = GetNavigationProperty();
// Load the metadata workspace
MetadataWorkspace metadataWorkspace = null;
bool allMetadataLoaded =loader.TryLoadAllMetadata(inputFile, out metadataWorkspace);
// Get the association type from the storage model
var association = metadataWorkspace
.GetItems<AssociationType>(DataSpace.SSpace)
.Single(a => a.Name == navProp.RelationshipType.Name)
// Then look at the referential constraints
var toColumns = String.Join(",",
association.ReferentialConstraints.SelectMany(rc => rc.ToProperties));
var fromColumns = String.Join(",",
association.ReferentialConstraints.SelectMany(rc => rc.FromProperties));
In this case, loader is a MetadataLoader defined in EF.Utility.CS.ttinclude and inputFile is a standard string variable specifying the name of the .edmx file. These should already be declared in your text template.
Not sure exactly whether you want to generate code using the columns or not, but this may partly help to answer your question (How can I extract the column names that are linked between two entities?) ...
NavigationProperty[] foreignKeys = entity.NavigationProperties
.Where(np => np.DeclaringType == entity &&
((AssociationType)np.RelationshipType).IsForeignKey).ToArray();
foreach (NavigationProperty foreignKey in foreignKeys)
{
foreach(var rc in GetSourceSchemaTypes<AssociationType>()
.Single(x => x.Name == foreignKey.RelationshipType.Name)
.ReferentialConstraints)
{
foreach(var tp in rc.ToProperties)
WriteLine(tp.Name);
foreach(var fp in rc.FromProperties)
WriteLine(fp.Name);
}
}
This code works fine on my Visual Studio 2012
<## template language="C#" debug="true" hostspecific="true"#>
<## include file="EF.Utility.CS.ttinclude"#>
<#
string inputFile = #"DomainModel.edmx";
MetadataLoader loader = new MetadataLoader(this);
EdmItemCollection ItemCollection = loader.CreateEdmItemCollection(inputFile);
foreach (EntityType entity in ItemCollection.GetItems<EntityType>().OrderBy(e => e.Name))
{
foreach (NavigationProperty navProperty in entity.NavigationProperties)
{
AssociationType association = ItemCollection.GetItems<AssociationType>().Single(a => a.Name == navProperty.RelationshipType.Name);
string fromEntity = association.ReferentialConstraints[0].FromRole.Name;
string fromEntityField = association.ReferentialConstraints[0].FromProperties[0].Name;
string toEntity = association.ReferentialConstraints[0].ToRole.Name;
string toEntityField = association.ReferentialConstraints[0].ToProperties[0].Name;
}
}
#>