how can I combine a cell array and two scalars to obtain a cell array of string, scalar1, scalar2 elements in Octave avoiding loops? - combinations

I have a cell array like P and two other float variables
P = {"GG+G[G]", "GG", "GG-GGG", "GG[][GG]", "[G[GG-G]]", "G[GG]+[G]"};
val1 = 0.01;
val2 = 0.3;
And I would like to build the following data structure without using a loop because the P cell array can contain a large number of elements:
Inputparam =
{
[1,1] = {
[1,1] = "GG+G[G]"
[1,2] = 0.01
[1,3] = 0.3
}
[1,2] = {
[1,1] = "GG"
[1,2] = 0.01
[1,3] = 0.3
}
[1,3] = {
[1,1] = "GG-GGG"
[1,2] = 0.01
[1,3] = 0.3
}
[1,4] = {
[1,1] = "GG[][GG]"
[1,2] = 0.01
[1,3] = 0.3
}
[1,5] = {
[1,1] = "[G[GG-G]]"
[1,2] = 0.01
[1,3] = 0.3
}
[1,6] = {
[1,1] = "G[GG]+[G]"
[1,2] = 0.01
[1,3] = 0.3
}
}
I've tried several options but with most of them what I got was a concatenation and not a combination of the elements.
The purpose of this structure is to be the argument of parcellfun function that's why I need to have each element of P, with val1 and val2 values.
I'm also considering using an anonymum function instead of allocation all this data in memory. does it make sense?
thanks in advance.

I suggest that, instead of a cell array of cells arrays, you create a 2D cell array, just because this is much easier to generate and, for large arrays, it takes up less memory too:
P = {"GG+G[G]", "GG", "GG-GGG", "GG[][GG]", "[G[GG-G]]", "G[GG]+[G]"};
P(2,:) = 0.01;
P(3,:) = 0.2;
The cell array is indexed using P{1,5}, rather than P{5}{1} as the cell array in the OP.
Another alternative is to use struct arrays:
P = {"GG+G[G]", "GG", "GG-GGG", "GG[][GG]", "[G[GG-G]]", "G[GG]+[G]"};
P = struct('name',P,'val1',0.01,'val2',0.2);
The struct is indexed as P(5).name rather than P{5}{1} (and P(5).val1 instead of P{5}{2}, etc.).

Related

Merge lists with 0 1 encoding

I have the following case in python:
a = [[0,0,1,0],
[0,0,0,1],
[1,0,0,1],
[1,0,1,1]]
b = [[1,1,0,0],
[1,0,0,1],
[0,1,0,0]]
c = [[1,0,1,0],
[0,0,1,0],
[0,1,0,0]]
d = [[1,0,1,0],
[0,0,1,0],
[0,0,0,0],
[0,0,0,1],
[1,0,0,0]]
a has length 4, b has length 3, c has length 3, d has length 4 and I have several more lists with variable length.
What I want is to construct a function that can merge the "sub lists" considering the columns, for example:
def combine(foo):
...
print(foo)
combine(a) = [1,0,1,1]
combine(b) = [1,1,0,1]
combine(c) = [1,1,1,0]
combine(d) = [1,0,1,1]
How can I do it?
Thanks for your help.

Find the the total number of 1's in binary form for a group number's in a list in python 3

I want to count total number of '1's in binary format of a number which is in a list.
z = ['0b111000','0b1000011'] # z is a list
d = z.count('1')
print(d)
The output is 0.
Whereas the required output should be in the form of [3,3]
which is number of ones in every element that Z is containing :
Here it is :
z=['0b111000','0b1000011']
finalData = []
for word in z:
finalData.append(word.count('1'))
print(finalData)
The problem with your code was you were trying to use count() method on list type and it is used for string. You first need to get the string from the list and then use count() method on it.
Hope this helps :)
z = ['0b111000','0b1000011']
d = z.count('1')
This attempts to find the number of times the string '1' is in z. This obviously returns 0 since z contains '0b111000' and '0b1000011'.
You should iterate over every string in z and count the numbers of '1' in every string:
z = ['0b111000','0b1000011']
output = [string.count('1') for string in z]
print(output)
# [3, 3]
list.count(x) will count the number of occurrences such that it only counts the element if it is equal to x.
Use list comprehension to loop through each string and then count the number of 1s. Such as:
z = ['0b111000','0b1000011']
d = [x.count("1") for x in z]
print(d)
This will output:
[3, 3]

How to manually initialize the values for the weights?

I would like to experiment the weights initialization recommended by Karpathy in his lecture notes,
the recommended heuristic is to initialize each neuron's weight vector
as: w = np.random.randn(n) / sqrt(n), where n is the number of its
inputs
source: http://cs231n.github.io/neural-networks-2/#init
I'm beginner in python, and I don"t know how to implement this :/
weights = tf.Variable(??)
Please help? ...
For a single value, use:
weights = tf.Variable(10)
For a vector with random values:
shape = [784, 625]
weights = tf.Variable(tf.random_normal(shape, stddev=0.01)/tf.sqrt(n))
Please note that you need to sess.run to evaluate the variables.
Also, please check out other Random Tensors: https://www.tensorflow.org/versions/r0.8/api_docs/python/constant_op.html#random-tensors
n = 10
init_x = np.random.randn(n)
x = tf.Variable(init_x)
sess = tf.InteractiveSession()
sess.run(tf.initialize_all_variables())
print(sess.run(x))
I do it in the following way:
self.w_full, self.b_full = [], []
n_fc_layers = len(structure)
structure.insert(0, self.n_inputs)
with vs.variable_scope(self.scope):
for lr_idx in range(n_fc_layers):
n_in, n_out = structure[lr_idx], structure[lr_idx+1]
self.w_full.append(
vs.get_variable(
"FullWeights{}".format(lr_idx),
[n_in, n_out],
dtype=tf.float32,
initializer=tf.random_uniform_initializer(
minval=-tf.sqrt(tf.constant(6.0)/(n_in + n_out)),
maxval=tf.sqrt(tf.constant(6.0)/(n_in + n_out))
)
)
)
self.b_full.append(
vs.get_variable(
"FullBiases{}".format(lr_idx),
[n_out],
dtype=tf.float32,
initializer=tf.constant_initializer(0.0)
)
)
after
structure.insert(0, self.n_inputs)
you'll have [n_inputs, 1st FC layer size, 2nd FC layer size ... output layer size]

Rcpp Create DataFrame with Variable Number of Columns

I am interested in using Rcpp to create a data frame with a variable number of columns. By that, I mean that the number of columns will be known only at runtime. Some of the columns will be standard, but others will be repeated n times where n is the number of features I am considering in a particular run.
I am aware that I can create a data frame as follows:
IntegerVector i1(3); i1[0]=4;i1[1]=2134;i1[2]=3453;
IntegerVector i2(3); i2[0]=4123;i2[1]=343;i2[2]=99123;
DataFrame df = DataFrame::create(Named("V1")=i1,Named("V2")=i2);
but in this case it is assumed that the number of columns is 2.
To simplify the explanation of what I need, assume that I would like pass a SEXP variable specifying the number of columns to create in the variable part. Something like:
RcppExport SEXP myFunc(SEXP n, SEXP <other stuff>)
IntegerVector i1(3); <compute i1>
IntegerVector i2(3); <compute i2>
for(int i=0;i<n;i++){compute vi}
DataFrame df = DataFrame::create(Named("Num")=i1,Named("ID")=i2,...,other columns v1 to vn);
where n is passed as an argument. The final data frame in R would look like
Num ID V1 ... Vn
1 2 5 'aasda'
...
(In reality, the column names will not be of the form "Vx", but they will be known at runtime.) In other words, I cannot use a static list of
Named()=...
since the number will change.
I have tried skipping the "Named()" part of the constructor and then naming the columns at the end, but the results are junk.
Can this be done?
If I understand your question correctly, it seems like it would be easiest to take advantage of the DataFrame constructor that takes a List as an argument (since the size of a List can be specified directly), and set the names of your columns via .attr("names") and a CharacterVector:
#include <Rcpp.h>
// [[Rcpp::export]]
Rcpp::DataFrame myFunc(int n, Rcpp::List lst,
Rcpp::CharacterVector Names = Rcpp::CharacterVector::create()) {
Rcpp::List tmp(n + 2);
tmp[0] = Rcpp::IntegerVector(3);
tmp[1] = Rcpp::IntegerVector(3);
Rcpp::CharacterVector lnames = Names.size() < lst.size() ?
lst.attr("names") : Names;
Rcpp::CharacterVector names(n + 2);
names[0] = "Num";
names[1] = "ID";
for (std::size_t i = 0; i < n; i++) {
// tmp[i + 2] = do_something(lst[i]);
tmp[i + 2] = lst[i];
if (std::string(lnames[i]).compare("") != 0) {
names[i + 2] = lnames[i];
} else {
names[i + 2] = "V" + std::to_string(i);
}
}
Rcpp::DataFrame result(tmp);
result.attr("names") = names;
return result;
}
There's a little extra going on there to allow the Names vector to be optional - e.g. if you just use a named list you can omit the third argument.
lst1 <- list(1L:3L, 1:3 + .25, letters[1:3])
##
> myFunc(length(lst1), lst1, c("V1", "V2", "V3"))
# Num ID V1 V2 V3
#1 0 0 1 1.25 a
#2 0 0 2 2.25 b
#3 0 0 3 3.25 c
lst2 <- list(
Column1 = 1L:3L,
Column2 = 1:3 + .25,
Column3 = letters[1:3],
Column4 = LETTERS[1:3])
##
> myFunc(length(lst2), lst2)
# Num ID Column1 Column2 Column3 Column4
#1 0 0 1 1.25 a A
#2 0 0 2 2.25 b B
#3 0 0 3 3.25 c C
Just be aware of the 20-length limit for this signature of the DataFrame constructor, as pointed out by #hrbrmstr.
It's an old question, but I think more people are struggling with this, like me. Starting from the other answers here, I arrived at a solution that isn't limited by the 20 column limit of the DataFrame constructor:
// [[Rcpp::plugins(cpp11)]]
#include <Rcpp.h>
#include <string>
#include <iostream>
using namespace Rcpp;
// [[Rcpp::export]]
List variableColumnList(int numColumns=30) {
List retval;
for (int i=0; i<numColumns; i++) {
std::ostringstream colName;
colName << "V" << i+1;
retval.push_back( IntegerVector::create(100*i, 100*i + 1),colName.str());
}
return retval;
}
// [[Rcpp::export]]
DataFrame variableColumnListAsDF(int numColumns=30) {
Function asDF("as.data.frame");
return asDF(variableColumnList(numColumns));
}
// [[Rcpp::export]]
DataFrame variableColumnListAsTibble(int numColumns=30) {
Function asTibble("tbl_df");
return asTibble(variableColumnList(numColumns));
}
So build a C++ List first by pushing columns onto an empty List. (I generate the values and the column names on the fly here.) Then, either return that as an R list, or use one of two helper functions to convert them into a data.frame or tbl_df. One could do the latter from R, but I find this cleaner.

Deleting duplicate x values and their corresponding y values

I am working with a list of points in python 2.7 and running some interpolations on the data. My list has over 5000 points and I have some repeating "x" values within my list. These repeating "x" values have different corresponding "y" values. I want to get rid of these repeating points so that my interpolation function will work, because if there are repeating "x" values with different "y" values it runs an error because it does not satisfy the criteria of a function. Here is a simple example of what I am trying to do:
Input:
x = [1,1,3,4,5]
y = [10,20,30,40,50]
Output:
xy = [(1,10),(3,30),(4,40),(5,50)]
The interpolation function I am using is InterpolatedUnivariateSpline(x, y)
have a variable where you store the previous X value, if it is the same as the current value then skip the current value.
For example (pseudo code, you do the python),
int previousX = -1
foreach X
{
if(x == previousX)
{/*skip*/}
else
{
InterpolatedUnivariateSpline(x, y)
previousX = x /*store the x value that will be "previous" in next iteration
}
}
i am assuming you are already iterating so you dont need the actualy python code.
A bit late but if anyone is interested, here's a solution with numpy and pandas:
import pandas as pd
import numpy as np
x = [1,1,3,4,5]
y = [10,20,30,40,50]
#convert list into numpy arrays:
array_x, array_y = np.array(x), np.array(y)
# sort x and y by x value
order = np.argsort(array_x)
xsort, ysort = array_x[order], array_y[order]
#create a dataframe and add 2 columns for your x and y data:
df = pd.DataFrame()
df['xsort'] = xsort
df['ysort'] = ysort
#create new dataframe (mean) with no duplicate x values and corresponding mean values in all other cols:
mean = df.groupby('xsort').mean()
df_x = mean.index
df_y = mean['ysort']
# poly1d to create a polynomial line from coefficient inputs:
trend = np.polyfit(df_x, df_y, 14)
trendpoly = np.poly1d(trend)
# plot polyfit line:
plt.plot(df_x, trendpoly(df_x), linestyle=':', dashes=(6, 5), linewidth='0.8',
color=colour, zorder=9, figure=[name of figure])
Also, if you just use argsort() on the values in order of x, the interpolation should work even without the having to delete the duplicate x values. Trying on my own dataset:
polyfit on its own
sorting data in order of x first, then polyfit
sorting data, delete duplicates, then polyfit
... I get the same result twice