What is the syntax to be used for feed_dict in TensorFlow C++? - c++

I want to build and train a graph in TensorFlow C++ that consists of two layers, and to feed it with a given matrix as an input.
I have two different examples for the syntax:
The official C++ example (line # 129)
An old answer in StackOverflow
It seems they contradict each other with respect to the exact syntax of the "input" parameter to tensorflow::Session::Run()
Should it be "placeholder_name:0" or "placeholder_name"?

Either one works. The name gets passed through ParseTensorName, where names without a colon are assumed to have an output index of 0. To verify this, we can add a ":0" to the end of the feed name in DirectSessionMinusAXTest::TestFeed:
std::vector<std::pair<string, Tensor>> inputs = {{x_, t}};
becomes
std::vector<std::pair<string, Tensor>> inputs = {{x_ + ":0", t}};
and it still passes.
The only case where passing an output index is required (more accurately the only case where it should be required; there may be some code lacking canonicalization) is if you're feeding a Tensor which is not the zeroth output of an operation (e.g. "unique:1"). This is quite rare, since constant and placeholder ops are the most likely feed targets and only have a single output.

Related

Extracting number of bits in a macroblock from VVC VTM reference software

Final:Result after calculating and displaying the differenceI am new to VVC and I am going through the reference software's code trying to understand it. I have encoded and decoded videos using the reference software. I want to extract the bitstream from it, I want to know the number of bits there are in each macroblock. I am not sure which class I should be working with, for now I am looking at, mv.cpp, QuantRDOQ.cpp, and TrQuant.cpp.
I am afraid to mess the code up completely, I don't know where to add what lines of code. Start: Result after calculating and displaying the difference
P.S. The linked pictures are after my problem has been solved, I attached these pictures because of my query in the comments.
As the error says, getNumBins() is not supported by the CABAC estimator. So you should make sure you call it "only" during the encoding, and not during the RDO.
This should do the job:
if (isEncoding())
before = m_BinEncoder.getNumBins()
coding_unit( cu, partitioner, cuCtx );
if (isEncoding())
{
after = m_BinEncoder.getNumBins();
diff = after - before;
}
The simpleset solution that I'm aware of is at the encoder side.
The trick is to compute the difference in the number of written bits "before" and "after" encoding a Coding Unit (CU) (aka macroblock). This stuff happens in the CABACWriter.cpp file.
You should go to to coding_tree() function, where coding_unit() function is called, which is responsible for context-coding all syntax elementes in the current CU.
There, you may call the function getNumBins() twice: once before and once after coding_unit(). The difference of the two value should do the job for you.

userWarning pymc3 : What does reparameterize mean?

I built a pymc3 model using the DensityDist distribution. I have four parameters out of which 3 use Metropolis and one uses NUTS (this is automatically chosen by the pymc3). However, I get two different UserWarnings
1.Chain 0 contains number of diverging samples after tuning. If increasing target_accept does not help try to reparameterize.
MAy I know what does reparameterize here mean?
2. The acceptance probability in chain 0 does not match the target. It is , but should be close to 0.8. Try to increase the number of tuning steps.
Digging through a few examples I used 'random_seed', 'discard_tuned_samples', 'step = pm.NUTS(target_accept=0.95)' and so on and got rid of these user warnings. But I couldn't find details of how these parameter values are being decided. I am sure this might have been discussed in various context but I am unable to find solid documentation for this. I was doing a trial and error method as below.
with patten_study:
#SEED = 61290425 #51290425
step = pm.NUTS(target_accept=0.95)
trace = sample(step = step)#4000,tune = 10000,step =step,discard_tuned_samples=False)#,random_seed=SEED)
I need to run these on different datasets. Hence I am struggling to fix these parameter values for each dataset I am using. Is there any way where I give these values or find the outcome (if there are any user warnings and then try other values) and run it in a loop?
Pardon me if I am asking something stupid!
In this context, re-parametrization basically is finding a different but equivalent model that it is easier to compute. There are many things you can do depending on the details of your model:
Instead of using a Uniform distribution you can use a Normal distribution with a large variance.
Changing from a centered-hierarchical model to a
non-centered
one.
Replacing a Gaussian with a Student-T
Model a discrete variable as a continuous
Marginalize variables like in this example
whether these changes make sense or not is something that you should decide, based on your knowledge of the model and problem.

Abaqus Python 'Getclosest' command

I'm using the getclosest command to find a vertex.
ForceVertex1 = hatInstance.vertices.getClosest(coordinates=((x,y,z,))
This is a dictionary object with Key 0 and two values (hatInstance.vertices[1] and the coordinates of the vertex) The specific output:
{0: (mdb.models['EXP-100'].rootAssembly.instances['hatInstance-100'].vertices[1], (62.5242172081597, 101.192447407436, 325.0))}
Whenever I try to create a set, the vertex isn't accepted
mainAssembly.Set(vertices=ForceVertex1[0][0],name='LoadSet1')
I also tried a different way:
tolerance = 1.0e-3
vertex = []
for vertex in hatInstance.vertices:
x = vertex.pointOn[0][0]
print x
y = vertex.pointOn[0][1]
print y
z = vertex.pointOn[0][2]
print z
break
if (abs(x-xTarget)) < tolerance and abs(y-yTarget) < tolerance and abs(z-zTarget) < tolerance):
vertex.append(hatInstance.vertices[vertex.index:vertex.index+1])
xTarget etc being my coordinates, despite this I still don't get a vertex object
For those struggeling with this, I solved it.
Don't use the getClosest command as it returns a dictionary object despite the manual recommending this. I couldn't convert this dictionary object, specifically a key and a value within to a standalone object (vertex)
Instead use Instance.vertices.getByBoundingSphere(center=,radius=)
The center is basically a tuple of the coordinates and the radius is the tolerance. This returns an array of vertices
If you want the geometrical object you just have to access the dictionary.
One way to do it is:
ForceVertex1 = hatInstance.vertices.getClosest(coordinates=((x,y,z,))[0][0]
This will return the vertices object only, which you can assign to a set or whatever.
Edit: Found a solution to actually address the original question:
part=mdb.models[modelName].parts[partName]
v=part.vertices.getClosest(coordinates=(((x,y,z)),))
Note the formatting requirement for coordinates ((( )),), three sets of parenthesis with a comma. This will find the vertex closest to the specified point. In order to use this to create a set, I found you need to massage the Abaqus Python interface to return the vertex in a format that uses their "getSequenceFromMask" method. In order to create a set, the edges, faces, and/or vertices need to be of type "Sequence", which is internal to Abaqus. To do this, I then use the following code:
v2=part.verticies.findAt((((v[0][1])),))
part.Set(name='setName', vertices=v2)
Note, v[0][1] will give you the point at which the vertex lies on. Note again the format of the specified point using the findAt method (((point)),) with three sets of parenthesis and a comma. This will return a vertex that uses the getSequenceFromMask method in Abaqus (you can check by typing v2 then enter in the python box at the bottom of CAE, works with Abaqus 2020). This is type "Sequence" (you can check by typing type(V2)) and this can be used to create a set. If you do not format the point in findAt correctly (e.g., findAt(v[0][1]), without the parenthesis and comma), it will return an identical vertex as you get by accessing the dictionary returned using getClosest (e.g., v[0][0]). This is type 'Vertex' and cannot be used to create a set, even though it asks for a vertex. If you know the exact point where the vertex is, then you do not need the first step. You can simply use the findAt method with the correct formatting. However, the tolerance for findAt is very small (1e-6) and will return an empty sequence if nothing is found within the tolerance. If you only have a ballpark idea of where the vertex is located, then you need to use the getClosest method first. This indeed gets the closest vertex to the specified point, which may or may not be the one you are interested in.
Original post:
None of these answers work for a similar problem I am having while trying to create a set of faces within some range near a point. If I use getClosest as follows
f=mdb.models['Model-1'].parts['Part-1'].faces.getClosest(coordinates=((0,0,0),), searchTolerance=1)
mdb.models['Model-1'].parts['Part-1'].Set(faces=f, name='faceSet')
I get an error "TypeError: Keyword error on faces".
If I access the dictionary via face=f[0], I get error "Feature Creation Failed". If I access the tuple within the dictionary via f[0][0], I get the error "TypeError: keyword error on faces" again.
The option to use .getByBoundingSphere doesn't work either, because the faces in my model are massive, and the faces have to be completely contained within the sphere for Abaqus to "get" them, basically requiring me to create a sphere that encompasses the entire model.
My solution was to create my own script as follows:
import numpy as np
model=mdb.models['Model-1']
part=model.parts['Part-1']
faceSave=[]
faceSave2=[]
x=np.arange(-1,1,0.1)
y=np.arange(-1,1,0.1)
z=np.arange(-1,1,0.1)
for x1 in x:
for y1 in y:
for z1 in z:
f=part.faces.findAt(((x1,y1,z1),))
if len(f)>0:
if f[0] in faceSave2:
None
else:
faceSave.append(f)
faceSave2.append(f[0])
part.Set(faces=faceSave,name='faceSet')
This works, but it's extraordinarily slow, in part because "findAt" will throw a warning to the console whenever it doesn't find a face, and it usually doesn't find a face with this approach. The code above basically looks within a small cube for any faces, and puts them in the list "faceSave". faceSave2 is setup to ensure that duplicate faces aren't added to the list. Accessing the tuple (e.g, f[0] in the code above) contains the unique information about the face, whereas 'f' is just a pointer to the 'findAt' command. Strangely, you can use the pointer 'f' to create a Set, but you cannot use the actual face object 'f[0]' to create a set. The problem with this approach for general use is, the tolerance for "findAt" is super small, so, you either have to be confident where things are located in your model, or have the step size be 1e-6 in np.arange() to ensure you don't miss a face that's in the cube. With a tiny step size, expect the code to take forever.
At any rate, I can use a tuple (or a list of tuples) obtained via "findAt" to create a Set in Abaqus. However, I cannot use the tuple obtained via "getClosest" to make a set, even though I see no difference between the two objects. It's unfortunate, because getClosest gives me the exact info I need effectively immediately without my jumbled mess of for-loops.
#anarchoNobody:
Thank you so much for your edited answer!
This workaround works great, also with faces. I spent a lot of hours trying to figure out why .getClosest does not provide a working result for creating a set, but with the workaround and the number of brackets it works.
If applied with several faces, the code has to be slightly modified:
faces=((mdb.models['Model-1'].rootAssembly.instances['TT-1'].faces.getClosest(
coordinates=(((10.0, 10.0, 10.0)),), searchTolerance=2)),
(mdb.models['Model-1'].rootAssembly.instances['TT-1'].faces.getClosest(
coordinates=((-10.0, 10.0, 10.0),), searchTolerance=2)),)
faces1=(mdb.models['Model-1'].rootAssembly.instances['Tube-1'].faces.findAt((((
faces[0][0][1])),)),
mdb.models['Model-1'].rootAssembly.instances['Tube-1'].faces.findAt((((
faces[1][0][1])),)),)
mdb.models['Model-1'].rootAssembly.Surface(name='TT-inner-surf', side1Faces=faces1)
```

Building Speech Dataset for LSTM binary classification

I'm trying to do binary LSTM classification using theano.
I have gone through the example code however I want to build my own.
I have a small set of "Hello" & "Goodbye" recordings that I am using. I preprocess these by extracting the MFCC features for them and saving these features in a text file. I have 20 speech files(10 each) and I am generating a text file for each word, so 20 text files that contains the MFCC features. Each file is a 13x56 matrix.
My problem now is: How do I use this text file to train the LSTM?
I am relatively new to this. I have gone through some literature on it as well but not found really good understanding of the concept.
Any simpler way using LSTM's would also be welcome.
There are many existing implementation for example Tensorflow Implementation, Kaldi-focused implementation with all the scripts, it is better to check them first.
Theano is too low-level, you might try with keras instead, as described in tutorial. You can run tutorial "as is" to understand how things goes.
Then, you need to prepare a dataset. You need to turn your data into sequences of data frames and for every data frame in sequence you need to assign an output label.
Keras supports two types of RNNs - layers returning sequences and layers returning simple values. You can experiment with both, in code you just use return_sequences=True or return_sequences=False
To train with sequences you can assign dummy label for all frames except the last one where you can assign the label of the word you want to recognize. You need to place input and output labels to arrays. So it will be:
X = [[word1frame1, word1frame2, ..., word1framen],[word2frame1, word2frame2,...word2framen]]
Y = [[0,0,...,1], [0,0,....,2]]
In X every element is a vector of 13 floats. In Y every element is just a number - 0 for intermediate frames and word ID for final frame.
To train with just labels you need to place input and output labels to arrays and output array is simpler. So the data will be:
X = [[word1frame1, word1frame2, ..., word1framen],[word2frame1, word2frame2,...word2framen]]
Y = [[0,0,1], [0,1,0]]
Note that output is vectorized (np_utils.to_categorical) to turn it to vectors instead of just numbers.
Then you create network architecture. You can have 13 floats for input, a vector for output. In the middle you might have one fully connected layer followed by one lstm layer. Do not use too big layers, start with small ones.
Then you feed this dataset into model.fit and it trains you the model. You can estimate model quality on heldout set after training.
You will have a problem with convergence since you have just 20 examples. You need way more examples, preferably thousands to train LSTM, you will only be able to use very small models.

RapidMiner: Can I use a wildcard as an attribute value for training a decision tree model?

I am working on a fairly simple process in RapidMiner 5.3.013, which reads a CSV file and uses it as a training set to train the decision tree classifier. The result of the process is the model. A second CSV is read and used as the unlabeled set. The model (calculated earlier) is applied to the unlabeled test set, in an effort to label it properly.
Each line of the CSVs contains a few attributes, for example:
15, 0, 1555, abc*15, label1
but some lines of the training set may be like this:
15, 0, *, abc*15, label2
This is done because the third value may take various values, so the creator of the training set used a star as a wildcard in the place of the value.
What I would like to do is let the decision tree know that the star there means "match anything", so that it does not literally only match a star.
Notes:
the star in the 4th field (abc*15) should be matched literally and not as a wildcard.
if the 3rd field always contained stars, I could just not include it in the attributes, but that's not the case. Sometimes the 3rd field contains integer values, which should be matched literally.
I tried leaving the field blank, but it doesn't work
So, is there a way to use regular expressions, or at least a simple wildcard while training the classifier or using the model?
A different way to put it is: Can I instruct the classifier to not use some of the attributes in some of the entries (lines in the CSV)?
Thanks!
I would process the data so the missing value is valid in its own right and I would discretize the valid numbers to be in ranges.
In more detail, what I meant by missing is the situation where the value of an attribute is something like *. I would simply allow this to be one valid value that the attribute takes. For all the other values of this attribute, these are numerical so they need to be converted to a nominal value to be compatible with the now valid *.
It's fairly fiddly to do this and I haven't tried this but I would start with the operator Declare Missing Value to detect the * and make them missing. From there, I would use the operator Discretize by Binning to convert numbers into nominal values. Finally, I would use Replace Missing Values to change the missing values to a nominal value like Missing. You might ask why bother with the first Declare Missing step above? The reason is that it will allow the Discretizing operation to work because it will be working on numbers alone given that non-numbers are marked as missing.
The resulting example set then be passed to a model in the normal way. Obviously, the model has to be able to cope with nominal attributes (Decision trees does).
It occurred to me that some modelling operators are more tolerant of missing data. I think k-nearest-neighbours may be one. In this case, you could simply mark the missing ones as above and not bother with the discretizing step.
The whole area of missing data does need care because it's important to understand the source of missingness. If missing data is correlated with other attributes or with the label itself, handling it inappropriately can skew results.