Fenics subdomain indicator in mesh file - python-2.7

In the Fenics documentation, it is mentionned that
DirichletBC takes three arguments, the first one is our function space V, the next is the boundary condition value and the third is the subdomain indicator which is information stored in the mesh.
Where is the subdomain indicator in the mesh file? How do I change it's value?
Context: I am solving on a domain that has multiple boundary parts, with a constant Dirichlet condition on each part.
The mesh file I'm using was generated using Triangle, and dolfin-convert to get an xml file.
It is my understanding that meshing tools such as GMSH natively provide the option to mark boundaries, but I would rather not resort to another mesher, since I am used to Triangle.

I didn't figure out how to modify the mesh to add boundary markers, but I did find a workaround to part of the problem in page 9 of this document.
The idea is to define a boundary condition for each boundary
u_1 = Constant(0.0)
def b_1(x, on_boundary):
return on_boundary and \
near(x[0]*x[0]+x[1]*x[1], 1, 1e-2)
then defining a list of boundary conditions that can be passed to the solve function
bcs = [DirichletBC(V, u_1, b_1), ...]
However, this will only work if each boundary can be described by an equation. So this is not a general solution to the problem

Related

Matlab: What's the most efficient approach to parse a large table or cell array with regexp when sometimes there is no match?

I am working with a messy manually maintained "database" that has a column containing a string with name,value pairs. I am trying to parse the entire column with regexp to pull out the values. The column is huge (>100,000 entries). As a proxy for my actual data, let's use this code:
line1={'''thing1'': ''-583'', ''thing2'': ''245'', ''thing3'': ''246'', ''morestuff'':, '''''};
line2={'''thing1'': ''617'', ''thing2'': ''239'', ''morestuff'':, '''''};
line3={'''thing1'': ''unexpected_string(with)parens5'', ''thing2'': 245, ''thing3'':''246'', ''morestuff'':, '''''};
mycell=vertcat(line1,line2,line3);
This captures the general issues encountered in the database. I want to extract what thing1, thing2, and thing3 are in each line using cellfun to output a scalar cell array. They should normally be 3 digit numbers, but sometimes they have an unexpected form. Sometimes thing3 is completely missing, without the name even showing up in the line. Sometimes there are minor formatting inconsistencies, like single quotes missing around the value, spaces missing, or dashes showing up in front of the three digit value. I have managed to handle all of these, except for the case where thing3 is completely missing.
My general approach has been to use expressions like this:
expr1='(?<=thing1''):\s?''?-?([\w\d().]*?)''?,';
expr2='(?<=thing2''):\s?''?-?([\w\d().]*?)''?,';
expr3='(?<=thing3''):\s?''?-?([\w\d().]*?)''?,';
This looks behind for thingX' and then tries to match : followed by zero or one spaces, followed by 0 or 1 single quote, followed by zero or one dash, followed by any combination of letters, numbers, parentheses, or periods (this is defined as the token), using a lazy match, until zero or one single quote is encountered, followed by a comma. I call regexp as regexp(___,'tokens','once') to return the matching token.
The problem is that when there is no match, regexp returns an empty array. This prevents me from using, say,
out=cellfun(#(x) regexp(x,expr3,'tokens','once'),mycell);
unless I call it with 'UniformOutput',false. The problem with that is twofold. First, I need to then manually find the rows where there was no match. For example, I can do this:
emptyout=cellfun(#(x) isempty(x),out);
emptyID=find(emptyout);
backfill=cell(length(emptyID),1);
[backfill{:}]=deal('Unknown');
out(emptyID)=backfill;
In this example, emptyID has a length of 1 so this code is overkill. But I believe this is the correct way to generalize for when it is longer. This code will change every empty cell array in out with the string Unknown. But this leads to the second problem. I've now got a 'messy' cell array of non-scalar values. I cannot, for example, check unique(out) as a result.
Pardon the long-windedness but I wanted to give a clear example of the problem. Now my actual question is in a few parts:
Is there a way to accomplish what I'm trying to do without using 'UniformOutput',false? For example, is there a way to have regexp pass a custom string if there is no match (e.g. pass 'Unknown' if there is no match)? I can think of one 'cheat', which would be to use the | operator in the expression, and if the first token is not matched, look for something that is ALWAYS found. I would then still need to double back through the output and change every instance of that result to 'Unknown'.
If I take the 'UniformOutput',false approach, how can I recover a scalar cell array at the end to easily manipulate it (e.g. pass it through unique)? I will admit I'm not 100% clear on scalar vs nonscalar cell arrays.
If there is some overall different approach that I'm not thinking of, I'm also open to it.
Tangential to the main question, I also tried using a single expression to run regexp using 3 tokens to pull out the values of thing1, thing2, and thing3 in one pass. This seems to require 'UniformOutput',false even when there are no empty results from regexp. I'm not sure how to get a scalar cell array using this approach (e.g. an Nx1 cell array where each cell is a 3x1 cell).
At the end of the day, I want to build a table using these results:
mytable=table(out1,out2,out3);
Edit: Using celldisp sheds some light on the problem:
celldisp(out)
out{1}{1} =
246
out{2} =
Unknown
out{3}{1} =
246
I assume that I need to change the structure of out so that the contents of out{1}{1} and out{3}{1} are instead just out{1} and out{3}. But I'm not sure how to accomplish this if I need 'UniformOutput',false.
Note: I've not used MATLAB and this doesn't answer the "efficient" aspect, but...
How about forcing there to always be a match?
Just thinking about you really wanting a match to skip this problem, how about an empty match?
Looking on the MATLAB help page here I can see a 'emptymatch' option, perhaps this is something to try.
E.g.
the_thing_i_want_to_find|
Match "the_thing_i_want_to_find" or an empty match, note the | character.
In capture group it might look like this:
(the_thing_i_want_to_find|)
As a workaround, I have found that using regexprep can be used to find entries where thing3 is missing. For example:
replace='$1 ''thing3'': ''Unknown'', ''morestuff''';
missingexpr='(?<=thing2'':\s?)(''?-?[\w\d().]*?''?,) ''morestuff''';
regexprep(mycell{2},missingexpr,replace)
ans =
''thing1': '617', 'thing2': '239', 'thing3': 'Unknown', 'morestuff':, '''
Applying it to the entire array:
fixedcell=cellfun(#(x) regexprep(x,missingexpr,replace),mycell);
out=cellfun(#(x) regexp(x,expr3,'tokens','once'),fixedcell,'UniformOutput',false);
This feels a little roundabout, but it works.
cellfun can be replaced with a plain old for loop. Your code will either be equally fast, or maybe even faster. cellfun is implemented with a loop anyway, there is no advantage of using it other than fewer lines of code. In your explicit loop, you can then check the output of regexp, and build your output array any way you like.

SVG.js How to update pattern scale and position?

Hi I'm trying to fill a polygon with a pattern here https://codesandbox.io/s/pmqv6vxvw0 (watch the HelloWorld.vue the drawPath() and anymPath() there)
Looks like it has the only option to update element, not the pattern's parameter enter link description here
Is there a way to change the scale as well?
And the pattern alignment, is it possible to move the pattern into the middle of the container?

converting a sentence to an embedding representation

If I have a sentence, ex: “get out of here”
And I want to use word2vec Embed. to represent it .. I found three different ways to do that:
1- for each word, we compute the AVG of its embedding vector, so each word replaced by a single value.
2- as in 1, but with using the standard deviation of the embedding vector values.
3- or by adding the Embed. vector as it is. So if I use 300 length embedding vector .. for the above example, I will have in the final a vector of (300 * 4 words) 1200 length as a final vector to represent the sentence.
Which one of them is most suitable .. ? specifically, for the sentence similarity applications ..
The way you describe option (1) makes it sound like each word becomes a single number. That wouldn't work.
The simple approach that's often used is to average all word-vectors for words in the sentence together - so with 300-dimensional word-vectors, you still wind up with a 300-dimensional sentence-average vector. Perhaps that's what you mean by your option (1).
(Sometimes, all vectors are normalized to unit-length before this operation, but sometimes not - because the non-normalized vector lengths can sometimes indicate the strength of a word's meaning. Sometimes, word-vectors are weighted by some other frequency-based indicator of their relative importance, such as TF/IDF.)
I've never seen your option (2) used and don't quite understand what you mean or how it could possibly work.
Your option (3) would be better described as "concatenating the word-vectors". It gives different-sized vectors depending on the number of words in the sentence. Slight differences in word placement, such as comparing "get out of here" and "of here get out", would result in very different vectors, that usual methods of comparing vectors (like cosine-similarity) would not detect as being 'close' at all. So it doesn't make sense, and I've not seen it used.
So, only your option (1), as properly implemented to (weighted-)average word-vectors, is a good baseline for sentence-similarities.
But, it's still fairly basic and there are many other ways to compare sentences using text-vectors. Here are just a few:
One algorithm closely related to word2vec itself is called 'Paragraph Vectors', and is often called Doc2Vec. It uses a very word2vec-like process to train vectors for full ranges of text (whether they're phrases, sentences, paragraphs, or documents) that work kind of like 'floating document-ID words' over the full text. It sometimes offers a benefit over just averaging word-vectors, and in some modes can produce both doc-vectors and word-vectors that are also comparable to each other.
If your interest isn't just pairwise sentence similarities, but some sort of downstream classification task, then Facebook's 'FastText' refinement of word2vec has a classification mode, where the word-vectors are trained not just to predict neighboring words, but to be good at predicting known text classes, when simply added/averaged together. (Text-vectors constructed from such classification vectors might be good at similarities too, depending on how well the training-classes capture salient contrasts between texts.)
Another way to compute pairwise similarities, using just word-vectors, is "Word Mover's Distance". Rather than averaging all the word-vectors for a text together into a single text-vector, it considers each word-vector as a sort of "pile of meaning". Compared to another sentence, it calculates the minimum routing work (distance along lots of potential word-to-word paths) to move all the "piles" from one sentence into the configuration of another sentence. It can be expensive to calculate, but usually represents sentence-contrasts better than the simple single-vector-summary that naive word-vector averaging achieves.
`
model = Word2Vec(sentences,vector_size=100, min_count=1)
def sent_vectorizer(sent, model):
sent_vec =[]
numw = 0
for w in sent:
try:
if numw == 0:
sent_vec = model[w]
else:
sent_vec = np.add(sent_vec, model[w])
numw+=1
except:
pass
return np.asarray(sent_vec) / numw
X=[]
for sentence in sentences:
X.append(sent_vectorizer(sentence, model))
print ("========================")
print (X)
`

Getting Beyond Compare to Match Similar Lines Properly

I am using Beyond Compare 4.1.6 to diff text configuration files. There is one configuration parameter per line, and each line is formatted as follows:
:=
I would like to configure Beyond Compare such that it will align only lines when the : portion of the line is exactly the same in both files. Put differently, everything from the beginning of the line up to and including the colon must match exactly for the two lines to be aligned. Note that a colon cannot occur in , so the colon I want Beyond Compare to base its alignment decision on will always be the first colon in the line.
An example is:
# FILE 1
abcdefgh:string=5
# FILE 2
abcdefkh:string=5
Beyond Compare aligns these two lines even though I don't want it to.
I've been unable to coerce Beyond Compare to compare lines as desired by editing its grammar rules or by tweaking other features.
How may I get Beyond Compare to match lines as described above?
Thank you!
You can compare it with a table compare.
Then you must set the = as field separator:
When you did this, you have two columns and the first is the key columns (if not, you can define it).
After this you get the result you want (if I understood your question right):
If you need it often, you may store the setting in a file format.

Gomoku datas representation in C

I'm working on a Gomoku game I'm currently done with GUI etc, and I need to code the IA and Rule Checker (for optional rules such as Capture, forbidden patterns etc).
I was planning on representing the board with an int array something like:
uint goban[361];
Which would represent a 19 * 19 Goban (board). Let's say we can split a 32bit integer in 4 byte and within each byte we can stock metadata like this for example:
1st byte: Is this case empty/black/white ?
2nd byte: Is this case part of a special pattern ?
3rd byte: In which position of the pattern am I ?
4th byte: Am I capturable ?
I don't know if this kind of solution is suitable for a Gomoku AI but the main problem I've is how to write it properly. Let's take pattern:
-OO-O-
It's a open & free three, it has space inside and at the extremity. How Am I supposed to link this pattern with a static representation without coordinates ?
One other concern is when should I update pattern and how because out of 361 case it can be pretty long if I update the previous figure to this:
XOO-O-
I've to update all four case so I don't think it's apropriate, plus it can affect many other vertical / diagonal patterns.
Should I rather make a list of patterns currently on the map like this:
std::list<ThreatList> tlist;
and make the map a simple tribool or char array ?
I want my data representation to give me maximum information to get a fast update of the influence map which would be filled by my evaluation function. I've read couple things about threat space search and other Gomoku algorithm but they don't talk about data representation and I don't get how to do it correctly, can you please help me find a clean way to represent pattern and how to update them.
Thanks you.
Take a look at this open source Gomoku:
https://github.com/garretraziel/gomoku
I think you will find a lot of interesting ideas in there.