How does the minizinc pentominoes regular constraint example work? - regex

The minizinc benchmarks repository contains several pentomino examples.
Here is the data for the first example:
width = 5;
height = 4;
filled = 1;
ntiles = 5;
size = 864;
tiles = [|63,6,1,2,0,
|9,6,1,2,378,
|54,6,1,2,432,
|4,6,1,2,756,
|14,6,1,2,780,
|];
dfa = [7,5,5,5,5,3,0,2,2,2,2,2,7,5,5,5,5,3,19,4,4,4,4,3,30,4,4,4,4,3,0,10,10,10,10,10,46,8,8,8,8,0,0,12,12,12,12,13,0,15,15,15,15,14,0,16,16,16,16,16,0,18,18,18,18,17,0,20,20,20,20,20,0,21,21,21,21,21,0,22,22,22,22,22,0,23,23,23,23,23,0,28,28,28,28,0,47,22,22,22,22,22,47,23,23,23,23,23,46,11,11,11,11,24,0,26,26,26,26,26,0,25,25,25,25,25,0,27,27,27,27,25,0,29,29,29,29,26,0,31,31,31,31,31,32,0,0,0,0,0,33,0,0,0,0,0,34,0,0,0,0,0,35,0,0,0,0,0,36,0,0,0,0,0,46,9,9,9,9,6,47,16,16,16,16,16,0,35,35,35,35,0,60,35,35,35,35,0,0,37,37,37,37,39,0,39,39,39,39,39,60,37,37,37,37,39,0,40,40,40,40,40,0,41,41,41,41,41,0,42,42,42,42,42,0,43,43,43,43,43,0,45,45,45,45,45,0,47,47,47,47,47,60,47,47,47,47,47,48,0,0,0,0,0,49,44,44,44,44,0,53,38,38,38,38,38,60,0,0,0,0,0,0,50,50,50,50,50,0,51,51,51,51,0,0,52,52,52,52,52,0,54,54,54,54,54,0,55,55,55,55,55,0,56,56,56,56,56,0,57,57,57,57,57,0,60,60,60,60,0,0,58,58,58,58,58,0,59,59,59,59,59,61,55,55,55,55,0,62,0,0,0,0,0,63,0,0,0,0,0,0,62,62,62,62,0,0,63,63,63,63,0,0,2,2,2,2,2,3,4,3,3,3,3,2,0,2,2,2,2,3,4,3,3,3,3,5,9,5,5,5,5,6,0,6,6,6,6,7,0,7,7,7,7,8,0,8,8,8,8,0,9,0,0,0,0,2,0,2,2,2,2,4,4,14,4,4,5,2,2,0,2,2,2,3,3,10,3,3,5,3,3,12,3,3,5,4,4,14,4,4,5,8,8,0,8,8,0,9,9,0,9,9,13,11,11,0,11,11,11,11,11,22,11,11,11,7,7,15,7,7,11,13,13,0,13,13,13,6,6,15,6,6,0,0,0,22,0,0,0,6,6,25,6,6,0,17,17,29,17,17,16,19,19,0,19,19,19,20,20,0,20,20,20,21,21,0,21,21,21,22,22,0,22,22,0,23,23,0,23,23,24,24,24,0,24,24,24,26,26,0,26,26,0,26,26,27,26,26,0,0,0,27,0,0,0,18,18,29,18,18,0,0,0,30,0,0,0,28,28,0,28,28,0,30,30,0,30,30,0,32,32,0,32,32,32,33,33,0,33,33,33,34,34,0,34,34,0,35,35,0,35,35,35,36,36,0,36,36,36,0,0,37,0,0,0,31,31,40,31,31,0,0,0,45,0,0,0,39,39,0,39,39,39,41,41,0,41,41,41,42,42,0,42,42,42,43,43,0,43,43,0,44,44,0,44,44,44,45,45,0,45,45,0,38,38,46,38,38,0,0,0,50,0,0,0,0,0,51,0,0,0,47,47,0,47,47,47,49,49,0,49,49,49,51,51,0,51,51,0,48,48,52,48,48,0,0,0,53,0,0,0,0,0,54,0,0,0,53,53,0,53,53,0,54,54,0,54,54,0,2,2,0,2,2,2,3,3,3,4,3,3,2,2,2,0,2,2,3,3,3,4,3,3,2,2,2,0,2,2,3,3,3,3,8,3,2,2,2,2,0,2,3,3,3,3,8,3,5,5,5,5,0,5,6,6,6,6,0,6,7,7,7,7,0,7,0,0,0,0,9,0,4,4,4,4,13,4,10,10,10,10,0,10,11,11,11,11,0,11,12,12,12,12,0,12,13,13,13,13,0,13,0,0,0,0,14,0,2,2,2,2,0,2,]
As far as I understand it, the goal is to fill a 5 x 4 board with 5 pentominoes. Some overlap and/or exclusion of tiles seems to be required, which is not usual.
Here is the minizinc solution:
include "globals.mzn";
int: Q = 1;
int: S = 2;
int: Fstart = 3;
int: Fend = 4;
int: Dstart = 5;
int: width;
int: height;
int: filled;
int: ntiles;
int: size;
array[1..ntiles,1..Dstart] of int: tiles;
array[1..size] of int: dfa;
array[1..width*height] of var filled..ntiles+1: board;
constraint forall (h in 1..height, w in 1..width-1) (
board[(h-1)*width+w] != ntiles+1);
constraint forall (h in 1..height) (
board[(h-1)*width+width] = ntiles+1);
constraint
forall (t in 1..ntiles)(
let {
int: q = tiles[t,Q],
int: s = tiles[t,S],
set of int: f = tiles[t,Fstart]..tiles[t,Fend],
array[1..q,1..s] of int: d =
array2d(1..q,1..s,
[ dfa[i] | i in tiles[t,Dstart]+1..tiles[t,Dstart]+q*s] )
}
in
regular(board,q,s,d,1,f)
);
solve :: int_search(board, input_order, indomain_min, complete) satisfy;
output [show(board)];
I've not been able to find much documentation on the minizinc benchmarks. They were part of the minizinc challenge for a few years but not anymore.
Chapter 3 of Mikael Lagerkvist's thesis is perhaps partially relevant. It describes placing pentominoes using the regular constraint in the gecode toolkit.
Section 3.2 illustrates a string representation for placing the L pentomino using a regular expression string of 0s and 1s: 1s where each square of the board overlaps a square of the L pentomino. Piece rotations are handled in section 3.3 using disjunctions of regular expressions. In general, there are 8 symmetries to consider for each pentomino (2 mirrorings and 4 rotations).
The minizinc data above does not use disjunctions of 8 binary strings to represent pentomino tiles but the minizinc code does use the regular constraint.
I realise gecode and minizinc work differently and in this case minizinc has chosen an alternative to difficult to read and write binary string regular expression disjunctions. The 864 long string of numbers in the dfa variable is probably the core part of the minizinc solution I'm missing. The rest of the solution (removing board symmetries) I can probably figure out after that.
I don't see how to fill a 5 x 4 board with 5 pentominoes without overlaps and/or exclusions. What is the goal of this example?
How does the minizinc pentomino tile and dfa representation work?
How does pentomino rotation and mirroring work in this minizinc representation?
Here is the only board solution from the above code:
[1, 1, 1, 2, 6, 3, 3, 1, 2, 6, 3, 5, 5, 5, 6, 3, 3, 3, 4, 6]
Here is the solution reformatted into a 5 x 4 board:
[1, 1, 1, 2, 6,
3, 3, 1, 2, 6,
3, 5, 5, 5, 6,
3, 3, 3, 4, 6]
Note the 6s. What do they represent?
See this web page for the complete set of 50, 5 x 4 pentomino tilings without overlaps, exclusions or holes.
There are alternative approaches for solving this sort of problem with minizinc.
The geost predicate is one possibility. None of these alternatives are relevant for this question.
Alternative software suggestions or discussion, beyond minizinc and gecode, are again not relevant.

Both the original MiniZinc model and the one in the repository in the comment are ones I wrote. While my licentiate thesis and the linked repository use regular expressions to express the constraints, the original MiniZinc challenge model was written when MiniZinc only had support for DFA inputs, as this is what the regular constraint inside solvers actually use (§). The DFAs were in fact generated by taking the Gecode model and writing a small program (lost to time) that printed out the DFA for the regular expressions in the Gecode example file using the Gecode regular expression to DFA translation. A nice thing about the translation is that for a piece that has some symmetry, the DFA minimization will remove the symmetries. The instance generator linked was written this year, and uses the modern MiniZinc feature that accepts regular expressions. It was just easier to write that way.
So, in order to understand the long list of numbers, you have to view it as a DFA. The list represent a matrix, where the indexes are the states and the next input, and the values are the next state to go to. The other arguments to the regular constraint indicate the number of states, the number of symbols in the alphabet, and the starting and accepting states of the DFA.
As for the 6's at the end of the matrix, these are end-of-line markers. They are there to make sure that a piece is not split apart. Consider the simple piece XXX on a 4 by 4 board with no other pieces (so X is 1, and empty is 0). With the expression 0*1110*, all placements of the piece are modeled, but so are placements like
_ _ _ _
_ _ X X
X X _ _
_ _ _ _
In order to avoid that, an additional end-of-line column is added to the board. in this particular case, the end of line marker is it's own unique value, which means that the model could even handle disjoint pieces. If all pieces are connected and the board is not full, the the end of line markers could be the same as the empty square.
I have a couple of other papers that use a similar construction of placing parts if you find this interesting, as well as the original publication.
Footnote: (§) Technically, most direct implementations of Pesant's algorithm can handle NFAs as well (disregarding epsilon transitions), which could be used to optimize the representation. However, DFA minimization is a known and fast method, while NFA minimization is much much harder. The fewer the states in the FA, the faster the propagation will be.

Related

tANS Mininum Size of State Set to Safely Encode a Symbol Frame

Hi I'm trying to implement tANS in a compute shader, but I am confused about the size of the state set. Also apologies but my account is too new to embed pictures of latex formatted equations.
Imagine we have a symbol frame S comprised of symbols s₁ to sₙ:
S = {s₁, s₂, s₁, s₂, ..., sₙ}
|S| = 2ᵏ
and the probability of each symbol is
pₛₙ = frequency(sₙ) / |S|
∑ pₛ₁ + pₛ₂ + ... pₛₙ = 1
According to Jarek Duda's slides (which can be found here) the first step in constructing the encoding function is to calculate the number of states L:
L = |S|
so that we can create a set of states
𝕃 = {L, ..., 2L - 1}
from which we can construct the encoding table from. In our example, this is simple L = |S| = 2^k. However, we don't want L to necessarily equal |S| because |S| could be enormous, and constructing an encoding table corresponding to size |S| would be counterproductive to compression. Jarek's solution is to create a quantization function so that we can choose an
L : L < |S|
which approximates the symbol probabilities
Lₛ / L ≈ pₛₙ
However as L decreases, the quality of the compression decreases, so I have two questions:
How small can we make L while still achieving compression?
What is a "good" way of determining the size of L for a given |S|?
In Jarek's ANS toolkit he uses the depth of a Huffman tree created from S to get the size of L, but this seems like a lot of work when we already know the upper bound of L (|S|; as I understand it when L = |S| we are at the Shannon entropy; thus making L > |S| would not increase compression). Instead it seems like it would be faster to choose an L that is both less than |S| and above some minimum L. A "good" size of L therefore would achieve some amount of compression, but more importantly would be easy to calculate. However we would need to determine the minimum L. Based on the pictures of sample ANS tables it seems like the minimum size of L could be the frequency of the most probable symbol, but I don't know enough about ANS to confirm this.
After mulling it over for awhile, both questions have very simple answers. The smallest L that still achieves lossless compression is L = |A|, where A is the alphabet of symbols to be encoded(I apologize, the lossless criterion should have been included in the original question). If L < |A| then we are pigeonholing symbols, thus losing information. When L = |A| what we essentially have is a fixed length variable code, where each symbol has an equal probability weighting in our encoding table. The answer to the second part is even more simple now that we know the answer to the first question. L can be pretty much whatever you want so long as its greater than the size of the alphabet to be encoded. Usually we want L to be a power of two for computational efficiency and then we want L to be greater than |A| to achieve better compression, so a very common L size is 2 times the greatest power of two equal to or greater than the size of the alphabet. This can easily be found by something like this:
int alphabetSize = SizeOfAlphabet();
int L = pow(2, ceil(log(alphabetSize, 2)) + 1);

Word2Vec is it for word only in a sentence or for features as well?

I would like to ask more about Word2Vec:
I am currently trying to build a program that check for the embedding vectors for a sentence. While at the same time, I also build a feature extraction using sci-kit learn to extract the lemma 0, lemma 1, lemma 2 from the sentence.
From my understanding;
1) Feature extractions : Lemma 0, lemma 1, lemma 2
2) Word embedding: vectors are embedded to each character (this can be achieved by using gensim word2vec(I have tried it))
More explanation:
Sentence = "I have a pen".
Word = token of the sentence, for example, "have"
1) Feature extraction
"I have a pen" --> lemma 0:I, lemma_1: have, lemma_2:a.......lemma 0:have, lemma_1: a, lemma_2:pen and so on.. Then when try to extract the feature by using one_hot then will produce:
[[0,0,1],
[1,0,0],
[0,1,0]]
2) Word embedding(Word2vec)
"I have a pen" ---> "I", "have", "a", "pen"(tokenized) then word2vec from gensim will produced matrices for example if using window_size = 2 produced:
[[0.31235,0.31345],
[0.31235,0.31345],
[0.31235,0.31345],
[0.31235,0.31345],
[0.31235,0.31345]
]
The floating and integer numbers are for explanation purpose and original data should vary depending on the sentence. These are just dummy data to explain.*
Questions:
1) Is my understanding about Word2Vec correct? If yes, what is the difference between feature extraction and word2vec?
2) I am curious whether can I use word2vec to get the feature extraction embedding too since from my understanding, word2vec is only to find embedding for each word and not for the features.
Hopefully someone could help me in this.
It's not completely clear what you're asking, as you seem to have many concepts mixed-up together. (Word2Vec gives vectors per word, not character; word-embeddings are a kind of feature-extraction on words, rather than an alternative to 'feature extraction'; etc. So: I doubt your understanding is yet correct.)
"Feature extraction" is a very general term, meaning any and all ways of taking your original data (such as a sentence) and creating a numerical representation that's good for other kinds of calculation or downstream machine-learning.
One simple way to turn a corpus of sentences into numerical data is to use a "one-hot" encoding of which words appear in each sentence. For example, if you have the two sentences...
['A', 'pen', 'will', 'need', 'ink']
['I', 'have', 'a', 'pen']
...then you have 7 unique case-flattened words...
['a', 'pen', 'will', 'need', 'ink', 'i', 'have']
...and you could "one-hot" the two sentences as a 1-or-0 for each word they contain, and thus get the 7-dimensional vectors:
[1, 1, 1, 1, 1, 0, 0] # A pen will need ink
[1, 1, 0, 0, 0, 1, 1] # I have a pen
Even with this simple encoding, you can now compare sentences mathematically: a euclidean-distance or cosine-distance calculation between those two vectors will give you a summary distance number, and sentences with no shared words will have a high 'distance', and those with many shared words will have a small 'distance'.
Other very-similar possible alternative feature-encodings of these sentences might involve counts of each word (if a word appeared more than once, a number higher than 1 could appear), or weighted-counts (where words get an extra significance factor by some measure, such as the common "TF/IDF" calculation, and thus values scaled to be anywhere from 0.0 to values higher than 1.0).
Note that you can't encode a single sentence as a vector that's just as wide as its own words, such as "I have a pen" into a 4-dimensional [1, 1, 1, 1] vector. That then isn't comparable to any other sentence. They all need to be converted to the same-dimensional-size vector, and in "one hot" (or other simple "bag of words") encodings, that vector is of dimensionality equal to the total vocabulary known among all sentences.
Word2Vec is a way to turn individual words into "dense" embeddings with fewer dimensions but many non-zero floating-point values in those dimensions. This is instead of sparse embeddings, which have many dimensions that are mostly zero. The 7-dimensional sparse embedding of 'pen' alone from above would be:
[0, 1, 0, 0, 0, 0, 0] # 'pen'
If you trained a 2-dimensional Word2Vec model, it might instead have a dense embedding like:
[0.236, -0.711] # 'pen'
All the 7 words would have their own 2-dimensional dense embeddings. For example (all values made up):
[-0.101, 0.271] # 'a'
[0.236, -0.711] # 'pen'
[0.302, 0.293] # 'will'
[0.672, -0.026] # 'need'
[-0.198, -0.203] # 'ink'
[0.734, -0.345] # 'i'
[0.288, -0.549] # 'have'
If you have Word2Vec vectors, then one alternative simple way to make a vector for a longer text, like a sentence, is to average together all the word-vectors for the words in the sentence. So, instead of a 7-dimensional sparse vector for the sentence, like:
[1, 1, 0, 0, 0, 1, 1] # I have a pen
...you'd get a single 2-dimensional dense vector like:
[ 0.28925, -0.3335 ] # I have a pen
And again different sentences may be usefully comparable to each other based on these dense-embedding features, by distance. Or these might work well as training data for a downstream machine-learning process.
So, this is a form of "feature extraction" that uses Word2Vec instead of simple word-counts. There are many other more sophisticated ways to turn text into vectors; they could all count as kinds of "feature extraction".
Which works best for your needs will depend on your data and ultimate goals. Often the most-simple techniques work best, especially once you have a lot of data. But there are few absolute certainties, and you often need to just try many alternatives, and test how well they do in some quantitative, repeatable scoring evaluation, to find which is best for your project.

Understanding; for i in range, x,y = [int(i) in i.... Python3

I am stuck trying to understand the mechanics behind this combined input(), loop & list-comprehension; from Codegaming's "MarsRover" puzzle. The sequence creates a 2D line, representing a cut-out of the topology in an area 6999 units wide (x-axis).
Understandably, my original question was put on hold, being to broad. I am trying to shorten and to narrow the question: I understand list comprehension basically, and I'm ok experienced with for-loops.
Like list comp:
land_y = [int(j) for j in range(k)]
if k = 5; land_y = [0, 1, 2, 3, 4]
For-loops:
for i in the range(4)
a = 2*i = 6
ab.append(a) = 0,2,4,6
But here, it just doesn't add up (in my head):
6999 points are created along the x-axis, from 6 points(x,y).
surface_n = int(input())
for i in range(surface_n):
land_x, land_y = [int(j) for j in input().split()]
I do not understand where "i" makes a difference.
I do not understand how the data "packaged" inside the input. I have split strings of integers on another task in almost exactly the same code, and I could easily create new lists and work with them - as I understood the structure I was unpacking (pretty simple being one datatype with one purpose).
The fact that this line follows within the "game"-while-loop confuses me more, as it updates dynamically as the state of the game changes.
x, y, h_speed, v_speed, fuel, rotate, power = [int(i) for i in input().split()]
Maybe someone could give an example of how this could be written in javascript, haskell or c#? No need to be syntax-correct, I'm just struggling with the concept here.
input() takes a line from the standard input. So it’s essentially reading some value into your program.
The way that code works, it makes very hard assumptions on the format of the input strings. To the point that it gets confusing (and difficult to verify).
Let’s take a look at this line first:
land_x, land_y = [int(j) for j in input().split()]
You said you already understand list comprehension, so this is essentially equal to this:
inputs = input().split()
result = []
for j in inputs:
results.append(int(j))
land_x, land_y = results
This is a combination of multiple things that happen here. input() reads a line of text into the program, split() separates that string into multiple parts, splitting it whenever a white space character appears. So a string 'foo bar' is split into ['foo', 'bar'].
Then, the list comprehension happens, which essentially just iterates over every item in that splitted input string and converts each item into an integer using int(j). So an input of '2 3' is first converted into ['2', '3'] (list of strings), and then converted into [2, 3] (list of ints).
Finally, the line land_x, land_y = results is evaluated. This is called iterable unpacking and essentially assumes that the iterable on the right has exactly as many items as there are variables on the left. If that’s the case then it’s just a nice way to write the following:
land_x = results[0]
land_y = results[1]
So basically, the whole list comprehension assumes that there is an input of two numbers separated by whitespace, it then splits those into separate strings, converts those into numbers and then assigns each number to a separate variable land_x and land_y.
Exactly the same thing happens again later with the following line:
x, y, h_speed, v_speed, fuel, rotate, power = [int(i) for i in input().split()]
It’s just that this time, it expects the input to have seven numbers instead of just two. But then it’s exactly the same.

Python weird behavior in list comprehension [duplicate]

This question already has answers here:
List comprehension rebinds names even after scope of comprehension. Is this right?
(6 answers)
Closed 9 years ago.
def nrooks(n):
#make board
print n # prints 4
arr = [0 for n in range(n)] # if 0 for n becomes 0 for x, it works fine
print n # prints 3 instead of 4
nrooks(4)
How come the second n becomes 3 , different from the given parameter?
Python 2
The n variable used in the list comprehension is the same n as is passed in.
The comprehension sets it to 1, 2, and then finally 3.
Instead, change it to
arr = [0 for _ in range(n)]
or (surprisingly!)
arr = list(0 for n in range(n))
Python 3
This has been fixed.
From the BDFL himself:
We also made another change in Python 3, to improve equivalence
between list comprehensions and generator expressions. In Python 2,
the list comprehension "leaks" the loop control variable into the
surrounding scope:
x = 'before'
a = [x for x in 1, 2, 3]
print x # this prints '3', not 'before'
This was an artifact of the original implementation of list
comprehensions; it was one of Python's "dirty little secrets" for
years. It started out as an intentional compromise to make list
comprehensions blindingly fast, and while it was not a common pitfall
for beginners, it definitely stung people occasionally. For generator
expressions we could not do this. Generator expressions are
implemented using generators, whose execution requires a separate
execution frame...
However, in Python 3, we decided to fix the "dirty little secret" of
list comprehensions by using the same implementation strategy as for
generator expressions. Thus, in Python 3, the above example (after
modification to use print(x) :-) will print 'before'.

Time based rotation

I'm trying to figure out the best way of doing the following:
I have a list of values: L
I'd like to pick a subset of this list, of size N, and get a different subset (if the list has enough members) every X minutes.
I'd like the values to be picked sequentially, or randomly, as long as all the values get used.
For example, I have a list: [google.com, yahoo.com, gmail.com]
I'd like to pick X (2 for this example) values and rotate those values every Y(60 for now) minutes:
minute 0-59: [google.com, yahoo.com]
minute 60-119: [gmail.com, google.com
minute 120-179: [google.com, yahoo.com]
etc.
Random picking is also fine, i.e:
minute 0-59: [google.com, gmail.com]
minute 60-119: [yahoo.com, google.com]
Note: The time epoch should be 0 when the user sets the rotation up, i.e, the 0 point can be at any point in time.
Finally: I'd prefer not to store a set of "used" values or anything like that, if possible. i.e, I'd like this to be as simple as possible.
Random picking is actually preferred to sequential, but either is fine.
What's the best way to go about this? Python/Pseudo-code or C/C++ is fine.
Thank you!
You can use the itertools standard module to help:
import itertools
import random
import time
a = ["google.com", "yahoo.com", "gmail.com"]
combs = list(itertools.combinations(a, 2))
random.shuffle(combs)
for c in combs:
print(c)
time.sleep(3600)
EDIT: Based on your clarification in the comments, the following suggestion might help.
What you're looking for is a maximal-length sequence of integers within the range [0, N). You can generate this in Python using something like:
def modseq(n, p):
r = 0
for i in range(n):
r = (r + p) % n
yield r
Given an integer n and a prime number p (which is not a factor of n, making p greater than n guarantees this), you will get a sequence of all the integers from 0 to n-1:
>>> list(modseq(10, 13))
[3, 6, 9, 2, 5, 8, 1, 4, 7, 0]
From there, you can filter this list to include only the integers that contain the desired number of 1 bits set (see Best algorithm to count the number of set bits in a 32-bit integer? for suggestions). Then choose the elements from your set based on which bits are set to 1. In your case, you would use pass n as 2N if N is the number of elements in your set.
This sequence is deterministic given a time T (from which you can find the position in the sequence), a number N of elements, and a prime P.