Hi I'm trying to implement tANS in a compute shader, but I am confused about the size of the state set. Also apologies but my account is too new to embed pictures of latex formatted equations.
Imagine we have a symbol frame S comprised of symbols s₁ to sₙ:
S = {s₁, s₂, s₁, s₂, ..., sₙ}
|S| = 2ᵏ
and the probability of each symbol is
pₛₙ = frequency(sₙ) / |S|
∑ pₛ₁ + pₛ₂ + ... pₛₙ = 1
According to Jarek Duda's slides (which can be found here) the first step in constructing the encoding function is to calculate the number of states L:
L = |S|
so that we can create a set of states
𝕃 = {L, ..., 2L - 1}
from which we can construct the encoding table from. In our example, this is simple L = |S| = 2^k. However, we don't want L to necessarily equal |S| because |S| could be enormous, and constructing an encoding table corresponding to size |S| would be counterproductive to compression. Jarek's solution is to create a quantization function so that we can choose an
L : L < |S|
which approximates the symbol probabilities
Lₛ / L ≈ pₛₙ
However as L decreases, the quality of the compression decreases, so I have two questions:
How small can we make L while still achieving compression?
What is a "good" way of determining the size of L for a given |S|?
In Jarek's ANS toolkit he uses the depth of a Huffman tree created from S to get the size of L, but this seems like a lot of work when we already know the upper bound of L (|S|; as I understand it when L = |S| we are at the Shannon entropy; thus making L > |S| would not increase compression). Instead it seems like it would be faster to choose an L that is both less than |S| and above some minimum L. A "good" size of L therefore would achieve some amount of compression, but more importantly would be easy to calculate. However we would need to determine the minimum L. Based on the pictures of sample ANS tables it seems like the minimum size of L could be the frequency of the most probable symbol, but I don't know enough about ANS to confirm this.
After mulling it over for awhile, both questions have very simple answers. The smallest L that still achieves lossless compression is L = |A|, where A is the alphabet of symbols to be encoded(I apologize, the lossless criterion should have been included in the original question). If L < |A| then we are pigeonholing symbols, thus losing information. When L = |A| what we essentially have is a fixed length variable code, where each symbol has an equal probability weighting in our encoding table. The answer to the second part is even more simple now that we know the answer to the first question. L can be pretty much whatever you want so long as its greater than the size of the alphabet to be encoded. Usually we want L to be a power of two for computational efficiency and then we want L to be greater than |A| to achieve better compression, so a very common L size is 2 times the greatest power of two equal to or greater than the size of the alphabet. This can easily be found by something like this:
int alphabetSize = SizeOfAlphabet();
int L = pow(2, ceil(log(alphabetSize, 2)) + 1);
I'm new to Prolog. I managed to learn C and Java relatively quickly but and Prolog is giving me a lot of trouble. My trouble is understanding lists and writing functions? For example. We have this automaton:
I can do this task in C and Java, no problems. But the course wants Prolog. With my current knowledge I could do things like this:
% 1. Check whether all integers of the list are < 10.
less_than_10([]).
less_than_10([Head|Tail]) :-
Head < 10,
less_than_10(Tail).
Just so you know where my knowledge is at. Very basic. I did read the list chapter in Learn Prolog Now but it's still confusing me. They gave us a hint:
Every node should be presented like:
delta(1, d, 2)
% or
alpha(2, a, 2)
They also told us to pass the list in questions to a predicate that returns true if the list fits the automaton and false if not:
accept([d,a,b,a,b,b,b,c,d,c]).
The output is true.
Where to go from here? I'm guessing the first step is to check if the Head of the list is 1. How do I do that? Also, should I add every node as fact into the knowledge base?
So that's pretty easy. Super-direct, much more than if you were using C or Java.
Let's write an interpreter for this graph that:
Is given a list of named transitions ;
Walks the transitions using the given graph along a path through that graph ;
Accepts (Succeeds) the list if we end up at a final state ;
Rejects (Fails) the list if we do not ;
And.. let's say throws an exception if the list cannot be generated by the given graph.
Prolog gives us nondeterminism for free in case there are several paths. Which is nice.
We do not have an class to describe the automaton. In a sense, the Prolog program is the automaton. We just have a set of predicates which describe the automaton via inductive definitions. Actually, if you slap a module definition around the source below, you do have the object.
First describe the graph. This is just a set of Prolog facts.
As required, we give the transitions (labeled by atoms) between nodes (labeled by integers), plus we indicate which are the start and end nodes. There is no need to list the nodes or edges themselves.
delta(1,d,2).
delta(2,a,2).
delta(2,b,2).
delta(2,d,4).
delta(2,e,5).
delta(2,c,3).
delta(3,d,6).
delta(6,c,5).
start(1).
end(4).
end(5).
A simple database. This is just one possible representation of course.
And now for the graph walker. We could use Definite Clause Grammars here because we are handling a list, but lets' not.
First, a predicate which "accepts" or "rejects" a list of transitions.
It looks like:
% accepts(+Transitions)
It starts in a start state, then "walks" by removing transitions off the list until the list is empty. Then it checks whether it is at an end state.
accepts(Ts) :- % accept the list of transitions if...
start(S), % you can accept the list starting
accepts_from(S,Ts). % from a start state
accepts_from(S,[T|Ts]) :- % accepts the transitions when at S if...
delta(S,T,NextS), % there is a transition S->NextS via T
accepts_from(NextS,Ts). % and you can accept the remaining Ts from NextS. (inductive definition)
accepts_from(S,[]) :- % if there is no transition left, we accept if...
end(S). % we are a final state
Ah, we wanted to throw if the path was impossible for that graph. So a little modification:
accepts(Ts) :- % accept the list of transitions if...
start(S), % you can accept the list starting
accepts_from(S,Ts). % from a start state
accepts_from(S,[T|Ts]) :- % accepts the transitions when at S if...
delta(S,T,NextS), % there is a transition S->NextS via T
accepts_from(NextS,Ts). % and you can accept the remaining Ts from NextS.
accepts_from(S,[T|Ts]) :- % accepts the transitions when at S if...
\+ delta(S,T,NextS), % there is NO transition S->NextS via T
format(string(Txt),"No transition at ~q to reach ~q",[S,[T|Ts]]),
throw(Txt).
accepts_from(S,[]) :- % if there is no transition left, we accept if...
end(S). % we are a final state
And so:
?- accepts([d,a,b,a,b,b,b,c,d,c]).
true ; % yup, accepts but maybe there are other paths?
false. % nope
?- accepts([d,a,a,a,a,e]).
true ;
false.
?- accepts([d,a,a,a,a]).
false.
?- accepts([d,c,e,a]).
ERROR: Unhandled exception: "No transition at 3 to reach [e,a]"
The above code should also be able to find acceptable paths through the graph. But it does not:
?- accepts(T).
... infinite loop
This is not nice.
The primary reason for that is that accept/2 will immediately generate an infinite path looping at state 2 via transitions a and b. So one needs to add a "depth limiter" (the keyword is "iterative deepening").
The second reason would be that the test \+ delta(S,T,NextS) would succeed at node 4 for example (because there is nowhere to go from that node) and cause an exception before trying out the possibility of going nowhere (the last clause). So when generating, throwing is a hindrance, one just wants to reject.
Addendum: Also generate
The following only accepts/rejects and does not throw, but can also generate.
:- use_module(library(clpfd)).
accepts(Ts,L) :- % Accept the list of transitions Ts of length L if
start(S), % ...starting from a start state S
accepts_from(S,Ts,L). % ...you can accept the Ts of length L.
accepts_from(S,[T|Ts],L) :- % Accept the transitions [T|Ts] when at S if
(nonvar(L)
-> L >= 1
; true), % L (if it is bound) is at least 1 (this can be replaced by L #> 0)
delta(S,T,SN), % ...and there is a transition S->SN via T
Lm #= L-1, % ...and the new length is **constrained to be** 1 less than the previous length
accepts_from(SN,Ts,Lm). % ...and you can accept the remaining Ts of length Lm from SN.
accepts_from(S,[],0) :- % If there is no transition left, length L must be 0 and we accept if
end(S). % ...we are a final state.
delta(1,d,2).
delta(2,a,2).
delta(2,b,2).
delta(2,d,4).
delta(2,e,5).
delta(2,c,3).
delta(3,d,6).
delta(6,c,5).
start(1).
end(4).
end(5).
generate :-
between(0,7,L),
findall(Ts,accepts(Ts,L),Bag),
length(Bag,BagLength),
format("Found ~d paths of length ~d through the graph\n",[BagLength,L]),
maplist({L}/[Ts]>>format("~d : ~q\n",[L,Ts]),Bag).
And so:
?- accepts([d,a,b,a,b,b,b,c,d,c],_).
true ;
false.
?- accepts([d,a,a,a,a],_).
false.
?- accepts([d,c,e,a],_).
false.
?- generate.
Found 0 paths of length 0 through the graph
true ;
Found 0 paths of length 1 through the graph
true ;
Found 2 paths of length 2 through the graph
2 : [d,d]
2 : [d,e]
true ;
Found 4 paths of length 3 through the graph
3 : [d,a,d]
3 : [d,a,e]
3 : [d,b,d]
3 : [d,b,e]
true ;
Found 9 paths of length 4 through the graph
4 : [d,a,a,d]
4 : [d,a,a,e]
4 : [d,a,b,d]
4 : [d,a,b,e]
4 : [d,b,a,d]
4 : [d,b,a,e]
4 : [d,b,b,d]
4 : [d,b,b,e]
4 : [d,c,d,c]
true
Here's my answer. I sought to completely separate the data from the logic.
There are rules to infer the possible paths, start and end nodes.
The edge/2 predicate stands for either an alpha or a delta line.
The path (DCG) predicate describes a list of edges that ends with an end node.
The start and end nodes are inferred using the start_node/1 and end_node/1 predicates.
Finally, the phrase/3 is used to describe the list of paths that are valid automata.
delta(1, d, 2).
delta(2, d, 4).
delta(2, e, 5).
delta(2, c, 3).
delta(3, d, 6).
delta(6, c, 5).
alpha(2, a, 2).
alpha(2, b, 2).
edge(Node, Node, Via) :-
alpha(Node, Via, Node).
edge(From, To, Via) :-
delta(From, Via, To).
path(From, To) -->
{ end_node(To),
dif(From, To),
edge(From, To, Via)
},
[Via].
path(From, To) -->
{edge(From, Mid, Via)},
[Via],
path(Mid, To).
start_node(Node) :-
node_aux(start_node_aux, Node).
end_node(Node) :-
node_aux(end_node_aux, Node).
start_node_aux(Node) :-
edge(Node, _, _),
\+ edge(_, Node, _).
node_aux(Goal, Node) :-
setof(Node, call(Goal, Node), Nodes),
member(Node, Nodes).
end_node_aux(Node) :-
edge(_, Node, _),
\+ edge(Node, _, _).
automaton -->
{start_node(Start)},
path(Start, _End).
accept(Steps) :-
length(Steps, _N),
phrase(automaton, Steps).
I suspect that David did not use Definite Clause Grammars because you should be familiar with the basics before learning DCGs.
Im very confused on how to filter out the element (1,1) from this list in the code below.
take 10 [ (i,j) | i <- [1,2],
j <- [1..] ]
yields
[(1,1),(1,2),(1,3),(1,4),(1,5),(1,6),(1,7),(1,8),(1,9),(1,10)]
My thoughts were to use something like filter but Im not too sure where to implement it.
My go was Filter ((i,j) /=0) "the list"
Thanks
Your attempt
Filter ((i,j) /=0) "the list"
has a few problems, which can be fixed.
First, the function is called filter. Second, its first argument must be a function: so you can use \(i,j) -> ... to take a list as input. Third, you want (i,j) /= (1,1) -- you can't compare a pair (i,j) to a single number 0.
You should now be able to correct your code.
As an alternative to using filter, you can also specify that you don't want (1,1) as an element within your list comprehension by adding a guard expression (i,j) /= (1,1):
take 10 [ (i,j) | i <- [1,2], j <- [1..], (i,j) /= (1,1) ]
This is similar to how you might write a set comprehension (which list comprehensions mimic):
This answer gives a nice example ([x | i <- [0..10], let x = i*i, x > 20]) of the three types of expression you can have in the tail end of a list comprehension:
Generators, eg. i <- [0..10] provide the sources of values.
Guards, eg. x > 20 are arbitrary predicates - for any given values from the generators, the value will only be included in the result if all the predicates hold.
Local declarations, eg. let x = i*i perform the same task as normal let/where statements.
Names for the different expressions taken from the syntax reference, expression qual.