Akka : UnboundedPriorityMailbox - Is it possible to prioritize messages by complicated type? - akka

UnboundedPriorityMailbox has the option to prioritize messages by type (like int string etc.). Is it possible to prioritize messages by type of class property?
I mean, I know that option is available:
case x: Int => 1
// String Messages
case x: String => 0
// Long messages
case x: Long => 2
// other messages
case _ => 3
And the order (priority) of the messages will be:
myPriorityActor ! “Hello”
myPriorityActor ! “I am priority actor”
myPriorityActor ! “I process string messages first,then integer, long and others”
myPriorityActor ! 5
myPriorityActor ! 3
myPriorityActor ! 1
myPriorityActor ! 6.0
myPriorityActor ! 5.0
Link to example :
I would like to know if I can prioritize the queue by type of Class property.
For example:
case x: Student.name=> 1
// String Messages
case x: Student.salary=> 0
// Long messages
case x: Student.School=> 2
Is such a thing possible?

Fundamentally, this has nothing to do with Akka. All you are doing is returning a partial function that has the interface of accepting an Any and returning an Int. Yes, behind the scenes that function will be used to determine the priority of a message, but from Akka's perspective it doesn't know anything about your function. The Mailbox just calls the function you provide to determine a relative priority for each message it receives.
The point being this is really a Scala question and you might be better off asking your question with just a Scala tag: you want to write a partial function function that looks at "class property" and returns an Int.
I think the answer to your question is no, but I'm not 100% sure I'm understanding your question. I think the key to the answer is understanding what I just said above though: there is no magic here. You are just providing a partial function that accepts an Any and returns an Int (or isn't applicable). The Scala pattern matching is doing some automatic detection of types and conversion of types, but you could do the same (in more lines of code) just by doing a bunch .getClass calls within if statements yourself.
So the only real question you have to answer is "could I write an if statement that does this?"
Thus, if my function receives the input of "Thomas Jefferson" could you write if statements that convert that into Int value corresponding to its message priority. Instinctively, I don't think you could. Because I would expect that both Student.name and Student.School are both String types and you would have a hard time distinguishing between them. Thus there's no if statement I could write that could tell between "Thomas Jefferson" the person and "Thomas Jefferson" the school. But, on the other hand, it all depends on how you are defining your types. Maybe you've subclassed String for school. In which case you could look at the types and tell the difference.
But, again, the bottom line is that this is just a function that converts Anys into Ints. If you can write function that does that, Akka will utilize that function to determine the priority of the message.
Although I'd also assert that this problem is somewhat moot in the real world. Prioritized mailboxes are pretty rare in real world applications and in cases where you have them you are probably not sending simple types. Most messages are likely some kind of envelope and won't be a simple type. You'll easily be able to tell the difference between a "HighPriorityCancel" message and a "GetGrade" message via the envelope.


Conncurency in Erlang

The problem I am trying to solve is as follows:
Write an Erlang function named print_message that takes no arguments. The function should wait to receive a message. When the message is received (it can be any Erlang term), print the message using io: format(). If 42 seconds pass without receiving a message, print a message that says “Too late.”.
The code that I wrote for the problem is down below:
print_message() ->
X -> io:format("~p~n",[X])
after 42000 ->
io:format("Too late ~n")
In my question, it says 'it can be any Erlang term'. Does using X in my code fulfill that requirement? Or do I need to use the Erlang built in function of any() as stated in the below reference manual:
Yes, your code fulfils the requirement. The pattern X matches any Erlang term.
Compare with the following, which matches only when the incoming message is a 2-tuple starting with ok:
print_message() ->
{ok, X} ->
Or with this, which matches only if the incoming message is an integer:
print_message() ->
X when is_integer(X) ->
Or with this, which matches only if the incoming message is equal to the function argument:
print_message(X) ->
X ->
(Since the variable names are the same, this turns into a selective receive where all other messages are ignored.)
Type specs are an optional part of the Erlang language. You could specify that your function takes an integer and returns a string:
-spec my_function(integer()) -> string().
my_function(N) ->
You could then use Dialyzer to check for type errors.
However, type specs are only used at compile time; they don't actually perform any checks at run time. Also, they cannot be used to specify types for messages being sent or received; only function arguments and return values are covered.
Your code fulfill the requirement.
Erlang is dynamically typed. So the type of X will be determined only on the reception of the first message, and therefore it can be any Erlang term.
To my knowledge, I don't think it is possible to specify the type of X in your code.
It exists some type specification in erlang, but it is used for the function parameters, their return values, and record definition.
these type definition can be used later for documentation or by dialyzer

Questions on SML type ckecking and inference

First of all, since the question is somehow related to a school project I don't think that posting my code is appropriate. Plus, as I explain later on I only have a modified version of the code in question.
And I explain myself. I should implement a version of Dijkstra's algorithm using a priority queue. I thought that a simple functional way to do so is define a dijkstra function with inputs the queue and the targeted node and a helper function to enqueue the nodes that are neighbors to the element of the list that was just dequeued. Unfortunately, the helper function did't typecheck - Unresolved Flex Record.
So far it may seem that the code is important but allow me to add one more
detail. Since the graph was 4-canonical(meaning each node has exactly four neighbors) I represented it as a matrix using modulus arithmetic. In order to simplify my algorithm I used this fact to rewrite it and use 4 extra helper functions - one for each move possible - instead of four ifs inside the first helper function. Each of the four-move function returns true if we should visit this node (meaning the cost we will need this way is smaller than the current cost needed) and false if not. And the first helper simply returns a tuple of four booleans variables. Finally, I copied the enqueue code that wasn't working in my first try into the body of the dijkstra code and suddenly it did typecheck.
I understand that it may still be unclear and perhaps you can only speculated about what was going on. But I am truly very puzzled.I searched this site and SML basis as well and found that this kind of error occurs in the following case:
f (x,y,z) = ...
where z isn't used so the checker can't deduct what it is.
I am sure this is not the case in my problem since I just copy-paste the code(not a very good technique I know but ok). Hence, I concluded that the problem was the typechecker not working with functions calls. I searched again and found a Hindley Miller algorithm explanation. And from what I understood every time it encounters and a function will assume is a->b as the first step and later on will go to the definition of the function and complete the task. So I was back to square one and decided to ask this question here looking for a better understanding of type inference or for a hint of what has going on.
P.S. 1) Even though I tried my best to explain the question I it is still unclear or too broad let me know and I will delete,no problem.
P.S. 2) A smaller and simpler question: I read that #1 is not adviceable to take the 1st element of a tuple and sometimes it doesn't even typecheck
and instead it should be used pattern matching. Could you explain that?
P.S. 3) Someone may wonder why I asked this question since I solved the problem with my second try. Personally, I don't consider solved but hidden.
Thanks in advance and sorry for the size of the question.
SML/NJ Errors
P.S. 2)
UPDATED: After some extra searching I have a guess about what was wrong. I was implementing a priority queue not customized for my problem but more general. So, the inference of the priority queue type was taking place when I first enqueued an element. But after enqueueing my source node and calling dijkstra the queue would be empty once more (my dijsktra was dequeueing the first element checking if it is the target node) and the first call of the helper function that add nodes would have an empty queue as one of its arguments. Perhaps the empty queue has no type and that was causing the error?
I'm taking a guess at what you're asking.
I have a function enqueue that does not work in one context, but it does work in another. Why? It uses the #1 macro, and I read that #1 is not adviceable to take the 1st element of a tuple and sometimes it doesn't even typecheck and instead it should be used pattern matching.
In Standard ML, #1 is a macro. It behaves like a function, but unlike functions, it is overloaded for any tuple/record with a 1 field in it. If you do not specify what kind of tuple you're passing to a function, using #1 will not disambiguate this. For example,
- fun f pair = #1 pair;
! Toplevel input:
! fun f pair = #1 pair;
! ^^
! Unresolved record pattern
But giving it the type (either through explicit type annotation, or in a context where the type can be inferred by other means) works well.
- fun f (pair : int * int) = #1 pair;
> val f = fn : int * int -> int
I don't know if I'd label #1 as a definite no-go and pattern matching as the only option, [edit: ... but this Stack Overflow answer that Ionuț G. Stan linked to has some arguments.]
There are advantages and disadvantages with both. Alternatively you can make unambiguous getters that only work on the type of tuple you're working with. For example,
fun fst (x, _) = x
fun snd (_, y) = y

Should I unit-test with data that should not be passed in a function (invalid input)?

I am trying to use TDD for my coding practice. I would like to ask should I test with a data that should not happen in a function BUT this data may possibly break your program.
Here is one of a easy example to illustrate to what I ask :
a ROBOT function that has a one INT parameter. In this function I know that the valid range would only be 0-100. If -1, 101 is used, the function will be break.
function ROBOT (int num){
return result;
So I decided some automated test cases for this function...
1. function ROBOT with input argument 0
2. function ROBOT with input argument 1
3. function ROBOT with input argument 10
4. function ROBOT with input argument 100
But should I write test cases with input argument -1 or 101 for this ROBOT function IF I would guard that in my other function that call function ROBOT???
5. function ROBOT with input argument -1
6. function ROBOT with input argument 101
I don't know if it is necessary cause I think it is redundancy to test -1 and 101. And If it is really necessary to cover all the cases, I have to write more code to guard -1 and 101.
So in Common practice of TDD, will you write test case on -1 and 101 as well???
Yes, you should test those invalid inputs. BUT, if your language has accessibility modifiers and ROBOT() is private you shouldn't be testing it; you should only test public functions/methods.
The functional testing technique is called Boundary Value Analysis.
If your range is 0-100, your boundary values are 0 and 100. You should test, at least:
below the boundary value
the boundary value
above the boundary value
In this case:
You assume everything below -1 to -infinity behaves the same, everything between 1-99 behaves the same and everything above 101 behaves the same. This is called Equivalence Partitioning. The ranges outside and between the boundary values are called partitions and you assume that they will have equivalent behaviour.
You should always consider using -1 as a test case to make sure nothing funny happens with negative numbers and a text string if the parameter is not strongly typed.
If the expected outcome is that an exception is thrown with invalid input values, then a test that the exceptions get properly thrown would be appropriate.
As I noted in my comment below, if these cases will break your application, you should throw an exception. If it really is logically impossible for these cases to occur, then I would say no, you don't need to throw an exception, and you don't need test cases to cover it.
Note that if your system is well componentized, and this function is one component, the fact that it is logically impossible now doesn't mean it will always be logically impossible. It may be used differently down the road.
In short, if it can break, then you should test it. Also validate data at the earliest point possible.
The answer depends on whether you control the inputs passed to Robot. If Robot is an internal class (C#) ; values only flow in from RobotClientX which is a public type. Then I'd put the guard checks in RobotClientX, write tests for it. I'd not write tests for Robot, because invalid values cannot materialize in-between.
e.g. if I put my validations in the GUI such that all invalid values are filtered off at the source, then I don't check for invalid values in all classes below the GUI (Unless I've also exposed a public API which bypasses the GUI).
On the other hand, if Robot is publicly visible i.e. Anyone can call Robot with any value that they please, then I need tests that document it's behavior given specific kinds of input.. invalid being one of them. e.g. if you pass an out-of-range value, it'd throw an ArgumentException.
You said your method will raise an exception if the argument is not valid.
So, yes you should, because you should test that the exception gets raised.
If other code guards against calling that method incorrectly, and no one else will be writing code to call that method, then I don't see a reason to test with invalid values. To me, it would seem a waste of time.
The programming by contract style of design and implementation draws attention to the fact that a single function (method) should be responsible for only some things, not for everything. The other functions that it calls (delegates to) and which call it also have responsibilities. This partition of responsibilities is at the heart of dividing the task of programming into smaller tasks that can be performed separately. The contract part of programming by contract is that the specification of a function says what a function must do if and only if the caller of the function fulfills the responsibilities placed on the caller by that specification. The requirement that the input integer is within the range [0,100] is that kind of requirement.
Now, unit tests should not test implementation details. They should test that the function conforms to its specification. This enables the implementation to change without the tests breaking. It makes refactoring possible.
Combining those two ideas, how can we write a test for a function that is given some particular invalid input? We should check that the function behaves according to the specification. But the specification does not say what the function must do in this case. So we can not write any checks of the program state after the invalid function call; the behaviour is undefined. So we can not write such a test at all.
My answer is that, no, you don't want exceptions, you don't want to have to have ROBOT() check for out of range input. The clients should be so well behaved that they don't pass garbage values in.
You might want to document this - Just say that clients must be careful about the values they pass in.
Besides where are you going to get invalid values from? Well, user input or by converting strings to numbers. But in those cases it should be the conversion routines that perform the checks and give feedback about whether the values are valid or not. The values should be guaranteed to be valid long before they get anywhere near ROBOT()!

What type of input check can be performed against binary data in C++?

let's say I have a function like this in C++, which I wish to publish to third parties. I want to make it so that the user will know what happened, should he/she feeds invalid data in and the library crashes.
Let's say that, if it helps, I can change the interface as well.
int doStuff(unsigned char *in_someData, int in_data_length);
Apart from application specific input validation (e.g. see if the binary begins with a known identifier etc.), what can be done? E.g. can I let the user know, if he/she passes in in_someData that has only 1 byte of data but passes in 512 as in_data_length?
Note: I already asked a similar question here, but let me ask from another angle..
It cannot be checked whether the parameter in_data_length passed to the function has the correct value. If this were possible, the parameter would be redundant and thus needless.
But a vector from the standard template library solves this:
int doStuff(const std::vector<unsigned char>& in_someData);
So, there is no possibility of a "NULL buffer" or an invalid data length parameter.
If you would know how many bytes passed by in_someData why would you need in_data_length at all?
Actually, you can only check in_someData for NULL and in_data_length for positive value. Then return some error code if needed. If a user passed some garbage to your function, this problem is obviously not yours.
In C++, the magic word you're looking for is "exception". That gives you a method to tell the caller something went wrong. You'll end up with code something like
doStuff(unsigned char * inSomeData, int inDataLength) throws Exception {
// do a test
if(inDataLength == 0)
throw new Exception("Length can't be 0");
// only gets here if it passed the test
// do other good stuff
return theResult;
Now, there's another problem with your specific example, because there's no universal way in C or C++ to tell how long an array of primitives really is. It's all just bits, with inSomeData being the address of the first bits. Strings are a special case, because there's a general convention that a zero byte ends a string, but you can't depend on that for binary data -- a zero byte is just a zero byte.
This has currently picked up some downvotes, apparently by people misled by the comment that exception specifications had been deprecated. As I noted in a comment below, this isn't actually true -- while the specification will be deprecated in C++11, it's still part of the language now, so unless questioner is a time traveler writing in 2014, the throws clause is still the correct way to write it in C++.
Also note that the original questioner says "I want to make it so that the user will know what happened, should he/she feeds [sic] invalid data in and the library crashes." Thus the question is not just what can I do to validate the input data (answer: not much unless you know more about the inputs than was stated), but then how do I tell the caller they screwed up? And the answer to that is "use the exception mechanism" which has certainly not been deprecated.

Short example of regular expression converted to a state machine?

In the Stack Overflow podcast #36 (https://blog.stackoverflow.com/2009/01/podcast-36/), this opinion was expressed:
Once you understand how easy it is to set up a state machine, you’ll never try to use a regular expression inappropriately ever again.
I've done a bunch of searching. I've found some academic papers and other complicated examples, but I'd like to find a simple example that would help me understand this process. I use a lot of regular expressions, and I'd like to make sure I never use one "inappropriately" ever again.
A rather convenient way to help look at this to use python's little-known re.DEBUG flag on any pattern:
>>> re.compile(r'<([A-Z][A-Z0-9]*)\b[^>]*>(.*?)</\1>', re.DEBUG)
literal 60
subpattern 1
range (65, 90)
max_repeat 0 65535
range (65, 90)
range (48, 57)
at at_boundary
max_repeat 0 65535
not_literal 62
literal 62
subpattern 2
min_repeat 0 65535
any None
literal 60
literal 47
groupref 1
literal 62
The numbers after 'literal' and 'range' refer to the integer values of the ascii characters they're supposed to match.
Sure, although you'll need more complicated examples to truly understand how REs work. Consider the following RE:
which is a typical identifier (must start with alpha and can have any number of alphanumeric and undescore characters following, including none). The following pseudo-code shows how this can be done with a finite state machine:
for char in all_chars_in(string):
if state == FIRSTCHAR:
if char is not in the set "A-Z" or "a-z":
error "Invalid first character"
next char
if char is not in the set "A-Z" or "a-z" or "0-9" or "_":
error "Invalid subsequent character"
next char
Now, as I said, this is a very simple example. It doesn't show how to do greedy/nongreedy matches, backtracking, matching within a line (instead of the whole line) and other more esoteric features of state machines that are easily handled by the RE syntax.
That's why REs are so powerful. The actual finite state machine code required to do what a one-liner RE can do is usually very long and complex.
The best thing you could do is grab a copy of some lex/yacc (or equivalent) code for a specific simple language and see the code it generates. It's not pretty (it doesn't have to be since it's not supposed to be read by humans, they're supposed to be looking at the lex/yacc code) but it may give you a better idea as to how they work.
Make your own on the fly!
This is a really nicely put together tool which visualises regular expressions as FSMs. It doesn't have support for some of the syntax you'll find in real-world regular expression engines, but certainly enough to understand exactly what's going on.
Is the question "How do I choose the states and the transition conditions?", or "How do I implement my abstract state machine in Foo?"
How do I choose the states and the transition conditions?
I usually use FSMs for fairly simple problems and choose them intuitively. In my answer to another question about regular expressions, I just looked at the parsing problem as one of being either Inside or outside a tag pair, and wrote out the transitions from there (with a beginning and ending state to keep the implementation clean).
How do I implement my abstract state machine in Foo?
If your implementation language supports a structure like c's switch statement, then you switch on the current state and process the input to see which action and/or transition too perform next.
Without switch-like structures, or if they are deficient in some way, you if style branching. Ugh.
Written all in one place in c the example I linked would look something like this:
token_t token;
state_t state=BEGIN_STATE;
do {
switch ( state.getValue() ) {
switch ( token.getValue() ) {
state = IN_STATE;
} while (state != END_STATE);
which is pretty messy, so I usually rip the state cases out to separate functions.
I'm sure someone has better examples, but you could check this post by Phil Haack, which has an example of a regular expression and a state machine doing the same thing (there's a previous post with a few more regex examples in there as well I think..)
Check the "HenriFormatter" on that page.
I don't know what academic papers you've already read but it really isn't that difficult to understand how to implement a finite state machine. There are some interesting mathematics but to idea is actually very trivial to understand. The easiest way to understand an FSM is through input and output (actually, this comprises most of the formal definition, that I won't describe here). A "state" is essentially just describing a set of input and outputs that have occurred and can occur from a certain point.
Finite state machines are easiest to understand via diagrams. For example:
alt text http://img6.imageshack.us/img6/7571/mathfinitestatemachinedco3.gif
All this is saying is that if you begin in some state q0 (the one with the Start symbol next to it) you can go to other states. Each state is a circle. Each arrow represents an input or output (depending on how you look at it). Another way to think of an finite state machine is in terms of "valid" or "acceptable" input. There are certain output strings that are NOT possible certain finite state machines; this would allow you to "match" expressions.
Now suppose you start at q0. Now, if you input a 0 you will go to state q1. However, if you input a 1 you will go to state q2. You can see this by the symbols above the input/output arrows.
Let's say you start at q0 and get this input
0, 1, 0, 1, 1, 1
This means you have gone through states (no input for q0, you just start there):
q0 -> q1 -> q0 -> q1 -> q0 -> q2 -> q3 -> q3
Trace the picture with your finger if it doesn't make sense. Notice that q3 goes back to itself for both inputs 0 and 1.
Another way to say all this is "If you are in state q0 and you see a 0 go to q1 but if you see a 1 go to q2." If you make these conditions for each state you are nearly done defining your state machine. All you have to do is have a state variable and then a way to pump input in and that is basically what is there.
Ok, so why is this important regarding Joel's statement? Well, building the "ONE TRUE REGULAR EXPRESSION TO RULE THEM ALL" can be very difficult and also difficult to maintain modify or even for others to come back and understand. Also, in some cases it is more efficient.
Of course, state machines have many other uses. Hope this helps in some small way. Note, I didn't bother going into the theory but there are some interesting proofs regarding FSMs.