catching exceptions in ML

catching exceptions in ML - sml

is it possible in ML catch every possible exception? for example if I don't know what exception might be

The handle statement lets you pattern match exceptions, so you can use something like handle _ to match anything, e.g.
hd [] handle _ => 0

Related

What regexp to look for "5 digits or more" EXCEPT if starting with a chain of character

My application is reading subject of emails and try to find file reference of user account into it.
The pb is that it is sent by users and can arrive in many different orders of course, with dates and also wrong reference to skip.
We are using a basic regexp that usually works pretty well for :
DOSSIER 4491128 - PPI Claim Benefit Calculation
4471631
Leistungsnr. 4445929
=> we catch respectively
4491128
4471631
4445929
And we can make use of these references to call our systems to retrieve information about users.
But we have a few cases like this where it's totally not working,
WG: SCM1177278 9910808067RSV Meldung
WG: SCM1161874 9909827071
WG: SCM1165728 9910855395 RSV
=> Here i want to skip SCM1177278 or SCM1161874 or SCM1165728 and catch only the 2nd number 9910808067 or 9909827071 or 9909855395
In 'WG: SCM1177278 9910808067RSV Meldung' I succeed in skipping SCM but i catch only the 1st number '1177278', i want to skip this one and catch the next sequence of 5 digits or more.
So i'm desesperately trying to find the good regexp to do this...
I tried
(?!scm|SCM)([0-9][0-9][0-9][0-9][0-9]+)
Our basic regexp (not optimized at all lol) is: ([0-9][0-9][0-9][0-9][0-9]+)

You can use a lookbehind powered regex like
(?<!scm|SCM|\d)[0-9]{5,}
(?<!scm|SCM)(?<!\d)[0-9]{5,}
See the regex demo. Both patterns should work the same, the second one is for Boost/Python re that require fixed-width lookbehind patterns.
Details:
(?<!scm|SCM|\d) - a negative lookbehind that fails the match if there is scm, SCM or digit immediately to the left of the current location
[0-9]{5,} - five or more digits.

Is there a way to test if my regex is vulnerable to catastrophic backtracking?

There are many questions on this topic, but I'm not sure if my regex is vulnerable or not.
The following regex is what I use for email validation:
/^\w+([\.-]?\w+)*#\w+([\.-]?\w+)*(\.\w{2,3})+$/.test(email)
Because I'm using a * in a few places, I suspect it could be.
I'd like to be able to test any number of occurrences in my code for problems.
I'm using Node.js so this could shut down my server entirely given the single threaded nature of the event loop.

Good question. Yes, given the right input, it's vulnerable and a runaway regex is able to block the entire node process, making the service unavailable.
The basic example of a regex prone to catastrophic backtracking looks like
^(\w+)*$
a pattern which can be found multiple times in the given regex.
When the regex contains optional quantifiers and the input contains long sequences that cannot be matched in the end the JS regex engine has to backtrack a lot and burns CPU. Potentially ad infinitum if the input is long enough. (You can play with this on regex101 as well using your regex by adjusting the timeout value in the settings.)
In general,
avoid monstrosities,
use HTML5 input validation whenever possible (in the front-end),
use established validation libraries for common input, e.g. validator.js,
try to detect potentially catastrophic exponential-time regular expressions ahead of time using tools like safe-regex & vuln-regex-detector (those offer pretty much what you had in mind),
and know your stuff 1, 2, 3 (I think the third link explains the issue best).
More drastic approaches to mitigate catastrophic backtracking in node.js are wrapping your regex efforts in a child process or vm context and set a meaningful timeout. (In a perfect world JavaScript's RegExp constructor would have a timeout param, maybe someday.)
The approach of using a child process is described here on SO.
The VM context (sandboxing) approach is described here.

const Joi = require('#hapi/joi');
function isEmail(emailAsStr) {
const schema = Joi.object({ email: Joi.string().email() });
const result = schema.validate({ email: emailAsStr });
const validated = result.error ? false : true;
if (validated) return true;
return [false, result.error.details[0].message];
}
Here's another way to do it - use a library! :)
I tested it against common catastrophic backtrack regex.
The answer to my original question is to use the npm lib. safe-regex, but I thought I'd share another example of how to resolve this problem w/o regex.

If-Else conditions with APL?

So, I'm wondering/asking; Is it possible to do an If-Statement in APL? If so how?
Here's my code
'Please enter a number to count to: '
number ←⎕
⍳number
How do I get an if-statement to where if the user inputs a number over 100 it will print out "too high" and end; or if it's 100 or under then it will just continue?
Thanks!

In Dyalog APL you have this neat little thing called guards.
They can be used in dfns and evaluate code when a certain condition matches.
func ← {⍵>100 : 'too high' ⋄ 1 : 'number is ok'}

If your APL supports control structures then this should work:
∇ generateAll number
:If number>100
⎕←'Too high'
:else
⎕←⍳ number
:endif
∇
If it does NOT support control structures (like APL2) you will need to branch:
∇ generateAll number
→(number>100)/error
⎕←⍳ number
→0
error:
⎕←'Too high'
∇
You can also use tricks like execute but this is less readable.

A "classical" way of doing error handling* in APL2 is with the ⎕ES or ⎕EA.
Your code would look something like this:
⎕ES(NUMBER>100)/'Too high'
⍳NUMBER
What happens here is that IF the parentheses evaluate to true, THEN the ⎕ES will halt the execution and echo the quoted string.
If you don't want your THEN to terminate, have a look at ⎕EA in some APL documentation.
Please note that I'm on APL2 in a GreenOnBlack environment, so there are likely more neat ways of doing this in a more modern dialect like Dyalog.
*I know you're asking about conditionals and not error handling, but since you're example terminates execution, it might as well be error handling.
There is a crucial difference between this and what MBaas suggests: His solution will gracefully exit the current function which might return a value. Using ⎕ES or ⎕EA with terminate all execution.

Depends on the dialect you're using. Some APL-Implementations support control-strucures, so you could write something like
:If number>100
⎕←'Too high'
→0
:endif
⍳number
In "tradtional APL" you would probably do something like
⍎(number>100)/'⎕←''Too high'' ⋄ →0'
⍳number

How to replace characters in string Erlang?

I have this piece of code that gets sessionid, make it a string, and then create a set with key as e.g. {{1401,873063,143916},<0.16443.0>} in redis. I'm trying replace { characters in this session with letter "a".
OldSessionID= io_lib:format("~p",[OldSession#session.sid]),
StringForOldSessionID = lists:flatten(OldSessionID),
ejabberd_redis:cmd([["SADD", StringForSessionID, StringForUserInfo]]);
I've tried this:
re:replace(N,"{","a",[global,{return,list}]).
Is this a good way of doing this? I read that regexp in Erlang is not a advised way of doing things.

Your solution works, and if you are comfortable with it, you should keep it.
On my side I prefer list comprehension : [case X of ${ -> $a; _ -> X end || X <- StringForOldSessionID ]. (just because I don't have to check the function documentation :o)

re:replace(N,"{","a",[global,{return,list}]).
Is this a good way of doing this? I read that regexp in Erlang is not
a advised way of doing things.
According to official documentation:
2.5 Myth: Strings are slow
Actually, string handling could be slow if done improperly. In Erlang, you'll have to think a little more about how the strings are used and choose an appropriate representation and use the re module instead of the obsolete regexp module if you are going to use regular expressions.
So, either you use re for strings, or:
leave { behind(using pattern matching)
if, say, N is {{1401,873063,143916},<0.16443.0>}, then
{{A,B,C},Pid} = N
And then format A,B,C,Pid into string.

Since Erlang OTP 20.0 you can use string:replace/3 function from string module.
string:replace/3 - replaces SearchPattern in String with Replacement. 3rd function parameter indicates whether the leading, the trailing or all encounters of SearchPattern are to be replaced.
string:replace(Input, "{", "a", all).

Measure the "matching"?

Is there mechanism to measure or compare of how tight the pattern corresponds to the given string? By pattern I mean regex or something similar. For example we have string "foobar" and two regexes: "fooba." and ".*" Both patterns match the string. Is it possible to determine that "fooba." is more appropriate pattern for given string then ".*"?

There are metrics and heuristics for string 'distance'. Check this for example http://en.wikipedia.org/wiki/Edit_distance
Here is one random Java implementation that came with Google search.
http://www.merriampark.com/ldjava.htm
Some metrics are expensive to compute so look around and find one that fits your needs.
As for your specific example, IIRC, regex matching in Java prioritizes terms by matching length and then order so if you use something like
"(foobar)|(.*)", it will match the first one and you can determine this by examining the results returned for the two capture groups.

How about this for an idea: Use the length of your regular expression: length("fooba.") > length(".*"), so "fooba." is more specific...
However, it depends on where the regular expressions come from and how precise you need to be as "fo.*|.*ba" would be longer than "fooba.", so the solution will not always work.

What you're asking for isn't really a property of regular expressions.
Create an enum that measures "closeness", and create a class that will hold a given regex, and a closeness value. This requires you to determine which regex is considered "more close" than another.
Instantiate your various classes, and let them loose on your code, and compare the matched objects, letting the "most closeness" one rise to the top.
pseudo-code, without actually comparing anything, or resembling any sane language:
enum Closeness
Exact
PrettyClose
Decent
NotSoClose
WayOff
CouldBeAnything
mune
class RegexCloser
property Closeness Close()
property String Regex()
ssalc
var foo = new RegexCloser(Closeness := Exact, Regex := "foobar")
var bar = new RegexCloser(Closeness := CouldBeAnything, Regex := ".*")
var target = "foobar";
if Regex.Match(target, foo)
print String.Format("foo {0}", foo.Closeness)
fi
if Regex.Match(target, bar)
print String.Format("bar {0}", bar.Closeness)
fi

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

catching exceptions in ML - sml

is it possible in ML catch every possible exception? for example if I don't know what exception might be

The handle statement lets you pattern match exceptions, so you can use something like handle _ to match anything, e.g. hd [] handle _ => 0

Related

What regexp to look for "5 digits or more" EXCEPT if starting with a chain of character

Is there a way to test if my regex is vulnerable to catastrophic backtracking?

If-Else conditions with APL?

How to replace characters in string Erlang?

Measure the "matching"?

Categories

Resources