Is there a way to construct tests in Rust to throw a warning when not exhaustive? - unit-testing

Is there a way to construct tests in Rust to throw a warning when not exhaustive? Certainly, I don't expect a solution for this in general, but I'm looking for a solution that would work when the arguments to a function are enumerated types. I'd like to check that all combinations are used in a way a match statement checks that all combinations are covered. For example, consider the code:
// Terrible numerical type
#[derive(Debug,PartialEq)]
pub enum Num {
Int(i32),
Float(f32),
}
// Mathematical operation on this terrible type
pub fn myadd(x : crate::Num, y :Num) -> Num {
match (x,y) {
(Num::Int(x),Num::Int(y)) => Num::Int(x+y),
(Num::Int(x),Num::Float(y)) => Num::Float((x as f32) + y),
(Num::Float(x),Num::Int(y)) => Num::Float(x+(y as f32)),
(Num::Float(x),Num::Float(y)) => Num::Float(x+y),
}
}
// Add testing
#[cfg(test)]
mod test{
use super::*;
#[test]
fn int_int() {
assert_eq!(myadd(Num::Int(1),Num::Int(2)),Num::Int(3));
}
#[test]
fn float_int() {
assert_eq!(myadd(Num::Float(1.),Num::Int(2)),Num::Float(3.));
}
#[test]
fn int_float() {
assert_eq!(myadd(Num::Int(1),Num::Float(2.)),Num::Float(3.));
}
}
Here, we're missing the test float_float. I'd like a way to throw a warning to denote this test is missing. If we forgot the Float,Float case in pattern matching, we'd get the error:
error[E0004]: non-exhaustive patterns: `(Float(_), Float(_))` not covered
--> src/lib.rs:10:11
|
10 | match (x,y) {
| ^^^^^ pattern `(Float(_), Float(_))` not covered
|
= help: ensure that all possible cases are being handled, possibly by adding wildcards or more match arms
I'm trying to get something similar for the testing combinations. In case it matters, I don't care if all of the tests are combined to a single function rather than split into four different tests. I didn't know if there was a trick using pattern matching to achieve this or some other mechanism.

This is by no means an ideal solution, but you can do this by using a pretty short macro:
macro_rules! exhaustive_tests {
(
$group_name:ident = match $type:ty {
$( $test_name:ident = $pattern:pat => $body:block )*
}
) => {
paste::item!{
#[allow(warnings)]
fn [< exhaustive_check_ $group_name >]() {
let expr: $type = unreachable!();
match expr {
$( $pattern => unreachable!(), )*
}
}
}
$(
#[test] fn $test_name() { $body }
)*
};
}
The macro will let you declare a group of tests, with a pattern for each test that must be exhaustive to compile.
exhaustive_tests!{
my_tests = match (Num, Num) {
int_int = (Num::Int(_), Num::Int(_)) => {
assert_eq!(myadd(Num::Int(1),Num::Int(2)),Num::Int(3));
}
float_int = (Num::Float(_), Num::Int(_)) => {
assert_eq!(myadd(Num::Int(1),Num::Int(2)),Num::Int(3));
}
int_float = (Num::Int(_), Num::Float(_)) => {
assert_eq!(myadd(Num::Int(1),Num::Float(2.)),Num::Float(3.));
}
}
}
In addition to generating each test (#[test] fn $test_name() { $body }), it will create a single function exhaustive_check_... with a match statement with the pattern for each test. The match statement must cover all patterns (as usual), otherwise compilation will fail:
// generated by macro
fn exhaustive_check_my_tests() {
let expr: (Num, Num) = unreachable!();
match expr {
(Num::Int(_), Num::Int(_)) => unreachable!(),
(Num::Float(_), Num::Int(_)) => unreachable!(),
(Num::Int(_), Num::Float(_)) => unreachable!(),
// error[E0004]: non-exhaustive patterns: `(Float(_), Float(_))` not covered
}
}
Try it in Playground

Related

How do I improve my code to get integer values from regex named capture groups?

My Rust code parses a log file and accumulates some information:
use regex::Regex;
fn parse(line: &str) {
let re_str = concat!(
r"^\s+(?P<qrw1>\d+)\|(?P<qrw2>\d+)",//qrw 0|0
r"\s+(?P<arw1>\d+)\|(?P<arw2>\d+)",//arw 34|118
);
let re = Regex::new(re_str).unwrap();
match re.captures(line) {
Some(caps) => {
let qrw1 = caps.name("qrw1").unwrap().as_str().parse::<i32>().unwrap();
let qrw2 = caps.name("qrw2").unwrap().as_str().parse::<i32>().unwrap();
let arw1 = caps.name("arw1").unwrap().as_str().parse::<i32>().unwrap();
let arw2 = caps.name("arw2").unwrap().as_str().parse::<i32>().unwrap();
}
None => todo!(),
}
}
Playground
This works as expected, but I think those long chained calls which I created to get integer values of regex capture groups are a bit ugly. How do I make them shorter/nicer?
One thing you could do is extract the parsing into a closure internal_parse:
fn parse(line: &str) -> Option<(i32, i32, i32, i32)> {
let re_str = concat!(
r"^\s+(?P<qrw1>\d+)\|(?P<qrw2>\d+)",//qrw 0|0
r"\s+(?P<arw1>\d+)\|(?P<arw2>\d+)",//arw 34|118
);
let re = Regex::new(re_str).unwrap();
match re.captures(line) {
Some(caps) => {
let internal_parse = |key| {
caps.name(key).unwrap().as_str().parse::<i32>().unwrap()
};
let qrw1 = internal_parse("qrw1");
let qrw2 = internal_parse("qrw2");
let arw1 = internal_parse("arw1");
let arw2 = internal_parse("arw2");
Some((qrw1, qrw2, arw1, arw2))
}
None => None,
}
}
However, you should keep in mind that parse::<i32> may fail. (Consider e.g. the string " 00|45 57|4894444444444444444444444 ".)
You could also try to solve this problem by using a parser combinator library (the crates nom, pest or combine come to one's mind) that traverses the string and spits out the i32s directly (so that you do not have to parse manually after matching via regex).

Regex: is there a oneliner for this?

I want to search inside multiple big text files (200MB each) as fast as possible. I am using the command line tool ripgrep and I want to call it only once.
In the following string:
***foo***bar***baz***foo***bar***baz
(*** stands for a different type and number of characters.)
I want to match baz, but only if it follows the first occurence of foo***bar***
So in ***foo***bar***baz***foo***bar***baz it matches the first baz
and in ***foo***bar***qux***foo***bar***baz it shall match nothing.
I tried several solutions but it did not work. Can this be done with a single regular expression?
I'm pretty sure that a regex is overkill in this case. A simple series of find can do the job:
fn find_baz(input: &str) -> Option<usize> {
const FOO: &str = "foo";
const BAR: &str = "bar";
// 1: we find the occurrences of "foo", "bar" and "baz":
let foo = input.find(FOO)?;
let bar = input[foo..].find(BAR).map(|i| i + foo)?;
let baz = input[bar..].find("baz").map(|i| i + bar)?;
// 2: we verify that there is no other "foo" and "bar" between:
input[bar..baz]
.find(FOO)
.map(|i| i + bar)
.and_then(|foo| input[foo..baz].find(BAR))
.xor(Some(baz))
}
#[test]
fn found_it() {
assert_eq!(Some(15), find_baz("***foo***bar***baz***foo***bar***baz"));
}
#[test]
fn found_it_2() {
assert_eq!(Some(27), find_baz("***foo***bar***qux***foo***baz"));
}
#[test]
fn not_found() {
assert_eq!(None, find_baz("***foo***bar***qux***foo***bar***baz"));
}
#[test]
fn not_found_2() {
assert_eq!(None, find_baz("***foo***bar***qux***foo***"));
}

Scala pattern matching with undefined number of parameters

I am developping a string parser in scala. I am facing an issue where I need to not always match the same number of parameters.
To be more clear, my code as follow :
line match {
case regex(first, second, third, ...) => // sometimes 2 arguments, sometimes more
// do stuff
case _ =>
println("Wrong parsing")
}
As you can see, I need to define dynamically my arguments. Do you have an idea to achieve this ? I tried to use a list, but I had no success.
PS : my regex is dynamically generated
UPDATE : thanks to sheunis' answer I found the solution.
line match {
case regex(args # _*) =>
println(args(0))
println(args(1))
println(args(2))
... // as much as you have
case _ => println("Wrong parsing")
}
case class Regex(args: String*)
val test = Regex("a", "b", "c")
test match {
case Regex(args # _*) => for (arg <- args) println(arg)
case _ => println("Wrong parsing")
}

Scala Regex Parser throws weird error

I have a simple RegexParser that matches {key}={value} repeating for several times:
object CommandOptionsParser extends RegexParsers {
private val key: Parser[String] = "[^= ]+".r
private val value: Parser[String] = "[^ ]*".r
val pair: Parser[Option[(String, Option[String])]] =
(key ~ ("=".r ~> value).?).? ^^ {
case None => None
case Some(k ~ v) => Some(k.trim -> v.map(_.trim))
}
val pairs: Parser[Map[String, Option[String]]] = phrase(repsep(pair, whiteSpace)) ^^ {
case v =>
Map(v.flatten: _*)
}
def apply(input: String): Map[String, Option[String]] = parseAll(pairs, input) match {
case Success(plan, _) => plan
case x => sys.error(x.toString)
}
}
However the matching of value seems to fail on more than 1 capturing groups (despite that the regex doesn't limit it). when I try to match against "token=abc again=abc", I have the following error:
[1.11] failure: string matching regex `\z' expected but `a' found
token=abc again=abc'
^
Why RegexParser has such strange behaviour?
The fix for your unexpected behavior is quite easy, just change the value of skipWhitespace:
object CommandOptionsParser extends RegexParsers {
override val skipWhitespace = false
From description of RegexParsers:
The parsing methods call the method skipWhitespace (defaults to
true) and, if true, skip any whitespace before each parser is
called.
So, what happened, your first pair was matched, then whiteSpace was skipped and then, as repsep couldn't find another whitespace separator, it just assumed that parsing is over, hence that "\z" expected.
Also, I can't help but note that the whole Parser approach for such simple task seems overcomplicated, simple regexps would suffice.
UPD: Also your parsers can be a bit simpler:
val pair: Parser[Option[(String, Option[String])]] =
(key ~ ("=" ~> value).?).? ^^ (_.map {case (k ~ v) => k.trim -> v.map(_.trim)})
val pairs: Parser[Map[String, Option[String]]] = phrase(repsep(pair, whiteSpace)) ^^
{ l => Map(l.flatten: _*)}

How to pattern match using regular expression in Scala?

I would like to be able to find a match between the first letter of a word, and one of the letters in a group such as "ABC". In pseudocode, this might look something like:
case Process(word) =>
word.firstLetter match {
case([a-c][A-C]) =>
case _ =>
}
}
But how do I grab the first letter in Scala instead of Java? How do I express the regular expression properly? Is it possible to do this within a case class?
You can do this because regular expressions define extractors but you need to define the regex pattern first. I don't have access to a Scala REPL to test this but something like this should work.
val Pattern = "([a-cA-C])".r
word.firstLetter match {
case Pattern(c) => c bound to capture group here
case _ =>
}
Since version 2.10, one can use Scala's string interpolation feature:
implicit class RegexOps(sc: StringContext) {
def r = new util.matching.Regex(sc.parts.mkString, sc.parts.tail.map(_ => "x"): _*)
}
scala> "123" match { case r"\d+" => true case _ => false }
res34: Boolean = true
Even better one can bind regular expression groups:
scala> "123" match { case r"(\d+)$d" => d.toInt case _ => 0 }
res36: Int = 123
scala> "10+15" match { case r"(\d\d)${first}\+(\d\d)${second}" => first.toInt+second.toInt case _ => 0 }
res38: Int = 25
It is also possible to set more detailed binding mechanisms:
scala> object Doubler { def unapply(s: String) = Some(s.toInt*2) }
defined module Doubler
scala> "10" match { case r"(\d\d)${Doubler(d)}" => d case _ => 0 }
res40: Int = 20
scala> object isPositive { def unapply(s: String) = s.toInt >= 0 }
defined module isPositive
scala> "10" match { case r"(\d\d)${d # isPositive()}" => d.toInt case _ => 0 }
res56: Int = 10
An impressive example on what's possible with Dynamic is shown in the blog post Introduction to Type Dynamic:
object T {
class RegexpExtractor(params: List[String]) {
def unapplySeq(str: String) =
params.headOption flatMap (_.r unapplySeq str)
}
class StartsWithExtractor(params: List[String]) {
def unapply(str: String) =
params.headOption filter (str startsWith _) map (_ => str)
}
class MapExtractor(keys: List[String]) {
def unapplySeq[T](map: Map[String, T]) =
Some(keys.map(map get _))
}
import scala.language.dynamics
class ExtractorParams(params: List[String]) extends Dynamic {
val Map = new MapExtractor(params)
val StartsWith = new StartsWithExtractor(params)
val Regexp = new RegexpExtractor(params)
def selectDynamic(name: String) =
new ExtractorParams(params :+ name)
}
object p extends ExtractorParams(Nil)
Map("firstName" -> "John", "lastName" -> "Doe") match {
case p.firstName.lastName.Map(
Some(p.Jo.StartsWith(fn)),
Some(p.`.*(\\w)$`.Regexp(lastChar))) =>
println(s"Match! $fn ...$lastChar")
case _ => println("nope")
}
}
As delnan pointed out, the match keyword in Scala has nothing to do with regexes. To find out whether a string matches a regex, you can use the String.matches method. To find out whether a string starts with an a, b or c in lower or upper case, the regex would look like this:
word.matches("[a-cA-C].*")
You can read this regex as "one of the characters a, b, c, A, B or C followed by anything" (. means "any character" and * means "zero or more times", so ".*" is any string).
To expand a little on Andrew's answer: The fact that regular expressions define extractors can be used to decompose the substrings matched by the regex very nicely using Scala's pattern matching, e.g.:
val Process = """([a-cA-C])([^\s]+)""".r // define first, rest is non-space
for (p <- Process findAllIn "aha bah Cah dah") p match {
case Process("b", _) => println("first: 'a', some rest")
case Process(_, rest) => println("some first, rest: " + rest)
// etc.
}
String.matches is the way to do pattern matching in the regex sense.
But as a handy aside, word.firstLetter in real Scala code looks like:
word(0)
Scala treats Strings as a sequence of Char's, so if for some reason you wanted to explicitly get the first character of the String and match it, you could use something like this:
"Cat"(0).toString.matches("[a-cA-C]")
res10: Boolean = true
I'm not proposing this as the general way to do regex pattern matching, but it's in line with your proposed approach to first find the first character of a String and then match it against a regex.
EDIT:
To be clear, the way I would do this is, as others have said:
"Cat".matches("^[a-cA-C].*")
res14: Boolean = true
Just wanted to show an example as close as possible to your initial pseudocode. Cheers!
First we should know that regular expression can separately be used. Here is an example:
import scala.util.matching.Regex
val pattern = "Scala".r // <=> val pattern = new Regex("Scala")
val str = "Scala is very cool"
val result = pattern findFirstIn str
result match {
case Some(v) => println(v)
case _ =>
} // output: Scala
Second we should notice that combining regular expression with pattern matching would be very powerful. Here is a simple example.
val date = """(\d\d\d\d)-(\d\d)-(\d\d)""".r
"2014-11-20" match {
case date(year, month, day) => "hello"
} // output: hello
In fact, regular expression itself is already very powerful; the only thing we need to do is to make it more powerful by Scala. Here are more examples in Scala Document: http://www.scala-lang.org/files/archive/api/current/index.html#scala.util.matching.Regex
Note that the approach from #AndrewMyers's answer matches the entire string to the regular expression, with the effect of anchoring the regular expression at both ends of the string using ^ and $. Example:
scala> val MY_RE = "(foo|bar).*".r
MY_RE: scala.util.matching.Regex = (foo|bar).*
scala> val result = "foo123" match { case MY_RE(m) => m; case _ => "No match" }
result: String = foo
scala> val result = "baz123" match { case MY_RE(m) => m; case _ => "No match" }
result: String = No match
scala> val result = "abcfoo123" match { case MY_RE(m) => m; case _ => "No match" }
result: String = No match
And with no .* at the end:
scala> val MY_RE2 = "(foo|bar)".r
MY_RE2: scala.util.matching.Regex = (foo|bar)
scala> val result = "foo123" match { case MY_RE2(m) => m; case _ => "No match" }
result: String = No match