rust how to collapse if let - clippy suggestion

rust how to collapse if let - clippy suggestion - if-statement

I run cargo clippy to get some feedback on my code and clippy told me that I can somehow collapse a if let.
Here is the exact "warning":
warning: this `if let` can be collapsed into the outer `if let`
--> src\main.rs:107:21
|
107 | / if let Move::Normal { piece, from, to } = turn {
108 | | if i8::abs(from.1 - to.1) == 2 && piece.getColor() != *color && to.0 == x {
109 | | let offsetX = x - to.0;
110 | |
... |
116 | | }
117 | | }
| |_____________________^
I thought I could maybe just append the inner if using && but then i get a warning ( `let` expressions in this position are experimental, I am using rust version 1.57.0, not nightly).
Any idea what clippy wants me to do?
Edit:
the outer if let is itself again inside another if let:
if let Some(turn) = board.getLastMove() {
And it seems you can indeed combine them like so:
if let Some(Move::Normal { piece, from, to }) = board.getLastMove() {
In my opinion the clippy lint should include the line above as it is otherwise, at least for me, somewhat confusing
Edit 2:
Turns out I just cant read, below the warning listed above was some more information telling me exactly what to do.
= note: `#[warn(clippy::collapsible_match)]` on by default
help: the outer pattern can be modified to include the inner pattern
--> src\main.rs:126:29
|
126 | if let Some(turn) = board.getLastMove() {
| ^^^^ replace this binding
127 | if let Move::Normal { piece, from, to } = turn {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ with this pattern
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#collapsible_match

Related

antlr visitor: lookup of reserved words efficiently

I'm learning Antlr. At this point, I'm writing a little stack-based language as part of my learning process -- think PostScript or Forth. An RPN language. For instance:
10 20 mul
This would push 10 and 20 on the stack and then perform a multiply, which pops two values, multiplies them, and pushes 200. I'm using the visitor pattern. And I find myself writing some code that's kind of insane. There has to be a better way.
Here's a section of my WaveParser.g4 file:
any_operator:
value_operator |
stack_operator |
logic_operator |
math_operator |
flow_control_operator;
value_operator:
BIND | DEF
;
stack_operator:
DUP |
EXCH |
POP |
COPY |
ROLL |
INDEX |
CLEAR |
COUNT
;
BIND is just the bind keyword, etc. So my visitor has this method:
antlrcpp::Any WaveVisitor::visitAny_operator(Parser::Any_operatorContext *ctx);
And now here's where I'm getting to the very ugly code I'm writing, which leads to the question.
Value::Operator op = Value::Operator::NO_OP;
WaveParser::Value_operatorContext * valueOp = ctx->value_operator();
WaveParser::Stack_operatorContext * stackOp = ctx->stack_operator();
WaveParser::Logic_operatorContext * logicOp = ctx->logic_operator();
WaveParser::Math_operatorContext * mathOp = ctx->math_operator();
WaveParser::Flow_control_operatorContext * flowOp = ctx->flow_control_operator();
if (valueOp) {
if (valueOp->BIND()) {
op = Value::Operator::BIND;
}
else if (valueOp->DEF()) {
op = Value::Operator::DEF;
}
}
else if (stackOp) {
if (stackOp->DUP()) {
op = Value::Operator::DUP;
}
...
}
...
I'm supporting approximately 50 operators, and it's insane that I'm going to have this series of if statements to figure out which operator this is. There must be a better way to do this. I couldn't find a field on the context that mapped to something I could use in a hashmap table.
I don't know if I should make every one of my operators have a separate rule, and use the corresponding method in my visitor, or if what else I'm missing.
Is there a better way?

With ANTLR, it's usually very helpful to label components of your rules, as well as the high level alternatives.
If part of a parser rule can only be one thing with a single type, usually the default accessors are just fine. But if you have several alternatives that are essentially alternatives for the "same thing", or perhaps you have the same sub-rule reference in a parser rule more than one time and want to differentiate them, it's pretty handy to give them names. (Once you start doing this and see the impact to the Context classes, it'll become pretty obvious where they provide value.)
Also, when rules have multiple top-level alternatives, it's very handy to give each of them a label. This will cause ANTLR to generate a separate Context class for each alternative, instead of dumping everything from every alternative into a single class.
(making some stuff up just to get a valid compile)
grammar WaveParser
;
any_operator
: value_operator # val_op
| stack_operator # stack_op
| logic_operator # logic_op
| math_operator # math_op
| flow_control_operator # flow_op
;
value_operator: op = ( BIND | DEF);
stack_operator
: op = (
DUP
| EXCH
| POP
| COPY
| ROLL
| INDEX
| CLEAR
| COUNT
)
;
logic_operator: op = (AND | OR);
math_operator: op = (ADD | SUB);
flow_control_operator: op = (FLOW1 | FLOW2);
AND: 'and';
OR: 'or';
ADD: '+';
SUB: '-';
FLOW1: '>>';
FLOW2: '<<';
BIND: 'bind';
DEF: 'def';
DUP: 'dup';
EXCH: 'exch';
POP: 'pop';
COPY: 'copy';
ROLL: 'roll';
INDEX: 'index';
CLEAR: 'clear';
COUNT: 'count';

"Borrowed Value Does Not Live Long Enough" when pushing into a vector

I am trying a daily programmer problem to shuffle a list of arguments and output them.
I'm not sure if this is the correct approach but it sounded like a good idea: remove the element from the args vector so it doesn't get repeated, and insert it into the result vector.
extern crate rand; // 0.7.3
use std::io;
use std::cmp::Ordering;
use std::env;
use rand::Rng;
fn main() {
let mut args: Vec<_> = env::args().collect();
let mut result: Vec<_> = Vec::with_capacity(args.capacity());
if args.len() > 1 {
println!("There are(is) {} argument(s)", args.len() - 1)
}
for x in args.iter().skip(1) {
let mut n = rand::thread_rng().gen_range(1, args.len());
result.push(&args.swap_remove(n));
}
for y in result.iter() {
println!("{}", y);
}
}
I get the error:
error[E0716]: temporary value dropped while borrowed
--> src/main.rs:18:22
|
18 | result.push(&args.swap_remove(n));
| ^^^^^^^^^^^^^^^^^^^ - temporary value is freed at the end of this statement
| |
| creates a temporary which is freed while still in use
...
21 | for y in result.iter() {
| ------ borrow later used here
|
= note: consider using a `let` binding to create a longer lived value
Older compilers said:
error[E0597]: borrowed value does not live long enough
--> src/main.rs:18:42
|
18 | result.push(&args.swap_remove(n));
| ------------------- ^ temporary value dropped here while still borrowed
| |
| temporary value created here
...
24 | }
| - temporary value needs to live until here
|
= note: consider using a `let` binding to increase its lifetime

Let's start with a smaller example. This is called an Minimal, Reproducible Example , and is very valuable for both you as a programmer and for us to answer your question. Additionally, it can run on the Rust Playground, which is convenient.
fn main() {
let mut args = vec!["a".to_string()];
let mut result = vec![];
for _ in args.iter() {
let n = args.len() - 1; // Pretend this is a random index
result.push(&args.swap_remove(n));
}
for y in result.iter() {
println!("{}", y);
}
}
The problem arises because when you call swap_remove, the item is moved out of the vector and given to you - the ownership is transferred. You then take a reference to the item and try to store that reference in the result vector. The problem is that the item is dropped after the loop iteration has ended because nothing owns it. If you were allowed to take that reference, it would be a dangling reference, one that points to invalid memory. Using that reference could cause a crash, so Rust prevents it.
The immediate fix is to not take a reference, but instead transfer ownership from one vector to the other. Something like:
for _ in args.iter() {
let n = args.len() - 1; // Pretend this is a random index
result.push(args.swap_remove(n));
}
The problem with this is that you will get
error[E0502]: cannot borrow `args` as mutable because it is also borrowed as immutable
--> src/main.rs:7:21
|
5 | for _ in args.iter() {
| -----------
| |
| immutable borrow occurs here
| immutable borrow later used here
6 | let n = args.len() - 1;
7 | result.push(args.swap_remove(n));
| ^^^^^^^^^^^^^^^^^^^ mutable borrow occurs here
See the args.iter? That creates an iterator that refers to the vector. If you changed the vector, then the iterator would become invalid, and allow access to an item that may not be there, another potential crash that Rust prevents.
I'm not making any claim that this is a good way to do it, but one solution would be to iterate while there are still items:
while !args.is_empty() {
let n = args.len() - 1; // Pretend this is a random index
result.push(args.swap_remove(n));
}
I'd solve the overall problem by using shuffle:
use rand::seq::SliceRandom; // 0.8.3
use std::env;
fn main() {
let mut args: Vec<_> = env::args().skip(1).collect();
args.shuffle(&mut rand::thread_rng());
for y in &args {
println!("{}", y);
}
}

DataArray case-insensitive match that returns the index value of the match

I have a DataFrame inside of a function:
using DataFrames
myservs = DataFrame(serverName = ["elmo", "bigBird", "Oscar", "gRover", "BERT"],
ipAddress = ["12.345.6.7", "12.345.6.8", "12.345.6.9", "12.345.6.10", "12.345.6.11"])
myservs
5x2 DataFrame
| Row | serverName | ipAddress |
|-----|------------|---------------|
| 1 | "elmo" | "12.345.6.7" |
| 2 | "bigBird" | "12.345.6.8" |
| 3 | "Oscar" | "12.345.6.9" |
| 4 | "gRover" | "12.345.6.10" |
| 5 | "BERT" | "12.345.6.11" |
How can I write the function to take a single parameter called server, case-insensitive match the server parameter in the myservs[:serverName] DataArray, and return the match's corresponding ipAddress?
In R this can be done by using
myservs$ipAddress[grep("server", myservs$serverName, ignore.case = T)]
I don't want it to matter if someone uses ElMo or Elmo as the server, or if the serverName is saved as elmo or ELMO.

I referenced how to accomplish the task in R and tried to do it using the DataFrames pkg, but I only did this because I'm coming from R and am just learning Julia. I asked a lot of questions from coworkers and the following is what we came up with:
This task is much cleaner if I was to stop thinking in terms of
vectors in R. Julia runs plenty fast iterating through a loop.
Even still, looping wouldn't be the best solution here. I was told to look into
Dicts (check here for an example). Dict(), zip(), haskey(), and
get() blew my mind. These have many applications.
My solution doesn't even need to use the DataFrames pkg, but instead
uses Julia's Matrix and Array data representations. By using let
we keep the global environment clutter free and the server name/ip
list stays hidden from view to those who are only running the
function.
In the sample code, I'm recreating the server matrix every time, but in reality/practice I'll have a permission restricted delimited file that gets read every time. This is OK for now since the delimited files are small, but this may not be efficient or the best way to do it.
# ONLY ALLOW THE FUNCTION TO BE SEEN IN THE GLOBAL ENVIRONMENT
let global myIP
# SERVER MATRIX
myservers = ["elmo" "12.345.6.7"; "bigBird" "12.345.6.8";
"Oscar" "12.345.6.9"; "gRover" "12.345.6.10";
"BERT" "12.345.6.11"]
# SERVER DICT
servDict = Dict(zip(pmap(lowercase, myservers[:, 1]), myservers[:, 2]))
# GET SERVER IP FUNCTION: INPUT = SERVER NAME; OUTPUT = IP ADDRESS
function myIP(servername)
sn = lowercase(servername)
get(servDict, sn, "That name isn't in the server list.")
end
end
# Test it out
myIP("SLIMEY")
#>"That name isn't in the server list."
myIP("elMo")
#>"12.345.6.7"

Here's one way:
julia> using DataFrames
julia> myservs = DataFrame(serverName = ["elmo", "bigBird", "Oscar", "gRover", "BERT"],
ipAddress = ["12.345.6.7", "12.345.6.8", "12.345.6.9", "12.345.6.10", "12.345.6.11"])
5x2 DataFrames.DataFrame
| Row | serverName | ipAddress |
|-----|------------|---------------|
| 1 | "elmo" | "12.345.6.7" |
| 2 | "bigBird" | "12.345.6.8" |
| 3 | "Oscar" | "12.345.6.9" |
| 4 | "gRover" | "12.345.6.10" |
| 5 | "BERT" | "12.345.6.11" |
julia> grep{T <: String}(pat::String, dat::DataArray{T}, opts::String = "") = Bool[isna(d) ? false : ismatch(Regex(pat, opts), d) for d in dat]
grep (generic function with 2 methods)
julia> myservs[:ipAddress][grep("bigbird", myservs[:serverName], "i")]
1-element DataArrays.DataArray{ASCIIString,1}:
"12.345.6.8"
EDIT
This grep works faster on my platform.
julia> function grep{T <: String}(pat::String, dat::DataArray{T}, opts::String = "")
myreg = Regex(pat, opts)
return convert(Array{Bool}, map(d -> isna(d) ? false : ismatch(myreg, d), dat))
end

Testing multiple scenarios under one Unit Test

This is how I am doing my unit test in groovy.
public void testSomeMethod() {
doSomething(1,2,3,4); //this is first test
doSomething(11,22,33,44); //this is second test
}
private void doSomething(a, b, c, d) {
assertEquals(a, actual)
}
Basically I am calling doSomething 2 times with different values under same test.
It might not be a good way to test But I just want to try it out.
So, the problem is, if the first test fails second does't get executed.
Is there a way I can force it to print fail message and move on to next one?

It is a good time for you to use spock, where you can do data driven testing and the second test will not be gated by the first one. You can get more flexibility like the one you have asked for and more.
Eventually, the test would look something like:
void "test something"(){
when:
def result = doSomething(a, b, c, d)
then:
result == expectedResult
where:
a | b | c | d || expectedResult
1 | 2 | 3 | 4 || 100
11 | 22 | 33 | 44 || 1000
}
private doSomething(a, b, c, d){...}
You can find more details in spock framework documents and/or also have a look at these questions.
BTW, above test example can be over simplified to make it groovier. ;)
void "test something"(){
expect:
result == doSomething(a, b, c, d)
where:
a | b | c | d || result
1 | 2 | 3 | 4 || 100
11 | 22 | 33 | 44 || 1000
}

Although I agree with the advice given to use Spock(great framework) you can also use JUnit parameterized tests if you don't want to include additional dependencies.

Simple Text Analysis library for C

I'm in the midst of creating my school project for our programming class.
I'm making a Medical Care system console app and I want to implement this kind of feature:
When a user enters what they are feeling. (Like they are feeling sick, having sore throat, etc) I want the C Text analysis library to help me analyze and parse the info given by the user (which have been saved into a string) and determine the medicine to be given. (I'll be the one to give which medicine is for which, I just want the library to help me analyze the info given by the user).
Thanks!
A good example would be this one:
http://www.codeproject.com/Articles/32175/Lucene-Net-Text-Analysis
Unfortunately it's for C#
Update:
Any C library that can help me even for the simple tokenizing and indexing of the words? I know I could do it by brute force coding... But a reliable and stable api would be better. Thanks!

Analyzing natural language text is one of the most difficult problems you could possibly pick.
Most likely your solution will come down to simply looking for keywords like "sick" "sore throat", etc - which can be accomplished with a simple dictionary of keywords and results.
As far as truly "understanding" what the user typed though - good luck with that.
EDIT:
A few technologies worth pointing out:
Regarding your question about a lexer - you can easily use flex if you feel you need something like that. Probably faster (in terms of execution speed AND development speed) than trying to code the multi-token search by hand.
On Mac there is a very cool framework called Latent Semantic Mapping. There is a WWDC 2011 video on it - and it's awesome. You basically feed it a ton of example inputs and train it on what result you want. It may be as close as you're going to get. It is C-based.
http://en.wikipedia.org/wiki/Latent_semantic_mapping
https://developer.apple.com/library/mac/#documentation/TextFonts/Reference/LatentSemanticMapping/index.html

This is what wakkerbot makes of your question. (The scores are low, because wakkerbot/Hubert is all Dutch.)
But the tokeniser seems to do fine on English:
[ 6]: | 29/ 27| 4.792 | weight |
------|--------+----------+---------+--------+
0 11| 15645 | 10/ 9 | 0.15469 | 0.692 |'to'
1 0| 19416 | 10/10 | 0.12504 | 0.646 |'i'
2 10| 10483 | 4/ 3 | 0.10030 | 0.84 |'and'
3 3| 3292 | 5/ 5 | 0.09403 | 1.4 |'be'
4 7| 27363 | 3/ 3 | 0.06511 | 1.4 |'one'
5 12| 36317 | 3/ 3 | 0.06511 | 8.52 |'this'
6 2| 35466 | 2/ 2 | 0.05746 | 10.7 |'just'
7 4| 12258 | 2/ 2 | 0.05301 | 0.56 |'info'
8 18| 81898 | 2/ 2 | 0.04532 | 20.1 |'ll'
9 20| 67009 | 3/ 3 | 0.04124 | 48.8 |'text'
10 13| 70575 | 2/ 2 | 0.03897 | 156 |'give'
11 19| 16806 | 2/ 2 | 0.03426 | 1.13 |'c'
12 14| 5992 | 2/ 2 | 0.03376 | 0.914 |'for'
13 1| 3940 | 1/ 1 | 0.02561 | 1.12 |'my'
14 5| 7804 | 1/ 1 | 0.02561 | 2.94 |'class'
15 17| 7920 | 1/ 1 | 0.02561 | 7.35 |'feeling'
16 15| 20429 | 3/ 2 | 0.01055 | 3.93 |'com'
17 16| 36544 | 2/ 1 | 0.00433 | 4.28 |'www'
To support my lex/nonlex tokeniser argument, this is the relevant part of wakkerbot's tokeniser:
for(pos=0; str[pos]; ) {
switch(*sp) {
case T_INIT: /* initial */
if (myisalpha(str[pos])) {*sp = T_WORD; pos++; continue; }
if (myisalnum(str[pos])) {*sp = T_NUM; pos++; continue; }
/* if (strspn(str+pos, "-+")) { *sp = T_NUM; pos++; continue; }*/
*sp = T_ANY; continue;
break;
case T_ANY: /* either whitespace or meuk: eat it */
pos += strspn(str+pos, " \t\n\r\f\b" );
if (pos) {*sp = T_INIT; return pos; }
*sp = T_MEUK; continue;
break;
case T_WORD: /* inside word */
while ( myisalnum(str[pos]) ) pos++;
if (str[pos] == '\0' ) { *sp = T_INIT;return pos; }
if (str[pos] == '.' ) { *sp = T_WORDDOT; pos++; continue; }
*sp = T_INIT; return pos;
...
As you can see, most of the time will be spent in the line with while ( myisalnum(str[pos]) ) pos++;,
which catches all the words. myisalnum() is a static function, which will probably be inlined. (There are similar tight loops for numbers and whitespace, of course)
UPDATE: for completeness, the definition for myisalpha():
static int myisalpha(int ch)
{
/* with <ctype.h>, this is a table lookup, too */
int ret = isalpha(ch);
if (ret) return ret;
/* don't parse, just assume valid utf8 */
if (ch == -1) return 0;
if (ch & 0x80) return 1;
return 0;
}

Yes, There's a C++ Data science toolkit called MeTA - ModErn Text Analysis Toolkit. Here's follow the features:
text tokenization, including deep semantic features like parse trees
inverted and forward indexes with compression and various caching strategies
a collection of ranking functions for searching the indexes
topic models
classification algorithms
graph algorithms
language models
CRF implementation (POS-tagging, shallow parsing)
wrappers for liblinear and libsvm (including libsvm dataset parsers)
UTF8 support for analysis on various languages
multithreaded algorithms
It comes with tests and examples. In your case I think statistical classifiers, like Bayes, will perfectly do the job, but, you can also do manual classification. It was the best feat to my personal case. Hope it helps.
Here's the link https://meta-toolkit.org/
Best Regards,

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

rust how to collapse if let - clippy suggestion - if-statement

Related

antlr visitor: lookup of reserved words efficiently

"Borrowed Value Does Not Live Long Enough" when pushing into a vector

DataArray case-insensitive match that returns the index value of the match

Testing multiple scenarios under one Unit Test

Simple Text Analysis library for C

Categories

Resources