How do I perform a replacement using a formatted string from a regex capture group? - regex

I am doing multiple replacements at once using the regex crate:
extern crate regex;
use regex::{Captures, Regex};
fn transform(string: &str) {
let rgx = Regex::new(r"(\n)|(/\w+)").unwrap();
let res = rgx.replace_all(string, |caps: &Captures| {
if caps.get(1).is_some() {
return " ";
}
match caps.get(2).map(|m: regex::Match| m.as_str()) {
Some(z) => return "nope", // how to return formatted z instead?
None => (),
}
unreachable!();
});
println!("{}", res);
}
fn main() {
transform("no errors");
transform("big\nbad\n/string");
}
Output as expected:
no errors
big bad nope
Instead of "nope", I would like to return z formatted in some way instead. format! doesn't seem like it can be used here due to String / lifetime issues:
match caps.get(2).map(|m: regex::Match| m.as_str()) {
Some(z) => return format!("cmd: {}", z),
None => (),
}
error[E0308]: mismatched types
--> src/main.rs:12:31
|
12 | Some(z) => return format!("cmd: {}", z),
| ^^^^^^^^^^^^^^^^^^^^^ expected &str, found struct `std::string::String`
|
= note: expected type `&str`
found type `std::string::String`
= note: this error originates in a macro outside of the current crate (in Nightly builds, run with -Z external-macro-backtrace for more info)
What should be done instead?

Note in the error message:
expected &str
It expects a &str because that's the first type returned by your closure:
return " ";
A closure / function can have only one return type, not two.
The simplest fix is to return a String in both cases:
let res = rgx.replace_all(string, |caps: &Captures| {
if caps.get(1).is_some() {
return String::from(" ");
}
let m = caps.get(2).unwrap();
format!("cmd: {}", m.as_str())
});
To be slightly more efficient, you can avoid the String allocation for the space character:
use std::borrow::Cow;
let res = rgx.replace_all(string, |caps: &Captures| {
if caps.get(1).is_some() {
return Cow::from(" ");
}
let m = caps.get(2).unwrap();
Cow::from(format!("cmd: {}", m.as_str()))
});
playground
I've also replaced the match with the => () arm paired with the unreachable! with the shorter unwrap.
See also:
Cannot use `replace_all` from the regex crate: expected (), found String
Using str and String interchangably
Return local String as a slice (&str)

Related

Validating the query parameter and parsing it using regular expression

I am new to regex, can you please tell me how to take a query parameter with all the below combinations.
(ParamName=Operator:ParamValue) is my set of query parameter value. This will be separated with ;(AND) or ,(OR) and i want to group them within braces. Like in below example
Ex: http://,host:port>/get?search=(date=gt:2020-02-06T00:00:00.000Z;(name=eq:Test,department=co:Prod))
Here the date should be greater than 2020-02-06 and name = Test or department contains Prod.
How to parse these query parameters. Please suggest.
Thanks, Vijay
So, I wrote a solution in JavaScript, but it should be adaptable in other languages as well, with a bit of research.
It's quite a bit of code, but what you're looking to achieve is not super easy!
So here's the code bellow, it's thoroughly commented, but please, if you there is something you don't understand, ask away, and I'll be happy to answer you :)
//
// The 2 first regexes are a parameter, which looks like date=gt:2020-02-06T00:00:00.000Z for example.
// The difference between those 2 is that the 1st one has **named capture group**
// For example '(?<operator>...)' is a capture group named 'operator'.
// This will come in handy in the code, to keep things clean
//
const RX_NAMED_PARAMETER = /(?:(?<param>\w+)=(?<operator>\w+):(?<value>[\w-:.]+))/
const parameter = "((\\w+)=(\\w+):([\\w-:.]+)|(true|false))"
//
// The 3rd parameter is an operation between 2 parameters
//
const RX_OPERATION = new RegExp(`\\((?<param1>${parameter})(?:(?<and_or>[,;])(?<param2>${parameter}))?\\)`, '');
// '---------.---------' '-------.------' '----------.---------'
// 1st parameter AND or OR 2nd parameter
my_data = {
date: new Date(2000, 01, 01),
name: 'Joey',
department: 'Production'
}
/**
* This function compates the 2 elements, and returns the bigger one.
* The elements might be dates, numbers, or anything that can be compared.
* The elements **need** to be of the same type
*/
function isGreaterThan(elem1, elem2) {
if (elem1 instanceof Date) {
const date = new Date(elem2).getTime();
if (isNaN(date))
throw new Error(`${elem2} - Not a valid date`);
return elem1.getTime() > date;
}
if (typeof elem1 === 'number') {
const num = Number(elem2);
if (isNaN(num))
throw new Error(`${elem2} - Not a number`);
return elem1 > num;
}
return elem1 > elem2;
}
/**
* Makes an operation as you defined them in your
* post, you might want to change that to suit your needs
*/
function operate(param, operator, value) {
if (!(param in my_data))
throw new Error(`${param} - Invalid parameter!`);
switch (operator) {
case 'eq':
return my_data[param] == value;
case 'co':
return my_data[param].includes(value);
case 'lt':
return isGreaterThan(my_data[param], value);
case 'gt':
return !isGreaterThan(my_data[param], value);
default:
throw new Error(`${operator} - Unsupported operation`);
}
}
/**
* This parses the URL, and returns a boolean
*/
function parseUri(uri) {
let finalResult;
// As long as there are operations (of the form <param1><; or ,><param2>) on the URL
while (RX_OPERATION.test(uri)) {
// We replace the 1st operation by the result of this operation ("true" or "false")
uri = uri.replace(RX_OPERATION, rawOperation => {
// As long as there are parameters in the operations (e.g. "name=eq:Bob")
while (RX_NAMED_PARAMETER.test(rawOperation)) {
// We replace the 1st parameter by its value ("true" or "false")
rawOperation = rawOperation.replace(RX_NAMED_PARAMETER, rawParameter => {
const res = RX_NAMED_PARAMETER.exec(rawParameter);
return '' + operate(
res.groups.param,
res.groups.operator,
res.groups.value,
);
// The "res.groups.xxx" syntax is allowed by the
// usage of capture groups. See the top of the file.
});
}
// At this point, the rawOperation should look like
// (true,false) or (false;false) for example
const res = RX_OPERATION.exec(rawOperation);
let operation;
if (res.groups.param2 === undefined)
operation = res.groups.param1; // In case this is an isolated operation
else
operation = res.groups.param1 + ({',': ' || ', ';': ' && '}[res.groups.and_or]) + res.groups.param2;
finalResult = eval(operation);
return '' + finalResult;
});
}
return finalResult;
}
let res;
res = parseUri("http://,host:port>/get?search=(date=gt:2020-02-06T00:00:00.000Z;(name=eq:Test,department=co:Prod))");
console.log(res);
res = parseUri("http://,host:port>/get?search=(date=lt:2020-02-06T00:00:00.000Z)");
console.log(res);

How do I get the X window class given a window ID with rust-xcb?

I'm trying to use rust-xcb to get a window's class given a window ID.
fn get_class(conn: &xcb::Connection, id: &i32) {
let window: xcb::xproto::Window = *id as u32;
let class_prop: xcb::xproto::Atom = 67; // XCB_ATOM_WM_CLASS from xproto.h
let cookie = xcb::xproto::get_property(&conn, false, window, class_prop, 0, 0, 2);
match cookie.get_reply() {
Ok(reply) => {
let x: &[std::os::raw::c_void] = reply.value();
println!("reply is {:?}", x[0]);
}
Err(err) => println!("err {:?}", err),
}
}
The documentation is kind of sparse and hasn't been incredibly helpful, though I did find this bit about the GetPropertyReply and of the xcb_get_property_reply_t it wraps.
I looked at this answer in JavaScript but I don't know what the ctypes equivalent in Rust is. I tried just casting the &[c_void] as a &str or String:
...
Ok(reply) => {
let len = reply.value_len() as usize;
let buf = reply.value() as &str;
println!("{}", buf.slice_unchecked(0, len)); // this seems redundant
}
...
but it returns
error: non-scalar cast: `&[_]` as `&str`
I tried casting the &[c_void] as a &[u8] and then collecting the Vec into a String, which sort of works:
...
Ok(reply) => {
let value : &[u8] = reply.value();
let buf : String = value.into_iter().map(|i| *i as char).collect();
println!("\t{:?}", buf);
}
...
but I'm now getting weird results. for example , when I use xprop on Chrome I see "google-chrome" but for me it is only showing "google-c", and "roxterm" is showing up as "roxterm\u{0}". I'm guessing "\u{0}" is something Unicode related but I'm not sure, and I don't know why stuff is being concatenated either. Maybe I have to check the reply again?
Here's my updated function:
fn get_class(conn: &Connection, id: &i32) -> String {
let window: xproto::Window = *id as u32;
let long_length: u32 = 8;
let mut long_offset: u32 = 0;
let mut buf = Vec::new();
loop {
let cookie = xproto::get_property(
&conn,
false,
window,
xproto::ATOM_WM_CLASS,
xproto::ATOM_STRING,
long_offset,
long_length,
);
match cookie.get_reply() {
Ok(reply) => {
let value: &[u8] = reply.value();
buf.extend_from_slice(value);
match reply.bytes_after() {
0 => break,
_ => {
let len = reply.value_len();
long_offset += len / 4;
}
}
}
Err(err) => {
println!("{:?}", err);
break;
}
}
}
let result = String::from_utf8(buf).unwrap();
let results: Vec<&str> = result.split('\0').collect();
results[0].to_string()
}
There were three main parts to this question:
I put xproto::get_property() in a loop so I could check reply.bytes_after() and accordingly adjust long_offset. I think with an appropriate long_length there will usually only be one read, but just being safe.
As #peter-hall said, converting &[u8] -> String should be done using String::from_utf8, which needs a Vec; so I let mut buf = Vec::new() and buf.extend_from_slice over the loop before creating the result string with String::from_utf8(buf).unwrap()
According to this random page WM_CLASS is actually two consecutive null-terminated strings, so I split the result by \0 and grab the first value.
I might've just been looking in the wrong place, but xcb has absolutely terrible documentation..

How to minimize optionals

I want to input a number as a string and am using readLine which returns a String?. Then I want to convert that inputed String to an Int which also returns an Int?. If either optional returns a nil, then print an error; otherwise, use the Int. The following code works but there has to be a better way. Any ideas?
print ("Enter number: ", terminator:"")
let number = readLine ()
if number != nil && Int (number!) != nil
{
let anInt = Int (number!)!
}
else
{
print ("Input Error")
}
You can combine the unwrapping of readLine response and the conversion to Int and making sure the numeric conversion succeeded into a single guard statement, e.g.,
guard let string = readLine(), let number = Int(string) else {
print("input error")
return
}
// use `number`, which is an `Int`, here
You can obviously spin that around if you want:
if let string = readLine(), let number = Int(string) {
// use `number`, which is an `Int`, here
} else {
print("input error")
}

Convert sweet.js argument into string

How would you create a string from an argument to a sweet.js macro? For example:
let foo = macro {
rule {
$name
} => {
console.log('$name', $name);
}
}
var x = 42;
foo x
Will output:
console.log(x, x);
When I'd prefer it to output:
console.log('x', x);
So the first argument has quotes around it.
You can use a case macro:
let foo = macro {
case {_
$name
} => {
letstx $name_str = [makeValue(unwrapSyntax(#{$name}), #{here})];
return #{
console.log($name_str, $name);
}
}
}
var x = 42;
foo x
The basic idea is that you make a new string token (via makeValue) using the string value of the identifiers mached by $name (unwrapSyntax gives us the value of the given syntax objects, in the case of identifiers it is the identifier string). Then letstx allows us to bind our newly created syntax object for use inside the #{} template.

Parsing Perl regex with golang

http://play.golang.org/p/GM0SWo0qGs
This is my code and playground.
func insert_comma(input_num int) string {
temp_str := strconv.Itoa(input_num)
var validID = regexp.MustCompile(`\B(?=(\d{3})+$)`)
return validID.ReplaceAllString(temp_str, ",")
}
func main() {
fmt.Println(insert_comma(1000000000))
}
Basically, my desired input is 1,000,000,000.
And the regular expression works in Javascript but I do not know how to make this Perl regex work in Go. I would greatly appreciate it. Thanks,
Since lookahead assertion seems to be not supported, I'm providing you a different algorithm with no regexp:
Perl code:
sub insert_comma {
my $x=shift;
my $l=length($x);
for (my $i=$l%3==0?3:$l%3;$i<$l;$i+=3) {
substr($x,$i++,0)=',';
}
return $x;
}
print insert_comma(1000000000);
Go code: Disclaimer: I have zero experience with Go, so bear with me if I have errors and feel free to edit my post!
func insert_comma(input_num int) string {
temp_str := strconv.Itoa(input_num)
var result []string
i := len(temp_str)%3;
if i == 0 { i = 3 }
for index,element := range strings.Split(temp_str, "") {
if i == index {
result = append(result, ",");
i += 3;
}
result = append(result, element)
}
return strings.Join(result, "")
}
func main() {
fmt.Println(insert_comma(1000000000))
}
http://play.golang.org/p/7pvo7-3G-s