I have a string "foo bar baz", and would like to turn it into "foo\ bar\ baz". Short of the hand-hacking method (call split, then re-join with the appropriate separator), is there another way I can do this? Does something like a replace function exist in Phobos?
Yep, std.array.replace
import std.array, std.stdio;
void main()
{
replace("foo bar baz", " ", "\\ ").writeln();
}
Related
I have this string:
(40.959953710949506, -74.18210638344726),(40.95891663745299, -74.10606039345703),(40.917472246121065, -74.09582940498359),(40.921752754230255, -74.16397897163398),(40.95248644043785, -74.21067086616523)
I need to grab the commas inside the parentesis for further processing, and I want the commas spliting the groups to remain.
Let's say I want to replace the target commas by FOO, the result should be:
(40.959953710949506 FOO -74.18210638344726),(40.95891663745299 FOO -74.10606039345703),(40.917472246121065 FOO -74.09582940498359),(40.921752754230255 FOO -74.16397897163398),(40.95248644043785 FOO -74.21067086616523)
I want a Regular Expression that is not language specific.
You can just use a lookaround to find all , that are not preceded by a ) like this:
(?<!\)),
I don't want some language specific functions for this
The format of the above regex is not language specific as can be seen in the following Code Snippet or this regex101 snippet:
const x = '(40.959953710949506, -74.18210638344726),(40.95891663745299, -74.10606039345703),(40.917472246121065, -74.09582940498359),(40.921752754230255, -74.16397897163398),(40.95248644043785, -74.21067086616523)';
const rgx = /(?<!\)),/g;
console.log(x.replace(rgx, ' XXX'));
For example:
import re
s = "(40.959953710949506, -74.18210638344726),(40.95891663745299, -74.10606039345703),(40.917472246121065, -74.09582940498359),(40.921752754230255, -74.16397897163398),(40.95248644043785, -74.21067086616523)"
s = re.sub(r",(?=[^()]+\))", " FOO", s)
print(s)
# (40.959953710949506 FOO -74.18210638344726),(40.95891663745299 FOO -74.10606039345703),(40.917472246121065 FOO -74.09582940498359),(40.921752754230255 FOO -74.16397897163398),(40.95248644043785 FOO -74.21067086616523)
We use a positive lookahead to only replace commas where ) comes before ( ahead in the string.
Use re.sub with a callback function:
inp = "(40.959953710949506, -74.18210638344726),(40.95891663745299, -74.10606039345703),(40.917472246121065, -74.09582940498359),(40.921752754230255, -74.16397897163398),(40.95248644043785, -74.21067086616523)"
output = re.sub(r'\((-?\d+(?:\.\d+)?),\s*(-?\d+(?:\.\d+)?)\)', lambda m: r'(' + m.group(1) + r' FOO ' + m.group(2) + r')', inp)
print(output)
This prints:
(40.959953710949506 FOO -74.18210638344726),(40.95891663745299 FOO -74.10606039345703),(40.917472246121065 FOO -74.09582940498359),(40.921752754230255 FOO -74.16397897163398),(40.95248644043785 FOO -74.21067086616523)
The strategy here is to capture the two numbers in each tuple in separate groups. Then, we replace by connecting the two numbers with FOO instead of the original comma.
I'm trying to match the two characters after a specific character. The trailing values may contain the specified character, which is ok, but I also need to capture that specified character as the beginning of the next capture group.
This code should illustrate what I mean:
extern crate regex;
use regex::Regex;
pub fn main() {
let re = Regex::new("(a..)").unwrap();
let st = String::from("aba34jf baacdaab");
println!("String to match: {}", st);
for cap in re.captures_iter(&st) {
println!("{}", cap[1].to_string());
// Prints "aba" and "aac",
// Should print "aba", "a34", "aac", "acd", "aab"
}
}
How do I get overlapping captures without using look around (which the regex crate doesn't support in Rust)? Is there something similar to what is in Python (as mentioned here) but in Rust?
Edit:
Using onig as BurntSushi5 suggested, we get the following:
extern crate onig;
use onig::*;
pub fn main() {
let re = Regex::new("(?=(a.{2}))").unwrap();
let st = String::from("aba34jf baacdaab");
println!("String to match: {}", st);
for ch in re.find_iter(&st) {
print!("{} ", &st[ch.0..=ch.1+2]);
// aba a34 aac acd aab, as it should.
// but we have to know how long the capture is.
}
println!("");
}
Now the problem with this is that you have to know how long the regex is, because the look ahead group doesn't capture. Is there a way to get the look ahead regex captured without knowing the length beforehand? How would we print it out if we had something like (?=(a.+)) as the regex?
You can't. Your only recourse is to either find a different approach entirely, or use a different regex engine that supports look-around like onig or pcre2.
I found a solution, unfortunately not regex though:
pub fn main() {
print_char_matches ("aba34jf baacdaab", 'a', 2);
//aba a34 aac acd aab, as it should.
}
pub fn print_char_matches( st:&str, char_match:char, match_length:usize ) {
let chars:Vec<_> = st.char_indices().collect();
println!("String to match: {}", st);
for i in 0..chars.len()-match_length {
if chars[i].1 == char_match {
for j in 0..=match_length {
print!("{}", chars[i+j].1);
}
print!(" ");
}
}
println!("");
}
This is a bit more generalizable, ASCII only. Matches the character provided and the specified number of digits after the match.
I have a string that a i need to replace with a replacement vector which i would like to use regex. is this thing possible?
txt='foo bar'
nchar(txt)
ix='foo'
gsub(ix,'bar', txt) #### output
gsub(pattern = '[^ix]', replacement = 'bar', txt)
Output desired is 'bar bar'
where ix is the char vector, how do i use pattern with regex is my real question.
We can use paste to join or a string object with another string.
sub(paste0('^',ix), 'bar', txt)
#[1] "bar bar"
NOTE: Using ^ inside [ i.e. '[^ix]' have different meaning.
I have a simple pattern that I am trying to do a find and replace on. It needs to replace all dashes with periods when they are surrounded by numbers.
Replace the period in these:
3-54
32-11
111-4523mhz
Like So:
3.54
32.11
111.4523mhz
However, I do not want to replace the dash inside anything like these:
Example-One
A-Test
I have tried using the following:
preg_replace('/[0-9](-)[0-9]/', '.', $string);
However, this will replace the entire match instead of just the middle. How do you only replace a portion of the match?
preg_replace('/([0-9])-([0-9])/', '$1.$2', $string);
Should do the trick :)
Edit: some more explanation:
By using ( and ) in a regular expression, you create a group. That group can be used in the replacement. $1 get replaced with the first matched group, $2 gets replaced with the second matched group and so on.
That means that if you would (just for example) change '$1.$2' to '$2.$1', the operation would swap the two numbers.
That isn't useful behavior in your case, but perhaps it helps you understand the principle better.
Depending on the regex implementation you're using, you can use non-capturing groups:
preg_replace('/(?<=[0-9])-(?=[0-9])/', '.', $string);
You can use back-references to retain the parts of the match you want to keep:
preg_replace('/([0-9])-([0-9])/', '$1.$2', $string);
Below are couple of ways of achieving the same.
import java.util.Arrays;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
/**
* Created by Rituraj_Jain on 7/14/2017.
*/
public class Main {
private static final Pattern pattern = Pattern.compile("(\\S)-(\\S)");
private static final String input = "Test-TestBad-ManRituRaj-Jain";
public static void main(String[] args) {
Matcher matcher = pattern.matcher(input);
String s = matcher.replaceAll("$1 $2");
System.out.println(s);
System.out.println(input.replaceAll(pattern.pattern(), "$1 $2"));
}
public static void main1(String[] args) {
Matcher matcher = pattern.matcher(input);
StringBuffer finalString = new StringBuffer();
while(matcher.find()){
String replace = matcher.group().replace("-", " ");
matcher.appendReplacement(finalString, replace);
}
matcher.appendTail(finalString);
System.out.println(input);
System.out.println(finalString.toString());
}
}
I want to split a string like this:
abc//def//ghi
into a part before and after the first occurrence of //:
a: abc
b: //def//ghi
I'm currently using this regex:
(?<a>.*?)(?<b>//.*)
Which works fine so far.
However, sometimes the // is missing in the source string and obviously the regex fails to match. How is it possible to make the second group optional?
An input like abc should be matched to:
a: abc
b: (empty)
I tried (?<a>.*?)(?<b>//.*)? but that left me with lots of NULL results in Expresso so I guess it's the wrong idea.
Try a ^ at the begining of your expression to match the begining of the string and a $ at the end to match the end of the string (this will make the ungreedy match work).
^(?<a>.*?)(?<b>//.*)?$
A proof of Stevo3000's answer (Python):
import re
test_strings = ['abc//def//ghi', 'abc//def', 'abc']
regex = re.compile("(?P<a>.*?)(?P<b>//.*)?$")
for ts in test_strings:
match = regex.match(ts)
print 'a:', match.group('a'), 'b:', match.group('b')
a: abc b: //def//ghi
a: abc b: //def
a: abc b: None
Why use group matching at all? Why not just split by "//", either as a regex or a plain string?
use strict;
my $str = 'abc//def//ghi';
my $short = 'abc';
print "The first:\n";
my #groups = split(/\/\//, $str, 2);
foreach my $val (#groups) {
print "$val\n";
}
print "The second:\n";
#groups = split(/\/\//, $short, 2);
foreach my $val (#groups) {
print "$val\n";
}
gives
The first:
abc
def//ghi
The second:
abc
[EDIT: Fixed to return max 2 groups]