How do I find and replace substrings through regular expressions in Matlab - regex

I am using Matlabs regular expression function regexprep() to find and replace strings in a .c file
I have been trying to take the following strings in the .c file:
var[12] = powmacro(var[11],"name11",var[25],"name25");
var[13] = divmacro(var[23],"name23",var[12],"name12");
...and convert them to the following format:
var[12] = var[11]^var[25]
var[13] = var[23]/var[12]
Any ideas on how I can do this?

Here is an example code:
% regular expression patterns
re_sp = '\s*';
re_str = [re_sp '"[^"]*"' re_sp];
re_var = [re_sp '(var\[\d+\])' re_sp];
re_pow = [re_var '=' re_sp 'powmacro' re_sp '\(' ...
re_var ',' re_str ',' re_var ',' re_str ')' re_sp ';'];
re_div = [re_var '=' re_sp 'divmacro' re_sp '\(' ...
re_var ',' re_str ',' re_var ',' re_str ')' re_sp ';'];
% replace patterns in strings
str = {'var[12] = powmacro(var[11],"name11",var[25],"name25");' ;
'var[13] = divmacro(var[23],"name23",var[12],"name12");'};
str = regexprep(str, re_pow, '$1 = $2 ^ $3');
str = regexprep(str, re_div, '$1 = $2 / $3');
disp(str)
The result:
>> str
str =
'var[12] = var[11] ^ var[25]'
'var[13] = var[23] / var[12]'

Related

Could regex be used in this PowerShell script?

I have the following code, used to remove spaces and other characters from a string $m, and replace them with periods ('.'):
Function CleanupMessage([string]$m) {
$m = $m.Replace(' ', ".") # spaces to dot
$m = $m.Replace(",", ".") # commas to dot
$m = $m.Replace([char]10, ".") # linefeeds to dot
while ($m.Contains("..")) {
$m = $m.Replace("..",".") # multiple dots to dot
}
return $m
}
It works OK, but it seems like a lot of code and can be simplified. I've read that regex can work with patterns, but am not clear if that would work in this case. Any hints?
Use a regex character class:
Function CleanupMessage([string]$m) {
return $m -replace '[ ,.\n]+', '.'
}
EXPLANATION
--------------------------------------------------------------------------------
[ ,.\n]+ any character of: ' ', ',', '.', '\n' (newline)
(1 or more times (matching the most amount
possible))
Solution for this case:
cls
$str = "qwe asd,zxc`nufc..omg"
Function CleanupMessage([String]$m)
{
$m -replace "( |,|`n|\.\.)", '.'
}
CleanupMessage $str
# qwe.asd.zxc.ufc.omg
Universal solution. Just enum in $toReplace what do you want to replace:
cls
$str = "qwe asd,zxc`nufc..omg+kfc*fox"
Function CleanupMessage([String]$m)
{
$toReplace = " ", ",", "`n", "..", "+", "fox"
.{
$d = New-Guid
$regex = [Regex]::Escape($toReplace-join$d).replace($d,"|")
$m -replace $regex, '.'
}
}
CleanupMessage $str
# qwe.asd.zxc.ufc.omg.kfc*.

Handling different escaping sequences?

I'm using ANTLR with Presto grammar in order to parse SQL queries.
This is the original string definition I've used to parse queries:
STRING
: '\'' ( '\\' .
| ~[\\'] // match anything other than \ and '
| '\'\'' // match ''
)*
'\''
;
This worked ok for most queries until I saw queries with different escaping rules. For example:
select
table1(replace(replace(some_col,'\\'',''),'\"' ,'')) as features
from table1
So I've modified my String definition and now it looks like:
STRING
: '\'' ( '\\' .
| '\\\\' . {HelperUtils.isNeedSpecialEscaping(this)}? // match \ followed by any char
| ~[\\'] // match anything other than \ and '
| '\'\'' // match ''
)*
'\''
;
However, this won't work for the query mentioned above as I'm getting
'\\'',''),'
as a single string.
The predicate returns True for the following query.
Any idea how can I handle this query as well?
Thanks,
Nir.
In the end I was able to solve it. This is the expression I was using:
STRING
: '\'' ( '\\\\' . {HelperUtils.isNeedSpecialEscaping(this)}?
| '\\' (~[\\] | . {!HelperUtils.isNeedSpecialEscaping(this)}?)
| ~[\\'] // match anything other than \ and '
| '\'\'' // match ''
)*
'\''
;
grammar Question;
sql
#init {System.out.println("Question last update 2352");}
: replace+ EOF
;
replace
: REPLACE '(' expr ')'
;
expr
: ( replace | ID ) ',' STRING ',' STRING
;
REPLACE : 'replace' DIGIT? ;
ID : [a-zA-Z0-9_]+ ;
DIGIT : [0-9] ;
STRING : '\'' '\\\\\'' '\'' // '\\''
| '\'' '\'\'' '\'' // ''''
| '\'' ~[\\']* '\'\'' ~[\\']* '\'' // 'it is 8 o''clock'
| '\'' .*? '\'' ;
NL : '\r'? '\n' -> channel(HIDDEN) ;
WS : [ \t]+ -> channel(HIDDEN) ;
File input.txt (not having more examples, I can only guess) :
replace1(replace(some_col,'\\'',''),'\"' ,'')
replace2(some_col,'''','')
replace3(some_col,'abc\tdef\tghi','xyz')
replace4(some_col,'abc\ndef','xyz')
replace5(some_col,'it is 8 o''clock','8')
Execution :
$ alias a4='java -jar /usr/local/lib/antlr-4.9-complete.jar'
$ alias grun='java org.antlr.v4.gui.TestRig'
$ a4 Question.g4
$ javac Question*.java
$ grun Question sql -tokens input.txt
[#0,0:7='replace1',<REPLACE>,1:0]
[#1,8:8='(',<'('>,1:8]
[#2,9:15='replace',<REPLACE>,1:9]
[#3,16:16='(',<'('>,1:16]
[#4,17:24='some_col',<ID>,1:17]
[#5,25:25=',',<','>,1:25]
[#6,26:30=''\\''',<STRING>,1:26]
[#7,31:31=',',<','>,1:31]
[#8,32:33='''',<STRING>,1:32]
[#9,34:34=')',<')'>,1:34]
[#10,35:35=',',<','>,1:35]
[#11,36:39=''\"'',<STRING>,1:36]
[#12,40:40=' ',<WS>,channel=1,1:40]
[#13,41:41=',',<','>,1:41]
[#14,42:43='''',<STRING>,1:42]
[#15,44:44=')',<')'>,1:44]
[#16,45:45='\n',<NL>,channel=1,1:45]
[#17,46:53='replace2',<REPLACE>,2:0]
[#18,54:54='(',<'('>,2:8]
[#19,55:62='some_col',<ID>,2:9]
[#20,63:63=',',<','>,2:17]
[#21,64:67='''''',<STRING>,2:18]
[#22,68:68=',',<','>,2:22]
[#23,69:70='''',<STRING>,2:23]
[#24,71:71=')',<')'>,2:25]
[#25,72:72='\n',<NL>,channel=1,2:26]
[#26,73:80='replace3',<REPLACE>,3:0]
[#27,81:81='(',<'('>,3:8]
[#28,82:89='some_col',<ID>,3:9]
[#29,90:90=',',<','>,3:17]
[#30,91:105=''abc\tdef\tghi'',<STRING>,3:18]
[#31,106:106=',',<','>,3:33]
[#32,107:111=''xyz'',<STRING>,3:34]
[#33,112:112=')',<')'>,3:39]
[#34,113:113='\n',<NL>,channel=1,3:40]
[#35,114:121='replace4',<REPLACE>,4:0]
[#36,122:122='(',<'('>,4:8]
[#37,123:130='some_col',<ID>,4:9]
[#38,131:131=',',<','>,4:17]
[#39,132:141=''abc\ndef'',<STRING>,4:18]
[#40,142:142=',',<','>,4:28]
[#41,143:147=''xyz'',<STRING>,4:29]
[#42,148:148=')',<')'>,4:34]
[#43,149:149='\n',<NL>,channel=1,4:35]
[#44,150:157='replace5',<REPLACE>,5:0]
[#45,158:158='(',<'('>,5:8]
[#46,159:166='some_col',<ID>,5:9]
[#47,167:167=',',<','>,5:17]
[#48,168:185=''it is 8 o''clock'',<STRING>,5:18]
[#49,186:186=',',<','>,5:36]
[#50,187:189=''8'',<STRING>,5:37]
[#51,190:190=')',<')'>,5:40]
[#52,191:191='\n',<NL>,channel=1,5:41]
[#53,192:191='<EOF>',<EOF>,6:0]
Question last update 2352

How to parse/identify double quoted string from the big expression using MARPA:R2 perl

Problem in parsing/identifying double quoted string from the big expression.
use strict;
use Marpa::R2;
use Data::Dumper;
my $grammar = Marpa::R2::Scanless::G->new({
default_action => '[values]',
source => \(<<'END_OF_SOURCE'),
:start ::= expression
expression ::= expression OP expression
expression ::= expression COMMA expression
expression ::= func LPAREN PARAM RPAREN
expression ::= PARAM
PARAM ::= STRING | REGEX_STRING
:discard ~ sp
sp ~ [\s]+
COMMA ~ [,]
STRING ~ [^ \/\(\),&:\"~]+
REGEX_STRING ~ yet to identify
OP ~ ' - ' | '&'
LPAREN ~ '('
RPAREN ~ ')'
func ~ 'func'
END_OF_SOURCE
});
my $recce = Marpa::R2::Scanless::R->new({grammar => $grammar});
my $input1 = "func(foo)&func(bar)"; -> able to parse it properly by parsing foo and bar as STRING LEXEME.
my $input2 = "\"foo\""; -> Here, I want to parse foo as regex_string LEXEME. REGEX_STRING is something which is enclosed in double quotes.
my $input3 = "func(\"foo\") - func(\"bar\")"; -> Here, func should be taken as func LEXEME, ( should be LPAREN, ) should be RPAREN, foo as REGEX_STRING, - as OP and same for func(\"bar\")
my $input4 = "func(\"foo\")"; -> Here, func should be taken as func LEXEME, ( should be LPAREN, ) should be RPAREN, foo as REGEX_STRING
print "Trying to parse:\n$input\n\n";
$recce->read(\$input);
my $value_ref = ${$recce->value};
print "Output:\n".Dumper($value_ref);
What did i try :
1st method:
My REGEX_STRING should be something : REGEX_STRING -> ~ '\"([^:]*?)\"'
If i try putting above REGEX_STRING in the code with input expression as my $input4 = "func(\"foo\")"; i get error like :
Error in SLIF parse: No lexeme found at line 1, column 5
* String before error: func(
* The error was at line 1, column 5, and at character 0x0022 '"', ...
* here: "foo")
Marpa::R2 exception
2nd method:
Tried including a rule like :
PARAM ::= STRING | REGEX_STRING
REGEX_STRING ::= '"' QUOTED_STRING '"'
STRING ~ [^ \/\(\),&:\"~]+
QUOTED_STRING ~ [^ ,&:\"~]+
The problem here is-> Input is given using:
my $input4 = "func(\"foo\")";
So, here it gives error because there are now two ways to parse this expression, either whole thing between double quotes which is func(\"foo\")
is taken as QUOTED_STRING or func should be taken as func LEXEME and so on.
Please help how do i fix this thing.
use 5.026;
use strictures;
use Data::Dumper qw(Dumper);
use Marpa::R2 qw();
my $grammar = Marpa::R2::Scanless::G->new({
bless_package => 'parsetree',
source => \<<'',
:default ::= action => [values] bless => ::lhs
lexeme default = bless => ::name latm => 1
:start ::= expression
expression ::= expression OP expression
expression ::= expression COMMA expression
expression ::= func LPAREN PARAM RPAREN
expression ::= PARAM
PARAM ::= STRING | REGEXSTRING
:discard ~ sp
sp ~ [\s]+
COMMA ~ [,]
STRING ~ [^ \/\(\),&:\"~]+
REGEXSTRING ::= '"' QUOTEDSTRING '"'
QUOTEDSTRING ~ [^ ,&:\"~]+
OP ~ ' - ' | '&'
LPAREN ~ '('
RPAREN ~ ')'
func ~ 'func'
});
# say $grammar->show_rules;
for my $input (
'func(foo)&func(bar)', '"foo"', 'func("foo") - func("bar")', 'func("foo")'
) {
my $r = Marpa::R2::Scanless::R->new({
grammar => $grammar,
# trace_terminals => 1
});
$r->read(\$input);
say "# $input";
say Dumper $r->value;
}
2nd method posted in question worked for me. I just have to include :
lexeme default = latm => 1
in my code.

Perl Remove outer Brackets

I want to use a perl regex to remove the outer brackets in a function but I can't construct a regex that doesn't interfere with the inner brackets . Here is an example:
void init(){
if(true){
//do something
}
}
into
void init()
if(true){
//do something
}
is there a regex that can do this?
Write a parser for the language. Here's a simplified example using Marpa::R2:
#!/usr/bin/perl
use warnings;
use strict;
use Marpa::R2;
my $input = << '__IN__';
void init(){
if(true){
//do something
}
}
__IN__
my $dsl = << '__DSL__';
:default ::= action => concat
lexeme default = latm => 1
FuncDef ::= type name Arglist ('{') Body ('}')
Arglist ::= '(' Args ')'
Args ::= Arg* separator => comma
Arg ::= type name
Body ::= Block+
Block ::= nonbrace
| '{' nonbrace '}'
nonbrace ~ [^{}]*
comma ~ ','
type ~ 'void'
name ~ [\w]+
space ~ [\s]+
:discard ~ space
__DSL__
sub concat { shift; join ' ', #_ }
my $grammar = 'Marpa::R2::Scanless::G'->new({ source => \$dsl });
my $value = $grammar->parse(\$input, { semantics_package => 'main' });
print $$value;
The curly brackets at FuncDef are parenthesized, which tells Marpa to discard them.
Here it is:
my $s = "void init(){ if(true){ //do something }}";
$s =~ s/^([^{]+)\{(.*)\}([^{]*)$/$1$2$3/s;
print "$s\n";

How to convert two character ('-' and '_') to JSON format

I want to convert below two character ("-" and "_") to JSON format in perl.
To use as JSON.
But I failed to convert. I want to know how to convert these unique two character.
format
'--' -> ':'
'-_' -> '{'
'_-' -> '}'
'__' -> ','
Here is my program
#!/usr/local/bin/perl
use strict;
use warnings;
sub toJsonFormat {
my $self = shift;
my $str = shift;
$str =~ s/-_/{/g;
$str =~ s/_-/}/g;
$str =~ s/--/:/g;
$str =~ s/__/,/g;
return $str;
}
This is sample
Try code
toJsonFormat('-_service---_key--value_-__-_key--value_-__service---_key--value_-_-')
expected
"{service:{key:value},{key:value},service:{key:value}}"
Got
'{service:{key:value_{_{key:value_{_service:{key:value_{-'
If you have any idea to convert to expected character, please tell me.
Thanks in advance.
This should work:
my %h = (
'--' => ':',
'-_' => '{',
'_-' => '}',
'__' => ',',
);
my $rx = qr(-_|_-|--|__);
sub toJsonFormat {
my $str = shift;
$str =~ s/($rx)/$h{$1}/g;
return $str;
}
print toJsonFormat('-_service---_key--value_-__-_key--value_-__service---_key--value_-_-')
So make a regex from all the keys you're matching and replace with the corresponding values...