perl - extract only from function definitions and not from function declarations

perl - extract only from function definitions and not from function declarations - regex

FILE CONTAINS FUNCTION DEFINITIONS AND FUNCTION DECLARATIONS
eErrorT ChainCtrlUpdateCameraRoute(ChainCtrlT* pChainCtrl, RouteListItemT* pNewRoute, RouteListItemT* pCurrRoute);
eErrorT ChainCtrlSetJpgSnapshotFile(ChainCtrlT* pChainCtrl, RouteListItemT* pRoute, char * dst_chain, char *jpg_file_path)
{
}
MY CODE
use strict;
use warnings;
use vars qw(#temp $index $i);
open(my $FILE, "< a.c") or die $!;
my #arr = <$FILE>;
foreach(#arr){
if($_ =~ /^ \S+ \s+ \S+ \s* \( (.+?) \) /xsmg) { # extracts function arguments
my $arg = $1;
my #arr = map /(\w+)$/, split /\W*?,\W*/, $arg;
print #temp = map "$_\n", #arr;
}
}
GIVES OUTPUT
pChainCtrl
pNewRoute
pCurrRoute
pChainCtrl
pRoute
dst_chain
jpg_file_path
OUTPUT NEEDED
pChainCtrl
pRoute
dst_chain
jpg_file_path
I need to extract arguments only from function definition(ChainCtrlSetJpgSnapshotFile) and not declaration(ChainCtrlUpdateCameraRoute).
I need to look whether the line with (..) doesn't have ";" in the same line. but I am unable to get regex command for it

Trying to understand what is the technical (lexical) difference, I see a ";" at the end of the definition, and no no ";" at the end of the declaration. In addition you have a { at the next line.
Utilizing these features (if they are consitent, fingers x'ed)
#ignore lines with ; (followed by optional spaces) at the end
print #temp = map "$_\n", #arr
unless $_ =~ /;\s*$/;
An option may be look for the { at the next line, and only print previous line, if you had a match, and this line start with a {. (to be left as an exercise...)

Related

Extract words between begin and end, \G \K

This pattern does the work
(?:\G(?!\A)|begin).*?\K(keyword)(?=.*end)
String:
begin
keyword
keyword
end
I get what I want (keyword keyword) in just one capture group, but if the string has this:
begin
keyword
keyword
end
keyword
end
I get three matches, How to stop in the first end ?
Can be this pattern be better, optimized?
demo regex

I would hate to run across such a regex in code. Any small change and it's broken.
I'd open a filehandle on a reference to the string then read its lines. Skip everything until you run into the starting line, then read everything up to the ending line:
use v5.26;
my $string =<<~'HERE';
begin
keyworda
keywordb
end
keywordc
end
HERE
open my $fh, '<', \$string;
while( <$fh> ) { last if /\Abegin/ }
my #keywords;
while( <$fh> ) {
last if /^end/;
chomp;
push #keywords, $_;
}
say join "\n", #keywords;
This outputs:
keyworda
keywordb
Or, break it up into two regexes. One sets the starting position, then you repeatedly match as long as the line isn't the ending line. This is a bit cleaner, but some people may be confused by the global matching in scalar context:
use v5.26;
my $string =<<~'HERE';
begin
keyworda
keywordb
end
keywordc
end
HERE
my #keywords;
if( $string =~ / ^ begin \R /gmx ) {
while( $string =~ /\G (?!end \R) (\N+) \R /gx ) {
push #keywords, $1;
}
}
say join "\n", #keywords;

Use regular expression and store match in an array
my #result = $data =~ /begin\n(.*?)\nend/sg;
then output to console
use strict;
use warnings;
use feature 'say';
use Data::Dumper;
my $data = do { local $/; <DATA> };
my #result = $data =~ /begin\n(.*?)\nend/sg;
say '-' x 35 . "\n" . $_ for #result;
__DATA__
begin
keyword 1
keyword 2
end
keyword
end
keyword
begin
keyword 3
keyword 4
end
keyword
keyword
Output
-----------------------------------
keyword 1
keyword 2
-----------------------------------
keyword 3
keyword 4

You can use not equal in grouping to fetch the data from begin to end.
my #keyws = ($data=~/begin((?:(?!begin|end).)*)end/sg);
use Data::Dumper;
print Dumper #keyws;
It's my way to doing in LaTeX.

A non-greedy Perl regular expression

I need to write a script which does the following:
$ cat testdata.txt
this is my file containing data
for checking pattern matching with a patt on the back!
only one line contains the p word.
$ ./mygrep5 pat th testdata.txt
this is my file containing data
for checking PATTERN MATCHING WITH a PATT ON THe back!
only one line contains the p word.
I have been able to print the line which is amended with the "a" capitalized as well. I have no idea how to only take what is needed.
I have been messing around (below is my script so far) and all I manage to return is the "PATT ON TH" part.
#!/usr/bin/perl
use strict;
use warnings;
use feature 'say';
use Data::Dump 'pp';
my ($f, $s, $t) = #ARGV;
my #output_lines;
open(my $fh, '<', $t);
while (my $line = <$fh>) {
if ($line =~ /$f/ && $line =~ /$s/) {
$line =~ s/($f.+?$s)/$1/g;
my $sub_phrase = uc $1;
$line =~ s/$1/$sub_phrase/g;
print $line;
}
#else {
# print $line;
#}
}
close($fh);
which returns: "for checking pattern matching with a PATT ON THe back!"
How can I fix this problem?

It sounds like you want to capitalize from pat to th except for instances of a surrounded by spaces. The easiest way is to uppercase the whole thing, and then fix any instances of A surrounded by spaces.
sub capitalize {
my $s = shift;
my $uc = uc($s);
$uc =~ s/ \s \K A (?=\s) /a/xg;
return $uc;
}
s{ ( \Q$f\E .* \Q$s\E ) }{ capitalize($1) }xseg;
The downside is that will replacing any existing A surrounded by spaces with a. The following is more complicated, but it doesn't suffer from that problem:
sub capitalize {
my $s = shift;
my #parts = $s =~ m{ \G ( \s+ | \S+ ) }xg;
for (#parts) {
$_ = uc($_) if $_ ne "a";
}
return join('', #parts);
}
s{ ( \Q$f\E .* \Q$s\E ) }{ capitalize($1) }xseg;
The rest of the code can be simplified:
#!/usr/bin/perl
use strict;
use warnings;
sub capitalize { ... }
my $f = shift;
my $s = shift;
while (<>) {
s{ ( \Q$f\E .* \Q$s\E ) }{ capitalize($1) }xseg;
print;
}

So, if you want to match each sequence that starts with pat and ends with th, non-greedily, and uppercase that sequence, you can simply use an expression on the right side of your substitution:
$line =~ s/($f.+?$s)/uc($1)/eg;
And that's it.

Perl parsing JavaScript file regex, to catch quotes only at the beginning and end of the returned string

I'm just starting to learn Perl. I need to parse JavaScript file. I came up with the following subroutine, to do it:
sub __settings {
my ($_s) = #_;
my $f = $config_directory . "/authentic-theme/settings.js";
if ( -r $f ) {
for (
split(
'\n',
$s = do {
local $/ = undef;
open my $fh, "<", $f;
<$fh>;
}
)
)
{
if ( index( $_, '//' ) == -1
&& ( my #m = $_ =~ /(?:$_s\s*=\s*(.*))/g ) )
{
my $m = join( '\n', #m );
$m =~ s/[\'\;]//g;
return $m;
}
}
}
}
I have the following regex, that removes ' and ; from the string:
s/[\'\;]//g;
It works alright but if there is a mentioned chars (' and ;) in string - then they are also removed. This is undesirable and that's where I stuck as it gets a bit more complicated for me and I'm not sure how to change the regex above correctly to only:
Remove only first ' in string
Remove only last ' in string
Remove ont last ; in string if exists
Any help, please?

You can use the following to match:
^'|';?$|;$
And replace with '' (empty string)
See DEMO

Remove only first ' in string
Remove only last ' in string
^[^']*\K'|'(?=[^']*$)
Try this .See demo.
https://regex101.com/r/oF9hR9/8
Remove ont last ; in string if exists
;(?=[^;]*$)
Try this.See demo.
https://regex101.com/r/oF9hR9/9
All three in one
^[^']*\K'|'(?=[^']*$)|;(?=[^;]*$)
See Here

You can use this code:
#!/usr/bin/perl
$str = "'string; 'inside' another;";
$str =~ s/^'|'?;?$//g;
print $str;
IDEONE demo
The main idea is to use anchors: ^ beginning of string, $ end of string and ;? matches the ";" symbol at the end only if it is present (? quantifier is making the pattern preceding it optional).EDIT: Also, ; will get removed even if there is no preceding '.

I suggest that your original code should look more like this. It is much more idiomatic Perl and I think more straightforward to follow
sub __settings {
my ($_s) = #_;
my $file = "$config_directory/authentic-theme/settings.js";
return unless -r $file;
open my $fh, '<', $file or die qq{Unable to open "$file" for input: $!};
my #file = <$fh>;
chomp #file;
for ( #file ) {
next if m{//};
if ( my #matches = $_ =~ /(?:$_s\s*=\s*(.*))/g ) {
my $matches = join "\n", #matches;
$matches =~ tr/';//d;
return $matches;
}
}
}

perl - extracting function arguments from multi line function definitions

eErrorT ChainCtrlInitChains(ChainCtrlT* pChainCtrl,
char* name,
int instance)
{
....
}
eErrorT ChainCtrlInit(ChainCtrlT* pChainCtrl, void* pOwner)
{
....
}
MY CODE
open(my $FILE, "< a.c") or die $!;
my #arr = <$FILE>;
foreach(#arr){
if ($_ =~ /^ \S+ \s+ \S+ \s* \( (.+?) \) /xsmg) {
my $arg = $1;
my #arr = map /(\w+)$/, split /\W*?,\W*/, $arg;
print my #temp = map "$_\n", #arr
unless $_ =~ /;\s*$/;
}
}
Basically I am extracting function arguments from the function definitions.
But in this case I am only able to extract arguments for 2nd function(ChainCtrlInit) and not for 1st function(ChainCtrlinitchains).
DESIRED OUTPUT
pChainCtrl
name
instance
pChainCtrl
pOwner
OUTPUT I AM GETTING
pChainCtrl
pOwner

The problem you have is when you do "foreach", $_ in the block inherits each element in the array.
For example,
foreach(#arr).. the first iteration of the loop assign $_ with "eErrorT ChainCtrlInitChains(ChainCtrlT* pChainCtrl,\n", so your regex will fail.
And you also use temp variables that don't have too much use. I have improved the code as the following:
my $arr = do { local $/; <$FILE> }; #Copy one of the comments above.
#note there's a slight difference in the 'while' regex to your code
while ($arr =~ /^ \S+ \s+ \S+ \s* (\( .+? \)) /xsmg) {
my #args = $1 =~ /(\w+)[,)]/g; #This assumes what you want always ends in
#a ',' or a ')', hence the gentle modification in the 'while' regex.
local $" = "\n";
say "#args";
}

The problem is that you are reading the file line by line and so the regex can never extend across multiple lines - if you load the file then make it a single string with embedded new lines it will work
eg a quick hack gives
open(my $FILE, "< a.c") or die $!;
my #arr = <$FILE>;
my $file = join('',#arr);
my #matches = $file =~ /^ \S+ \s+ \S+ \s* \( (.+?) \) /xsmg;
foreach (#matches) {
my $arg = $_;
my #arr = map /(\w+)$/, split /\W*?,\W*/, $arg;
print my #temp = map "$_\n", #arr
unless $_ =~ /;\s*$/;
print "\n";
}

#Ad-vic, in addition to #atleypnorth's solution to slurp the whole file into a string, there could be a problem within your split-map statement.
this
my #arr = map /(\w+)$/, split /\W*?,\W*/, $arg;
should be this
my #arr = map /(\w+)\W*$/, split /\W*?,\W*/, $arg;
because the last element from split gets residual characters.
Once you get the hang of Perl, you could streamline it into this -
$/ = "";
open(my $FILE, "< a.c") or die $!;
my $str = <$FILE>;
foreach ( $str =~ /^ \S+ \s+ \S+ \s* \( (.+?) \) /xsmg ) {
print map "$_\n", /(\w+) [^\w,]* (?:,|$)/xg ;
print "\n";
}

perl - extracting arguments from function definitions and putting it as comment above it

............
########### NEED TO PUT ARGUMENTS HERE AS COMMENT #########
eErrorT ChainCtrlInitChains(ChainCtrlT* pChainCtrl,
char* name,
int instance,
void* pOwner,
)
{
....
}
.........
i want to extract and put it above function definition as comment. There are many similar function definitions.
open(my $FILE1, "< a.c") or die $!;
#arr = <$FILE1>;
foreach(#arr){
if($_ =~ /^ \S+ \s+ \S+ \s* \( (.+?) \) /xsmg) {
my $arg = $1;
my #arr = map /(\w+)$/, split /\W*?,\W*/, $arg;
print #temp = map ' * #param[in/out] '."$_\n", #arr
unless $_ =~ /;\s*$/;
}
}
It works when I use $str, But then I can't splice the arguments as comment above the function definitions.
$str = <$FILE1>;

Here you go:
use File::Copy;
open my $FILE,'<','a.c' or die "open failed: $!\n";
$file_slurp = do { local $/;<$FILE>};
$file_slurp =~ s{ ^ ( \w+ \s+ \w+ \s* \( (.+?) \) )}{&print_args($2,$1)}xmesg;
close($FILE) or die "Couldn't close file: $!\n";
copy "a.c","a.c.bak" or die "Copy failed: $!\n";
open my $NEW_FILE,'>','a.c' or die "Truncating a.c failed: $!\n";
print $NEW_FILE $file_slurp and unlink "a.c.bak";
sub print_args {
($args,$proto) = #_;
#comments = map { ' * #param[in/out] '."$_" } split /\s*(?:^|,)\s*/,$args;
return join "\n",(#comments,$proto)
}
Test the code first by removing the unlink so that a backup copy of your source is kept on disk. When your'e confident it does what you want, you can put back the unlink so that it seems that your original file was modified in place.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

perl - extract only from function definitions and not from function declarations - regex

Related

Extract words between begin and end, \G \K

A non-greedy Perl regular expression

Perl parsing JavaScript file regex, to catch quotes only at the beginning and end of the returned string

perl - extracting function arguments from multi line function definitions

perl - extracting arguments from function definitions and putting it as comment above it

Categories

Resources