multiple addition of header file even after checking - regex

I want to add in a .C file incase its not present. Using Perl
MY CODE SNIPPET
my $flag = 0;
my $pos = 0;
open(FILE, $input) or die $!;
my #lines = <FILE>;
foreach(#lines)
{
$pos++;
#checks for #include where it can add stdint.h
if ($_ =~ (m/#include/))
{
#prevents multiple addition for each header file
if($flag == 0)
{
#checks whether stdint already present or not
unless($_ =~ m/#include <stdint.h>/ )
{
splice #lines,$pos,0,"#include <stdint.h>"."\n";
$flag = 1;
}
}
}
}
But my code is adding stdint.h everytime it runs which means multiple addition for every run.
whats wrong with the code
unless($_ =~ m/#include <stdint.h>/){
doesn't work even if i use
unless($_ =~ m/<stdint.h>/){

Imagine you have this C file:
#include <stdio.h>
#include <stdint.h>
int main(int argc, char ** argv) {
return 0;
}
What is supposed to happen when this goes through your script?
Nothing, because is already included
What actually happens though? This is where learning to use the Perl debugger or simply tracing by hand is really useful.
flag and pos are initialized to 0. The first line in the file is #include <stdio.h> which is not #include <stdint.h>, so your code immediately assumes the file is missing and adds it.
So, in your above code you insert #include <stdint.h> on the first include that is not it, regardless of whether or not it is actually there later in the file or before, which will always be any other include file.
What you should actually do is gather all of the include lines in an Array, then search for the file matching <stdint.h> only adding it if it isn't contained in the complete list.

Here is a way to do it:
open(my $FILE, '<', $input) or die $!;
my #lines = <$FILE>;
my $flag = 0;
my $pos = 0;
my $insert_pos = 1; #add stdin even if there're no other include
foreach(#lines) {
$pos++;
if (/#include/){
$insert_pos = $pos;
if (/#include <stdint.h>/) {
$insert_pos = 0;
last;
}
}
}
if ($insert_pos) {
splice #lines, $insert_pos, 0, "#include <stdint.h>"."\n";
}

This is an awful thing to be doing to a C project.
What you have coded adds #include <stdint.h> right after the first #include line, and has no effect on files that don't #include anything.
However, if you want to "edit" a file using Perl, then you should use Tie::File
The code in your question would look like this
use strict;
use warnings;
use Tie::File;
my ($input) = #ARGV;
tie my #c_file, 'Tie::File', $input or die qq{Unable to open C file "$input": $!};
for my $i (0 .. $#c_file) {
next unless $c_file[$i] =~ /#include/;
splice #c_file, $i, 0, '#include <stdint.h>';
last;
}

Related

How to apply negative regex on array in perl?

Having this:
foo.pl:
#!/usr/bin/perl -w
#heds = map { /_h.+/ and s/^(.+)_.+/$1/ and "$_.hpp" } #ARGV;
#fls = map { !/_h.+/ and "$_.cpp" } #ARGV;
print "heds: #heds\nfls: #fls";
I want to separate headers from source files, and when I give input:
$./foo.pl a b c_hpp d_hpp
heds: e.hpp f.hpp
fls: e.cpp f.cpp a.cpp b.cpp
The headers are correctly separated, however the files are taken all. Why? I have applied the negative regex !/_h.+/ in the mapping so the files with *_h* should not be taken in account, but they are. Why so? and how to fix it?
Does not work even this:
#fls = map { if(!/_h.+/){ "$_.cpp" } } #ARGV;
still takes every files, despite the condition
The map { } for #heds includes a substitution on the $1 argument and changes it. Just reorder the mapppings to avoid the effect on #fls and you get the desired result. Though, if you need to access #ARGV after these mappings it is not the original #ARGV anymore, like in your example code.
#!/usr/bin/perl -w
#fls = map { !/_h.+/ and "$_.cpp" } #ARGV;
#heds = map { /_h.+/ and s/^(.+)_.+/$1/ and "$_.hpp" } #ARGV;
print "heds: #heds\nfls: #fls\n";

Perl: comparing elements from a regex result

So i have a data file that contains a couple of lines that I am using for testing purposes. The data file contains this:
typedef enum A
{
enum A = 0;
enum B = 1;
}
A;
typedef enum B
{
enum A = 1;
enum B = 2;
}
B;
My code consists of this:
open(DATA, "<file.txt") or die "Couldn't open file file.txt, $!";
while(<DATA>){
print if /^(.*)(enum)(.*)$/;
}
I want to compare the values of A and A in each typedef and print an error suggesting compile error. How can i store these regex search results as variables?
Thanks
Just store the values returned from the match in list context (see perlop for details):
Keep a hash of declared variable names, report the error if an already declared name appears again:
my %declared;
while (<DATA>) {
if (my ($name) = /\b enum \s+ (\w+)/x) {
die "$name re-declared" if $declared{$name};
$declared{$name} = 1;
}
}

Writing a bubble sort using Perl regular expressions

I'm beginning to learn perl and I'm writing a simple bubble sort using regular expressions. However, I can't get it to sort properly (alphabetically, delimiting by whitespace). It just ends up returning the same string. Can someone help? I'm sure it's something really simple. Thanks:
#!/usr/bin/perl
use warnings;
use strict;
my $document=<<EOF;
This is the beginning of my text...#more text here;
EOF
my $continue = 1;
my $swaps = 0;
my $currentWordNumber = 0;
while($continue)
{
$document =~ m#^(\w+\s+){$currentWordNumber}#g;
if($document =~ m#\G(\w+)(\s+)(\w+)#)
{
if($3 lt $1)
{
$document =~ s#\G(\w+)(\s+)(\w+)#$3$2$1#;
$swaps++;
}
else
{
pos($document) = 0;
}
$currentWordNumber++;
}
else
{
$continue = 0 if ($swaps == 0);
$swaps = 0;
$currentWordNumber = 0;
}
}
print $document;
SOLVED: I figured out the problem. I wasn't taking into account punctuation after a word.
If you just want to sort all the words, you don't have to use regular expressions... Simply splitting up the text by newlines and white spaces should be much faster:
sub bsort {
my #x = #_;
for my $i (0..$#x) {
for my $j (0..$i) {
#x[$i, $j] = #x[$j, $i] if $x[$i] lt $x[$j];
}
}
return #x;
}
print join (" ", bsort(split(/\s+/, $document)));

perl script to read content between marks

In the perl , how to read the contents between two marks. Source data like this
START_HEAD
ddd
END_HEAD
START_DATA
eee|234|ebf
qqq| |ff
END_DATA
--Generate at 2011:23:34
then I only want to get data between "START_DATA" and "END_DATA". How to do this ?
sub readFile(){
open(FILE, "<datasource.txt") or die "file is not found";
while(<FILE>){
if(/START_DATA/){
record(\*FILE);#start record;
}
}
}
sub record($){
my $fileHandle = $_[0];
while(<fileHandle>){
print $_."\n";
if(/END_DATA/) return ;
}
}
I write this code, it doesn't work. do you know why ?
Thanks
Thanks
You can use the range operator:
perl -ne 'print if /START_DATA/ .. /END_DATA/'
The output will include the *_DATA lines, too, but it should not be so hard to get rid of them.
Besides a few typos, your code is not too far off. Had you used
use strict;
use warnings;
You might have figured it out yourself. Here's what I found:
Don't use prototypes if you do not need them, or know what they do.
Normal sub declaration is sub my_function (prototype) {, but you can leave out the prototype and just use sub my_function {.
while (<fileHandle>) { is missing the $ sign to denote that it is
a variable (scalar) and not a global. Should be $fileHandle.
print $_."\n"; will add an extra newline. Just print; will do
what you expect.
if(/END_DATA/) return; is a syntax error. Brackets are not optional
in perl in this case. Unless you reverse the statement.
Use either:
return if (/END_DATA/);
or
if (/END_DATA/) { return }
Below is the cleaned up version. I commented out your open() while testing, so this would be a functional code example.
use strict;
use warnings;
readFile();
sub readFile {
#open(FILE, "<datasource.txt") or die "file is not found";
while(<DATA>) {
if(/START_DATA/) {
recordx(\*DATA); #start record;
}
}
}
sub recordx {
my $fileHandle = $_[0];
while(<$fileHandle>) {
print;
if (/END_DATA/) { return }
}
}
__DATA__
START_HEAD
ddd
END_HEAD
START_DATA
eee|234|ebf
qqq| |ff
END_DATA
--Generate at 2011:23:34
This is a pretty simple thing to do with regular expressions, just use the /s or /m (single line or multiple line) flags - /s allows the . operator to match newlines, so you can do /start_data(.+)end_data/is.

Why my perl script isn't finding bad indetation from my regex match

My work's coding standard uses this bracket indentation:
some declaration
{
stuff = other stuff;
};
control structure, function, etc()
{
more stuff;
for(some amount of time)
{
do something;
}
more and more stuff;
}
I'm writing a perl script to detect incorrect indentation. Here's what I have in the body of a while(<some-file-handle>):
# $prev holds the previous line in the file
# $current holds the current in the file
if($prev =~ /^(\t*)[^;]+$/ and $current =~ /^(?<=!$1\t)[\{\}].+$/) {
print "$file # line ${.}: Bracket indentation incorrect\n";
}
Here, I'm trying to match:
$prev: A line not ended with a semi-colon, followed by...
$current: A line not having the number of leading tabs+1 of the previous line.
This doesn't seem to match anything, at the moment.
the $prev variable needs some modification.
it should be something like \t* then .+ then not ending in semicolon
also, the $current should be like:
anything ending in ; or { or } not having the number of leading tabs+1 of the previous line.
EDIT
the perl code to try the $prev
#!/usr/bin/perl -l
open(FP,"example.cpp");
while(<FP>)
{
if($_ =~ /^(\t*)[^;]+$/) {
print "got the line: $_";
}
}
close(FP);
//example.cpp
for(int i = 0;i<10;i++)
{
//not this;
//but this
}
//output
got the line: {
got the line: //but this
got the line: }
it did not detect the line with the for loop ...
am i missing something...
i see a couple of problems...
your prev regex matches all lines which do not have a ; anywhere. which will break on lines like (for int x = 1; x < 10; x++)
if the indent of the opening { is incorrect, you will not detect that.
try this instead, it only cares if you have a ;{ (followed by any whitespace) at the end.
/^(\s*).*[^{;]\s*$/
now you should change your strategy so that if you see a line which does not end in { or ; you increment the indent counter.
if you see a line which ends in }; or } decrement your indent counter.
compare all lines against this
/^\t{$counter}[^\s]/
so...
$counter = 0;
if (!($curr =~ /^\t{$counter}[^\s]/)) {
# error detected
}
if ($curr =~ /[};]+/) {
$counter--;
} else if ($curr =~ /^(\s*).*[^{;]\s*$/) }
$counter++;
}
sorry for not styling my code according to your standards... :)
And you intend to only count tabs (not spaces) for indentation?
Writing this kind of checker is complicated. Just think about all the possible constructs that uses braces that should not change indentation:
s{some}{thing}g
qw{ a b c }
grep { defined } #a
print "This is just a { provided to confuse";
print <<END;
This {
$is = not $code
}
END
But anyway, if the issues above aren't important to you, consider whether the semi colon is important at all in your regex. After all, writing
while($ok)
{
sort { some_op($_) }
grep { check($_} }
my_func(
map { $_->[0] } #list
);
}
Should be possible.
Have you considered looking at Perltidy?
Perltidy is a Perl script that reformats Perl code into set standards. Granted, what you have isn't part of the Perl standard, but you can probably tweak the curly braces via the configuration file Perltidy uses. If all else fails, you can hack through the code. After all, Perltidy is just a Perl script.
I haven't really used it, but it might be worth looking into. Your problem is trying to locate all the various edge cases, and making sure you're handling them correctly. You can parse 100 programs to find that the 101st reveal problems in your formatter. Perltidy has been used by thousands of people on millions of lines of code. If there is an issue, it probably already has been found.