Extract hex codes using regex [duplicate] - regex

This question already has answers here:
How can I obfuscate my Perl script to make it difficult to reverse engineer?
(6 answers)
Closed 5 years ago.
I have a file with hex codes. Example:
my $O1ol1ooI = "";
my $lOI100 = "";
my $oO10OI0 = 99;
my #olIIO1 = `df -h`;
chomp(my $I0110 = `hostname`);
foreach (#olIIO1) {
if (m/(\d+)% (.*)/ && $1 > $oO10OI0) {
$lOI100 = "\x66\x75\x6C\x6C";
}
}
my $O1ol1ooI = "";
my $lOI100 = "";
my $oO10OI0 = 99;
my #olIIO1 = `df -ih`;
chomp(my $I0110 = `hostname`);
foreach (#olIIO1) {
if (m/(\d+)% (.*)/ && $1 > $oO10OI0) {
$lOI100 = "\x66\x75\x6C\x6C";
}
}
if ($ol0IOoO eq "\x2D\x2D\x64\x65\x62\x75\x67\x3D\x6F\x6E") {
$OloOlOlII->show_progress;
}
What i need is a regex to be able to extract the hex codes from this file like \x66\x75\x6C\x6C, \x66\x75\x6C\x6C and etc.
Note: The file is so much longer.
Thanks in advance

A regular expression to match a hex code like that is straightforward:
\\x[0-9a-f][0-9a-f]
or
\\x[[:xdigit:]]{2}
Is that enough to solve your problem?

Related

Bash for manipulating curly bracket delimited config files [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
Currently I have a config file of the following form:
under Time {
TimeStep = 0.001;
MaxTime = 0.2;
MaxIts = 400;
Type = Implicit;
under Implicit {
Type = ForwardEuler;
Jacobian = FiniteDifference;
under Newton {
MaxIts = 20;
Eps = 0.01;
}
}
}
First Question: I want to write a set of bash scripts that can
set property = value in a file; add it if it is not there.
get property from such a file.
line-by-line editting is not suitable here: take MaxIts for example, the script needs to distinguish between Time.MaxIts and Time.Implicit.MaxIts.
Second Question: I want to write a bash script that transforms above into:
Time.TimeStep = 0.001;
Time.MaxTime = 0.2;
Time.MaxIts = 400;
Time.Type = Implicit;
Time.Implicit.Type = ForwardEuler;
Time.Implicit.Jacobian = FiniteDifference;
Time.Implicit.Newton.MaxIts = 20;
Time.Implicit.Newton.Eps = 0.01;
so that sed or awk can do the job simply.
Here's how to do the 2nd part:
$ cat tst.awk
function descend(name) {
while ( (getline > 0) && !/}/ ) {
if ( /{/ ) {
descend(name "." $2)
}
else {
sub(/^[[:space:]]+/,"")
print name "." $0
}
}
}
{ descend($2) }
$ awk -f tst.awk file
Time.TimeStep = 0.001;
Time.MaxTime = 0.2;
Time.MaxIts = 400;
Time.Type = Implicit;
Time.Implicit.Type = ForwardEuler;
Time.Implicit.Jacobian = FiniteDifference;
Time.Implicit.Newton.MaxIts = 20;
Time.Implicit.Newton.Eps = 0.01;
I'm sure you can write a script to do the reverse mapping and then you can just do all the manipulation related to your first question on the flat format above.

How can I elegantly handle state when parsing line oriented files using regex?

I have a perl script that I use to extract data from a raw data/log file. I need help on making the script dynamic. First, let me show you the part of the perl script and raw data file.
Perl script:
if ( /Catalyst tester (\S+)\S+/ )
{
$DETAILS{tester_name} = $1;
}
if ( /(CATALYST_TH\s*1)/ )
{
$FOUND_CAT = 1;
$DETAILS{test_head} = $1;
$TEST_HEAD = $1;
}
if ($FOUND_CAT)
{
if ( /(BACKPLANE\s*A)/ )
{
$FRAME = $TEST_HEAD .' '. $1;
$FOUND_BACKPLANE_A = 1;
}
if ( /(BACKPLANE\s*B)/ )
{
$FRAME = $TEST_HEAD . ' ' . $1;
$FOUND_BACKPLANE_B = 1;
}
}
if ( /END/ )
{
$FOUND_CAT = 0;
$FOUND_BACKPLANE_A = 0;
$FOUND_BACKPLANE_B = 0;
$FOUND_PRECISION_1 = 0;
$FOUND_PRECISION_2 = 0;
$FOUND_UB_SPS = 0;
$FOUND_HSD100_1 = 0;
$FOUND_HSD100_2 = 0;
$FOUND_HSD100_3 = 0;
$FOUND_TSY = 0;
$FOUND_TIME_SUB = 0;
}
if ($FOUND_BACKPLANE_A)
{
if ( /(\d+)\s+(\S+)\s+(\w+)\s+\w+\s+\d*\s+\#\s+\S+\s+(?:\d+\s+){2}((?!.*EMPTY\b).+)$/ )
{
push #{$DETAILS{frame}}, $FRAME;
push #{$DETAILS{slot}}, $1;
push #{$DETAILS{part_no}}, $2;
push #{$DETAILS{serial_no}}, $3;
push #{$DETAILS{board_name}}, $4;
}
}
if ($FOUND_BACKPLANE_B)
{
if ( /(\d+)\s+(\S+)\s+(\w+)\s+\w+\s+\d*\s+\#\s+\S+\s+((?!.*EMPTY\b).+)$/ )
{
push #{$DETAILS{frame}}, $FRAME;
push #{$DETAILS{slot}}, $1;
push #{$DETAILS{part_no}}, $2;
push #{$DETAILS{serial_no}}, $3;
push #{$DETAILS{board_name}}, $4;
}
}
if( /(PRECISION\_AC\s*1)/ )
{
$FOUND_PRECISION_1 = 1;
$FRAME = $1;
}
if ($FOUND_PRECISION_1)
{
if ( /(\d+)\s+(\S+)\s+(\w+)\s+\w+\s+\d*\s+\#\s+\S+\s+((?!.*EMPTY\b).+)/ )
{
push #{$DETAILS{frame}}, $FRAME;
push #{$DETAILS{slot}}, $1;
push #{$DETAILS{part_no}}, $2;
push #{$DETAILS{serial_no}}, $3;
push #{$DETAILS{board_name}}, $4;
}
}
## And the rest of the script follows the same format
In my perl script, my logic is if the line/word/header(as I prefer to call it) is found, assign a variable with a true or 1. Then in another if statement if the variable is 1, search for the data needed using regex and store it in a hash.
Now my main problem is that it is not dynamic. If you noticed I did an if statement for every header and the variable that is used to store a 1 is different for every header; if it's Catalyst tester then the variable would be $FOUND_CAT = 1;.
Somethings to take note of: for the header specifically CATALYST_TH 1, there will always be BACKPLANE A or it could be BACKPLANE B. If there is a BACKPLANE B I would have to write another if statement and push everything into the hash again. It's tedious because other log files may have even up to C or D which I do not know of yet, therefore making my script hard to maintain.
Other headers only need one line like PRECISION_AC 1. Only CATALYST_TH 1 will always have a backplane. This is just to take note in case it affects any answers.
So any help on this? Is there anyway to reduce the number of variables? Or even the number or if statements? I've tried but that way it wouldn't push other data into the hash if it's not true. Suggestions would greatly be appreciated.
P.S. Ignore the comments with one '#' symbol, those are part of the log file. The ones with two '#' symbols, like '##' are the comments I have added in.
Since your parsing has lots of state in it depending on what your program has already seen I would switch from regex to Parse-RecDescent, which can easily handle all that state nicely.
It's a steep learning curve at first though. There's a tutorial on it here, and an older, simpler tutorial here.

How to remove asterisk from this spin syntax code?

here is my code it is a text spinner (synonym)
public function fetchContent($keyword)
{
$customContent = $this->getOption('custom_content_text');
$this->_setHttpStatusCode(200);
if (!$customContent)
{
$this->_setContentStatus(self::CONTENT_STATUS_NO_RESULTS);
return false;
}
if (preg_match_all('/({\*)(.*?)(\*})/', $customContent, $result))
{
if (is_array($result[0]))
{
foreach ($result[0] as $index => $group_string)
{
//replace the first or next pattern match with a replaceable token
$customContent = preg_replace('/(\{\*)(.*?)(\*\})/', '{#'.$index.'#}', $customContent, 1);
$words = explode('|', $result[2][$index]);
//clean and trim all words
$finalPhrase = array();
foreach ($words as $word)
{
if (preg_match('/\S/', $word))
{
$word = preg_replace('/{%keyword%}/i', $keyword, $word);
$finalPhrase[] = trim($word);
}
}
$finalPhrase = $finalPhrase[rand(0, count($finalPhrase) - 1)];
//now inject it back to where the token was
$customContent = str_ireplace('{#' . $index . '#}', $finalPhrase, $customContent);
}
$this->_setContentStatus(self::CONTENT_STATUS_PASSED);
}
}
return $customContent;
}
}
there is regex that request bracket like this
{*spin1|spin2|spin3*}
here is the regex from the snippet above
if (preg_match_all('/({\*)(.*?)(\*})/', $customContent, $result))
$customContent = preg_replace('/(\{\*)(.*?)(\*\})/', '{#'.$index.'#}', $customContent, 1);
i would like to remove the * to format allow just {spin1|spin2|spin3} wich is more compatible with most spinner ,
i tried with some regex that i find online
i tried to remove the * from both regex without result
thanks you very much for your help
Remove \* instead of just * – Lucas Trzesniewski

Parsing Microsoft Office 2013 MRU Lists in Registry using Perl

I am currently trying to parse the keys in a Windows 7 registry containing the MRU lists for Microsoft Office 2013. However when I attempt to run the Perl script in RegRipper it says the plugin was not successfully run. Im not sure if there is a syntax error in my code or if it is unable to parse the registry as I have it written. The biggest problem is that one of the keys is named after the user's LiveId (it appear as LiveId_XXXXXXX) and this changes from user to user so i would like this plugin to work no matter what the user's LiveId is. Thanks!
my $reg = Parse::Win32Registry->new($ntuser);
my $root_key = $reg->get_root_key;
# ::rptMsg("officedocs2013_File_MRU v.".$VERSION); # 20110830 [fpi] - redundant
my $tag = 0;
my $key_path = "Software\\Microsoft\\Office\\15.0";
if (defined($root_key->get_subkey($key_path))) {
$tag = 1;
}
if ($tag) {
::rptMsg("MSOffice version 2013 located.");
my $key_path = "Software\\Microsoft\\Office\\15.0";
my $of_key = $root_key->get_subkey($key_path);
if ($of_key) {
# Attempt to retrieve Word docs
my $word_mru_key_path = 'Software\\Microsoft\\Office\\15.0\\Word\\User MRU';
my $word_mru_key = $of_key->get_subkey($word_mru_key_path);
foreach ($word_mru_key->get_list_of_subkeys())
{
if ($key->as_string() =~ /LiveId_\w+/)
{
$word = join($key->as_string(),'\\File MRU');
::rptMsg($key_path."\\".$word);
::rptMsg("LastWrite Time ".gmtime($word_key->get_timestamp())." (UTC)");
my #vals = $word_key->get_list_of_values();
if (scalar(#vals) > 0) {
my %files
# Retrieve values and load into a hash for sorting
foreach my $v (#vals) {
my $val = $v->get_name();
if ($val eq "Max Display") { next; }
my $data = getWinTS($v->get_data());
my $tag = (split(/Item/,$val))[1];
$files{$tag} = $val.":".$data;
}
# Print sorted content to report file
foreach my $u (sort {$a <=> $b} keys %files) {
my ($val,$data) = split(/:/,$files{$u},2);
::rptMsg(" ".$val." -> ".$data);
}
}
else {
::rptMsg($key_path.$word." has no values.");
}
else {
::rptMsg($key_path.$word." not found.");
}
::rptMsg("");
}
}
The regex
LiveId_(\w+)
will grab the string after LiveId_ and you can reference it with a \1 like this

regex validating telephone number, but chops white space using perl

So I have an HTML field in a form that takes in a phone number. It validates it correctly when I use () or / or - however, if I put in say 555 123 4567, it returns 555. As always your help is greatly appreciates it.
Here is my code
my $userName=param("userName");
my $password=param("password");
my $phoneNumber=param("phoneNumber");
my $email=param("email");
my $onLoad=param("onLoad");
my $userNameReg = "[a-zA-Z0-9_]+";
my $passwordReg = "([a-zA-Z]*)([A-Z]+)([0-9]+)";
my $phoneNumberReg = "((\(?)([2-9]{1}[0-9]{2})(\/|-|\)|\s)?([2-9]{1}[0-9]{2})(\/|-|\s)?([0-9]{4}))";
my $emailReg = "([a-zA-Z0-9_]{2,})(#)([a-zA-Z0-9_]{2,})(.)(com|COM)";
if ($onLoad !=1)
{
#controlValue = ($userName, $password, $phoneNumber, $email);
#regex = ($userNameReg, $passwordReg, $phoneNumberReg, $emailReg);
#validated;
for ($i=0; $i<4; $i++)
{
$retVal= validatecontrols ($controlValue[$i], $regex[$i]);
if ($retVal)
{
$count++;
}
if (!$retVal)
{
$validated[$i]="*"
}
}
sub validatecontrols
{
my $ctrlVal = shift();
my $regexVal = shift();
if ($ctrlVal =~ /^$regexVal$/)
{
return 1;
}
return 0;
}
}
*html code is here*
I realize that this is part of an assignment, so you may be working under specific restraints. However, your attempt to abstract out your data validation is honestly just making things messy and harder to follow. It also ties you down to specifically regex tests, which may not actually be the best bet. As has already been said, email validation should be done via a module.
Also, for this phone validation, an easier solution is just to strip out anything that isn't a number, and then do your validation test. The below code demonstrates what I'm talking about:
my $userName = param("userName");
my $password = param("password");
my $phoneNumber = param("phoneNumber");
my $email = param("email");
my $onLoad = param("onLoad");
my $error = 0;
if ($onLoad !=1)
{
if ($username !~ /^[a-zA-Z0-9_]+$/) {
$username = '*';
$error++;
}
if ($password !~ /^[a-zA-Z]*[A-Z]+[0-9]+$/) {
$password = '*';
$error++;
}
(my $phoneNumOnly = $phoneNumber) =~ s/\D//g;
if ($phoneNumOnly !~ /^1?[2-9]{1}\d{2}[2-9]{1}\d{6}$/) {
$phoneNumber = '*';
$error++;
}
if ($email !~ /^\w{2,}\#\w{2,}\.com$/i) {
$email = '*';
$error++;
}
}
*html code is here*
That regex you're using looks a overly complicated. You have a lot of capturing groups in there, but I get the feeling you're mostly using them to define "OR" statements with the vertical bar. It's usually a lot easier to just use brackets for this purpose if you're only selecting single characters. Also, it's not a good idea to use\s for normal spaces, since this will actually match any whitespace character (tabs and newlines). Maybe try something like this:
(?:\(?[2-9]\d{2}\)?[-\/ ]?)?[2-9]\d{2}[-\/ ]?\d{4}