Regex to get password from a long string of mess - regex

I am using power-shell and am getting the below output from my program.
I am having problems getting the password from the mess of other things. Ideally i need to get Hiva!!66 by itself. I am using reg-ex to accomplish this and its just not working. the password will always be 8 characters have an upper and a lowercase and a special character. I have created the split and everything else i need but the reg-ex part is messing with me.
I am away that there are a lot of questions around reg-ex and passwords but those don't seem to have a lot of mess before and after it.Any help would be appreciated.
My best attempt so far is:
"(?=.*\d)(?=.*[A-Z])(?=.*[!##\$%\^&\*\~()_\+\-={}\[\]\\:;`"'<>,./]).{8}$"
C:\Users\<username>\AppData\Roaming\Crystal Point\OutsideView\Macro\CONNECTEXP.VCB:5:For intTmp = 1 To 4
C:\Users\<username>\AppData\Roaming\Crystal Point\OutsideView\Macro\CONNECTEXP.VCB:8:cboCOMPort.SelectString 1, "1"
C:\Users\<username>\AppData\Roaming\Crystal Point\OutsideView\Macro\CONNECTEXP.VCB:11:str2CRLF = Chr(13) & Chr(10) & Chr(13) & Chr(10)
C:\Users\<username>\AppData\Roaming\Crystal Point\OutsideView\Macro\CONNECTEXP.VCB:14: & "include emulation type (currently Tandem), the I/O method (currently Async) and host connection information
for the session (currently COM9, 8N1)" _
C:\Users\<username>\AppData\Roaming\Crystal Point\OutsideView\Macro\CONNECTEXP.VCB:15: & " to the correct values for your target host (e.g., TCP/IP and host IP name or address) and save the
IOSet "CHARSIZE", "8"
PASS="Hiva!!66" If DDEAppReturnCode() <> 0 Then
If DDEAppReturnCode() <> 0 Then
C:\Users\<username>\AppData\Roaming\Crystal Point\OutsideView\Macro\DDEtoXL.vcb:28: MsgBox "Could not load " & txtWorkSheet.text, 48
C:\Users\<username>\AppData\Roaming\Crystal Point\OutsideView\Macro\DDEtoXL.vcb:37:DDESheetChan = -1
C:\Users\<username>\AppData\Roaming\Crystal Point\OutsideView\Macro\DDEtoXL.vcb:38:DDESystemChan = -2

If you can't count on the quotes or the PASS= being there, you'll have to rely on the password's composition to do everything. The following regex matches a string of eight consecutive characters of the allowed types, with the lookahead and lookbehind to make sure there aren't more than eight.
$regex = [regex] #'
(?x)
(?<![!##$%^&*~()_+\-={}\[\]\\:;`<>,./A-Za-z0-9])
(?:
[!##$%^&*~()_+\-={}\[\]\\:;`<>,./]()
|
[A-Z]()
|
[a-z]()
|
[0-9]()
){8}
\1\2\3\4
(?![!##$%^&*~()_+\-={}\[\]\\:;`<>,./A-Za-z0-9])
'#
It also verifies that there's at least one of each character type: uppercase letter, lowercase letter, digit and special. The lookahead approach used in your regex won't work because it can look too far ahead, beyond the end of the word you're trying to match. Instead, I put an empty group in each branch to act like check boxes. If a backreference to one of those groups fails, it means that branch didn't participate in the match, meaning in turn that the associated character type was not present.

Did you try the following regex:
^PASS="(.{8})"
?

Just use this
(?<=PASS=").+(?=")

You can extract the password from that output with something like this:
... | ? { $_ -cmatch 'PASS="(.{8})"' | % { $matches[1] }
or like this (in PowerShell v3):
... | Select-String -Case 'PASS="(.{8})"' | % { $_.Matches.Groups[1].Value }
In PowerShell v2 you'll have to do something like this if you want to use Select-String:
... | Select-String -Case 'PASS="(.{8})"' | select -Expand Matches |
select -Expand Groups | select -Last 1 | % { $_.Value }

Related

Powershell Regex question. Escape parenthesis

Been beating my head around this one all day and I'm getting close but not quite getting there. I have a small subset of my much larger script for just the regex part. Here is the script so far:
$CCI_ID = #(
"003417 AR-2.1"
"003425 AR-2.9"
"003392 AP-1.12"
"009012 APP-1(21).1"
)
[regex]::matches($CCI_ID, '(\d{1,})|([a-zA-Z]{2}[-][\d][\(?\){0,1}[.][\d]{1,})') |
ForEach-Object {
if($_.Groups[1].Value.length -gt 0){
write-host $('CCI-' + $_.Groups[1].Value.trim())}
else{$_.Groups[2].Value.trim()}
}
CCI-003417
AR-2.1
CCI-003425
AR-2.9
CCI-003392
AP-1.12
CCI-009012
PP-1(21
CCI-1
The output is correct for all but the last one. It should be:
CCI-009012
APP-1(21).1
Thanks for any advice.
Instead of describing and quantifying the (optional) opening and closing parenthesis separately, group them together and then make the whole group optional:
(?:\(\d+\))?
The whole pattern thus ends up looking like:
[regex]::Matches($CCI_ID, '(\d{1,})|([a-zA-Z]{2,3}[-][\d](?:\(\d+\))?[.][\d]{1,})')
In your pattern you are using an alternation | but looking at the example data you can match 1 or more whitespaces after it instead.
If there is a match for the pattern, the group 1 value already contains 1 or more digits so you don't have to check for the Value.length
The pattern with the optional digits between parenthesis:
\b(\d+)\s+([a-zA-Z]{2,}-\d(?:\(\d+\))?\.\d+)\b
See a regex101 demo.
$CCI_ID = #(
"003417 AR-2.1"
"003425 AR-2.9"
"003392 AP-1.12"
"009012 APP-1(21).1"
)
[regex]::matches($CCI_ID, '\b(\d+)\s+([a-zA-Z]{2,}-\d(?:\(\d+\))?\.\d+)\b') |
ForEach-Object {
write-host $( 'CCI-' + $_.Groups[1].Value.trim() )
write-host $_.Groups[2].Value.trim()
}
Output
CCI-003417
AR-2.1
CCI-003425
AR-2.9
CCI-003392
AP-1.12
CCI-009012
APP-1(21).1
As you experiencing here, Regex expressions might become very complex and unreadable.
Therefore it is often an good idea to view your problem from two different angles:
Try matching the part(s) you want, or
Try matching the part(s) you don't want
In your case it is probably easier to match the part that you don't want: the delimiter, the space, and split your string upon that, which is apparently want to achieve:
$CCI_ID | Foreach-Object {
$Split = $_ -Split '\s+', 2
'CCI-' + $Split[0]
$Split[1]
}
$_ -Split '\s+', 2, Splits the concerned string based on 1 or more white-spaces (where you might also consider a literal space: -Split ' '). The , 2 will prevent the the string to split in more than 2 parts. Meaning that the second part will not be further split even if it contains a spaces.

Change 3rd octet of IP in string format using PowerShell

Think I've found the worst way to do this:
$ip = "192.168.13.1"
$a,$b,$c,$d = $ip.Split(".")
[int]$c = $c
$c = $c+1
[string]$c = $c
$newIP = $a+"."+$b+"."+$c+"."+$d
$newIP
But what is the best way? Has to be string when completed. Not bothered about validating its a legit IP.
Using your example for how you want to modify the third octet, I'd do it pretty much the same way, but I'd compress some of the steps together:
$IP = "192.168.13.1"
$octets = $IP.Split(".") # or $octets = $IP -split "\."
$octets[2] = [string]([int]$octets[2] + 1) # or other manipulation of the third octet
$newIP = $octets -join "."
$newIP
You can simply use the -replace operator of PowerShell and a look ahead pattern. Look at this script below
Set-StrictMode -Version "2.0"
$ErrorActionPreference="Stop"
cls
$ip1 = "192.168.13.123"
$tests=#("192.168.13.123" , "192.168.13.1" , "192.168.13.12")
foreach($test in $tests)
{
$patternRegex="\d{1,3}(?=\.\d{1,3}$)"
$newOctet="420"
$ipNew=$test -replace $patternRegex,$newOctet
$msg="OLD ip={0} NEW ip={1}" -f $test,$ipNew
Write-Host $msg
}
This will produce the following:
OLD ip=192.168.13.123 NEW ip=192.168.420.123
OLD ip=192.168.13.1 NEW ip=192.168.420.1
OLD ip=192.168.13.12 NEW ip=192.168.420.12
How to use the -replace operator?
https://powershell.org/2013/08/regular-expressions-are-a-replaces-best-friend/
Understanding the pattern that I have used
The (?=) in \d{1,3}(?=.\d{1,3}$) means look behind.
The (?=.\d{1,3}$ in \d{1,3}(?=.\d{1,3}$) means anything behind a DOT and 1-3 digits.
The leading \d{1,3} is an instruction to specifically match 1-3 digits
All combined in plain english "Give me 1-3 digits which is behind a period and 1-3 digits located towards the right side boundary of the string"
Look ahead regex
https://learn.microsoft.com/en-us/dotnet/standard/base-types/regular-expression-language-quick-reference
CORRECTION
The regex pattern is a look ahead and not look behind.
If you have PowerShell Core (v6.1 or higher), you can combine -replace with a script block-based replacement:
PS> '192.168.13.1' -replace '(?<=^(\d+\.){2})\d+', { 1 + $_.Value }
192.168.14.1
Negative look-behind assertion (?<=^(\d+\.){2}) matches everything up to, but not including, the 3rd octet - without considering it part of the overall match to replace.
(?<=...) is the look-behind assertion, \d+ matches one or more (+) digits (\d), \. a literal ., and {2} matches the preceding subexpression ((...)) 2 times.
\d+ then matches just the 3rd octet; since nothing more is matched, the remainder of the string (. and the 4th octet) is left in place.
Inside the replacement script block ({ ... }), $_ refers to the results of the match, in the form of a [MatchInfo] instance; its .Value is the matched string, i.e. the 3rd octet, to which 1 can be added.
Data type note: by using 1, an implicit [int], as the LHS, the RHS (the .Value string) is implicitly coerced to [int] (you may choose to use an explicit cast).
On output, whatever the script block returns is automatically coerced back to a string.
If you must remain compatible with Windows PowerShell, consider Jeff Zeitlin's helpful answer.
For complete your method but shortly :
$a,$b,$c,$d = "192.168.13.1".Split(".")
$IP="$a.$b.$([int]$c+1).$d"
function Replace-3rdOctet {
Param(
[string]$GivenIP,
[string]$New3rdOctet
)
$GivenIP -match '(\d{1,3}).(\d{1,3}).(\d{1,3}).(\d{1,3})' | Out-Null
$Output = "$($matches[1]).$($matches[2]).$New3rdOctet.$($matches[4])"
Return $Output
}
Copy to a ps1 file and dot source it from command line, then type
Replace-3rdOctet -GivenIP '100.201.190.150' -New3rdOctet '42'
Output: 100.201.42.150
From there you could add extra error handling etc for random input etc.
here's a slightly different method. [grin] i managed to not notice the answer by JeffZeitlin until after i finished this.
[edit - thanks to JeffZeitlin for reminding me that the OP wants the final result as a string. oops! [*blush*]]
what it does ...
splits the string on the dots
puts that into an [int] array & coerces the items into that type
increments the item in the targeted slot
joins the items back into a string with a dot for the delimiter
converts that to an IP address type
adds a line to convert the IP address to a string
here's the code ...
$OriginalIPv4 = '1.1.1.1'
$TargetOctet = 3
$OctetList = [int[]]$OriginalIPv4.Split('.')
$OctetList[$TargetOctet - 1]++
$NewIPv4 = [ipaddress]($OctetList -join '.')
$NewIPv4
'=' * 30
$NewIPv4.IPAddressToString
output ...
Address : 16908545
AddressFamily : InterNetwork
ScopeId :
IsIPv6Multicast : False
IsIPv6LinkLocal : False
IsIPv6SiteLocal : False
IsIPv6Teredo : False
IsIPv4MappedToIPv6 : False
IPAddressToString : 1.1.2.1
==============================
1.1.2.1

Match the word "bar" if found anywhere in a field

I am trying to use a CASE statement in Google Data Studio to return a Boolean result if a given string is found within an existing field.
As Google Data Studio uses RE2 RegEx syntax, I believe the following would work, but it returns a could not parse formula error:
CASE
WHEN REGEXP_MATCH(Foo, '(\W|^)bar(\W|$)') THEN 1
ELSE 0
END
I have tried many different combinations of RegEx syntax, but can't work it out. Any help would be much appreciated as this should be a simple REGEXP_MATCH?
The Boolean result should be true if the string is found anywhere within the field:
+---------------------------+----------------+
| Foo | Boolean Result |
+---------------------------+----------------+
| blah bar / boo doo | True |
| but is / should not match | False |
| but match / here bar | True |
+---------------------------+----------------+
You need to make sure you match the whole string with the pattern that you want to use in a REGEXP_MATCH and when using regex escapes, make sure to double escape them:
CASE WHEN REGEXP_MATCH(Foo, '(.*\\W|^)bar(\\W.*|$)') THEN 1 ELSE 0 END
If there are line breaks in Foo, add (?s) at the start of the pattern.
Details
(.*\\W|^) - either any 0+ chars as many as possible followed with a non-word char or start of a string
bar - the word
(\\W.*|$) - either a non-word char followed with any 0+ chars as many as possible or end of a string
See the regex demo.
A Boolean field can be created using the single REGEXP_MATCH Calculated Field below, where \\b on either side of bar represents a Word Boundary thus matching bar but not bark, embark or embar:
REGEXP_MATCH(Foo, ".*(\\bbar\\b).*")
Google Data Studio Report and a GIF to elaborate:

extract a variable value from the middle of a string

I have been trying to figure out for quite sometime. how do I get the PID value from the following string using powershell? I thought REGEX was the way to go but I can't quite figure out the syntax.
For what it is worth everything except for the PID will remain the same.
$foo = <VALUE>I am just a string and the string is the thing. PID:25973. After this do that and blah blah.</VALUE>
I have tried the following in regex
[regex]::Matches($foo, 'PID:.*') | % {$_.Captures[0].Groups[1].value}
[regex]::Matches($foo, 'PID:*?>') | % {$_.Captures[0].Groups[1].value}
[regex]::Matches($foo, 'PID:*?>') | % {$_.Captures[0].Groups[1].value}
[regex]::Matches($foo, 'PID:*?>(.+).') | % {$_.Captures[0].Groups[1].value}
For your regex you'll want to indicate what's before and after the portion you're looking for. PID:.* will find everything from the PID to the end of the string.
And to use a capture group you'll want to have some ( and ) in your regex, which defines a group.
So try this on for size:
[regex]::Matches($foo,'PID:(\d+)') | % {$_.Captures[0].Groups[1].value}
I'm using a regex of PID:(\d+). The \d+ means "one or more digits". The parentheses around that (\d+) identifies it as a group I can access using Captures[0].Groups[1].
Here's another option. Basically it replaces everything with the first capture group (which is the digits after 'pid:':
$foo -replace '^.+PID:(\d+).+$','$1'

Regex: how to determine odd/even number of occurrences of a char preceding a given char?

I would like to replace the | with OR only in unquoted terms, eg:
"this | that" | "the | other" -> "this | that" OR "the | other"
Yes, I could split on space or quote, get an array and iterate through it, and reconstruct the string, but that seems ... inelegant. So perhaps there's a regex way to do this by counting "s preceding | and obviously odd means the | is quoted and even means unquoted. (Note: Processing doesn't start until there is an even number of " if there is at least one ").
It's true that regexes can't count, but they can be used to determine whether there's an odd or even number of something. The trick in this case is to examine the quotation marks after the pipe, not before it.
str = str.replace(/\|(?=(?:(?:[^"]*"){2})*[^"]*$)/g, "OR");
Breaking that down, (?:[^"]*"){2} matches the next pair of quotes if there is one, along with the intervening non-quotes. After you've done that as many times as possible (which might be zero), [^"]*$ consumes any remaining non-quotes until the end of the string.
Of course, this assumes the text is well-formed. It doesn't address the problem of escaped quotes either, but it can if you need it to.
Regexes do not count. That's what parsers are for.
You might find the Perl FAQ on this issue relevant.
#!/usr/bin/perl
use strict;
use warnings;
my $x = qq{"this | that" | "the | other"};
print join('" OR "', split /" \| "/, $x), "\n";
You don't need to count, because you don't nest quotes. This will do:
#!/usr/bin/perl
my $str = '" this \" | that" | "the | other" | "still | something | else"';
print "$str\n";
while($str =~ /^((?:[^"|\\]*|\\.|"(?:[^\\"]|\\.)*")*)\|/) {
$str =~ s/^((?:[^"|\\]*|\\.|"(?:[^\\"]|\\.)*")*)\|/$1OR/;
}
print "$str\n";
Now, let's explain that expression.
^ -- means you'll always match everything from the beginning of the string, otherwise
the match might start inside a quote, and break everything
(...)\| -- this means you'll match a certain pattern, followed by a |, which appears
escaped here; so when you replace it with $1OR, you keep everything, but
replace the |.
(?:...)* -- This is a non-matching group, which can be repeated multiple times; we
use a group here so we can repeat multiple times alternative patterns.
[^"|\\]* -- This is the first pattern. Anything that isn't a pipe, an escape character
or a quote.
\\. -- This is the second pattern. Basically, an escape character and anything
that follows it.
"(?:...)*" -- This is the third pattern. Open quote, followed by a another
non-matching group repeated multiple times, followed by a closing
quote.
[^\\"] -- This is the first pattern in the second non-matching group. It's anything
except an escape character or a quote.
\\. -- This is the second pattern in the second non-matching group. It's an
escape character and whatever follows it.
The result is as follow:
" this \" | that" | "the | other" | "still | something | else"
" this \" | that" OR "the | other" OR "still | something | else"
Another approach (similar to Alan M's working answer):
str = str.replace(/(".+?"|\w+)\s*\|\s*/g, '$1 OR ');
The part inside the first group (spaced for readability):
".+?" | \w+
... basically means, something quoted, or a word. The remainder means that it was followed by a "|" wrapped in optional whitespace. The replacement is that first part ("$1" means the first group) followed by " OR ".
Perhaps you're looking for something like this:
(?<=^([^"]*"[^"]*")+[^"|]*)\|
Thanks everyone. Apologies for neglecting to mention this is in javascript and that terms don't have to be quoted, and there can be any number of quoted/unquoted terms, eg:
"this | that" | "the | other" | yet | another -> "this | that" OR "the | other" OR yet OR another
Daniel, it seems that's in the ballpark, ie basically a matching/massaging loop. Thanks for the detailed explanation. In js, it looks like a split, a forEach loop on the array of terms, pushing a term (after changing a | term to OR) back into an array, and a re join.
#Alan M, works nicely, escaping not necessary due to the sparseness of sqlite FTS capabilities.
#epost, accepted solution for brevity and elegance, thanks. it needed to merely be put in a more general form for unicode etc.
(".+?"|[^\"\s]+)\s*\|\s*
My solution in C# to count the quotes and then regex to get the matches:
// Count the number of quotes.
var quotesOnly = Regex.Replace(searchText, #"[^""]", string.Empty);
var quoteCount = quotesOnly.Length;
if (quoteCount > 0)
{
// If the quote count is an odd number there's a missing quote.
// Assume a quote is missing from the end - executive decision.
if (quoteCount%2 == 1)
{
searchText += #"""";
}
// Get the matching groups of strings. Exclude the quotes themselves.
// e.g. The following line:
// "this and that" or then and "this or other"
// will result in the following groups:
// 1. "this and that"
// 2. "or"
// 3. "then"
// 4. "and"
// 5. "this or other"
var matches = Regex.Matches(searchText, #"([^\""]*)", RegexOptions.Singleline);
var list = new List<string>();
foreach (var match in matches.Cast<Match>())
{
var value = match.Groups[0].Value.Trim();
if (!string.IsNullOrEmpty(value))
{
list.Add(value);
}
}
// TODO: Do something with the list of strings.
}