I need to validate a hexadecimal string value (containing only A-F or a-f or 0-9 and combination of this pattern).
I have searched varioud forums and SO as well, and find some solution but none of them is satisfying, at some point some of them are failing to give appropriate results.
Below are some samples.
translate(upper(<VALUE-TO-CHECK>), '0123456789ABCDEF', '.') != '..'
above code is giving incorrect result for values '1234567890ABCDEF' or '000000' or '100000' etc.
REGEXP_LIKE(LTRIM(RTRIM(<VALUE-TO-CHECK>)), '[a-f|A-F|0-9].*');
above code is giving incorrect result for values 'Q1W'
hex_num := TO_NUMBER(<VALUE-TO-CHECK>, 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX');
EXCEPTION
WHEN value_error THEN -- When value_error that means not convertible to HEX value
RETURN FALSE;
above code is giving incorrect result for a 64 byte long hexadecimal character value i.e. 'CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC'
Can anyone please help on me to validate hexadecimal values.
select
case
when regexp_like(:str, '^[^g-zG-Z]*$') then 'Hex'
else 'NotHex'
end typ
from dual
Related
data = "000000000000000117c80378b8da0e33559b5997f2ad55e2f7d18ec1975b9717"
result1 = data.decode('hex')[::-1]
The hex data are decoded to decimal, which is 6,860,217,587,554,922,525,607,992,740,653,361,396,256,930,700,588,249,487,127
Then the decimal number 6,860,217,587,554,922,525,607,992,740,653,361,396,256,930,700,588,249,487,127 is converted to bits and reversed its order (little-endian) and stored in result1 variable as a bitarray?
Is this what exactly happens with that code or did I misunderstood anything?
So the result1 variable is a bitarray?
If it's just a integer variable, how can it hold that much long decimal value?
Strings in python are declared using double or single quotes, therefore the variable data contains a string.
You can check the type of a variable directly in python:
data = "000000000000000117c80378b8da0e33559b5997f2ad55e2f7d18ec1975b9717"
type(data)
which outputs
str
meaning that the variable is a string.
When you call the function decode('hex') on a string you obtain another string:
data.decode('hex')
'\x00\x00\x00\x00\x00\x00\x00\x01\x17\xc8\x03x\xb8\xda\x0e3U\x9bY\x97\xf2\xadU\xe2\xf7\xd1\x8e\xc1\x97[\x97\x17'
Every character in your original string is interpreted as an hexadecimal number, and every pairs of hexadecimal numbers - e.s. "17" - is converted into an hexadecimal character using the escape sequence \x - becoming "\x17".
When you write "\x41" you are basically telling python to interpret 41 as a single ASCII character whose hexadecimal representation is 41.
The ASCII table contains the hexadecimal, decimal and octal values associated to the ascii characters.
If you try for example
"48454C4C4F".decode('hex')
you obtain the string "HELLO"
Lastly when you use [::-1] on a string you reverse it:
"48454C4C4F".decode('hex')[::-1]
produces the string "OLLEH"
You can find more about the escape characters reading the python documentation.
So one of my variables was coded in a messy mix of numeric values, texts, parenthesis and so on. I actually only need to extract the numeric values which are recorded as 12345 (for example, not limited to a specific number of digits, i mean it could be a n-k-digit to n-digit) followed by || and then description that might also contain some numeric values. So when I applied SAS compress funtion newvar = compress(oldvar, '', 'a'), the newvar extracted ALL the numbers from the oldvar. Thus it looks like 12345|||(789)|| etc. The number of '|' sign (which is control character to indicate line breaks etc.?) varies though.
I only need to extract the first numeric values before the '|' sign. Any help please?
Thanks in advance.
Use the SCAN() function to extract the values. It will result in a character value and converting to a numeric should be straightforward.
new_var = input(scan(old_var, 1, "|"), best12.);
This should do it:
substr("12345||45||89||...",1,find("|","12345||45||89||...",1)-1)
I need to ensure that a textbox is having a specific format entered against it... Number from a variable then a Decimal Point then any other number (1.10, 2.6 etc...) The important bit is that the first number should come from a variable then it must be a decimal followed by another number.
I have not been able to find anything too specific and the REGEX functionality looks to require a bit of investigation of how it all works... If I can get a quick result here would be great though!
I instinctively (although didnt expect it to work) tried:
If System.Text.RegularExpressions.Regex.IsMatch(txbCriterionNo.Text, OutcomeNo.ToString() + "." + "^[0-9]+$") Then
...
where OutcomeNo is an integer variable - so I hope you can see what I am aiming to get. So, the format MUST be integer variable - decimal point then another integer value.
What should work:
1.5 or 5.42 or 10.5
What shouldn't work:
.14 or a.1 or 1.c
etc...
Thanks!
Chris85 pointed me in the right direction, but I also needed to ensure that the first number matched a variable value so I have arrived at the following which works a treat...
If System.Text.RegularExpressions.Regex.IsMatch(txbCriterionNo.Text, "^\d+\.\d+$") And txbCriterionNo.Text.Substring(0, Convert.ToInt32(InStr(txbCriterionNo.Text, "."))) = OutcomeNo Then
Here we are fistly using the regex "^\d+.\d+$" to make sure the format is correct [number][decimal][number] and then a second check get the position of the decimal and using that to get the substring we want to compare against my variable OutcomeNo.
Thanks all!!
TextBox This will allow only digits and dot to be enetered. And it will have to start with a digit.
Private Sub txtValue_KeyPress(ByVal sender As Object, ByVal e As System.Windows.Forms.KeyPressEventArgs) Handles txtValue.KeyPress
Dim txtValue As txtValue = DirectCast(sender, txtValue)
If Not (Char.IsDigit(e.KeyChar) Or Char.IsControl(e.KeyChar) Or (e.KeyChar = "." And txtValue.Text.IndexOf(".") < 0) ) Then
e.Handled = True
If txtValue.Text.StartsWith(".") Then
txtValue.Text = ""
End If
End If
End Sub
In ColdFusion, URLDecode() decodes the URL-encoded string (URL encoding formats some characters with a percent sign and the two-character hexadecimal representation of the character).
Example: %3A is the hex equivalent for ":(colon)" When the URLDecode() is applied on different strings as below,
URLDecode("%3A%") = :% -- Valid
URLDecode("%EE") = � -- this is because EE has no equivalent character.
But I am trying to decode the string "%ara%" which is invalid, the result I am getting is "%ar%". I don't find the 2nd occurrence of character "a".Can anyone explain me why this is happening??
EDIT: I should note that I want a general case for any hex array, not just the google one I provided.
EDIT BACKGROUND: Background is networking: I'm parsing a DNS packet and trying to get its QNAME. I'm taking in the whole packet as a string, and every character represents a byte. Apparently this problem looks like a Pascal string problem, and using the struct module seems like the way to go.
I have a char array in Python 2.7 which includes octal values. For example, let's say I have an array
DNS = "\03www\06google\03com\0"
I want to get:
www.google.com
What's an efficient way to do this? My first thought would be iterating through the DNS char array and adding chars to my new array answer. Every time i see a '\' char, I would ignore the '\' and two chars after it. Is there a way to get the resulting www.google.com without using a new array?
my disgusting implementation (my answer is an array of chars, which is not what i want, i want just the string www.google.com:
DNS = "\\03www\\06google\\03com\\0"
answer = []
i = 0
while i < len(DNS):
if DNS[i] == '\\' and DNS[i+1] != 0:
i += 3
elif DNS[i] == '\\' and DNS[i+1] == 0:
break
else:
answer.append(DNS[i])
i += 1
Now that you've explained your real problem, none of the answers you've gotten so far will work. Why? Because they're all ways to remove sequences like \03 from a string. But you don't have sequences like \03, you have single control characters.
You could, of course, do something similar, just replacing any control character with a dot.
But what you're really trying to do is not replace control characters with dots, but parse DNS packets.
DNS is defined by RFC 1035. The QNAME in a DNS packet is:
a domain name represented as a sequence of labels, where each label consists of a length octet followed by that number of octets. The domain name terminates with the zero length octet for the null label of the root. Note that this field may be an odd number of octets; no padding is used.
So, let's parse that. If you understand how "labels consisting of "a length octet followed by that number of octets" relates to "Pascal strings", there's a quicker way. Also, you could write this more cleanly and less verbosely as a generator. But let's do it the dead-simple way:
def parse_qname(packet):
components = []
offset = 0
while True:
length, = struct.unpack_from('B', packet, offset)
offset += 1
if not length:
break
component = struct.unpack_from('{}s'.format(length), packet, offset)
offset += length
components.append(component)
return components, offset
import re
DNS = "\\03www\\06google\\03com\\0"
m = re.sub("\\\\([0-9,a-f]){2}", "", DNS)
print(m)
Maybe something like this?
#!/usr/bin/python3
import re
def convert(adorned_hostname):
result1 = re.sub(r'^\\03', '', adorned_hostname )
result2 = re.sub(r'\\0[36]', '.', result1)
result3 = re.sub(r'\\0$', '', result2)
return result3
def main():
adorned_hostname = r"\03www\06google\03com\0"
expected_result = 'www.google.com'
actual_result = convert(adorned_hostname)
print(actual_result, expected_result)
assert actual_result == expected_result
main()
For the question as originally asked, replacing the backslash-hex sequences in strings like "\\03www\\06google\\03com\\0" with dots…
If you want to do this with a regular expression:
\\ matches a backslash.
[0-9A-Fa-f] matches any hex digit.
[0-9A-Fa-f]+ matches one or more hex digits.
\\[0-9A-Fa-f]+ matches a backslash followed by one or more hex digits.
You want to find each such sequence, and replace it with a dot, right? If you look through the re docs, you'll find a function called sub which is used for replacing a pattern with a replacement string:
re.sub(r'\\[0-9A-Fa-f]+', '.', DNS)
I suspect these may actually be octal, not hex, in which case you want [0-7] rather than [0-9A-Fa-f], but nothing else would change.
A different way to do this is to recognize that these are valid Python escape sequences. And, if we unescape them back to where they came from (e.g., with DNS.decode('string_escape')), this turns into a sequence of length-prefixed (aka "Pascal") strings, a standard format that you can parse in any number of ways, including the stdlib struct module. This has the advantage of validating the data as you read it, and not being thrown off by any false positives that could show up if one of the string components, say, had a backslash in the middle of it.
Of course that's presuming more about the data. It seems likely that the real meaning of this is "a sequence of length-prefixed strings, concatenated, then backslash-escaped", in which case you should parse it as such. But it could be just a coincidence that it looks like that, in which case it would be a very bad idea to parse it as such.