Regex to extract first two octets of IP address in hive - regex

I want to extract only first two octets of IP address in hive.
Can anybody please tell me equivalent Regex supported in Hive?
For example,extract '192.96.0.0' from ip_address '192.96.45.33'.

192\\.96\\.\\d{3}\\.\\d{3}
Guess this should work as Hive uses Java format.
or
192\\.96\\.(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.)(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)

Related

Extract the Source IP Address from two different log samples with regex

I have a regular expression as follows:
"id.resp_h"|"rx_hosts":(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}),
I am trying to extract the Source IP Address from two different log samples. "id.orig_h" and "tx_hosts" are two different fields for Source IP. How do i ignore the speech marks and square brackets? i just want extract the IP addresses
schema_id=17127524534057985804:skip_writers="":{"_path":"conn","_system_name":"hostname","_write_ts":"2020-01-12T22:09:28.853417Z","ts":"2020-01-12T22:07:14.642074Z","uid":"Cm4cbmvRjlmd2I52c","id.orig_h":"192.168.1.1","id.orig_p":xxx,"id.resp_h":"192.168.1.2","id.resp_p":xxx,"proto":"udp",
schema_id=17223896091372211545:skip_writers="":{"_path":"files","_system_name":"Hostname","_write_ts":"2020-01-12T22:09:00.016260Z","ts":"2020-01-12T22:07:14.108217Z","fuid":"FnmzOv3Fkhr8lP0qL","tx_hosts":["192.168.1.1","192.168.1.1"],"rx_hosts":["192.168.1.10"],
Any help would be gratefully appreciated :-)
Thanks,
JM
Try this if you want to solve it with regex:
(?:"id.resp_h"["[:]|"rx_hosts"["[:])(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})
See here

get two ip separately

my log files got two ip src-ip:132.23.35.1, dest-ip:10.23.56.1.
I 'm using regex:
\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}
it gets two IPs, if I want to retrieve IP address of src-ip (in this case, 132.23.35.1) how to do?
I expect to get ip of source-ip and dest-ip separately.
You could try
(?<=src-ip:)(.*)(?=,)
Example output from regexr
The regex code has been adapted from: Regex Match all characters between two strings

How to write a regular expression in c++

I have this string dummy_data:\m192.168.1.125\pApp and I want to extract the IP address from the given string.
I have used the following regular Expression:
\\\\m([\\d\\w\\.]+)\\\\?
This returns \m192.168.1.125, but I want only 192.168.1.125
Do you have any suggestions on how to achieve this?
This one is simple:
[0-9][0-9]?[0-9]?\.[0-9][0-9]?[0-9]?\.[0-9][0-9]?[0-9]?\.[0-9][0-9]?[0-9]?
It only works for IPv4 addresses.
This one also worked for your string:
([0-9]{1,3}\.){3}[0-9]{1,3}
I tested both on this random page. I can not tell you how reliable they are.

Regular expression for isolating Comcast IP addresses in access log file for Apache

Really the fact I want to use this for my Apache access log file is arbitrary and irrelevant, but it gives context to the situation.
I need to filter out records associated with Comcast IP addresses. Here's a list of the dynamic IP address ranges that Comcast assigns. I need a regular expression that can match all of those, and only those. I'll work on it on my own in the mean time but I figured there would be some RegEx guru out there on SO that would enjoy the problem.
Regex solution is possible, but very cumbersome, since the subnet mask is not multiple of 8. You will need to write a function to process the list and convert into regex.
It is better to use regex to grab the IP address and test the IP address against the list of IP addresses by Comcast. Simple implementation would be a set which allows you to search for the nearest number that is smaller than the argument.
That are a lot of IP adresses.
For example, 24.0.0.0/12 defines the IP range 24.0.0.1 - 24.15.255.255. To match these numeric ranges with a regex:
24: 24
0-15: [0-9]|1[0-5]
0-255: [0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5]
Which gives
(24)\.([0-9]|1[0-5])\.([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\.([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])
And that's just for 24.0.0.0/12, 293 to go.
If you really want to do this you should write a small script to convert each IP range into a regex automatically.
Another approach would be to match any IP address and feed it into a callback that does the matching using an appropriate module / framework / API.

Regex to see if ip starts with 156.21.x.x

I'm writing a regex for google analytics and I need to block any IP from 156.21.x.x I don't care about the last 2 octets just the first two. I would like to keep the regex to as few characters as possible as google only allows 255 chars and my regex is already pretty large.
not sure what flavor of regex or what lang your using, but this will work on most regex engines:
156\.21\.\d{1,3}\.\d{1,3}
Of course, this will match invalid ip's like 156.21.777.888, but if the list your parsing doesnt contain invalid ip addresses, then you should be ok. Or:
156\.21(\.\d{1,3}){2}
If you are running short on space, this would work, though you would match non-IP addresses as well. If you can assume Google will give you valid IP addresses, this is your shortest option:
^156\.21\.
Matches things like: 156.21.1.1 156.21.1000.1000 156.21.ABC
But does not match http://156.21.1.1 ehlo 156.21.1000.1000
The following regex would match (almost) valid IPv4 addresses that starts with 156.21:
(156\.21(?:\.[\d]{1,3}){2})