Use regex with grep to filter data from the output of a verbose command - regex

I am working with a cloud environment and there is a command that will display all available information about VMs running. here is an example of some of the lines that pertain to one VM.
RESERVATION r-6D0F464B 170506678332 GroupD
INSTANCE i-E9B444A9 emi-376642D8 999.99.999.999 88.888.88.888 running lock_key 0 c1.xlarge 2013-06-17T18:40:56.270Z cluster01 eki-E7E242A3 monitoring-disabled 999.99.999.999 88.888.88.888 ebs
I need to be able to pull the i-********, emi-********, both IP address, its status, the lock_key, the c1.xlarge, and the monitoring-disabled/enabled.
I have been able to pull the whole line with some super simple regex but all of this is well beyond me. If there is another easier method of grabbing this data any suggestions are welcome.

Let's go by parts. Best way I can think of is redirecting the output to a file, in unix-like environments you do it like:
cat your-command > filename.txt
Second, you need to read the file line by line, I would use a python script or a perl script if you know any of those, or whatever language fits you.
Third, you can get values two different ways:
Read columns by position, you can get colums with a regex like: [^\s]+
Write regular expressions for every specific column, so for IP you could have something like this: ([0-9]{1,3}\.){4} for monitoring monitoring-([^\s]+) and so on.

As long as the fields will always be in the same order, all you need to is split on whitespace.
Pseudocode (well, it's ruby, but hopefully you get the idea):
vms = {}
File.open('vm-info').readlines.each do |line|
fields = line.split('\s+')
field_map = {}
vm_name = fields[<index_of_vm_name>]
field_map['emi'] = fields[<index_of_emi>]
field_map['ip_address'] = fields[<index_of_ip_address]
.
.
.
vms[vm_name] = field_map
end
After this, vms will be initialized to contain information about each vm. You can simply print them all out at this point, or continue running data manipulation on them.

Related

TCL Regex Skipping Over a Set of Characters and Matching to a New line

I'm working with expect scripting in order to ssh into a device and pull information off of it. However, I'm facing issues parsing the expect_out(buffer) for the data from the commands I send.
This is the contents of my expect_out(buffer):
"mca-cli-op info\r\n\r\nModel: UAP-AC-Lite\r\nVersion: 6.0.21.13673\r\nMAC Address: 10:9f:5r:20:c5:7e\r\nIP Address: 123.123.1.123\r\nHostname: UAP-AC-Lite\r\nUptime: 152662 seconds\r\n\r\nStatus: Connected (http://base_controller<url;>/inform)\r\nUAP-AC-Lite-BZ.6.0.21# "
Right now I'm trying to get the Model (UAP-AC-LITE) without the Model tag.
So the regex expression I'm using is,
expect -re {(?=(Model: ))+[.*\$]}
set model "$expect_out(0,string)"
puts $model
The command doesn't work, but my thought process was that I would perform a look ahead for the Model tag, then match only the subsequent characters after it to the new line. I've tried replacing the "$" with \r\n but that doesn't work either. Can anyone explain what I'm doing wrong? Thanks for the help!
Note: If possible, I wouldn't want to include the newline either, as it might mess up commands that I run which use these variables.
You're close, but the regex is incorrect. Try
expect -re {Model:\s+([^\r]+)}
set model $expect_out(1,string)
The 1 in $expect_out(1,string) means the first set of capturing parentheses.
Regexes are documented at http://www.tcl-lang.org/man/tcl8.6/TclCmd/re_syntax.htm

Change WiFi WPA2 passkey from a script

I'm using Raspbian Wheezy, but this is not a Raspberry Pi specific question.
I am developing a C application, which allows the user to change their WiFi Password.
I did not find a ready script/command for this, so I'm trying to use sed.
I pass the SSID name and new key to a bash script, and the key is replaced for the that ssid block within *etc/wpa_supplicant/wpa_supplicant.conf.*.
My application runs as root.
A sample block is shown below.
network={
ssid="MY_SSID"
scan_ssid=1
psk="my_ssid_psk"
}
so far I've tried the following (I've copied the wpa_supplicant.conf to wpa.txt for trying) :
(1) This tries to do the replacement between a range, started when my SSID is detected, and ending when the closing brace, followed by a newline.
SSID="TRIMURTI"
PSK="12345678"
sed -n "1 !H;1 h;$ {x;/ssid=\"${SSID}\"/,/}\n/ s/[[:space:]]*psk=.*\n/\n psk=\"${PSK}\"\n/p;}" wpa.txt
and
(2) This tries to 'remember' the matched pattern, and reproduce it in the output, but with the new key.
SSID="TRIMURTI"
PSK="12345678"
sed -n "1 !H; 1 h;$ {x;s/\(ssid=\"${SSID}\".*psk=\).*\n/\1\"${PSK}\"/p;}" wpa.txt
I have used hold & pattern buffers as the pattern can span multiple lines.
Above, the first example seems to ignore the range & replaces the 1st instance, and then truncates the rest of the file.
The second example replaces the last found psk value & truncates the file thereafter.
So I need help in correcting the above code, or trying a different solution.
If we can assume the fields will always be in a strict order where the ssid= goes before psk=, all you really need is
sed "/^[[:space:]]*ssid=\"$SSID\"[[:space:]]*$/,/}/s/^\([[:space:]]*psk=\"\)[^\"]*/\1$PSK/" wpa.txt
This is fairly brittle, though. If the input is malformed, or if the ssid goes after the psk in your block, it will break. The proper solution (which however is severe overkill in this case) is to have a proper parser for the input format; while that is in theory possible in sed, it would be much simpler if you were to swtich a higher-level language like Python or Perl, or even Awk.
The most useful case is update a password or other value in configuration is to utilize wpa_cli. E.g.:
wpa_cli -i "wlan0" set_network "0" psk "\"Some5Strong1Pass"\"
wpa_cli -i "wlan0" save_config
The save_config method is required to update cfg file: /etc/wpa_supplicant/wpa_supplicant.conf

Looking for a Google script that will perform CTRL+F replace for a string

I have looked at multiple solutions here for similar tasks, and tried them in different ways.
Essentially, I have a cells with long, somewhat similar strings of text, and I want to isolate specific text markers in order to be able to split on those markers. The specific string I am looking for is "MHPP" and I want to replace it with "][MHPP " so I can used the split function to split on the "]".
I was able to get it to work by manually Finding and Replacing (CTRL+F and selecting parameters for the replace), but I want to be able to script it because I won't be the one running the script and need to simplify the process for low-information users.
Using =replace(find("MHPP"),7,"][MHPP ") only finds the first instance of the find value, and there may be multiple usages of the term throughout the cell.
Any suggestions? I suppose there might be a way to write the cell to a string, and replace within the array, but the logic of that process is escaping me at the moment.
I'm not asking for the entire code. I can activate the sheet, get the range, and work from there, but I just don't know how to write the specific function findAndReplace() that would actually locate all repetitions of the string and replace them all.
I'm also open to importing the .csv into a different format, running a function there, and returning it back out to a .csv, but that hasn't proven to be very fruitful either in my searches.
Thanks for any guidance you can offer to get me on the right path.
You can use the replace string function on every cell in a global iteration of your sheet, do that at array level to keep it fast and simple.
The code itself can be very short and straightforward like this :
function myFunction() {
var sh = SpreadsheetApp.getActive();
var data = sh.getDataRange().getValues();// get all data
for(var n=0;n<data.length;n++){
for(var m=0;m<data[0].length;m++){
if(typeof(data[n][m])=='string'){ // if it is a string
data[n][m]=data[n][m].replace(/MHPP/g,'][MHPP');// use the regex replace with /g parameter meaning "globally"
}
}
}
sh.getDataRange().setValues(data);// update sheet values
}
This could be improved to take care of certain situations where the script would be executed twice (or more) to prevent replacement if '][' is already present... I'll let you manage these details.

Regexp pattern matching IP and UserAgent in an Huge File

I have a huge log file that has a structure like this:
ip=X.X.X.X
userAgent=Firefox
-----
Referer=hxxp://www.bla.org
I want to create a custom output like this:
ip:userAgent
for ex:
X.X.X.X:Firefox
and the pattern will ignore lines which don't start with ip= and userAgent=. (these two must form a pair as i mentioned above.)
I am a newbie administrator and our client needs a sorted file immediately.
Any help will be wonderful.
Thanks.
^ip=(\d+(?:\.\d+){3})[\r\n]+userAgent=(.+)$
Apply in global + multiline mode.
Group 1 will contain the IP, group 2 will contain the user agent string.
Edit: The above expression can be simplified a bit, we can remove the IP address format checking - assuming that there will be nothing but real IP addresses in the log file:
^ip=(\d+\.?)+[\r\n]+userAgent=(.+)$
You can use:
^ip=((?:[0-9]{1,3}\.){3}[0-9]{1,3})$
And
^userAgent=(.*)$
Get the group 1 for both and you will have the desired data.
give it a try (this is in no way robust if there are lines where your log file differs from the example snippet above):
sed -n -e '/^ip=/ {s///
N
s/\nuserAgent=/:/
p
}' HugeFile > customoutput

Use cases for regular expression find/replace

I recently discussed editors with a co-worker. He uses one of the less popular editors and I use another (I won't say which ones since it's not relevant and I want to avoid an editor flame war). I was saying that I didn't like his editor as much because it doesn't let you do find/replace with regular expressions.
He said he's never wanted to do that, which was surprising since it's something I find myself doing all the time. However, off the top of my head I wasn't able to come up with more than one or two examples. Can anyone here offer some examples of times when they've found regex find/replace useful in their editor? Here's what I've been able to come up with since then as examples of things that I've actually had to do:
Strip the beginning of a line off of every line in a file that looks like:
Line 25634 :
Line 632157 :
Taking a few dozen files with a standard header which is slightly different for each file and stripping the first 19 lines from all of them all at once.
Piping the result of a MySQL select statement into a text file, then removing all of the formatting junk and reformatting it as a Python dictionary for use in a simple script.
In a CSV file with no escaped commas, replace the first character of the 8th column of each row with a capital A.
Given a bunch of GDB stack traces with lines like
#3 0x080a6d61 in _mvl_set_req_done (req=0x82624a4, result=27158) at ../../mvl/src/mvl_serv.c:850
strip out everything from each line except the function names.
Does anyone else have any real-life examples? The next time this comes up, I'd like to be more prepared to list good examples of why this feature is useful.
Just last week, I used regex find/replace to convert a CSV file to an XML file.
Simple enough to do really, just chop up each field (luckily it didn't have any escaped commas) and push it back out with the appropriate tags in place of the commas.
Regex make it easy to replace whole words using word boundaries.
(\b\w+\b)
So you can replace unwanted words in your file without disturbing words like Scunthorpe
Yesterday I took a create table statement I made for an Oracle table and converted the fields to setString() method calls using JDBC and PreparedStatements. The table's field names were mapped to my class properties, so regex search and replace was the perfect fit.
Create Table text:
...
field_1 VARCHAR2(100) NULL,
field_2 VARCHAR2(10) NULL,
field_3 NUMBER(8) NULL,
field_4 VARCHAR2(100) NULL,
....
My Regex Search:
/([a-z_])+ .*?,?/
My Replacement:
pstmt.setString(1, \1);
The result:
...
pstmt.setString(1, field_1);
pstmt.setString(1, field_2);
pstmt.setString(1, field_3);
pstmt.setString(1, field_4);
....
I then went through and manually set the position int for each call and changed the method to setInt() (and others) where necessary, but that worked handy for me. I actually used it three or four times for similar field to method call conversions.
I like to use regexps to reformat lists of items like this:
int item1
double item2
to
public void item1(int item1){
}
public void item2(double item2){
}
This can be a big time saver.
I use it all the time when someone sends me a list of patient visit numbers in a column (say 100-200) and I need them in a '0000000444','000000004445' format. works wonders for me!
I also use it to pull out email addresses in an email. I send out group emails often and all the bounced returns come back in one email. So, I regex to pull them all out and then drop them into a string var to remove from the database.
I even wrote a little dialog prog to apply regex to my clipboard. It grabs the contents applies the regex and then loads it back into the clipboard.
One thing I use it for in web development all the time is stripping some text of its HTML tags. This might need to be done to sanitize user input for security, or for displaying a preview of a news article. For example, if you have an article with lots of HTML tags for formatting, you can't just do LEFT(article_text,100) + '...' (plus a "read more" link) and render that on a page at the risk of breaking the page by splitting apart an HTML tag.
Also, I've had to strip img tags in database records that link to images that no longer exist. And let's not forget web form validation. If you want to make a user has entered a correct email address (syntactically speaking) into a web form this is about the only way of checking it thoroughly.
I've just pasted a long character sequence into a string literal, and now I want to break it up into a concatenation of shorter string literals so it doesn't wrap. I also want it to be readable, so I want to break only after spaces. I select the whole string (minus the quotation marks) and do an in-selection-only replace-all with this regex:
/.{20,60} /
...and this replacement:
/$0"ΒΆ + "/
...where the pilcrow is an actual newline, and the number of spaces varies from one incident to the next. Result:
String s = "I recently discussed editors with a co-worker. He uses one "
+ "of the less popular editors and I use another (I won't say "
+ "which ones since it's not relevant and I want to avoid an "
+ "editor flame war). I was saying that I didn't like his "
+ "editor as much because it doesn't let you do find/replace "
+ "with regular expressions.";
The first thing I do with any editor is try to figure out it's Regex oddities. I use it all the time. Nothing really crazy, but it's handy when you've got to copy/paste stuff between different types of text - SQL <-> PHP is the one I do most often - and you don't want to fart around making the same change 500 times.
Regex is very handy any time I am trying to replace a value that spans multiple lines. Or when I want to replace a value with something that contains a line break.
I also like that you can match things in a regular expression and not replace the full match using the $# syntax to output the portion of the match you want to maintain.
I agree with you on points 3, 4, and 5 but not necessarily points 1 and 2.
In some cases 1 and 2 are easier to achieve using a anonymous keyboard macro.
By this I mean doing the following:
Position the cursor on the first line
Start a keyboard macro recording
Modify the first line
Position the cursor on the next line
Stop record.
Now all that is needed to modify the next line is to repeat the macro.
I could live with out support for regex but could not live without anonymous keyboard macros.