Find and Replace using Notepad++ - regex

I'm trying to use Notepad++ to do some find and replace as I'm dealing with up to few thousand of lines of data.
The below is the example of the data structure that i am dealing with.
A = Can be any Aplabet
X = Can be any Number 0-9
RX = Number that I want to replace with another value.
AAAAA X.XXXXXX X.XXXXX X X X X X XX:XX:XX:XX.XXX XXX RXRXRXRXRXRX XXXXXX XXXXXX
Actual Example
werwer 2.178924 1.17892 1 1 1 1 1 12:14:44:59.123 123 0123123 123345 123123
gret 2.178975 1.15731 1 1 1 1 1 12:14:44:59.123 123 0123 123345 123123
sdfwe 2.123245 1.15171 1 1 1 1 1 12:14:44:59.123 123 0555312 123345 123123
Is there a shortcut I can use?

N++ is not the tool for the job as it has very limited regexp capabilities. In a decent editor, you could replace
((?:[a-zA-Z0-9:\.]+\s+){10})\d+(.*)
with
\1your_text\2
but notepad++ regex syntax supports neither (?:) nor {10}.
There are lots of regex tools out there, choose whichever.
P.S. I also tried repeating the first pattern ten times to emulate {10}, it still did not work strangely.

This looks like the sort of job that is perfect for awk.
awk '{print "$1 $2 $3 $4 $5 $6 $7 $8 $9 NUMBERS I'\''M CHANGING $11 $12"}' < file.txt > newfile.txt
You can also try vim's Block Highlighting and Insert/Change functionality. I don't know if notepad++ has anything similar to it.

press ctrl+F and then go to Replace tab.

refer to https://superuser.com/questions/214079/find-and-replace-using-notepad
whitequark solution

Related

get number value between two strings using regex

I have a string with multiple value outputs that looks like this:
SD performance read=1450kB/s write=872kB/s no error (0 0), ManufactorerID 27 Date 2014/2 CardType 2 Blocksize 512 Erase 0 MaxtransferRate 25000000 RWfactor 2 ReadSpeed 22222222Hz WriteSpeed 22222222Hz MaxReadCurrentVDDmin 3 MaxReadCurrentVDDmax 5 MaxWriteCurrentVDDmin 3 MaxWriteCurrentVDDmax 1
I would like to output only the read value (1450kB/s) using bash and sed.
I tried
sed 's/read=\(.*\)kB/\1/'
but that outputs read=1450kB but I only want the number.
Thanks for any help.
Sample input shortened for demo:
$ echo 'SD performance read=1450kB/s write=872kB/s no error' | sed 's/read=\(.*\)kB/\1/'
SD performance 1450kB/s write=872/s no error
$ echo 'SD performance read=1450kB/s write=872kB/s no error' | sed 's/.*read=\(.*\)kB.*/\1/'
1450kB/s write=872
$ echo 'SD performance read=1450kB/s write=872kB/s no error' | sed 's/.*read=\([0-9]*\)kB.*/\1/'
1450
Since entire line has to be replaced, add .* before and after search pattern
* is greedy, will try to match as much as possible, so in 2nd example it can be seen that it matched even the values of write
Since only numbers after read= is needed, use [0-9] instead of .
Running
sed 's/read=\(.*\)kB/\1/'
will replace read=[digits]kB with [digit]. If you want to replace the whole string, use
sed 's/.*read=\([0-9]*\)kB.*/\1/'
instead.
As Sundeep noticed, sed doesn't support non-greedy pattern, updated for [0-9]* instead

Can adding a particular number to a bunch of "time" strings, be done in Regex

I have a "srt" file(like standard movie-subtitle format) like shown in below link:http://pastebin.com/3k8a53SC
Excerpt:
1
00:00:53,000 --> 00:00:57,000
<any text that may span multiple lines>
2
00:01:28,000 --> 00:01:35,000
<any text that may span multiple lines>
But right now the subtitles timing is all wrong, as it lags behind by 9 seconds.
Is it possible to add 9 seconds(+9) to every time entry with regex ?
Even if the milliseconds is set to 000 then it's fine, but the addition of 9 seconds should adhere to "60 seconds = 1 minute & 60 minutes = 1 hour" rules.
Also the subtitle text after timing entry must not get altered by regex.
By the way the time format for each time string is "Hours:Minutes:Seconds.Milliseconds".
Quick answer is "no", that's not an application for regex. A regular expression lets you MATCH text, but not change it. Changing things is outside the scope of the regex itself, and falls to the language you're using -- perl, awk, bash, etc.
For the task of adjusting the time within an SRT file, you could do this easily enough in bash, using the date command to adjust times.
#!/usr/bin/env bash
offset="${1:-0}"
datematch="^(([0-9]{2}:){2}[0-9]{2}),[0-9]{3} --> (([0-9]{2}:){2}[0-9]{2}),[0-9]{3}"
os=$(uname -s)
while read line; do
if [[ "$line" =~ $datematch ]]; then
# Gather the start and end times from the regex
start=${BASH_REMATCH[1]}
end=${BASH_REMATCH[3]}
# Replace the time in this line with a printf pattern
linefmt="${line//[0-2][0-9]:[0-5][0-9]:[0-5][0-9]/%s}\n"
# Calculate new times
case "$os" in
Darwin|*BSD)
newstart=$(date -v${offset}S -j -f "%H:%M:%S" "$start" '+%H:%M:%S')
newend=$(date -v${offset}S -j -f "%H:%M:%S" "$end" '+%H:%M:%S')
;;
Linux)
newstart=$(date -d "$start today ${offset} seconds" '+%H:%M:%S')
newend=$(date -d "$end today ${offset} seconds" '+%H:%M:%S')
;;
esac
# And print the result
printf "$linefmt" "$newstart" "$newend"
else
# No adjustments required, print the line verbatim.
echo "$line"
fi
done
Note the case statement. This script should auto-adjust for Linux, OSX, FreeBSD, etc.
You'd use this script like this:
$ ./srtadj -9 < input.srt > output.srt
Assuming you named it that, of course. Or more likely, you'd adapt its logic for use in your own script.
No, sorry, you can’t. Regex are a context free language (see Chomsky e.g. https://en.wikipedia.org/wiki/Chomsky_hierarchy) and you cannot calculate.
But with a context sensitive language like perl it will work.
It could be a one liner like this ;-)))
perl -n -e 'if(/^(\d\d:\d\d:\d\d)([-,\d\s\>]*)(\d\d:\d\d:\d\d)(.*)/) {print plus9($1).$2.plus9($3).$4."\n";}else{print $_} sub plus9{ ($h,$m,$s)=split(/:/,shift); $t=(($h*60+$m)*60+$s+9); $h=int($t/3600);$r=$t-($h*3600);$m=int($r/60);$s=$r-($m*60);return sprintf "%02d:%02d:%02d", $h, $m, $s;}‘ movie.srt
with move.srt like
1
00:00:53,000 --> 00:00:57,000
hello
2
00:01:28,000 --> 00:01:35,000
I like perl
3
00:02:09,000 --> 00:02:14,000
and regex
you will get
1
00:01:02,000 --> 00:01:06,000
hello
2
00:01:37,000 --> 00:01:44,000
I like perl
3
00:02:18,000 --> 00:02:23,000
and regex
You can change the +9 in the "sub plus9{...}", if you want another delta.
How does it work?
We are looking for lines that matches
dd:dd:dd something dd:dd:dd something
and then we call a sub, which add 9 seconds to the matched group one ($1) and group three ($3). All other lines are printed unchanged.
added
If you want to put the perl oneliner in a file, say plus9.pl, you can add newlines ;-)
if(/^(\d\d:\d\d:\d\d)([-,\d\s\>]*)(\d\d:\d\d:\d\d)(.*)/) {
print plus9($1).$2.plus9($3).$4."\n";
} else {
print $_
}
sub plus9{
($h,$m,$s)=split(/:/,shift);
$t=(($h*60+$m)*60+$s+9);
$h=int($t/3600);
$r=$t-($h*3600);
$m=int($r/60);
$s=$r-($m*60);
return sprintf "%02d:%02d:%02d", $h, $m, $s;
}
Regular expressions strictly do matching and cannot add/substract. You can match each datetime string using python, for example, add 9 seconds to that, and then rewrite the string in the appropriate spot. The regular expression I would use to match it would be the following:
(?<hour>\d+):(?<minute>\d+):(?<second>\d+),(?<msecond>\d+)
It has labeled capture groups so it's really easy to get each section (you won't need msecond but it's there for visualization, I guess)
Regex101

bash: Batch reformatting using sed + date?

I have a bunch of data that looks like this:
"2004-03-23 20:11:55" 3 3 1
"2004-03-23 20:12:20" 1 1 1
"2004-03-31 02:20:04" 15 15 1
"2004-04-07 14:33:48" 141 141 1
"2004-04-15 02:08:31" 2 2 1
"2004-04-15 07:56:01" 1 2 1
"2004-04-16 12:41:22" 4 4 1
and I need to feed this data to a program which only accepts time in UNIX (Epoch) format. Is there a way I can change all the dates in bash? My first instinct tells me to do something like this:
sed 's/"(.*)"/`date -jf "%Y-%m-%d %T" "\1" "+%s"`'
But I am not entirely sure that the \1 inside the date call will properly backreference the regex matched by sed. In fact, when I run this, I get the following response:
sed: 1: "s/(".*")/`date -jf "% ...": unterminated substitute in regular expression
Can anyone guide me in the right direction on this? Thank you.
Nothing is going to be expanded between single quotes. Also, no, the shell expansions are going to happen before the sed \1 expansion, so your code isn't going to work. How about something like this (untested):
while IFS= read -r date time a b c
do
date --date "${date:1} ${time::-1}" # Cut the variables to remove the literal quotes
printf " %s %s %s\n" "$a" "$b" "$c"
done < file

Regular expression to match second or last decimal number in a string

String:
<LF><CR>A214 pH/ISE,X00066,2.59,ABCDE,10/16/13 22:06:59,ABC1,CH-1,pH,7.00,pH,0.0, mV,25.0,C,100.0,%,M100,#35<LF><CR>
I need to match only the 7.00 - This number could be anywhere from 0.00 - 14.00 (its a pH reading).
Right now I can only come up with [0-9]{1,2}\.[0-9]{2} which also matches the software revision number which appears earlier in the string (2.59)
Any help is greatly appreciated.
EDIT: Thanks everyone. I figured it out by using [0-9]{1,2}\.[0-9]{2}(?=,p)
Simply find all entries and get the last:
>>> s = "A214 pH/ISE,X00066,2.59,ABCDE,10/16/13 22:06:59,ABC1,CH-1,pH,7.00,pH,0.0, mV,25.0,C,100.0,%,M100,#35"
>>> re.findall("[0-9]{1,2}.[0-9]{2}", s)[-1]
'7.00'
You can improve that regex by using the information that PH is between 0-14(first digit can only by one etc). Or better, just split by commas or use csv module.
maybe you can use that:
pH,(([0-9]|1[0-4])\.\d{2}),pH
group 1 match number that you need. And that control data
If the format of the string is fixed, i.e. the data is in the 9th position if you split on , Use e.g. awk:
$ awk -F, '{print $8, $9}' input
pH 7.00
or using perl in awk-mode:
$ perl -F, -lane 'print $F[8]' input
7.00
Or this regexp
pH,(\d+\.\d{2})
See it line on http://www.rubular.com/r/3kkWNVBAi8

RegEx to extract numeric value immediately before a search character /string found

what would be the best regEx to extract all the number (only numbers) before a search string ?
ABC Y C S 1 $ 46CC MAN 25/ 31
Need to extract 25 in this case, but its not fixed length ? Any help ?
'\d+(?=/)'
should work. see test with grep:
kent$ echo "ABC Y C S 1 $ 46CC MAN 25/ 31 "|grep -Po '\d+(?=/)'
25
Perl regex:
while ($subject =~ m!\d+(?=.*/)!g) {
# matched text = $&
}
Output:
1
46
25
So basically keep matching, as long as a / exist somewhere later.