I am having trouble with a short bash script. It seems like all forward slashes needs to be escaped. How can required characters in expanded (environment) variables be escaped before perl reads them? Or some other method that perl understands.
This is what I am trying to do, but this will not work properly.
eval "perl -pi -e 's/$HOME\/_TV_rips\///g'" '*$videoID.info.json'
That is part of a longer script where videoID=$1. (And for some reason perl expands variables both within single and double quotes.)
This simple workaround with no forward slash in the expanded environment variable $USER works. But I would like to not have /Users/ hard coded:
eval "perl -pi -e 's/\/Users\/$USER\/_TV_rips\///g'" '*$videoID.info.json'
This is probably solvable in some better way fetching home dir for files or something else. The goal is to remove the folder name in youtube-dl's json data.
I am using perl just because it can handle extended regex. But perl is not required. Any better substitute for extended regex on macOS is welcome.
You are building the following Perl program:
s//home/username\/_TV_rips\///g
That's quite wrong.
You shouldn't be attempting to build Perl code from the shell in the first place. There are a few ways you could pass values to the Perl code instead of generating Perl code. Since the value is conveniently in the environment, we can use
perl -i -pe's/\Q$ENV{HOME}\E\/_TV_rips\///' *"$videoID.info.json"
or better yet
perl -i -pe's{\Q$ENV{HOME}\E/_TV_rips/}{}' *"$videoID.info.json"
(Also note the lack of eval and the fixed quoting on the glob.)
Just assembling the ideas in comments, this should achieve what you expected :
perl -pi -e 's{$ENV{HOME}/_TV_rips/}{}g' *$videoID.info.json
#ikegami thanks for your comment! It is indeed safer with \Q...\E, in case $HOME contains characters like $.
All RegEx delimiters must of cource be escaped in input String.
But as Stefen stated, you can use other delimiters in perl, like %, ยง.
Special characters
# Perl comment - don't use this
?,[], {}, $, ^, . Regex control chars - must be escaped in Regex. That makes it easier if you have many slashes in your string.
You should always write a comment to make clear you are using different delimiters, because this makes your regex hard to read for inexperienced users.
Try out your RegEx here: https://regex101.com/r/cIWk1o/1
I have this regex for now
It should catch something like this
org.package;version="[1.0.41, 1.0.51)" and "," optionally if it is not last element.
Also if after package i added .* because the package could be "org.package.util.something" until ";version"
I tried it online in Regex tool and it is working like this
org.package(.*.*)?;version="[[0-9].[0-9].[0-9][0-9],\s[0-9].[0-9].[0-9][0-9])",?
but i dont know what should i change so it can work in bash
package="org.package"
sed -i "s/"$$package.*;version="\[[0-9].[0-9].[0-9][0-9],[[:space:]][0-9].[0-9].[0-9][0-9]\)",?"//g" "$file"
Change the double quotes arround sed command by single quotes, because variable expansion of $package single quotes are closed and double quotes are use arround variable
package="org.package"
sed -i 's/'"$package"'.*;version="\[[0-9].[0-9].[0-9][0-9],[[:space:]][0-9].[0-9].[0-9][0-9]\)",?//g' "$file"
before using command with -i option check the output is correct
There is more than one problem
$$ will be replaced by bash with its PID, that's probably not what you want
online regex evaluators usually use extended regex or perl regex syntax
sed -r will enable extended regex mode. (for grep there's -E and -P)
You use . when you want to match literal dots. However you should be using \., because . actually means "any character" in regular expressions.
I have a working (in macOS app Patterns) RegExp that reformats GeoJSON MultiPolygon coordinates, but don't know how to escape it for sed.
The file I'm working on is over 90 Mb in size, so bash terminal looks like the ideal place and sed the perfect tool for the job.
Search Text Example:
[[[379017.735,6940036.7955],[379009.8431,6940042.5761],[379000.4869,6940048.9545],[378991.5455,6940057.8128],[378984.0665,6940066.0744],[378974.7072,6940076.2152],[378962.8639,6940090.5283],[378954.5822,6940101.4028],[378947.9369,6940111.3128],[378941.4564,6940119.5094],[378936.2565,6940128.1229],[378927.6089,6940141.4764],[378919.6611,6940154.0312],[378917.21,6940158.7053],[378913.7614,6940163.4443],[378913.6515,6940163.5893],[378911.4453,6940166.3531],
Desired outcome:
[[[37.9017735,69.400367955],[37.90098431,69.400425761],[37.90004869,69.400489545],[37.89915455,69.400578128],[37.89840665,69.400660744],[37.89747072,69.400762152],[37.89628639,69.400905283],[37.89545822,69.401014028],[37.89479369,69.401113128],[37.89414564,69.401195094],[37.89362565,69.401281229],[37.89276089,69.401414764],[37.89196611,69.401540312],[37.891721,69.401587053],[37.89137614,69.401634443],[37.89136515,69.401635893],[37.89114453,69.401663531],
My current RegExp:
((?:\[)[0-9]{2})([0-9]+)(\.)([0-9]+)(,)([0-9]{2})([0-9]+)(\.)([0-9]+(?:\]))
and reformatting:
$1\.$2$4,$6.$7$9
The command should be something along these lines:
sed -i -e 's/ The RegExp escaped /$1\.$2$4,$6.$7$9/g' large_file.geojson
But what should be escaped in the RegExp to make it work?
My attempts always complain of being unbalanced.
I'm sorry if this has already been answered elsewhere, but I couldn't find even after extensive searching.
Edit: 2017-01-07: I didn't make it clear that the file contains properties other than just the GPS-points. One of the other example values picked from GeoJSON Feature properties is "35.642.1.001_001", which should be left unchanged. The braces check in my original regex is there for this reason.
That regex is not legal in sed; since it uses Perl syntax, my recommendation would be to use perl instead. The regular expression works exactly as-is, and even the command line is almost the same; you just need to add the -p option to get perl to operate in filter mode (which sed does by default). I would also recommend adding an argument suffix to the -i option (whether using sed or perl), so that you have a backup of the original file in case something goes horribly wrong. As for quoting, all you need to do is put the substitution command in single quotation marks:
perl -p -i.bak -e \
's/((?:\[)[0-9]{2})([0-9]+)(\.)([0-9]+)(,)([0-9]{2})([0-9]+)(\.)([0-9]+(?:\]))/$1\.$2$4,$6.$7$9/g' \
large_file.geojson
If your data is just like you showed, you needn't worry about the brackets. You may use a POSIX ERE enabled with -E (or -r in some other distributions) like this:
sed -i -E 's/([0-9]{2})([0-9]*)\.([0-9]+)/\1.\2\3/g' large_file.geojson
Or a POSIX BRE:
sed -i 's/\([0-9]\{2\}\)\([0-9]*\)\.\([0-9]\+\)/\1.\2\3/g' large_file.geojson
See an online demo.
You may see how this regex works here (just a demo, not proof).
Note that in POSIX BRE you need to escape { and } in limiting / range quantifiers and ( and ) in grouping constructs, and the + quantifier, else they denote literal symbols. In POSIX ERE, you do not need to escape the special chars to make them special, this POSIX flavor is closer to the modern regexes.
Also, you need to use \n notation inside the replacement pattern, not $n.
A simple sed will do it:
$ echo "$var"
[[[379017.735,6940036.7955],[379009.8431,6940042.5761],[379000.4869,6940048.9545],[378991.5455,6940057.8128],[378984.0665,6940066.0744],[378974.7072,6940076.2152],[378962.8639,6940090.5283],[378954.5822,6940101.4028],[378947.9369,6940111.3128],[378941.4564,6940119.5094],[378936.2565,6940128.1229],[378927.6089,6940141.4764],[378919.6611,6940154.0312],[378917.21,6940158.7053],[378913.7614,6940163.4443],[378913.6515,6940163.5893],[378911.4453,6940166.3531],
$ echo "$var" | sed 's/\([0-9]\{3\}\)\./.\1/g'
[[[379.017735,6940.0367955],[379.0098431,6940.0425761],[379.0004869,6940.0489545],[378.9915455,6940.0578128],[378.9840665,6940.0660744],[378.9747072,6940.0762152],[378.9628639,6940.0905283],[378.9545822,6940.1014028],[378.9479369,6940.1113128],[378.9414564,6940.1195094],[378.9362565,6940.1281229],[378.9276089,6940.1414764],[378.9196611,6940.1540312],[378.91721,6940.1587053],[378.9137614,6940.1634443],[378.9136515,6940.1635893],[378.9114453,6940.1663531],
I have a perl one-liner that works when log statements are on a single line:
find -type f -iname "*java" | xargs -d'\n' -n 1 perl -i -pe 's{(log.*((info)|(debug)).*)}{//$1}gi'
But trying to modify this to work on multiple lines is tricky. I know that the s modifier will match newlines, but how do I get it to comment out subsequent lines (i.e. up to ; assuming the log string doesn't have it)?
I'm fine with a solution that makes multi-line log statements into single-line log statements too. I'll also accept C-style comments (though it would be nice to find a solution for C++ style comments).
(Don't tell me to turn off logging. Anyone who's actually tried that will realize how brutally complicated that is in non-trivial applications.)
Just a general idea (please adapt to your case...)
...perl -i -p0e 's{(log.*?((info)|(debug)).*?;)}{ $1 =~ s!^|\n!\n//!gr }gsei'
where:
.*? instead of .* to behave non-greddy
-p0e to process the full text as a single record
$1 =~ s!^|\n!\n//!gr to make extra processing of internal newlines
Please test it before application...
You can use the range operator, start .. stop:
perl -i -pe 's!^!//! if /log.*(info|debug)/ .. /;/'