In bash, I want to get the name of the last folder in a folder path.
For instance, given ../parent/child/, I want "child" as the output.
In a language other than bash, this regex works .*\/(.*)\/$ works.
Here's one of my attempts in bash:
echo "../parent/child/" | sed "s_.*/\(.*?\)/$_\1_p"
This gives me the error:
sed: -e expression #1, char 17: unterminated `s' command
What have I failed to understand?
One problem with your script is that inside the "s_.*/\(.*?\)/$_\1_p" the $_ is interpreted by the shell as a variable name.
You could either replace the double-quotes with single-quotes or escape the $.
Once that's fixed, the .*? may or may not work with your implementation of sed. It will be more robust to write something roughly equivalent that's more widely supported, for example:
sed -e 's_.*/\([^/]*\)/$_\1_'
Note that I dropped the p flag of sed to avoid printing the result twice.
Finally, a much simpler solution will be to use the basedir command.
$ basename ../parent/child/
child
Finally, a native Bash solution is also possible using parameter expansion:
path=../parent/child/
path=${path%/}
path=${path##*/}
You can use cut too
echo '../parent/child/' | cut -d/ -f3
I have a working (in macOS app Patterns) RegExp that reformats GeoJSON MultiPolygon coordinates, but don't know how to escape it for sed.
The file I'm working on is over 90 Mb in size, so bash terminal looks like the ideal place and sed the perfect tool for the job.
Search Text Example:
[[[379017.735,6940036.7955],[379009.8431,6940042.5761],[379000.4869,6940048.9545],[378991.5455,6940057.8128],[378984.0665,6940066.0744],[378974.7072,6940076.2152],[378962.8639,6940090.5283],[378954.5822,6940101.4028],[378947.9369,6940111.3128],[378941.4564,6940119.5094],[378936.2565,6940128.1229],[378927.6089,6940141.4764],[378919.6611,6940154.0312],[378917.21,6940158.7053],[378913.7614,6940163.4443],[378913.6515,6940163.5893],[378911.4453,6940166.3531],
Desired outcome:
[[[37.9017735,69.400367955],[37.90098431,69.400425761],[37.90004869,69.400489545],[37.89915455,69.400578128],[37.89840665,69.400660744],[37.89747072,69.400762152],[37.89628639,69.400905283],[37.89545822,69.401014028],[37.89479369,69.401113128],[37.89414564,69.401195094],[37.89362565,69.401281229],[37.89276089,69.401414764],[37.89196611,69.401540312],[37.891721,69.401587053],[37.89137614,69.401634443],[37.89136515,69.401635893],[37.89114453,69.401663531],
My current RegExp:
((?:\[)[0-9]{2})([0-9]+)(\.)([0-9]+)(,)([0-9]{2})([0-9]+)(\.)([0-9]+(?:\]))
and reformatting:
$1\.$2$4,$6.$7$9
The command should be something along these lines:
sed -i -e 's/ The RegExp escaped /$1\.$2$4,$6.$7$9/g' large_file.geojson
But what should be escaped in the RegExp to make it work?
My attempts always complain of being unbalanced.
I'm sorry if this has already been answered elsewhere, but I couldn't find even after extensive searching.
Edit: 2017-01-07: I didn't make it clear that the file contains properties other than just the GPS-points. One of the other example values picked from GeoJSON Feature properties is "35.642.1.001_001", which should be left unchanged. The braces check in my original regex is there for this reason.
That regex is not legal in sed; since it uses Perl syntax, my recommendation would be to use perl instead. The regular expression works exactly as-is, and even the command line is almost the same; you just need to add the -p option to get perl to operate in filter mode (which sed does by default). I would also recommend adding an argument suffix to the -i option (whether using sed or perl), so that you have a backup of the original file in case something goes horribly wrong. As for quoting, all you need to do is put the substitution command in single quotation marks:
perl -p -i.bak -e \
's/((?:\[)[0-9]{2})([0-9]+)(\.)([0-9]+)(,)([0-9]{2})([0-9]+)(\.)([0-9]+(?:\]))/$1\.$2$4,$6.$7$9/g' \
large_file.geojson
If your data is just like you showed, you needn't worry about the brackets. You may use a POSIX ERE enabled with -E (or -r in some other distributions) like this:
sed -i -E 's/([0-9]{2})([0-9]*)\.([0-9]+)/\1.\2\3/g' large_file.geojson
Or a POSIX BRE:
sed -i 's/\([0-9]\{2\}\)\([0-9]*\)\.\([0-9]\+\)/\1.\2\3/g' large_file.geojson
See an online demo.
You may see how this regex works here (just a demo, not proof).
Note that in POSIX BRE you need to escape { and } in limiting / range quantifiers and ( and ) in grouping constructs, and the + quantifier, else they denote literal symbols. In POSIX ERE, you do not need to escape the special chars to make them special, this POSIX flavor is closer to the modern regexes.
Also, you need to use \n notation inside the replacement pattern, not $n.
A simple sed will do it:
$ echo "$var"
[[[379017.735,6940036.7955],[379009.8431,6940042.5761],[379000.4869,6940048.9545],[378991.5455,6940057.8128],[378984.0665,6940066.0744],[378974.7072,6940076.2152],[378962.8639,6940090.5283],[378954.5822,6940101.4028],[378947.9369,6940111.3128],[378941.4564,6940119.5094],[378936.2565,6940128.1229],[378927.6089,6940141.4764],[378919.6611,6940154.0312],[378917.21,6940158.7053],[378913.7614,6940163.4443],[378913.6515,6940163.5893],[378911.4453,6940166.3531],
$ echo "$var" | sed 's/\([0-9]\{3\}\)\./.\1/g'
[[[379.017735,6940.0367955],[379.0098431,6940.0425761],[379.0004869,6940.0489545],[378.9915455,6940.0578128],[378.9840665,6940.0660744],[378.9747072,6940.0762152],[378.9628639,6940.0905283],[378.9545822,6940.1014028],[378.9479369,6940.1113128],[378.9414564,6940.1195094],[378.9362565,6940.1281229],[378.9276089,6940.1414764],[378.9196611,6940.1540312],[378.91721,6940.1587053],[378.9137614,6940.1634443],[378.9136515,6940.1635893],[378.9114453,6940.1663531],
I have a perl one-liner that works when log statements are on a single line:
find -type f -iname "*java" | xargs -d'\n' -n 1 perl -i -pe 's{(log.*((info)|(debug)).*)}{//$1}gi'
But trying to modify this to work on multiple lines is tricky. I know that the s modifier will match newlines, but how do I get it to comment out subsequent lines (i.e. up to ; assuming the log string doesn't have it)?
I'm fine with a solution that makes multi-line log statements into single-line log statements too. I'll also accept C-style comments (though it would be nice to find a solution for C++ style comments).
(Don't tell me to turn off logging. Anyone who's actually tried that will realize how brutally complicated that is in non-trivial applications.)
Just a general idea (please adapt to your case...)
...perl -i -p0e 's{(log.*?((info)|(debug)).*?;)}{ $1 =~ s!^|\n!\n//!gr }gsei'
where:
.*? instead of .* to behave non-greddy
-p0e to process the full text as a single record
$1 =~ s!^|\n!\n//!gr to make extra processing of internal newlines
Please test it before application...
You can use the range operator, start .. stop:
perl -i -pe 's!^!//! if /log.*(info|debug)/ .. /;/'
I'm trying to parse a command's help file to grab all the arguments the command excepts.
Here is some text from the help file:
* --digest:
Set the digest for fingerprinting (defaults to the digest used when
signing the cert). Valid values depends on your openssl and openssl ruby
extension version.
* --debug:
Enable full debugging.
* --help:
Print this help message
* --verbose:
Enable verbosity.
* --version:
Print the puppet version number
I want to just grab --argument and nothing else.
I almost got it with this command, but its still including the ":" which I want to exclude:
puppet cert --help | egrep '^* --(.*):$' | awk '{print $2}'
--all:
--allow-dns-alt-names:
--digest:
--debug:
--help:
--verbose:
--version:
Why is '^* --(.*):$' including the ":" shouldn't it be matching everything between '^* --' and ':$' ?
shouldn't it be matching everything between ^* -- and :$ ?
Actually, no. You're capturing a group, but it won't print just the group. I suggest using the -P flag to use Perl regex, and look arounds. In your case, this might be enough:
$ cert --help | grep -Po '^\* \K--\w+'
Note that I also used the -o option, to print only the matched content, not the whole line. This eliminates the usage of awk.
A more complete line based on your initial thoughts and more look arounds:
$ cert --help | grep -Po '^\* \K--.*(?=:)'
Edit: as noted in the comments and fine answer by mklement0, this requires GNU grep. You can however do the same with Perl itself, which certainly is probably already installed in your system.
$ cert --help | perl -nle 'print $1 if /^\* (--\w+)/'
This works like a line of code inside a loop. Which is automatically generated by the -nle. -n for the input look, -l for the auto line break, and -e to present the line of code.
The line of Perl code prints the first captured group if the line matches the regex. So it combines ideas from your original solution too.
For a complete POSIX compliant answer, check the answer by mklement0 here in this page.
To provide a POSIX-compliant alternative to sidyll's elegant GNU grep answer (which also explains why the OP's approach didn't work):
Update: Avinash Raj points out in a comment that sed is an option, which indeed allows for a POSIX-compliant single-tool solution: sed allows us to match entire lines of interest and replace them with the contents of a capture group (the part of the line of interest):
puppet cert --help | sed -n 's/^\* \(--.*\):$/\1/p'
Note that since sed is used without the - nonstandard - -r / -E option, a basic regular expression must be used, where ( and ) must be \-escaped to act as capture-group delimiters.
Original answer:
puppet cert --help | egrep '^\* --.+:$' | awk -F '\\* |:' '{print $2}'
Note:
^* was replaced with ^\* so as to ensure that * is matched as a literal, and (.*) was replaced with .+, because (a) there is nothing to be gained by a capture group here, and (b) it's fair to assume that at least one letter follows the --.
-F '\\* |:' uses either literal *<space> or : as the field separator, which ensures that only the --... token (the second field) is printed.
My following regex in Sed doesn't extract the file I want without the #30 substring.
Could you please help pointing out what I am missing here?
[machine]# echo "//dir1/dir2/dir3/component/file.rb#70" | sed 's/\(.*rb\)#\d+$/\1/g'
Output: //dir1/dir2/dir3/component/file.rb#70
What I want is simply: //dir1/dir2/dir3/component/file.rb without #70 substring.
Thanks in advance
PL
The flavor of regular expression understood by sed by default doesn't include either \d for digits or + for "1 or more".
This will work:
sed 's/\(.*\.rb\)#[0-9][0-9]*$/\1/g'
Or you could turn on "extended" regular expression syntax with -E, which makes the + work (though still not \d), and swaps the meaning of backslashed vs non-backslashed parentheses:
sed -E 's/(.*\.rb)#[0-9]+$/\1/g'
Both of the above commands will work even on non-GNU sed, as you get by default on BSD and Mac OS X systems. In normal mode (without the -E), GNU sed also understands \+ to mean the same as bare + in extended mode, but BSD sed does not.
If all you're trying to do is get rid of the #digits, though, you can do it more simply. Sed regexes aren't anchored to the start of the line, so you don't have to include the filename - just replace the part you don't want with nothing at all:
sed 's/#[0-9][0-9]*$//'
or
sed -E 's/#[0-9]+$//'
If your real problem does require the fancy version, though, you could also use Perl, which has the advantage that there's relatively few (almost no) changes in regex syntax across versions. It also understands that \d syntax you tried to use:
perl -pe 's/(.*\.rb)#\d+$/\1/g'
With GNU sed, your command works if you use -E and change \d to [0-9] or [[:digit:]]:
echo "//dir1/dir2/dir3/component/file.rb#70" | sed -E 's/(.*rb)#[0-9]+$/\1/g'
//dir1/dir2/dir3/component/file.rb
Depending on the context, you may be able to use a simpler command, such as
sed 's/#[0-9]\+//g'
You got the answer but have you considered simply:
$ echo "//dir1/dir2/dir3/component/file.rb#70" | cut -d'#' -f1
//dir1/dir2/dir3/component/file.rb