sed remove digits at end of the line - regex

I need to find out how to delete up to 10 digits that are at the end of the line in my text file using sed.
For example if I have this:
ajsdlfkjasldf1234567890
asdlkjfalskdjf123456
adsf;lkjasldfkjas123
it should become:
ajsdlfkjasldf
asdlkjfalskdjf
adsf;lkjasldfkjas
can anyone help?
I have this, but its not working:
sed 's/[0-9]{10}$//g'

Have you tried this:
sed 's/[0-9]+$//'
Your command would only match and delete exactly 10 digits at the end of line and only, if you enabled extended regular expressions (-E or -r, depending on your version of sed).
You should try
sed -r 's/[0-9]{1,10}$//'

The following should work:
sed 's/[0-9]\{1,10\}$//' file
Regex syntax in sed requires backslashes before the brackets to use them for repetition, unless you use an extended regex option.

A quick look here suggests you should try this:
$ sed 's/[0-9]\{0,10\}$//g'
{ } should be escaped, unless you switch to extended regex syntax:
$ sed -r 's/[0-9]{0,10}$//g'

Related

Why does this regex work in grep but not sed?

I have two regular expressions:
$ grep -E '\-\- .*$' *.sql
$ sed -E '\-\- .*$' *.sql
(I am trying to grep lines in sql files that have comments and remove lines in sql files that have comments)
The grep command works using this regex; however, the sed returns the following error:
sed: -e expression #1, char 7: unterminated address regex
What am I doing incorrectly with sed?
(The space after the two hyphens is required for sql comments if you are unfamiliar with MySql comments of this type)
You're trying to use:
sed -E '\-\- .*$' *.sql
Here sed command is not correct because you're not really telling sed to do something.
It should be:
sed -n '/-- /p' *.sql
and equivalent grep would be:
grep -- '-- ' *.sql
or even better with a fixed string search:
grep -F -- '-- ' *.sql
Using -- to separate pattern and arguments in grep command.
There is no need to escape - in a regex if it is outside bracket expression (or character class) i.e. [...].
Based on comments below it seems OP's intent is to remove commented section in all *.sql files that start with 2 hyphens.
You may use this sed for that:
sed -i 's/-- .*//g' *.sql
The problem here is not the regex, the problem is that sed requires a command. The equivalent of your grep would be:
sed -n '/\-\- .*$/p'
You suppress output for non-matching lines -n ... you search (wrap your regex in slashes) and you print p (after the last slash).
P.S.: As Anub pointed out, escaping the hyphens - inside the regex is unnecessary.
You are trying to use sed's \cregexpc syntax where with \-<...> you are telling sed the delimiter character you want use is a dash -, but you didn't terminate it where it should be: \-<...>- also add d command to delete those lines.
sed '\-\-\-.*$-d' infile
see man sed about that:
\cregexpc
Match lines matching the regular expression regexp. The c may be any character.
if default / was used this was not required so:
sed '/--.*$/d' infile
or simply:
sed '/^--/d' infile
and more accurately:
sed '/^[[:blank:]]*--/d' infile

Replace some dots(.) with commas(,) with RegEx and awk or sed

I want to replace dots with commas for some but not all matches:
hostname_metric (Index: 1) to hostname;metric (avg);22.04.2015 13:40:00;3.0000;22.04.2015 02:05:00;2.0000;22.04.2015 02:00:00;650.7000;2.2594;
The outcome should look like this:
hostname_metric (Index: 1) to hostname;metric (avg);22.04.2015 13:40:00;3,0000;22.04.2015 02:05:00;2,0000;22.04.2015 02:00:00;650,7000;2,2594;
I was able to identify the RegEx which should work to find the correct dots.
;[0-9]{1,}\.[0-9]{4}
But how can I replace them with a comma with awk or sed?
Thanks in advance!
Adding some capture groups to the regex in your question, you can use this sed one-liner:
sed -r 's/(;[0-9]{1,})\.([0-9]{4})/\1,\2/g' file
This matches and captures the part before and after the . and uses them in the replacement string.
On some versions of sed, you may need to use -E instead of -r to enable Extended Regular Expressions. If your version of sed doesn't understand either switch, you can use basic regular expressions and add a few escape characters:
sed 's/\(;[0-9]\{1,\}\)\.\([0-9]\{4\}\)/\1,\2/g' file
sed 's/\(;[0-9]\+\)\.\([0-9]\{4\}\)/\1,\2/g' should do the trick.

Bash: sed regex pattern won't match strings

I have tested this particular regex in RegExr.com:
/(\*)*((\s)?(\w)*)/g
to match the following:
* Global Links contained...etc
* Change User, contact list...etc
(everything from ... on is just extra words in the sentence, not a literal ...etc)
I tried to use this regex in a sed command as part of a bash script like so:
sed "/(\*)*((\s)?(\w)*)/d" test.txt > stripped.txt
But these two lines still remain in stripped.txt. Is there something I'm not accounting for in the regex or in the file? before these two lines is the start of a block comment (/**) and the block comment end is after them(*/), both of these are on new lines. Am i missing something obscure with new lines or is the sed command/regex wrong?
You aren't accounting for the dialect of regex in use by sed by default. That's not a valid BRE (basic regular expression).
You need to tell sed to use ERE's (extended regular expressions).
For GNU sed that is the -r flag and for BSD sed that is the -E flag (though -r is often available as a compat flag).
sed -r "/(\*)*((\s)?(\w)*)/d" test.txt > stripped.txt

How to use sed and regex?

I need to use sed to look for all lines in a file with pattern "[whatever]|[whatever]" so I'm using the following regex:
sed '/\"[a-zA-Z0-9]+\|[a-zA-Z0-9]+\"/p' test2.txt
But it's not working because in this file is returning something when it shouldn't
RTV0031605951US|3160595|20/03/2013|0|"Laurie Graham"|"401"
Does anybody know with regex should I use? Thanks in advance
I see three problems with your regular expression:
+ is not a metacharacter, so you need to escape it to get its special meaning.
Similar issue happens with the pipe. Neither it is a metacharacter, so don't escape it to match it literally.
Sed by default prints each line that matches, so add -n that avoids that, if you already use /p that prints it. Otherwise you will have those lines twice in the output.
sed will output anything that is a partial match.
To match only whole lines that match your regex, add ^ and $ to the start/end:
sed '/^\"[a-zA-Z0-9]+\|[a-zA-Z0-9]+\"$/p' test2.txt
sed '/\B\"[ [:alnum:]]\+\"|\"[ [:alnum:]]\+\"\B/!d' file
If you use this in a sed script, do not escape double quotes.

Replace certain strings from text with SED and REGEX

I have the following strings in a text file (big one, more like these and different):
79A18D7F-1517-5981-8446-3A0452727B06
7842A72D-1517-5281-84E4-EAEF09B743F7
6040BEE7-1517-5982-84C1-419B224E647E
615F2747-1517-5981-84AF-787C34967FB2
7468A3E3-1517-5931-84B3-3FC3F701C269
I can find them using grep and regex:
'[0-9A-F]{8}-[0-9]{4}-[0-9]{4}-[0-9A-F]{4}-[0-9A-F]{12}'
what's the sed regex syntax to delete them because:
sed "s/[0-9A-F]{8}-[0-9]{4}-[0-9]{4}-[0-9A-F]{4}-[0-9A-F]{12}//g"
doesn't seem to work.
Thanks!
Use sed -r. You are relying on extended regular expression syntax features without escaping them, but with sed -r you don't have to. If you want to actually delete the lines instead of just clearing them, you can use:
sed -r "/regex/d"
In addition, for regular sed (BRE) you would need to escape the curly braces:
sed 's/[0-9A-F]\{8\}-[0-9]\{4\}-[0-9]\{4\}-[0-9A-F]\{4\}-[0-9A-F]\{12\}//g' file