how to show only the first 3 numbers in a string - applescript-objc

I use a shell command that returns me the following:
Version-02520201604
I would like to know how I could show the first 3 numbers separated by a dot.
ie 0.2.5
I tried with SED and AWK but I couldn't.

Related

Awk 3 Spaces + 1 space or hyphen

I have a rather large chart to parse. Each column is separated by either 4 spaces or by 3 spaces and a hyphen (since the numbers in the chart can be negative).
cat DATA.txt | awk "{ print match($0,/\s\s/) }"
does nothing but print a slew of 0's. I'm trying to understand AWK and when to escape, etc, but I'm not getting the hang of it. Help is appreciated.
One line:
1979 1 -0.176 -0.185 -0.412 0.069 -0.129 0.297 -2.132 -0.334 -0.019
1979 1 -0.176 0.185 -0.412 0.069 -0.129 0.297 -2.132 -0.334 -0.019
I would like to get just, say, the second column. I copied the line, but I'd like to see -0.185 and 0.185.
You need to start by thinking about bash quoting, since it is bash which interprets the argument to awk which will be the awk program. Inside double-quoted strings, bash expands $0 to the name of the bash executable (or current script); that's almost certainly not what you want, since it will not be a quoted string. In fact, you almost never want to use double quotes around the awk program argument, so you should get into the habit of writing awk '...'.
Also, awk regular expressions don't understand \s (although Gnu awk will handle that as an extension). And match returns the position of the match, which I don't think you care about either.
Since by default, awk considers any sequence of whitespace a field separator, you don't really need to play any games to get the fourth column. Just use awk '{print $4}'
Why not just use this simple awk
awk '$0=$4' Data.txt
-0.185
0.185
It sets $0 to value in $4 and does the default action, print.
PS do not use cat with program that can read data itself, like awk
In case of filed 4 containing 0, you can make it more robust like:
awk '{$0=$4}1' Data.txt
If you're trying to split the input according to 3 or 4 spaces then you will get the expected output only from column 3.
$ awk -v FS=" {3,4}" '{print $3}' file
-0.185
0.185
FS=" {3,4}" here we pass a regex as FS value. This regex get parsed and set the Field Separator value to three or four spaces. In regex {min,max} called range quantifier which repeats the previous token from min to max times.

Shell script, split string into everything before and after the last whitespace character

I'm having some issues with separating a string in a shell script. I've been trying similar bits of code I've found online for RegEx, perl, awk, grep etc... but I can't seem to get the required result.
Basically I have a number of strings. Most are in the following format:
long string, space, number e.g.
Something!Something_Something_#Something_Something 10
However a small number aren't all the one string (they should be!) but they have spaces instead of underscores, e.g.
Something!Something_Something_#Something Something 10
or
Something!Something - Something_#Something Something 10
Each string is then formatted as follows:
... |awk '{printf "%-100s %10d\n", $1, $2}' > file.out
which prints the correct result for the strings which contain no spaces
Something!Something_Something_#Something_Something 10
However in the case of the first example it only prints the following due to the space delimiter:
Something!Something_Something_#Something 10
So basically I need a way to pull out everything before the last " " space and assign it to $1 in the awk printf statement. Any help would be greatly appreciated!!!
It's a Solaris 5.10 server by the way.
Hackjob but this will work
awk '{x=$NF;NF--;printf "%-100s %10d\n", $0, x}'

Use SED to replace string of fixed length at certain position - arbitrary pattern

Trying to replace a string of fixed length at certain position (a string of arbitrary numbers) with a specified string.
I have to :
for every line beginning with 1, in the 4-13 columns, replace existing value with 123456789 where column 4 is a space. 123456789
so a sample file looks like this in the first line:
110 000000000000000000000000000000000000000
and i want
110 123456789000000000000000000000000000000
So far I have:
sed -i "/^1/ s/(.{10})/ 123456789/4" $DEST/$FILE_NAME$DATE.txt
This doesn't do anything though...
With sed:
sed '/^1/s/\(.\{4\}\)\(.\{9\}\)/\1123456789/' "$DEST/$FILE_NAME$DATE.txt"
The preceding regex /^1/ makes the following substitute command apply only to lines starting with a 1.
The substitute command itself captures the first 4 chars 100<space> and the following 9 chars 000000000 into separate groups while keeping the first 4 chars and replacing the following nine chars by 123456789.
Btw, if you have GNU sed, you might simplify the command to:
sed -r '/^1/s/(.{4})(.{9})/\1123456789/'
... which looks simpler for understanding, but is not portable across all different sed versions.
Using awk, simple to understand solution
awk '/^1/ {print substr($0,1,4)"123456789"substr($0,14)}' file
110 123456789000000000000000000000000000000
If line starts with 1, print the 4 first characters + 123456789 + the rest of the line starting from 14 position.

Use grep to find a specific pattern in a line

I am trying to find a specific pattern in a text file using grep inside a bourne shell script
The style is: word1 word2 word3
I want to print everything that is not of that style. So far I used
grep -e '[[:space:]]\{2,\}' somefile
to find more than 2 empty spaces between the words, but I cannot figure out how to make it so that the 3 word per line limit is retained.
My other method would be to also count how many words there are per line and if it exceeds 3, to print the line. Or to check for a white space at the end of the 3rd word, but I am unsure how that would be formatted.
I'm not sure if that's what you wanted, but here:
]$ cat input
one
one two
one two three
one two three four
]$ grep -v -e "^[^[:space:]]\+ [^[:space:]]\+ [^[:space:]]\+$" input
one
one two
one two three four
We match:
any number of not spaces: [^[:space:]]\+
until we get a space
repeated 3 times
all this should be in one line: ^...$
and we negate this with -v option

Regular Expression to parse Common Name from Distinguished Name

I am attempting to parse (with sed) just First Last from the following DN(s) returned by the DSCL command in OSX terminal bash environment...
CN=First Last,OU=PCS,OU=guests,DC=domain,DC=edu
I have tried multiple regexs from this site and others with questions very close to what I wanted... mainly this question... I have tried following the advice to the best of my ability (I don't necessarily consider myself a newbie...but definitely a newbie to regex..)
DSCL returns a list of DNs, and I would like to only have First Last printed to a text file. I have attempted using sed, but I can't seem to get the correct function. I am open to other commands to parse the output. Every line begins with CN= and then there is a comma between Last and OU=.
Thank you very much for your help!
I think all of the regular expression answers provided so far are buggy, insofar as they do not properly handle quoted ',' characters in the common name. For example, consider a distinguishedName like:
CN=Doe\, John,CN=Users,DC=example,DC=local
Better to use a real library able to parse the components of a distinguishedName. If you're looking for something quick on the command line, try piping your DN to a command like this:
echo "CN=Doe\, John,CN=Users,DC=activedir,DC=local" | python -c 'import ldap; import sys; print ldap.dn.explode_dn(sys.stdin.read().strip(), notypes=1)[0]'
(depends on having the python-ldap library installed). You could cook up something similar with PHP's built-in ldap_explode_dn() function.
Two cut commands is probably the simplest (although not necessarily the best):
DSCL | cut -d, -f1 | cut -d= -f2
First, split the output from DSCL on commas and print the first field ("CN=First Last"); then split that on equal signs and print the second field.
Using sed:
sed 's/^CN=\([^,]*\).*/\1/' input_file
^ matches start of line
CN= literal string match
\([^,]*\) everything until a comma
.* rest
http://www.gnu.org/software/gawk/manual/gawk.html#Field-Separators
awk -v RS=',' -v FS='=' '$1=="CN"{print $2}' foo.txt
I like awk too, so I print the substring from the fourth char:
DSCL | awk '{FS=","}; {print substr($1,4)}' > filterednames.txt
This regex will parse a distinguished name, giving name and val a capture groups for each match.
When DN strings contain commas, they are meant to be quoted - this regex correctly handles both quoted and unquotes strings, and also handles escaped quotes in quoted strings:
(?:^|,\s?)(?:(?<name>[A-Z]+)=(?<val>"(?:[^"]|"")+"|[^,]+))+
Here is is nicely formatted:
(?:^|,\s?)
(?:
(?<name>[A-Z]+)=
(?<val>"(?:[^"]|"")+"|[^,]+)
)+
Here's a link so you can see it in action:
https://regex101.com/r/zfZX3f/2
If you want a regex to get only the CN, then this adapted version will do it:
(?:^|,\s?)(?:CN=(?<val>"(?:[^"]|"")+"|[^,]+))