How to remove time date stamp from string - regex

Hello sed/awk/bash experts,
I have thousands of certs to report on and I want to remove the time:
www.bob.com | Jul 28 19:22:38 2015 | Jul 27 19:22:38 2017
How can I (easily) remove 19:22:38 & 19:22:38 so I just have:
www.bob.com | Jul 28 2015 | Jul 27 2017

If you want to edit the file in place rather than just outputting to the screen, use a modified version of anubhava's command:
sed -E 's/[[:digit:]]{2}:[[:digit:]]{2}:[[:digit:]]{2}[[:blank:]]+//g' file > file.tmp && mv file.tmp file
It also has the added benefit of not wiping your original file should sed fail. See here.

If you're using an older version of sed you could try the following:
$ echo "www.bob.com | Jul 28 19:22:38 2015 | Jul 27 19:22:38 2017" | sed 's/\([a-zA-Z]*[ ]*[0-9]*[ ]*\)[0-9:]*\([ ]*[0-9]*\)/\1\2/g'
www.bob.com | Jul 28 2015 | Jul 27 2017
Or perhaps to just remove the time, you could instead use:
echo "www.bob.com | Jul 28 19:22:38 2015 | Jul 27 19:22:38 2017" | sed 's/[0-9][0-9]:[0-9][0-9]:[0-9][0-9]//g'
www.bob.com | Jul 28 2015 | Jul 27 2017

awk '{$5=""; $10=""; print}' file
www.bob.com | Jul 28 2015 | Jul 27 2017

Using sed:
sed -E 's/[[:digit:]]{2}:[[:digit:]]{2}:[[:digit:]]{2}[[:blank:]]+//g' file
www.bob.com | Jul 28 2015 | Jul 27 2017

Related

Create a custom indicator using DAX on power BI

FOR ALL POWER BI USERS
I have created a table visual name from table "Example" given below as raw data (similar to result table except Ind column). I want to create an indicator which will be based on the column total using DAX.
Company | Rev 2018 | Rev 2019 | Rev YoY(%) |
-----------------------------------------------------------
A | 440,980,812 | 321,015,626 | -27.20% |
B | 587,171,150 | 248,150,205 | -57.74% |
C | 693,692,632 | 255,633,145 | -63.15% |
D | 753,951,313 | 266,033,862 | -64.71% |
E | 387,652,076 | 393,439,270 | 1.49% |
Total | 2,863,447,983 | 1,484,272,108 | -48.16% |
My current measure calculation is given below
Rev 2018 = CALCULATE(sum(Example[Rev]),Example[Year]=2018)
Rev 2019 = CALCULATE(sum(Example[Rev]),Example[Year]=2019)
Rev YoY(%) = ([Rev 2019]-[Rev 2018])/[Rev 2018]
I want to create an indicator (Ind) which will show 1 if Rev YoY(%)(-27.20% for company A) for the company is greater than equal to (>=) overall/total Rev YoY(%)(-48.16% of total), else it will show 0. And it will changes based on slicer selections(if Jan is selected, the updated values should changes based on Jan and similarly for other selections such as Feb , Mar etc).
As of now, the value is based on YTD values.Based on the Jan , Feb etc month , the values will be updated for revenue and YoY, simultaneously Indicator measure should also get updated.
Final result will look like below
Company | Rev 2018 | Rev 2019 | Rev YoY(%) | Ind
-----------------------------------------------------------
A | 440,980,812 | 321,015,626 | -27.20% | 1
B | 587,171,150 | 248,150,205 | -57.74% | 0
C | 693,692,632 | 255,633,145 | -63.15% | 0
D | 753,951,313 | 266,033,862 | -64.71% | 0
E | 387,652,076 | 393,439,270 | 1.49% | 1
Total | 2,863,447,983 | 1,484,272,108 | -48.16% | 1
Please help in case you have the solution.
I tried to used filer(allcrossfiler) but it makes the field constant which does not change with slicers
Rev_total% =
CALCULATE(
[Rev YoY(%)],
ALLCROSSFILTERED(Example)
)
Base Raw data
Company Year Month Rev
A 2018 Jan 3715518
A 2018 Feb 62195456
A 2018 Mar 47896563
A 2018 Apr 30397293
A 2018 May 13316124
A 2018 Jun 54702783
A 2018 Jul 23559246
A 2018 Aug 56357008
A 2018 Sep 91266366
A 2018 Oct 7826397
A 2018 Nov 30081453
A 2018 Dec 19666605
A 2019 Jan 20525691
A 2019 Feb 55636582
A 2019 Mar 70832178
A 2019 Apr 51101460
A 2019 May 71658353
A 2019 Jun 51261362
B 2018 Jan 70866878
B 2018 Feb 16605125
B 2018 Mar 77399457
B 2018 Apr 93675100
B 2018 May 24187836
B 2018 Jun 17141132
B 2018 Jul 23189326
B 2018 Aug 1228527
B 2018 Sep 77025448
B 2018 Oct 69069603
B 2018 Nov 61201073
B 2018 Dec 55581645
B 2019 Jan 49529171
B 2019 Feb 30268530
B 2019 Mar 58895051
B 2019 Apr 16378441
B 2019 May 63289350
B 2019 Jun 29789662
C 2018 Jan 28386565
C 2018 Feb 55081195
C 2018 Mar 98650639
C 2018 Apr 13600972
C 2018 May 79286377
C 2018 Jun 97910757
C 2018 Jul 59601906
C 2018 Aug 60499979
C 2018 Sep 10555754
C 2018 Oct 21239252
C 2018 Nov 79278588
C 2018 Dec 89600648
C 2019 Jan 27489712
C 2019 Feb 8085774
C 2019 Mar 33489287
C 2019 Apr 52598275
C 2019 May 50816690
C 2019 Jun 83153407
D 2018 Jan 69955023
D 2018 Feb 1684049
D 2018 Mar 44503967
D 2018 Apr 91505045
D 2018 May 74480545
D 2018 Jun 70038948
D 2018 Jul 28811752
D 2018 Aug 82052925
D 2018 Sep 97215945
D 2018 Oct 48093159
D 2018 Nov 96939697
D 2018 Dec 48670258
D 2019 Jan 68414609
D 2019 Feb 34593576
D 2019 Mar 28277668
D 2019 Apr 46146140
D 2019 May 83794133
D 2019 Jun 4807736
E 2018 Jan 21180873
E 2018 Feb 14552267
E 2018 Mar 27409537
E 2018 Apr 68894164
E 2018 May 24608038
E 2018 Jun 12774844
E 2018 Jul 13193433
E 2018 Aug 89921780
E 2018 Sep 34581806
E 2018 Oct 52068148
E 2018 Nov 11374013
E 2018 Dec 17093173
E 2019 Jan 21748970
E 2019 Feb 95983245
E 2019 Mar 49661560
E 2019 Apr 90056699
E 2019 May 72277971
E 2019 Jun 63710825
You can use the fllowing:
Indicator = if(Example[Rev YoY(%)] > CALCULATE(Example[Rev YoY(%)];ALL(Example[Company]));1;0)
The ALL is doing the trick, it tells pickup all companies data but still keep all other filters.
Some advice:
I would work with real dates and combine the year and month column,
this makes it much easier to work with future data.
You are having now 2018 and 2019, what if your data grows? more years to come.. It is better to talk about PrevYear/NextYear. What you can do is add a column to your data RevNextYear, based on this your reports will always work:
RevNextYear = CALCULATE(sum(RawRevenue[Rev]);
FILTER(RawRevenue;RawRevenue[Company] = EARLIER(RawRevenue[Company]) &&
RawRevenue[Month] = EARLIER(RawRevenue[Month]) &&
RawRevenue[Year] = EARLIER(RawRevenue[Year]) + 1)
)

PowerShell RegEx Select Line and Parse the Fields

I've a file that contains information about the programs.
What I want is to get some information about a particular prgoram.
This the structure of the file.
sometext...program.EXE;Thu, 04 May 2017 08:58:48 -0700;Wed, 27 Sep 2017 10:50:00 -0700;Wed, 04 Oct 2017 00:00:31 -0700;True;False, 17:38:05.810;30...somtext
I was to get the following detail from the above file. each field is separated with ;
p = program.exe
dt1 = Thu, 04 May 2017 08:58:48 -0700
dt2 = Wed, 27 Sep 2017 10:50:00 -0700
dt3 = Wed, 04 Oct 2017 00:00:31 -0700
d1 = True
d2 = False
Get-Content .\file.txt
So far I have \W*((?i)program.exe(?-i))\W* to match it.
But I don't know how to move forward, read all of the fields and parse it.
>> sometext...program.EXE;Thu, 04 May 2017 08:58:48 -0700;Wed, 27 Sep 2017 10:50:00 -0700;Wed, 04 Oct 2017 00:00:31 -0700;True;False, 17:38:05.810;30...somtext
>> #'|
>> Select-String '(?i)\W*(program\.exe)\W*(.*?;)(.*?;)(.*?;)(.*?;)(.*?;)'|
>> % {$_.Matches}|
>> % {$p=$_.Groups[1].Value;$dt1=$_.Groups[2].Value;$dt2=$_.Groups[3].Value;$dt3=$_.Groups[4].Value;$d1=$_.Groups[5].Value;$d2=$_.Groups[6].Value}
:\> $p
program.EXE
:\> $dt1
Thu, 04 May 2017 08:58:48 -0700;
:\> $dt2
Wed, 27 Sep 2017 10:50:00 -0700;
:\> $dt3
Wed, 04 Oct 2017 00:00:31 -0700;
:\> $d1
True;
:\> $d2
False, 17:38:05.810;
:\>
OR
IN: \> "sometext...program.EXE;Thu, 04 May 2017 08:58:48 -0700;Wed, 27 Sep 2017 10:50:00 -0700;Wed, 04 Oct 2017 00:00:31 -0700;True;False, 17:38:05.810;30..
.somtext"|
IN: >> Select-String '(?i)\W*(program\.exe)\W*(.*?;)(.*?;)(.*?;)(.*?;)(.*?;)' -OutVariable o
OUT:
sometext...program.EXE;Thu, 04 May 2017 08:58:48 -0700;Wed, 27 Sep 2017 10:50:00 -0700;Wed, 04 Oct 2017 00:00:31 -0700;True;False, 17:38:05.810;30...somtext
IN: \> $f,$p,$dt1,$dt2,$dt3,$d1,$d2=% -inputObject $o.Matches.Groups {$_.Value}
IN: \> $d2
OUT: False, 17:38:05.810;
I assigned each group to the variable required. See if this works. Apologize for any naivety, I'm not well versed with powershell.
Next try
$p,$dt1,$dt2,$dt3,$d1,$d2=#'
sometext...program.EXE;Thu, 04 May 2017 08:58:48 -0700;Wed, 27 Sep 2017 10:50:00 -0700;Wed, 04 Oct 2017 00:00:31 -0700;True;False, 17:38:05.810;30...somtext
'#|
Select-String '(program.exe)[^;]*(?:;([^;]+)){3}(?:;(true|false)){2},' -AllMatches|
ForEach-Object {$_.Matches}|
ForEach-Object {$_.Groups[1..3]}|
ForEach-Object {$_.Captures}|
Select-Object -ExpandProperty Value
$p,$dt1,$dt2,$dt3,$d1,$d2
Can this help you?
#'
sometext...program.EXE;Thu, 04 May 2017 08:58:48 -0700;Wed, 27 Sep 2017 10:50:00 -0700;Wed, 04 Oct 2017 00:00:31 -0700;True;False, 17:38:05.810;30...somtext
'#|
Select-String '([^;]+)' -AllMatches|
ForEach-Object {$_.Matches}|
ForEach-Object {$_.Groups[1].Value}

Select specific columns from a record using only 'sed' without using 'awk'

Here are some sample input I obtain from doing ls -l :
-rwxr-xr-x 1 root root 1779 Jan 10 2014 zcmp
-rwxr-xr-x 1 root root 5766 Jan 10 2014 zdiff
-rwxr-xr-x 1 root root 142 Jan 10 2014 zegrep
-rwxr-xr-x 1 root root 142 Jan 10 2014 zfgrep
-rwxr-xr-x 1 root root 2133 Jan 10 2014 zforce
-rwxr-xr-x 1 root root 5940 Jan 10 2014 zgrep
lrwxrwxrwx 1 root root 8 Dec 5 2015 ypdomainname -> hostname
I would like to print out the last column and 5th column using ONLY sed like this:
zcmp 1779
zdiff 5766
zegrep 142
zfgrep 142
zforce 2133
zgrep 5940
ypdomainname -> hostname 8
I'm trying to find a regex to match but have not succeeded. And I'm not allowed to use awk or cut either.
Thank you in advance.
Try this;
ls -l | sed -r 's/^(\S+\s+){5}(\S+\s+){3}/\1/' | sed 's/^\(.*\) \(.*\)$/\2\ \1/g'

How to grep lines with date formats?

I have a log file that is created from a bash script that uses $(date), so there are dates in such a format:
Fri Apr 24 22:10:39 CEST 2015
The log file looks like this:
Using SCRIPTS_ROOTDIR: /home/gillin/moses/scripts
Using multi-thread GIZA
using gzip
(1) preparing corpus # Fri Apr 24 22:10:39 CEST 2015
Executing: mkdir -p /media/2tb/ccexp/phrase-clustercat-mgiza/work.en-ru/training/corpus
(1.0) selecting factors # Fri Apr 24 22:10:39 CEST 2015
Forking...
(1.2) creating vcb file /media/2tb/ccexp/phrase-clustercat-mgiza/work.en-ru/training/corpus/en.vcb # Fri Apr 24 22:10:39 CEST 2015
(1.1) running mkcls # Fri Apr 24 22:10:39 CEST 2015
/home/gillin/moses/training-tools/mkcls -c50 -n2 -p/media/2tb/ccexp/corpus.exp/train-clean.en -V/media/2tb/ccexp/phrase-clustercat-mgiza/work.en-ru/training/corpus/en.vcb.classes opt
Executing: /home/gillin/moses/training-tools/mkcls -c50 -n2 -p/media/2tb/ccexp/corpus.exp/train-clean.en -V/media/2tb/ccexp/phrase-clustercat-mgiza/work.en-ru/training/corpus/en.vcb.classes opt
(1.1) running mkcls # Fri Apr 24 22:10:39 CEST 2015
/home/gillin/moses/training-tools/mkcls -c50 -n2 -p/media/2tb/ccexp/corpus.exp/train-clean.ru -V/media/2tb/ccexp/phrase-clustercat-mgiza/work.en-ru/training/corpus/ru.vcb.classes opt
Executing: /home/gillin/moses/training-tools/mkcls -c50 -n2 -p/media/2tb/ccexp/corpus.exp/train-clean.ru -V/media/2tb/ccexp/phrase-clustercat-mgiza/work.en-ru/training/corpus/ru.vcb.classes opt
Is there a way such that i can grep all the lines that contain the output of $(date)?
Currently I'm using this regex:
[a-z].*[1-9] [0-2][1-9]:[0-6][0-9]:[0-6][0-9] CEST 2015
And it catches line like
preparing corpus # Fri Apr 24 22:10:39 CEST 2015
But i need the full line:
(1) preparing corpus # Fri Apr 24 22:10:39 CEST 2015
And also the year and time is sort of hard coded. Is there a better regex or unix tool that can extract lines with $(date) outputs?
Try this:
unalias grep
grep --color=never '.*[a-z].*[1-9] [0-2][1-9]:[0-6][0-9]:[0-6][0-9] CEST 2015' file

Confusion on grep pattern search

Consider this log file
SN PID Date Status
1 P01 Fri Feb 14 19:32:36 IST 2014 Alive
2 P02 Fri Feb 14 19:32:36 IST 2014 Alive
3 P03 Fri Feb 14 19:32:36 IST 2014 Alive
4 P04 Fri Feb 14 19:32:36 IST 2014 Alive
5 P05 Fri Feb 14 19:32:36 IST 2014 Alive
6 P06 Fri Feb 14 19:32:36 IST 2014 Alive
7 P07 Fri Feb 14 19:32:36 IST 2014 Alive
8 P08 Fri Feb 14 19:32:36 IST 2014 Alive
9 P09 Fri Feb 14 19:32:36 IST 2014 Alive
10 P010 Fri Feb 14 19:32:36 IST 2014 Alive
When i do => grep "P01" File
output is : (as expected)
1 P01 Fri Feb 14 19:32:36 IST 2014 Alive
10 P010 Fri Feb 14 19:32:36 IST 2014 Alive
But when i do => grep " P01 " File (notice the space before and after P01)
I do not get any output!
Question : grep matches pattern in a line, so " P01 " ( with space around ) should match the first PID of P01 as it has spaces around it....but seems that this logic is wrong....what obvious thing i am missing here!!!?
If the log uses tabs not spaces, your grep pattern won't match. I would add word boundaries to the word you want to find:
grep '\<P01\>' file
If you really want to use whitespace in your pattern, use one of:
grep '[[:blank:]]P01[[:blank:]]' file # horizontal whitespace, tabs and spaces
grep -P '\sP01\s' file # using Perl regex