Deleting the last characters in the specific columns

Deleting the last characters in the specific columns - regex

I have a sample text file with following columns
scff2 54 92 aa_bb_c4_1024_0_2 scff2 30 18 aa_bb_c4_1024_0_2
scff8 80 96 aa_bb_c4_24_0_2 scff8 14 42 aa_bb_c4_24_0_2
scff1 20 25 aa_bb_c4_98_0_1 scff4 11 25 aa_bb_c4_13_0_1
scff6 16 61 aa_bb_c4_84_0_1 scff6 15 16 aa_bb_c4_84_0_2
I would like remove the last characters in the column 4 and column 8 like following using awk
scff2 54 92 aa_bb_c4_1024_0 scff2 30 18 aa_bb_c4_1024_0
scff8 80 96 aa_bb_c4_24_0 scff8 14 42 aa_bb_c4_24_0
scff1 20 25 aa_bb_c4_98_0 scff4 11 25 aa_bb_c4_13_0
scff6 16 61 aa_bb_c4_84_0 scff6 15 16 aa_bb_c4_84_0
I tried using following script sed -i.bak 's/_[0-9]*$//' sample.txt but it did remove the characters after the last underscore in 8th column but not in the 4th column. Can some one can guide me in achieving my desired output. Thanks in advance.

You can use sub() in awk to perform a substitution in a specific field.
awk '{sub(/_[0-9]*$/, "", $4); sub(/_[0-9]*$/, "", $8); print}' sample.txt

It looks like all you need is:
$ sed 's/_[0-9]\( \|$\)/\1/g' file
scff2 54 92 aa_bb_c4_1024_0 scff2 30 18 aa_bb_c4_1024_0
scff8 80 96 aa_bb_c4_24_0 scff8 14 42 aa_bb_c4_24_0
scff1 20 25 aa_bb_c4_98_0 scff4 11 25 aa_bb_c4_13_0
scff6 16 61 aa_bb_c4_84_0 scff6 15 16 aa_bb_c4_84_0
or if your sed supports -E to enable EREs (which I expect yours does since you're using -i):
$ sed -E 's/_[0-9]( |$)/\1/g' file
scff2 54 92 aa_bb_c4_1024_0 scff2 30 18 aa_bb_c4_1024_0
scff8 80 96 aa_bb_c4_24_0 scff8 14 42 aa_bb_c4_24_0
scff1 20 25 aa_bb_c4_98_0 scff4 11 25 aa_bb_c4_13_0
scff6 16 61 aa_bb_c4_84_0 scff6 15 16 aa_bb_c4_84_0
or as #GlennJackman pointed out in the comments, with GNU sed (the above would work with other seds too, e.g. OSX sed), it'd be:
sed 's/_[0-9]\>//g'

Sometimes it's useful to store the result of a substitution in gawk :
$ awk '{$4=gensub(/_[0-9]$/, "", 1, $4); $8=gensub(/_[0-9]$/, "", 1, $8)}1' file
Output :
scff2 54 92 aa_bb_c4_1024_0 scff2 30 18 aa_bb_c4_1024_0
scff8 80 96 aa_bb_c4_24_0 scff8 14 42 aa_bb_c4_24_0
scff1 20 25 aa_bb_c4_98_0 scff4 11 25 aa_bb_c4_13_0
scff6 16 61 aa_bb_c4_84_0 scff6 15 16 aa_bb_c4_84_0
But #Barmar solution is smarter/shorter/lighter
Not in all awk implementations : not nawk, need GNU awk or maybe mawks

In GNU awk, everything ending in `_[0-9]+' removed:
$ awk '{gsub(/_[0-9]+\>/,"")}1' file
scff2 54 92 aa_bb_c4_1024_0 scff2 30 18 aa_bb_c4_1024_0
scff8 80 96 aa_bb_c4_24_0 scff8 14 42 aa_bb_c4_24_0
...

awk '{gsub(/_0_./,"_0")}1' file
scff2 54 92 aa_bb_c4_1024_0 scff2 30 18 aa_bb_c4_1024_0
scff8 80 96 aa_bb_c4_24_0 scff8 14 42 aa_bb_c4_24_0
scff1 20 25 aa_bb_c4_98_0 scff4 11 25 aa_bb_c4_13_0
scff6 16 61 aa_bb_c4_84_0 scff6 15 16 aa_bb_c4_84_0

Related

Issue generating DAG using CausalFS cpp package

Link to CausalFS GitHub I'm using v.2.0 of CausalFS cpp package by Kui Yu.
Upon running the structural learning algos, my DAG and MB are not matching.
I'm trying to generate a DAG based on the data given in the CDD/data/data.txt directory and CDD/data.txt via some of the Local-to-global structure learning algos mentioned in the manual (PCMB-CSL, STMB-CSL etc.). Running the commands as given by the manual (pg. 18 of 26).
But my resulting DAG is just filled with zeros (for the most part). Given that this is a example dataset that looks suspicious. Upon then checking CDD/mb/mb.out I find that the Markov blankets for the variables do not agree with the DAG output.
For ex, running ./main ./data/data.txt ./data/net.txt 0.01 PCMB-CSL "" "" -1 gives a 1 at position (1,22) (one-indexed) only (relaxing alpha value to 0.1 (kept at 0.01 in ex) gives just another 1). However, this doesn't agree with the output MB for each variable, which looks like (upon running IAMB as ./main ./data/data.txt ./net/net.txt 0.01 IAMB all "" "")-
0 21
1 22 26 28
2 29
3 14 21
4 5 12
5 4 12
6 8 12
7 8 12
8 6 7 12
9 11 15
10 35
11 9 12 15 33
12 4 6 7 8 11 13
13 8 12 14 15 17 30 34
14 3 13 20
15 8 9 11 13 14 17 30
16 15
17 13 15 18 27 30
18 17 19 20 27
19 18 20
20 14 18 21 28
21 0 3 20 26
22 1 21 23 24 28
23 1 22 24
24 5 22 23 25
25 24
26 1 21 22
27 17 18 28 29
28 1 18 21 22 27 29
29 2 27
30 13 14 15 17
31 34
32 15 18 34
33 11 12 32 35 36
34 30 31 32 35 36
35 10 33 34
36 33 34 35
Such an MB profile suggests the DAG to be much more connected.
I would love to hear suggestions from people who've managed to get the package to behave appropriately. I just do not understand the error here from my side. (I'm running on PopOS 20.04)
Thanks a bunch <3
P.S- The files just continue to write upon rerunning the code, so make sure to appropriately delete them.

Is there a way to make a dummy variable in SAS for a Country in my SAS Data Set?

I am looking to create a dummy variable for Latin American Countries in my data set which I need to make for a log-log model. I know how to log all of them for my later regression. Any suggestion or help on how to make a dummy variable for the Latin American countries with my data would be appreciated.
data HW6;
input country : $25. midyear sancts lprots lfrac ineql pop;
cards;
CHILE 1955 58 44 65 57 6.743
CHILE 1960 19 34 65 57 7.785
CHILE 1965 27 24 65 57 8.510
CHILE 1970 36 29 65 57 9.369
CHILE 1975 38 58 65 57 10.214
COSTA_RICA 1955 16 7 54 60 1.024
COSTA_RICA 1960 6 1 54 60 1.236
COSTA_RICA 1965 2 1 54 60 1.482
COSTA_RICA 1970 3 1 54 60 1.732
COSTA_RICA 1975 2 3 54 60 1.965
INDIA 1955 81 134 47 52 404.478
INDIA 1960 101 190 47 52 445.857
INDIA 1965 189 845 47 52 494.882
INDIA 1970 133 915 47 52 553.619
INDIA 1975 132 127 47 52 616.551
JAMICA 1955 11 12 47 62 1.542
JAMICA 1960 9 2 47 62 1.629
JAMICA 1965 8 6 47 62 1.749
JAMICA 1970 1 1 47 62 1.877
JAMICA 1975 7 1 47 62 2.043
PHILIPPINES 1955 26 123 48 56 24.0
PHILIPPINES 1960 20 38 48 56 27.898
PHILIPPINES 1965 9 5 48 56 32.415
PHILIPPINES 1970 79 25 48 56 37.540
SRI_LANKA 1955 29 2 73 52 8.679
SRI_LANKA 1960 75 35 73 52 9.879
SRI_LANKA 1965 25 63 73 52 11.202
SRI_LANKA 1970 34 14 73 52 12.532
TURKEY 1955 79 1 67 61 24.145
TURKEY 1960 138 19 67 61 28.217
TURKEY 1965 36 51 67 61 31.951
TURKEY 1970 51 22 67 61 35.743
URUGUAY 1955 8 4 57 48 2.372
URUGUAY 1960 12 1 57 48 2.538
URUGUAY 1965 16 14 57 48 2.693
URUGUAY 1970 21 19 57 48 2.808
URUGUAY 1975 24 45 57 48 2.829
VENEZUELA 1955 38 14 76 65 6.110
VENEZUELA 1960 209 23 76 65 7.632
VENEZUELA 1965 100 162 76 65 9.119
VENEZUELA 1970 9 27 76 65 10.709
VENEZUELA 1975 4 12 76 65 12.722
;
data newData;
set HW6;
sancts = log (sancts);
lprots = log (lprots);
lfrac = log (lfrac);
ineql = log (ineql);
pop = log (pop);
run;

The GLMSELECT procedure is one simple way of creating dummy variables.
There is a nice article about how to use it to generate dummy variables
data newData;
set HW6;
sancts = log (sancts);
lprots = log (lprots);
lfrac = log (lfrac);
ineql = log (ineql);
pop = log (pop);
Y = 0; *-- Create a fake response variable --*
run;
proc glmselect data=newData noprint outdesign(addinputvars)=want(drop=Y);
class country;
model Y = country / noint selection=none;
run;
If needed in further step, use the macro-variable &_GLSMOD created by the procedure that contains the names of the dummy variables.

The real question here is not related to SAS, it is related on how to get the region of a country by its name.
I would give a try to the ISO 3166 which lists all countries and their geographical location.
Getting that list is straight forward, then import that list in SAS, use a merge by country and finally identify the countries in Latin America

Magick++ ``depth`` does not behave the same as ``convert -depth``

I discovered via my previous question that depth appears to work differently when I use it in ImageMagick's convert vs Magick++.
CLI version and result
Using:
$ convert /foo/bar.ppm -depth 1 /foo/out.ppm
I get an output image which, upon inspection, shows a 1-bit color depth:
$ identify /foo/out.ppm
out.ppm PPM (blah blah) 1-bit sRGB (blah blah)
C++ version and result
Using the code:
#include <Magick++.h>
int main(int argc, char **argv) {
Magick::InitializeMagick(*argv);
Magick::Image img;
img.read("/foo/bar.ppm");
Magick::Image temp_img(img);
temp_img.depth(1);
temp_img.write("/foo/out.ppm");
return 0;
}
Compiled using the command:
g++ -std=c++17 test.cpp -o test `Magick++-config --cppflags --cxxflags --ldflags --libs`
Produces the output:
$ identify /foo/out.ppm
out.ppm PPM (blah blah) 8-bit sRGB (blah blah)
Hardware
I have run this with the same result on on:
Raspberry Pi - Raspbian 10 (buster)
Laptop - Ubuntu 18.04 (bionic beaver)
Software (on the RPi)
$ apt list --installed | grep magick
WARNING: apt does not have a stable CLI interface. Use with caution in scripts.
graphicsmagick-libmagick-dev-compat/stable,now 1.4+really1.3.35-1~deb10u1 all [installed]
imagemagick-6-common/stable,now 8:6.9.10.23+dfsg-2.1+deb10u1 all [installed,automatic]
imagemagick-6.q16/now 8:6.9.10.23+dfsg-2.1 armhf [installed,upgradable to: 8:6.9.10.23+dfsg-2.1+deb10u1]
imagemagick/now 8:6.9.10.23+dfsg-2.1 armhf [installed,upgradable to: 8:6.9.10.23+dfsg-2.1+deb10u1]
libgraphics-magick-perl/stable,now 1.4+really1.3.35-1~deb10u1 armhf [installed,automatic]
libgraphicsmagick++-q16-12/stable,now 1.4+really1.3.35-1~deb10u1 armhf [installed,automatic]
libgraphicsmagick++1-dev/stable,now 1.4+really1.3.35-1~deb10u1 armhf [installed,automatic]
libgraphicsmagick-q16-3/stable,now 1.4+really1.3.35-1~deb10u1 armhf [installed,automatic]
libgraphicsmagick1-dev/stable,now 1.4+really1.3.35-1~deb10u1 armhf [installed,automatic]
libmagick++-6-headers/stable,now 8:6.9.10.23+dfsg-2.1+deb10u1 all [installed,auto-removable]
libmagick++-6.q16-8/stable,now 8:6.9.10.23+dfsg-2.1+deb10u1 armhf [installed,auto-removable]
libmagickcore-6-arch-config/stable,now 8:6.9.10.23+dfsg-2.1+deb10u1 armhf [installed,auto-removable]
libmagickcore-6-headers/stable,now 8:6.9.10.23+dfsg-2.1+deb10u1 all [installed,auto-removable]
libmagickcore-6.q16-6-extra/stable,now 8:6.9.10.23+dfsg-2.1+deb10u1 armhf [installed,automatic]
libmagickcore-6.q16-6/stable,now 8:6.9.10.23+dfsg-2.1+deb10u1 armhf [installed,automatic]
libmagickwand-6-headers/stable,now 8:6.9.10.23+dfsg-2.1+deb10u1 all [installed,auto-removable]
libmagickwand-6.q16-6/stable,now 8:6.9.10.23+dfsg-2.1+deb10u1 armhf [installed,automatic]
The Picture
I've tested with multiple input files with sRGB type. I convert everything to NetBPM format before starting my test e.g.:
convert yourimage.jpg /foo/bar.ppm
The Question
Why is the C++ different from the Bash version? They should be linking to the exact same code in the background. The input value for depth does not need to be a special type (Magick::Image.depth takes size_t). Is there something in my installation which is messing this up? I know most of it is based on ImageMagick v6 because debian repos are notoriously slow, but nothing has changed (to my knowledge) in the source code which should effect the depth.
What else doesn't work?
Quantization
Adding:
temp_img.quantizeColorSpace(Magick::GRAYColorspace);
temp_img.quantizeColors(1);
temp_img.quantize( );
to the code should also be a method which reduces the color depth. Again, this results in an 8-bit image in C++.
Monochrome
This results in an 8-bit image in both CLI and C++

Closest solution I can think of is to user the "PBM" format.
Generating a test image with the following.
convert -size 10x10 plasma: input.jpg && convert input.jpg input.ppm
Just using the Magick::Image.magick method.
#include <Magick++.h>
int main(int argc, char **argv) {
Magick::InitializeMagick(*argv);
Magick::Image img;
img.read("input.ppm");
Magick::Image temp_img(img);
temp_img.magick("PBM");
temp_img.depth(1);
temp_img.write("output.ppm");
return 0;
}
We get the following file structure...
$ hexdump -C output.ppm
00000000 50 34 0a 31 30 20 31 30 0a 00 00 00 00 00 00 06 |P4.10 10........|
00000010 00 ff 80 ff c0 ff c0 ff c0 ff c0 ff c0 |.............|
0000001d
If we want the ASCII representation of the binary data, just disable the compression.
Magick::Image temp_img(img);
temp_img.compressType(Magick::NoCompression);
temp_img.magick("PBM");
temp_img.depth(1);
temp_img.write("output.ppm");
Which would yield the following...
$ hexdump -C output2.ppm
00000000 50 31 0a 31 30 20 31 30 0a 30 20 30 20 30 20 30 |P1.10 10.0 0 0 0|
00000010 20 30 20 30 20 30 20 30 20 30 20 30 20 0a 30 20 | 0 0 0 0 0 0 .0 |
00000020 30 20 30 20 30 20 30 20 30 20 30 20 30 20 30 20 |0 0 0 0 0 0 0 0 |
00000030 30 20 0a 30 20 30 20 30 20 30 20 30 20 30 20 30 |0 .0 0 0 0 0 0 0|
00000040 20 30 20 30 20 30 20 0a 30 20 30 20 30 20 30 20 | 0 0 0 .0 0 0 0 |
00000050 30 20 31 20 31 20 30 20 30 20 30 20 0a 31 20 31 |0 1 1 0 0 0 .1 1|
00000060 20 31 20 31 20 31 20 31 20 31 20 31 20 31 20 30 | 1 1 1 1 1 1 1 0|
00000070 20 0a 31 20 31 20 31 20 31 20 31 20 31 20 31 20 | .1 1 1 1 1 1 1 |
00000080 31 20 31 20 31 20 0a 31 20 31 20 31 20 31 20 31 |1 1 1 .1 1 1 1 1|
00000090 20 31 20 31 20 31 20 31 20 31 20 0a 31 20 31 20 | 1 1 1 1 1 .1 1 |
000000a0 31 20 31 20 31 20 31 20 31 20 31 20 31 20 31 20 |1 1 1 1 1 1 1 1 |
000000b0 0a 31 20 31 20 31 20 31 20 31 20 31 20 31 20 31 |.1 1 1 1 1 1 1 1|
000000c0 20 31 20 31 20 0a 31 20 31 20 31 20 31 20 31 20 | 1 1 .1 1 1 1 1 |
000000d0 31 20 31 20 31 20 31 20 31 20 0a |1 1 1 1 1 .|
000000db
Don't know if that's exactly which you need, but should get you on track. Also might be worth reviewing WritePNMImage method in coders/pnm.c file (same file for ImageMagick-6).

Solution
It appears that this problem was solved by removing problem packages previously downloaded from the Debian apt repository. It is difficult to nail down which was the offending part, but I removed:
sudo apt remove graphicsmagick-libmagick-dev-compat imagemagick-6-common imagemagick-6.q16 imagemagick
Next, I built ImageMagick from source, following the directions here.
Explanation
The solution was not simply a version change, which would be an understandable confusion since during the source build, I upgraded from v6 of ImageMagick still in the Debian repository to v7. However, tests by #emcconville were performed on both v6 and v7 without reproducing the errors I experienced. Presumably, since he is involved with ImageMagick development, he uses a copy built from source rather than what is available from apt-get. Therefore, we can safely assume that the problem is either in one of the Debian packages or caused by some incorrect combination of packages on the affected machine.

How to parse & read from .txt file (NOAA Weather)

I am trying to parse and read from a .txt file, from the NOAA Weather site. How can I search the file to find certain text and insert it into the database?
I'm trying to search for Greenville and pull the conditions and temp. Then push that into my database, along with other cities? Any code that you are willing to share would be appreciated.
Code:
<cffile action="read" file="#expandPath("./NWS Raleigh Durham.txt")#" variable="myFile">"
<cfdump var="#myfile#">
Content:
National Weather Service Text Product Display Skip Navigation Regional Weather Roundup Issued by NWS Raleigh/Durham, NC Home | Current Version | Previous Version | Graphics & Text | Print | Product List | Glossary On Versions: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 000 ASUS42 KRAH 081410 RWRRAH NORTH CAROLINA WEATHER ROUNDUP NATIONAL WEATHER SERVICE RALEIGH NC 900 AM EST THU MAR 08 2018 NOTE: "FAIR" INDICATES FEW OR NO CLOUDS BELOW 12,000 FEET WITH NO SIGNIFICANT WEATHER AND/OR OBSTRUCTIONS TO VISIBILITY. NCZ001-053-055-056-065-067-081500- WESTERN NORTH CAROLINA CITY SKY/WX TMP DP RH WIND PRES REMARKS ASHEVILLE FAIR 35 23 61 VRB3 29.92F JEFFERSON FLURRIES 26 16 65 W9 29.83F WCI 16 MORGANTON FAIR 37 25 64 NW3 29.97F HICKORY CLOUDY 35 24 64 SW5 29.94F WCI 31 RUTHERFORDTON CLOUDY 37 27 67 W6 29.97S WCI 33 MOUNT AIRY FAIR 37 21 53 NW8 29.94F WCI 31 BOONE PTSUNNY 27 16 63 NW13G18 29.85F WCI 16 $$ NCZ021-022-025-041-071-084-088-081500- CENTRAL NORTH CAROLINA CITY SKY/WX TMP DP RH WIND PRES REMARKS CHARLOTTE CLOUDY 38 27 64 W5 29.97F WCI 34 GREENSBORO PTSUNNY 38 24 57 W8 29.93S WCI 32 WINSTON-SALEM FAIR 38 20 48 W8 29.94F WCI 32 RALEIGH-DURHAM PTSUNNY 36 26 67 CALM 29.96R FORT BRAGG CLOUDY 39 23 52 NW5 29.97R FAYETTEVILLE CLOUDY 38 28 67 W6 29.98R WCI 33 BURLINGTON CLOUDY 39 25 57 SW5 29.94S LAURINBURG CLOUDY 38 28 67 NW8 29.99R WCI 32 $$ NCZ011-015-027-028-043-044-047-080-103-081500- NORTHEASTERN NORTH CAROLINA CITY SKY/WX TMP DP RH WIND PRES REMARKS ROCKY MT-WILSO PTSUNNY 40 24 53 NW6 29.96R GREENVILLE FAIR 41 23 48 N6 29.97S WASHINGTON FAIR 41 25 51 NW9 29.94F ELIZABETH CITY PTSUNNY 40 27 59 NW7 29.92S MANTEO CLOUDY 42 28 58 N9 29.91S CAPE HATTERAS FAIR 45 33 63 N5 29.90S $$ NCZ078-087-090-091-093-098-101-081500- SOUTHEASTERN NORTH CAROLINA CITY SKY/WX TMP DP RH WIND PRES REMARKS LUMBERTON CLOUDY 40 29 64 NW8 29.99R WCI 34 GOLDSBORO CLOUDY 39 25 57 NW5 29.94S KINSTON PTSUNNY 43 25 49 NW8 29.96S KENANSVILLE CLOUDY 39 27 60 NW7 29.96S WCI 34 NEW BERN FAIR 41 27 57 NW8 29.95S CHERRY POINT NOT AVBL BEAUFORT FAIR 45 28 51 NW13 29.93S JACKSONVILLE CLOUDY 43 27 53 NW9 29.95S WILMINGTON FAIR 44 27 51 NW13 29.96S $$ National Weather Service Raleigh, NC Weather Forecast Office 1005 Capability Drive, Suite 300 Centennial Campus Raleigh, NC 27606-5226 (919) 326-1042 Page Author: RAH Webmaster Web Master's E-mail: rah.webmaster#noaa.gov Page last modified: Jan 10th, 2018 19:57 UTC Disclaimer Credits Glossary Privacy Policy About Us Career Opportunities

How do you Resample daily with a conditional statement in pandas

I have a pandas data frame below: (it does have other columns but these are the important ones) Date column is the Index
Number_QA_VeryGood Number_Valid_Cells Time
Date
2015-01-01 91 92 18:55
2015-01-02 6 6 18:00
2015-01-02 13 13 19:40
2015-01-03 106 106 18:45
2015-01-05 68 68 18:30
2015-01-06 111 117 19:15
2015-01-07 89 97 18:20
2015-01-08 86 96 19:00
2015-01-10 9 16 18:50
I need to resample daily the first two columns will be resampled with sum.
The last column needs to look at the highest daily value for the Number_Valid_Cells column and use that time for the value.
example output should be: (1/2/02 is line which changed)
Number_QA_VeryGood Number_Valid_Cells Time
Date
2015-01-01 91 92 18:55
2015-01-02 19 19 19:40
2015-01-03 106 106 18:45
2015-01-05 68 68 18:30
2015-01-06 111 117 19:15
2015-01-07 89 97 18:20
2015-01-08 86 96 19:00
2015-01-10 9 16 18:50
What is the best way to get this to work.

Or you can try
df.groupby(df.index).agg({'Number_QA_VeryGood':'sum','Number_Valid_Cells':'sum','Time':'last'})
Out[276]:
Time Number_QA_VeryGood Number_Valid_Cells
Date
2015-01-01 18:55 91 92
2015-01-02 19:40 19 19
2015-01-03 18:45 106 106
2015-01-05 18:30 68 68
2015-01-06 19:15 111 117
2015-01-07 18:20 89 97
2015-01-08 19:00 86 96
2015-01-10 18:50 9 16
Update: sort_values first
df.sort_values('Number_Valid_Cells').groupby(df.sort_values('Number_Valid_Cells').index)\
.agg({'Number_QA_VeryGood':'sum','Number_Valid_Cells':'sum','Time':'last'})
Out[314]:
Time Number_QA_VeryGood Number_Valid_Cells
Date
1/1/2015 18:55 91 92
1/10/2015 18:50 9 16
1/2/2015 16:40#here.changed 19 19
1/3/2015 18:45 106 106
1/5/2015 18:30 68 68
1/6/2015 19:15 111 117
1/7/2015 18:20 89 97
1/8/2015 19:00 86 96
Data input :
Number_QA_VeryGood Number_Valid_Cells Time
Date
1/1/2015 91 92 18:55
1/2/2015 6 6 18:00
1/2/2015 13 13 16:40#I change here
1/3/2015 106 106 18:45
1/5/2015 68 68 18:30
1/6/2015 111 117 19:15
1/7/2015 89 97 18:20
1/8/2015 86 96 19:00
1/10/2015 9 16 18:50

You can use groupby sum for first two columns, if you have the values of Number_Valid_Cells sorted then
ndf = df.reset_index().groupby('Date').sum()
ndf['Time'] = df.reset_index().drop_duplicates(subset='Date',keep='last').set_index('Date')['Time']
Number_QA_VeryGood Number_Valid_Cells Time
Date
2015-01-01 91 92 18:55
2015-01-02 19 19 19:40
2015-01-03 106 106 18:45
2015-01-05 68 68 18:30
2015-01-06 111 117 19:15
2015-01-07 89 97 18:20
2015-01-08 86 96 19:00
2015-01-10 9 16 18:50

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Deleting the last characters in the specific columns - regex

You can use sub() in awk to perform a substitution in a specific field. awk '{sub(/_[0-9]$/, "", $4); sub(/_[0-9]$/, "", $8); print}' sample.txt

In GNU awk, everything ending in `_[0-9]+' removed: $ awk '{gsub(/_[0-9]+\>/,"")}1' file scff2 54 92 aa_bb_c4_1024_0 scff2 30 18 aa_bb_c4_1024_0 scff8 80 96 aa_bb_c4_24_0 scff8 14 42 aa_bb_c4_24_0 ...

awk '{gsub(/_0_./,"_0")}1' file scff2 54 92 aa_bb_c4_1024_0 scff2 30 18 aa_bb_c4_1024_0 scff8 80 96 aa_bb_c4_24_0 scff8 14 42 aa_bb_c4_24_0 scff1 20 25 aa_bb_c4_98_0 scff4 11 25 aa_bb_c4_13_0 scff6 16 61 aa_bb_c4_84_0 scff6 15 16 aa_bb_c4_84_0

Related

Issue generating DAG using CausalFS cpp package

Is there a way to make a dummy variable in SAS for a Country in my SAS Data Set?

Magick++ ``depth`` does not behave the same as ``convert -depth``

How to parse & read from .txt file (NOAA Weather)

How do you Resample daily with a conditional statement in pandas

Categories

Resources

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Deleting the last characters in the specific columns - regex

You can use sub() in awk to perform a substitution in a specific field. awk '{sub(/_[0-9]*$/, "", $4); sub(/_[0-9]*$/, "", $8); print}' sample.txt

In GNU awk, everything ending in `_[0-9]+' removed: $ awk '{gsub(/_[0-9]+\>/,"")}1' file scff2 54 92 aa_bb_c4_1024_0 scff2 30 18 aa_bb_c4_1024_0 scff8 80 96 aa_bb_c4_24_0 scff8 14 42 aa_bb_c4_24_0 ...

awk '{gsub(/_0_./,"_0")}1' file scff2 54 92 aa_bb_c4_1024_0 scff2 30 18 aa_bb_c4_1024_0 scff8 80 96 aa_bb_c4_24_0 scff8 14 42 aa_bb_c4_24_0 scff1 20 25 aa_bb_c4_98_0 scff4 11 25 aa_bb_c4_13_0 scff6 16 61 aa_bb_c4_84_0 scff6 15 16 aa_bb_c4_84_0

Related

Issue generating DAG using CausalFS cpp package

Is there a way to make a dummy variable in SAS for a Country in my SAS Data Set?

Magick++ ``depth`` does not behave the same as ``convert -depth``

How to parse & read from .txt file (NOAA Weather)

How do you Resample daily with a conditional statement in pandas

Categories

Resources

You can use sub() in awk to perform a substitution in a specific field. awk '{sub(/_[0-9]$/, "", $4); sub(/_[0-9]$/, "", $8); print}' sample.txt