Dates and Between statement - sas

I am using SAS E.G. 7.1
I have the following code:
data time_dim_monthly;
do i = 0 to 200;
index_no = i;
year_date = year(intnx('month','01JAN2008'd,i));
month_date = month(intnx('month','01JAN2008'd,i));
SOM = put(intnx('month', '01JAN2008'd, i, 'b'),date11.) ;
EOM = put(intnx('month', '01JAN2008'd, i, 'e'),date11.) ;
days_in_month = INTCK('day',intnx('month', '01JAN2008'd, i, 'b'),
intnx('month', '01JAN2008'd, i, 'e'));
output;
end;
run;
followed by
proc sql;
create table calendar as
select year_date, month_date, index_no, put(today(),date11.) as todays_dt, som, eom
from time_dim_monthly
where put(today(),date11.) between som and eom
/*or datepart((INTNX('month',today(),-1)) between som and eom)*/
order by index_no
;
quit;
The output looks like this:
year_date month_date index_no todays_dt SOM EOM
2008 10 9 31-MAY-2017 01-OCT-2008 31-OCT-2008
2009 10 21 31-MAY-2017 01-OCT-2009 31-OCT-2009
2010 10 33 31-MAY-2017 01-OCT-2010 31-OCT-2010
2011 10 45 31-MAY-2017 01-OCT-2011 31-OCT-2011
2012 10 57 31-MAY-2017 01-OCT-2012 31-OCT-2012
2013 10 69 31-MAY-2017 01-OCT-2013 31-OCT-2013
2014 10 81 31-MAY-2017 01-OCT-2014 31-OCT-2014
2015 10 93 31-MAY-2017 01-OCT-2015 31-OCT-2015
2016 10 105 31-MAY-2017 01-OCT-2016 31-OCT-2016
2017 5 112 31-MAY-2017 01-MAY-2017 31-MAY-2017
2017 10 117 31-MAY-2017 01-OCT-2017 31-OCT-2017
2018 5 124 31-MAY-2017 01-MAY-2018 31-MAY-2018
2018 10 129 31-MAY-2017 01-OCT-2018 31-OCT-2018
2019 5 136 31-MAY-2017 01-MAY-2019 31-MAY-2019
2019 10 141 31-MAY-2017 01-OCT-2019 31-OCT-2019
2020 5 148 31-MAY-2017 01-MAY-2020 31-MAY-2020
2020 10 153 31-MAY-2017 01-OCT-2020 31-OCT-2020
2021 5 160 31-MAY-2017 01-MAY-2021 31-MAY-2021
2021 10 165 31-MAY-2017 01-OCT-2021 31-OCT-2021
2022 5 172 31-MAY-2017 01-MAY-2022 31-MAY-2022
2022 10 177 31-MAY-2017 01-OCT-2022 31-OCT-2022
2023 5 184 31-MAY-2017 01-MAY-2023 31-MAY-2023
2023 10 189 31-MAY-2017 01-OCT-2023 31-OCT-2023
2024 5 196 31-MAY-2017 01-MAY-2024 31-MAY-2024
While I'd expected that it would only give me one line:
2017 5 112 31-MAY-2017 01-MAY-2017 31-MAY-2017
Would appreciate help in understanding why this is happening.
Thank you

This is your mistake:
SOM = put(intnx('month', '01JAN2008'd, i, 'b'),date11.) ;
EOM = put(intnx('month', '01JAN2008'd, i, 'e'),date11.) ;
where put(today(),date11.) between som and eom
put creates a character variable. You shouldn't really use between with character variables unless you really know what you're doing (it will compare in alphabetical order).
Use numeric variables. Get rid of the put. Instead use a format statement to make the variables look nice, but still be numeric.
SOM = intnx('month', '01JAN2008'd, i, 'b') ;
EOM = intnx('month', '01JAN2008'd, i, 'e') ;
format som eom date11.;
later
where today() between som and eom

Related

Problem with if condition in my fortran code

I have been trying to extract the atom number corresponding to atom name "OW" using if condition in fortran code from a file. But when I am using the 'if condition' the values are not written in a file. Could anybody help me regarding the same where I am doing wrong.
implicit none
character(len=100)::head,grofile
character(len=5):: res_nm,at_name
integer :: n,i,ierror,at_num
write(*,*) 'enter the name of gro file'
read(*,*) grofile
open(unit=10,file=grofile,status='old',action='read')
openif : if (ierror == 0) then
!open was ok. Read values.
read(10,*)head
read(10,*)n
do i=1,n
read(10,200) at_name,at_num
if (at_name == 'OW') then
write(44,*)at_num
200 format (10x,a5,i5)
endif
enddo
endif openif
end program name
and the input file that I am using is
CNT in water
44316
1LIG C 1 2.814 2.448 2.231 -0.2002 0.0645 -0.2005
1LIG C 2 2.783 2.584 2.233 0.4146 0.2083 -0.1403
1LIG C 3 2.769 2.658 2.350 -0.4678 -0.0886 -0.0500
1LIG C 4 2.687 2.772 2.348 -0.7671 -0.3032 -0.0624
1LIG C 5 2.619 2.795 2.228 -0.2327 -0.2483 -0.3593
1LIG C 6 2.486 2.837 2.238 -0.0621 0.2349 -0.0781
................
1LIG H 1006 2.613 1.972 12.082 -1.2767 0.0570 0.2045
1LIG H 1007 2.804 2.173 12.099 -0.4228 1.8734 1.9762
1LIG H 1008 2.862 2.377 12.097 -0.7176 -2.2587 1.0804
2water OW 1009 2.221 1.281 6.853 -0.6831 -0.3395 0.1402
2water HW1 1010 2.191 1.215 6.789 -1.2195 0.6304 -0.6225
2water HW2 1011 2.143 1.333 6.871 -0.5687 -0.7024 1.7263
2water MW 1012 2.206 1.279 6.847 -0.7389 -0.2594 0.2489
3water OW 1013 2.826 4.482 12.736 -0.2852 0.1750 0.1277
3water HW1 1014 2.735 4.490 12.707 -0.3265 -0.3844 0.1046
3water HW2 1015 2.860 4.406 12.689 0.4937 0.9762 -0.6120
3water MW 1016 2.818 4.473 12.726 -0.1879 0.2065 0.0267
4water OW 1017 3.510 2.042 10.165 0.1154 -0.0258 -0.0813
4water HW1 1018 3.530 2.105 10.095 3.0124 -0.2562 0.4945
4water HW2 1019 3.434 1.993 10.132 -0.4188 1.8748 -1.7521
...............
4.90369 4.90369 14.25892
Also, I am not getting any error for the code and without any output.
The command that I am using
gfortran br_br_gofr_smooth_dlp4.f90 -o read -I /usr/local/include/ -lgmxfort -g -fcheck=all -fbounds-check
./read
You have two errors:
You are using ierror uninitialised, as noted in the comments
at_name is a length 5 character variable. You are comparing it with a 2 letter character variable. For this to be true the leftmost 2 characters of at_name have to be the same as those in the two character variable. Unfortunately as your code is written it reads the atom name into the rightmost 2 characters of at_name. Thus the test fails
The code below shows a way of fixing the above, and does what I think you want. Especially for point 2 there are other ways.
ijb#ijb-Latitude-5410:~/work/stack$ cat pdb.f90
implicit none
character(len=100)::head,grofile
character(len=5):: res_nm,at_name
integer :: n,i,ierror,at_num
write(*,*) 'enter the name of gro file'
read(*,*) grofile
open(unit=10,file=grofile,status='old',action='read',iostat=ierror)
openif : if (ierror == 0) then
!open was ok. Read values.
read(10,*)head
read(10,*)n
do i=1,n
read(10,200) at_name,at_num
if (Adjustl(at_name) == 'OW') then
write(44,*)at_num
200 format (10x,a5,i5)
endif
enddo
endif openif
end program
ijb#ijb-Latitude-5410:~/work/stack$ gfortran -std=f2008 -Wall -Wextra -fcheck=all -g -O pdb.f90
pdb.f90:3:27:
3 | character(len=5):: res_nm,at_name
| 1
Warning: Unused variable ‘res_nm’ declared at (1) [-Wunused-variable]
ijb#ijb-Latitude-5410:~/work/stack$ cat stuff
CNT in water
20
1LIG C 1 2.814 2.448 2.231 -0.2002 0.0645 -0.2005
1LIG C 2 2.783 2.584 2.233 0.4146 0.2083 -0.1403
1LIG C 3 2.769 2.658 2.350 -0.4678 -0.0886 -0.0500
1LIG C 4 2.687 2.772 2.348 -0.7671 -0.3032 -0.0624
1LIG C 5 2.619 2.795 2.228 -0.2327 -0.2483 -0.3593
1LIG C 6 2.486 2.837 2.238 -0.0621 0.2349 -0.0781
1LIG H 1006 2.613 1.972 12.082 -1.2767 0.0570 0.2045
1LIG H 1007 2.804 2.173 12.099 -0.4228 1.8734 1.9762
1LIG H 1008 2.862 2.377 12.097 -0.7176 -2.2587 1.0804
2water OW 1009 2.221 1.281 6.853 -0.6831 -0.3395 0.1402
2water HW1 1010 2.191 1.215 6.789 -1.2195 0.6304 -0.6225
2water HW2 1011 2.143 1.333 6.871 -0.5687 -0.7024 1.7263
2water MW 1012 2.206 1.279 6.847 -0.7389 -0.2594 0.2489
3water OW 1013 2.826 4.482 12.736 -0.2852 0.1750 0.1277
3water HW1 1014 2.735 4.490 12.707 -0.3265 -0.3844 0.1046
3water HW2 1015 2.860 4.406 12.689 0.4937 0.9762 -0.6120
3water MW 1016 2.818 4.473 12.726 -0.1879 0.2065 0.0267
4water OW 1017 3.510 2.042 10.165 0.1154 -0.0258 -0.0813
4water HW1 1018 3.530 2.105 10.095 3.0124 -0.2562 0.4945
4water HW2 1019 3.434 1.993 10.132 -0.4188 1.8748 -1.7521
4.90369 4.90369 14.25892
ijb#ijb-Latitude-5410:~/work/stack$ ls -lrt | tail
-rw-rw-r-- 1 ijb ijb 1106 Jun 11 05:43 pi_orig.f90
-rw-rw-r-- 1 ijb ijb 958 Jun 11 05:56 pi_ijb.f90~
-rw-rw-r-- 1 ijb ijb 1805 Jun 11 06:07 pi_ijb.f90
-rw-rw-r-- 1 ijb ijb 1106 Jun 11 06:29 pi2.f90~
-rw-rw-r-- 1 ijb ijb 1305 Jun 11 06:34 pi2.f90
-rw-rw-r-- 1 ijb ijb 537 Jun 14 08:39 pdb.f90~
-rw-rw-r-- 1 ijb ijb 1462 Jun 14 08:40 stuff~
-rw-rw-r-- 1 ijb ijb 1425 Jun 14 08:41 stuff
-rw-rw-r-- 1 ijb ijb 560 Jun 14 08:43 pdb.f90
-rwxrwxr-x 1 ijb ijb 20520 Jun 14 08:49 a.out
ijb#ijb-Latitude-5410:~/work/stack$ ./a.out
enter the name of gro file
stuff
ijb#ijb-Latitude-5410:~/work/stack$ ls -lrt | tail
-rw-rw-r-- 1 ijb ijb 958 Jun 11 05:56 pi_ijb.f90~
-rw-rw-r-- 1 ijb ijb 1805 Jun 11 06:07 pi_ijb.f90
-rw-rw-r-- 1 ijb ijb 1106 Jun 11 06:29 pi2.f90~
-rw-rw-r-- 1 ijb ijb 1305 Jun 11 06:34 pi2.f90
-rw-rw-r-- 1 ijb ijb 537 Jun 14 08:39 pdb.f90~
-rw-rw-r-- 1 ijb ijb 1462 Jun 14 08:40 stuff~
-rw-rw-r-- 1 ijb ijb 1425 Jun 14 08:41 stuff
-rw-rw-r-- 1 ijb ijb 560 Jun 14 08:43 pdb.f90
-rwxrwxr-x 1 ijb ijb 20520 Jun 14 08:49 a.out
-rw-rw-r-- 1 ijb ijb 39 Jun 14 08:49 fort.44
ijb#ijb-Latitude-5410:~/work/stack$ cat fort.44
1009
1013
1017
ijb#ijb-Latitude-5410:~/work/stack$

How can I delete objects without complete data by using stata

I have a large panel dataset that looks as follows.
input id age high weight str6 daily_drink
1 10 110 35 water
1 10 110 35 coffee
1 11 120 38 water
1 11 120 38 coffee
1 12 130 50 water
1 12 130 50 coffee
2 11 118 31 water
2 11 118 31 coffee
2 11 118 31 milk
2 12 123 38 water
2 12 123 38 coffee
2 12 123 38 milk
3 10 98 55 water
3 11 116 36 water
3 12 129 39 water
4 12 125 40 water
end
However, I would like to use stata to keep objects with complete 10, 11, and 12 age. Looks like this.
id age high weight daily_drink
1 10 110 35 water
1 10 110 35 coffee
1 11 120 38 water
1 11 120 38 coffee
1 12 130 50 water
1 12 130 50 coffee
3 10 98 55 water
3 11 116 36 water
3 12 129 39 water
However, all the rows are without missing data, so I cannot simply delete the row with missing data. Is there any way to do it? Any suggestion will help. Thanks in advance.
You can use bysort and egen for this. Something along the lines of
bysort id: egen has10 = total(age==10)
bysort id: egen has11 = total(age==11)
bysort id: egen has12 = total(age==12)
keep if (has10 != 0) & (has11 != 0) & (has12 != 0)
should work (untested). See help egen for more info. Install gtools if you have very large data (ssc install gtools) and then replace egen by gegen.
A solution that works if 10, 11, 12 are the only age values possible:
bysort id (age) : gen nvals = sum(age != age[_n-1])
by id : replace nvals = nvals[_N]
keep if nvals == 3
Consider also
bysort id (age) : gen OK1 = age[1] == 10 & age[_N] == 12
by id : egen OK2 = max(age == 11)
keep if OK1 & OK2

How to shift value of column as new variable name?

I have a dataset that looks like this
ID Model_Value Count_Model
111 24 2
222 12 9
234 88 6
111 88 8
222 24 10
222 88 17
I want it to look like this:
ID Model_12 Model_24 Model_88
111 0 2 8
222 9 10 17
234 0 0 6
I don't think I am searching online for the correct terms, I thought initially a transform might work but I still want the row to represent the ID not the model.
How do I go about creating this output from what I have?
Ok I believe this is it! Thank you #mjsqu !!
I was able to do this with the help of this link: http://www.sascommunity.org/mwiki/images/d/dd/PROC_Transpose_slides.pdf
data test_transpose ;
input #1 ID_P #6 Model_Value #18 Count_Model ;
cards;
111 24 2
222 12 9
234 88 6
111 88 8
222 24 10
222 88 17
run;
proc print data=test_transpose;
run;
proc sort data=test_transpose out=test_transpose_S;
By ID_P;
run;
proc transpose
data = test_transpose_S
out = test_transpose_result (drop=_name_)
prefix=Model_Value;
var Count_Model;
BY ID_P;
id Model_Value;
run;
proc print data=test_transpose_result ;
run;
Output of the original sorted dataset and the transpose!

How to sum by group and add new variable dependent by the other two variables in SAS SQL

data work.want2;
input Y M $ ID $ volume;
datalines;
2009 JAN A1 100
2009 FEB A1 20
2009 FEB A1 80
2009 JAN A2 100
2009 JAN A2 100
2009 FEB A2 20
2009 FEB A2 80
2009 JAN A3 100
2009 FEB A3 150
2009 MAR A3 100
2011 DEC A1 100
2011 DEC A1 20
2011 DEC A2 20
2011 DEC A3 120
2011 DEC A3 80
2011 OCT A1 100
2011 OCT A2 20
2011 OCT A2 100
;
proc print data=want2;
run;
/*Code 2--> to sum by Y M ID*/
PROC SQL;
create table want3 as SELECT
Y,
M,
ID,
sum(volume) AS sumvolume
FROM want2
GROUP BY Y, M ,ID;
QUIT;
/*Code 3 -->get sum by Y M*/
PROC SQL;
SELECT
Y,
M,
sum(sumvolume) AS sumvolume_MO
FROM want3
GROUP BY Y, M;
QUIT;
I have use SAS SQL(code 2) to sum by ID, Y and M. I want to add a new variable,Monthly volume, dependent on Y and M.I have use "code 3" to get the results.
Is it possible to combine code 2 and code 3 together to get the results as following? I always get errors.
Thanks in advance.
Y M ID sumvolume sumvolume_MO
2009 FEB A1 100 350
2009 FEB A2 100 350
2009 FEB A3 150 350
2009 JAN A1 100 400
2009 JAN A2 200 400
2009 JAN A3 100 400
2009 MAR A3 100 100
2011 DEC A1 120 340
2011 DEC A2 20 340
2011 DEC A3 200 340
2011 OCT A1 100 220
2011 OCT A2 120 220
Updated to reflect results wanted sum(volume) instead of raw volume.
In general you would want to use sub queries. You could calculate the sum over the different groupings in separate subqueries and merge the results back together.
select a.y,a.m,a.id,a.sumvolume,b.sumvolume_mo
from
(select y,m,id,sum(volume) as sumvolume
from have
group by 1,2,3
) a
natural join
(select y,m,sum(volume) as sumvolume_mo
from have
group by 1,2
) b
;
But PROC SQL in SAS will also let you include non group and non aggregate variables in the SELECT and automatically remerge the data for you. So your could get SUMVOLUME_MO by adding up the values of SUMVOLUME.
select y,m,id,sumvolume,sum(sumvolume) as sumvolume_mo
from
(select y,m,id,sum(volume) as sumvolume
from have
group by 1,2,3
)
group by 1,2
;
Thanks to TOM's answers. I can get the results from the following codes.
PROC SQL;
create table newwant2 as
select y,m,id, sum(volume) as sumvolume_mo2,sumvolume_mo
from newwant
group by Y,M,id
;
Then I use the following code to delete the duplicate rows and keep the last row of each duplicate.
data newwant3;
set newwant2;
by Y M ID sumvolume_mo2 ;
if last.ID;
run;
proc print data=newwant3;
run;

Create date variable from time (Using SAS 9.3)

Using SAS 9.3
I have files with two variables (Time and pulse), one file for each person.
I have the information which date they started measuring for each person.
Now I want create a date variable whom change date at midnight (of course), how?
Example from text files:
23:58:02 106
23:58:07 105
23:58:12 103
23:58:17 98
23:58:22 100
23:58:27 97
23:58:32 99
23:58:37 100
23:58:42 99
23:58:47 104
23:58:52 95
23:58:57 96
23:59:02 98
23:59:07 96
23:59:12 104
23:59:17 109
23:59:22 105
23:59:27 111
23:59:32 111
23:59:37 104
23:59:42 110
23:59:47 100
23:59:52 106
23:59:57 114
00:00:02 123
00:00:07 130
00:00:12 130
00:00:17 125
00:00:22 119
00:00:27 116
00:00:32 122
00:00:37 116
00:00:42 119
00:00:47 117
00:00:52 114
00:00:57 114
00:01:02 110
00:01:07 103
00:01:12 98
00:01:17 98
00:01:22 102
00:01:27 97
00:01:32 99
00:01:37 93
00:01:42 97
00:01:47 103
00:01:52 96
00:01:57 93
00:02:02 93
00:02:07 95
00:02:12 106
00:02:17 99
00:02:22 102
00:02:27 96
00:02:32 93
00:02:37 97
00:02:42 102
00:02:47 101
00:02:52 95
00:02:57 92
00:03:02 100
00:03:07 95
00:03:12 102
00:03:17 102
00:03:22 109
00:03:27 109
00:03:32 107
00:03:37 111
00:03:42 112
00:03:47 113
00:03:52 115
Regex:
\d{2}:\d{2}:\d{2} \d*
See here for an example and play around with regex:
https://regex101.com/r/xF1fQ5/1
EDIT: and have a look at the SAS regex tip sheet: http://support.sas.com/rnd/base/datastep/perl_regexp/regexp-tip-sheet.pdf
Something like this:
Date lastDate = startDate;
List<NData> ListData = new ArrayList<NData>();
for(FileData fdat:ListFileData){
Date nDate = this.getDate(lastDate,fdat.gettime());
NData ndata= new NData(ndate,fdat.getMeasuring());
LisData.add(nData);
lastDate = nDate;
}
.
.
.
.
function Date getDate(Date ld,String time){
Calendar cal = Calendar.getInstance();
cal.setTime(ld);
int year = cal.get(Calendar.YEAR);
int month = cal.get(Calendar.MONTH)+1;
int day = cal.get(Calendar.DAY_OF_MONTH);
int hourOfDay = this.getHour(time);
int minuteOfHour = this.getMinute(time);
org.joda.time.LocalDateTime lastDate = new org.joda.time.LocalDateTime(ld)
org.joda.time.LocalDateTime newDate = new org.joda.time.LocalDateTime(year,month,day,hourOfDay,minuteOfHour);
if(newDate.isBefore(lastDate)){
newDate = newDate.plusDays(1);
}
return newDate.toDate();
}
It's hard to provide a complete answer without sample code, but the SAS lag() function might be enough to do what you need. Your data step would include lines like the following, assuming your time variable is called time and your date variable is called date:
retain date;
if time < lag(time) then date = date + 1;
This assumes you never have any 24 hour gaps (but it appears you'd have to assume that anyway).
This answer also assumes that the time field is already in a SAS time format.