Timepart Constraint Issue with 00:30:00 (12:30am) - sas

GM I have a need to identify timeframes however I am having an issue with 12:30am (00:30:00)
The timestamp shows as 18:00:01
Below is the code being used. If I separate each out, the 6pm constraint it works as expected. However when evaluating 18:00:01 for < 00:30:00, it's not doing what I need. Thoughts on how to get this statement to work?
timepart(submit_time) >='18:00:00't and timepart(submit_time) < '00:30:00't then '6pm-12:30am'

Your test is always going to be false. A single value cannot be both larger than 18 hours and smaller than half an hour.
Use OR instead of AND.

Related

XSLT 1.0 - Compare date values and offset by -15 days

I am trying to compare to date values that are being passed into XSLT with the format of MMDDYY, for example: 01022020 (Jan 2, 2020). I am using the below snippet of code in the test:
<xsl:if test="(E_Payclass ='FT') and (E_Status ='TERMINATED') and (E_TermDate < E_PayEnd_Date)">
In an example, the dates are:
E_TermDate: 072017
E_PayEnd_Date: 020120
When running this file, I get the 072017 appearing even though it is actually less than the current date. It seems to me that the file is actually taking the number literally (which is most likely what is supposed to happen) and returning it since its technically greater than 020120.
Overall I am trying to accomplish an outcome in the test that basically would take the E_PayEnd_Date as an actual date, and only return E_TermDate if it is within 15 days prior to it.
Does anyone know how to do that in XSLT 1.0 within an if test statement?

Annotate with Count yields incorrect values with NullBooleanField

I am performing the following query with a few annotations:
(AwardIssueProcess.objects.filter(grant__client=client, completed=completed_status)
.annotate(total_awarded=Sum('award__awardissuedactivity__units_awarded'),
num_accepted=Count(Case(When(award__accepted=True, then=1))),
num_rejected=Count(Case(When(award__accepted=False, then=1))),
num_unaccepted=Count(Case(When(award__accepted=None, then=1)))))
However, the num_unaccepted yields incorrect values. Firstly, if I have awards the number is sometimes double what I expect. But if I remove
total_awarded=Sum('award__awardissuedactivity__units_awarded')
from the annotation, then the doubling problem goes away.
Secondly, the num_unaccepted has a value of 1 if there are no awards. But when there are awards, the value is correct (but not is all cases due to the doubling problem I mentioned previously). In this second issue, I suspect that this may be because it is evaluating the award to be None, but what I really want is for it to check if the accepted field is None. And then if the award doesn't exists, just don't count it. The accepted field is a NullBooleanField.
How should I be writing this differently to solve these two problems with num_unaccepted?
EDIT
I removed total_awarded=Sum('award__awardissuedactivity__units_awarded') from the annotation and created a separate function to get the total_awarded amount that I need. This addresses my first problem, but the second issue still remains.

Weka Resample to balance instances in binary dataset

I've only been using Weka for a couple of weeks but I am absolutely blown away by how great it is!
But I have a question, I have a dataset with a target column which is either True or False.
6709 instances in my dataset are True
25318 instances are False.
I want to randomly add duplicates of my True instances to produce a new dataset with 25318 True and 25318 False.
The only filter I can find which does this is the supervised Resample filter however I am having trouble understanding what parameters I should use.
(there might be a better filter to do what I want)
I've got some success with these parameters
biasToUniformClass = 1.0
invertSelection = False
noReplacement = False
randomSeed = 1
sampleSizePercent = 157.5 (a magic number I've arrived at by trial and error)
This produces 25277 True and 25165 False. Not exactly what I want, but quite close.
The problem is that I cant figure out how to arrive at the magic number. I'm also not getting exactly the numbers of instances that I really want.
Is there a better filter for this purpose?
If not, is there a way to calculate the sampleSizePercent magic number?
Any help is greatly appreciated :)
Supplemental question, am I best to run NominalToBinary on my boolean columns to ensure they are Binary? I'm using a NaiveBayes classifier (at the moment) and I don't have any missing instances.
Jason
I think the tricky part of this question is getting a perfect balance using the Resample Filter. This is because, as it is stated in the description, it 'Produces a random sub-sample of a dataset using either sampling with replacement or without replacement'. If these cases are being drawn randomly, there is no guarantee that you will get an equal measure between the two classes.
As for the magic number, this would be associated with the total number of cases that you would like to have when the filter is applied. In your case, it would be 50636 instead of 32027. In this case, your magic number would be 50636 / 32027 = 1.581. However, as stated above, you may not get an exact match of true and false cases.
If you really need an exact figure, you could use your favourite spreadsheet and preprocess the data. One possible method is to randomise the true cases (in a separate column), sort and copy all of the cases until the number matches the false one. It's not an automated solution, and the solution is outside of Weka, but I have used this method before and does the job reasonably quickly.
Hope this Helps!

How to record total values with rrdtool

I'm pretty sure this question has been asked several times, but either I did not find the correct answer or I didn't understand the solution.
To my current problem:
I have a sensor which measures the time a motor is running.
The sensor is reset after reading.
I'm not interested in the time the motor was running the last five minutes.
I'm more interested in how long the motor was running from the very beginning (or from the last reset).
When storing the values in an rrd, depending on the aggregate function, several values are recorded.
When working with GAUGE, the value read is 3000 (10th seconds) every five minutes.
When working with ABSOLUTE, the value is 10 every five minutes.
But what I would like to get is something like:
3000 after the first 5 minutes
6000 after the next 5 minutes (last value + 3000)
9000 after another 5 minutes (last value + 3000)
The accuracy of the older values (and slopes) is not so important, but the last value should reflect the time in seconds since the beginning as accurate as possible.
Is there a way to accomplish this?
I dont know if it is useful for ur need or not but maybe using TREND/TRENDNAN CDEF function is what u want, look at here:
TREND CDEF function
I now created a small SQLite database with one table and one column in that tabe.
The table has one row. I update that row every time my cron job runs and add the current value to the current value. So the current value of the one row and column is the cumualted value of my sensor. This is then fed into the rrd.
Any other (better) ideas?
The way that I'd tackle this (in Linux) is to write the value to a plain-text file and then use the value from that file for the RRDTool graph. I think that maybe using SQLite (or any other SQL server) just to keep track of this would be unnecessarily hard on a system just to keep track of something like this.

Simple query working for years, then suddenly very slow

I've had a query that has been running fine for about 2 years. The database table has about 50 million rows, and is growing slowly. This last week one of my queries went from returning almost instantly to taking hours to run.
Rank.objects.filter(site=Site.objects.get(profile__client=client, profile__is_active=False)).latest('id')
I have narrowed the slow query down to the Rank model. It seems to have something to do with using the latest() method. If I just ask for a queryset, it returns an empty queryset right away.
#count returns 0 and is fast
Rank.objects.filter(site=Site.objects.get(profile__client=client, profile__is_active=False)).count() == 0
Rank.objects.filter(site=Site.objects.get(profile__client=client, profile__is_active=False)) == [] #also very fast
Here are the results of running EXPLAIN. http://explain.depesz.com/s/wPh
And EXPLAIN ANALYZE: http://explain.depesz.com/s/ggi
I tried vacuuming the table, no change. There is already an index on the "site" field (ForeignKey).
Strangely, if I run this same query for another client that already has Rank objects associated with her account, then the query returns very quickly once again. So it seems that this is only a problem when their are no Rank objects for that client.
Any ideas?
Versions:
Postgres 9.1,
Django 1.4 svn trunk rev 17047
Well, you've not shown the actual SQL, so that makes it difficult to be sure. But, the explain output suggests it thinks the quickest way to find a match is by scanning an index on "id" backwards until it finds the client in question.
Since you said it has been fast until recently, this is probably not a silly choice. However, there is always the chance that a particular client's record will be right at the far end of this search.
So - try two things first:
Run an analyze on the table in question, see if that gives the planner enough info.
If not, increase the stats (ALTER TABLE ... SET STATISTICS) on the columns in question and re-analyze. See if that does it.
http://www.postgresql.org/docs/9.1/static/planner-stats.html
If that's still not helping, then consider an index on (client,id), and drop the index on id (if not needed elsewhere). That should give you lightning fast answers.
latests is normally used for date comparison, maybe you should try to order by id desc and then limit to one.