Multiple Group by clause - grouping

I need to count the number of students based on class_code and course_code and at the same time count the number of students only based on course. I need two results.
Here is my code
SELECT
Class_code,Course_code, Count(OEN)
FROM section52D
GROUP BY Class_code, Course_code
It solves the first part but I still need count based on course_code.

Assuming you are using Sql Server 2008, I hope this helps you:
SELECT Class_code,Course_code, Count(OEN) as count_basedOn_classAndCourse
into #temp
FROM section52D
GROUP BY Class_code, Course_code
SELECT Course_code, Count(OEN) as count_basedOn_Course
into #temp2
FROM section52D
GROUP BY Course_code
select * from #temp1
select * from #temp2
select t1.Course_Code , t1.Class_Code,
t1.count_basedOn_classAndCourse,
t2.count_basedOn_Course
from #temp1 t1
left outer join #temp2 t2 on t1.Course_Code = t2.Course_Code

You can try:
SELECT Class_code,
Course_code,
(select Count(OEN) FROM section52D) as totalOEN
from section52D
GROUP BY Class_code, Course_code

Related

Django ORM and GROUP BY

Newcommer to Django here.
I'm currently trying to fetch some data from my model with a query that need would need a GROUP BY in SQL.
Here is my simplified model:
class Message(models.Model):
mmsi = models.CharField(max_length=16)
time = models.DateTimeField()
point = models.PointField(geography=True)
I'm basically trying to get the last Message from every distinct mmsi number.
In SQL that would translates like this for example:
select a.* from core_message a
inner join
(select mmsi, max(time) as time from core_message group by mmsi) b
on a.mmsi=b.mmsi and a.time=b.time;
After some tries, I managed to have something working similarly with Django ORM:
>>> mf=Message.objects.values('mmsi').annotate(Max('time'))
>>> Message.objects.filter(mmsi__in=mf.values('mmsi'),time__in=mf.values('time__max'))
That works, but I find my Django solution quite clumsy. Not sure it's the proper way to do it.
Looking at the underlying query this looks like this :
>>> print(Message.objects.filter(mmsi__in=mf.values('mmsi'),time__in=mf.values('time__max')).query)
SELECT "core_message"."id", "core_message"."mmsi", "core_message"."time", "core_message"."point"::bytea FROM "core_message" WHERE ("core_message"."mmsi" IN (SELECT U0."mmsi" FROM "core_message" U0 GROUP BY U0."mmsi") AND "core_message"."time" IN (SELECT MAX(U0."time") AS "time__max" FROM "core_message" U0 GROUP BY U0."mmsi"))
I'd appreciate if you could propose a better solution for this problem.
Thanks !
You only need something like this:
Message.objects.all().distinct('mmsi').values('mmsi', 'time').order_by('mmsi','-id')
or like this:
Message.objects.all().values('mmsi').annotate(date_last=Max('time'))
Note: the last is translate by Django in this sql query:
SELECT "message"."mmsi", MAX("message"."time") AS "date_last" FROM "message" GROUP BY "message"."mmsi", "message"."time" ORDER BY "message"."time" DESC
Using the answers and comments, I managed to solve this using a subquery or a simple distinct order by.
Simple distinct order by solution inspired by #Oriphiel answer:
Message.objects.distinct('mmsi').order_by('mmsi','-time')
The underlying SQL query looks like this :
SELECT DISTINCT ON ("core_message"."mmsi") "core_message"."id", "core_message"."mmsi", "core_message"."time", "core_message"."point"::bytea
FROM "core_message"
ORDER BY "core_message"."mmsi" ASC, "core_message"."time" DESC
Simple and straightforward.
Subquery solution inspired by #DanielRoseman comment:
time_order=Message.objects.filter(mmsi=OuterRef('mmsi')).order_by('-time')
Message.objects.filter(id__in=Subquery(time_order.values('id')[:1]))
The underlying SQL query looks like this :
SELECT "core_message"."id", "core_message"."mmsi", "core_message"."time", "core_message"."point"::bytea
FROM "core_message"
WHERE "core_message"."id" IN
(SELECT U0."id" FROM "core_message" U0 WHERE U0."mmsi" = ("core_message"."mmsi") ORDER BY U0."time" DESC LIMIT 1)
A tad more complex but it gives more flexibility. If I wanted to get first five messages for every MMSI, I'd just need to change the LIMIT value. In Django, it would look like this :
Message.objects.filter(id__in=Subquery(time_order.values('id')[:5]))

Django 1.11 annotation GROUP BY is wrong

With model relations like this
Risk <-- RiskGroup --> Group
I am trying to achieve a query resembling this
SELECT risk.id,
(
SELECT array_agg(bg.name) as names
FROM hazards_riskgroup rgroup
JOIN (SELECT * FROM base_group ORDER BY name) as bg
on rgroup.group_id = bg.id
WHERE risk_id = risk.id
GROUP BY risk_id
ORDER BY risk_id
) AS conf
FROM hazards_risk risk
ORDER BY conf NULLS LAST
;
And I have gotten as far as
group_names = (
RiskGroup.objects
.filter(risk_id=OuterRef('pk'))
.annotate(names=ArrayAgg('group__name'))
.order_by()
.order_by("group__name")
.values('names')
)
qs = Risk.objects.annotate(
conf=Subquery(group_names)
).order_by(F('conf').asc(nulls_last=True))
But that wil produce something along the lines of this query
SELECT risk.id,
(SELECT ARRAY_AGG(U2."name") AS "names"
FROM "hazards_riskgroup" U0
INNER JOIN "base_group" U2 ON (U0."group_id" = U2."id")
WHERE U0."risk_id" = (risk.id)
GROUP BY U0."id", U2."name"
ORDER BY U2."name" ASC) AS "conf"
FROM hazards_risk risk
ORDER BY conf NULLS LAST
Notice that the generated GROUP BY becomes GROUP BY U0."id", U2."name", where I want just GROUP BY U2."name".
Best I've been able to gather, is that it is related to some default ordering on models, but as you can tell, I've already accounted for that by inserting .order_by().
So I'm a little lost. I tried annotating with Raw sql instead, but in that case, I can't seem to be able to reference risk.id which is important to get the right group names for each risk.

Redshift: How to list all users in a group

Getting the list of users belonging to a group in Redshift seems to be a fairly common task but I don't know how to interpret BLOB in grolist field.
I am literally getting "BLOB" in grolist field from TeamSQL. Not so sure this is specific to TeamSQL but I kind of remember thatI got a list of IDs there instead previously in other tool
This worked for me:
select usename
from pg_user , pg_group
where pg_user.usesysid = ANY(pg_group.grolist) and
pg_group.groname='<YOUR_GROUP_NAME>';
SELECT usename, groname
FROM pg_user, pg_group
WHERE pg_user.usesysid = ANY(pg_group.grolist)
AND pg_group.groname in (SELECT DISTINCT pg_group.groname from pg_group);
This will provide the usernames along with the respective groups.
this worked better for me:
SELECT
pu.usename,
pg.groname
FROM
pg_user pu
left join pg_group pg
on pu.usesysid = ANY(pg.grolist)
order by pu.usename

How to add multiple customers to one group on opencart at once?

Is there any way to add many customers to one group at once on opencart? Its too much of a work to add 7500 customers to different groups by one by one.
If you need to add all customers to one group while you know the group ID, you can execute this SQL query in your preferred MySQL administration tool:
INSERT INTO <DB_PREFIX>customer_to_group (customer_id, group_id)
SELECT customer_id, <GROUP_ID>
FROM <DB_PREFIX>customer
Replace <DB_PREFIX> with your DB table name prefix (or none if you are not using it).
Replace <GROUP_ID> by the numeric representation of the group you want the customers to be assigned to.
You can use a similar approach if you want to insert only few customers - but again you need to know their ID's or email addresses (i.e. some unique value that could identify each customer):
INSERT INTO <DB_PREFIX>customer_to_group (customer_id, group_id)
SELECT customer_id, <GROUP_ID>
FROM <DB_PREFIX>customer
WHERE customer_id IN (<ID1>, <ID2>, <ID3>, '<ID...>')
or
INSERT INTO <DB_PREFIX>customer_to_group (customer_id, group_id)
SELECT customer_id, <GROUP_ID>
FROM <DB_PREFIX>customer
WHERE email IN ('email#address.1', 'email#address.2', 'email#address.3', '...')
Let's say you want to assign only those customers living in USA (where the country ID is <COUNTRY_ID>):
INSERT INTO <DB_PREFIX>customer_to_group (customer_id, group_id)
SELECT c.customer_id, <GROUP_ID>
FROM <DB_PREFIX>customer c
LEFT JOIN <DB_PREFIX>customer_address ca ON ca.customer_id = c.customer_id
WHERE ca.country_id = <COUNTRY_ID>
GROUP BY c.customer_id

Analyzing tweeter with hive, regex extract

I am trying to analyze what are the most popular hashtags of July. So far I am able to select tweets from July, or display the most popular tweets, but I didn't sucess in putting them together. I am thinking about creating a intermediate table with july tweets, then display the popular hashtags, but I don't know how, can you help me? What about a 2 level select (select a from select b from table) ?
SELECT hashtags.text, count(*) as total FROM tweets
WHERE regexp_extract(created_at, "(Tue) (Jul)*", 2) = "Jul"
LATERAL VIEW EXPLODE(entities.hashtags) t1 AS hashtags
GROUP BY LOWER(hashtags.text), created_at
ORDER BY total_count DESC
LIMIT 200
Regards, K.
So far, I did this, which is pretty much what I want, but is there any mean to achieve this differently ?
Working nested query:
SELECT
LOWER(hashtags.text),
COUNT(*) AS total_count
FROM (
SELECT * FROM tweets WHERE regexp_extract(created_at,"(Tue Jul)*",1) = "Tue Jul"
) tweets
LATERAL VIEW EXPLODE(entities.hashtags) t1 AS hashtags
GROUP BY LOWER(hashtags.text)
ORDER BY total_count DESC
LIMIT 15
EDIT:
Ok, so if you want you can also do it by a temporary table:
CREATE TABLE tmpdb (
id BIGINT,
created_at STRING,
source STRING,
favorited BOOLEAN,
retweet_count INT,
retweeted_status STRUCT<
text:STRING,
user:STRUCT<screen_name:STRING,name:STRING>>,
entities STRUCT<
urls:ARRAY<STRUCT<expanded_url:STRING>>,
user_mentions:ARRAY<STRUCT<screen_name:STRING,name:STRING>>,
hashtags:ARRAY<STRUCT<text:STRING>>>,
text STRING,
user STRUCT<
screen_name:STRING,
name:STRING,
friends_count:INT,
followers_count:INT,
statuses_count:INT,
verified:BOOLEAN,
utc_offset:INT,
time_zone:STRING>,
in_reply_to_screen_name STRING
)
ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe'
Then you update it:
INSERT OVERWRITE TABLE tmpdb
SELECT * FROM tweets WHERE regexp_extract(created_at,"(Tue Jul)*",1) = "Tue Jul"
And the request become as simple as this:
SELECT
LOWER(hashtags.text),
COUNT(*) AS total_count
FROM tmpdb
LATERAL VIEW EXPLODE(entities.hashtags) t1 AS hashtags
GROUP BY LOWER(hashtags.text)
ORDER BY total_count DESC
LIMIT 15
The pro/cons about the second method:
You need to update the table if you want accurate requests, so it is not suited for one-shot request, but if you need to do multiple requests on the current state of the database, then this method is better.
Don't forget that, copying a database is a costly operation ! So know when to use it :)