Redshift - Case insensitive grouping - grouping

Is there a way to make 'group by' in redshift case insensitive. I have contents in a table which have values like below. I'm trying to figure out a way to make the following 2 fields get grouped as one.
Pricewaterhousecoopers LLP
PricewaterhouseCoopers LLP
Thanks

You could group using the LOWER function, e.g.
SELECT LOWER(company) AS company, SUM(sales) AS total
FROM yourTable
GROUP BY LOWER(company);

Related

Querying Redshift Permissions via Groups

Question:
"For each user, show me the groups they belong (1) to and the tables they can query as a result of being part of that group (2)."
Details: This is on redshift.
I'm curious if anyone has done anything like this.
The first part isn't hard, I did that with this:
SELECT usename, groname
FROM pg_user,
pg_group
WHERE pg_user.usesysid = ANY (pg_group.grolist)
AND pg_group.groname in (SELECT DISTINCT pg_group.groname from pg_group)
The second part is much harder I think. Showing table permissions isn't, but that's not really what's being asked. That can be accomplished with this:
SELECT has_table_privilege(user, table, 'select'::text) as select...
But that does not show which group gives the user that permission.
Any advice is great!

Redshift: How to list all users in a group

Getting the list of users belonging to a group in Redshift seems to be a fairly common task but I don't know how to interpret BLOB in grolist field.
I am literally getting "BLOB" in grolist field from TeamSQL. Not so sure this is specific to TeamSQL but I kind of remember thatI got a list of IDs there instead previously in other tool
This worked for me:
select usename
from pg_user , pg_group
where pg_user.usesysid = ANY(pg_group.grolist) and
pg_group.groname='<YOUR_GROUP_NAME>';
SELECT usename, groname
FROM pg_user, pg_group
WHERE pg_user.usesysid = ANY(pg_group.grolist)
AND pg_group.groname in (SELECT DISTINCT pg_group.groname from pg_group);
This will provide the usernames along with the respective groups.
this worked better for me:
SELECT
pu.usename,
pg.groname
FROM
pg_user pu
left join pg_group pg
on pu.usesysid = ANY(pg.grolist)
order by pu.usename

Using another table in BigQuery Regex

I would like to map a string column to a category based on a regular expression match.
Is it possible to use another bigquery table containing the regular expressions and corresponding category for this? This would make it easier for me to update only a table when adding new categories/updating the regex, instead of having to update all queries that would use this lookup.
Query:
CASE
-- Use the entries from another table here
WHEN REGEXP_MATCH(string_to_check, cat1regex) THEN cat1
WHEN REGEXP_MATCH(string_to_check, cat2regex) THEN cat2
etc.
END
Mapping table:
Regex category
pagex|pagey xy
pagez|page1 z1
It's also possible there is another simple way to do something similar that I'm not thinking of, answers pointing those out are welcome too.
Any help would be appreciated.
Below is for BigQuery Standard SQL
#standardSQL
SELECT
string_to_check,
MAX(IF(REGEXP_CONTAINS(string_to_check, reg), category, NULL)) AS category
FROM yourTable
CROSS JOIN mappingTable
GROUP BY string_to_check
You can test / play with it using below dummy date from your question
#standardSQL
WITH `mappingTable` AS (
SELECT r'pagex|pagey' AS reg, 'xy' AS category UNION ALL
SELECT r'pagez|page1', 'z1'
),
`yourTable` AS (
SELECT string_to_check
FROM UNNEST(["pagex.com", "pagez#example.org", "page.example.net"]) AS string_to_check
)
SELECT
string_to_check,
MAX(IF(REGEXP_CONTAINS(string_to_check, reg), category, NULL)) AS category
FROM yourTable
CROSS JOIN mappingTable
GROUP BY string_to_check

REGEX help needed in Oracle

How to get all the table names from the below Sql? My sql returns only the last table name.
with t as
(select 'select col1,
(select max(col3) from dd3) max_timestamp
from dd1,
dd2
where dd1.col1 = dd2.col1
and dd1.col1 in(select col1 from dd4)' sql_text from dual)
select regexp_substr(regexp_substr(upper(sql_text), '\sFROM\s*(\w|\.|_)*'), '(\w|_|\.)+', 1,2)
from t
Thanks,
DD.
This is a more of a regex question than an Oracle question.
If you can run the sql through REPLACE(REPLACE(sql,CHR(13),' '),CHR(10),NULL) to replace all newlines with a space, so that the query fits on a single line, here is regex that will return all the tables in group 1 (for the ones after FROM) and group 3 for subsequent items in a list:
/FROM ([A-Z0-9$#_]+)(,[\s]*([A-Z0-9$#_]+))*/gi
Having multiple groups is not ideal, so I would look at the full match instead, see https://regex101.com/r/OZUalH/1/ for an example (see full match on the right, where every match has from followed by one or more tables).
But let me warn you this is not going to be robust, as these valid FROM clause expressions are not handled:
"my_table"
MY_TABLE AS A
MY_TABLE AS "a"
etc...
If it were me, I would write a function to run the query through explain plan (execute immediate 'explain plan for ...') and extract the tables from the plan tables (or possibly using SYS.DBMS_XPLAN)

How to tweak LISTAGG to support more than 4000 character in select query?

Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production.
I have a table in the below format.
Name Department
Johny Dep1
Jacky Dep2
Ramu Dep1
I need an output in the below format.
Dep1 - Johny,Ramu
Dep2 - Jacky
I have tried the 'LISTAGG' function, but there is a hard limit of 4000 characters. Since my db table is huge, this cannot be used in the app. The other option is to use the
SELECT CAST(COLLECT(Name)
But my framework allows me to execute only select queries and no PL/SQL scripts.Hence i dont find any way to create a type using "CREATE TYPE" command which is required for the COLLECT command.
Is there any alternate way to achieve the above result using select query ?
You should add GetClobVal and also need to rtrim as it will return delimiter in the end of the results.
SELECT RTRIM(XMLAGG(XMLELEMENT(E,colname,',').EXTRACT('//text()')
ORDER BY colname).GetClobVal(),',') from tablename;
if you cant create types (you can't just use sql*plus to create on as a one off?), but you're OK with COLLECT, then use a built-in array. There's several knocking around in the RDBMS. run this query:
select owner, type_name, coll_type, elem_type_name, upper_bound, length
from all_coll_types
where elem_type_name = 'VARCHAR2';
e.g. on my db, I can use sys.DBMSOUTPUT_LINESARRAY which is a varray of considerable size.
select department,
cast(collect(name) as sys.DBMSOUTPUT_LINESARRAY)
from emp
group by department;
A derivative of #anuu_online but handle unescaping the XML in the result.
dbms_xmlgen.convert(xmlagg(xmlelement(E, name||',')).extract('//text()').getclobval(),1)
For IBM DB2, Casting the result to a varchar(10000) will give more than 4000.
select column1, listagg(CAST(column2 AS VARCHAR(10000)), x'0A') AS "Concat column"...
I end up in another approach using the XMLAGG function which doesn't have the hard limit of 4000.
select department,
XMLAGG(XMLELEMENT(E,name||',')).EXTRACT('//text()')
from emp
group by department;
You can use:
SELECT department
, REGEXP_REPLACE(XMLCAST(XMLAGG(XMLELEMENT(x, name, ',')) AS CLOB), ',$')
FROM emp
GROUP BY department
it will return CLOB that has no size limit, handles correctly XML entity escapes and separators.
Instead of REGEXP_REPLACE(..., ',$')) you can use RTRIM(..., ','), which should be faster, but will remove all separators from the end of the result (including those that can appear in name at the end, or previous ones if last names are empty).