We get user\friends location as follows -->
location = {id = 104048449631599; name = "Menlo Park, California";};
On our side, location is split into city and state. In our database we store the city and state is 2 different columns or tables. We are building app for US market only. Would it be safe to assume the location name has only 2 components - city, state name. Use "," as delimiter. As per my understanding the Id is unique, is their FB service which if provide id will give city and state information separately.
Related
I am learning how to query bigquery (esp STRUCTs and ARRAYs)
I have a table structured as below reference table:
Table Name: Addresses
Name (String), Age(INT, Address (RECORD and REPEATED)
The columns within Address are : address1, address2, city, zipcode
Question:
how do I select all columns except zipcode
I tried querying as follows
SELECT
EXCEPT(zipcode)
FROM Address, UNNEST(address)
The above query is retrieving the address record field column twice.
Also, the subsequent select command that runs is as follows
SELECT
Name,
Age,
Address
from temp
The address should have all columns except zipcode.
If you explode your address array, you need to rebuild it. I'm not expert in SQL, but this code work
select name,
age,
ARRAY_AGG(STRUCT(a.address1 as address1, a.address2 as address2, a.city as city)) as address
from Address, unnest(address) a
group by name, age
You can’t use * except because you need to group by your keys to rebuild your array agg
Consider below - leaves all columns as is with except of address where zipcode is removed
select * except(address),
array(select as struct * except(zipcode) from t.address) address
from Addresses t
I have cloudwatch entries that may be group with respect to a certain field. To be clear assume that field is city. I would like to count the entries with respect to cities. This is the easy part.
fields city
|stats count(*) by city
However I also want to get maximum minimum and average of this count, but I can not. Is it possible to have such queries i.e:
fields city
|stats avg(count(*) by city)
The console return an error for such query : mismatched input 'by' expecting {SYM_COMMA, SYM_RParen}
Here's how you'd do it: You'd first get the count (that you already figured) and then get the metrics you want by calling the relevant functions like so:
fields city
|stats count(*) as cityCount, avg(cityCount), max(cityCount), min(cityCount)
I have a data warehouse where a lot of values are stored as coded values. Coded columns store a numeric value that relates to a row on the CODE_VALUE table. The row on the CODE_VALUE table contains descriptive information for the code. For example, the ADDRESS table has a Address_TYPE_CD column.Address type can be home/office/postal address etc . The output from selecting these columns would be a list of numbers as 121234.0/2323234.0/2321344.0. So we need to query the code_value table to get their descriptions.
We have created a function which hits the code_value table and gets the description for these codes. But when I use the function to change codes to their description it takes almost 15 minutes for a query that otherwise takes a few seconds . So I was thinking of loading the table permanently in Cache. Any suggestions how can this be dealt with??
A solution being used by another system is described below
I have been using Cerner to query the database, which uses User access routines to convert these code_values and are very quick. Generally they are written in C++. The routine is using the global code cache to look up the display value for the code_value that is passed to it. That UAR never hits Oracle directly. The code cache does pull the values from the Code_Value table and load them into memory. So the code cache system is hitting Oracle and doing memory swapping to cache the code values, but the UAR is hitting that cached data instead of querying the Code_Value table.
EXAMPLE :
Person table
person_id(PK)
person_type_cd
birth_dt_cd
deceased_cd
race_cd
name
Visit table
visit_id(PK)
person_id(FK)
visit_type_cd
hospital_cd
visit_dt_tm
disch_dt_tm
reason_visit_cd
address code_value
address_id(PK)
person_id(FK)
address_type_cd
street
suburb_cd
state_cd
country_cd
code_value table
code_value
code_set
description
DATA :
code_value table
code_value code_set description
visit_type :
121212 12 admitted
122233 12 emergency
121233 12 outpatient
address_type :
1234434 233 home
23234 233 office
343434 233 postal
ALTER function [dbo].[getDescByCv](#cv int)
returns varchar(80)
as begin
-- Returns the code value display
declare #ret varchar(80)
select #ret = cv.DESCRIPTION
from CODE_VALUE cv
where cv.code_value = #cv
and cv.active_ind = 1
return isnull(#ret, 0)
end;
Final query :
SELECT
v.PERSON_ID as PersonID
, v.ENCNTR_ID as EncntrID
, [EMR_DWH].dbo.[getDispByCv](v.hospital_cd) as Hospital
, [EMR_DWH].dbo.[getDispByCv](v.visit_type_cd) as VisitType
from visit v
SELECT
v.PERSON_ID as PersonID
, v.ENCNTR_ID as EncntrID
, [EMR_DWH].dbo.[getDispByCv](v.hospital_cd) as Hospital
, [EMR_DWH].dbo.[getDispByCv](v.visit_type_cd) as VisitType
, [EMR_DWH].dbo.[getDispByCv](n.person_type) as PersonType
, [EMR_DWH].dbo.[getDispByCv](v.deceased_cd) as Deceased
, [EMR_DWH].dbo.[getDispByCv](v.address_type_cd) as AddressType
, [EMR_DWH].dbo.[getDispByCv](v.country_cd) as Country
from visit v
,person p
,address a
where v.visit_id = 102288.0
and v.person_id = p.person_id
and p.person_id = a.person_id
I've a table "City" with more than 100k records.
The field "name" contains strings like "Roma", "La Valletta".
I receive a file with the city name, all in upper case as in "ROMA".
I need to get the id of the record that contains "Roma" when I search for "ROMA".
In SQL, I must do something like:
select id from city where upper(name) = upper(%name%)
How can I do this in kettle?
Note: if the city is not found, I use an Insert/update field to create it, so I must avoid duplicates generated by case-sensitive names.
You can make use of the String Operations steps in Pentaho Kettle. Set the Lower/Upper option to Y
Pass the city (name) from the City table to the String operations steps which will do the Upper case of your data stream i.e. city name. Join/lookup with the received file and get the required id.
More on String Operations step in pentaho wiki.
You can use a 'Database join' step. Here you can write the sql:
select id from city where upper(name) = upper(?)
and specify the city field name from the text file as parameter. With 'Number of rows to return' and 'Outer join?' you can control the join behaviour.
This solution doesn't work well with a large number of rows, as it will execute one query per row. In those cases Rishu's solution is better.
This is how I did:
First "Modified JavaScript value" step for create a query:
var queryDest="select coalesce( (select id as idcity from city where upper(name) = upper('"+replace(mycity,"'","\'\'")+"') and upper(cap) = upper('"+mycap+"') ), 0) as idcitydest";
Then I use this string as a query in a Dynamic SQL row.
After that,
IF idcitydest == 0 then
insert new city;
else
use the found record
This system make a query for file's row but it use few memory cache
I have list of more than 10,000 cities , I need to find out corresponding state and country of this city. If any built in service or any web services is available for this ,please let me know .
Your help will be appreciated.
Thank you in Advance.
Maybe this free database can help you:
http://www.maxmind.com/en/worldcities
Includes the following fields:
- Country Code
- ASCII City Name
- City Name
- Region
- Population
- Latitude
- Longitude
If you need the states, you can use the database from here: http://download.geonames.org/export/dump/ (the file cities1000.zip has a LOT of cities)