I need a conditional formula to validate data that is put in the location column to only accept valid street addresses. For instance, to prevent someone from putting an email address in the location column. The location column so far accepts the email address as a street address.
Location column accepting an email address as a location
You can use following column validation to exclude string contains "#"
=AND(IF(ISERROR(FIND("#",Address)),TRUE))
Related
I am learning how to query bigquery (esp STRUCTs and ARRAYs)
I have a table structured as below reference table:
Table Name: Addresses
Name (String), Age(INT, Address (RECORD and REPEATED)
The columns within Address are : address1, address2, city, zipcode
Question:
how do I select all columns except zipcode
I tried querying as follows
SELECT
EXCEPT(zipcode)
FROM Address, UNNEST(address)
The above query is retrieving the address record field column twice.
Also, the subsequent select command that runs is as follows
SELECT
Name,
Age,
Address
from temp
The address should have all columns except zipcode.
If you explode your address array, you need to rebuild it. I'm not expert in SQL, but this code work
select name,
age,
ARRAY_AGG(STRUCT(a.address1 as address1, a.address2 as address2, a.city as city)) as address
from Address, unnest(address) a
group by name, age
You can’t use * except because you need to group by your keys to rebuild your array agg
Consider below - leaves all columns as is with except of address where zipcode is removed
select * except(address),
array(select as struct * except(zipcode) from t.address) address
from Addresses t
I am working with customer data, part of which looks at customers email addresses. Unfortunately there are next to none controls on fields where customer data is input in the system and therefore requires scrubbing.
Using the current email field, I want to create a new field populated with the customer's email address based on the condition "if # exists" and then if it doesn't exist, I will populate the email address with a blank value.
For example:
Current Email Address New Email Address
customer1#business1.com customer1#business1.com
customer2#business2.com customer2#business2.com
customer3business3.com
Can anyone help - I have scoured the internet and cannot find anything that would do this!!
Thanks
You'd probably want more controls than this to validate an email address, but here you go:
data have;
infile cards;
input cur_email:$50.;
cards4;
customer1#business1.com
customer2#business2.com
customer3business3.com
;;;;
run;
data want;
set have;
if index(cur_email,"#") then new_email=cur_email;
run;
If you want to search for a string within the email address like 'gmail' then you can use this:
if COMPRESS(TRANWRD(cur_email,'gmail','~'),'~','k')='~' then new_email=cur_email;
or to be in keeping with the first answer:
if INDEX(TRANWRD(cur_email,'gmail','~'),'~') then new_email=cur_email;
I've a table "City" with more than 100k records.
The field "name" contains strings like "Roma", "La Valletta".
I receive a file with the city name, all in upper case as in "ROMA".
I need to get the id of the record that contains "Roma" when I search for "ROMA".
In SQL, I must do something like:
select id from city where upper(name) = upper(%name%)
How can I do this in kettle?
Note: if the city is not found, I use an Insert/update field to create it, so I must avoid duplicates generated by case-sensitive names.
You can make use of the String Operations steps in Pentaho Kettle. Set the Lower/Upper option to Y
Pass the city (name) from the City table to the String operations steps which will do the Upper case of your data stream i.e. city name. Join/lookup with the received file and get the required id.
More on String Operations step in pentaho wiki.
You can use a 'Database join' step. Here you can write the sql:
select id from city where upper(name) = upper(?)
and specify the city field name from the text file as parameter. With 'Number of rows to return' and 'Outer join?' you can control the join behaviour.
This solution doesn't work well with a large number of rows, as it will execute one query per row. In those cases Rishu's solution is better.
This is how I did:
First "Modified JavaScript value" step for create a query:
var queryDest="select coalesce( (select id as idcity from city where upper(name) = upper('"+replace(mycity,"'","\'\'")+"') and upper(cap) = upper('"+mycap+"') ), 0) as idcitydest";
Then I use this string as a query in a Dynamic SQL row.
After that,
IF idcitydest == 0 then
insert new city;
else
use the found record
This system make a query for file's row but it use few memory cache
Say for example in column C I have all emails which contain the same domain. This field is populated by a form.
I need a function to remove the #domain.com from the field every time a new record is inserted in the column.
pseudo code:
=REGEXREPLACE(<this-cell-value>,"#domain.com","")
assuming your data starts in row 2, in D2 try:
=ArrayFormula(iferror(regexextract(C2:C, "(.+)#")))
This should extract from col C everything that is before the #.
See if that works ?
We get user\friends location as follows -->
location = {id = 104048449631599; name = "Menlo Park, California";};
On our side, location is split into city and state. In our database we store the city and state is 2 different columns or tables. We are building app for US market only. Would it be safe to assume the location name has only 2 components - city, state name. Use "," as delimiter. As per my understanding the Id is unique, is their FB service which if provide id will give city and state information separately.