BigQuery: create view with comments added to the top of the query

BigQuery: create view with comments added to the top of the query - google-cloud-platform

I want to create a view in BigQuery that also shows comments such as the author, date created etc.
But if I try this in the UI the comments are left out.
Is there a way to do this in the BigQuery UI?
Or are there other ways using bq client or python, or ...?
So for example if I run this:
CREATE OR REPLACE VIEW `my_project_id.my_dataset.my_view_name`
AS
-- this is my important comment. This will be a long and extensive comment.
SELECT 1 as column_a
;
BigQuery will not show the comments in the view when the UI is used to create that view:

Bigquery is skipping the "---- this is my important comment. This will be a long and extensive comment." because it is treated as a comment on the SQL query and it is not treated as a separate string to be included in the creation of view.
Another option is to use bq command as shown below.
bq mk \
--use_legacy_sql=false \
--expiration 3600 \
--description "This is my view" \
--label organization:development \
--view \
'-- this is my important comment. This will be a long and extensive comment.
SELECT 1 as column_a ' \
your-dataset.your-view
My sample output:

I don't know how to do this in the UI, but with the python API you can do as follows:
from google.cloud import bigquery
bq_client = bigquery.Client()
view_id = "my_project_id.my_dataset.my_view_name"
view = bigquery.Table(view_id)
query = """
-- this is my important comment. This will be a long and extensive comment.
SELECT 1 as column_a
"""
view.view_use_legacy_sql = False
view.view_query = query
# if your view already exists
bq_client.delete_table(view)
# your query will now show the comment at the top
bq_client.create_table(view)
This results in the following view:
See also: https://cloud.google.com/bigquery/docs/views#python

Related

Can you integrate GCP labels into dbt sql?

I am trying to add GCP labels to my dbt sql files and running into some errors.
In the BigQuery UI you can prepend any query with label key/value pairs like this:
SET ##query_label = "key1:value1,key2:value2";
I know the keys and values I am using are correct because they work in the UI.
Is there anything in how dbt parses/processes the sql that could be failing due to the ## symbols?
I tried adding SET ##query_label = "key1:value1,key2:value2"; after the config section and it fails

You can specify labels for the views and tables you create. These labels need to be provided in either the dbt_project.yml file, or within the model config. See the examples below:
# this is your dbt_project.yml
models:
your_project:
marketing:
+labels:
domain: marketing
another_key: another_value
finance:
+labels:
domain: finance
-- in the config of the model
{{
config(
materialized = "table",
labels = {'domain': 'finance'}
)
}}
select * from {{ ref('fct_revenue') }}
You can find more information in the docs here.

Primary key validation after Django's sqlsequencereset [duplicate]

I'm following up in regards to a question that I asked earlier in which I sought to seek a conversion from a goofy/poorly written mysql query to postgresql. I believe I succeeded with that. Anyways, I'm using data that was manually moved from a mysql database to a postgres database. I'm using a query that looks like so:
UPDATE krypdos_coderound cru
set is_correct = case
when t.kv_values1 = t.kv_values2 then True
else False
end
from
(select cr.id,
array_agg(
case when kv1.code_round_id = cr.id
then kv1.option_id
else null end
) as kv_values1,
array_agg(
case when kv2.code_round_id = cr_m.id
then kv2.option_id
else null end
) as kv_values2
from krypdos_coderound cr
join krypdos_value kv1 on kv1.code_round_id = cr.id
join krypdos_coderound cr_m
on cr_m.object_id=cr.object_id
and cr_m.content_type_id =cr.content_type_id
join krypdos_value kv2 on kv2.code_round_id = cr_m.id
WHERE
cr.is_master= False
AND cr_m.is_master= True
AND cr.object_id=%s
AND cr.content_type_id=%s
GROUP BY cr.id
) t
where t.id = cru.id
""" % ( self.object_id, self.content_type.id)
)
I have reason to believe that this works well. However, this has lead to a new issue. When trying to submit, I get an error from django that states:
IntegrityError at (some url):
duplicate key value violates unique constraint "krypdos_value_pkey"
I've looked at several of the responses posted on here and I haven't quite found the solution to my problem (although the related questions have made for some interesting reading). I see this in my logs, which is interesting because I never explicitly call insert- django must handle it:
STATEMENT: INSERT INTO "krypdos_value" ("code_round_id", "variable_id", "option_id", "confidence", "freetext")
VALUES (1105935, 11, 55, NULL, E'')
RETURNING "krypdos_value"."id"
However, trying to run that results in the duplicate key error. The actual error is thrown in the code below.
# Delete current coding
CodeRound.objects.filter(
object_id=o.id, content_type=object_type, is_master=True
).delete()
code_round = CodeRound(
object_id=o.id,
content_type=object_type,
coded_by=request.user, comments=request.POST.get('_comments',None),
is_master=True,
)
code_round.save()
for key in request.POST.keys():
if key[0] != '_' or key != 'csrfmiddlewaretoken':
options = request.POST.getlist(key)
for option in options:
Value(
code_round=code_round,
variable_id=key,
option_id=option,
confidence=request.POST.get('_confidence_'+key, None),
).save() #This is where it dies
# Resave to set is_correct
code_round.save()
o.status = '3'
o.save()
I've checked the sequences and such and they seem to be in order. At this point I'm not sure what to do- I assume it's something on django's end but I'm not sure. Any feedback would be much appreciated!

This happend to me - it turns out you need to resync your primary key fields in Postgres. The key is the SQL statement:
SELECT setval('tablename_id_seq', (SELECT MAX(id) FROM tablename)+1);

It appears to be a known difference of behaviour between the MySQL and SQLite (they update the next available primary key even when inserting an object with an explicit id) backends, and other backends like Postgres, Oracle, ... (they do not).
There is a ticket describing the same issue. Even though it was closed as invalid, it provides a hint that there is a Django management command to update the next available key.
To display the SQL updating all next ids for the application MyApp:
python manage.py sqlsequencereset MyApp
In order to have the statement executed, you can provide it as the input for the dbshell management command. For bash, you could type:
python manage.py sqlsequencereset MyApp | python manage.py dbshell
The advantage of the management commands is that abstracts away the underlying DB backend, so it will work even if later migrating to a different backend.

I had an existing table in my "inventory" app and I wanted to add new records in Django admin and I got this error:
Duplicate key value violates unique constraint "inventory_part_pkey"
DETAIL: Key (part_id)=(1) already exists.
As mentioned before, I run the code below to get the SQL command to reset the id-s:
python manage.py sqlsequencereset inventory
Piping the python manage.py sqlsequencereset inventory | python manage.py dbshell to the shell was not working
So I copied the generated raw SQL command
Then opened pgAdmin3 https://www.pgadmin.org for postgreSQL and opened my db
Clicked on the 6. icon (Execute arbitrary SQL queries)
Copied the statement what was generated
In my case the raw SQL command was:
BEGIN;
SELECT setval(pg_get_serial_sequence('"inventory_signup"','id'), coalesce(max("id"), 1), max("id") IS NOT null) FROM "inventory_signup";
SELECT setval(pg_get_serial_sequence('"inventory_supplier"','id'), coalesce(max("id"), 1), max("id") IS NOT null) FROM "inventory_supplier";
COMMIT;
Executed it with F5.
This fixed everything.

In addition to zapphods answer:
In my case the indexing was indeed incorrect, since I had deleted all migrations, and the database probably 10-15 times when developing as I wasn't in the stage of migrating anything.
I was getting an IntegrityError on finished_product_template_finishedproduct_pkey
Reindex the table and restart runserver:
I was using pgadmin3 and for whichever index was incorrect and throwing duplicate key errors I navigated to the constraints and reindexed.
And then reindexed.

The solution is that you need to resync your primary key fields as reported by "Hacking Life" who wrote an example SQL code but, as suggested by "Ad N" is better to run the Django command sqlsequencereset to get the exact SQL code that you can copy and past or run with another command.
As a further improvement to these answers I would suggest to you and other reader to dont' copy and paste the SQL code but, more safely, to execute the SQL query generated by sqlsequencereset from within your python code in this way (using the default database):
from django.core.management.color import no_style
from django.db import connection
from myapps.models import MyModel1, MyModel2
sequence_sql = connection.ops.sequence_reset_sql(no_style(), [MyModel1, MyModel2])
with connection.cursor() as cursor:
for sql in sequence_sql:
cursor.execute(sql)
I tested this code with Python3.6, Django 2.0 and PostgreSQL 10.

If you want to reset the PK on all of your tables, like me, you can use the PostgreSQL recommended way:
SELECT 'SELECT SETVAL(' ||
quote_literal(quote_ident(PGT.schemaname) || '.' || quote_ident(S.relname)) ||
', COALESCE(MAX(' ||quote_ident(C.attname)|| '), 1) ) FROM ' ||
quote_ident(PGT.schemaname)|| '.'||quote_ident(T.relname)|| ';'
FROM pg_class AS S,
pg_depend AS D,
pg_class AS T,
pg_attribute AS C,
pg_tables AS PGT
WHERE S.relkind = 'S'
AND S.oid = D.objid
AND D.refobjid = T.oid
AND D.refobjid = C.attrelid
AND D.refobjsubid = C.attnum
AND T.relname = PGT.tablename
ORDER BY S.relname;
After running this query, you will need to execute the results of the query. I typically copy and paste into Notepad. Then I find and replace "SELECT with SELECT and ;" with ;. I copy and paste into pgAdmin III and run the query. It resets all of the tables in the database. More "professional" instructions are provided at the link above.

If you have manually copied the databases, you may be running into the issue described here.

I encountered this error because I was passing extra arguments to the save method in the wrong way.
For anybody who encounters this, try forcing UPDATE with:
instance_name.save(..., force_update=True)
If you get an error that you cannot pass force_insert and force_update at the same time, you're probably passing some custom arguments the wrong way, like I did.

This question was asked about 9 years ago, and lots of people gave their own ways to solve it.
For me, I put unique=True in my email custom model field, but while creating superuser I didn't ask for the email to be mandatory.
Now after creating a superuser my email field is just saved as blank or Null. Now this is how I created and saved new user
obj = mymodel.objects.create_user(username='abc', password='abc')
obj.email = 'abc#abc.com'
obj.save()
It just threw the error saying duplicate-key-value-violates in the first line because the email was set to empty by default which was the same with the admin user. Django spotted a duplicate !!!
Solution
Option1: Make email mandatory while creating any user (for superuser as well)
Option2: Remove unique=True and run migrations
Option3: If you don't know where are the duplicates, you either drop the column or you can clear the database using python manage.py flush
It is highly recommended to know the reason why the error occurred in your case.

I was getting the same error as the OP.
I had created some Django models, created a Postgres table based on the models, and added some rows to the Postgres table via Django Admin. Then I fiddled with some of the columns in the models (changing around ForeignKeys, etc.) but had forgotten to migrate the changes.
Running the migration commands solved my problem, which makes sense given the SQL answers above.
To see what changes would be applied, without actually applying them:
python manage.py makemigrations --dry-run --verbosity 3
If you're happy with those changes, then run:
python manage.py makemigrations
Then run:
python manage.py migrate

I was getting a similar issue and nothing seemed to be working. If you need the data (ie cant exclude it when doing dump) make sure you have turned off (commented) any post_save receivers. I think the data would be imported but it would create the same model again because of these. Worked for me.

You just have to go to pgAdmin III and there execute your script with the name of the table:
SELECT setval('tablename_id_seq', (SELECT MAX(id) FROM tablename)+1);

Based on Paolo Melchiorre's answer, I wrote a chunk as a function to be called before any .save()
from django.db import connection
def setSqlCursor(db_table):
sql = """SELECT pg_catalog.setval(pg_get_serial_sequence('"""+db_table+"""', 'id'), MAX(id)) FROM """+db_table+""";"""
with connection.cursor() as cursor:
cursor.execute(sql)

This is the right statement. Mostly, It happens when we insert rows with id field.
SELECT setval('tablename_id_seq', (SELECT MAX(id) FROM tablename));

Using Firestore with Flask/Jinja, correct method to query and display results to DOM

#app.route('/view_case/<case_name>')
def view_case(case_name):
query = db.collection('cases').document(case_name).collection('documents').get()
documents = []
for _document in query:
documents.append(_document)
return render_template('views/view_case.html', documents=documents)
Is the above method the correct way to query a group of documents and send them to the DOM as a list to be iterated over by jinja to display?
Side question, I notice the results dont include the documents ID, is there a way to attach the id to the document?

Just amended to use list comprehension
from google.cloud import firestore
db = firestore.Client()
collection_ref = db.collection(u'collection').get()
documents = list(doc.to_dict() for doc in collection_ref)

Tweepy API search doesn't have keyword

I am working with Tweepy (python's REST API client) and I'm trying to find tweets by several keywords and without url included in tweet.
But search results are not up to our satisfaction. Looks like query has erros and was stopped. Additionally we had observed that results were returned one-by-one not (as previously) in bulk packs of 100.
Could you please tell me why this search does not work properly?
We wanted to get all tweets mentioning 'Amazon' without any URL links in the text.
We used search shown below. Search results were still containing tweets with URLs or without 'Amazon' keyword.
Could you please let us know what we are doing wrong?
auth = tweepy.AppAuthHandler(consumer_key, consumer_secret)
api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)
searchQuery = 'Amazon OR AMAZON OR amazon filter:-links' # Keyword
new_tweets = api.search(q=searchQuery, count=100,
result_type = "recent",
max_id = sinceId,
lang = "en")

The minus sign should be put before "filter", not before "links", like this:
searchQuery = 'Amazon OR AMAZON OR amazon -filter:links'
Also, I doubt that the count = 100 option is a valid one, since it is not listed on the API documentation (which may not be very up-to-date, though). Try to replace that with rpp = 100 to get tweets in bulk packs.
I am not sure why some of the tweets you find do not contain the "Amazon" keyword, but a possibility is that "Amazon" is contained within the username of the poster. I do not know if you can filter that directly in the query, or even if you would want to filter it, since it would mean you would reject tweets from the official Amazon accounts. I would suggest that, for each tweet the query returns, you check it to make sure it does contain "Amazon".

Pulling data from datastore and converting it in Json in python(Google Appengine)

I am creating an apllication using google appengine, in which i am fetching a data from the website and storing it in my Database (Data store).Now whenever user hits my application url as "application_url\name =xyz&city= abc",i am fetching the data from the DB and want to show it as json.Right now i am using a filter to fetch data based on the name and city but getting output as [].I dont know how to get data from this.My code looks like this:
class MainHandler(webapp2.RequestHandler):
def get(self):
commodityname = self.request.get('veg',"Not supplied")
market = self.request.get('market',"No market found with this name")
self.response.write(commodityname)
self.response.write(market)
query = commoditydata.all()
logging.info(commodityname)
query.filter('commodity = ', commodityname)
result = query.fetch(limit = 1)
logging.info(result)
and the db structure for "commoditydata" table is
class commoditydata(db.Model):
commodity= db.StringProperty()
market= db.StringProperty()
arrival= db.StringProperty()
variety= db.StringProperty()
minprice= db.StringProperty()
maxprice= db.StringProperty()
modalprice= db.StringProperty()
reporteddate= db.DateTimeProperty(auto_now_add = True)
Can anyone tell me how to get data from the db using name and market and covert it in Json.First getting data from db is the more priority.Any suggestions will be of great use.

If you are starting with a new app, I would suggest to use the NDB API rather than the old DB API. Your code would look almost the same though.
As far as I can tell from your code sample, the query should give you results as far as the HTTP query parameters from the request would match entity objects in the datastore.
I can think of some possible reasons for the empty result:
you only think the output is empty, because you use write() too early; app-engine doesn't support streaming of response, you must write everything in one go and you should do this after you queried the datastore
the properties you are filtering are not indexed (yet) in the datastore, at least not for the entities you were looking for
the filters are just not matching anything (check the log for the values you got from the request)
your query uses a namespace different from where the data was stored in (but this is unlikely if you haven't explicitly set namespaces anywhere)
In the Cloud Developer Console you can query your datastore and even apply filters, so you can see the results with-out writing actual code.
Go to https://console.developers.google.com
On the left side, select Storage > Cloud Datastore > Query
Select the namespace (default should be fine)
Select the kind "commoditydata"
Add filters with example values you expect from the request and see how many results you get
Also look into Monitoring > Log which together with your logging.info() calls is really helpful to better understand what is going on during a request.
The conversion to JSON is rather easy, once you got your data. In your request handler, create an empty list of dictionaries. For each object you get from the query result: set the properties you want to send, define a key in the dict and set the value to the value you got from the datastore. At the end dump the dictionary as JSON string.
class MainHandler(webapp2.RequestHandler):
def get(self):
commodityname = self.request.get('veg')
market = self.request.get('market')
if commodityname is None and market is None:
# the request will be complete after this:
self.response.out.write("Please supply filters!")
# everything ok, try query:
query = commoditydata.all()
logging.info(commodityname)
query.filter('commodity = ', commodityname)
result = query.fetch(limit = 1)
logging.info(result)
# now build the JSON payload for the response
dicts = []
for match in result:
dicts.append({'market': match.market, 'reporteddate': match.reporteddate})
# set the appropriate header of the response:
self.response.headers['Content-Type'] = 'application/json; charset=utf-8'
# convert everything into a JSON string
import json
jsonString = json.dumps(dicts)
self.response.out.write( jsonString )

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

BigQuery: create view with comments added to the top of the query - google-cloud-platform

Related

Can you integrate GCP labels into dbt sql?

Primary key validation after Django's sqlsequencereset [duplicate]

Using Firestore with Flask/Jinja, correct method to query and display results to DOM

Tweepy API search doesn't have keyword

Pulling data from datastore and converting it in Json in python(Google Appengine)

Categories

Resources