Trouble authenticating and writing to database locally - google-cloud-platform

I'm having trouble authenticating and writing data to a spanner database locally. All imports are up to date - google.cloud, google.auth2, etc. I have tried having someone else run this and it works fine, so the problem seems to be something on my end - something wrong or misconfigured on my computer, maybe where the credentials are stored or something?
Anyone have any ideas?
from google.cloud import spanner
from google.api_core.exceptions import GoogleAPICallError
from google.api_core.datetime_helpers import DatetimeWithNanoseconds
import datetime
from google.oauth2 import service_account
def write_to(database):
record = [[
1041613562310836275,
'test_name'
]]
columns = ("id", "name")
insert_errors = []
try:
with database.batch() as batch:
batch.insert_or_update(
table = "guild",
columns = columns,
values = record,
)
except GoogleAPICallError as e:
print(f'error: {e}')
insert_errors.append(e.message)
pass
return insert_errors
if __name__ == "__main__":
credentials = service_account.Credentials.from_service_account_file(r'path\to\a.json')
instance_id = 'instance-name'
database_id = 'database-name'
spanner_client = spanner.Client(project='project-name', credentials=credentials)
print(f'spanner creds: {spanner_client.credentials}')
instance = spanner_client.instance(instance_id)
database = instance.database(database_id)
insert_errors = write_to(database)
some credential tests:
creds = service_account.Credentials.from_service_account_file(a_json)
<google.oauth2.service_account.Credentials at 0x...>
spanner_client.credentials
<google.auth.credentials.AnonymousCredentials at 0x...>
spanner_client.credentials.signer_email
AttributeError: 'AnonymousCredentials' object has no attribute 'signer_email'
creds.signer_email
'...#....iam.gserviceaccount.com'
spanner.Client().from_service_account_json(a_json).credentials
<google.auth.credentials.AnonymousCredentials object at 0x...>

The most common reason for this is that you have accidentally set (or forgot to unset) the environment variable SPANNER_EMULATOR_HOST. If this environment variable has been set, the client library will try to connect to the emulator instead of Cloud Spanner. This will cause the client library to wait for a long time while trying to connect to the emulator (assuming that the emulator is not running on your machine). Unset the environment variable to fix this problem.
Note: This environment variable will only affect Cloud Spanner client libraries, which is why other Google Cloud product will work on the same machine. The script will also in most cases work on other machines, as they are unlikely to have this environment variable set.

Related

Airflow SSHOperator: How To Securely Access Pem File Across Tasks?

We are running Airflow via AWS's managed MWAA Offering. As part of their offering they include a tutorial on securely using the SSH Operator in conjunction with AWS Secrets Manager. The gist of how their solution works is described below:
Run a Task that fetches the pem file from a Secrets Manager location and store it on the filesystem at /tmp/mypem.pem.
In the SSH Connection include the extra information that specifies the file location
{"key_file":"/tmp/mypem.pem"}
Use the SSH Connection in the SSHOperator.
In short the workflow is supposed to be:
Task1 gets the pem -> Task2 uses the pem via the SSHOperator
All of this is great in theory, but it doesn't actually work. It doesn't work because Task1 may run on a different node from Task2, which means Task2 can't access the /tmp/mypem.pem file location that Task1 wrote the file to. AWS is aware of this limitation according to AWS Support, but now we need to understand another way to do this.
Question
How can we securely store and access a pem file that can then be used by Tasks running on different nodes via the SSHOperator?
I ran into the same problem. I extended the SSHOperator to do both steps in one call.
In AWS Secrets Manager, two keys are added for airflow to retrieve on execution.
{variables_prefix}/airflow-user-ssh-key : the value of the private key
{connections_prefix}/ssh_airflow_user : ssh://replace.user#replace.remote.host?key_file=%2Ftmp%2Fairflow-user-ssh-key
from typing import Optional, Sequence
from os.path import basename, splitext
from airflow.models import Variable
from airflow.providers.ssh.operators.ssh import SSHOperator
from airflow.providers.ssh.hooks.ssh import SSHHook
class SSHOperator(SSHOperator):
"""
SSHOperator to execute commands on given remote host using the ssh_hook.
:param ssh_conn_id: :ref:`ssh connection id<howto/connection:ssh>`
from airflow Connections.
:param ssh_key_var: name of Variable holding private key.
Creates "/tmp/{variable_name}.pem" to use in SSH connection.
May also be inferred from "key_file" in "extras" in "ssh_conn_id".
:param remote_host: remote host to connect (templated)
Nullable. If provided, it will replace the `remote_host` which was
defined in `ssh_hook` or predefined in the connection of `ssh_conn_id`.
:param command: command to execute on remote host. (templated)
:param timeout: (deprecated) timeout (in seconds) for executing the command. The default is 10 seconds.
Use conn_timeout and cmd_timeout parameters instead.
:param environment: a dict of shell environment variables. Note that the
server will reject them silently if `AcceptEnv` is not set in SSH config.
:param get_pty: request a pseudo-terminal from the server. Set to ``True``
to have the remote process killed upon task timeout.
The default is ``False`` but note that `get_pty` is forced to ``True``
when the `command` starts with ``sudo``.
"""
template_fields: Sequence[str] = ("command", "remote_host")
template_ext: Sequence[str] = (".sh",)
template_fields_renderers = {"command": "bash"}
def __init__(
self,
*,
ssh_conn_id: Optional[str] = None,
ssh_key_var: Optional[str] = None,
remote_host: Optional[str] = None,
command: Optional[str] = None,
timeout: Optional[int] = None,
environment: Optional[dict] = None,
get_pty: bool = False,
**kwargs,
) -> None:
super().__init__(
ssh_conn_id=ssh_conn_id,
remote_host=remote_host,
command=command,
timeout=timeout,
environment=environment,
get_pty=get_pty,
**kwargs,
)
if ssh_key_var is None:
key_file = SSHHook(ssh_conn_id=self.ssh_conn_id).key_file
key_filename = basename(key_file)
key_filename_no_extension = splitext(key_filename)[0]
self.ssh_key_var = key_filename_no_extension
else:
self.ssh_key_var = ssh_key_var
def import_ssh_key(self):
with open(f"/tmp/{self.ssh_key_var}", "w") as file:
file.write(Variable.get(self.ssh_key_var))
def execute(self, context):
self.import_ssh_key()
super().execute(context)
The answer by holly is good. I am sharing a different way I solved this problem. I used the strategy of converting the SSH Connection into a URI and then input that into Secrets Manager under the expected connections path, and everything worked great via the SSH Operator. Below are the general steps I took.
Generate an encoded URI
import json
from airflow.models.connection import Connection
from pathlib import Path
pem = Path(“/my/pem/file”/pem).read_text()
myconn= Connection(
conn_id="connX”,
conn_type="ssh",
host="10.x.y.z,
login=“mylogin”,
extra=json.dumps(dict(private_key=pem)),
print(myconn.get_uri())
Input that URI under the environment's configured path in Secrets Manager. The important note here is to input the value in the plaintext field without including a key. Example:
airflow/connections/connX and under Plaintext only include the URI value
Now in the SSHOperator you can reference this connection Id like any other.
remote_task = SSHOperator(
task_id="ssh_and_execute_command",
ssh_conn_id="connX"
command="whoami",
)

How to configure firebase admin at Django server?

I am trying to add custom token for user authentication (phone number and password) and as per reference documents I would like to configure server to generate custom token.
I have installed, $ sudo pip install firebase-admin
and also setup an environment : export GOOGLE_APPLICATION_CREDENTIALS="[to json file at my server]"
I am using Django project at my server where i have created all my APIs.
I am stucked at this point where it says to initialize app:
default_app = firebase_admin.initialize_app()
Where should i write the above statement within Django files? and how should i generate endpoint to get custom token?
Regards,
PD
pip install firebase-admin
credentials.json file includes some private keys. So, you can’t add to your project directly. If you’re using the git version system and you want to host this file in your project folder, you must add the file name to your “.gitignore”.
Set your operation system environ variable. You can use for MacOSX or Linux distributions; For set a variable in window os (https://www.computerhope.com/issues/ch000549.htm).
$ export GOOGLE_APPLICATION_CREDENTIALS='/path/to/credentials.json'
This part is important, google package (It came with firebase_admin package) looking at some conditions for credentials. One of them is os.environ.get(‘GOOGLE_APPLICATION_CREDENTIALS’). If you set this file, than you don’t need to anythig fot initialize firebase. Otherwise you should define manually.
For initial firebase look at. Set up configurations (https://firebase.google.com/docs/firestore/quickstart#initialize )
Create a file named “firebase.py”.
$ touch firebase.py
Now we can use the “firebase_admin” package for querying. Our firebase.py seems like;
import time
from datetime import timedelta
from uuid import uuid4
from firebase_admin import firestore, initialize_app
__all__ = ['send_to_firebase', 'update_firebase_snapshot']
initialize_app()
def send_to_firebase(raw_notification):
db = firestore.client()
start = time.time()
db.collection('notifications').document(str(uuid4())).create(raw_notification)
end = time.time()
spend_time = timedelta(seconds=end - start)
return spend_time
def update_firebase_snapshot(snapshot_id):
start = time.time()
db = firestore.client()
db.collection('notifications').document(snapshot_id).update(
{'is_read': True}
)
end = time.time()
spend_time = timedelta(seconds=end - start)
return spend_time
You Refer this link(https://medium.com/#canadiyaman/how-to-use-firebase-with-django-project-34578516bafe)

InvalidQueryException: Consistency level LOCAL_ONE is not supported for this operation. Supported consistency levels are: LOCAL_QUORUM

import org.apache.spark._
import org.apache.spark.SparkContext._
import com.datastax.spark.connector._
import com.datastax.spark.connector.cql.CassandraConnector
val conf = new SparkConf()
.setMaster("local[*]")
.setAppName("XXXX")
.set("spark.cassandra.connection.host" ,"cassandra.us-east-2.amazonaws.com")
.set("spark.cassandra.connection.port", "9142")
.set("spark.cassandra.auth.username", "XXXXX")
.set("spark.cassandra.auth.password", "XXXXX")
.set("spark.cassandra.connection.ssl.enabled", "true")
.set("spark.cassandra.connection.ssl.trustStore.path", "/home/nihad/.cassandra/cassandra_truststore.jks")
.set("spark.cassandra.connection.ssl.trustStore.password", "XXXXX")
.set("spark.cassandra.output.consistency.level", "LOCAL_QUORUM")
val connector = CassandraConnector(conf)
val session = connector.openSession()
sesssion.execute("""INSERT INTO "covid19".delta_by_states (state_code, state_value, date ) VALUES ('kl', 5, '2020-03-03');""")
session.close()
i amn trying to write data to AWS Cassandra Keyspace using Spark App set in my local system.
Problem is when i execute above code, I get Exception like below:
"com.datastax.oss.driver.api.core.servererrors.InvalidQueryException:
Consistency level LOCAL_ONE is not supported for this operation.
Supported consistency levels are: LOCAL_QUORUM"
As you can see from the above code I have already set cassandra.output.consistency.level as LOCAL_QUORUM in Spark Conf. Also I am using datastax cassandra driver.
But when I read data from AWS Cassandra, it works fine. Also I tried same INSERT command in AWS Keyspace cqlsh. It is working fine there too. So Query is valid.
Can someone help me how to set consistency via datastax.CassandraConnector?
Cracked it.
Instead of setting cassandra consistency via spark config. I created an application.conf file in src/main/resources directory.
datastax-java-driver {
basic.contact-points = [ "cassandra.us-east-2.amazonaws.com:9142"]
advanced.auth-provider{
class = PlainTextAuthProvider
username = "serviceUserName"
password = "servicePassword"
}
basic.load-balancing-policy {
local-datacenter = "us-east-2"
}
advanced.ssl-engine-factory {
class = DefaultSslEngineFactory
truststore-path = "yourPath/.cassandra/cassandra_truststore.jks"
truststore-password = "trustorePassword"
}
basic.request.consistency = LOCAL_QUORUM
basic.request.timeout = 5 seconds
}
and created cassandra session like below
import com.datastax.oss.driver.api.core.config.DriverConfigLoader
import com.datastax.oss.driver.api.core.CqlSession
val loader = DriverConfigLoader.fromClassPath("application.conf")
val session = CqlSession.builder().withConfigLoader(loader).build()
sesssion.execute("""INSERT INTO "covid19".delta_by_states (state_code, state_value, date ) VALUES ('kl', 5, '2020-03-03');""")
It finally worked. No need to mess with spark config
Doc for Driver Config https://docs.datastax.com/en/drivers/java/4.0/com/datastax/oss/driver/api/core/config/DriverConfigLoader.html#fromClasspath-java.lang.String-
datastax configuration doc https://docs.datastax.com/en/developer/java-driver/4.6/manual/core/configuration/reference/

How can I get reports from Google Cloud Storage using the Google's API

I have to create a program that get informations on a daily basis about installations of a group of apps on the AppStore and the PlayStore.
For the PlayStore, using Google Cloud Storage I followed the instructions on this page using the client library and a Service Account method and the Python code example :
https://support.google.com/googleplay/android-developer/answer/6135870?hl=en&ref_topic=7071935
I slightly changed the given code to make it work since documentation looks not up-to-date. I made it possible to connect to the API and it seems to connect correctly.
My problem is that I don't understand what object I get and how to use it. It's not a report it just looks like files properties in a dict.
This is my code (private data "hidden") :
import json
from httplib2 import Http
from oauth2client.service_account import ServiceAccountCredentials
from googleapiclient.discovery import build
client_email = '************.iam.gserviceaccount.com'
json_file = 'PATH/TO/MY/JSON/FILE'
cloud_storage_bucket = 'pubsite_prod_rev_**********'
report_to_download = 'stats/installs/installs_****************_202005_app_version.csv'
private_key = json.loads(open(json_file).read())['private_key']
credentials = ServiceAccountCredentials.from_json_keyfile_name(json_file, scopes='https://www.googleapis.com/auth/devstorage.read_only')
storage = build('storage', 'v1', http=credentials.authorize(Http()))
supposed_to_be_report = storage.objects().get(bucket=cloud_storage_bucket, object=report_to_download).execute()
When I print the supposed_to_be_report - which is a dictionary- I only get what I understand as Metadata about he report like this:
{'kind': 'storage#object', 'id': 'pubsite_prod_rev_***********/stats/installs/installs_****************_202005_app_version.csv/1591077412052716',
'selfLink': 'https://www.googleapis.com/storage/v1/b/pubsite_prod_rev_***********/o/stats%2Finstalls%2Finstalls_*************_202005_app_version.csv',
'mediaLink': 'https://storage.googleapis.com/download/storage/v1/b/pubsite_prod_rev_***********/o/stats%2Finstalls%2Finstalls_****************_202005_app_version.csv?generation=1591077412052716&alt=media',
'name': 'stats/installs/installs_***********_202005_app_version.csv',
'bucket': 'pubsite_prod_rev_***********',
'generation': '1591077412052716',
'metageneration': '1',
'contentType': 'text/csv;
charset=utf-16le', 'storageClass': 'STANDARD', 'size': '378', 'md5Hash': '*****==', 'contentEncoding': 'gzip'......
I am not sure I'm using it correctly. Could you please explain me where am I wrong and/or how to get installs reports correctly ?
Thanks.
I can see that you are using googleapiclient.discovery client, this is not an issue, but the recommended way to access Google Cloud APIs programmatically is by using the client libraries.
Second, you are just retrieving the object's metadata. You can download the object to have access to the file contents, this is a sample using the client library.
from google.cloud import storage
def download_blob(bucket_name, source_blob_name, destination_file_name):
"""Downloads a blob from the bucket."""
# bucket_name = "your-bucket-name"
# source_blob_name = "storage-object-name"
# destination_file_name = "local/path/to/file"
storage_client = storage.Client()
bucket = storage_client.bucket(bucket_name)
blob = bucket.blob(source_blob_name)
blob.download_to_filename(destination_file_name)
print(
"Blob {} downloaded to {}.".format(
source_blob_name, destination_file_name
)
)
Sample taken from official docs.

Encoding is not kept

I am using Python 2.7 and using google plus public API to get activity data in a file. I am encountering issues to maintain the json encoding in my file. Double quotes are coming as u'' in my file. Below is my code:
from apiclient import discovery
API_KEY = 'MY API KEY'
service = discovery.build("plus", "v1", developerKey=API_KEY)
activities_resource = service.activities()
request = activities_resource.search(query='India versus South Africa', maxResults=1, orderBy='best',)
while request!= None:
activities_document = request.execute()
if 'items' in activities_document:
with open("output.json", mode='a') as file:
data = str(activities_document['items'])
file.write(data +"\n\n")
request = service.activities().list_next(request, activities_document)
Output:
[{u'kind': u'plus#activity', u'provider': {u'title': u'Google+'}, u'titl.......
I am expecting [{"kind": "plus#activity", .....
I am running my code on windows and I have tried both on DOS and pycharm IDE. I have also run the code on ubuntu machine but same output. Please let me know what I am doing wrong.
The json module is used for generating JSON. Use it.