I'm having trouble authenticating and writing data to a spanner database locally. All imports are up to date - google.cloud, google.auth2, etc. I have tried having someone else run this and it works fine, so the problem seems to be something on my end - something wrong or misconfigured on my computer, maybe where the credentials are stored or something?
Anyone have any ideas?
from google.cloud import spanner
from google.api_core.exceptions import GoogleAPICallError
from google.api_core.datetime_helpers import DatetimeWithNanoseconds
import datetime
from google.oauth2 import service_account
def write_to(database):
record = [[
1041613562310836275,
'test_name'
]]
columns = ("id", "name")
insert_errors = []
try:
with database.batch() as batch:
batch.insert_or_update(
table = "guild",
columns = columns,
values = record,
)
except GoogleAPICallError as e:
print(f'error: {e}')
insert_errors.append(e.message)
pass
return insert_errors
if __name__ == "__main__":
credentials = service_account.Credentials.from_service_account_file(r'path\to\a.json')
instance_id = 'instance-name'
database_id = 'database-name'
spanner_client = spanner.Client(project='project-name', credentials=credentials)
print(f'spanner creds: {spanner_client.credentials}')
instance = spanner_client.instance(instance_id)
database = instance.database(database_id)
insert_errors = write_to(database)
some credential tests:
creds = service_account.Credentials.from_service_account_file(a_json)
<google.oauth2.service_account.Credentials at 0x...>
spanner_client.credentials
<google.auth.credentials.AnonymousCredentials at 0x...>
spanner_client.credentials.signer_email
AttributeError: 'AnonymousCredentials' object has no attribute 'signer_email'
creds.signer_email
'...#....iam.gserviceaccount.com'
spanner.Client().from_service_account_json(a_json).credentials
<google.auth.credentials.AnonymousCredentials object at 0x...>
The most common reason for this is that you have accidentally set (or forgot to unset) the environment variable SPANNER_EMULATOR_HOST. If this environment variable has been set, the client library will try to connect to the emulator instead of Cloud Spanner. This will cause the client library to wait for a long time while trying to connect to the emulator (assuming that the emulator is not running on your machine). Unset the environment variable to fix this problem.
Note: This environment variable will only affect Cloud Spanner client libraries, which is why other Google Cloud product will work on the same machine. The script will also in most cases work on other machines, as they are unlikely to have this environment variable set.
Got the following code:
import time
from google.protobuf.timestamp_pb2 import Timestamp
from google.cloud import bigquery_datatransfer_v1
def runQuery (parent, requested_run_time):
client = bigquery_datatransfer_v1.DataTransferServiceClient()
projectid = '[enter your projectId here]' # Enter your projectID here
transferid = '[enter your transferId here]' # Enter your transferId here
parent = client.project_transfer_config_path(projectid, transferid)
start_time = bigquery_datatransfer_v1.types.Timestamp(seconds=int(time.time() + 10))
response = client.start_manual_transfer_runs(parent, requested_run_time=start_time)
print(response)
We used it in few different projects and cases and everything works fine. Today I deployed another function using this code and keep getting the following error:
AttributeError: 'DataTransferServiceClient' object has no attribute
'project_transfer_config_path'
What am I missing?
Thank you!
You are probably using a newer version (2.0.0 or 2.1.0) of the google-cloud-bigquery-datatransfer client library. In these versions, most utility methods have been removed, one of them being project_transfer_config_path.
You can use the method transfer_config_path of the client to achieve the same result.
I would strongly suggest that you study the Migration Guide to 2.0.0 as there might be other changes that you need to make too.
In case you are using version 2.0.0 and not 2.1.0, I would recommend upgrading to the latest since there are breaking changes between them, for example the import paths that were changed in 2.0.0 have been reverted in 2.1.0.
I'm working on retrieving my HIT results from my local computer. I followed the template of get_results.py and entered my key_id, access_key correctly, and installed xmltodict but got the error message. Could anyone help me figure out why? Here is my HIT address if anyone needs the format of my HIT https://workersandbox.mturk.com/mturk/preview?groupId=3MKP0VNPM2VVY0K5UTNZX9OO9Q8RJE
import boto3
mturk = boto3.client('mturk',
aws_access_key_id = "PASTE_YOUR_IAM_USER_ACCESS_KEY",
aws_secret_access_key = "PASTE_YOUR_IAM_USER_SECRET_KEY",
region_name='us-east-1',
endpoint_url = MTURK_SANDBOX
)
# You will need the following library
# to help parse the XML answers supplied from MTurk
# Install it in your local environment with
# pip install xmltodict
import xmltodict
# Use the hit_id previously created
hit_id = 'PASTE_IN_YOUR_HIT_ID'
# We are only publishing this task to one Worker
# So we will get back an array with one item if it has been completed
worker_results = mturk.list_assignments_for_hit(HITId=hit_id, AssignmentStatuses=['Submitted'])
Here I want to use SFTPToGCSOperator in composer enviornment(1.10.6) of GCP. I know there is a limitation because The operator present only in latest version of airflow not in composer latest version 1.10.6.
See the refrence -
https://airflow.readthedocs.io/en/latest/howto/operator/gcp/sftp_to_gcs.html
I found the alternative of operator and I created a plugin class, But again I faced the issue for sftphook class, Now I am using older version of sftphook class.
see the below refrence -
from airflow.contrib.hooks.sftp_hook import SFTPHook
https://airflow.apache.org/docs/stable/_modules/airflow/contrib/hooks/sftp_hook.html
I have created a plugin class, later It's import in my DAG script. It's working fine only when we are moveing one file, In that case we need to pass complete file path with extension.
Please refer below example(It's working fine in this scenrio)
DIR = "/test/sftp_dag_test/source_dir"
OBJECT_SRC_1 = "file.csv"
source_path=os.path.join(DIR, OBJECT_SRC_1),
Except this If we are using wildcard, I mean if we want to move all the files from directory I am getting error for get_tree_map method.
Please see below DAG code
import os
from airflow import models
from airflow.models import Variable
from PluginSFTPToGCSOperator import SFTPToGCSOperator
#from airflow.contrib.operators.sftp_to_gcs import SFTPToGCSOperator
from airflow.utils.dates import days_ago
default_args = {"start_date": days_ago(1)}
DIR_path = "/main_dir/sub_dir/"
BUCKET_SRC = "test-gcp-bucket"
with models.DAG(
"dag_sftp_to_gcs", default_args=default_args, schedule_interval=None
) as dag:
copy_sftp_to_gcs = SFTPToGCSOperator(
task_id="t_sftp_to_gcs",
sftp_conn_id="test_sftp_conn",
gcp_conn_id="google_cloud_default",
source_path=os.path.join(DIR_path, "*.gz"),
destination_bucket=BUCKET_SRC,
)
copy_sftp_to_gcs
Here we are using wildcard * in DAG script, please see below plugin class.
import os
from tempfile import NamedTemporaryFile
from typing import Optional, Union
from airflow.plugins_manager import AirflowPlugin
from airflow import AirflowException
from airflow.contrib.hooks.gcs_hook import GoogleCloudStorageHook
from airflow.models import BaseOperator
from airflow.contrib.hooks.sftp_hook import SFTPHook
from airflow.utils.decorators import apply_defaults
WILDCARD = "*"
class SFTPToGCSOperator(BaseOperator):
template_fields = ("source_path", "destination_path", "destination_bucket")
#apply_defaults
def __init__(
self,
source_path: str,
destination_bucket: str = "destination_bucket",
destination_path: Optional[str] = None,
gcp_conn_id: str = "google_cloud_default",
sftp_conn_id: str = "sftp_conn_plugin",
delegate_to: Optional[str] = None,
mime_type: str = "application/octet-stream",
gzip: bool = False,
move_object: bool = False,
*args,
**kwargs
) -> None:
super().__init__(*args, **kwargs)
self.source_path = source_path
self.destination_path = self._set_destination_path(destination_path)
print('destination_bucket : ',destination_bucket)
self.destination_bucket = destination_bucket
self.gcp_conn_id = gcp_conn_id
self.mime_type = mime_type
self.delegate_to = delegate_to
self.gzip = gzip
self.sftp_conn_id = sftp_conn_id
self.move_object = move_object
def execute(self, context):
print("inside execute")
gcs_hook = GoogleCloudStorageHook(
google_cloud_storage_conn_id=self.gcp_conn_id, delegate_to=self.delegate_to
)
sftp_hook = SFTPHook(self.sftp_conn_id)
if WILDCARD in self.source_path:
total_wildcards = self.source_path.count(WILDCARD)
if total_wildcards > 1:
raise AirflowException(
"Only one wildcard '*' is allowed in source_path parameter. "
"Found {} in {}.".format(total_wildcards, self.source_path)
)
print('self.source_path : ',self.source_path)
prefix, delimiter = self.source_path.split(WILDCARD, 1)
print('prefix : ',prefix)
base_path = os.path.dirname(prefix)
print('base_path : ',base_path)
files, _, _ = sftp_hook.get_tree_map(
base_path, prefix=prefix, delimiter=delimiter
)
for file in files:
destination_path = file.replace(base_path, self.destination_path, 1)
self._copy_single_object(gcs_hook, sftp_hook, file, destination_path)
else:
destination_object = (
self.destination_path
if self.destination_path
else self.source_path.rsplit("/", 1)[1]
)
self._copy_single_object(
gcs_hook, sftp_hook, self.source_path, destination_object
)
def _copy_single_object(
self,
gcs_hook: GoogleCloudStorageHook,
sftp_hook: SFTPHook,
source_path: str,
destination_object: str,
) -> None:
"""
Helper function to copy single object.
"""
self.log.info(
"Executing copy of %s to gs://%s/%s",
source_path,
self.destination_bucket,
destination_object,
)
with NamedTemporaryFile("w") as tmp:
sftp_hook.retrieve_file(source_path, tmp.name)
print('before upload self det object : ',self.destination_bucket)
gcs_hook.upload(
self.destination_bucket,
destination_object,
tmp.name,
self.mime_type,
)
if self.move_object:
self.log.info("Executing delete of %s", source_path)
sftp_hook.delete_file(source_path)
#staticmethod
def _set_destination_path(path: Union[str, None]) -> str:
if path is not None:
return path.lstrip("/") if path.startswith("/") else path
return ""
#staticmethod
def _set_bucket_name(name: str) -> str:
bucket = name if not name.startswith("gs://") else name[5:]
return bucket.strip("/")
class SFTPToGCSOperatorPlugin(AirflowPlugin):
name = "SFTPToGCSOperatorPlugin"
operators = [SFTPToGCSOperator]
So this plugin class I am importing in my DAG script and it's wotking fine when we are using file name, Because code is going inside else condition.
But when we are using wildcard we have cursor inside if condition and I am getting error for get_tree_map method.
see below error -
ERROR - 'SFTPHook' object has no attribute 'get_tree_map'
I found the reason of this error this method itself is not present in composer(airflow 1.10.6)-
https://airflow.apache.org/docs/stable/_modules/airflow/contrib/hooks/sftp_hook.html
This method is present in latest version of airflow
https://airflow.readthedocs.io/en/latest/_modules/airflow/providers/sftp/hooks/sftp.html
Now What should I can try, Is there any alternative of this method or any alternative of this operator class.
Does anyone know if there is a solution for this?
Thanks in Advance.
Please ignore Typo or indentation error in stackoverflow. In my code there is no Indentation error.
"providers" packages are only available from Airflow 2.0, which is not yet available in Cloud Composer (as I write this post, the latest available Airflow image is 1.10.14, released this morning).
BUT you can import backport packages which let you enjoy these new packages in earlier versions 1.10.*.
My requirements.txt:
apache-airflow-backport-providers-ssh==2020.10.29
apache-airflow-backport-providers-sftp==2020.10.29
pysftp>=0.2.9
paramiko>=2.6.0
sshtunnel<0.2,>=0.1.4
You can import PyPi packages directly in your Composer environment from the console.
With these dependencies, I could use the newest airflow.providers.ssh.operators.ssh.SSHOperator (formerly airflow.contrib.operators.ssh_operator.SSHOperator) and the new airflow.providers.google.cloud.transfers.gcs_to_sftp.GCSToSFTPOperator (which had no equivalent in contrib operators).
Enjoy!
To use SFTPToGCSOperator in Google Cloud Composer on Airflow version 1.10.6 we need to create a plugin and somehow "hack" Airflow by copying operator/hook codes into one file to enable SFTPToGCSOperator use code from Airflow 1.10.10 version.
The latest Airflow version has a new airflow.providers directory, which does not exist in earlier versions. This is why you saw following error: No module named airflow.providers. All the changes I made are described here:
I prepared working plugin, which you can download here. Before using it, we have to install following PyPI libraries on the Cloud Composer environment: pysftp, paramiko, sshtunnel.
I copied full SFTPToGCSOperator code, which starts in 792nd line. You can see that this operator uses GCSHook:
from airflow.providers.google.cloud.hooks.gcs import GCSHook
which also need to be copied to the plugin - starts in 193rd line.
Then, GCSHook inherits from GoogleBaseHook class, which we can change for GoogleCloudBaseHook accessible in Airflow 1.10.6 version, and import it:
from airflow.contrib.hooks.gcp_api_base_hook import GoogleCloudBaseHook
Finally, there is a need to import SFTPHook code into the plugin - starts in 39th line, which inherits from SSHHook class, we can use one from Airflow 1.10.6 version by changing import statement:
from airflow.contrib.hooks.ssh_hook import SSHHook
At the end of file, you can find the definition of the plugin:
class SFTPToGCSOperatorPlugin(AirflowPlugin):
name = "SFTPToGCSOperatorPlugin"
operators = [SFTPToGCSOperator]
Plugin creation is needed, as an Airflow built-in operator is not currently available in Airflow 1.10.6 version (the latest in Cloud Composer). You can keep an eye on Cloud Composer version lists in order to see when the newest version of Airflow will be available to use.
I hope you find the above pieces of information useful.
I have created an automation framework using toolium with Appium which works for both IOS and Android. Toolium is a python wrapper that I've used to facilitate page object modelling. Basically the UI is separated from the test case so that the same test case can be used across android as well as IOS.
I now need to get the framework working with IOS 10 (With XCUI test framework). So I have changed the elements for IOS so as to support XCUI (Places were XPATH is used and there is no other means of element identification). There is no change in the folder structure/execution mechanism whatsoever. But with the new framework I get an import error from toolium.
Code from tooling mobile page objects.py looks something like this.
# -*- coding: utf-8 -*-
import importlib
from toolium.driver_wrapper import DriverWrappersPool
from toolium.pageobjects.page_object import PageObject
class MobilePageObject(PageObject):
def __new__(cls, driver_wrapper=None):
"""Instantiate android or ios page object from base page object depending on driver configuration
Base, Android and iOS page objects must be defined with following structure:
FOLDER/base/MODULE_NAME.py
class BasePAGE_OBJECT_NAME(MobilePageObject)
FOLDER/android/MODULE_NAME.py
class AndroidPAGE_OBJECT_NAME(BasePAGE_OBJECT_NAME)
FOLDER/ios/MODULE_NAME.py
class IosPAGE_OBJECT_NAME(BasePAGE_OBJECT_NAME)
:param driver_wrapper: driver wrapper instance
:returns: android or ios page object instance
"""
if cls.__name__.startswith('Base'):
__driver_wrapper = driver_wrapper if driver_wrapper else DriverWrappersPool.get_default_wrapper()
__os_name = 'ios' if __driver_wrapper.is_ios_test() else 'android'
__class_name = cls.__name__.replace('Base', __os_name.capitalize())
try:
return getattr(importlib.import_module(cls.__module__), __class_name)(__driver_wrapper)
except AttributeError:
__module_name = cls.__module__.replace('.base.', '.{}.'.format(__os_name))
print __module_name
print __class_name
print __driver_wrapper
return getattr(importlib.import_module(__module_name), __class_name)(__driver_wrapper)
else:
return super(MobilePageObject, cls).__new__(cls)
I follow the folder structure as mentioned in toolium. Basically I have,
pageobjects folder under which I have base folder, ios folder and android folder. All my methods are in the base class. The elements are picked up either from the iOS folder or android folder at run time based on the driver type.
Below is the error from the import module
name = 'pageobjects.ios.intro', package = None
def import_module(name, package=None):
"""Import a module.
The 'package' argument is required when performing a relative import. It
specifies the package to use as the anchor point from which to resolve the
relative import to an absolute import.
"""
if name.startswith('.'):
if not package:
raise TypeError("relative imports require the 'package' argument")
level = 0
for character in name:
if character != '.':
break
level += 1
name = _resolve_name(name[level:], package, level)
__import__(name)
E ImportError: No module named ios.intro
When I print the module name and class name this is what I get.
module name = pageobjects.ios.intro
class name = IosIntroduction
intro is one of the modules basically. I access it something like this
from pageobjects.base.intro import BaseIntroduction
On the same machine I have the old framework working without any problem. I have checked environment variables/permissions etc. But I can't seem to figure out as to why the import is failing.
PS: I am running this on MACOSX and also use virtualenvironment for python