after installing the gcp package to my airflow(1.10.9) set-up, i tried to call on the GSheetHook
https://airflow.readthedocs.io/en/latest/_api/airflow/providers/google/suite/hooks/sheets/index.html
but i get an error No module named 'airflow.providers'.
looking into the installed python packages for airflow, i do not find the providers package.
is the gcp airflow packagge working or am i missingg a step before i am able to use it?
EDIT: I have installed the gcp package using the pip installer: pip install apache-airflow[gcp]
and here's the list of the installed packages
The "providers" package is only available in Airflow Master. We plan to release each provider as a separate packages as "backport packages", most likely in a week or two from today.
PR to do that: https://github.com/apache/airflow/pull/8807
You should be checking https://airflow.apache.org/docs/1.10.9/ for Airflow 1.10.9 docs. You are looking at the docs for "latest" which is for Master.
It looks backport packages (providers) can now be installed in v1.10* using with following pip command $pip install apache-airflow-backport-providers-PACKAGE-NAME
Related
I have written a dag which uses mongoexport command in BashOperator. By default mongoexport package is not installed in composer. I will need to install it using below command:
sudo apt install mongo-tools
We can directly install PyPi packages in composer as there is direct option available in gcp console for that.
But how can i install apt package in all nodes of composer? also that package should be effective in case of autoscaling.
Using gcloud app deploy to deploy to App Engine will fail if cron.yaml or cron.xml contains timezone information, when run using Cloud SDK 297.0.0.
For example:
[INFO] GCLOUD: ERROR: (gcloud.app.deploy) An error occurred while parsing file: [/path/to/file/cron.yaml]
[INFO] GCLOUD: Unable to assign value 'America/New_York' to attribute 'timezone':
[INFO] GCLOUD: timezone 'America/New_York' is unknown
The workaround is to downgrade to version 296.0.1 of the Cloud SDK. (Substitute 296.0.1 for VERSION below.)
If you installed the SDK directly (outside of a package manager), you should use gcloud components to update: gcloud components update --version VERSION. This includes all installation mechanisms on this page (including the interactive installer, static versions, Windows installer, and Homebrew) but excludes the two following bullets.
If you installed via the rapture repo for Debian/Ubuntu: sudo apt-get update && sudo apt-get install google-cloud-sdk=VERSION-0
If you installed via the rapture repo for RedHat/CentOS: sudo yum downgrade google-cloud-sdk-VERSION
If for any reason any of the above do not work, use the download archive to manually download an older version, and install using http://cloud/sdk/docs/downloads-versioned-archives.
I got the same problem and I wrote a ticket to the GCP support.
It seems that they are not well aware of the bug.
Airflow installation with command is failing
sudo pip3 install apache-airflow[gcp_api]
Everything was working fine yesterday. Today I see the following error:
Could not find a version that satisfies the requirement apache-beam[gcp]==2.3.0 (from google-cloud-dataflow->apache-airflow[gcp_api]) (from versions: 0.6.0, 2.0.0, 2.1.0, 2.1.1, 2.2.0)
No matching distribution found for apache-beam[gcp]==2.3.0 (from google-cloud-dataflow->apache-airflow[gcp_api])
Can someone help me on this?
Thanks in advance
I faced the same problem :(
Why?
most likely it happened because in the new version(2.3.0) of apache-beam they actually added the restriction for python3
https://pypi.python.org/pypi/apache-beam/2.3.0
Requires Python: >=2.7,<3.0
the previous packages didn't have this restriction, that is why it was working before(if you didn't use dataflow from gcp).
probably you have the last version of https://pypi.python.org/pypi/google-cloud-dataflow/2.3.0 which has updated apache-beam package
How to fix?
uninstall google-cloud-dataflow
pip3 uninstall google-cloud-dataflow
and install version 2.2.0 which has the old version of apache beam
pip install google-cloud-dataflow==2.2.0
Fixed this problem for me, I hope it will help you as well
This has been resolved in the master branch at Apache Airflow Github by Pull Request #3273 .
You can install the latest development branch using the below commands.
pip install git+https://github.com/apache/incubator-airflow
pip install git+https://github.com/apache/incubator-airflow#egg=apache-airflow[gcp_api]
I am unable to run yum command in DSX environment. I need yum command access to install some packages.
Here's the error I am seeing when I type in "!yum install sox" command in DSX notebook:
Could not find platform independent libraries <prefix>
Could not find platform dependent libraries <exec_prefix>
Consider setting $PYTHONHOME to <prefix>[:<exec_prefix>]
ImportError: No module named site
This is possible duplicate of this
Can I use MeCab on IBM Data Science Experience
You cannot use yum in DSX Notebook attach to Apache Spark service on Bluemix.
Given Apache Spark service on bluemix does not allow user to install any root level packages which are usually** installed using yum as well.
The only alternative is for you to either try to see if you can download source using !wget or !curl and then try to see if you can compile it, if the package doesn't need any root permission technically , you should be able to compile and install it using make.
You can also raise feature enhancement for getting this package installed by default.
http://ibm.biz/dsxideas
Thanks,
Charles.
I'm on their github page: https://github.com/GoogleCloudPlatform/google-cloud-python
Their first command is pip install --upgrade google-cloud
this gives me:
Collecting google-cloud
Could not find a version that satisfies the requirement google-cloud (from versions: )
No matching distribution found for google-cloud
I downloaded their SDK and which installed their google cloud SDK and I did gcloud init, but I can't seem to have their python library imported into mine. Starting python and typing:
from google.cloud import datastore
gives me an error that it doesn't exist as a module... This is all from their github so I'm not sure what I'm doing wrong
The issue is that the team is trying to transition from gcloud to google-cloud which is still somewhere incomplete.
All you need to do is install gcloud using pip and you should be fine.
Pro Tip: python -m pip install --upgrade gcloud using this command will install it for your python version.
First, make sure you have installed gcloud on your system then run the commands like this:
First: gcloud components update in your terminal.
then: pip install google-cloud
And for the import error:
I had a similar problem. Adding "--ignore-installed" to my pip command made it work for me.
This might be a bug in pip - see this page for more details: https://github.com/pypa/pip/issues/2751