1) !pip install python-dotenv
2) from dotenv import load_dotenv, find_dotenv
3) # find .env automatically by walking up directories until it's found
dotenv_path = find_dotenv()
# load up the entries as environment variables
load_dotenv(dotenv_path)
4) import os
KAGGLE_USERNAME = os.environ.get("KAGGLE_USERNAME")
print(KAGGLE_USERNAME)
Output: None
But
Expected output is:
what is the issue here?
I recently faced this issue.
Problem was i was running this inside a virtual environment and the dotenv package fails to locate the .env file using find_dotenv() command. To overcome this use
dotenv_path = find_dotenv(usecwd=True)
Hopefully this will work.
Related
hope you are doing well.
I wanted to check if anyone has get up and running with dbt in aws mwaa airflow.
I have tried without success this one and this python packages but fails for some reason or another (can't find the dbt path, etc).
Did anyone has managed to use MWAA (Airflow 2) and DBT without having to build a docker image and placing it somewhere?
Thank you!
I've managed to solve this by doing the following steps:
Add dbt-core==0.19.1 to your requirements.txt
Add DBT cli executable into plugins.zip
#!/usr/bin/env python3
# EASY-INSTALL-ENTRY-SCRIPT: 'dbt-core==0.19.1','console_scripts','dbt'
__requires__ = 'dbt-core==0.19.1'
import re
import sys
from pkg_resources import load_entry_point
if __name__ == '__main__':
sys.argv[0] = re.sub(r'(-script\.pyw?|\.exe)?$', '', sys.argv[0])
sys.exit(
load_entry_point('dbt-core==0.19.1', 'console_scripts', 'dbt')()
)
And from here you have two options:
Setting dbt_bin operator argument to /usr/local/airflow/plugins/dbt
Add /usr/local/airflow/plugins/ to the $PATH by following the docs
Environment variable setter example:
from airflow.plugins_manager import AirflowPlugin
import os
os.environ["PATH"] = os.getenv(
"PATH") + ":/usr/local/airflow/.local/lib/python3.7/site-packages:/usr/local/airflow/plugins/"
class EnvVarPlugin(AirflowPlugin):
name = 'env_var_plugin'
The plugins zip content:
plugins.zip
├── dbt (DBT cli executable)
└── env_var_plugin.py (environment variable setter)
Using the pypi airflow-dbt-python package has simplified the setup of dbt_ to MWAA for us, as it avoids needing to amend PATH environment variables in the plugins file. However, I've yet to have a successful dbt_ run via either airflow-dbt or airflow-dbt-python packages, as MWAA worker seems to be a read only filesystem, so as soon as dbt_ tries to compile to the target directory, the following error occurs:
File "/usr/lib64/python3.7/os.py", line 223, in makedirs
mkdir(name, mode)
OSError: [Errno 30] Read-only file system: '/usr/local/airflow/dags/dbt/target'
This is how I managed to do it:
#dag(**default_args)
def dbt_dag():
#task()
def run_dbt():
from dbt.main import handle_and_check
os.environ["DBT_TARGET_DIR"] = "/usr/local/airflow/tmp/target"
os.environ["DBT_LOG_DIR"] = "/usr/local/airflow/tmp/logs"
os.environ["DBT_PACKAGE_DIR"] = "/usr/local/airflow/tmp/packages"
succeeded = True
try:
args = ['run', '--whatever', 'bla']
results, succeeded = handle_and_check(args)
print(results, succeeded)
except SystemExit as e:
if e.code != 0:
raise e
if not succeeded:
raise Exception("DBT failed")
note that my dbt_project.yml has the following paths, this is to avoid os exception when trying to write to read only paths:
target-path: "{{ env_var('DBT_TARGET_DIR', 'target') }}" # directory which will store compiled SQL files
log-path: "{{ env_var('DBT_LOG_DIR', 'logs') }}" # directory which will store dbt logs
packages-install-path: "{{ env_var('DBT_PACKAGE_DIR', 'packages') }}" # directory which will store dbt packages
Combining the answer from #Yonatan Kiron & #Ofer Helman works for me.
I just need to fix these 3 files:
requiremnt.txt
plugins.zip
dbt_project.yml
My requirements.txt I specify the version I want, and looks like this:
airflow-dbt==0.4.0
dbt-core==1.0.1
dbt-redshift==1.0.0
Note that, as of v1.0.0, pip install dbt is no longer supported and will raise an explicit error. Since v0.13, the PyPi package named dbt was a simple "pass-through" of dbt-core. (refer https://docs.getdbt.com/dbt-cli/install/pip#install-dbt-core-only)
For my plugins.zip I add a file env_var_plugin.py that looks like this
from airflow.plugins_manager import AirflowPlugin
import os
os.environ["DBT_LOG_DIR"] = "/usr/local/airflow/tmp/logs"
os.environ["DBT_PACKAGE_DIR"] = "/usr/local/airflow/tmp/dbt_packages"
os.environ["DBT_TARGET_DIR"] = "/usr/local/airflow/tmp/target"
class EnvVarPlugin(AirflowPlugin):
name = 'env_var_plugin'
And finally I add this in my dbt_project.yml
log-path: "{{ env_var('DBT_LOG_DIR', 'logs') }}" # directory which will store dbt logs
packages-install-path: "{{ env_var('DBT_PACKAGE_DIR', 'dbt_packages') }}" # directory which will store dbt packages
target-path: "{{ env_var('DBT_TARGET_DIR', 'target') }}" # directory which will store compiled SQL files
And as stated in the airflow-dbt github, (https://github.com/gocardless/airflow-dbt#amazon-managed-workflows-for-apache-airflow-mwaa) configure the dbt task like below:
dbt_bin='/usr/local/airflow/.local/bin/dbt',
profiles_dir='/usr/local/airflow/dags/{DBT_FOLDER}/',
dir='/usr/local/airflow/dags/{DBT_FOLDER}/'
I have a Django project I have been working on offline and now I have hosted it on Heroku and it works well on Heroku but fails on my local machine with this error.
File "/usr/lib/python3.9/os.py", line 679, in __getitem__
raise KeyError(key) from None
KeyError: 'DEBUG'
and I think it is because I used environment variables like this.
from boto.s3.connection import S3Connection
import os
DEBUG = S3Connection(os.environ['DEBUG'], os.environ['DEBUG'])
I also have a .env file in my root(project folder) with the environment variables like this.
export JWT_SECRET_KEY = "dfge..."
export DEBUG = 1
What is the right way to store the environment variables on my local machine?
I have local file secret.py added to .gitignore with all keys, env values needed:
#secret.py
DEBUG = 1
Then in settings.py:
# settings.py
try:
import secret
DEBUG = secret.DEBUG
except ModuleNotFoundError:
DEBUG = S3Connection(os.environ['DEBUG'], os.environ['DEBUG'])
Python 2.7
Django 1.10
settings.ini file(located at "/opts/myproject/settings.ini"):
[settings]
DEBUG: True
SECRET_KEY: '5a88V*GuaQgAZa8W2XgvD%dDogQU9Gcc5juq%ax64kyqmzv2rG'
On my django settings file I have:
import os
from ConfigParser import RawConfigParser
config = RawConfigParser()
config.read('/opts/myproject/settings.ini')
SECRET_KEY = config.get('settings', 'SECRET_KEY')
DEBUG = config.get('settings', 'DEBUG')
The setup works fine locally, but when I deploy to my server I get the following error if I try run any django management commands:
ConfigParser.NoSectionError: No section: 'settings'
If I go into Python shell locally I type in the above imports and read the file I get back:
['/opts/myproject/settings.ini']
On server I get back:
[]
I have tried changing "confif.read()" to "config.readfp()" as suggested on here but it didn't work.
Any help or advice is appreciated.
I've two files.
test.py
from pyspark import SparkContext
from pyspark import SparkConf
from pyspark import SQLContext
class Connection():
conf = SparkConf()
conf.setMaster("local")
conf.setAppName("Remote_Spark_Program - Leschi Plans")
conf.set('spark.executor.instances', 1)
sc = SparkContext(conf=conf)
sqlContext = SQLContext(sc)
print ('all done.')
con = Connection()
test_test.py
from test import Connection
sparkConnect = Connection()
when I run test.py the connection is made successfully but with test_test.py it gives
raise KeyError(key)
KeyError: 'SPARK_HOME'
KEY_ERROR arises if the SPARK_HOME is not found or invalid. So it's better to add it to your bashrc and check and reload in your code. So add this at the top of your test.py
import os
import sys
import pyspark
from pyspark import SparkContext, SparkConf, SQLContext
# Create a variable for our root path
SPARK_HOME = os.environ.get('SPARK_HOME',None)
# Add the PySpark/py4j to the Python Path
sys.path.insert(0, os.path.join(SPARK_HOME, "python", "lib"))
sys.path.insert(0, os.path.join(SPARK_HOME, "python"))
pyspark_submit_args = os.environ.get("PYSPARK_SUBMIT_ARGS", "")
if not "pyspark-shell" in pyspark_submit_args: pyspark_submit_args += " pyspark-shell"
os.environ["PYSPARK_SUBMIT_ARGS"] = pyspark_submit_args
Also add this at the end of your ~/.bashrc file
COMMAND: vim ~/.bashrc if you are using any Linux based OS
# needed for Apache Spark
export SPARK_HOME="/opt/spark"
export IPYTHON="1"
export PYSPARK_PYTHON="/usr/bin/python3"
export PYSPARK_DRIVER_PYTHON="ipython3"
export PYSPARK_DRIVER_PYTHON_OPTS="notebook"
export PYTHONPATH="$SPARK_HOME/python/:$PYTHONPATH"
export PYTHONPATH="$SPARK_HOME/python/lib/py4j-0.9-src.zip:$PYTHONPATH"
export PYSPARK_SUBMIT_ARGS="--master local[2] pyspark-shell"
export CLASSPATH="$CLASSPATH:/opt/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar
Note:
In the above bashrc code, I have given my SPARK_HOME value as /opt/spark you can give the location where you keep your spark folder(the downloaded one from the website).
Also I'm using python3 you can change it to python in the bashrc if you are using python 2.+ versions
I was using Ipython, for easy testing during runtime, like load the data once and test your code many times. If you are using plain old text editor, let me know I will update the bashrc accordingly.
I created several servers, without any issue, with the stack nginx - uwsgi - flask using virtualenv.
with the current one uwsgi is throwing the error cannot import name "appl"
here is the myapp directory structure:
/srv/www/myapp
+ run.py
+ venv/ # virtualenv
+ myapp/
+ init.py
+ other modules/
+ logs/
here is the /etc/uwsgi/apps-avaliable/myapp.ini
[uwsgi]
# Variables
base = /srv/www/myapp
app = run
# Generic Config
# plugins = http, python
# plugins = python
home = %(base)/venv
pythonpath = %(base)
socket = /tmp/%n.sock
module = %(app)
callable = appl
logto = %(base)/logs/uwsgi_%n.log
and this is run.py
#!/usr/bin/env python
from myapp import appl
if __name__ == '__main__':
DEBUG = True if appl.config['DEBUG'] else False
appl.run(debug=DEBUG)
appl is defined in myapp/ _ init _ .py as an instance of Flask()
(underscores spaced just to prevent SO to turn them into bold)
I accurately checked the python code and indeed if I activate manually the virtualenvironment and execute run.py manually everything works like a charm, but uwsgi keeps throwing the import error.
Any suggestion what should I search more ?
fixed it, it was just a read permissions issue. The whole python app was readable by my user but not by the group, therefore uwsgi could not find it.
This was a bit tricky because I deployed successfully many time with the same script and never had permissions issues