Error importing PyDeequ package on Glue 3.0 - amazon-web-services

I am trying to import pydeequ lib in aws enviroment bulding a job with glue. So, I put a zip file of pydeequ in Python library path and jars file in Dependent JARs path . My script is the following:
import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job
import pydeequ
from pydeequ.analyzers import *
import findspark
findspark.init()
args = getResolvedOptions(sys.argv, ['JOB_NAME'])
spark = (SparkSession\
.builder\
.config("spark.jars.packages", pydeequ.deequ_maven_coord)\
.config("spark.jars.excludes", pydeequ.f2j_maven_coord)\
.getOrCreate())
sc = SparkContext()
glueContext = GlueContext(sc)
job = Job(glueContext)
job.init(args['JOB_NAME'], args)
But, I couldn't import the pydeequ lib and I have the following error:
2022-12-21 17:50:31,717 ERROR [main] glue.ProcessLauncher (Logging.scala:logError(73)): Error from Python:Traceback (most recent call last):
File "/tmp/Test_Pydeequ.py", line 7, in <module>
import pydeequ
File "<frozen importlib._bootstrap>", line 983, in _find_and_load
File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 668, in _load_unlocked
File "<frozen importlib._bootstrap>", line 638, in _load_backward_compatible
File "/tmp/pydeequ.zip/pydeequ/__init__.py", line 21, in <module>
from pydeequ.configs import DEEQU_MAVEN_COORD
File "<frozen importlib._bootstrap>", line 983, in _find_and_load
File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 668, in _load_unlocked
File "<frozen importlib._bootstrap>", line 638, in _load_backward_compatible
File "/tmp/pydeequ.zip/pydeequ/configs.py", line 37, in <module>
DEEQU_MAVEN_COORD = _get_deequ_maven_config()
File "/tmp/pydeequ.zip/pydeequ/configs.py", line 28, in _get_deequ_maven_config
spark_version = _get_spark_version()
File "/tmp/pydeequ.zip/pydeequ/configs.py", line 23, in _get_spark_version
spark_version = output.stdout.decode().split("\n")[-2]
IndexError: list index out of range
I need to work with pydeequ lib inside aws enviroment and I don't know why I had this problem.
I appreciate very much if someone could help me to solve this problem.

So, I solved the problem doing two things:
First step Solution.
I had to open the configs.py file of pydeequ and change the code in the _get_spark_version() method.
#lru_cache(maxsize=None)
def _get_spark_version() -> str:
# Get version from a subprocess so we don't mess up with existing SparkContexts.
command = [
"python",
"-c",
"from pyspark import SparkContext; print(SparkContext.getOrCreate()._jsc.version())",
]
output = subprocess.run(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
#spark_version = output.stdout.decode().split("\n")[-2]
spark_version = '3.1.1'
return spark_version
I simply commented the original spark_version declaration and wrote '3.1.1'. That is the spark version used in Glue 3.0.
Second step solution
I also was using a wrong jar file version for deequ. The version was 1.0.3 and this is not compatible with spark version 3.1.1 used by glue 3.0. So, I have to download jar file for 2.0.0 version of deequ.

Related

Django ImportError: cannot import name 'python_2_unicode_compatible' from 'django.utils.encoding'

I know others have had a similar issues with and getting this same error, but I think my situation is unique.
I am running Django 3.1.4 and on my local machine, I can run python manage.py shell with no issue.
On the server instance, running what should be the same project, and the same version of Django, I get:
Django ImportError: cannot import name 'python_2_unicode_compatible'
from 'django.utils.encoding'
When trying to run manage.py shell. To make things more cryptic, if I open the shell on my local machine and run:
from django.utils.encoding import python_2_unicode_compatible
I get the same error. So for some reason when I call manage.py shell from my local machine it doesn't try to import python_2_unicode_compatible, but when I run it from the server it does. I can't find where the discrepancy is.
Here is the full stacktrace if that is helpful:
Traceback (most recent call last):
File "manage.py", line 10, in <module>
execute_from_command_line(sys.argv)
File "/home/chase/Env/mantis/lib/python3.8/site-packages/django/core/management/__init__.py", line 401, in execute_from_command_line
utility.execute()
File "/home/chase/Env/mantis/lib/python3.8/site-packages/django/core/management/__init__.py", line 377, in execute
django.setup()
File "/home/chase/Env/mantis/lib/python3.8/site-packages/django/__init__.py", line 24, in setup
apps.populate(settings.INSTALLED_APPS)
File "/home/chase/Env/mantis/lib/python3.8/site-packages/django/apps/registry.py", line 114, in populate
app_config.import_models()
File "/home/chase/Env/mantis/lib/python3.8/site-packages/django/apps/config.py", line 211, in import_models
self.models_module = import_module(models_module_name)
File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 848, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/home/chase/Env/mantis/lib/python3.8/site-packages/request/models.py", line 7, in <module>
from django.utils.encoding import python_2_unicode_compatible
ImportError: cannot import name 'python_2_unicode_compatible' from 'django.utils.encoding' (/home/chase/Env/mantis/lib/python3.8/site-packages/django/utils/encoding.py)
Any ideas on where to start poking around?
You can try this.
First, install six:
pip install six
Then go to the django.utils.encoding.py file and simply import python_2_unicode_compatible from six like this:
from six import python_2_unicode_compatible

Web application could not be started by the Phusion Passenger application server, ModuleNotFoundError - Django [SOLVED]

when I install a third-site application from github with pip install -e git+https://github.com/breduin/das.git#egg=django_ajax_selects my site doesn't start and the following error raises:
Web application could not be started by the Phusion Passenger application server.
/usr/share/passenger/helper-scripts/wsgi-loader.py:26: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
import sys, os, re, imp, threading, signal, traceback, socket, select, struct, logging, errno
Traceback (most recent call last):
File "/usr/share/passenger/helper-scripts/wsgi-loader.py", line 369, in <module>
app_module = load_app()
File "/usr/share/passenger/helper-scripts/wsgi-loader.py", line 76, in load_app
return imp.load_source('passenger_wsgi', startup_file)
File "/opt/python/python-3.8.6/lib/python3.8/imp.py", line 171, in load_source
module = _load(spec)
File "<frozen importlib._bootstrap>", line 702, in _load
File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 783, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/var/www/u1234567/data/www/mysite/passenger_wsgi.py", line 7, in <module>
application = get_wsgi_application()
File "/var/www/u1234567/data/env/lib/python3.8/site-packages/django/core/wsgi.py", line 12, in get_wsgi_application
django.setup(set_prefix=False)
File "/var/www/u1234567/data/env/lib/python3.8/site-packages/django/__init__.py", line 24, in setup
apps.populate(settings.INSTALLED_APPS)
File "/var/www/u1234567/data/env/lib/python3.8/site-packages/django/apps/registry.py", line 91, in populate
app_config = AppConfig.create(entry)
File "/var/www/u1234567/data/env/lib/python3.8/site-packages/django/apps/config.py", line 90, in create
module = import_module(entry)
File "/opt/python/python-3.8.6/lib/python3.8/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 973, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'ajax_select'
In the directory env/lib/python3.8/site-packages/, where usually all the packages from PyPI are placed one can find the egg-file,
django-ajax-selects.egg-link
/var/www/u1234567/data/env/src/django-ajax-selects
.
The path is correct, the application django-ajax-selects is placed there.
Without this application from github and with other applications from PyPI my site starts ok.
It seems like server doesn't recognize egg-link or path, but I couldn't find out how to correct this.
SO-effect, while writing the question I found the answer by myself: one needs to add the path to the package (which is in the egg-link file) in passenger_wsgi.py:
# -*- coding: utf-8 -*-
import os, sys
sys.path.insert(0, '/var/www/u1234567/data/www/mysite/mysite')
sys.path.insert(1, '/var/www/u1234567/data/env/lib/python3.8/site-packages')
sys.path.insert(2, '/var/www/u1234567/data/env/src/django-ajax-selects') #<---- this path
os.environ['DJANGO_SETTINGS_MODULE'] = 'mysite.settings'
from django.core.wsgi import get_wsgi_application
application = get_wsgi_application()

django.core.exceptions.AppRegistryNotReady: Apps aren't loaded yet when loading wsgi.py

I have a problem with Django and wsgi that I cannot pinpoint. The app works fine on my local test server and it also works fine on a local apache WAMP setup (without any venvs). When deploying it to our Linux server again the local test server runs (as does makemigrations, migrate or check):
python3 manage.py runserver
/home/www-test/myapp-venv/lib/python3.7/site-packages/requests/__init__.py:91: RequestsDependencyWarning: urllib3 (1.25.8) or chardet (2.0.3) doesn't match a supported version!
RequestsDependencyWarning)
/home/www-test/myapp-venv/lib/python3.7/site-packages/requests/__init__.py:91: RequestsDependencyWarning: urllib3 (1.25.8) or chardet (2.0.3) doesn't match a supported version!
RequestsDependencyWarning)
Performing system checks...
System check identified no issues (0 silenced).
March 18, 2020 - 16:22:20
Django version 2.2.11, using settings 'myapp.settings'
Starting development server at http://127.0.0.1:8000/
Quit the server with CONTROL-C.
python3 manage.py check
/home/www-test/myapp-venv/lib/python3.7/site-packages/requests/__init__.py:91:
RequestsDependencyWarning: urllib3 (1.25.8) or chardet (2.0.3) doesn't match a supported version!
RequestsDependencyWarning)
System check identified no issues (0 silenced).
However when I try to deploy it with wsgi/Apache on the Linux machine I get
Traceback (most recent call last):
File "myapp/wsgi.py", line 20, in <module>
application = get_wsgi_application()
File "/home/www-test/.local/lib/python3.7/site-packages/django/core/wsgi.py", line 12, in get_wsgi_application
django.setup(set_prefix=False)
File "/home/www-test/.local/lib/python3.7/site-packages/django/__init__.py", line 24, in setup
apps.populate(settings.INSTALLED_APPS)
File "/home/www-test/.local/lib/python3.7/site-packages/django/apps/registry.py", line 91, in populate
app_config = AppConfig.create(entry)
File "/home/www-test/.local/lib/python3.7/site-packages/django/apps/config.py", line 90, in create
module = import_module(entry)
File "/usr/lib/python3.7/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
File "<frozen importlib._bootstrap>", line 983, in _find_and_load
File "<frozen importlib._bootstrap>", line 953, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
File "<frozen importlib._bootstrap>", line 983, in _find_and_load
File "<frozen importlib._bootstrap>", line 953, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
File "<frozen importlib._bootstrap>", line 983, in _find_and_load
File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 728, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/home/www-test/myapp/myapp/maintenance.py", line 9, in <module>
from imageplus.models import ImagePlus
File "/home/www-test/myapp/imageplus/models.py", line 6, in <module>
from userprofile.models import Userprofile
File "/home/www-test/myapp/userprofile/models.py", line 5, in <module>
from django.contrib.auth.models import User
File "/home/www-test/.local/lib/python3.7/site-packages/django/contrib/auth/models.py", line 2, in <module>
from django.contrib.auth.base_user import AbstractBaseUser, BaseUserManager
File "/home/www-test/.local/lib/python3.7/site-packages/django/contrib/auth/base_user.py", line 47, in <module>
class AbstractBaseUser(models.Model):
File "/home/www-test/.local/lib/python3.7/site-packages/django/db/models/base.py", line 103, in __new__
app_config = apps.get_containing_app_config(module)
File "/home/www-test/.local/lib/python3.7/site-packages/django/apps/registry.py", line 252, in get_containing_app_config
self.check_apps_ready()
File "/home/www-test/.local/lib/python3.7/site-packages/django/apps/registry.py", line 135, in check_apps_ready
raise AppRegistryNotReady("Apps aren't loaded yet.")
django.core.exceptions.AppRegistryNotReady: Apps aren't loaded yet.
I suspect it has something to do with venv or with missing paths, but I don't know where to start. wsgi.py looks like this:
"""
WSGI config for myapp project.
It exposes the WSGI callable as a module-level variable named ``application``.
For more information on this file, see
https://docs.djangoproject.com/en/2.2/howto/deployment/wsgi/
"""
import os, sys
from django.core.wsgi import get_wsgi_application
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'myapp.settings')
sys.path.append('/home/www-test/myapp/myapp')
sys.path.append('/home/www-test/myapp')
sys.path.append('/home/www-test/myapp-venv/lib/python3.7/site-packages')
application = get_wsgi_application()
Any suggestions a to where to even start with hunting this error would be very much appreciated.
[edit:]
I traced the error further to apps.py of myapp which looks like this:
from django.apps import AppConfig
from threading import Thread
from django.utils import timezone
import time
import schedule
import os
from decouple import config
class SebastianConfig(AppConfig):
name = 'myapp'
def ready(self):
if os.environ.get('RUN_MAIN', None) == 'true':
return
import myapp.signals
if config('AUTOMAINTENANCE', cast=bool)==False:
print ('[myApp]: auto maintenance is disabled')
return
from myapp.maintenance import maintenance
def scheduler_demon():
while True:
schedule.run_pending()
time.sleep(60)
def aliveping():
print(f'[myApp]:ping {timezone.now()}')
schedule.every(5).minutes.do(aliveping)
schedule.every(24).hours.do(maintenance)
schedule.run_all()
worker=Thread(target=scheduler_demon,args=(),)
worker.daemon=True
worker.start()
In theory this should not cause problems as fas as my understanding of Django goes. Its also the ONLY spot maintenance is imported.
Now the funny thing is I removed that passage and wsgi worked like a charm. I added the passage once again and SURPRISE it still works like a charm?!!?

ModuleNotFoundError Django views

I have created a separate module (sources.py) in my django project in order to hold functions and keep my views module from turning into a complete nightmare.
When I import sources.py into views.py, call the function it holds, and run the code, with a print statement to make sure what is called is accurately being imported, everything works fine and I see printed what I expect.
However, when I try to run the app in the local server I get a ModuleNotFoundError.
If I include the code from the function in views and then run the local server, everything works just fine.
The module obviously is being imported otherwise the print statement wouldn't work. However, something is lost when trying to run on the local server.
I have gone through a ton of stackoverflow questions, python and django module tutorials, and questions located elsewhere and nothing has seemed to solved the problem. Below is all the code.
sources.py and views.py are both in the same directory. (betrTV/betrTV_app/file.py)
I have also tried " import sources * with original = sources.video() which provided the same issue as well as import . sources
from django.shortcuts import render
from django.http import HttpResponse
from datetime import date
from sources import video
''' returns a dict of unique video identifiers (last 11 chars)'''
original = video()
print(original)
sources = []
# strip out all except the unique source code
for i in range(len(original)):
source = original[i]
source = source[30:]
sources.append(source)
# Checked out... print(sources)
# create day of week to pull video
day = date.today()
num = str(day)[9]
# pull source from list by using 'num' from 'day'
source = sources[int(num)]
def index(request):
''' The home page for betrTV_app '''
video = 'class="video-frame" width="1200" height="600"\
+ src="https://www.youtube.com/embed/%s"\
+ frameborder="0" allow="accelerometer; autoplay; encrypted-media; \
+ gyroscope; picture-in-picture" allowfullscreen' %source
html = '<html><body><link href="https://fonts.googleapis.com/css?family=Gugi"\
+ rel="stylesheet"><h1 style="color: grey; font-family: Gugi; font-size: \
+ 20px;">BetrTV... Empowerment on Demand</h1>\
<iframe %s></iframe></body></html>' %video
return HttpResponse(html)
INSTALLED_APPS = [
...
# My Apps
'betrTV_app',
]
File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 728, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "C:\Users\Prometheus\Dropbox\LVL6\Programming\Python\betrTV\betrTV\urls.py", line 21, in <module>
path('', include('betrTV_app.urls')),
File "C:\Users\Prometheus\Envs\betr_env\lib\site-packages\django\urls\conf.py", line 34, in include
urlconf_module = import_module(urlconf_module)
File "C:\Users\Prometheus\Envs\betr_env\lib\importlib\__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
File "<frozen importlib._bootstrap>", line 983, in _find_and_load
File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 728, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "C:\Users\Prometheus\Dropbox\LVL6\Programming\Python\betrTV\betrTV_app\urls.py", line 4, in <module>
from . import views
File "C:\Users\Prometheus\Dropbox\LVL6\Programming\Python\betrTV\betrTV_app\views.py", line 4, in <module>
from sources import video
ModuleNotFoundError: No module named 'sources'

ImportError: cannot import name DateRange

I am setting up the new Django project with PostgreSQL as a backend . I am using the pip to install the packages . Following are the requirements are install in new environment.
Django==1.9
argparse==1.2.1
djangorestframework==3.3.3
psycopg2==2.6.1
wsgiref==0.1.2
I don't know where I made mistakes. please Help me out to configure the new project.please give me step by step procedure to configure the new environment.
The error stack is
from psycopg2.extras import DateRange, DateTimeTZRange, NumericRange
ImportError: cannot import name DateRange
my#box:~ mkvirtualenv env
(env)my#box:~ pip install -r requirements.txt
(env)my#box:~ django-admin startproject proj
(env)my#box:~ ./proj/manage.py shell
>>> from psycopg2.extras import DateRange, DateTimeTZRange, NumericRange
>>>
Works for me. Can you import psycopg2 and import psycopg2.extras?
I know this is way too late to the party, but apparently using psycopg2-binary worked for me.
This was with Djano 2.2.9 and Djongo 1.3.0 that failed to import psycopg2 with the following error:
traceback (most recent call last):
File "manage.py", line 21, in <module>
main()
File "manage.py", line 17, in main
execute_from_command_line(sys.argv)
File "/venv/lib/python3.6/site-packages/django/core/management/__init__.py", line 381, in execute_from_command_line
utility.execute()
File "/venv/lib/python3.6/site-packages/django/core/management/__init__.py", line 357, in execute
django.setup()
File "/venv/lib/python3.6/site-packages/django/__init__.py", line 24, in setup
apps.populate(settings.INSTALLED_APPS)
File "/venv/lib/python3.6/site-packages/django/apps/registry.py", line 114, in populate
app_config.import_models()
File "/venv/lib/python3.6/site-packages/django/apps/config.py", line 211, in import_models
self.models_module = import_module(models_module_name)
File "/usr/lib/python3.6/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 994, in _gcd_import
File "<frozen importlib._bootstrap>", line 971, in _find_and_load
File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 678, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/segnet-webapp/webapp/label/models.py", line 2, in <module>
from psycopg2.extras import DateRange, DateTimeTZRange, NumericRange
ModuleNotFoundError: No module named 'psycopg2'
I'm posting this here since it is one of the very few places that show up when searching for this particular error.
All I had to do was:
$(venv) pip install psycopg2-binary