How to install Apache Atlas on a single EC2 node? - amazon-web-services

I tried installing Apache Atlas on a single EC2 node but if fails to start:
wget http://www-eu.apache.org/dist/atlas/1.0.0/apache-atlas-1.0.0-sources.tar.gz
tar xvfz apache-atlas-1.0.0-sources.tar.gz
cd apache-atlas-sources-1.0.0/
export MAVEN_OPTS="-Xms2g -Xmx2g"
mvn clean -DskipTests package -Pdist,embedded-hbase-solr
python atlas_start.py
/tmp/apache-atlas-sources-1.0.0/distro/src/conf/atlas-env.sh: line 59: MANAGE_LOCAL_HBASE=${hbase.embedded}: bad substitution
/tmp/apache-atlas-sources-1.0.0/distro/src/conf/atlas-env.sh: line 62: MANAGE_LOCAL_SOLR=${solr.embedded}: bad substitution
/tmp/apache-atlas-sources-1.0.0/distro/src/conf/atlas-env.sh: line 65: MANAGE_EMBEDDED_CASSANDRA=${cassandra.embedded}: bad substitution
/tmp/apache-atlas-sources-1.0.0/distro/src/conf/atlas-env.sh: line 68: MANAGE_LOCAL_ELASTICSEARCH=${elasticsearch.managed}: bad substitution
Exception: [Errno 2] No such file or directory
Traceback (most recent call last):
File "atlas_start.py", line 163, in <module>
returncode = main()
File "atlas_start.py", line 73, in main
mc.expandWebApp(atlas_home)
File "/tmp/apache-atlas-sources-1.0.0/distro/src/bin/atlas_config.py", line 160, in expandWebApp
jar(atlasWarPath)
File "/tmp/apache-atlas-sources-1.0.0/distro/src/bin/atlas_config.py", line 213, in jar
process = runProcess(commandline)
File "/tmp/apache-atlas-sources-1.0.0/distro/src/bin/atlas_config.py", line 249, in runProcess
p = subprocess.Popen(commandline, stdout=stdoutFile, stderr=stderrFile, shell=shell)
File "/usr/lib64/python2.7/subprocess.py", line 390, in __init__
errread, errwrite)
File "/usr/lib64/python2.7/subprocess.py", line 1025, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
How to install Apache Atlas on one AWS EC2?
Thanks.

I agree you should check the script. However, the notes are not very clear. You need to have configured it too. That means defining if you are using a pre-built ZK installation or not, but more importantly, Atlas by default uses HBase as its store. You MUST have HDFS available too, and change the config to point to HDFS Namenode (usually on port 9000).
Hope this helps.

Please check if JAVA_HOME initialized or has correct value. Initializing with the valid value solved the issue for me.

Related

AWS SAM Accelerate fails to resolve dependencies when building application

I'm trying to use SAM Accelerate as recommended by AWS. However, the sam sync command is failing
PythonPipBuilder:ResolveDependencies - Could not satisfy the requirement: jsonpickle==2.1.0
The requirement for jsonpickle is included in the requirements.txt file, and it's installed locally.
foo#bar:~/sam-project$ pip freeze | grep jsonpickle
jsonpickle==2.1.0
The exact same error occurs when I use sam build, but I'm able to use the sam build -u to use a container and make the build work. Unfortunately that doesn't seem to be an option for sam sync.
I have found a few occurrences of a similar issue, but none of them address the root cause and this I am unsure of how to fix this.
Full output
foo#bar:~/sam-project$ sam sync --watch
The SAM CLI will use the AWS Lambda, Amazon API Gateway, and AWS StepFunctions APIs to upload your code without
performing a CloudFormation deployment. This will cause drift in your CloudFormation stack.
**The sync command should only be used against a development stack**.
Confirm that you are synchronizing a development stack.
Enter Y to proceed with the command, or enter N to cancel:
[Y/n]: y
Queued infra sync. Wating for in progress code syncs to complete...
Starting infra sync.
Manifest file is changed (new hash: 1719a58de4024a0928ae0e3ddf42ac82) or dependency folder (.aws-sam/deps/ce2e5caa-e309-401a-8ab1-425d3c3e399d) is missing for (CoreLayer), downloading dependencies and copying/building source
Building layer 'CoreLayer'
Running PythonPipBuilder:CleanUp
Clean up action: .aws-sam/deps/ce2e5caa-e309-401a-8ab1-425d3c3e399d does not exist and will be skipped.
Running PythonPipBuilder:ResolveDependencies
Build Failed
Failed to sync infra. Code sync is paused until template/stack is fixed.
Traceback (most recent call last):
File "aws_lambda_builders/workflows/python_pip/actions.py", line 54, in execute
File "aws_lambda_builders/workflows/python_pip/packager.py", line 156, in build_dependencies
File "aws_lambda_builders/workflows/python_pip/packager.py", line 258, in build_site_packages
File "aws_lambda_builders/workflows/python_pip/packager.py", line 282, in _download_dependencies
File "aws_lambda_builders/workflows/python_pip/packager.py", line 365, in _download_all_dependencies
File "aws_lambda_builders/workflows/python_pip/packager.py", line 717, in download_all_dependencies
aws_lambda_builders.workflows.python_pip.packager.NoSuchPackageError: Could not satisfy the requirement: jsonpickle==2.1.0
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "aws_lambda_builders/workflow.py", line 301, in run
File "aws_lambda_builders/workflows/python_pip/actions.py", line 57, in execute
aws_lambda_builders.actions.ActionFailedError: Could not satisfy the requirement: jsonpickle==2.1.0
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "samcli/lib/build/app_builder.py", line 760, in _build_function_in_process
File "aws_lambda_builders/builder.py", line 164, in build
File "aws_lambda_builders/workflow.py", line 95, in wrapper
File "aws_lambda_builders/workflow.py", line 308, in run
aws_lambda_builders.exceptions.WorkflowFailedError: PythonPipBuilder:ResolveDependencies - Could not satisfy the requirement: jsonpickle==2.1.0
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "samcli/commands/build/build_context.py", line 248, in run
File "samcli/lib/build/app_builder.py", line 221, in build
File "samcli/lib/build/build_strategy.py", line 358, in build
File "samcli/lib/build/build_strategy.py", line 78, in build
File "samcli/lib/build/build_strategy.py", line 361, in _build_layers
File "samcli/lib/build/build_strategy.py", line 380, in _run_builds_async
File "samcli/lib/utils/async_utils.py", line 131, in run_async
File "samcli/lib/utils/async_utils.py", line 90, in run_given_tasks_async
File "asyncio/base_events.py", line 587, in run_until_complete
File "samcli/lib/utils/async_utils.py", line 58, in _run_given_tasks_async
File "concurrent/futures/thread.py", line 57, in run
File "samcli/lib/build/build_strategy.py", line 388, in build_single_layer_definition
File "samcli/lib/build/build_strategy.py", line 546, in build_single_layer_definition
File "samcli/lib/build/build_strategy.py", line 430, in build_single_layer_definition
File "samcli/lib/build/build_strategy.py", line 218, in build_single_layer_definition
File "samcli/lib/build/app_builder.py", line 552, in _build_layer
File "samcli/lib/build/app_builder.py", line 763, in _build_function_in_process
samcli.lib.build.exceptions.BuildError: PythonPipBuilder:ResolveDependencies - Could not satisfy the requirement: jsonpickle==2.1.0
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "samcli/lib/sync/watch_manager.py", line 190, in _execute_infra_sync
File "samcli/lib/sync/watch_manager.py", line 142, in _execute_infra_context
File "samcli/commands/build/build_context.py", line 308, in run
samcli.commands.exceptions.UserException: PythonPipBuilder:ResolveDependencies - Could not satisfy the requirement: jsonpickle==2.1.0
samcli.commands.exceptions.UserException: PythonPipBuilder:ResolveDependencies - Could not satisfy the requirement: jsonpickle==2.1.0
Unfortunately no one was able to assist here, so I opened an issue on GitHub.
Eventually the issue became clear and it's not actually an issue with SAM. The problem is that I use an AWS BuildArtifact feed, so any sam build or sam sync action will try and pull packages from that feed. However, the token for that feed expires after 12 hours.
The issue remains open and the SAM team are investigating the error that is displayed, and hopefully they will implement a solution that will surface the underlying error message, which would have made diagnosing this issue a whole lot easier.

gcloud version 306.0.0 causing "No module named 'urllib2'" errors

After updating gcloud from version 290.0.1 to version 306.0.0, I'm getting an error when I run a gsutil cp command:
Traceback (most recent call last):
File "/usr/lib/google-cloud-sdk/platform/gsutil/gsutil", line 21, in <module>
gsutil.RunMain()
File "/usr/lib/google-cloud-sdk/platform/gsutil/gsutil.py", line 122, in RunMain
import gslib.__main__
File "/usr/lib/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 53, in <module>
import boto
File "/usr/lib/google-cloud-sdk/platform/gsutil/gslib/vendored/boto/boto/__init__.py", line 1216, in <module>
boto.plugin.load_plugins(config)
File "/usr/lib/google-cloud-sdk/platform/gsutil/gslib/vendored/boto/boto/plugin.py", line 93, in load_plugins
_import_module(file)
File "/usr/lib/google-cloud-sdk/platform/gsutil/gslib/vendored/boto/boto/plugin.py", line 75, in _import_module
return imp.load_module(name, file, filename, data)
File "/usr/lib/python3.6/imp.py", line 235, in load_module
return load_source(name, filename, file)
File "/usr/lib/python3.6/imp.py", line 172, in load_source
module = _load(spec)
File "/usr/share/google/boto/boto_plugins/compute_auth.py", line 18, in <module>
import urllib2
ModuleNotFoundError: No module named 'urllib2'
Following the downgrade instructions at https://cloud.google.com/sdk/docs/downloads-apt-get#downgrading_cloud_sdk_versions temporarily fixes the issue:
sudo apt-get update && sudo apt-get install google-cloud-sdk=290.0.1-0
But I'd like to know how to get this working with the latest version.
I have installed the version 306.0.0 and I ran a gcloud cp command, but I didn't face the issue. For this reason, checking for causes for the error ModuleNotFoundError: No module named 'urllib2', it seems that they are always related to a Python library that isn't working correctly - as you can check in this two examples here and here.
However, in further searches, this plugin usually is used within Compute Engine and startup scripts to VMs with Python, more specific relating to the file compute_auth.py - which in the message seems to be related to the error - and as you can check for more information here about this file.
Considering that, the new version of Cloud SDK bring some updates to Compute Engine that could be causing the error. In case you are indeed, using Python within your applications, I would give it a try the solution from this case here, that would be to update the file compute_auth.py, changing the line import urllib2 toimport urllib.request as urllib2.
In case this doesn't fix, raising a bug within Google's Issue Tracker will be the best option, for a further investigation.
I had a similar case. In my case, Travis CI/CD was giving the below error. What I did is add the below script to my .travis.yml file before_script section.
Error:
Traceback (most recent call last):
635 File "/usr/lib/google-cloud-sdk/platform/gsutil/gsutil", line 21, in <module>
636 gsutil.RunMain()
637 File "/usr/lib/google-cloud-sdk/platform/gsutil/gsutil.py", line 121, in RunMain
638 import gslib.__main__
639 File "/usr/lib/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 83, in <module>
640 import httplib2
641ModuleNotFoundError: No module named 'httplib2'
642error Command failed with exit code 1
Fix:
before_script:
- pip install httplib2 crcmod

Fresh Windows 10 install of anaconda and jupyter - Kernel Error (Python 2.7 and 3.5)

I have been using ipython and also a little bit of jupyter for quite some time, some time ago. After not having used it in almost 6 months I wanted to start using it again.
I installed the newest version of jupyter, updated my python 2.7 install, got pip working and installed the necessary packages:
pip install jupyter
pip install notebook
and etc. After having done that I tried to enter an old notebook (written in 2.7) but there was no connection to the kernel. I thought, well wth, why not just update to the newest python 3 version and try that. That resulted in the same problem.
I went ahead and installed anaconda and created two virtual envs, one with python 2.7 and one with python 3.5. Both installed like this:
conda create --name py27 python=2.7 anaconda
conda create --name py35 python=3.5 anaconda
After that I made sure that both venvs had jupyter installed by activating them and trying to install the package. (This was done with py27 and py35 which are the names of the venvs as seen from above commands).
activate py27
conda install jupyter
After that I tried to run:
jupyter notebook
I created a new notebook file to see if I had access to the kernel. However it was made clear that I didn't with the following error:
Traceback (most recent call last):
File "E:\Anaconda3\envs\py35\lib\site-packages\notebook\base\handlers.py", line 458, in wrapper
result = yield gen.maybe_future(method(self, *args, **kwargs))
File "E:\Anaconda3\envs\py35\lib\site-packages\tornado\gen.py", line 1008, in run
value = future.result()
File "E:\Anaconda3\envs\py35\lib\site-packages\tornado\concurrent.py", line 232, in result
raise_exc_info(self._exc_info)
File "<string>", line 3, in raise_exc_info
File "E:\Anaconda3\envs\py35\lib\site-packages\tornado\gen.py", line 1014, in run
yielded = self.gen.throw(*exc_info)
File "E:\Anaconda3\envs\py35\lib\site-packages\notebook\services\sessions\handlers.py", line 58, in post
sm.create_session(path=path, kernel_name=kernel_name))
File "E:\Anaconda3\envs\py35\lib\site-packages\tornado\gen.py", line 1008, in run
value = future.result()
File "E:\Anaconda3\envs\py35\lib\site-packages\tornado\concurrent.py", line 232, in result
raise_exc_info(self._exc_info)
File "<string>", line 3, in raise_exc_info
File "E:\Anaconda3\envs\py35\lib\site-packages\tornado\gen.py", line 1014, in run
yielded = self.gen.throw(*exc_info)
File "E:\Anaconda3\envs\py35\lib\site-packages\notebook\services\sessions\sessionmanager.py", line 73, in create_session
self.kernel_manager.start_kernel(path=kernel_path, kernel_name=kernel_name)
File "E:\Anaconda3\envs\py35\lib\site-packages\tornado\gen.py", line 1008, in run
value = future.result()
File "E:\Anaconda3\envs\py35\lib\site-packages\tornado\concurrent.py", line 232, in result
raise_exc_info(self._exc_info)
File "<string>", line 3, in raise_exc_info
File "E:\Anaconda3\envs\py35\lib\site-packages\tornado\gen.py", line 282, in wrapper
yielded = next(result)
File "E:\Anaconda3\envs\py35\lib\site-packages\notebook\services\kernels\kernelmanager.py", line 87, in start_kernel
super(MappingKernelManager, self).start_kernel(**kwargs)
File "E:\Anaconda3\envs\py35\lib\site-packages\jupyter_client\multikernelmanager.py", line 109, in start_kernel
km.start_kernel(**kwargs)
File "E:\Anaconda3\envs\py35\lib\site-packages\jupyter_client\manager.py", line 244, in start_kernel**kw)
File "E:\Anaconda3\envs\py35\lib\site-packages\jupyter_client\manager.py", line 190, in _launch_kernel
return launch_kernel(kernel_cmd, **kw)
File "E:\Anaconda3\envs\py35\lib\site-packages\jupyter_client\launcher.py", line 108, in launch_kernel
proc = Popen(cmd, **kwargs)
File "E:\Anaconda3\envs\py35\lib\subprocess.py", line 950, in __init__ restore_signals, start_new_session)
File "E:\Anaconda3\envs\py35\lib\subprocess.py", line 1220, in _execute_child startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified
This was obviously tried on my py35 venv however I get the same error on my py27 venv. I have tried a few things such as running the kernelspec, but none of it to any success.
Someone got a suggestion to what might be wrong?
I had the same problem. You need to create a kernelspec for the jupyter notebook. Follow this link to solve it.
How to start an ipython shell(not notebook) within a conda or virtualenv

virtualenv returns error 'Operation not Permitted'

I was using the command virtualenv --no-site-packages django-env but I encountered the following error
Traceback (most recent call last):
File "/usr/local/bin/virtualenv", line 9, in <module>
load_entry_point('virtualenv==12.0.7', 'console_scripts', 'virtualenv')()
File "/usr/local/lib/python2.7/dist-packages/virtualenv-12.0.7-py2.7.egg/virtualenv.py", line 825, in main
symlink=options.symlink)
File "/usr/local/lib/python2.7/dist-packages/virtualenv-12.0.7-py2.7.egg/virtualenv.py", line 985, in create_environment
site_packages=site_packages, clear=clear, symlink=symlink))
File "/usr/local/lib/python2.7/dist-packages/virtualenv-12.0.7-py2.7.egg/virtualenv.py", line 1416, in install_python
os.symlink(py_executable_base, full_pth)
OSError: [Errno 1] Operation not permitted
So I thought using the command sudo virtualenv --no-site-packages django-env on my terminal to avoid any operating system conflicts, but it throws the following error please have a look at that
Traceback (most recent call last):
File "/usr/local/bin/virtualenv", line 9, in <module>
load_entry_point('virtualenv==12.0.7', 'console_scripts', 'virtualenv')()
File "/usr/local/lib/python2.7/dist-packages/virtualenv-12.0.7-py2.7.egg/virtualenv.py", line 825, in main
symlink=options.symlink)
File "/usr/local/lib/python2.7/dist-packages/virtualenv-12.0.7-py2.7.egg/virtualenv.py", line 985, in create_environment
site_packages=site_packages, clear=clear, symlink=symlink))
File "/usr/local/lib/python2.7/dist-packages/virtualenv-12.0.7-py2.7.egg/virtualenv.py", line 1204, in install_python
copyfile(stdinc_dir, inc_dir, symlink)
File "/usr/local/lib/python2.7/dist-packages/virtualenv-12.0.7-py2.7.egg/virtualenv.py", line 479, in copyfile
copyfileordir(src, dest, symlink)
File "/usr/local/lib/python2.7/dist-packages/virtualenv-12.0.7-py2.7.egg/virtualenv.py", line 454, in copyfileordir
shutil.copytree(src, dest, symlink)
File "/usr/lib/python2.7/shutil.py", line 208, in copytree
raise Error, errors
shutil.Error: [('/usr/include/python2.7/numpy', 'django-env/include/python2.7/numpy', '[Errno 1] Operation not permitted')]
I am using Ubuntu 14.04 and Python 2.7.6
I am unable to figure what is causing the error.
Regarding ownership of the development folder:
I had a similar error when running virtualenv on a virtualbox mounted drive. Switching over to a directory on the virtual machine ran fine.
All the best.
By default, VirtualBox will forbid creating symlinks in mounted shared folders due to the security reasons.
You can however, enable it yourself manually using the following command.
VBoxManage setextradata VM_NAME VBoxInternal2/SharedFoldersEnableSymlinksCreate/SHARE_NAME 1
After that virtual environment should be bootstrapped correctly. Do not forget to shutdown the VM for this setting to be picked up.
You can refer to this VirtualBox's ticket for more details: https://www.virtualbox.org/ticket/10085.
I had a similar error when running virtualenv on a mounted drive. For me "--always-copy" option resolved the issue.
If you're using venv module (like you're supposed to nowadays), create the virtual environment in some nonshared folder. The stupid feature of that module is even if you specify --copies flag, it still tries to create symlinks, but between two folders in the same directory.
Create the virutal environment in a non-shared folder between your vm and host, remove the symlink and duplicate the lib folder via copy or another command.
On your home try these commands
1.sudo easy_install virtualenv
2.mkdir virt_env
3.virtualenv virt_env/test1
4.source test1/bin/activate
5.pip install django==1.7.4
after that
django-admin.py startproject project_name

virtualenv is not compatible with this system or executable

I am rather new to Linux (Ubuntu) and installing (Python) packages. I'm having trouble with mkvirtualenv and can not solve it:
~$ mkvirtualenv mysite70
New python executable in mysite70/bin/python
Traceback (most recent call last):
File "/usr/lib/python2.7/site.py", line 562, in <module>
main()
File "/usr/lib/python2.7/site.py", line 544, in main
known_paths = addusersitepackages(known_paths)
File "/usr/lib/python2.7/site.py", line 271, in addusersitepackages
user_site = getusersitepackages()
File "/usr/lib/python2.7/site.py", line 246, in getusersitepackages
user_base = getuserbase() # this will also set USER_BASE
File "/usr/lib/python2.7/site.py", line 236, in getuserbase
USER_BASE = get_config_var('userbase')
File "/usr/lib/python2.7/sysconfig.py", line 577, in get_config_var
return get_config_vars().get(name)
File "/usr/lib/python2.7/sysconfig.py", line 476, in get_config_vars
_init_posix(_CONFIG_VARS)
File "/usr/lib/python2.7/sysconfig.py", line 355, in _init_posix
raise IOError(msg)
IOError: invalid Python installation: unable to open /home/sietse/.virtualenvs/mysite70/local/include/python2.7/pyconfig.h (No such file or directory)
ERROR: The executable mysite70/bin/python is not functioning
ERROR: It thinks sys.prefix is u'/home/usr/.virtualenvs' (should be u'/home/usr/.virtualenvs/mysite70')
ERROR: virtualenv is not compatible with this system or executable
Did I install something wrong?
This is likely a permissions error with your current logged in user on the Linux machine.
Try
sudo mkvirtualenv mysite70
This will often prompt for the password of the root user.
If that does not work, you may want to look at the article below:
http://noelusion.com/2013/Fixing-the-mysterious-virtualenv-error-IOError-invalid-Python-installation/
But note, that the article is a hack on a fairly specific instance.
I think I messed up the installation. I reinstalled Ubuntu, virtualenv etc. It works fine now.
Make sure your username has accents or special characters. If yes, change directory creating environments creating an environment variable WORKON_HOME with value equal to the new path. Ex .: C:\Envs