How Can I enable SSO login to Apache Zeppelin on AWS EMR - amazon-web-services

I created a AWS EMR Cluster using (http://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-spark-launch.html, I chose the application - "Spark: Spark 2.1.0 on Hadoop 2.7.3 YARN with Ganglia 3.7.2 and Zeppelin 0.7.0 while creating the cluster") and I am able to access Apache Zeppelin.
Now I want to give Zeppelin access to a new user using their Gmail or Google SSO or any other login. How can do this? Please point me to any documentation or steps.
*The SAML /SSO logins give access only to AWS console but not the application like Zeppelin which is hosted on the master node.

Zeppelin uses
Apache Shiro
and there are some libraries and examples to use oauth in shiro.
shiro-oauth
Oauth2Relam.java
pac4j security library for Shiro: OAuth, CAS, SAML, OpenID Connect, LDAP, JWT...
But Zeppelin doens't support oauth extensions currently (0.8.0-SNAPSHOT) as far as i know. You might extend Zeppelin by yourself.
Docs: Zeppelin Shiro Configuration for Relam

Single sign-on can be implemented using Apache Knox. KnoxSSO support is recently added to Zeppelin.
For configuration options check out this link

Related

In a containerized application that runs in AWS/Azure but needs to access GCLOUD commands, what is the best way to setup gcloud authentication?

I am very new to GCP and I would greatly appreciate some help here ...
I have a docker containerized application that runs in AWS/Azure but needs to access gcloud SDK as well as through "Google cloud client libraries".
what is the best way to setup gcloud authentication from an application that runs outside of GCP?
In my Dockerfile, I have this (cut short for brevity)
ENV CLOUDSDK_INSTALL_DIR /usr/local/gcloud/
RUN curl -sSL https://sdk.cloud.google.com | bash
ENV PATH $PATH:$CLOUDSDK_INSTALL_DIR/google-cloud-sdk/bin
RUN gcloud components install app-engine-java kubectl
This container is currently provisioned from an Azure app service & AWS Fargate. When a new container instance is spawned, we would like it to be gcloud enabled with a service account attached already so our application can deploy stuff on GCP using its deployment manager.
I understand gcloud requires us to run gcloud auth login to authenticate to your account. How we can automate the provisioning of our container if this step has to be manual?
Also, from what I understand, for cloud client libraries, we can store the path to service account key json file in an environment variable (GOOGLE_APPLICATION_CREDENTIALS). So this file either has to be stored inside the docker image itself OR has to be mounted from an external storage at the very least?
How safe is it to store this service account key file in an external storage. What are the best practices around this?
There are two main means of authentication in Google Cloud Platform:
User Accounts: Belong to people, represent people involved in your project and they're associated to a Google Account
Service Accounts: Used by an application or an instance.
Learn more about their differences here.
Therefore, you are not required to use the command gcloud auth login to perform gcloud commands.
You should be using gcloud auth activate-service-account instead, along with the --key-file=<path-to-key-file> flag, which will allow you to authenticate without the need of signing into a Google Account with access to your project every time you need to call an API.
This key should be stored securely, preferably encrypted in the platform of your choice. Learn how to do it in GCP here following these steps as an example.
Take a look at these useful links for storing secrets in Microsoft Azure and AWS.
On the other hand, you can deploy services to GCP programmatically either using Cloud Libraries with your programming language of choice, or using Terraform is very intuitive if you prefer to do so over using the Google Cloud SDK through the CLI.
Hope this helped.

Launch Jupyter Notebooks in AWS Sagemaker from a Custom Webapplication

We have a requirement where we are building a Webportal/platform that will use services of AWS and Git as both will host certain content to allow users to search for certain artifacts.
We also want to allow a user after they have searched for certain artifacts (lets say certain jupyter notebooks) to be able to launch these notebooks from our web-application. Note the notebooks are in different domain i.e AWS Console application host them.
Now, When user click on the notebook links from the webportal search it should open up the Jupyter notebook in a notebook instance in a new tab.
We understand there is integration of AWS Sagemaker and GIT so some repos that will store notebooks can be configured. When user performs the search in webapp it will pick up the results from github API Call.
The same repos can also be added in the sagemaker-github integration through AWS Console. So when a user launches the notebook he will see the github repos as well.
I understand we call Sagemaker API either through SDK or Rest API(not sure there is a rest api interface exploring on that). See a CLI call example -
aws sagemaker create-presigned-notebook-instance-url --notebook-instance-name notebook-sagemaker-git
this gives me a response url "AuthorizedUrl": "https://notebook-sagemaker-git.notebook.us-east-2.sagemaker.aws?authToken=eyJhbGciOiJIUzI1NiJ9.eyJmYXNDcmVkZW50aWFscyI6IkFZQURlQlR1NHBnZ2dlZGc3VTJNcjZKSmN3UUFYd0FCQUJWaGQzTXRZM0o1Y0hSdkxYQjFZbXhwWXkxclpYa0FSRUZvUVZadGMxSjFSVzV6V1hGVGJFWmphRXhWUTNwcVlucDZaR2x5ZDNGQ1RsZFplV1YyUkRoTGJubHRWRzVQT1dWM1RTdDBTR0p6TjJoYVdXeDJabnBrUVQwOUFBRUFCMkYzY3kxcmJYTUFTMkZ5YmpwaGQzTTZhMjF6T25WekxXVmhjM1F0TWpvMk5qZzJOek15TXpJMk5UUTZhMlY1THpObFlUaGxNMk14TFRSaU56a3RORGd4T0
However, when i open this url it again asks me the aws console username and password. I feel in the webapp when i logged in a user would already authenticate himself through AWS API as well as GIT API.
So there should be no need to re-authenticate themselves when they connect to AWS-Console to access their notebooks.
Is it something that can be circumvent using SIngle sign on etc.
thanks,
Aakash
The URL that you get from a call to CreatePresignedNotebookInstanceUrl is valid only for 5 minutes. If you try to use the URL after the 5-minute limit expires, you are directed to the AWS console sign-in page. See https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreatePresignedNotebookInstanceUrl.html
Jun

Install custom plugin for Kibana on AWS ElasticSearch Instance

I want to know if it is possible to add a custom plugin for Kibana running on an AWS instance as mentioned in this link.
From the command line we can type,
bin/kibana-plugin install some-plugin
But, In case of AWS ElasticSearch Service, there is no command prompt/terminal as it is just a service and we don't get to SSH to it. We just have the management console. How to add a custom plugin for kibana in this scenario then?
From the AWS Elasticsearch Supported Plugins page:
Amazon ES comes prepackaged with several plugins that are available
from the Elasticsearch community. Plugins are automatically deployed
and managed for you.
The page referenced above has all the plugins supported on each ES version.
Side note, Kibana is installed and fully managed, but it runs as a Node.js application (not as a plugin).

Enabling User Authorization on Hue AWS EMR

We have been struggling with configuring Hue/AWS EMR to use user/role based authorization on AWS EMR using Apache Sentry.
we are following the cloudera's documentation (https://www.cloudera.com/documentation/enterprise/5-6-x/topics/sg_sentry_service_config.html)
We have been able to start the Sentry's service on the cluster and create/grant privileges to the users using sentry shell but after that beeline stopped working and we couldn't proceed any further.
Have anyone installed Apache Sentry and configured it then any help would be appreciated.

Synchronize secure vault in wso2 esb

I have clustered and deployment synchronize enabled 'wso2 esbs'(4.9).and i had enable secure vault. now all the deployments have been sync with all worker nodes.but how can i sync my secure vault credentials with worker nodes.
I tried copy "wso2carbon.jks" file,i tried copy "cypher-text.property" file,it doesn't worked.
so how can i sync my secure valet with other worker node?
Yes. If you have clustered the environment correctly it should automatically get synchronized. Steps to follow,
Add a secure vault entry to ESB manager node.
Check the secure vaults in ESB woker node. (If not running in the -Dworker mode.)
If the workers are running on -Dworker mode, you can also check the wso2carbon.log for the logs right after adding the entries to secure vault.
When you are deploying ESB cluster, you can use Puppet and Hiera to make the configurations changes.Wso2, already provided puppet modules to deploy wso2 product clusters.You can use existing Wso2 ESB puppet module to
achieve your requirement. Refer "Running WSO2 Enterprise Service Bus with Secure Vault" section of the README of the WSO2 Enterprise Service Bus Puppet Module to configure Secure Vault related configurations among the cluster.