Socket not created by this factory with spark streaming and AWS - amazon-web-services

I am hitting this exception with spark streaming S3 access.
java.lang.IllegalStateException: Socket not created by this factory
at org.apache.http.util.Asserts.check(Asserts.java:34)
at org.apache.http.conn.ssl.SSLSocketFactory.isSecure(SSLSocketFactory.java:435)...
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:310)...
at org.apache.hadoop.fs.s3a.S3AFileSystem.verifyBucketExists ...
I have tried these steps [from other threads like AWS Socket Not created by this factory or Cannot use AWS SDK in Spring Boot Application (Socket not created by this factory) ] and didn't work.
Shading the all amazonaws and apache http jars.
Upgrade httpclient and httpcore explicitly to 4.5.x.
Upgrade amazon sdk to 1.11.x.
Upgrade hadoop to 3.1.x.
None of this worked and I am stuck for 2 days now which means I have tried most of the solutions in stackoverflow and internet already. Any other ideas?

This is "new", but spark has upgraded to a version of httpclient which breaks s3a in hadoop 2.8 in some cases. this sounds like one of the symptoms
I would recommend grabbing the ASF hadoop download consistent with those in the spark release you are using and then dropping in the (hadoop-aws, aws-, http) JARs from that release. Or build spark yourself with the -Phadoop-cloud profile and let maven do the work.

Related

Cloud Composer worker fails to connect to external database

I am attempting to take my existing cloud composer environment and connect to a remote SQL database (Azure SQL). I've been banging at my head at this for a few days and I'm hoping someone can point out where my problem lies.
Following the documentation found here I've spun up a GKE Service and SQL Proxy workload. I then created a new airflow connection as show here using the full name of the service azure-sqlproxy-service:
I test run one of my DAG tasks and get the following:
Unable to connect: Adaptive Server is unavailable or does not exist
Not sure on the issue I decide to remote directly into one of the workers, whitelist that IP on the remote DB firewall, and try to connect to the server. With no command line MSSQL client installed I launch python on the worker and attempt to connect to the database with the following:
connection = pymssql.connect(host='database.url.net',user='sa',password='password',database='database')
From which I get the same error above with both the Service and the remote IP entered in as host. Even ignoring the service/proxy shouldn't this airflow worker be able to reach the remote database? I can ping websites but checking the remote logs the DB doesn't show any failed logins. With the generic error and not many ideas on what to do next I'm stuck. A few google results have suggested switching libraries but I'm not quite sure how, or if I even need to, within airflow.
What troubleshooting steps could I take next to get at least a single worker communicating to the DB before moving on the the service/proxy?
After much pain I've found that Cloud composer uses ubuntu 1804 which currently breaks pymssql as per here:
https://github.com/pymssql/pymssql/issues/687
I tried downgrading to 2.1.4 to no success. Needing to get this done I've followed the instructions outlined in this post to use pyodbc.
Google Composer- How do I install Microsoft SQL Server ODBC drivers on environments

Corda apps not visible on 10004 port

I'm working on Corda on Azure Cloud.
I have deployed a Corda blockchain (4 nodes, 1 notary and 1 network manager) in Corda 2.0.
I have tried to follow the tutorial https://docs.corda.net/azure-vm.html.
When I go to http://(public IP address):10004/, I don't see my Cordapps.
I have 2 installed (jar files in /opt/corda/plugins) on each node: corda-finance (already installed by Azure) and yo!app (version M11)
I see :
Installed CorDapps
No installed custom CorDapps.
If I go to http://(public IP address):(port)/web/yo, I have :
Corda O=Organisation 2 (Corda 2.0.0), L=London, C=GB
HTTP ERROR 404
Problem accessing /web/yo. Reason:
Not Found
Powered by Jetty:// 9.4.7.v20170914
Do anyone know why?
I found the problem,
The yo!app version M11 doesn't work for Corda V2. There is nothing in the tutorial saying what to do (I think it is outdated) but I have an updated version of yo!app on https://explore.corda.zone/, you can upload it on your node in the pluging folder with
wget http://ci-artifactory.corda.r3cev.com/artifactory/cordapp-showcase/yo-4.jar
strangely the "corda-finance.jar" file don't seem to work, either the jar file is also outdated or it is not a "real" cordapps and thus , it doesn't appear on the web-service page.
Hope this can be helpful to someone else.

Pentaho DI can't connect to AWS Redshift - Amazon Error 100021

Referring to Pentaho's Doc, we should be using RedshiftJDBC4.jar instead of version 4.1. I have downloaded the driver and placed it in the lib/ directory. Relaunched spoon.sh and I noticed it is no longer complaining about not able to find the com.amazon.redshift.jdbc4 class driver as I was using the 4.1 driver earlier. However, it still could not establish the connection.
Error connecting to database [aws_redshift] :
org.pentaho.di.core.exception.KettleDatabaseException: Error occurred
while trying to connect to the database
Error connecting to database: (using class
com.amazon.redshift.jdbc4.Driver) Amazon Error setting
default driver property values.
Can anyone help on this?
On the flip side, I can connect to my endpoint using SQLWorkbench/J, a SQL client tool.
Somehow I managed to get it working. It seems that downloading AWS Redshift drivers version 4, 4.1, or 4.2 and placing them in the lib/ directory did not work for me for each version by choosing Redshift as connection type (in Database Connection setup).
Instead, I chose PostgreSQL using JDBC. In host name field, I included the endpoint WITHOUT port number 5439 at the end. So, the endpoint should just end with ...amazonaws.com. Fill in database name, port number of 5439, and username and password. If this did not work, try downloading the latest PostgreSQL JDBC driver and placing it in lib/ directory and try again.

Installing AD and MSExchange 2016 in AWS EC2

I am trying to install an MSExchange 2016 in an EC2 instance from scratch without success. By from scratch, I mean I start from a new EC2 instance without any AD yet installed.
I am not very familial with Windows Server. I got a lot of problem during the installation. By digging the web, I fixed a lot of them, but I think there is something I miss to succeed in my installation. Any help would be greatly appreciated
Here is the procedure I followed:
I created an EC2 Windows Server 2012RC2 instance
I created a simple Active Directory in AWS.
I provided the AD DNS to my Windows Server (via Network and Sharing Center, properties of Internet Protocol v4)
I joined the server into that AD (Via Control Panel > System and Security > System, change computer workgroup to the domain defined in my AWS Simple AD)
Restart computer
Log into the server as Administrator, with the AD domain
Download Exchange from here
Set-up the active directory, as in this procedure: https://judeperera.wordpress.com/2015/07/24/step-by-step-guide-for-installing-exchange-server-2016-preview/
The Step 4.1. of that procedure indicates to execute the following code
Setup.exe /PrepareSchema /IAcceptExchangeServerLicenseTerms
When I execute it, I get the following error:
I do not understand what I need to do/fix to continue the installation.
Thanks in advance for your help!
The issue you are encountering is that Simple Directory is not an Active Directory product, it is powered by Samba v4. What you need is to setup a Microsoft Active Directory (Enterprise Edition) or Microsoft AD, which is powered by Windows Server 2012 R2. The Simple AD is powered by Samba v4 and is simply Active Directory compatible but does not support the added schema features which are needed by Exchange Server 2016.
The other option is to back away from hosting your own instance of Exchange server and instead take a look at AWS WorkMail. It is an exchange like service which supports active sync with Outlook 2007+ and all current mobile smart devices such as Android and iOS. I currently use this and it took a lot of the headache out of managing my own mail server as the complexities are offloaded to the AWS environment and all you need to do it add mail accounts and group addresses.
Either option should solve your issue.

Amazon Linux 2 AMI with .NET Core

I recently spun up an EC2 instance from AWS - "Amazon Linux 2 AMI with .NET Core 2.2" and do not see a webserver running (maybe I'm just missing something here)
Does this mean that despite it having .net core, I need to now just install apache and then do the whole Kestrel and reverse proxy thing in httpd.conf as per this:
https://gooroo.io/GoorooThink/Article/17422/Deploy-ASPNET-Core-Application-On-EC2-Amazon-Linux-Instance/32558#.XZ1NGi-ZPUI
Just needed to do a gut check and make sure there wasn't an easier way before I make a mistake.
Thank you everyone!