Delta Lake is not working with Glue 4.0.
Glue 3.0 is working with delta-core_2.12-1.0.1.jar perfectly.
Which delta-core is supported by Glue 4.0?
I have tried most of them current delta-cor jars but received several error messages..
AWS Glue 4.0 supports only version 2.1.0 to my knowledge, since it runs Spark 3.3.
Related
I am using yfinance library to get some values like market_cap.
Code:-
import yfinance as yf
com = yf.Ticker('1140.SR')
print(com.fast_info['market_cap'])
I've updated the library to latest version i.e. 0.2.9 locally and on AWS MWAA Apache Airflow.
Locally, I'm able to run the code.
But on Amazon MWAA Airflow, I'm getting an error 'near "without": syntax error'.
Based on my searching, I believe, that the issue is because of a lower version of SQLite. One person was able to resolve the issue by upgrading SQLite version - https://github.com/ranaroussi/yfinance/issues/1372
Locally, here is the configuration:-
Python version - 3.10.8
yfinance - 0.2.9
SQLite3 - 3.37.2
and my Amazon MWAA Airflow configuration is:-
Python version - 3.10.8 (main, Jan 17 2023, 22:57:31) [GCC 7.3.1 20180712 (Red Hat 7.3.1-15)]
yfinance - 0.2.9
SQLite3 - 3.7.17
SQLite is not a python library version, but it’s the a system-level application that needs to be upgraded manually.
I'm able to do locally, but how do I upgrade SQLite to >3.34.x version on Amazon MWAA?
Can anyone help me?
Tried upgrading SQLITE version with requirements file. It didn't work.
apache-airflow-providers-sqlite==3.3.1
Will there be any issue in my current notebooks and jobs if i upgrade my Databricks run time version from 9.1 LTS to 10.4 LTS
I didn't tried upgrading the version. If I upgrade it then will I be able to change it back to previous version
It's really a very broad question - exact answer depends on the features and libraries/connectors that you're using in your code. You can refer to the Databricks Runtime 10.x migration guide and Spark 3.2.1 migration guide for more information about upgrade.
Usually, the correct way to do is to try to run your job with new runtime, but using the test environment, where your production data won't be affected.
So far I'm using scala 2.11 with Java 8 to build the library used by the Glue ETL job. We're planning to upgrade to Scala 2.12 with Java 11 but not sure if they are supported by the Glue ETL.
The glue versions are listed here. The latest version supports Spark 2.4.3.
In Spark 2.4.3, the default version of Scala is 2.11.
I am getting Exception on using AWS S3 SDK with OpenStack to get all buckets/containers.
Expected Behavior
I should be able get list of buckets from OpenStack using aws s3 sdk using .net Core as it is working fine in dotnet 4.72 framework.
Current Behavior
I'm getting exception while fetching list of buckets from OpenStack using S3 sdk.
AWS s3 sdk is working fine with .net 4.72 frameworks
Possible Solution
Is there any incompatibility between dotnet core and openstack ?
Steps to Reproduce (for bugs)
Create a simple dotnet core application and try to fetch list of buckets using AWS s3 credentials.
Context
dotnet 4.72 was working fine but now after migrating to dotnet core it is not working.
Your Environment
AWSSDK.Core version used: 3.3.100
Service assembly and version used:
Operating System and version: Windows 10
Visual Studio version: VS 2015
Targeted .NET platform: dotnet core
.NET Core Info
.NET Core version used for development: dotnet core 2.1
I got the following response using GitHub ticket :
https://github.com/aws/aws-sdk-net/issues/1246
Hi. We do not make any guarantees on how the SDK behaves with non-S3 service.
Please re-open the issue if you can reproduce the problem against S3.
Hadoop 3 is already 15 months old, and EMR official release labels are still supporting only Hadoop 2.
I couldn't find a quick documentation on how to set up Hadoop 3.1.2 on EMR. Are most people not using it? Seems more difficult than it should be, what am I missing?
EMR did come out with the official support for hadoop 3.1 in September as part of EMR6-beta release.[1]
Also, it includes support for Amazon Linux 2, and Amazon Corretto JDK 8.
[1]EMR6-beta: https://aws.amazon.com/about-aws/whats-new/2019/09/simplify-your-spark-application-dependency-management-with-docker-and-hadoop-3-with-emr-6-0-0-beta/