Compress and uncompress using different version of snappy-java library - compression

Can using different version of xerial cause compression error and could be not backward compatible?
I am compressing using:
org = "org.xerial.snappy",
name = "snappy-java",
rev = "1.1.7.3",
and uncompressing using:
org = "org.xerial.snappy",
name = "snappy-java",
rev = "1.0.3.2",
And getting following error:
SEVERE: java.io.IOException: FAILED_TO_UNCOMPRESS(5)
at org.xerial.snappy.SnappyNative.throw_error(SnappyNative.java:98)
at org.xerial.snappy.SnappyNative.rawUncompress(Native Method)
at org.xerial.snappy.Snappy.rawUncompress(Snappy.java:474)
at org.xerial.snappy.Snappy.uncompress(Snappy.java:513)
at org.xerial.snappy.SnappyInputStream.readFully(SnappyInputStream.java:147)
at org.xerial.snappy.SnappyInputStream.readHeader(SnappyInputStream.java:99)
at org.xerial.snappy.SnappyInputStream.<init>(SnappyInputStream.java:59)
at com.twitter.ads.common.io.SnappyCompressionCodec.deserialize(SnappyCompressionCodec.java:85)

Related

Fatal Error: init_fs_encoding: failed to get the Python codec of the filesystem encoding

Trying to run Python 3.11 on CentOS 7 and keep getting this error block anytime I invoke python3.11.
Could not find platform independent libraries <prefix>
Could not find platform dependent libraries <exec_prefix>
Python path configuration:
PYTHONHOME = (not set)
PYTHONPATH = (not set)
program name = 'python3.11'
isolated = 0
environment = 1
user site = 1
safe_path = 0
import site = 1
is in build tree = 0
stdlib dir = '/usr/local/lib/python3.11'
sys._base_executable = '/usr/local/bin/python3.11'
sys.base_prefix = '/usr/local'
sys.base_exec_prefix = '/usr/local'
sys.platlibdir = 'lib'
sys.executable = '/usr/local/bin/python3.11'
sys.prefix = '/usr/local'
sys.exec_prefix = '/usr/local'
sys.path = [
'/usr/local/lib/python311.zip',
'/usr/local/lib/python3.11',
'/usr/local/lib/python3.11/lib-dynload',
]
Fatal Python error: init_fs_encoding: failed to get the Python codec of the filesystem encoding
Python runtime state: core initialized
ModuleNotFoundError: No module named 'encodings'
Current thread 0x00007f11582b4740 (most recent call first):
<no Python frame>
I have tried setting paths and deconflicting the references to other python versions on the device.

CRSError while when opening shapefile using geopandas

I am getting error when opening a shapefile that I downloaded from ArcGIs Hub. I am using the geopandas library in python and the code used is:
data = geopandas.read_file('filepath/name.shp')
I get the following error message:
CRSError: Invalid projection: epsg:4326: (Internal Proj Error: proj_create: SQLite error on SELECT name, type, coordinate_system_auth_name, coordinate_system_code, datum_auth_name, datum_code, area_of_use_auth_name, area_of_use_code, text_definition, deprecated FROM geodetic_crs WHERE auth_name = ? AND code = ?: no such column: area_of_use_auth_name)
Downloaded shapefile from the following link:
(https://hub.arcgis.com/search?collection=App%2CMap)

Changing input type for linear learner to csv

I am trying to run linear learner on a simple dataset. My csv of data is uploaded to a bucket. The problem is that when I run it I get the following error:
UnexpectedStatusException: Error for Training job linear-learner-2020-05-23-22-31-40-894: Failed. Reason: ClientError: Unable to read data channel 'train'. Requested content-type is 'application/x-recordio-protobuf'. Please verify the data matches the requested content-type. (caused by MXNetError)
Caused by: [22:34:37] /opt/brazil-pkg-cache/packages/AIAlgorithmsCppLibs/AIAlgorithmsCppLibs-2.0.2746.0/AL2012/generic-flavor/src/src/aialgs/io/iterator_base.cpp:100: (Input Error) The header of the MXNet RecordIO record at position 0 in the dataset does not start with a valid magic number.
I did some googling and it says to change the content_type to 'text/csv'. My question is, how do I do this? Or does anyone know how to get this working? Thanks! Here is my linear learner code:
container = get_image_uri(boto3.Session().region_name, 'linear-learner')
linear = sagemaker.estimator.Estimator(container,
role,
train_instance_count = 1,
train_instance_type = 'ml.c4.xlarge',
output_path = output_location,
sagemaker_session = sess)
linear.set_hyperparameters(predictor_type = 'regressor',
mini_batch_size = 200)
You can use SageMaker input channels:
train_data = sagemaker.inputs.TrainingInput(
"s3://my-bucket/path/to/train",
distribution="FullyReplicated",
content_type="text/csv",
s3_data_type="S3Prefix",
record_wrapping=None,
compression=None
)
validation_data = sagemaker.inputs.TrainingInput(
"s3://my-bucket/path/to/validation",
distribution="FullyReplicated",
content_type="text/csv",
s3_data_type="S3Prefix",
record_wrapping=None,
compression=None
)
linear.fit({"train": train_data, "validation": validation_data})
See this example

How to transfer data from one system to another system's HDFS (connected through LAN) using Flume?

I have a computer in LAN Connection . I need to transfer data from the system to another system's HDFS location using flume.
I have tried using ip address of the sink system, but it didn't work. Please help..
Regards,
Athiram
This can be achieved by using avro mechanism.
The flume has to be installed in both the machines. A config file with the following codes has to be made to be run in the source system , where the logs are generated.
a1.sources = tail-file
a1.channels = c1
a1.sinks=avro-sink
a1.sources.tail-file.channels = c1
a1.sinks.avro-sink.channel = c1
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.sources.tail-file.type = spooldir
a1.sources.tail-file.spoolDir =<location of spool directory>
a1.sources.tail-file.channels = c1
a1.sinks.avro-sink.type = avro
a1.sinks.avro-sink.hostname = <IP Address of destination system where the data has to be written>
a1.sinks.avro-sink.port = 11111
A config file with the following codes has to be made to be run in the destination system , where the logs are generated.
a2.sources = avro-collection-source
a2.sinks = hdfs-sink
a2.channels = mem-channel
a2.sources.avro-collection-source.channels = mem-channel
a2.sinks.hdfs-sink.channel = mem-channel
a2.channels.mem-channel.type = memory
a2.channels.mem-channel.capacity = 1000
a2.sources.avro-collection-source.type = avro
a2.sources.avro-collection-source.bind = localhost
a2.sources.avro-collection-source.port = 44444
a2.sinks.hdfs-sink.type = hdfs
a2.sinks.hdfs-sink.hdfs.writeFormat = Text
a2.sinks.hdfs-sink.hdfs.filePrefix = testing
a2.sinks.hdfs-sink.hdfs.path = hdfs://localhost:54310/user/hduser/
Now, the data from the log file in the source system will be written to hdfs system in the destination system.
Regards,
Athiram

Liquidsoap - Use PulseAudio With Static Image

I am using liquidsoap to stream audio from my murmur server, as well as a static image, to icecast. However, I can't seem to get liquidsoap to use my JPEG file
Failed to register plugin /usr/lib/frei0r-1/facebl0r.so: Frei0r.Not_a_plugin
Failed to register plugin /usr/lib/frei0r-1/facedetect.so: Frei0r.Not_a_plugin
Invalid value at line 6, char 14-43:
Could not get a valid media file of kind {audio=0;video=1;midi=0} from "/home/iandun/stream/test.jpg".
The file does exist, and I used GIMP to create it. My code (quites short), is below:
#!/usr/bin/liquidsoap
set("frame.video.width", 800)
set("frame.video.height", 600)
video_file = "/home/iandun/stream/test.jpg"
video = single(video_file)
source = mux_video(video=video,input.pulseaudio(device = "stream.monitor"))
output.icecast(%ogg(%vorbis,%theora), host = "duncan.usr.sh", port = 8000,
password = "my_password", mount = "test.ogv",
source,fallible=true)
What am I doing wrong?