cannot access s3 bucket using vertica - amazon-web-services

I am running vertica-ce in docker and I created a table , now i wanna export that table as parquet to s3 bucket, when i tried to export using EXPORT TO PARQUET(directory = 's3://s3-bucket-name/data') I got error ERROR 8198: Unable to verify if directory [s3://eucloid-vertica-migration/data/] exists due to 'Access Denied' I know why im getting this error because i need to give the access_key,secret_key but im unable to use awslib inside my docker container so i tried docker exec -it vertica-ce bash -l and accessed my container and tried to install the awscli but apt-get,yum,apk, nothing is working.
if any one have solution for this please let me know!!

You have a couple of options.
Set AWS parameters in the session.
There are a bunch of S3 settings that you can set. For instance, if you need to set is the access and secret key, you can do this:
=> ALTER SESSION SET AWSAuth='access_key:secret_key';
=> EXPORT TO PARQUET(directory = 's3://s3-bucket-name/data');
Depending on your setup, you may need to set additional config options (e.g. region, endpoint url, etc). All of the settings are documented here: https://www.vertica.com/docs/12.0.x/HTML/Content/Authoring/AdministratorsGuide/ConfiguringTheDB/S3Parameters.htm
Create a new version of the image that has aws cli in it.
docker-ce is currently a CentOS based OS, so it uses the yum package manager. You can create a new image using the sample Dockerfile:
FROM vertica/vertica-ce:latest
USER root
RUN set -x \
&& yum -q -y makecache \
&& yum install -y unzip \
&& yum clean all \
&& rm -rf /var/cache/yum \
&& cd /tmp \
&& curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o awscli.zip \
&& unzip awscli.zip \
&& /tmp/aws/install -i /usr/bin/aws-cli -b /usr/bin
USER dbadmin
Mount a volume so that you can access it from your host PC.
With this approach you can use the aws cli that you have installed locally to copy them to an s3 bucket.
Run this docker command to start the ce container persisting everything in /data to a local vertica-data directory.
docker run -p 5433:5433 \
--mount type=bind,source=$(pwd)/vertica-data,target=/data \
--name vertica_ce \
vertica/vertica-ce:latest
Run EXPORT TO PARQUET using the in-container path /data.
Access the parquet files from your PC in the vertica-data directory.

Related

pull access denied repo does not exist or may require authorization: server message:insufficient_scope: authorization failed"host=registry-1.docker.io

My Docker container works perfectly locally and using the default context and the command "docker compose up". I'm trying to run my docker image on ECS in AWS following this guide - https://aws.amazon.com/blogs/containers/deploy-applications-on-amazon-ecs-using-docker-compose/
I've followed all of the steps on the guide, after I've set the context to my new context (I've tried all 3 options) - after I run "docker compose up" I get the above error, here again for detail:
INFO trying next host error="pull access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed" host=registry-1.docker.io
pull access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed
I've also set the user and added all of the permissions I can think of - image below
I've looked everywhere and I can't find traction, please help :)
The image is located on AWS ECS and Docker hub - I've tried both
Here is my Docker file:
FROM php:7.4-fpm
# Arguments defined in docker-compose.yml
ARG user
ARG uid
# Install system dependencies
RUN apt-get update && apt-get install -y \
git \
curl \
libpng-dev \
libonig-dev \
libxml2-dev \
zip \
unzip
# Clear cache
RUN apt-get clean && rm -rf /var/lib/apt/lists/*
RUN curl -sS https://getcomposer.org/installer | php -- --
install-dir=/usr/local/bin --filename=composer
# Install PHP extensions
RUN docker-php-ext-install pdo_mysql mbstring exif pcntl bcmath
gd
# Get latest Composer
COPY --from=composer:latest /usr/bin/composer /usr/bin/composer
# Create system user to run Composer and Artisan Commands
# RUN useradd -G www-data,root -u $uid -d /home/$user $user
RUN mkdir -p /home/$user/.composer && \
chown -R $user:$user /home/$user
# Set working directory
WORKDIR /var/www
USER $user

Connecting to a s3 bucket from docker container

I am building a docker container which, in a specific folder transform some data, I would like to allocate those files in a s3 bucket, within as specific folder. I have been going thought the aws cli documentation but not sure how this needs to be faced it.
I have installed it without errors by using:
# AWS cli isntallation
RUN apk add --no-cache \
python3 \
py3-pip \
&& pip3 install --upgrade pip \
&& pip3 install \
awscli \
&& rm -rf /var/cache/apk/*
RUN aws --version
I have read about adding the yaml configuration to point to the bucket but not sure how the process needs to be done. Someone with similar project on mind which can point me better to some documentation or how to face it as i am very layman in docker.

running script to upload file to AWS S3 works, but running the same script via jenkins job doesn't work

The simple goal:
I would like to have two containers both running on my local machine. One jenkins container & one SSH server container. Then, jenkins job could connect to the SSH server container & execute aws command to upload file to S3.
My workspace directory structure:
a docker-compose.yml (details see below)
a directory named centos/,
Inside centos/ I have a Dockerfile for building the SSH server image.
The docker-compose.yml:
In my docker-compose.yml I declared the two containers(services).
One jenkins container, name jenkins.
One SSH server contaienr, named remote_host.
version: '3'
services:
jenkins:
container_name: jenkins
image: jenkins/jenkins
ports:
- "8080:8080"
volumes:
- $PWD/jenkins_home:/var/jenkins_home
networks:
- net
remote_host:
container_name: remote_host
image: remote-host
build:
context: centos7
networks:
- net
networks:
net:
The Dockerfile for the remote_host is like this (Notice the last RUN installs the AWS CLI):
FROM centos
RUN yum -y install openssh-server
RUN useradd remote_user && \
echo remote_user:1234 | chpasswd && \
mkdir /home/remote_user/.ssh && \
chmod 700 /home/remote_user/.ssh
COPY remote-key.pub /home/remote_user/.ssh/authorized_keys
RUN chown remote_user:remote_user -R /home/remote_user/.ssh/ && \
chmod 600 /home/remote_user/.ssh/authorized_keys
RUN ssh-keygen -A
RUN rm -rf /run/nologin
RUN yum -y install unzip
RUN curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" && unzip awscliv2.zip && ./aws/install
Current situation with the above setup:
I run docker-compose build and docker-compose up. Both jenkins container and the remote_host(SSH server) container are up and running successfully.
I can go inside jenkins container by :
$ docker exec -it jenkins bash
jenkins#7551f2fa441d:/$
I can successfully ssh to the remote_host container by:
jenkins#7551f2fa441d:/$ ssh -i /tmp/remote-key remote_user#remote_host
Warning: the ECDSA host key for 'remote_host' differs from the key for the IP address '172.19.0.2'
Offending key for IP in /var/jenkins_home/.ssh/known_hosts:1
Matching host key in /var/jenkins_home/.ssh/known_hosts:2
Are you sure you want to continue connecting (yes/no)? yes
[remote_user#8c203bbdcf72 ~]$
Inside the remote_host container, I have also configured my AWS access key and secret key under ~.aws/credentials:
[default]
aws_access_key_id=AKIAIOSFODNN7EXAMPLE
aws_secret_access_key=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
I can successfully run aws command to upload a file from remote_host container to my AWS S3 bucket. Like:
[remote_user#8c203bbdcf72 ~]$ aws s3 cp myfile s3://mybucket123asx/myfile
What the issue is
Now, I would like my jenkins job to execute the aws command to upload file to S3. So I created a shell script inside my remote_host container, the script is like this:
#/bin/bash
BUCKET_NAME=$1
aws s3 cp /tmp/myfile s3://$BUCKET_NAME/myfile
In my jenkins, I have configured the SSH & in my jenkins job configuration, I have:
As you can see , it simply runs the script located in the remote_host container.
When I build the jenkins job, I always get the error in console : upload failed: ../../tmp/myfile to s3://mybucket123asx/myfile Unable to locate credentials.
Why the same s3 command works when executing in the remote_host container but not working when run from jenkins job?
I also tried explicitly export the aws key id & secrete key in the script. (bear in mind that I have the ~.aws/credentils configured in remote_host, which works without explicitly exporting the aws secret key)
#/bin/bash
BUCKET_NAME=$1
export aws_access_key_id=AKAARXL1CFQNN4UV5TIO
export aws_secret_access_key=MY_SECRETE_KEY
aws s3 cp /tmp/myfile s3://$BUCKET_NAME/myfile
OK, I solved my issue by changing the export statement to capital case. So, the cause of the issue is that when jenkins run the script, it runs as remote_user on remote_host. Though on remote_host I have the ~/.aws/credentials setup, but that file only have read permission for users other than root:
[root#8c203bbdcf72 /]# ls -l ~/.aws/
total 4
-rw-r--r-- 1 root root 112 Sep 25 19:14 credentials
That's why when jenkins run the script to upload file to S3 got Unable to locate credentials failure. Because the credentials file can't be read by remote_user. So, I have to still uncomment the lines which exports aws key id and secret key. #Marcin's comment is helpful that the letters need to be capital letters, otherwise it would not work.
So, overall, what I did to fix the issue is to update my script with:
export aws_access_key_id=AKAARXL1CFQNN4UV5TIO
export aws_secret_access_key=MY_SECRETE_KEY

What is correct working IBM repository for Liberty profile docker image

I am trying to build docker image with Liberty profile.Using below location Docker file.
https://github.com/WASdev/ci.docker/blob/master/ga/developer/kernel/Dockerfile
FROM ibmjava:8-jre
RUN apt-get update \
&& apt-get install -y --no-install-recommends unzip \
&& rm -rf /var/lib/apt/lists/*
#Install WebSphere Liberty
ENV LIBERTY_VERSION 16.0.0_03
ARG LIBERTY_URL
ARG DOWNLOAD_OPTIONS=""
RUN LIBERTY_URL=${LIBERTY_URL:-$(wget -q -O - https://public.dhe.ibm.com/ibmdl/export/pub/software/websphere/wasdev/downloads/wlp/index.yml | grep $LIBERTY_VERSION -A 6 | sed -n 's/\s*kernel:\s//p' | tr -d '\r' )} \
&& wget $DOWNLOAD_OPTIONS $LIBERTY_URL -U UA-IBM-WebSphere-Liberty-Docker -O /tmp/wlp.zip \
&& unzip -q /tmp/wlp.zip -d /opt/ibm \
&& rm /tmp/wlp.zip
ENV PATH=/opt/ibm/wlp/bin:$PATH
# Set Path Shortcuts
ENV LOG_DIR=/logs \
WLP_OUTPUT_DIR=/opt/ibm/wlp/output
RUN mkdir /logs \
&& ln -s $WLP_OUTPUT_DIR/defaultServer /output \
&& ln -s /opt/ibm/wlp/usr/servers/defaultServer /config
# Configure WebSphere Liberty
RUN /opt/ibm/wlp/bin/server create \
&& rm -rf $WLP_OUTPUT_DIR/.classCache /output/workarea
COPY docker-server /opt/ibm/docker/
EXPOSE 9080 9443
CMD ["/opt/ibm/docker/docker-server", "run", "defaultServer"]**
When I build docker image using this code we are getting error like below.Looks like this repository is not active now.Can anyone provide valid repository.
CWWKF1219E: The IBM WebSphere Liberty Repository cannot be reached. Verify that your computer has network access and firewalls are configured correctly, then try the action again. If the connection still fails, the repository server might be temporarily unavailable.
The URL is correct.
As the error message indicates, try checking your network config. To do that you can try to reach this link in a web browser. (this URL is simply from the script)
https://public.dhe.ibm.com/ibmdl/export/pub/software/websphere/wasdev/downloads/wlp/index.yml
Also, you could testing your connection to the repository outside of the docker environment by doing:
$WLP_HOME/bin/installUtility testConnection
If you are able to ping the repo from your computer, but not within the docker container, then perhaps your docker container has no internet access.
To fix the "docker can't access internet" issue, it looks like the solution from the above link was to do:
service docker restart

connecting sftp server with in AWS

I am trying to create a job to connect sftp server from aws services to bring files into s3 storage in aws. It will be an automated job which runs every night and bring data into S3. I have seen documentation about how to connect aws and import data into S3 manually. However there is nothing I found about connecting external SFTP server to bring data into S3. I don't know if it is doable?
You can now use the managed SFTP service by AWS. It provides a fully managed SFTP server which is easy to setup and is reliable, scalable and durable. It uses S3 as backend for storing files.
Use S3FS to configure sftp connection directly to S3.
All you need to do is install S3FS
https://github.com/s3fs-fuse/s3fs-fuse/wiki/Installation-Notes
Install dependencies for fuse and s3cmd.
CentOS/RHEL Users:
# yum install gcc libstdc++-devel gcc-c++ curl-devel libxml2-devel openssl-devel mailcap
Ubuntu Users:
$ sudo apt-get install build-essential libcurl4-openssl-dev libxml2-dev mime-support
Download and Compile latest fuse
https://github.com/libfuse/libfuse/releases/download/fuse-2.9.7/fuse-2.9.7.tar.gz
# cd fuse-2.9.7
# ./configure --prefix=/usr/local
# make && make install
# export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig
# ldconfig
# modprobe fuse
Download and Compile latest S3FS
https://code.google.com/archive/p/s3fs/downloads
# cd /usr/src/
# wget https://s3fs.googlecode.com/files/s3fs-1.74.tar.gz
# tar xzf s3fs-1.74.tar.gz
# cd s3fs-1.74
# ./configure --prefix=/usr/local
# make && make install
4. setup Access Keys
# echo AWS_ACCESS_KEY_ID:AWS_SECRET_ACCESS_KEY > ~/.passwd-s3fs
# chmod 600 ~/.passwd-s3fs
Mount S3 Bucket
# mkdir /tmp/cache
# mkdir /s3mnt
# chmod 777 /tmp/cache /s3mnt
# s3fs -o use_cache=/tmp/cache mydbbackup /s3mnt
Make your mount point as ftp user home directory this will direct the files transferred using sftp to S3.
NOTE: Donot forget to add permissions to your S3 Bucket to allow Authenticated AWS users