Image permission in jenkins docker agent - amazon-web-services

I am using normal jenkins installation (NOT THE DOCKER IMAGE) on a normal AWS ec2 instance, with docker engine installed along side jenkins.
I have a simple jenkins pipeline like this:
pipeline {
agent none
stages {
stage('Example Build') {
agent { docker {
image 'cypress/base:latest'
args '--privileged --env CYPRESS_CACHE_FOLDER=~/.cache'
} }
steps {
sh 'ls'
sh 'node --version'
sh 'yarn install'
sh 'make e2e-test'
}
}
}
}
this will make the pipeline fail in the yarn install step while installing cypress although all it's dependenices is satisfied from the cypress image.
ERROR LOG FROM JENKINS
error /var/lib/jenkins/workspace/Devops-Capstone-Project_master/node_modules/cypress: Command failed.
Exit code: 1
Command: node index.js --exec install
Arguments:
Directory: /var/lib/jenkins/workspace/Devops-Capstone-Project_master/node_modules/cypress
Output:
Cypress cannot write to the cache directory due to file permissions
See discussion and possible solutions at
https://github.com/cypress-io/cypress/issues/1281
----------
Failed to access /.cache:
EACCES: permission denied, mkdir '/.cache'
After some investigation i found that although i have provided the environment variable "CYPRESS_CACHE_FOLDER=~/.cache" to override the default location in the root directory, and also provided the "--privileged". it fails because for some reason jenkins and docker is forcing their args and user mapping from the jenkins host.
I have also tried providing "-u 1000:1000" to override the user mapping but it didn't work.
What could possibly be wrong? and any recommendations or work arounds about this issue?
Thanks ,,

I have found a work around by creating a docker file to build the image and pass the jenkins user id and group to it as build arguments, as described here on this thread .
But this is not guaranteed to work on multiple nodes (master->slaves) jenkins installations as the jenkins user id and group may differ.

Related

Kubectl against GKE Cluster through Terraform's local-exec?

I am trying to make an automatic migration of workloads between two node pools in a GKE cluster. I am running Terraform in GitLab pipeline. When new node pool is created the local-exec runs and I want to cordon and drain the old node so that the pods are rescheduled on the new one. I am using this registry.gitlab.com/gitlab-org/terraform-images/releases/1.1:v0.43.0 image for my Gitlab jobs. Also, python3 is installed with apk add as well as gcloud cli - downloading the tar and using the gcloud binary executable from google-cloud-sdk/bin directory.
I am able to use commands like ./google-cloud-sdk/bin/gcloud auth activate-service-account --key-file=<key here>.
The problem is that I am not able to use kubectl against my cluster.
Although I have installed the gke-gcloud-auth-plugin with ./google-cloud-sdk/bin/gcloud components install gke-gcloud-auth-plugin --quiet once in the CI job and second time in the local-exec script in HCL code I get the following errors:
module.create_gke_app_cluster.null_resource.node_pool_provisioner (local-exec): E0112 16:52:04.854219 259 memcache.go:238] couldn't get current server API group list: Get "https://<IP>/api?timeout=32s": getting credentials: exec: executable <hidden>/google-cloud-sdk/bin/gke-gcloud-auth-plugin failed with exit code 1
290module.create_gke_app_cluster.null_resource.node_pool_provisioner (local-exec): Unable to connect to the server: getting credentials: exec: executable <hidden>/google-cloud-sdk/bin/gke-gcloud-auth-plugin failed with exit code 1
When I check the version of the gke-gcloud-auth-plugin with gke-gcloud-auth-plugin --version
I am getting the following error:
174/bin/sh: eval: line 253: gke-gcloud-auth-plugin: not found
Which clearly means that the plugin is not installed.
The image that I am using is based on alpine for which there is no way to install the plugin via package manager, unfortunately.
Edit: gcloud components list shows gke-gcloud-auth-plugin as installed too.
The solution was to use google/cloud-sdk image in which I have installed terraform and used this image for the job in question.

Jenkins for C++ Project with Docker

I want to integrate my C++ project with Jenkins using Docker.
First I created a setup for Jenkins using:
Dockerfile:
FROM jenkins/jenkins:latest
USER root
RUN apt-get update && apt-get install git && apt-get install make g++ -y
ENV JAVA_OPTS -Djenkins.install.runSetupWizard=false
ENV CASC_JENKINS_CONFIG /var/jenkins_home/casc.yaml
ENV JENKINS_ADMIN_ID=admin
ENV JENKINS_ADMIN_PASSWORD=password
COPY plugins.txt /usr/share/jenkins/ref/plugins.txt
RUN jenkins-plugin-cli -f /usr/share/jenkins/ref/plugins.txt
COPY casc.yaml /var/jenkins_home/casc.yaml
plugins.txt
ant:latest
antisamy-markup-formatter:latest
authorize-project:latest
build-timeout:latest
cloudbees-folder:latest
configuration-as-code:latest
credentials-binding:latest
email-ext:latest
git:latest
github-branch-source:latest
gradle:latest
ldap:latest
mailer:latest
matrix-auth:latest
pam-auth:latest
pipeline-github-lib:latest
pipeline-stage-view:latest
ssh-slaves:latest
timestamper:latest
workflow-aggregator:latest
ws-cleanup:latest
casc.yaml
jenkins:
securityRealm:
local:
allowsSignup: false
users:
- id: ${JENKINS_ADMIN_ID}
password: ${JENKINS_ADMIN_PASSWORD}
authorizationStrategy:
globalMatrix:
permissions:
- "USER:Overall/Administer:admin"
- "GROUP:Overall/Read:authenticated"
remotingSecurity:
enabled: true
security:
queueItemAuthenticator:
authenticators:
- global:
strategy: triggeringUsersAuthorizationStrategy
unclassified:
location:
url: http://127.0.0.1:8080/
So far, so good. I used docker build -t jenkins:jcasc . and docker run --name jenkins --rm -p 8080:8080 jenkins:jcasc and Jenkins starts normally. I entered the user name (admin) and the password and it worked.
There are although 2 issues that Jenkins displays:
-> It appears that your reverse proxy set up is broken.
-> Building on the built-in node can be a security issue. You should set up distributed builds. See the documentation.
I ignored them so far, and continued with my project and installed Blue Ocean.
After installing Blue Ocean I connected it to my GitHub repository and for this I created a token and the pipeline was created, but it is not giving any result.
The log says:
Branch indexing
Connecting to https://api.github.com with no credentials, anonymous access
Jenkins-Imposed API Limiter: Current quota for Github API usage has 37 remaining (1 over budget). Next quota of 60 in 36 min. Sleeping for 4 min 43 sec.
Jenkins is attempting to evenly distribute GitHub API requests. To configure a different rate limiting strategy, such as having Jenkins restrict GitHub API requests only when near or above the GitHub rate limit, go to "GitHub API usage" under "Configure System" in the Jenkins settings.
My C++ project looks like this:
-->HelloWorld
|
-->HelloWorld.cpp
-->scripts
|
-->Linux-Build.sh
|
-->Linux-Run.sh
-->Jenkinsfile
HelloWorld.cpp:
#include <iostream>
int main(){
std::cout << "Hello World!" << std::endl;
return 0;
}
Linux-Build.sh
#!/bin/bash
vendor/bin/premake/premake5 gmake2
make
Linux-Run.sh
#!/bin/bash
bin/Debug/HelloWorld
Jenkinsfile
pipeline {
agent any
stages {
stage('Build') {
steps {
sh 'echo "Building..."'
sh 'chmod +x scripts/Linux-Build.sh'
sh 'scripts/Linux-Build.sh'
archiveArtifacts artifacts: 'bin/Debug/*', fingerprint: true
}
}
stage('Test') {
steps {
sh 'echo "Running..."'
sh 'chmod +x scripts/Linux-Run.sh'
sh 'scripts/Linux-Run.sh'
}
}
}
}
So, where might be the problem? I don't know if the problem is in my C++ code, or in the Docker configuration.
I am also a big fan of cmake, instead of make and I would prefer to use cmake for this, but I am not sure how to integrate this and make it work.
EDIT:
In the tutorial that I followed there was a "vendor" directory but the instructor didn't show what was in that directory.
EDIT:
The program took 4 minutes but it eventually did something and the error is:
+ scripts/Linux-Build.sh
scripts/Linux-Build.sh: line 3: vendor/bin/premake/premake5: No such file or directory
make: *** No targets specified and no makefile found. Stop.
script returned exit code 2
So, the problem is in the C++ project, not Docker.

Permission Error Running Container in AWS CodeBuild

I'm attempting to run the following command in CodeBuild:
- docker run --rm -v $(pwd)/SQL:/flyway/SQL -v $(pwd)/conf:/flyway/conf flyway/flyway -enterprise -url=jdbc:postgresql://xxx.xxx.us-east-1.rds.amazonaws.com:5432/hamshackradio -dryRunOutput="/flyway/SQL/output.sql" migrate
I get the following error:
ERROR: Unable to use /flyway/SQL/output.sql as a dry run output:
/flyway/SQL/output.sql (Permission denied) Caused by:
java.io.FileNotFoundException: /flyway/SQL/output.sql (Permission
denied)
The goal is to capture the output.sql file. Running the exact same command locally on Windows (adjusting the paths of course) works without error. The issue isn't Flyway or the overall command structure. It's something to do with the internals of running the Docker container on Ubuntu on AWS CodeBuild and permissions there (maybe permissions, maybe something else, I'm open).
Does anyone have a good idea on how to address this?
The container doesn't have write access to the host. You could try the following, which saves the artifact to the container and uses docker cp to copy the artifact to the host.
container=$(docker create -v <flywaymigrationspathonhost>:/flyway/sql flyway/flyway migrate -dryRunOutput=/flyway/reports/changes.sql -schemas=dbo -user=youruser -password=yourpass -url=<yourjdbcurl> -licenseKey=<licensekey>)
docker start -a ${container}
docker cp ${container}:/flyway/reports/changes.sql <hostpath>
You need to enable Privileged Mode for your CodeBuild project.

AWS: CodeDeploy for a Docker Compose project?

My current objective is to have Travis deploy our Django+Docker-Compose project upon successful merge of a pull request to our Git master branch. I have done some work setting up our AWS CodeDeploy since Travis has builtin support for it. When I got to the AppSpec and actual deployment part, at first I tried to have an AfterInstall script do docker-compose build and then have an ApplicationStart script do docker-compose up. The containers that have images pulled from the web are our PostgreSQL container (named db, image aidanlister/postgres-hstore which is the usual postgres image plus the hstore extension), the Redis container (uses the redis image), and the Selenium container (image selenium/standalone-firefox). The other two containers, web and worker, which are the Django server and Celery worker respectively, use the same Dockerfile to build an image. The main command is:
CMD paver docker_run
which uses a pavement.py file:
from paver.easy import task
from paver.easy import sh
#task
def docker_run():
migrate()
collectStatic()
updateRequirements()
startServer()
#task
def migrate():
sh('./manage.py makemigrations --noinput')
sh('./manage.py migrate --noinput')
#task
def collectStatic():
sh('./manage.py collectstatic --noinput')
# find any updates to existing packages, install any new packages
#task
def updateRequirements():
sh('pip install --upgrade -r requirements.txt')
#task
def startServer():
sh('./manage.py runserver 0.0.0.0:8000')
Here is what I (think I) need to make happen each time a pull request is merged:
Have Travis deploy changes using CodeDeploy, based on deploy section in .travis.yml tailored to our CodeDeploy setup
Start our Docker containers on AWS after successful deployment using our docker-compose.yml
How do I get this second step to happen? I'm pretty sure ECS is actually not what is needed here. My current status right now is that I can get Docker started with sudo service docker start but I cannot get docker-compose up to be successful. Though deployments are reported as "successful", this is only because the docker-compose up command is run in the background in the Validate Service section script. In fact, when I try to do docker-compose up manually when ssh'd into the EC2 instance, I get stuck building one of the containers, right before the CMD paver docker_run part of the Dockerfile.
This took a long time to work out, but I finally figured out a way to deploy a Django+Docker-Compose project with CodeDeploy without Docker-Machine or ECS.
One thing that was important was to make an alternate docker-compose.yml that excluded the selenium container--all it did was cause problems and was only useful for local testing. In addition, it was important to choose an instance type that could handle building containers. The reason why containers couldn't be built from our Dockerfile was that the instance simply did not have the memory to complete the build. Instead of a t1.micro instance, an m3.medium is what worked. It is also important to have sufficient disk space--8GB is far too small. To be safe, 256GB would be ideal.
It is important to have an After Install script run service docker start when doing the necessary Docker installation and setup (including installing Docker-Compose). This is to explicitly start running the Docker daemon--without this command, you will get the error Could not connect to Docker daemon. When installing Docker-Compose, it is important to place it in /opt/bin/ so that the binary is used via /opt/bin/docker-compose. There are problems with placing it in /usr/local/bin (I don't exactly remember what problems, but it's related to the particular Linux distribution for the Amazon Linux AMI). The After Install script needs to be run as root (runas: root in the appspec.yml AfterInstall section).
Additionally, the final phase of deployment, which is starting up the containers with docker-compose up (more specifically /opt/bin/docker-compose -f docker-compose-aws.yml up), needs to be run in the background with stdin and stdout redirected to /dev/null:
/opt/bin/docker-compose -f docker-compose-aws.yml up -d > /dev/null 2> /dev/null < /dev/null &
Otherwise, once the server is started, the deployment will hang because the final script command (in the ApplicationStart section of my appspec.yml in my case) doesn't exit. This will probably result in a deployment failure after the default deployment timeout of 1 hour.
If all goes well, then the site can finally be accessed at the instance's public DNS and port in your browser.

eb deploy always fails

I have created a ruby env on amazon elastic beanstalk, but when I try to deploy my rails app from command line using eb deploy I get this error:
Don't run Bundler as root. Bundler can ask for sudo if it is needed, and
installing your bundle as root will break this application for all non-root
users on this machine.
You need to install git to be able to use gems from git repositories. For help
installing git, please refer to GitHub's tutorial at
https://help.github.com/articles/set-up-git (Executor::NonZeroExitStatus)
[2015-08-09T15:50:38.513Z] INFO [4217] - [CMD-AppDeploy/AppDeployStage0/AppDeployPreHook/10_bundle_install.sh] : Activity failed.
[2015-08-09T15:50:38.513Z] INFO [4217] - [CMD-AppDeploy/AppDeployStage0/AppDeployPreHook] : Activity failed.
[2015-08-09T15:50:38.513Z] INFO [4217] - [CMD-AppDeploy/AppDeployStage0] : Activity failed.
[2015-08-09T15:50:38.514Z] INFO [4217] - [CMD-AppDeploy] : Completed activity. Result:
Command CMD-AppDeploy failed.
So, shall I install git at amazon instance bash directly? will this effect autoscaling?
I don't know if you fixed this, but you need to tell Elastic Beanstalk to install git.
In the root directory of your project, add a folder called .ebextensions.
Create a file inside that folder called (something like) install_git.config (the .config is important).
Add the following lines to that file:
packages:
yum:
git: []
Then redeploy your application, and you shouldn't see that error anymore.