chef - restart a service on another machine after connecting to it - amazon-web-services

I'm new to chef and struggling with this concept - I'm not sure if it's possible and if it is, how to achieve it. Thus far Google Fu has failed me.
I have a chef client and chef server, i'm uploading cookbooks to run on my client. I trigger the code by running chef-client on the chef client machine.
What I'm then trying to do is connect to several other boxes via ssh and then stop a service, clear out a directory and restart the service.
When working I'll be targeting Solr service
My basic test code is as follows:
ruby_block 'do something' do
block do
Chef::Log.info('Do Something')
Chef::Resource::Notification.new('stop', :run, self)
end
end
#define the service - does nothing
service 'atd' do
action :nothing
end
#do something that triggers
execute 'start' do
command 'pwd'
Chef::Log.info('triggers start')
action :nothing
notifies :start, run_context.resource_collection.find(:service => 'atd')
end
execute 'stop' do
# some stuff
# on success...
command 'pwd'
Chef::Log.info('triggers restart')
notifies :stop, run_context.resource_collection.find(:service => 'atd'), :immediately
notifies :run, 'execute[start]'
end
in the ruby block I plan on iterating through a list of machines that I have and ssh'ing to them. Is it then possible to run the remainder of the code on those machines?
Thanks for any help / advice.

Related

Checking for the result of the AWS CLI 'run-task' command, task stopped succesfully or from an error?

I'm currently moving an application off of static EC2 servers to ECS, as until now the release process has been ssh'ing into the server to git pull/migrate the database.
I've created everything I need using terraform to deploy my code from my organisations' Elastic Container Registry. I have a cluster, some services and task definitions.
I can deploy the app successfully for any given version now, however my main problem is finding a way to run migrations.
My approach so far has been to split the application into 3 services, I have my 'web' service which handles all HTTP traffic (serving the frontend, responding to API requests), my 'cron' service which handles things like sending emails/push notifications on specific times/events and my 'migrate' service which is just the 'cron' service but with the entryPoint to the container overwritten to just run the migrations (as I don't need any of the apache2 stuff for this container, and I didn't see reason to make another one for just migrations).
The problem I had with this was the 'migrate' service would constantly try and schedule more tasks for migrating the database, even though it only needed to be done once. So I've scrapped it as a service and kept it as a task definition however, so that I can still place it into my cluster.
As part of the deploy process I'm writing, I run that task inside the cluster via a bash script so I can wait until the migrations finish before deciding whether to take the application out of maintenance mode (if the migrations fail) or to deploy the new 'web'/'cron' containers once the migration has been completed.
Currently this is inside a shell script (ran by Github actions) that looks like this:
#!/usr/bin/env bash
CLUSTER_NAME=$1
echo $CLUSTER_NAME
OUTPUT=`aws ecs run-task --cluster ${CLUSTER_NAME} --task-definition saas-app-migrate`
if [$? -n 0]; then
>&2 echo $OUTPUT
exit 1
fi
TASKS=`echo $OUTPUT | jq '.tasks[].taskArn' | jq #sh | sed -e "s/'//g" | sed -e 's/"//g'`
for task in $TASKS
do
# check for task to be done
done
Because $TASKS contains the taskArn of any tasks that have been spawned by this, I am freely able to query the task however I don't know what information I'm looking for.
The AWS documentation says I should use the 'describe-task' command to then find out why a task has reached the 'STOPPED' status, as it provides a 'stopCode' and 'stoppedReason' property in the response. However, it doesn't say what these values would be if it was succesfully stopped? I don't want to have to introduce a manual step in my deployment where I wait until the migrations are done - with the application not being usable - to then tell my release process to continue.
Is there a link to documentation I might have missed with the values I'm searching for, or an alternate way to handle this case?

Wildfly 10 restart issue on AWS EC2

I am running my Wildfly 10.1.0 server on Linux OS on Amazon EC2 instance. I have written start and stop scripts for the server. Whenever I stop my server and re-start after some time I get the following exception -
WFLYCTL0013: Operation ("add") failed - address: ([("deployment" => "rapid.ear")]) - failure description: "WFLYSRV0137: No deployment content with hash dd66eee901c4bf79dd6659873df918e1b639bc1b is available in the deployment content repository for deployment 'rapid.ear'. This is a fatal boot error. To correct the problem, either restart with the --admin-only switch set and use the CLI to install the missing content or remove it from the configuration, or remove the deployment from the xml configuration file and restart."
When I remove the entry for that WAR from standalone.xml I am able to restart the server, but I need a more permanent solution.
The start script written is -
nohup /data/wildfly-10.1.0.Final/bin/standalone.sh -Djavax.net.ssl.trustStore="/usr/java/jdk1.8.0_121/jre/lib/security/jssecacerts" --server-config=standalone.xml &
And the stop script is -
sh /data/wildfly-10.1.0.Final/bin/jboss-cli.sh --connect command=:shutdown
It may not be quite as efficient in terms of I/O but if you've got a standalone instance I've just taken advantage of the deployment scanner. I have:
<subsystem xmlns="urn:jboss:domain:deployment-scanner:2.0">
<deployment-scanner name="myapp" path="/home/wildfly/sites/www.mysite.tld" scan-interval="60000" auto-deploy-exploded="true"/>
</subsystem>
in my standalone-full.xml (you may or may not need the "-full" part). I then deploy my webapp to "/home/wildfly/sites/www.mysite.tld" and can update it as needed. The code I show only reads the directory once a minute so it isn't terrible on I/O.
Again, your deployment may be different than mine.

AWS CodePipeline advanced tutorial with jenkins

I'm running through AWS CodePipeline tutorial and there is this step
saying that I have to create a jenkins job running bash script which will connect to the EC2 instance (not the one where jenkins is running, but the one where the code has been deployed earlier).
It is said that I have to connect to the EC2 instance by running this command in bash script:
TEST_IP_ADDRESS=192.168.0.4 rake test
But my gut feeling is saying that this step is completely wrong.
There is no variable with this name, and there is no option to connect to external instance just like that.
I've completed all the steps successfully, but this one is obviously wrong
The bash script will run in your jenkins instance, and it will make an HTTP request to the instance you configured in TEST_IP_ADDRESS.
When you add the "build step", and choose "Execute shell", you'll enter this:
TEST_IP_ADDRESS=192.168.0.4 rake test
You are defining the TEST_IP_ADDRESS variable, so it's up to you to give it an appropriate value.
First I had the same confusion, then I saw the source code and it is pretty self-explained:
#!/usr/bin/env ruby
require 'net/http'
require 'minitest/autorun'
require 'socket'
class JenkinsSampleTest < MiniTest::Unit::TestCase
def setup
uri_params = {
:host => ENV['TEST_IP_ADDRESS'] || 'localhost',
:port => (ENV['TEST_PORT'] || '80').to_i,
:path => '/index.html'
}
#webpage = Net::HTTP.get(URI::HTTP.build(uri_params))
end
def test_congratulations
assert(#webpage =~ /Congratulations/)
end
end

Cron Job issu with rails - gem whenever

I am trying to do a simple Cron task using the gem whenever for rails.
How can I tell whenever to trigger a controller action ?
What I wan to do :
every 1.minute do
runner "Mycontroller.index", :environment => 'development'
end
What i want to do is to trigger the action index in my DataController every minute. The index action trigger a mailer.
I run : whenever --update-crontab football
also when I start/restart my server I get a tinny message as follow:
You have new mail in /var/mail/Antoine
[2015-04-12 18:38:17] INFO WEBrick 1.3.1 [2015-04-12 18:38:17] INFO
ruby 2.1.3 (2014-09-19) [x86_64-darwin14.0] [2015-04-12 18:38:17] INFO
WEBrick::HTTPServer#start: pid=22476 port=3000 ^C[2015-04-12 18:38:52]
INFO going to shutdown ... [2015-04-12 18:38:52] INFO
WEBrick::HTTPServer#start done. Exiting
You have new mail in
/var/mail/Antoine
Okay I figured it out:
every 1.day, :at => '4:30 am' do
command "curl http://localhost:3000", :environment => 'development'
end
I use the command curl to got to the route where I wish to trigger a controller action.
I also understood that I need to run whenever -w to write the cron task and then I can do crontab -l to see my ongoing cron tasks and if I wan to remove cron taks I just need to run crontab -r

xcodebuild running tests headless?

As we all know by now, the only way to run tests on iOS is by using the simulator. My problem is that we are running jenkins and the iOS builds are running on a slave (via SSH), as a result running xcodebuild can't start the simulator (as it runs headless). I've read somewhere that it should be possible to get this to work with SimLauncher (gem sim_launcher). But I can't find any info on how to set this up with xcodebuild. Any pointers are welcome.
Headless and xcodebuild do not mix well. Please consider this alternative:
You can configure the slave node to launch via jnlp (webstart). I use a bash script with the .command extension as a login item (System Preferences -> Users -> Login Items) with the following contents:
#!/bin/bash
slave_url="https://gardner.company.com/jenkins/jnlpJars/slave.jar"
max_attempts=40 # ten minutes
echo "Waiting to try again. curl returneed $rc"
curl -fO "${slave_url}" >>slave.log
rc=$?
if [ $rc -ne 0 -a $max_attempts -gt 0 ]; then
echo "Waiting to try again. curl returneed $rc"
sleep 5
curl -fO "${slave_url}" >>slave.log
rc=$?
if [ $rc -eq 0 ]; then
zip -T slave.jar
rc=$?
fi
let max_attempts-=1
fi
# Simulator
java -jar slave.jar -jnlpUrl https://gardner.company.com/jenkins/computer/buildmachine/slave-agent.jnlp -secret YOUR_SECRET_KEY
The build user is set to automatically login. You can see the arguments to the slave.jar app by executing:
gardner:~ buildmachine$ java -jar slave.jar --help
"--help" is not a valid option
java -jar slave.jar [options...]
-auth user:pass : If your Jenkins is security-enabled, specify
a valid user name and password.
-connectTo HOST:PORT : make a TCP connection to the given host and
port, then start communication.
-cp (-classpath) PATH : add the given classpath elements to the
system classloader.
-jar-cache DIR : Cache directory that stores jar files sent
from the master
-jnlpCredentials USER:PASSWORD : HTTP BASIC AUTH header to pass in for making
HTTP requests.
-jnlpUrl URL : instead of talking to the master via
stdin/stdout, emulate a JNLP client by
making a TCP connection to the master.
Connection parameters are obtained by
parsing the JNLP file.
-noReconnect : Doesn't try to reconnect when a communication
fail, and exit instead
-proxyCredentials USER:PASSWORD : HTTP BASIC AUTH header to pass in for making
HTTP authenticated proxy requests.
-secret HEX_SECRET : Slave connection secret to use instead of
-jnlpCredentials.
-slaveLog FILE : create local slave error log
-tcp FILE : instead of talking to the master via
stdin/stdout, listens to a random local
port, write that port number to the given
file, then wait for the master to connect to
that port.
-text : encode communication with the master with
base64. Useful for running slave over 8-bit
unsafe protocol like telnet
gardner:~ buildmachine$
For a discussion about OSX slaves and how the master is launched please see this Jenkins bug: https://issues.jenkins-ci.org/browse/JENKINS-21237
Erik - I ended up doing the items documented here:
Essentially:
The first problem, is that you do have to have the user that runs the builds also logged in to the console on that Mac build machine. It needs to be able to pop up the simulator, and will fail if you don’t have a user logged in — as it can’t do this entirely headless without a display.
Secondly, the XCode Developer tools requires elevated privileges in order to execute all of the tasks on the Unit tests. Sometimes you may miss seeing it, but without these, the Simulator will give you an authentication prompt that never clears.
A first solution to this (on Mavericks) is to run:
sudo security authorizationdb write system.privilege.taskport allow
This will eliminate one class of these authentication popups. You’ll also need to run:
sudo DevToolsSecurity --enable
Per Apple’s man page on this tool:
On normal user systems, the first time in a given login session that
any such Apple-code-signed debugger or performance analysis tools are
used to examine one of the user’s processes, the user is queried for
an administator password for authorization. DevToolsSecurity tool to
change the authorization policies, such that a user who is a member of
either the admin group or the _developer group does not need to enter
an additional password to use the Apple-code-signed debugger or
performance analysis tools.
Only issue is that these same things seem to be broken once I upgraded to Xcode 6. Back to the drawing board....