How to use gcloud with Babashka - clojure

I'm trying to use Babashka to replace a few Bash scripts I use to deploy functions on GCP Cloud Functions.
The script below is working, but I wonder if there is a better way to execute the gcloud shell command:
#!/usr/bin/env bb
(require '[cheshire.core :as json]
'[clojure.java.shell :refer [sh]])
(let [package-json (json/parse-string (slurp "package.json") true)
name (:name package-json)
entry-point "entryPoint"
region "europe-west3"
memory "128MB"
runtime "nodejs14"
source "dist"
service-account "sa-function-invoker#prj-kitchen-sink.iam.gserviceaccount.com"
timeout "10s"]
(println "deploy function" name "with entry point" entry-point "to GCP Cloud Functions." "Attach service account" service-account)
(let [output (sh "gcloud" "functions" "deploy" name "--region" region "--entry-point" entry-point "--memory" memory "--runtime" runtime "--service-account" service-account "--source" source "--trigger-http" "--timeout" timeout)]
(if (= "" (:err output))
(println (:out output))
(println (:err output)))))
As a comparison, the Bash script I was using is easier to read:
#!/bin/bash
set -euo pipefail
FUNCTION_NAME=$(cat package.json | jq '{name}' | jq '.name' | sed 's/"//g')
FUNCTION_ENTRY_POINT=entryPoint
ATTACHED_SA=sa-function-invoker#prj-kitchen-sink.iam.gserviceaccount.com
MEMORY=128MB
echo "deploy function `${FUNCTION_NAME}` with entry point `${FUNCTION_ENTRY_POINT}` to GCP Cloud Functions. Attach service account `${ATTACHED_SA}`"
gcloud functions deploy ${FUNCTION_NAME} \
--project ${GCP_PROJECT_ID} \
--region ${GCP_REGION} \
--memory ${MEMORY} \
--runtime nodejs14 \
--service-account ${ATTACHED_SA} \
--source dist \
--entry-point ${FUNCTION_ENTRY_POINT} \
--timeout 10s
I guess my question is not very specific to Babashka or gcloud, but it's about how to construct commands with clojure.java.shell in general...

If you want to execute the shell command and see the direct output as it appears, I recommend using babashka.process/process or babashka.tasks/shell:
#(babashka.process/process ["ls" "-la"] {:out :inherit :err :inherit})
#(babashka.process/process ["ls" "-la"] {:inherit true})
(babashka.tasks/shell "ls -la")
The above invocations do pretty much the same, but shell also applied babashka.process/check which throws if the exit code was non-zero. The # sign before the invocation is the same as calling deref which means: wait for the process to finish. If you don't prepend that, than the process is going to run asynchronously.
More info:
https://github.com/babashka/process
https://book.babashka.org/#tasks

One trick I use to simplify calling out to the shell is shown by this helper function:
(shell-cmd cmd-str)
Runs a command string in the default OS shell (/bin/bash); returns result in a Clojure map. Example:
(shell-cmd "ls -ldF *")
;=> {:exit 0 ; unix exit status (0 -> normal)
:err '' ; text from any errors
:out '...' ; text output as would printed to console
}
It allows you to write a single command string, instead of having to manually tokenize all the parts of the string. The implementation is quite simple:
(def ^:dynamic *os-shell* "/bin/bash") ; could also use /bin/zsh, etc
(defn shell-cmd
[cmd-str]
(let [result (shell/sh *os-shell* "-c" cmd-str)]
(if (= 0 (grab :exit result))
result
(throw (ex-info "shell-cmd: clojure.java.shell/sh failed, cmd-str:" (vals->map cmd-str result))))))
So it allows you to send a command string directly to /bin/bash and allow it to parse the args as normal.
I used this extensively a few years back to control AWS RDS hosts (creating, snapshots, selecting, deleting) via the AWS CLI and it was very easy to use.

Related

How to execute command on multiple servers for executing a command

i have set of servers (150) for logging and a command (to get disk space). How can i execute this command for each server.
Suppose if script is taking 1 min to get report of command for single server, how can i send report for all the servers for every 10 min?
use strict;
use warnings;
use Net::SSH::Perl;
use Filesys::DiskSpace;
# i have almost more than 100 servers..
my %hosts = (
'localhost' => {
user => "z",
password => "qumquat",
},
'129.221.63.205' => {
user => "z",
password => "aardvark",
},
'129.221.63.205' => {
user => "z",
password => "aardvark",
},
);
# file system /home or /dev/sda5
my $dir = "/home";
my $cmd = "df $dir";
foreach my $host (keys %hosts) {
my $ssh = Net::SSH::Perl->new($host,port => 22,debug => 1,protocol => 2,1 );
$ssh->login($hostdata{$host}{user},$hostdata{$host}{password} );
my ($out) = $ssh->cmd($cmd});
print "$out\n";
}
It has to send output of disk space for each server
Is there a reason this needs to be done in Perl? There is an existing tool, dsh, which provides precisely this functionality of using ssh to run a shell command on multiple hosts and report the output from each. It also has the ability, with the -c (concurrent) switch to run the command at the same time on all hosts rather than waiting for each one to complete before going on to the next, which you would need if you want to monitor 150 machines every 10 minutes, but it takes 1 minute to check each host.
To use dsh, first create a file in ~/.dsh/group/ containing a list of your servers. I'll put mine in ~/.dsh/group/test-group with the content:
galera-1
galera-2
galera-3
Then I can run the command
dsh -g test-group -c 'df -h /'
And get back the result:
galera-3: Filesystem Size Used Avail Use% Mounted on
galera-3: /dev/mapper/debian-system 140G 36G 99G 27% /
galera-1: Filesystem Size Used Avail Use% Mounted on
galera-1: /dev/mapper/debian-system 140G 29G 106G 22% /
galera-2: Filesystem Size Used Avail Use% Mounted on
galera-2: /dev/mapper/debian-system 140G 26G 109G 20% /
(They're out-of-order because I used -c, so the command was sent to all three servers at once and the results were printed in the order the responses were received. Without -c, they would appear in the same order the servers are listed in the group file, but then it would wait for each reponse before connecting to the next server.)
But, really, with the talk of repeating this check every 10 minutes, it sounds like what you really want is a proper monitoring system such as Icinga (a high-performance fork of the better-known Nagios), rather than just a way to run commands remotely on multiple machines (which is what dsh provides). Unfortunately, configuring an Icinga monitoring system is too involved for me to provide an example here, but I can tell you that monitoring disk space is one of the checks that are included and enabled by default when using it.
There is a ready-made tool called Ansible for exactly this purpose. There you can define your list of servers, group then and execute commands on all of them.

How to implement blue/green deployments in AWS with Terraform without losing capacity

I have seen multiple articles discussing blue/green deployments and they consistently involve forcing recreation of the Launch Configuration and the Autoscaling Group. For example:
https://groups.google.com/forum/#!msg/terraform-tool/7Gdhv1OAc80/iNQ93riiLwAJ
This works great in general except that the desired capacity of the ASG gets reset to the default. So if my cluster is under load then there will be a sudden drop in capacity.
My question is this: is there a way to execute a Terraform blue/green deployment without a loss of capacity?
I don't have a full terraform-only solution to this.
The approach I have is to run a small script to get the current desired capacity, set a variable, and then use that variable in the asg.
handle-desired-capacity:
#echo "Handling current desired capacity"
#echo "---------------------------------"
#if [ "$(env)" == "" ]; then \
echo "Cannot continue without an environment"; \
exit -1; \
fi
$(eval DESIRED_CAPACITY := $(shell aws autoscaling describe-auto-scaling-groups --profile $(env) | jq -SMc '.AutoScalingGroups[] | select((.Tags[]|select(.Key=="Name")|.Value) | match("prod-asg-app")).DesiredCapacity'))
#if [ "$(DESIRED_CAPACITY)" == '' ]; then \
echo Could not determine desired capacity.; \
exit -1; \
fi
#if [ "$(DESIRED_CAPACITY)" -lt 2 -o "$(DESIRED_CAPACITY)" -gt 10 ]; then \
echo Can only deploy between 2 and 10 instances.; \
exit -1; \
fi
#echo "Desired Capacity is $(DESIRED_CAPACITY)"
#sed -i.bak 's!desired_capacity = [0-9]*!desired_capacity = $(DESIRED_CAPACITY)!g' $(env)/terraform.tfvars
#rm -f $(env)/terraform.tfvars.bak
#echo ""
Clearly, this is as ugly as it gets, but it does the job.
I am looking to see if we can get the name of the ASG as an output from the remote state that I can then use on the next run to get the desired capacity, but I'm struggling to understand this enough to make it useful.
As a second answer, I wrapped the AWSCLI + jq into a Terraform module.
https://registry.terraform.io/modules/digitickets/cli/aws/latest
module "current_desired_capacity" {
source = "digitickets/cli/aws"
assume_role_arn = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/OrganizationAccountAccessRole"
role_session_name = "GettingDesiredCapacityFor${var.environment}"
aws_cli_commands = ["autoscaling", "describe-auto-scaling-groups"]
aws_cli_query = "AutoScalingGroups[?Tags[?Key==`Name`]|[?Value==`digitickets-${var.environment}-asg-app`]]|[0].DesiredCapacity"
}
and
module.current_desired_capacity.result gives you the current desired capacity of the ASG you have nominated in the aws_cli_query.
Again, this is quite ugly, but the formalisation of this means you can now access a LOT of properties from AWS that are not yet available within Terraform.
This is a gentle hack. No resources are passed around and it was written purely with read-only for single scalar values in mind, so please use it with care.
As the author, I'd be happy to explain anything about this via the GitHub Issues page at https://github.com/digitickets/terraform-aws-cli/issues

clojure core.async - unexpected inconsistencies

haven't done any Clojure for couple years, so decided to go back and not ignore core.async this time around ) pretty cool stuff, that - but it surprised me almost immediately. Now, I understand that there's inherent indeterminism when multiple threads are involved, but there's something bigger than that at play here.
The source code for my oh-so-simple example, where I am trying to copy lines from STDIN to a file:
(defn append-to-file
"Write a string to the end of a file"
([filename s]
(spit filename (str s "\n")
:append true))
([s]
(append-to-file "/tmp/journal.txt" s)))
(defn -main
"I don't do a whole lot ... yet."
[& args]
(println "Initializing..")
(let [out-chan (a/chan)]
(loop [line (read-line)]
(if (empty? line) :ok
(do
(go (>! out-chan line))
(go (append-to-file (<! out-chan)))
(recur (read-line)))))))
except, of course, this turned out to be not so simple. I think I've narrowed it down to something that's not properly cleaned up. Basically, running the main function produces inconsistent results. Sometimes I run it 4 times, and see 12 lines in the output. But sometimes, 4 run will produce just 10 lines. Or like below, 3 times, 6 lines:
akamac.home ➜ coras git:(master) ✗ make clean
cat /dev/null > /tmp/journal.txt
lein clean
akamac.home ➜ coras git:(master) ✗ make compile
lein uberjar
Compiling coras.core
Created /Users/akarpov/repos/coras/target/uberjar/coras-0.1.0-SNAPSHOT.jar
Created /Users/akarpov/repos/coras/target/uberjar/coras-0.1.0-SNAPSHOT-standalone.jar
akamac.home ➜ coras git:(master) ✗ make run
java -jar target/uberjar/coras-0.1.0-SNAPSHOT-standalone.jar < resources/input.txt
Initializing..
akamac.home ➜ coras git:(master) ✗ make run
java -jar target/uberjar/coras-0.1.0-SNAPSHOT-standalone.jar < resources/input.txt
Initializing..
akamac.home ➜ coras git:(master) ✗ make run
java -jar target/uberjar/coras-0.1.0-SNAPSHOT-standalone.jar < resources/input.txt
Initializing..
akamac.home ➜ coras git:(master) ✗ make check
cat /tmp/journal.txt
line a
line z
line b
line a
line b
line z
(Basically, sometimes a run produced 3 lines, sometimes 0, sometimes 1 or 2).
The fact that lines appear in random order doesn't bother me - go blocks do things in a concurrent/threaded manner, and all bets are off. But why they don't do all of work all the time? (Because I am misusing them somehow, but where?)
Thanks!
There's many problems with this code, let me walk through them real quick:
1) Every time you call (go ...) you're spinning of a new "thread" that will be executed in a thread pool. It is undefined when this thread will run.
2) You aren't waiting for the completion of these threads, so it's possible (and very likely) that you will end up reading several lines form the file, writing several lines to the channel, before a read even occurs.
3) You are firing off multiple calls to append-to-file at the same time (see #2) these functions are not synchronized, so it's possible that multiple threads will append at once. Since access to files in most OSes is uncoordinated it's possible for two threads to write to your file at the same time, overwriting eachother's results.
4) Since you are creating a new go block for every line read, it's possible they will execute in a different order than you expect, this means the lines in the output file may be out of order.
I think all this can be fixed a bit by avoiding a rather common anti-pattern with core.async: don't create go blocks (or threads) inside unbounded or large loops. Often this is doing something you don't expect. Instead create one core.async/thread with a loop that reads from the file (since it's doing IO, never do IO inside a go block) and writes to the channel, and one that reads from the channel and writes to the output file.
View this as an assembly line build out of workers (go blocks) and conveyor belts (channels). If you built a factory you wouldn't have a pile of people and pair them up saying "you take one item, when you're done hand it to him". Instead you'd organize all the people once, with conveyors between them and "flow" the work (or data) between the workers. Your workers should be static, and your data should be moving.
.. and of course, this was a misuse of core.async on my part:
If I care about seeing all the data in the output, I must use a blocking 'take' on the channel, when I want to pass the value to my I/O code -- and, as it was pointed out, that blocking call should not be inside a go block. A single line change was all I needed:
from:
(go (append-to-file (<! out-chan)))
to:
(append-to-file (<!! out-chan))

How to execute program from Clojure without any external libraries and showing its output at real time?

My attempt:
(import 'java.lang.Runtime)
(. (Runtime/getRuntime) exec (into-array ["youtube-dl" "--no-playlist" "some youtube video link"]))
I also tried sh. But both approaches don't do what I want - running a program similarly like shell does (sh waits until program exits, exec launches it and doesn't wait for its exit; both don't output anything to standard output). I want live showing of process output, e.g. when I run youtube-dl I want to see progress of a video download.
How to do this simple simple task in Clojure?
You must start the process and listen to its output stream. One solution is :
(:require [clojure.java.shell :as sh]
[clojure.java.io :as io])
(let [cmd ["yes" "1"]
proc (.exec (Runtime/getRuntime) (into-array cmd))]
(with-open [rdr (io/reader (.getInputStream proc))]
(doseq [line (line-seq rdr)]
(println line))))

Intermittent error serving a binary file with Clojure/Ring

I am building an event collector in Clojure for Snowplow (using Ring/Compojure) and am having some trouble serving a transparent pixel with Ring. This is my code for sending the pixel:
(ns snowplow.clojure-collector.responses
(:import (org.apache.commons.codec.binary Base64)
(java.io ByteArrayInputStream)))
(def pixel-bytes (Base64/decodeBase64 (.getBytes "R0lGODlhAQABAPAAAAAAAAAAACH5BAEAAAAALAAAAAABAAEAAAICRAEAOw==")))
(def pixel (ByteArrayInputStream. pixel-bytes))
(defn send-pixel
[]
{:status 200
:headers {"Content-Type" "image/gif"}
:body pixel})
When I start up my server, the first time I hit the path for send-pixel, the pixel is successfully delivered to my browser. But the second time - and every time afterwards - Ring sends no body (and content-length 0). Restart the server and it's the same pattern.
A few things it's not:
I have replicated this using wget, to confirm the intermittent-ness isn't a browser caching issue
I generated the "R01GOD..." base64 string at the command-line (cat original.gif | base64) so know there is no issue there
When the pixel is successfully sent, I have verified its contents are correct (diff original.gif received-pixel.gif)
I'm new to Clojure - my guess is there's some embarrassing dynamic gremlin in my code, but I need help spotting it!
I figured out the problem in the REPL shortly after posting:
user=> (import (org.apache.commons.codec.binary Base64) (java.io ByteArrayInputStream))
java.io.ByteArrayInputStream
user=> (def pixel-bytes (Base64/decodeBase64 (.getBytes "R0lGODlhAQABAAAAACH5BAEKAAEALAAAAAABAAEAAAICTAEAOw==")))
#'user/pixel-bytes
user=> (def pixel (ByteArrayInputStream. pixel-bytes))
#'user/pixel
user=> (slurp pixel-bytes)
"GIF89a!�\n,L;"
user=> (slurp pixel-bytes)
"GIF89a!�\n,L;"
user=> (slurp pixel)
"GIF89a!�\n,L;"
user=> (slurp pixel)
""
So basically the problem was that the ByteArrayInputStream was getting emptied after the first call. Mutable data structures!
I fixed the bug by generating a new ByteArrayInputStream for each response, with:
:body (ByteArrayInputStream. pixel-bytes)}))
The problem is your pixel variable holds a stream. Once it has been read, there is no possibility to re-read it again.
Moreover, you do not need to deal with encoding issues. Ring serves static files as well. Just return:
(file-response "/path/to/pixel.gif")
It handles non-existing files as well. See the docs also.