How to create two otherwise identical containers that extend different images - dockerfile

I have this Dockerfile:
FROM python:2
# Large number of commands
I want to create Dockerfile-py3 that extends python:3 instead of python:2 but is otherwise the same. Aside from
FROM python:3
# Copy of large number of commands
what solution do I have in order to avoid the redundancy?

You can use build-arg option to docker build with ARG and FROM directive of Dockerfile together to get desired result like so:
Example of Dockerfile:
ARG REPOSITORY_AND_TAG=python:2.7
FROM ${REPOSITORY_AND_TAG}
# Large number of commands
Now, if built without any build-arg it will default to python:2.7 as declared in the Dockerfile. But, if built wih build-arg, you can choose substitute value like so:
docker build --build-arg REPOSITORY_AND_TAG=python:latest -t my-docker-image .
In which case it will override default (python:2.7) value from the Dockerfile and you will end up with python:latest instead.
Additional documentation:
about ARG usage in FROM directive can be found in the official Dockerfile reference
about build-arg usage also from the official documentation

Related

How to pass command line arguments into the implementation js file in gauge project?

I am using gauge-js for running puppeteer scripts and I am trying to pass in custom argument from command lines.
while I am running my gauge run spec command to run the test cases I want to pass in any custom argument like gauge run spec --username=test and read that value inside my implementation files.
You cannot pass custom arguments to Gauge. However, you can use environment variables to pass any additional information you need in your implementation files.
For example, you can run gauge as
(on mac/*nix)
username=test gauge run spec
or
(on windows)
set username=test
gauge run specs
and use the environment variable in your implementation file using process.env.username.
You can additionally set the variable in the .property files in the env folder. These get picked up as environment variables as well.

EMR hdfs transparently backed by s3

With hadoop I can use s3 as a storage url. But currently I have a lot of applications using hdfs://... and I would like to migrate the whole cluster and apps to EMR and s3. do i have to change url in every single app from hdfs://... to s3://... or is it possible to somehow tell EMR to store hdfs content on s3 so each application can still use hdfs://... but in fact it will point to s3? if so, how?
That's a very good question. is there such a thing as protocol spoofing? could you actually affect this behavior by writing something that overrides how protocols are handled? Honestly that kind of a solution gives me the heeby-jeebies because if someone doesn't know that's happening and then gets unexpected pathing, and can't really diagnose or fix it, that's worse than the original problem.
if I were you, I'd do a find-replace over all my apps to just update the protocol.
let's say you had all of your apps in a directory:
-- myApps
|-- app1.txt
|-- app2.txt
and you wanted to find and replace hdfs:// with s3:// in all of those apps, I'd just do something like this:
sed -i .original 's/hdfs/s3/h' *
which produces:
-- myApps
|-- app1.txt
|-- app1.txt.original
|-- app2.txt
|-- app2.txt.original
and now app1.txt has s3:// everywhere rather than hdfs://
Isn't that enough?
The applications shall be refactored so that the input and output paths are not hard-coded. Instead, they shall be injected into the applications, after being read from some configuration files or parsed from command line arguments.
Take the following Pig script for example:
loaded_records =
LOAD '$input'
USING PigStorage();
--
-- ... magic processing ...
--
STORE processed_records
INTO '$output'
USING PigStorage();
We can then have a wrapper script like this:
#!/usr/bin/env bash
config_file=${1:?"Missing config_file"}
[[ -f "$config_file" ]] && source "$config_file" || { echo "Failed to source config file $config_file"; exit 1; }
pig -p input="${input_root:?'Missing parameter input_root in config_file'}/my_input_path" -p output="${output:?'Missing parameter output_root in config_file'}/my_output_path" the_pig_script.pig
In the config file:
input_root="s3://mybucket/input"
output_root="s3://mybucket/output"
If you have this kind of setup, you only have to do the configuration changes to switch between hdfs and s3.

Stata: How to reference subfolder after setting directory with cd

In SPSS, you can set a directory or path, like cd 'C:\MyData' and later refer to any subfolders within that directory, like get file 'Subfolder1\Some file.sav'.
How do you do this in Stata? Assume I have this folder structure:
C:\MyData\
Subfolder1\
data1.dta
data2.dta
Subfolder2\
data3.dta
data4.dta
Can I do:
cd "C:\MyData"
and then
use Subfolder1\data1.dta
[a bunch of code ...]
use Subfolder2\data3.dta
[a bunch of code]
I'm basically trying to avoid having to respecify the higher level folder I established with the initial cd command.
This is valid Stata syntax:
clear
set more off
cd "D:/Datos/rferrer/Desktop/statatemps"
use "test/cauto.dta"
You could also do something like:
clear
set more off
local dirstub "D:/Datos/rferrer/Desktop/statatemps"
use "`dirstub'/test/cauto.dta"
That is, define a directory stub using a local, and use it whenever needed. Unlike the first example, this form doesn't actually produce a directory change.
I think you should be able to use a period as a directory component in a path to represent the current directory, like this:
use "./Subfolder1/data1.dta"

How to implement my product resource into a Pods structure?

Reading http://www.ember-cli.com/#pod-structure
Lets say I have a product resource. Which currently has the following directory structure:
app/controllers/products/base.js
app/controllers/products/edit.js
app/controllers/products/new.js
app/controllers/products/index.js
With pods all the logic in these files are put in a single file app/products/controller.js?
At the same time, my routes and templates for these resources currently look like:
app/routes/products/base.js
app/routes/products/edit.js
app/routes/products/new.js
app/routes/products/index.js
app/templates/products/-form.hbs
app/templates/products/edit.hbs
app/templates/products/index.hbs
app/templates/products/new.hbs
app/templates/products/show.hbs
How should this be converted to Pods?
You can use ember generate --pod --dry-run to help with that:
$ ember g -p -d route products/base
version: 0.1.6
The option '--dryRun' is not supported by the generate command. Run `ember generate --help` for a list of supported options.
installing
You specified the dry-run flag, so no changes will be written.
create app/products/base/route.js
create app/products/base/template.hbs
installing
You specified the dry-run flag, so no changes will be written.
create tests/unit/products/base/route-test.js
$
(I don't know why it complains yet it honours the option, might be a bug).
So you'd end up with a structure like:
app/controllers/products/base/route.js
app/controllers/products/edit/route.js
etc.

git log suppress refs matching specified pattern

I regularly use the following git-log command:
git log --oneline --graph --decorate --all
The command is perfect for me, with one exception. I maintain a set of refs in refs/arch/ that I want to keep around ("arch" stands for "archive"), but I do not want to see them every time I look at my git log. I don't mind them showing up if they are an ancestor of an existing branch or tag, but I really do not want to see series of commits that would not otherwise show up in the git log but for the fact that they are in the commit history of a given refs/arch/* ref.
For example, in the image below, the left-hand side is an illustration of what I see currently when I run git log --oneline --graph --decorate --all. As you can see, the commit referred to by refs/arch/2 would not show up in the log if that ref didn't exist. (Assume there are no refs that are not shown in the left-hand side image.) Now, the right-hand side is an illustration of two alternative log graphs, either of which would be perfectly fine. I don't mind seeing anything matching refs/arch/* so long as it is in the commit history of a branch or tag. But, in the image below, I definitely do not want to see the commit referred to by refs/arch/2.
How can my git-log command be modified to suppress refs/arch/* in either of the senses depicted in the illustration?
What you want is:
git log --oneline --graph --decorate --exclude 'refs/arch/*' --all
The --exclude option is new in git 1.9.0.
From the git-log manual page:
--exclude=<glob-pattern>
Do not include refs matching <glob-pattern> that the next --all, --branches, --tags, --remotes, or --glob would otherwise consider. Repetitions of this option accumulate exclusion patterns up to the next --all, --branches, --tags, --remotes, or --glob option (other options or arguments do not clear accumlated patterns).
The patterns given should not begin with refs/heads, refs/tags, or refs/remotes when applied to --branches, --tags, or --remotes, respectively, and they must begin with refs/ when applied to --glob or --all. If a trailing /* is intended, it must be given explicitly.
If you are on some flavor of Ubuntu you can upgrade git from the Ubuntu Git Maintainers team ppa.
sudo add-apt-repository ppa:git-core/ppa
sudo apt-get update
sudo apt-get upgrade