Apply one terraform module at a time - amazon-web-services

efx/
...
aws_account/
nonprod/
account-variables.tf
dev/
account-variables.tf
common.tf
app1.tf
app2.tf
app3.tf
...
modules/
tf_efxstack_app1
tf_efxstack_app2
tf_efxstack_app3
...
In a given environment (dev in the example above), we have multiple modules (app1, app2, app3, etc.) which are based on individual applications we are running in the infrastructure.
I am trying to update the state of one module at a time (e.g. app1.tf). I am not sure how I can do this.
Use Case: I would like only one of the module's LC to be updated to use the latest AMI or security group.
I tried the -target command in terrafrom, but this does not seem to work because it does not check the terraform remote state file.
terraform plan -target=app1.tf
terraform apply -target=app1.tf
Therefor, no changes take place. I believe this is a bug with terraform.
Any ideas how I can accomplish this?

Terraform's -target should be for exceptional use cases only and you should really know what you're doing when you use it. If you genuinely need to regularly target different parts at a time then you should separate your applications into different directory so you can easily apply the whole directory at a time.
This might mean you need to use data sources or rethink the structure of things a bit more but means you also limit the blast radius of any single Terraform action which is always useful.

Related

Terraform, Main.tf, refresh and Drift

I have an AWS Terraform repo where i have an architecture for an AWS solution.
Over time people have gone onto the management console and made changes to the architecture without changing the terraform code causing a drift between the repo and the actual architecture on aws.
Is there a way i can change detect the drift and update my main.tf file to match the new architecture? I know you can use terraform apply -refresh to update the state file but does this affect the main.tf file aswell? Does anyone have a solution for a problem like this so that all my files are updated correctly? Thanks!
his affect the main.tf file aswell
Sadly no. main.tf is not affected.
Does anyone have a solution for a problem like this so that all my files are updated correctly?
Such a solution does not exist unless you develop your own. You have to manually update your main.tf to match the state of your resources.
However a bit of help can come from former2 which can scan your resources and produce terraform code.
Terraform's work of evaluating the given configuration to determine the desired state is inherently lossy. The desired state used to produce a plan, and the updated state obtained by applying that plan, include only the final values resulting from evaluating any expressions, and it isn't possible in general to reverse updated values back to updated expressions that would produce those values.
For example, imagine that you have an argument like this:
foo = sha1("hello")
This produces a SHA-1 checksum of the string "hello". If someone changes the checksum in the remote system, Terraform can see that the checksum no longer matches but it cannot feasibly determine what new string must be provided to sha1 to produce that new checksum. This is an extreme example using an inherently irreversible function, but this general problem applies to any argument whose definition is more than just a literal value.
Instead, terraform plan -refresh-only will show you the difference between the previous run result and the refreshed state, so you can see how the final results for each argument have changed. You'll need to manually update your configuration so that it will somehow produce a value that matches that result, which is sometimes as simple as just copying the value literally into your configuration but is often more complicated because arguments in a resource block can be derived from data elsewhere in your module and transformed arbitrarily using Terraform functions.

AWS ec2 to run a python program using latex and OpenCV

A friend and I are working on a machine learning project together. We've managed to collect about 5,000 tex documents (we hope to get up to around 100,000 soon). We have a python script that we run on each document to do some text manipulation, extract particular parts of the tex code, compile the parts, convert the compiled parts to cropped PNG images, and search a converted PNG of the full tex for the cropped images using OpenCV. The code takes between 30 seconds and 2 minutes on the documents we've tried so far, so we really need to speed it up.
I've been tasked with gaining access to a computer cluster and figuring out how to implement our code on such a cluster. Someone suggested I look into using AWS, so I've made an account and have been trying to figure out how to use EC2 for the past few hours. Am I on the right track, or is there some other part of AWS or something else entirely that would be better suited to my task?
Whatever I use, it has to have access to the various python libraries in our code and to pdflatex and the full set of tex packages. Is this possible on EC2? I have almost no idea how to go about using EC2 (I've managed to start some instances, but how do I use them to run my script? and do I need to change my python script to accomodate the parallel processing, or does EC2 take care of that somehow? is it as easy as starting a linux instance and installing the programs I need like I would on any other linux machine?). None of the tutorials are immediately useful, and I'm still not even sure if EC2 is capable of doing what I'm looking for. Any advice is appreciated.
I wouldn't normally answer this kind of question but it sounds like you are doing something interesting. So let's have a go
Q1.
"We have a python script that we run on each document to do some text
manipulation, extract particular parts of the tex code, compile the
parts, convert the compiled parts to cropped PNG images, and search a
converted PNG of the full tex for the cropped images using OpenCV.. we
really need to speed it up"
Probably you could split the 100,000 documents into 10 parts and set up
10 instances of the processing software and do the run in parallel.
To set up 10 instances the same, there are many methods but one of the simpler ways is to set up one machine as desired, take a snapshot, make an AMI and then
use the AMI to launch many more copies.
There might be an extra step with putting the results of the search into some
kind of central database.
I don't know anything about OpenCV but there are several suggestions that with a G3 instance type (this has a GPU) it might go faster. Google for "Open CV on AWS"
Q2.
"trying to figure out how to use EC2 for the past few hours. Am I on
the right track, or is there some other part of AWS or something else
entirely that would be better suited to my task?"
EC2 is a general purpose virtual machine, so if you already have code that runs on
some other machine it is easy to move it to EC2
EC2 has many features but one you might find interesting is "spot instances", these are short lived but cheap ( typically 10% of the price ) instance launch
Q3.
Whatever I use, it has to have access to the various python libraries
in our code and to pdflatex and the full set of tex packages. Is this
possible on EC2?
Yes, they will pip install or install from packages just like any other system
Q4.
how do I use them to run my script? and do I need to change my python
script to accomodate the parallel processing, or does EC2 take care of
that somehow? is it as easy as starting a linux instance and
installing the programs I need like I would on any other linux
machine?
As described above your basic task seems to scale well, you may need a step to
collate the results. Yes it is basically the same as any other linux machine

Change stored macro SAS

In SAS using SASMSTORE option I can specify a place where the SASMACR catalog will exist. In this catalog will reside some macro.
At some moment I may need to change the macro and this moment may occure while this macro and therefore the catalog will be in use by another user. But then it will be locked and unavailable to be modified.
How can I avoid such a situation?
If you're using a SAS Macro catalog as a public catalog that is shared among colleagues, a few options exist.
First, use SVN or similar source control option so that you and your colleagues each have a local copy of the macro catalog. This is my preferred option. I'd do this, and also probably not used stored compiled macros - I'd just set it up as autocall macros, personally - because that makes it easy to resolve conflicts (as you have separate files for each macro). Using SCMs you won't be able to resolve conflicts, so you'll have to make sure everyone is very well behaved about always downloading the newest copy before making any changes, and discusses any changes so you don't have two competing changes made at about the same time. If SCMs are important for your particular use case, you could version control the macros that create the SCMs and build the SCM yourself every time you refresh your local copy of the sources.
Second, you could and should separate development from production here. Even if you have a shared library located on a shared network folder, you should have a development copy as well that is explicitly not locked by anyone except when developing a new macro for it (or updating a currently used macro). Then make your changes there, and on a consistent schedule push them out once they've been tested and verified (preferably in a test environment, so you have the classic three: dev, test, and prod environments). Something like this:
Changes in Dev are pushed to Test on Wednesdays. Anyone who's got something ready to go by Wednesday 3pm puts it in a folder (the macro source code, that is), and it's compiled into the test SCM automatically.
Test is then verified Thursday and Friday. Anything that is verified in Test by 3pm Friday is pushed to the Dev source code folder at that time, paying attention to any potential conflicts in other new code in test (nothing's pushed to dev if something currently in test but not verified could conflict with it).
Production then is run at 3pm Friday. Everyone has to be out of the SCM by then.
I suggest not using Friday for prod if you have something that runs over the weekend, of course, as it risks you having to fix something over the weekend.
Create two folders, e.g. maclib1 and maclib2, and a dataset which stores the current library number.
When you want to rebuild your library, query the current number, increment (or reset to 1 if it's already 2), assign your macro library path to the corresponding folder, compile your macros, and then update the dataset with the new library number.
When it comes to assigning your library, query the current library number from the dataset, and assign the library path accordingly.

Is it possible to set a variable in one build configuration and refer to it in subsequent, dependent configurations?

What I'm trying to achieve is the following:
I have multiple dependent configurations for a single, logical build. The very first configuration runs a script that does a bit of work and returns a value. You can think of this configuration as the setup step. I need to be able store this value and use it in subsequent steps. All dependent configurations for a single build should receive the same value.
Setup() computes a value x. I then have configurations B(x) and A(x) that run after Setup() and need to be fed the calculated value x.
Previously, I've managed to do something similar for things that are calculated as part of the TeamCity configuration. E.g. I generated a unique build id for the entire build chain and was able to access it via %dep.{team_city_configuration_id}.system.build.number%.
This time, the value I need to propagate is calculated in the guts of a build script and not as part of the TeamCity plumbing. I've managed to wrap the setup script in question and grep out the value I need, but I don't know how to propagate it between configurations.
Is this even possible, or am I barking up the wrong tree? If I cannot do this in a non-insane way, is there a better alternative I'm missing?
Thanks
Can a mod close this, please? It's a dupe. My colleague found this, which does exactly what we wanted.

Referencing information in builds specified in a run parameter [Hudson]

Day 1 with using Hudson for our CI build. Slowly but surely getting up to speed.
My question is about run parameters. I've seen that I can use them to reference a particular run of a particular project - that's all fine.
What I don't understand (and can't find any documentation on - there's nothing at Parameterized Build) is how I refer to anything in the run defined by the run parameter.
Essentially I want to reference the %BUILD_NUMBER% and %SVN_REVISION% of the run that is selected in the run parameter.
How can I do that?
Do you really need to add extra property values, extra parameters for your job?
Since BUILD_NUMBER and SVN_REVISION are already defined as environment variables (see Building a software project), you can use those in your job.
When a Hudson job executes, it sets some environment variables that you may use in your shell script, batch command, or Ant script
or:
illustrates you already have those values at your disposal.
You can then use them to define other environment variables/properties within your shell or ant script.
When it comes to pass a variable value from one job to another, the Parameterized Trigger Plugin should do the trick:
The parameters section can contain a combination of one or more of the following:
a set of predefined properties
properties from a properties file read from the workspace of the triggering build
the parameters of the current build
"Subversion revision": makes sure the triggered projects are built with the same revision(s) of the triggering build.
You still have to make sure those projects are actually configured to checkout the right Subversion URLs.
Note: there might be an issue with the Join Plugin, which might not work when the Parameterized Trigger is in action.