Creating DCOS service with artifacts from hdfs

Creating DCOS service with artifacts from hdfs - hdfs

I'm trying to create DCOS services that download artifacts(custom config files etc.) from hdfs. I was using simple ftp server before for it but I wanted to use hdfs. It is allowed to use "hdfs://" in artifact uri but it doesn't work correctly.
Artifact fetch ends with error because there's no "hadoop" command. Weird. I read that I need to provide own hadoop for it.
So I downloaded hadoop, set up necessary variables in /etc/profile. I can run "hadoop" without any problem when ssh'ing to node but service still ends with the same error.
It seems that environment variables configured in service are used after the artifact fetch because they don't work at all. Also, it looks like services completely ignore /etc/profile file.
So my question is: how do I set up everything so my service can fetch artifacts stored on hdfs?

The Mesos fetcher supports local Hadoop clients, please check your agent configuration and in particular your --hadoop_home setting.

Related

How to retrieve heapdump in PCF using SMB

I need to -XXHeapdumoOutofmemory and -XXHeapdumoFilepath option in PCF manifest yml to create heapdump on OutOfMemory .
I understand I can use SMB or NFS in vm args but how to retrieve the heapdump file when app goes OutOfMemory and not accessible.
Kindly help.

I need to -XXHeapdumoOutofmemory and -XXHeapdumoFilepath option in PCF manifest yml to create heapdump on OutOfMemory
You don't need to set these options. The Java buildpack will take care of this for you. By default, it installs a jvmkill agent which will automatically do this.
https://github.com/cloudfoundry/java-buildpack/blob/main/docs/jre-open_jdk_jre.md#jvmkill
In addition, the jvmkill agent is smart enough that if you bind a SMB or NFS volume service to your application, it will automatically save the heap dumps to that location. From the doc link above...
If a Volume Service with the string heap-dump in its name or tag is bound to the application, terminal heap dumps will be written with the pattern <CONTAINER_DIR>/<SPACE_NAME>-<SPACE_ID[0,8]>/<APPLICATION_NAME>-<APPLICATION_ID[0,8]>/<INSTANCE_INDEX>--<INSTANCE_ID[0,8]>.hprof
The key is that you name the bound volume service appropriately, i.e. the name must contain the string heap-dump.
You may also do the same thing with non-terminal heap dumps using the Java Memory Agent that the Java buildpack can install for you upon request.
I understand I can use SMB or NFS in vm args but how to retrieve the heapdump file when app goes OutOfMemory and not accessible.
To retrieve the heap dumps you need to somehow access the file server. I say "somehow" because it entirely depends on what you are allowed to do in your environment.
You may be permitted to mount the SMB/NFS volume directly to your PC. You could then access the files directly.
You may be able to retrieve the files through some other protocol like HTTP or FTP or SFTP.
You may be able to mount the SMB or NFS volume to another application, perhaps using the static file buildpack, to serve up the files for you.
You may need to request the files from an administrator with access.
Your best best is to talk with the admin for your SMB or NFS server. He or she can inform you about the options that are available to you in your environment.

How to achieve multiple gcs backends in terraform

Within our team. We all have our own dev project, and then we have a test and prod environment.
We are currently in the process of migrating from deployment manager, and gcloud cli. Into terraform. however we havent been able to figure out a way to create isolated backends within gcs backend. We have noticed that the remote backends support setting a dedicated workspace but we havent been able to setup something similar within gcs.
Is it possible to state that terraform resource A, will have a configurable backend, that we can adjust per project, or is the equivalent possible with workspaces?
So that we can use either tfvars, and vars parameters to switch between projects?
As stands everytime we attempt to make the backend configurable through vars, we get the error in terraform init of
Error: Variables not allowed
How does one go about creating isolated backends for each project.
Or if that isn't possible how can we guarantee that with multiple projects a shared backend state will not collide causing the state to be incorrect?

Your backend must been known when you run your terraform init command, I mean your backend bucket.
If you don't want to use workspace, you have to customize the backend value before running the init. We are use make to achieve this. According to the environment, make create a backend.tf file with the correct backend name. And run the init command.
EDIT 1
We have this piece of script (sh) which create the backend before triggering the terraform command. (it's our Make file that do this)
cat > $TF_export_dir/backend.tf << EOF
terraform {
backend "gcs" {
bucket = "$TF_subsidiary-$TF_environment-$TF_deployed_application_code-gcs-tfstatebackend"
prefix = "terraform/state"
}
}
EOF
Of course the bucket name pattern is dependent of our project. The $TF_environment is the most important because according to the env var set, the bucket reached will be different.

How to handle private configuration file when deploying?

I am deploying a Django application using the following steps:
Push updates to GIT
Log into AWS
Pull updates from GIT
The issue I am having is my settings production.py file. I have it in my .gitignore so it does not get uploaded to GITHUB due to security. This, of course, means it is not available when I PULL updates onto my server.
What is a good approach for making this file available to my app when it is on the server without having to upload it to GITHUB where it is exposed?

It is definitely a good idea not to check secrets into your repository. However, there's nothing wrong with checking in configuration that is not secret if it's an intrinsic part of your application.
In large scale deployments, typically one sets configuration using a tool for that purpose like Puppet, so that all the pieces that need to be aware of a particular application's configuration can be generated from one source. Similarly, secrets are usually handled using a secret store like Vault and injected into the environment when the process starts.
If you're just running a single server, it's probably just fine to adjust your configuration or application to read secrets from the environment (or possibly a separate file) and set those values on the server. You can then include other configuration settings (secrets excluded) as a file in the repository. If, in the future, you need more flexibility, you can pick up other tools in the future.

Running .net core web app on AWS Beanstalk - file write permissions

Ok, so I've got a web application written in .NET Core which I've deployed to the AWS Elastic beanstalk which was pretty easy, but I've already hit a snag.
The application fetches JSON data from an external source and writes to a local file, currently to wwwroot/data/data.json under the project root. Once deployed to AWS, this functionality is throwing an access denied exception when it tries to write the file.
I've seen something about creating a folder called .ebextensions with a config file with some container commands to set permissions to certain paths/files after deployment, and I've tried doing that, but that does not seem to do anything for me - I don't even know if those commands are even being executed so I have no idea what's happening, if anything.
This is the config file I created under the .ebextensions folder:
{
"container_commands": {
"01-aclchange": {
"command": "icacls \"C:/inetpub/AspNetCoreWebApps/app/wwwroot/data\" /grant DefaultAppPool:(OI)(CI)",
}
}
}
The name of the .config file matches the applicatio name in AWS, but I also read somewhere that the name does not matter, as long as it has the .config extension.
Has anyone successfully done something like this? Any pointers appreciated.

Rather than trying to fix permission issues writing to the local storage within AWS Elastic Beanstalk, I would instead suggest using something like Amazon S3 for storing files. Some benefits would be:
Not having to worry about file permissions.
S3 files are persistent.
You may run into issues with losing local files when you republish your application.
If you ever move to using something like containers, you will lose the file every time the container is taken down.
S3 is incredibly cheap to use.

Cloud Foundry Change Buildpack from Command Line

I have a Jenkins app running on Cloud Foundry for a POC. Since it's Jenkins it uses a bound service for file persistence.
I had to make a change to the Java Buildpack and would like Jenkins to use the updated buildpack.
I could pull the source for Jenkins from GitHub and push it again with updated references to the new build pack in the manifest.yml file or via a command line option. In theory, the bound file system service's state would remain intact. However, I haven't validated this assumption and have concerns I might loose the state.
I've looked through the client CLI to see if there's a way to explicitly swap buildpacks without another push. However, I didn't see anything.
Is anyone aware of a way to change the buildpack of an existing application without re-pushing it to Cloud Foundry?

After some research I couldn't find anyway to swap the buildpack without a push. I did discover that my bound file system service remained intact and didn't loose any work.
Answer: re-push to change the buildpack.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js