What does 'Logging' do in Dockerrun.aws.json - amazon-web-services

I'm struggling to work out what the Logging tag does in the Dockerrun.aws.json file for a Single Container Docker Configuration. All the official docs say about it is Logging – Maps the log directory inside the container.
This sounds like they essentially create a volume from /var/log on the EC2 instance to a directory in the docker filesystem as specified by Logging. I have the following Dockerrun.aws.json file:
{
"AWSEBDockerrunVersion": "1",
...
"Logging": "/var/log/supervisor"
}
However, when I go to the AWS Console and request the logs for my instance, none of my custom log files located in /var/log/supervisor are in the log bundles. Can anyone explain to me what the purpose of this Logging tag is and how I may use it (or not) to retrieve my custom logs.
EDIT
Here are the Volume mappings for my container (didn't think to check that):
"Volumes": {
"/var/cache/nginx": "/var/lib/docker/vfs/dir/ff6ecc190ba3413660a946c557f14a104f26d33ecd13a1a08d079a91d2b5158e",
"/var/log/supervisor": "/var/log/eb-docker/containers/eb-current-app"
},
"VolumesRW": {
"/var/cache/nginx": true,
"/var/log/supervisor": true
}
It turns out that /var/log/supervisor is mapping to /var/log/eb-docker/containers/eb-current-app rather than /var/log as I originally suspected. It'd be nice if this was clearer in the documentation.
But it also turns out that I was running the wrong Docker Image which explains why my log files weren't appearing anywhere! Doh!

Related

Get container image label without pulling the image from GCR

I am trying to create a dataset for at least 250 container images built by docker and pushed to a single GCP project on Google Container Repository (GCR). The GCR is highly active, thus it changes the version quite frequently, thus the automation.
All of these images add a certain label at the time of push from the CI system. I want to add those labels in the dataset. I tried accessing the label and its value after pulling the image, however, pulling 250+ images and then inspecting them is taking too much resources on this automation and may not even be possible.
So in short, I just want to know if there's any gcloud API (REST or CLI) which can fetch the label metadata without pulling the image first?
I tried looking in the docs, but couldn't find anything. I tried the following command which only gives the SHA256 digest and the repository details, but not labels
gcloud container images describe gcr.io/[PROJECT-ID]/[IMAGE]
# Output
image_summary:
digest: sha256:[SHA_DIGEST_HERE]
fully_qualified_digest: gcr.io/[PROJECT-ID]/[IMAGE]#sha256:[SHA_DIGEST_HERE]
registry: gcr.io
repository: [PROJECT-ID]/[IMAGE]
Update:
I tried the curl command with the access token which gave me different layers instead
$> curl https://gcr.io:443/v2/[PROJECT-ID]/[IMAGE]/manifests/latest -H "Authorization: Bearer {token}"
// output
{
"schemaVersion": 2,
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"config": {
"mediaType": "application/vnd.docker.container.image.v1+json",
"size": [size],
"digest": "sha256:[SHA_256_DIGEST]"
},
"layers": [
// different layers here
]
}
Not sure how can I actually extract the manifest itself and look into it.
I want something like what this question is asking, but for GCR instead of dockerhub.
As of the moment, only Artifact Registry has the Label Repositories option to identify and group related repositories.
I would suggest if you want to use labels, you may want to migrate from Google Container Registry to Artifact Registry in order to use Label Repositories.
Another option is you may want to file this one as a feature request. Please be advised that this doesn't have a specific ETA however you can still keep track of the progress by following the thread once the ticket has been created.

What precautions do I need to take when sharing an AWS Amplify project publicly?

I'm creating a security camera IoT project that uploads images to S3 and will soon offer a UI to review those images. AWS Amplify is being used to make this happen quickly.
As I get started on the Amplify side of things, I'm noticing a config file that has very specifically named attributes and values. The team-provider-info.json file in particular that isn't ignored is very specific:
{
"dev": {
"awscloudformation": {
"AuthRoleName": "amplify-twintigersecurityweb-dev-123456-authRole",
"UnauthRoleArn": "arn:aws:iam::111164163333:role/amplify-twintigersecurityweb-dev-123456-unauthRole",
"AuthRoleArn": "arn:aws:iam::111164163333:role/amplify-twintigersecurityweb-dev-123456-authRole",
"Region": "us-east-1",
"DeploymentBucketName": "amplify-twintigersecurityweb-dev-123456-deployment",
"UnauthRoleName": "amplify-twintigersecurityweb-dev-123456-unauthRole",
"StackName": "amplify-twintigersecurityweb-dev-123456",
"StackId": "arn:aws:cloudformation:us-east-1:111164163333:stack/amplify-twintigersecurityweb-dev-123456/88888888-8888-8888-8888-888838f58888",
"AmplifyAppId": "dddd7dx2zipppp"
}
}
}
May I post this to my public repository without worry? Is there a chance for conflict in naming? How would one pull this in for use in their new project?
Per AWS Amplify documentation:
If you want to share a project publicly and open source your serverless infrastructure, you should remove or put the amplify/team-provider-info.json file in gitignore file.
At a glance, everything else generated by amplify init NOT in the .gitignore file is ok to share. e.g. project-config.json and backend-config.json.
Add this to .gitignore:
# not to share if public
amplify/team-provider-info.json

If there are a way to get info at runtime about SparkMetrics configuration

I add metrics.properties file to resource directory (maven project) with CSV sinc. Everything is fine when I run Spark app locally - metrics appears. But when I file same fat jar to Amazon EMR I do not see any tries to put metrics into CSV sinc. So I want to check at runtime what are loaded settings for SparkMetrics subsystem. If there are any possibility to do this?
I looked into SparkEnv.get.metricsSystem but didn't find any.
That is basically because Spark on EMR is not picking up your custom metrics.properties file from the resources dir of the fat jar.
For EMR the preferred way to configure is through EMR Configurations API in which you need to pass the classification and properties in an embedded JSON.
For spark metrics subsystem here is an example to modify a couple of metrics
[
{
"Classification": "spark-metrics",
"Properties": {
"*.sink.csv.class": "org.apache.spark.metrics.sink.CsvSink",
"*.sink.csv.period": "1"
}
}
]
You can use this JSON when creating EMR cluster using Amazon Console or through SDK

How to set up different uploaded file storage locations for Laravel 5.2 in local deployment and AWS EB w/ S3?

I'm working on a Laravel 5.2 application where users can send a file by POST, the application stores that file in a certain location and retrieves it on demand later. I'm using Amazon Elastic Beanstalk. For local development on my machine, I would like the files to store in a specified local folder on my machine. And when I deploy to AWS-EB, I would like it to automatically switch over and store the files in S3 instead. So I don't want to hard code something like \Storage::disk('s3')->put(...) because that won't work locally.
What I'm trying to do here is similar to what I was able to do for environment variables for database connectivity... I was able to find some great tutorials where you create an .env.elasticbeanstalk file, create a config file at ~/.ebextiontions/01envconfig.config to automatically replace the standard .env file on deployment, and modify a few lines of your database.php to automatically pull the appropriate variable.
How do I do something similar with file storage and retrieval?
Ok. Got it working. In /config/filesystems.php, I changed:
'default' => 'local',
to:
'default' => env('DEFAULT_STORAGE') ?: 'local',
In my .env.elasticbeanstalk file (see the original question for an explanation of what this is), I added the following (I'm leaving out my actual key and secret values):
DEFAULT_STORAGE=s3
S3_KEY=[insert your key here]
S3_SECRET=[insert your secret here]
S3_REGION=us-west-2
S3_BUCKET=cameraflock-clips-dev
Note that I had to specify my region as us-west-2 even though S3 shows my environment as Oregon.
In my upload controller, I don't specify a disk. Instead, I use:
\Storage::put($filePath, $filePointer, 'public');
This way, it always uses my "default" disk for the \Storage operation. If I'm in my local environment, that's my public folder. If I'm in AWS-EB, then my Elastic Beanstalk .env file goes into effect and \Storage defaults to S3 with appropriate credentials.

Elasticsearch Snapshot Fails With RepositoryMissingException

Three node ElasticSearch cluster on AWS. Bigdesk and Head both show a healthy cluster. All three nodes are running ES 1.3, and the latest Amazon Linux updates. When I fire off a snapshot request like:
http://localhost:9200/_snapshot/taxanalyst/201409031540-snapshot?wait_for_completion=true
the server churns away for several minutes before responding with the following:
{
"snapshot": {
"snapshot": "201409031521-snapshot",
"indices": [
"docs",
"pdflog"
],
"state": "PARTIAL",
"start_time": "2014-09-03T19:21:36.034Z",
"start_time_in_millis": 1409772096034,
"end_time": "2014-09-03T19:28:48.685Z",
"end_time_in_millis": 1409772528685,
"duration_in_millis": 432651,
"failures": [
{
"node_id": "ikauhFYEQ02Mca8fd1E4jA",
"index": "pdflog",
"reason": "RepositoryMissingException[[faxmanalips] missing]",
"shard_id": 0,
"status": "INTERNAL_SERVER_ERROR"
}
],
"shards": {
"total": 10,
"failed": 1,
"successful": 9
}
}
}
These are three nodes on three different virtual EC2 machines, but they're able to communicate via 9300/9200 without any problems. Indexing and searching works as expected. There doesn't appear to be anything in the elasticsearch log files that speaks to the server error.
Does anyone know what's going on here, or at least where a good place to start would be?
UPDATE: Turns out that each of the nodes in the cluster need to have snapshot directories that match the directory specified when you register the snapshot with the elasticsearch cluster.
I guess the next question is: when you want to tgz up the snapshot directory so you can archive it, or provision a backup cluster, is it sufficient to just tgz the snapshot directory on the Master node? Or do you have to somehow consolidate the snapshot directories of all the nodes. (That can't be right, can it?)
Elasticsearch supports shared file system repository uses the shared file system to store snapshots.
In order to register the shared file system repository it is necessary to mount the same shared filesystem to the same location on all master and data nodes.
All you need to know Put in elasticsearch.yml of all 3 nodes same repository_name.
for eg:-
path.repo:[/my_repository]
I think you are looking for this aws plugin for elasticsearch (I guess you already installed it to configure your cluster) : https://github.com/elasticsearch/elasticsearch-cloud-aws#s3-repository
It will allow you to create a repository mapped to a S3 bucket.
To use (create/restore/whatever) a snapshot, you need to create a repository first. Then when you will do some actions on a snapshot, Elasticsearch will directly manage it on your S3 bucket.