I wanna create separate log streams for each fille with pattern sync-.log (for example filename sync-site1.log, stream name - server-sync-site1). There are a lot of log files with sync-.log pattern, so do it manually for each is a bad option for me. How can I do it? I'm using an older agent.
Here is my current config:
[server-sync.log]
datetime_format = %Y-%m-%dT%H:%M:%S,%f
file = /home/ec2-user/log/sync/sync-*.log
buffer_duration = 5000
log_stream_name = server-sync
initial_position = end_of_file
multi_line_start_pattern = {datetime_format}
log_group_name = server
There is a well defined API Call for this purpose CreateLogStream, corresponding api call in boto3 is create_stream
response = client.create_log_stream(
logGroupName='string',
logStreamName='string'
)
There is a terraform resource as well named aws_cloudwatch_log_stream
resource "aws_cloudwatch_log_group" "yada" {
name = "Yada"
}
resource "aws_cloudwatch_log_stream" "foo" {
name = "SampleLogStream1234"
log_group_name = aws_cloudwatch_log_group.yada.name
}
Related
I am trying to get terraform set up to where I can have an array of cluster parameters and then do a for_each in a redshift module to create them all like so:
for_each = local.env[var.tier][var.region].clusters
source = "terraform-aws-modules/redshift/aws"
cluster_identifier = "${each.value.name}"
allow_version_upgrade = true
node_type = "dc2.large"
number_of_nodes = 2
database_name = "${each.value.database}"
master_username = "${each.value.admin_user}"
create_random_password = false
master_password = "${each.value.admin_password}"
encrypted = true
kms_key_arn = xxxxx
enhanced_vpc_routing = false
vpc_security_group_ids = xxxxxx
subnet_ids = xxxxxx
publicly_accessible = true
iam_role_arns = xxxxxx
# Parameter group
parameter_group_name = xxxxxx
# Subnet group
create_subnet_group = false
subnet_group_name = xxxxxx
# Maintenance
preferred_maintenance_window = "sat:01:00-sat:01:30"
# Backup Details
automated_snapshot_retention_period = 30
manual_snapshot_retention_period = -1
}
But I also want to add an additional user aside from the admin user to each of these clusters. I am struggling to find a way to do this in terraform. Any advice would be appreciated! Thanks!
There are two ways to do this:
Can try to use TF Redshift Provider which allows you to create redshift_user.
Use local-exec to invoke JDBC, Python or ODBC tools that will create your user using SQL commands.
How do I send cloud-init script to a gcp instance using terraform?
Documentation is very sparse around this topic.
You need the following:
A cloud-init file (say 'conf.yaml')
#cloud-config
# Create an empty file on the system
write_files:
- path: /root/CLOUD_INIT_WAS_HERE
cloudinit_config data source
gzip and base64_encode must be set to false (they are true by default).
data "cloudinit_config" "conf" {
gzip = false
base64_encode = false
part {
content_type = "text/cloud-config"
content = file("conf.yaml")
filename = "conf.yaml"
}
}
A metadata section under the google_compute_instance resource
metadata = {
user-data = "${data.cloudinit_config.conf.rendered}"
}
I would like to try if terraform data source will be able to place the output to a text file.
I was looking on it online but not able to find any, I plan to perform on getting the load balancer name and after that our automation script will perform aws-cli command and will use the load balancer name taken by the data-source
If your CLB name is autogenrated by TF, you can save it in a file using local_file:
resource "aws_elb" "clb" {
availability_zones = ["ap-southeast-2a"]
listener {
instance_port = 8000
instance_protocol = "http"
lb_port = 80
lb_protocol = "http"
}
}
resource "local_file" "foo" {
content = <<-EOL
${aws_elb.clb.name}
EOL
filename = "${path.module}/clb_name.txt"
}
output "clb_name" {
value = aws_elb.clb.name
}
But maybe it would be easier to get the output value directly as json:
clb_name=$(terraform output -json clb_name | jq -r)
echo ${clb_name}
Despite using depends_on directive, it looks like zip is not created before trying to put it in the bucket. Considering pipeline output, somehow it just omits archiving the file before firing upload to bucket. Both files (index.js and package.json) exists.
resource "google_storage_bucket" "cloud-functions" {
project = var.project-1-id
name = "${var.project-1-id}-cloud-functions"
location = var.project-1-region
}
resource "google_storage_bucket_object" "start_instance" {
name = "start_instance.zip"
bucket = google_storage_bucket.cloud-functions.name
source = "${path.module}/start_instance.zip"
depends_on = [
data.archive_file.start_instance,
]
}
data "archive_file" "start_instance" {
type = "zip"
output_path = "${path.module}/start_instance.zip"
source {
content = file("${path.module}/scripts/start_instance/index.js")
filename = "index.js"
}
source {
content = file("${path.module}/scripts/start_instance/package.json")
filename = "package.json"
}
}
Terraform has been successfully initialized!
$ terraform apply -input=false "planfile"
google_storage_bucket_object.stop_instance: Creating...
google_storage_bucket_object.start_instance: Creating...
Error: open ./start_instance.zip: no such file or directory
on cloud_functions.tf line 41, in resource "google_storage_bucket_object" "start_instance":
41: resource "google_storage_bucket_object" "start_instance" {
LOGS:
2020-11-18T13:02:56.796Z [DEBUG] plugin.terraform-provider-google_v3.40.0_x5: 2020/11/18 13:02:56 [WARN] Failed to read source file "./start_instance.zip". Cannot compute md5 hash for it.
2020/11/18 13:02:56 [WARN] Provider "registry.terraform.io/hashicorp/google" produced an invalid plan for google_storage_bucket_object.stop_instance, but we are tolerating it because it is using the legacy plugin SDK.
The following problems may be the cause of any confusing errors from downstream operations:
- .detect_md5hash: planned value cty.StringVal("different hash") does not match config value cty.NullVal(cty.String)
2020/11/18 13:02:56 [WARN] Provider "registry.terraform.io/hashicorp/google" produced an invalid plan for google_storage_bucket_object.start_instance, but we are tolerating it because it is using the legacy plugin SDK.
The following problems may be the cause of any confusing errors from downstream operations:
- .detect_md5hash: planned value cty.StringVal("different hash") does not match config value cty.NullVal(cty.String)
I have exactly the same issue with GitLab CI/CD pipeline. After some digging, according to the discussion I found out that with this setup, the plan and apply stages are run in separate containers, and the archiving step is executed in the plan stage.
A workaround is to create a dummy trigger with null_resource and force the archive_file to depend on it, and, hence, to be executed in the apply stage.
resource null_resource dummy_trigger {
triggers = {
timestamp = timestamp()
}
}
resource "google_storage_bucket" "cloud-functions" {
project = var.project-1-id
name = "${var.project-1-id}-cloud-functions"
location = var.project-1-region
}
resource "google_storage_bucket_object" "start_instance" {
name = "start_instance.zip"
bucket = google_storage_bucket.cloud-functions.name
source = "${path.module}/start_instance.zip"
depends_on = [
data.archive_file.start_instance,
]
}
data "archive_file" "start_instance" {
type = "zip"
output_path = "${path.module}/start_instance.zip"
source {
content = file("${path.module}/scripts/start_instance/index.js")
filename = "index.js"
}
source {
content = file("${path.module}/scripts/start_instance/package.json")
filename = "package.json"
}
depends_on = [
resource.null_resource.dummy_trigger,
]
}
I'm running the awslogs agent on a server, and when I look in CloudWatch logs in the AWS console, the logs are about 60 minutes behind. Our server produces about 650MB of data per hour, and it appears that the agent is not able to keep up.
Here is our abbreviated config file:
[application.log]
datetime_format = %Y-%m-%d %H:%M:%S
time_zone = UTC
file = var/output/logs/application.json.log*
log_stream_name = {hostname}
initial_position = start_of_file
log_group_name = ApplicationLog
[service_log]
datetime_format = %Y-%m-%dT%H:%M:%S
time_zone = UTC
file = var/output/logs/service.json.log*
log_stream_name = {hostname}
initial_position = start_of_file
log_group_name = ServiceLog
Is there a common way to speed of the awslogs agent?
The amount of data (> 0.2MB/s) is not an issue for the agent. The agent has a capacity of about 3MB/s per log file. However, if you're using the same log stream for multiple log files, the agents write to the same stream, and end up blocking each other. The throughput more than halves when you share a stream between log files.
Also, there are a few properties that can be configured that may have an impact on performance:
buffer_duration = <integer>
batch_count = <integer>
batch_size = <integer>
To solve my issue, I did two things:
Drastically increase the batch size (defaults to 32768 bytes)
Use a different log stream for each log file
And the agent had no problems keeping up. Here's my final config file:
[application.log]
datetime_format = %Y-%m-%d %H:%M:%S
time_zone = UTC
file = var/output/logs/application.json.log*
log_stream_name = {hostname}-app
initial_position = start_of_file
log_group_name = ApplicationLog
batch_size = 524288
[service_log]
datetime_format = %Y-%m-%dT%H:%M:%S
time_zone = UTC
file = var/output/logs/service.json.log*
log_stream_name = {hostname}-service
initial_position = start_of_file
log_group_name = ServiceLog
batch_size = 524288
awslogs agent supports log rotation
so this:
file = var/output/logs/application.json.log*
would pick up too many files?
Try:
file = var/output/logs/application.json.log
to speed up the process.