I'm using the latest version of AWS OpenSearch but somehow, when I'm trying to go to the Trace analytics Dashboard it does not show the traces sent by the Data Prepper.
Manual OpenTelemetry instrumented application
Data Prepper is running in a Docker (opensearchproject/data-prepper:latest)
OpenSearch is running on the latest version
Sample Configuration
data-prepper-config.yaml
ssl: false
pipelines.yaml
entry-pipeline:
delay: "100"
source:
otel_trace_source:
ssl: false
sink:
- pipeline:
name: "raw-pipeline"
- pipeline:
name: "service-map-pipeline"
raw-pipeline:
delay: "100"
source:
pipeline:
name: "entry-pipeline"
processor:
- otel_trace_raw:
sink:
- opensearch:
hosts: [ "https://opensearch-domain" ]
username: "admin"
password: "admin"
index_type: trace-analytics-raw
service-map-pipeline:
delay: "100"
source:
pipeline:
name: "entry-pipeline"
processor:
- service_map_stateful:
sink:
- opensearch:
hosts: ["https://opensearch-domain"]
username: "admin"
password: "admin"
index_type: trace-analytics-service-map
remote-collector.yaml
...
exporters:
otlp/data-prepper:
endpoint: data-prepper-address:21890
service:
pipelines:
traces:
receivers: [otlp]
exporters: [otlp/data-prepper]
When I try to go to the Query Workbench and run the query SELECT * FROM otel-v1-apm-span, I'm getting the list of received trace spans. But I'm unable to see a chart or something on the Trace Analytics Dashboard (both Traces and Services). It's just an empty dashboard.
I'm also getting a warning:
WARN org.opensearch.dataprepper.plugins.processor.oteltrace.OTelTraceRawProcessor - Missing trace group for SpanId: xxxxxxxxxxxx
The traceGroupFields are also empty.
"traceGroupFields": {
"endTime": null,
"durationInNanos": null,
"statusCode": null
}
Is there something wrong with my setup? Any help is appreciated.
I am using aws Instance and I am trying to run promtail in order to fetch logs and forward it to loki server. Promtail, Loki and Grafana are being run through Docker.
The Loki Server is runing on port 3100, Promtail on 3400 and Loki on 8001. Since It is an AWS platform what needs to be done so that It stops throwing error at http://43.206.43.87:3100/loki/api/v1/push end point.
here is my promtail-config.yaml
server:
http_listen_port: 3400
grpc_listen_port: 0
positions:
filename: /tmp/positions.yaml
clients:
- url: http://43.206.43.87:3100/loki/api/v1/push
scrape_configs:
- job_name: system
static_configs:
- targets:
- 43.206.43.87
labels:
job: varlogs
__path__: /var/log/*log
Here is my loki-config.yaml
auth_enabled: false
server:
http_listen_port: 3100
grpc_listen_port: 0
common:
path_prefix: /tmp/loki
storage:
filesystem:
chunks_directory: /tmp/loki/chunks
rules_directory: /tmp/loki/rules
replication_factor: 1
ring:
instance_addr: 43.206.43.87
kvstore:
store: inmemory
schema_config:
configs:
- from: 2020-10-24
store: boltdb-shipper
object_store: filesystem
schema: v11
index:
prefix: index_
period: 24h
ruler:
alertmanager_url: http://localhost:9093
Please help me out
All I had to do was change the loki-config.yaml file in that
I had to write
instance_addr: localhost
and everything worked
Below is my filebeat.yml file where it should send logs only from the below mentioned /home/ubuntu/logs/test-app/path.log path. But it is all the logs including var/log/syslog and /var/log/auth.log folders. Please give me clarification on how to avoid sending system logs.
filebeat.yml
filebeat.inputs:
- type: syslog
enabled: false
- type: log
enabled: true
paths:
- home/ubuntu/logs/test-app/path.log
logging:
level: info
to_files: true
to_syslog: false
filebeat.config.modules:
path: ${path.config}/modules.d/*.yml
reload.enabled: false
output.logstash:
hosts: ["ip:5044"]
check if you are enabling the system module ,
filebeat modules list | head
cat /etc/filebeat/modules.d/system.yml
and use filestream input instead of logs as the latter will be deprecated
https://www.elastic.co/guide/en/beats/filebeat/8.2/filebeat-input-filestream.html
How can I solve this problem -- I am trying to run akka cluster on minikube. But failed to create a cluster.
17:46:49.093 [appka-akka.actor.default-dispatcher-12] WARN akka.management.cluster.bootstrap.internal.HttpContactPointBootstrap - Probing [http://172-17-0-3.default.pod.cluster.local:8558/bootstrap/seed-nodes] failed due to: Tcp command [Connect(172-17-0-3.default.pod.cluster.local:8558,None,List(),Some(10 seconds),true)] failed because of java.net.ConnectException: Connection refused
My config is --
akka {
actor {
provider = cluster
}
cluster {
shutdown-after-unsuccessful-join-seed-nodes = 60s
}
coordinated-shutdown.exit-jvm = on
management {
cluster.bootstrap {
contact-point-discovery {
discovery-method = kubernetes-api
}
}
}
}
my yaml
kind: Deployment
metadata:
labels:
app: appka
name: appka
spec:
replicas: 2
selector:
matchLabels:
app: appka
template:
metadata:
labels:
app: appka
spec:
containers:
- name: appka
image: akkacluster:latest
imagePullPolicy: Never
readinessProbe:
httpGet:
path: /ready
port: management
periodSeconds: 10
failureThreshold: 10
initialDelaySeconds: 20
livenessProbe:
httpGet:
path: /alive
port: management
periodSeconds: 10
failureThreshold: 10
initialDelaySeconds: 20
ports:
- name: management
containerPort: 8558
protocol: TCP
- name: http
containerPort: 8080
protocol: TCP
- name: remoting
containerPort: 25520
protocol: TCP
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: pod-reader
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "watch", "list"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: read-pods
subjects:
- kind: User
name: system:serviceaccount:default:default
roleRef:
kind: Role
name: pod-reader
apiGroup: rbac.authorization.k8s.io
Unfortunately my cluster is not formaing---
kubectl logs pod/appka-7c4b7df7f7-5v7cc
17:46:32.026 [appka-akka.actor.default-dispatcher-3] INFO akka.event.slf4j.Slf4jLogger - Slf4jLogger started
SLF4J: A number (1) of logging calls during the initialization phase have been intercepted and are
SLF4J: now being replayed. These are subject to the filtering rules of the underlying logging system.
SLF4J: See also http://www.slf4j.org/codes.html#replay
17:46:33.644 [appka-akka.actor.default-dispatcher-3] INFO akka.remote.artery.tcp.ArteryTcpTransport - Remoting started with transport [Artery tcp]; listening on address [akka://appka#172.17.0.4:25520] with UID [-8421566647681174079]
17:46:33.811 [appka-akka.actor.default-dispatcher-3] INFO akka.cluster.Cluster - Cluster Node [akka://appka#172.17.0.4:25520] - Starting up, Akka version [2.6.14] ...
17:46:34.491 [appka-akka.actor.default-dispatcher-3] INFO akka.cluster.Cluster - Cluster Node [akka://appka#172.17.0.4:25520] - Registered cluster JMX MBean [akka:type=Cluster]
17:46:34.512 [appka-akka.actor.default-dispatcher-3] INFO akka.cluster.Cluster - Cluster Node [akka://appka#172.17.0.4:25520] - Started up successfully
17:46:34.883 [appka-akka.actor.default-dispatcher-3] INFO akka.cluster.Cluster - Cluster Node [akka://appka#172.17.0.4:25520] - No downing-provider-class configured, manual cluster downing required, see https://doc.akka.io/docs/akka/current/typed/cluster.html#downing
17:46:34.884 [appka-akka.actor.default-dispatcher-3] INFO akka.cluster.Cluster - Cluster Node [akka://appka#172.17.0.4:25520] - No seed nodes found in configuration, relying on Cluster Bootstrap for joining
17:46:39.084 [appka-akka.actor.default-dispatcher-11] INFO akka.management.internal.HealthChecksImpl - Loading readiness checks [(cluster-membership,akka.management.cluster.scaladsl.ClusterMembershipCheck), (sharding,akka.cluster.sharding.ClusterShardingHealthCheck)]
17:46:39.090 [appka-akka.actor.default-dispatcher-11] INFO akka.management.internal.HealthChecksImpl - Loading liveness checks []
17:46:39.104 [appka-akka.actor.default-dispatcher-3] INFO ClusterListenerActor$ - started actor akka://appka/user - (class akka.actor.typed.internal.adapter.ActorRefAdapter)
17:46:39.888 [appka-akka.actor.default-dispatcher-3] INFO akka.management.scaladsl.AkkaManagement - Binding Akka Management (HTTP) endpoint to: 172.17.0.4:8558
17:46:40.525 [appka-akka.actor.default-dispatcher-3] INFO akka.management.scaladsl.AkkaManagement - Including HTTP management routes for ClusterHttpManagementRouteProvider
17:46:40.806 [appka-akka.actor.default-dispatcher-3] INFO akka.management.scaladsl.AkkaManagement - Including HTTP management routes for ClusterBootstrap
17:46:40.821 [appka-akka.actor.default-dispatcher-3] INFO akka.management.cluster.bootstrap.ClusterBootstrap - Using self contact point address: http://172.17.0.4:8558
17:46:40.914 [appka-akka.actor.default-dispatcher-3] INFO akka.management.scaladsl.AkkaManagement - Including HTTP management routes for HealthCheckRoutes
17:46:44.198 [appka-akka.actor.default-dispatcher-3] INFO akka.management.cluster.bootstrap.ClusterBootstrap - Initiating bootstrap procedure using kubernetes-api method...
17:46:44.200 [appka-akka.actor.default-dispatcher-3] INFO akka.management.cluster.bootstrap.ClusterBootstrap - Bootstrap using `akka.discovery` method: kubernetes-api
17:46:44.226 [appka-akka.actor.default-dispatcher-3] INFO akka.management.scaladsl.AkkaManagement - Bound Akka Management (HTTP) endpoint to: 172.17.0.4:8558
17:46:44.487 [appka-akka.actor.default-dispatcher-6] INFO akka.management.cluster.bootstrap.internal.BootstrapCoordinator - Locating service members. Using discovery [akka.discovery.kubernetes.KubernetesApiServiceDiscovery], join decider [akka.management.cluster.bootstrap.LowestAddressJoinDecider], scheme [http]
17:46:44.490 [appka-akka.actor.default-dispatcher-6] INFO akka.management.cluster.bootstrap.internal.BootstrapCoordinator - Looking up [Lookup(appka,None,Some(tcp))]
17:46:44.493 [appka-akka.actor.default-dispatcher-6] INFO akka.discovery.kubernetes.KubernetesApiServiceDiscovery - Querying for pods with label selector: [app=appka]. Namespace: [default]. Port: [None]
17:46:45.626 [appka-akka.actor.default-dispatcher-12] INFO akka.management.cluster.bootstrap.internal.BootstrapCoordinator - Looking up [Lookup(appka,None,Some(tcp))]
17:46:45.627 [appka-akka.actor.default-dispatcher-12] INFO akka.discovery.kubernetes.KubernetesApiServiceDiscovery - Querying for pods with label selector: [app=appka]. Namespace: [default]. Port: [None]
17:46:48.428 [appka-akka.actor.default-dispatcher-13] INFO akka.management.cluster.bootstrap.internal.BootstrapCoordinator - Located service members based on: [Lookup(appka,None,Some(tcp))]: [ResolvedTarget(172-17-0-4.default.pod.cluster.local,None,Some(/172.17.0.4)), ResolvedTarget(172-17-0-3.default.pod.cluster.local,None,Some(/172.17.0.3))], filtered to [172-17-0-4.default.pod.cluster.local:0, 172-17-0-3.default.pod.cluster.local:0]
17:46:48.485 [appka-akka.actor.default-dispatcher-22] INFO akka.management.cluster.bootstrap.internal.BootstrapCoordinator - Located service members based on: [Lookup(appka,None,Some(tcp))]: [ResolvedTarget(172-17-0-4.default.pod.cluster.local,None,Some(/172.17.0.4)), ResolvedTarget(172-17-0-3.default.pod.cluster.local,None,Some(/172.17.0.3))], filtered to [172-17-0-4.default.pod.cluster.local:0, 172-17-0-3.default.pod.cluster.local:0]
17:46:48.586 [appka-akka.actor.default-dispatcher-12] INFO akka.management.cluster.bootstrap.LowestAddressJoinDecider - Discovered [2] contact points, confirmed [0], which is less than the required [2], retrying
17:46:49.092 [appka-akka.actor.default-dispatcher-12] WARN akka.management.cluster.bootstrap.internal.HttpContactPointBootstrap - Probing [http://172-17-0-4.default.pod.cluster.local:8558/bootstrap/seed-nodes] failed due to: Tcp command [Connect(172-17-0-4.default.pod.cluster.local:8558,None,List(),Some(10 seconds),true)] failed because of java.net.ConnectException: Connection refused
17:46:49.093 [appka-akka.actor.default-dispatcher-12] WARN akka.management.cluster.bootstrap.internal.HttpContactPointBootstrap - Probing [http://172-17-0-3.default.pod.cluster.local:8558/bootstrap/seed-nodes] failed due to: Tcp command [Connect(172-17-0-3.default.pod.cluster.local:8558,None,List(),Some(10 seconds),true)] failed because of java.net.ConnectException: Connection refused
17:46:49.603 [appka-akka.actor.default-dispatcher-22] INFO akka.management.cluster.bootstrap.LowestAddressJoinDecider - Discovered [2] contact points, confirmed [0], which is less than the required [2], retrying
17:46:49.682 [appka-akka.actor.default-dispatcher-21] INFO akka.management.cluster.bootstrap.internal.BootstrapCoordinator - Looking up [Lookup(appka,None,Some(tcp))]
17:46:49.683 [appka-akka.actor.default-dispatcher-21] INFO akka.discovery.kubernetes.KubernetesApiServiceDiscovery - Querying for pods with label selector: [app=appka]. Namespace: [default]. Port: [None]
17:46:49.726 [appka-akka.actor.default-dispatcher-12] INFO akka.management.cluster.bootstrap.internal.BootstrapCoordinator - Located service members based on: [Lookup(appka,None,Some(tcp))]: [ResolvedTarget(172-17-0-4.default.pod.cluster.local,None,Some(/172.17.0.4)), ResolvedTarget(172-17-0-3.default.pod.cluster.local,None,Some(/172.17.0.3))], filtered to [172-17-0-4.default.pod.cluster.local:0, 172-17-0-3.default.pod.cluster.local:0]
17:46:50.349 [appka-akka.actor.default-dispatcher-21] WARN akka.management.cluster.bootstrap.internal.HttpContactPointBootstrap - Probing [http://172-17-0-3.default.pod.cluster.local:8558/bootstrap/seed-nodes] failed due to: Tcp command [Connect(172-17-0-3.default.pod.cluster.local:8558,None,List(),Some(10 seconds),true)] failed because of java.net.ConnectException: Connection refused
17:46:50.504 [appka-akka.actor.default-dispatcher-11] WARN akka.management.cluster.bootstrap.internal.HttpContactPointBootstrap - Probing [http://172-17-0-4.default.pod.cluster.local:8558/bootstrap/seed-nodes] failed due to: Tcp command [Connect(172-17-0-4.default.pod.cluster.local:8558,None,List(),Some(10 seconds),true)] failed because of java.net.ConnectException: Connection refused
You are missing akka.remote setting block. Something like:
akka {
actor {
# provider=remote is possible, but prefer cluster
provider = cluster
}
remote {
artery {
transport = tcp # See Selecting a transport below
canonical.hostname = "127.0.0.1"
canonical.port = 25520
}
}
}
We deployed our site in front GCLB.
LB -> Cloud run -> APP ENGINE API
Cloud run is hosting a react site and App Engine golang API.
After 12 hours we started to saw decline in the amount of clicks via google analytics but traffic was pretty much the same.
Our assumption is that "lost" traffic somehow, I can see in logs 2 main issue.
404 with address of old site components.
client disconnected before any response error.
I can understand the 404 error its cache request that looking for old site components.
But i don`t understand client disconnected error and if its related to our "lost" traffic.
Any suggestion how to analyze our "lost" traffic?
UPDATE:
I found some correlation to the client client disconnected error.
The requestUrl contains images resources for exemple
images/zoom.png?v1.0
Back end service name is empty backend_service_name: ""
not sure how it can be empty, I mapped all the resources and host
LOG
{
"insertId": "cs2fmdg2eo8nba",
"jsonPayload": {
"cacheId": "FRA-1209ea83",
"#type": "type.googleapis.com/google.cloud.loadbalancing.type.LoadBalancerLogEntry",
"statusDetails": "client_disconnected_before_any_response"
},
"httpRequest": {
"requestMethod": "GET",
"requestUrl": "https://travelpricedrops.com/images/aero.png?v1.0",
"requestSize": "78",
"userAgent": "Mozilla/5.0 (iPhone; CPU iPhone OS 14_8 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.2 Mobile/15E148 Safari/604.1",
"remoteIp": "109.104.52.1",
"referer": "https://travelpricedrops.com/passthru?tab=front&vert=flights&origin-iata=LEJ&destination-iata=JFK&departure-time=2021-12-26T11%3A00%3A00Z&cabin-class=economy&num-adults=1&num-youth=0&rental-duration=6&dta=48&return-time=2022-01-01T11%3A00%3A00Z&f=cf&fuid=1102&b=k&buid=1043",
"cacheLookup": true,
"latency": "0.071958s"
},
"resource": {
"type": "http_load_balancer",
"labels": {
"zone": "global",
"backend_service_name": "",
"forwarding_rule_name": "tpd-int-https-ipv4",
"target_proxy_name": "int-tpd-target-proxy-2",
"url_map_name": "int-tpd",
"project_id": "tpdrops"
}
},
"timestamp": "2021-11-09T06:13:55.121455Z",
"severity": "INFO",
"logName": "projects/tpdrops/logs/requests",
"trace": "projects/tpdrops/traces/13821ba38ae9e3191381f3f64b0a7b1a",
"receiveTimestamp": "2021-11-09T06:13:55.343086132Z",
"spanId": "a5ae86336a24bc32"
}
Config
**gcloud compute forwarding-rules describe tpd-int-https-ipv4**
IPAddress: 34.149.93.11
IPProtocol: TCP
creationTimestamp: '2021-08-30T11:49:06.047-07:00'
description: ''
fingerprint: CIAg3TcEb9Y=
id: '1815919129513727693'
kind: compute#forwardingRule
labelFingerprint: 42WmSpB8rSM=
loadBalancingScheme: EXTERNAL
name: tpd-int-https-ipv4
networkTier: PREMIUM
portRange: 443-443
selfLink: https://www.googleapis.com/compute/v1/projects/tpdrops/global/forwardingRules/tpd-int-https-ipv4
target: https://www.googleapis.com/compute/v1/projects/tpdrops/global/targetHttpsProxies/int-tpd-target-proxy-2
**gcloud compute backend-services describe tpd-prod-back**
affinityCookieTtlSec: 0
backends:
- balancingMode: UTILIZATION
capacityScaler: 0.0
group: https://www.googleapis.com/compute/v1/projects/tpdrops/regions/us-central1/networkEndpointGroups/tpd-front
cdnPolicy:
cacheKeyPolicy:
includeHost: true
includeProtocol: true
includeQueryString: true
cacheMode: CACHE_ALL_STATIC
clientTtl: 3600
defaultTtl: 3600
maxTtl: 86400
negativeCaching: false
requestCoalescing: true
serveWhileStale: 86400
signedUrlCacheMaxAgeSec: '0'
connectionDraining:
drainingTimeoutSec: 0
creationTimestamp: '2021-10-25T04:09:29.908-07:00'
description: ''
enableCDN: true
fingerprint: 5FNZk6GXJTw=
iap:
enabled: false
id: '6357784085114072710'
kind: compute#backendService
loadBalancingScheme: EXTERNAL
logConfig:
enable: true
sampleRate: 1.0
name: tpd-prod-back
port: 80
portName: http
protocol: HTTP
selfLink: https://www.googleapis.com/compute/v1/projects/tpdrops/global/backendServices/tpd-prod-back
sessionAffinity: NONE
timeoutSec: 30
**gcloud compute url-maps describe int-tpd**
creationTimestamp: '2021-08-29T06:08:35.918-07:00'
defaultService: https://www.googleapis.com/compute/v1/projects/tpdrops/global/backendServices/tpd-prod-back
fingerprint: trtG9xBMlvE=
hostRules:
- hosts:
- acpt.travelpricedrops.com
pathMatcher: path-matcher-2
- hosts:
- int.travelpricedrops.com
pathMatcher: path-matcher-1
- hosts:
- api.acpt.travelpricedrops.com
pathMatcher: path-matcher-3
- hosts:
- api.int.travelpricedrops.com
pathMatcher: path-matcher-4
- hosts:
- api.travelpricedrops.com
pathMatcher: path-matcher-5
- hosts:
- travelpricedrops.com
pathMatcher: path-matcher-6
id: '6018005644614187068'
kind: compute#urlMap
name: int-tpd
pathMatchers:
- defaultService: https://www.googleapis.com/compute/v1/projects/tpdrops/global/backendServices/tpd-acpt-back
name: path-matcher-2
- defaultService: https://www.googleapis.com/compute/v1/projects/tpdrops/global/backendServices/tpd-int-http
name: path-matcher-1
- defaultService: https://www.googleapis.com/compute/v1/projects/tpdrops/global/backendServices/tpd-api-acpt
name: path-matcher-3
- defaultService: https://www.googleapis.com/compute/v1/projects/tpdrops/global/backendServices/tpd-api-int
name: path-matcher-4
- defaultService: https://www.googleapis.com/compute/v1/projects/tpdrops/global/backendServices/tpd-api
name: path-matcher-5
- defaultService: https://www.googleapis.com/compute/v1/projects/tpdrops/global/backendServices/tpd-prod-back
name: path-matcher-6
selfLink: https://www.googleapis.com/compute/v1/projects/tpdrops/global/urlMaps/int-tpd
**gcloud compute target-http-proxies describe int-tpd-target-proxy-2**
ERROR: (gcloud.compute.target-http-proxies.describe) Could not fetch resource:
- The resource 'projects/tpdrops/global/targetHttpProxies/int-tpd-target-proxy-2' was not found
Your load balancer's configuration looks ok; you have a https-ssl-secured frontend on port 443 pointing to a http backend on port 80 which means that SSL is resolved at the load balancer and sent in plain http to your backend.
Error you're getting means (as per documentation) that the client disconnected before load balancer could reply:
client_disconnected_before_any_response - The connection to the client was broken before the load balancer sent any response.
Now to answer your questions.
Since the images are served directly by your app (I didn't see any host-path rules saying otherwise) make sure that application can serve images in time. Set your application response timeout to 10 seconds or more and this should solve the issue. Have a look at this discussion which may be quite usefull for you.
1.1 - there's also a configurable request timeout for Cloud Run services - you can check it by running gcloud run services describe SERVICE_NAME
The backend_service_name: "" string you mentioned may be empty - nothing to worry about - this is an expected behavior.
Additionally have a look at the Backend service timeout Timeouts and retries in external load balancing which may also put some light onto your case.
Lastly - have a look at How to debug failed requests with client_disconnected_before_any_response.