How to configure istio-proxy to log traceId? - istio

I am using istio with version 1.3.5. Is there any configuration to be set to allow istio-proxy to log traceId? I am using jaeger tracing (wit zipkin protocol) being enabled. There is one thing I want to accomplish by having traceId logging:
- log correlation in multiple services upstream. Basically I can filter all logs by certain traceId.

According to envoy proxy documentation for envoy v1.12.0 used by istio 1.3:
Trace context propagation
Envoy provides the capability for reporting tracing information regarding communications between services in the mesh. However, to be able to correlate the pieces of tracing information generated by the various proxies within a call flow, the services must propagate certain trace context between the inbound and outbound requests.
Whichever tracing provider is being used, the service should propagate the x-request-id to enable logging across the invoked services to be correlated.
The tracing providers also require additional context, to enable the parent/child relationships between the spans (logical units of work) to be understood. This can be achieved by using the LightStep (via OpenTracing API) or Zipkin tracer directly within the service itself, to extract the trace context from the inbound request and inject it into any subsequent outbound requests. This approach would also enable the service to create additional spans, describing work being done internally within the service, that may be useful when examining the end-to-end trace.
Alternatively the trace context can be manually propagated by the service:
When using the LightStep tracer, Envoy relies on the service to propagate the
x-ot-span-context
HTTP header while sending HTTP requests to other services.
When using the Zipkin tracer, Envoy relies on the service to propagate the B3 HTTP headers (
x-b3-traceid,
x-b3-spanid,
x-b3-parentspanid,
x-b3-sampled,
and
x-b3-flags).
The
x-b3-sampled
header can also be supplied by an external client to either enable or
disable tracing for a particular request. In addition, the single
b3
header propagation format is supported, which is a more compressed
format.
When using the Datadog tracer, Envoy relies on the service to propagate the Datadog-specific HTTP headers (
x-datadog-trace-id,
x-datadog-parent-id,
x-datadog-sampling-priority).
TLDR: traceId headers need to be manually added to B3 HTTP headers.
Additional information: https://github.com/openzipkin/b3-propagation

Related

OpenCensus distributed tracing propagation using both Flask & requests

I'm attempting to implement distributed tracing via OpenCensus across different services where each service is using Flask for serving and sends downstream requests using... er.. requests to build its reply. The tracing platform is GCP Cloud Trace.
I'm using FlaskMiddleware, which traces the initial call correctly, but there's no propagation of span info between the source and target service, even when the middleware has a propagation defined (I've tried a few):
#middleware = FlaskMiddleware(app, exporter=exporter, sampler=sampler, excludelist_paths=['healthz', 'metrics'], propagator=google_cloud_format.GoogleCloudFormatPropagator())
middleware = FlaskMiddleware(app, exporter=exporter, sampler=sampler, excludelist_paths=['healthz', 'metrics'], propagator=b3_format.B3FormatPropagator())
#middleware = FlaskMiddleware(app, exporter=exporter, sampler=sampler, excludelist_paths=['healthz', 'metrics'], propagator=trace_context_http_header_format.TraceContextPropagator())
I guess the question is, does anyone have an example of setting up tracing that traverses several downstream calls when each service is serving via Flask and making downstream requests via requests.
Currently I end up with each service having its own trace with a single span.

Making GCP Load-balancer HTTP logs integrate with Cloud Trace (in GCP Logs Explorer)?

Cloud Trace and Cloud Logging integrate quite nicely in most cases, described in https://cloud.google.com/trace/docs/trace-log-integration
Unfortunately, this doesn't seem to include the HTTP request logs generated by a Load Balancer when request logging is enabled.
The LB logs show the traces icon, and are correctly associated with an overall trace in the Cloud Trace system, but the context menu 'show trace details' is greyed out for those log items.
A similar problem arose with my application level logging/tracing, and was solved by setting the traceSampled attribute on the LogEntry, but this can't work for LB logs, since I'm not in control of their generation.
In this instance I'm tracing 100% of requests since the service is M2M and fairly low volume, but in the general case it makes sense that the LB can't know if something is actually generating traces without being told.
I can't find any good references in the docs, but in theory a response header indicating it was sampled could be observed by the LB and cause it to issue the appropriate log.
Any ideas if such a feature exists, in this form or any other?
(Last-ditch workaround might be to use Logs Router to feed LB logs into a pubsub queue (and exclude them from normal logging sinks), and resubmit them to the normal sink(s) with fields appropriately set by some Cloud Function or other pubsub consumer, but that seems like a lot of work and complexity for this purpose)
There is currently a Feature Request created for this, you can follow the status in the following link 1.
As a workaround, you could implement target proxies along with your Load Balancer, according to the documentation for a Global external HTTP(S) load balancer:
The proxies set HTTP request/response headers as follows:
Via: 1.1 google (requests and responses)
X-Forwarded-Proto: [http | https] (requests only)
X-Cloud-Trace-Context: <trace-id>/<span-id>;<trace-options> (requests only) Contains parameters for Cloud Trace.
X-Forwarded-For: [<supplied-value>,]<client-ip>,<load-balancer-ip>
(see X-Forwarded-For header) (requests only)
You can find the complete documentation about external HTTP(S) load balancers and target proxies here 2.
And finally, take a look at the documentation on how to use and configure target proxies here 3.

Should django health-check endpoint /ht/ be accessible from everybody?

From the documentation reported here I read
This project checks for various conditions and provides reports when
anomalous behavior is detected.The following health checks are bundled
with this project: cache, database, storage, disk and memory
utilization (viapsutil), AWS S3 storage, Celery task queue, Celery
ping, RabbitMQ, Migrations
and from use case section
The primary intended use case is to monitor conditions via HTTP(S),
with responses available in HTML and JSONformats. When you get back a
response that includes one or more problems, you can then decide the
appropriate courseof action, which could include generating
notifications and/or automating the replacement of a failing node with
a newone
And then
The /ht/ endpoint will respond aHTTP 200 if all checks passed and a HTTP
500 if any of the tests failed.
From a security point of view: should this url (https://example.com/ht) be reachable from everybody? It seems to give away different information.

Java micro service distributed tracing with Istio

Kubernetes and Istio already installed in the cluster. Three micro services deployed as PODs. The flow is
Micro service A to Micro Service B calls => HTTP
Micro service B to Micro service C calls => via Kafka
Micro service A expose a HTTP API to outside
I guess when client hit the Ingres, Istio generate traceId and spanId in HTTP header and enter to Service A.
Are these spanId and traceId propagate to Micro service B and C without using separate API like Spring Cloud sleuth?
No, Istio does not provide tracing headers propagation. However it can be configured on application side without use of 3rd party APIs.
According to Istio documentation:
Istio leverages Envoy’s distributed tracing feature to provide tracing integration out of the box. Specifically, Istio provides options to install various tracing backend and configure proxies to send trace spans to them automatically. See Zipkin, Jaeger and LightStep task docs about how Istio works with those tracing systems.
Istio documentation also has an example of application side header propagation for the bookinfo demo application:
Trace context propagation
Although Istio proxies are able to automatically send spans, they need some hints to tie together the entire trace. Applications need to propagate the appropriate HTTP headers so that when the proxies send span information, the spans can be correlated correctly into a single trace.
To do this, an application needs to collect and propagate the following headers from the incoming request to any outgoing requests:
x-request-id
x-b3-traceid
x-b3-spanid
x-b3-parentspanid
x-b3-sampled
x-b3-flags
x-ot-span-context
Additionally, tracing integrations based on OpenCensus (e.g. Stackdriver) propagate the following headers:
x-cloud-trace-context
traceparent
grpc-trace-bin
If you look at the sample Python productpage service, for example, you see that the application extracts the required headers from an HTTP request using OpenTracing libraries:
def getForwardHeaders(request):
headers = {}
# x-b3-*** headers can be populated using the opentracing span
span = get_current_span()
carrier = {}
tracer.inject(
span_context=span.context,
format=Format.HTTP_HEADERS,
carrier=carrier)
headers.update(carrier)
# ...
incoming_headers = ['x-request-id']
# ...
for ihdr in incoming_headers:
val = request.headers.get(ihdr)
if val is not None:
headers[ihdr] = val
return headers
The reviews application (Java) does something similar:
#GET
#Path("/reviews/{productId}")
public Response bookReviewsById(#PathParam("productId") int productId,
#HeaderParam("end-user") String user,
#HeaderParam("x-request-id") String xreq,
#HeaderParam("x-b3-traceid") String xtraceid,
#HeaderParam("x-b3-spanid") String xspanid,
#HeaderParam("x-b3-parentspanid") String xparentspanid,
#HeaderParam("x-b3-sampled") String xsampled,
#HeaderParam("x-b3-flags") String xflags,
#HeaderParam("x-ot-span-context") String xotspan) {
if (ratings_enabled) {
JsonObject ratingsResponse = getRatings(Integer.toString(productId), user, xreq, xtraceid, xspanid, xparentspanid, xsampled, xflags, xotspan);
When you make downstream calls in your applications, make sure to include these headers.

WSO2 API Manager 2.1 : Gateway not enforcing Throttling Limits

We have deployed API-M 2.1 in a distributed way (each component, GW, TM, KM are running in their own Docker image) on top on DC/OS 1.9 ( Mesos ).
We have issues to get the gateway to enforce throttling policies (should it be subscription tiers or app-level policies). Here is what we have managed to define so far:
The Traffic Manager itself does it job : it receives the event streams, analyzes them on the fly and pushes an event onto the JMS topic throttledata
The Gateway reads the message properly.
So basically we have discarded a communication issue.
However we found two potential issues:
In the event which is pushed to the TM component, the value of the appTenant is null (instead of carbon.super)- We have a single tenant defined.
When the gateway receives the throttling message, it decides to let the message go thinking the "stopOnQuotaReach" is set to false, when it is set to true (we checked the value in the database).
Digging into the source code, we related those two issues to a single source: the value for both values above are read from the authContext and apparently incorrectly set. We are stuck and running out of ideas of things to try and would need some pointers to what could be a potential source of the problem and things to check.
Can somebody help please ?
Thanks- Isabelle.
Is there two TM with HA enabled available in the system?
If the TM is HA enabled, how gateways publish data to TM. Is it load balanced data publishing or failover data publishing to the TMs?
Did you follow below articles to configure the environment with respect to your deployment?
http://wso2.com/library/articles/2016/10/article-scalable-traffic-manager-deployment-patterns-for-wso2-api-manager-part-1/
http://wso2.com/library/articles/2016/10/article-scalable-traffic-manager-deployment-patterns-for-wso2-api-manager-part-2/
Is throttling completely not working in your environment?
Have you noticed any JMS connection related logs in gateways nodes?
In these tests, we have disabled HA to avoid possible complications. Neither subscription nor app throttling policies are working, both because parameters that should have values have not the adequate value (appTenant, stopOnQuotaReach).
Our scenario is far more basic. If we go with one instance of each component, it fails as Isabelle described. And the only thing we know is that both parameters come from the Authentication Context.
Thank you!