Java micro service distributed tracing with Istio

Java micro service distributed tracing with Istio - istio

Kubernetes and Istio already installed in the cluster. Three micro services deployed as PODs. The flow is
Micro service A to Micro Service B calls => HTTP
Micro service B to Micro service C calls => via Kafka
Micro service A expose a HTTP API to outside
I guess when client hit the Ingres, Istio generate traceId and spanId in HTTP header and enter to Service A.
Are these spanId and traceId propagate to Micro service B and C without using separate API like Spring Cloud sleuth?

No, Istio does not provide tracing headers propagation. However it can be configured on application side without use of 3rd party APIs.
According to Istio documentation:
Istio leverages Envoy’s distributed tracing feature to provide tracing integration out of the box. Specifically, Istio provides options to install various tracing backend and configure proxies to send trace spans to them automatically. See Zipkin, Jaeger and LightStep task docs about how Istio works with those tracing systems.
Istio documentation also has an example of application side header propagation for the bookinfo demo application:
Trace context propagation
Although Istio proxies are able to automatically send spans, they need some hints to tie together the entire trace. Applications need to propagate the appropriate HTTP headers so that when the proxies send span information, the spans can be correlated correctly into a single trace.
To do this, an application needs to collect and propagate the following headers from the incoming request to any outgoing requests:
x-request-id
x-b3-traceid
x-b3-spanid
x-b3-parentspanid
x-b3-sampled
x-b3-flags
x-ot-span-context
Additionally, tracing integrations based on OpenCensus (e.g. Stackdriver) propagate the following headers:
x-cloud-trace-context
traceparent
grpc-trace-bin
If you look at the sample Python productpage service, for example, you see that the application extracts the required headers from an HTTP request using OpenTracing libraries:
def getForwardHeaders(request):
headers = {}
# x-b3-*** headers can be populated using the opentracing span
span = get_current_span()
carrier = {}
tracer.inject(
span_context=span.context,
format=Format.HTTP_HEADERS,
carrier=carrier)
headers.update(carrier)
# ...
incoming_headers = ['x-request-id']
# ...
for ihdr in incoming_headers:
val = request.headers.get(ihdr)
if val is not None:
headers[ihdr] = val
return headers
The reviews application (Java) does something similar:
#GET
#Path("/reviews/{productId}")
public Response bookReviewsById(#PathParam("productId") int productId,
#HeaderParam("end-user") String user,
#HeaderParam("x-request-id") String xreq,
#HeaderParam("x-b3-traceid") String xtraceid,
#HeaderParam("x-b3-spanid") String xspanid,
#HeaderParam("x-b3-parentspanid") String xparentspanid,
#HeaderParam("x-b3-sampled") String xsampled,
#HeaderParam("x-b3-flags") String xflags,
#HeaderParam("x-ot-span-context") String xotspan) {
if (ratings_enabled) {
JsonObject ratingsResponse = getRatings(Integer.toString(productId), user, xreq, xtraceid, xspanid, xparentspanid, xsampled, xflags, xotspan);
When you make downstream calls in your applications, make sure to include these headers.

Related

OpenCensus distributed tracing propagation using both Flask & requests

I'm attempting to implement distributed tracing via OpenCensus across different services where each service is using Flask for serving and sends downstream requests using... er.. requests to build its reply. The tracing platform is GCP Cloud Trace.
I'm using FlaskMiddleware, which traces the initial call correctly, but there's no propagation of span info between the source and target service, even when the middleware has a propagation defined (I've tried a few):
#middleware = FlaskMiddleware(app, exporter=exporter, sampler=sampler, excludelist_paths=['healthz', 'metrics'], propagator=google_cloud_format.GoogleCloudFormatPropagator())
middleware = FlaskMiddleware(app, exporter=exporter, sampler=sampler, excludelist_paths=['healthz', 'metrics'], propagator=b3_format.B3FormatPropagator())
#middleware = FlaskMiddleware(app, exporter=exporter, sampler=sampler, excludelist_paths=['healthz', 'metrics'], propagator=trace_context_http_header_format.TraceContextPropagator())
I guess the question is, does anyone have an example of setting up tracing that traverses several downstream calls when each service is serving via Flask and making downstream requests via requests.
Currently I end up with each service having its own trace with a single span.

Making GCP Load-balancer HTTP logs integrate with Cloud Trace (in GCP Logs Explorer)?

Cloud Trace and Cloud Logging integrate quite nicely in most cases, described in https://cloud.google.com/trace/docs/trace-log-integration
Unfortunately, this doesn't seem to include the HTTP request logs generated by a Load Balancer when request logging is enabled.
The LB logs show the traces icon, and are correctly associated with an overall trace in the Cloud Trace system, but the context menu 'show trace details' is greyed out for those log items.
A similar problem arose with my application level logging/tracing, and was solved by setting the traceSampled attribute on the LogEntry, but this can't work for LB logs, since I'm not in control of their generation.
In this instance I'm tracing 100% of requests since the service is M2M and fairly low volume, but in the general case it makes sense that the LB can't know if something is actually generating traces without being told.
I can't find any good references in the docs, but in theory a response header indicating it was sampled could be observed by the LB and cause it to issue the appropriate log.
Any ideas if such a feature exists, in this form or any other?
(Last-ditch workaround might be to use Logs Router to feed LB logs into a pubsub queue (and exclude them from normal logging sinks), and resubmit them to the normal sink(s) with fields appropriately set by some Cloud Function or other pubsub consumer, but that seems like a lot of work and complexity for this purpose)

There is currently a Feature Request created for this, you can follow the status in the following link 1.
As a workaround, you could implement target proxies along with your Load Balancer, according to the documentation for a Global external HTTP(S) load balancer:
The proxies set HTTP request/response headers as follows:
Via: 1.1 google (requests and responses)
X-Forwarded-Proto: [http | https] (requests only)
X-Cloud-Trace-Context: <trace-id>/<span-id>;<trace-options> (requests only) Contains parameters for Cloud Trace.
X-Forwarded-For: [<supplied-value>,]<client-ip>,<load-balancer-ip>
(see X-Forwarded-For header) (requests only)
You can find the complete documentation about external HTTP(S) load balancers and target proxies here 2.
And finally, take a look at the documentation on how to use and configure target proxies here 3.

How to configure istio-proxy to log traceId?

I am using istio with version 1.3.5. Is there any configuration to be set to allow istio-proxy to log traceId? I am using jaeger tracing (wit zipkin protocol) being enabled. There is one thing I want to accomplish by having traceId logging:
- log correlation in multiple services upstream. Basically I can filter all logs by certain traceId.

According to envoy proxy documentation for envoy v1.12.0 used by istio 1.3:
Trace context propagation
Envoy provides the capability for reporting tracing information regarding communications between services in the mesh. However, to be able to correlate the pieces of tracing information generated by the various proxies within a call flow, the services must propagate certain trace context between the inbound and outbound requests.
Whichever tracing provider is being used, the service should propagate the x-request-id to enable logging across the invoked services to be correlated.
The tracing providers also require additional context, to enable the parent/child relationships between the spans (logical units of work) to be understood. This can be achieved by using the LightStep (via OpenTracing API) or Zipkin tracer directly within the service itself, to extract the trace context from the inbound request and inject it into any subsequent outbound requests. This approach would also enable the service to create additional spans, describing work being done internally within the service, that may be useful when examining the end-to-end trace.
Alternatively the trace context can be manually propagated by the service:
When using the LightStep tracer, Envoy relies on the service to propagate the
x-ot-span-context
HTTP header while sending HTTP requests to other services.
When using the Zipkin tracer, Envoy relies on the service to propagate the B3 HTTP headers (
x-b3-traceid,
x-b3-spanid,
x-b3-parentspanid,
x-b3-sampled,
and
x-b3-flags).
The
x-b3-sampled
header can also be supplied by an external client to either enable or
disable tracing for a particular request. In addition, the single
b3
header propagation format is supported, which is a more compressed
format.
When using the Datadog tracer, Envoy relies on the service to propagate the Datadog-specific HTTP headers (
x-datadog-trace-id,
x-datadog-parent-id,
x-datadog-sampling-priority).
TLDR: traceId headers need to be manually added to B3 HTTP headers.
Additional information: https://github.com/openzipkin/b3-propagation

Select the service you wish to carry out a Google Task Handler

I am relatively new to Google Cloud Platform, and I am able to create app services, and manage databases. I am attempting to create a handler within Google Cloud Tasks (similar to the NodeJS sample found in this documentation.
However, the documentation fails to clearly address how to connect the deployed service with what is requesting. Necessity requires that I have more than one service in my project (one in Node for managing rest, and another in Python for managing geospatial data as asynchronous tasks).
My question: When running multiple services, how does Google Cloud Tasks know which service to direct the task towards?
Screenshot below as proof that I am able to request tasks to a queue.

When using App Engine routing for your tasks it will route it to the "default" service. However, you can overwrite this by defining AppEngineRouting, select your service, instance and version, the AppEngineHttpRequest field.
The sample shows a task routed to the default service's /log_payload endpoint.
const task = {
appEngineHttpRequest: {
httpMethod: 'POST',
relativeUri: '/log_payload',
},
};
You can update this to:
const task = {
appEngineHttpRequest: {
httpMethod: 'POST',
relativeUri: '/log_payload',
appEngineRouting: {
service: 'non-default-service'
}
},
};
Learn more about configuring routes.

I wonder which "services" you are talking about, because it always is the current service. These HTTP requests are basically being dispatched by HTTP headers HTTP_X_APPENGINE_QUEUENAME and HTTP_X_APPENGINE_TASKNAME... as you have them in the screenshot with sample-tasks and some random numbers. If you want to task other services, these will have to have their own task queue(s).

What is a web service endpoint?

Let's say my web service is located at http://localhost:8080/foo/mywebservice and my WSDL is at http://localhost:8080/foo/mywebservice?wsdl.
Is http://localhost:8080/foo/mywebservice an endpoint, i.e., is it the same as the URI of my web service or where the SOAP messages received and unmarshalled?
Could you please explain to me what it is and what the purpose of it is?

This is a shorter and hopefully clearer answer...
Yes, the endpoint is the URL where your service can be accessed by a client application. The same web service can have multiple endpoints, for example in order to make it available using different protocols.

Updated answer, from Peter in comments :
This is de "old terminology", use directally the WSDL2 "endepoint"
definition (WSDL2 translated "port" to "endpoint").
Maybe you find an answer in this document : http://www.w3.org/TR/wsdl.html
A WSDL document defines services as collections of network endpoints, or ports. In WSDL, the abstract definition of endpoints and messages is separated from their concrete network deployment or data format bindings. This allows the reuse of abstract definitions: messages, which are abstract descriptions of the data being exchanged, and port types which are abstract collections of operations. The concrete protocol and data format specifications for a particular port type constitutes a reusable binding. A port is defined by associating a network address with a reusable binding, and a collection of ports define a service. Hence, a WSDL document uses the following elements in the definition of network services:
Types– a container for data type definitions using some type system (such as XSD).
Message– an abstract, typed definition of the data being communicated.
Operation– an abstract description of an action supported by the service.
Port Type–an abstract set of operations supported by one or more endpoints.
Binding– a concrete protocol and data format specification for a particular port type.
Port– a single endpoint defined as a combination of a binding and a network address.
Service– a collection of related endpoints.
http://www.ehow.com/info_12212371_definition-service-endpoint.html
The endpoint is a connection point where HTML files or active server pages are exposed. Endpoints provide information needed to address a Web service endpoint. The endpoint provides a reference or specification that is used to define a group or family of message addressing properties and give end-to-end message characteristics, such as references for the source and destination of endpoints, and the identity of messages to allow for uniform addressing of "independent" messages. The endpoint can be a PC, PDA, or point-of-sale terminal.

A web service endpoint is the URL that another program would use to communicate with your program. To see the WSDL you add ?wsdl to the web service endpoint URL.
Web services are for program-to-program interaction, while web pages are for program-to-human interaction.
So:
Endpoint is: http://www.blah.com/myproject/webservice/webmethod
Therefore,
WSDL is: http://www.blah.com/myproject/webservice/webmethod?wsdl
To expand further on the elements of a WSDL, I always find it helpful to compare them to code:
A WSDL has 2 portions (physical & abstract).
Physical Portion:
Definitions - variables - ex: myVar, x, y, etc.
Types - data types - ex: int, double, String, myObjectType
Operations - methods/functions - ex: myMethod(), myFunction(), etc.
Messages - method/function input parameters & return types
ex: public myObjectType myMethod(String myVar)
Porttypes - classes (i.e. they are a container for operations) - ex: MyClass{}, etc.
Abstract Portion:
Binding - these connect to the porttypes and define the chosen protocol for communicating with this web service.
- a protocol is a form of communication (so text/SMS, vs. phone vs. email, etc.).
Service - this lists the address where another program can find your web service (i.e. your endpoint).

In past projects I worked on, the endpoint was a relative property. That is to say it may or may not have been appended to, but it always contained the protocol://host:port/partOfThePath.
If the service being called had a dynamic part to it, for example a ?param=dynamicValue, then that part would get added to the endpoint. But many times the endpoint could be used as is without having to be amended.
Whats important to understand is what an endpoint is not and how it helps. For example an alternative way to pass the information stored in an endpoint would be to store the different parts of the endpoint in separate properties. For example:
hostForServiceA=someIp
portForServiceA=8080
pathForServiceA=/some/service/path
hostForServiceB=someIp
portForServiceB=8080
pathForServiceB=/some/service/path
Or if the same host and port across multiple services:
host=someIp
port=8080
pathForServiceA=/some/service/path
pathForServiceB=/some/service/path
In those cases the full URL would need to be constructed in your code as such:
String url = "http://" + host + ":" + port + pathForServiceA + "?" + dynamicParam + "=" + dynamicValue;
In contract this can be stored as an endpoint as such
serviceAEndpoint=http://host:port/some/service/path?dynamicParam=
And yes many times we stored the endpoint up to and including the '='. This lead to code like this:
String url = serviceAEndpoint + dynamicValue;
Hope that sheds some light.

Simply put, an endpoint is one end of a communication channel. When an API interacts with another system, the touch-points of this communication are considered endpoints. For APIs, an endpoint can include a URL of a server or service. Each endpoint is the location from which APIs can access the resources they need to carry out their function.
APIs work using ‘requests’ and ‘responses.’ When an API requests information from a web application or web server, it will receive a response. The place that APIs send requests and where the resource lives, is called an endpoint.
Reference:
https://smartbear.com/learn/performance-monitoring/api-endpoints/

An Endpoint is specified as a relative or absolute url that usually results in a response. That response is usually the result of a server-side process that, could, for instance, produce a JSON string. That string can then be consumed by the application that made the call to the endpoint. So, in general endpoints are predefined access points, used within TCP/IP networks to initiate a process and/or return a response. Endpoints could contain parameters passed within the URL, as key value pairs, multiple key value pairs are separated by an ampersand, allowing the endpoint to call, for example, an update/insert process; so endpoints don’t always need to return a response, but a response is always useful, even if it is just to indicate the success or failure of an operation.

A endpoint is a URL for web service.And Endpoints also is a distributed API.
The Simple Object Access Protocol (SOAP) endpoint is a URL. It identifies the location on the built-in HTTP service where the web services listener listens for incoming requests.
Reference: https://www.ibm.com/support/knowledgecenter/SSSHYH_7.1.0.4/com.ibm.netcoolimpact.doc/dsa/imdsa_web_netcool_impact_soap_endpoint_c.html

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js