How to get the data for attribute vocabulary in istio - istio

In https://istio.io/v1.7/docs/reference/config/policy-and-telemetry/mixer-overview/#attributes,
given Istio deployment has a fixed vocabulary of attributes that it
understands. The specific vocabulary is determined by the set of
attribute producers being used in the deployment. The primary
attribute producer in Istio is Envoy, although specialized Mixer
adapters can also generate attributes.
I'd like to know how does istio get these data (Attribute Vocabulary) from envoy(or mixer adapter) and how does envoy export these data in detail.
Because I want to develop the WASM plugin for logging and I need to define the custom log data which equals logging at telemetry v1 in isio...

I'd like to know how does istio get these data (Attribute Vocabulary) from envoy(or mixer adapter) and how does envoy export these data in detail.
According to banzaicloud
Because Istio Telemetry V2 lacks a central component (Mixer) with access to K8s metadata, the proxies themselves require the metadata necessary to provide rich metrics. Additionally, features provided by Mixer had to be added to the Envoy proxies to replace the Mixer-based telemetry. Istio Telemetry V2 uses two custom Envoy plugins to achieve just that.
In-proxy service-level metrics in Telemetry V2 are provided by two custom plugins, metadata-exchange and stats.
By default, in Istio 1.5, Telemetry V2 is enabled as compiled in Istio proxy filters, mainly for performance reasons. The same filters are also compiled to WebAssembly (WASM) modules and shipped with Istio proxy.
You can find more useful information in above documentation.
Because I want to develop the WASM plugin for logging and I need to define the custom log data which equals logging at telemetry v1 in istio
I've found this documentation on envoy site, there are all the attributes you might use.
There is an example about how to get these values in Wasm ABI.
Path expressions allow access to inner fields in structured attributes via a sequence of field names, map, and list indexes following an attribute name. For example, get_property({“node”, “id”}) in Wasm ABI extracts the value of id field in node message attribute, while get_property({“request”, “headers”, “my-header”}) refers to the comma-concatenated value of a particular request header.

Related

how to query istio metadata exchange data in envoy access log?

I know with the latest istio version, it start to integrate the metadata-exchange extension to envoy proxy, so that when communicate between each pod, envoy will potential use the "x-envoy-peer-metadata" header to exchange the pod metadata and save in metadata exchange extension cache.
So my questions is,
Is there a way that I can list the existing data in this metadata-exchange extension cache?
I know envoy proxy support DYNAMIC_METADATA in its access log format, is it related with the metadata-exchange extension?
Finally, how can I query the metadata in extension and log it in envoy access log. Such as, besides the DOWNSTREAM_REMOTE_ADDRESS, also i wish to output the downstream pod name.

unable to delete custom plugin from datafusion instance

I tried uploading a custom jar as cdap plugin and it has few errors in it. I want to delete that particular plugin and upload a new one. what is the process for it ? I tried looking for documentation and it was not much informative.
Thanks in advance!
You can click on the hamburger menu, and click on Control Center at the bottom of the left panel. In the Control Center, click on Filter by, and select the checkbox for Artifacts. After that, you should see the artifact being listed in the Control Center, which then you can delete.
Alternatively, we suggest that while developing, the version of the artifact should be suffixed with -SNAPSHOT (ie. 1.0.0-SNAPSHOT). Any -SNAPSHOT version can be overwritten simply by reuploading. This way, you don't have to delete first before deploying a patched plugin JAR.
Actually each Data Fusion instance is running in GCP tenant project inside fully isolated area, keeping all orchestration actions, pipeline lifecycle management tasks and coordination as a part of GCP managed scenarios, thus you can make a user defined actions within a dedicated Data Fusion UI or targeting execution environment via CDAP REST API HTTP calls.
The purpose for using Data Fusion UI is to create a visual design for data pipelines, controlling ETL data processing through different phases of data executions, therefore you can do the same accessing particular CDAP API inventory.
Looking into the origin CDAP documentation you can find Artifact HTTP RESTful API that offers a set of HTTP methods that you can consider to manage custom plugin operations.
Referencing GCP documentation, there are a few simple steps how to prepare sufficient environment, supplying INSTANCE_URL variable for the target Data Fusion instance in order to smoothly trigger API functions within HTTP call methods against CDAP endpoint, i.e.:
export INSTANCE_ID=your-instance-id
export CDAP_ENDPOINT=$(gcloud beta data-fusion instances describe \
--location=us-central1 \
--format="value(apiEndpoint)" \
${INSTANCE_ID})
When you are ready with above steps, you can push a particular HTTP call method, approaching specific action.
For plugin deletion, try this one, invoking HTTP DELETE method:
curl -X DELETE -H "Authorization: Bearer ${AUTH_TOKEN}" "${CDAP_ENDPOINT}/v3/namespaces/system/artifacts/<artifact-name>/versions/<artifact-version>"

How to use Google Compute Python API to create custom machine type or instance with GPU?

I am just looking into using GCP for cloud computing stuff. So far I have been using AWS and the boto3 library and was trying to use the google python client API for launching instances.
So an example I came across was from their docs here. The instance machine type is specified as:
machine_type = "zones/%s/machineTypes/n1-standard-1" % zone
and then it passed to the configuration as:
config = {
'name': name,
'machineType': machine_type,
....
I wonder how does one go about specifying machines with GPU and custom RAM and processors etc. from the python API?
The Python API is basically a wrapper around the REST API, so in the example code you are using, the config object is being built using the same schema as would be passed in the insert request.
Reading that document shows that the guestAccelerators structure is the relevant one for GPUs.
Custom RAM and CPUs are more interesting. There is a format for specifying a custom machine type name (you can see it in the gcloud documentation for creating a machine type). The format is:
[GENERATION]custom-[NUMBER_OF_CPUs]-[RAM_IN_MB]
Generation refers to the "n1" or "n2" in the predefined names. For n1, this block is empty, for n2, the prefix is "n2-". That said, experimenting with gcloud seems to indicate that "n1-" as a prefix also works as you would expect.
So, for a 1 CPU n1 machine with 5GB of ram, it would be a custom-1-5120. This is what you would replace the n1-standard-1 in your example with.
You are, of course, subject to the limits of how to specify a custom machine such as the fact that RAM must be a multiple of 256MB.
Finally, there's a neat little feature at the bottom of the console "create instance" page:
Clicking on the relevant link will show you the exact REST object you need to create the machine you have defined in the console at that very moment, so it can be very useful to see how a particular parameter is used.
You can create a Compute Engine instance using the Compute Engine API. Specifically, we can use the insert API request. This accepts a JSON payload in a REST request that describes the desired VM instance that you desire. A full specification of the request is found in the docs. It includes:
machineType - specs of different (common) machines including CPUs and memory
disks - specs of disks to be added including size and type
guestAccelerators - specs for GPUs to add
many more options ...
One can also create a template description of the machine structure you want and simplify the creation of an instance by naming the template to use and thereby abstracting the configuration details out of code and into configuration.
Beyond using REST requests (which can be passed from a python), you also have the capability to create Compute Engines from:
GCP Console - web interface
gcloud - command line (which I suspect can also be driven from within Python)
Deployment Manager - configuration driven deployment which includes Python as a template language
Terraform - popular environment for creating Infrastructure as Code environments

GKE Stack Driver Trace Reporting By Cluster By Environment By Service By Service Version

We have multiple spring boot and python apps running on top of GKE and for spring boot applications am using spring-cloud-gcp-starter-trace to log traces to stack driver so that I can debug those traces via the stack driver UI.
Am not able to figure out how to add labels like service_name, service_version and cluster_name so that I can filter out only those traces for reporting purposes because right now we have istio configured on one cluster and even with one percent sampling rate it's generating tons of telemetry data and with UN-availability of filters or am missing some configuration, the trace UI has almost become useless for me
I had a look at the documentation for spring-cloud-gcp-starter-trace, they don't have any properties through which I can set these fields, Am setting app name and app version via the metadata tags of the kubernetes deployment template but they aren't getting picked up.
Can some one please let me know how can I achieve this.
You can add custom tags using the brave.SpanCustomizer. Just autowire it in as the bean already exists in the application context.
You can then add tags like this:
#Autowired
SpanCustomizer spanCustomizer;
...
spanCustomizer.tag("my-tag", "my tag value");
These will turn into labels on you traces in Stackdriver Trace, on which you can search.
If you're using OpenCensus, you can use annotations to pass metadata into the Trace backend:
https://cloud.google.com/trace/docs/setup/java#custom_spans.
I don't see anything in spring-cloud-gcp-starter-trace documentation (what little I could find) regarding annotations however.

Kubernetes: how to properly change apiserver runtime settings

I'm using kube-aws to run a Kubernetes cluster on AWS, and everything works as expected.
Now, I realize that cron jobs aren't turned on in the version I'm using (v1.7.10_coreos.0), while the documentation for Kubernetes only states the following:
For previous versions of cluster (< 1.8) you need to explicitly enable batch/v2alpha1 API by passing --runtime-config=batch/v2alpha1=true to the API server (see Turn on or off an API version for your cluster for more).
And the documentation directed to in that text only states this (it's the actual, full documentation):
Specific API versions can be turned on or off by passing --runtime-config=api/ flag while bringing up the API server. For example: to turn off v1 API, pass --runtime-config=api/v1=false. runtime-config also supports 2 special keys: api/all and api/legacy to control all and legacy APIs respectively. For example, for turning off all API versions except v1, pass --runtime-config=api/all=false,api/v1=true. For the purposes of these flags, legacy APIs are those APIs which have been explicitly deprecated (e.g. v1beta3).
I have been unsuccessful in finding information about how to change the configuration of a running cluster, and I, of course, don't want to try to re-run the command on api-server.
Note that kube-aws still use hyperkube, and not kubeadm. Also, the /etc/kubernetes/manifests-directory only contains the ssl-directory.
The setting I want to apply is this: --runtime-config=batch/v2alpha1=true
What is the proper way, preferably using kubectl, to apply this setting and have the apiservers restarted?
Thanks.
batch/v2alpha1=true is set by default in kube-aws. You can find it here