Display slice length in kubectl custom columns output - kubectl

Let's say I want to list pods, and show their name and the number of containers they're running. If I just want the images tags themselves, I could do something like
λ kubectl get pods --output custom-columns='NAME:.metadata.namespace,IMAGES:.spec.containers[*].image'
NAME IMAGES
prometheus-system quay.io/prometheus/prometheus:v2.21.0,quay.io/prometheus-operator/prometheus-config-reloader:v0.42.1,jimmidyson/configmap-reload:v0.4.0
prometheus-system quay.io/prometheus-operator/prometheus-operator:v0.42.1
But how do I make it display just the number of containers? In other words, what do I put for the selector to get the lenght of the slice, to give me output like this instead?
λ kubectl get pods --output custom-columns='NAME:.metadata.namespace,CONTAINERS:<what goes here?>'
NAME CONTAINERS
prometheus-system 3
prometheus-system 1
(Eventually, I want to put this on a CRD to display the length of a list in its default output, but I figure this use case is more reproducible, and therefore easier to relate to. IIUC - but please correct me if I'm wrong! - a solution that works for this question, will also work for the display-columns of a CRD...)

Related

AWS GroundTruth text labeling - hide columns in the data, and checking quality of answers

I am new to SageMaker. I have a large csv dataset which I would like labelled:
sentence_id
sentence
pre_agreed_label
148392
A sentence
0
383294
Another sentence
1
For each sentence, I would like a) a yes/no binary classification in response to a question, and b) on a scale of 1-3, how obvious the classification was. I need the sentence id to map to other parts of the dataset, and will use the pre-agreed labels to assess accuracy.
I have identified SageMaker GroundTruth labelling jobs as a possible way to do this. Is this the best way? In trying to set it up I have run into a few problems.
The first problem is I can't find a way to display only the sentence column to the labellers, hiding the sentence_id and pre_agreed_labels.
The second is that there is either single labelling or multi labelling, but I would like a way to have two sets of single-selection labels:
Select one for binary classification:
Yes
No
Select one for difficulty of classification:
Easy
Medium
Hard
It seems as though this can be done using custom HTML, but I don't know how to do this - the template it gives you doesn't even render
Finally, having not used mechanical turk before, are there ways of ensuring people take the work seriously and don't just select random answers? I can see there's an option to have x number of people answer the same question, but is there also a way to put in an obvious question to which we already have a 'pre_agreed_label' every nth question, and kick people off the task if they get it wrong? There also appears to be a maximum of $1.20 per task which seems odd.

What gcloud command can be used to obtain a list of default compute engine service accounts across your organisation?

I have tried this command;
gcloud alpha scc assets list <ORGANISATION-ID> --filter "security_center_properties.resource.type="google.iam.ServiceAccount" AND resource_properties.name:\"Compute Engine default service account\""
but I am recieving the following error;
(gcloud.alpha.scc.assets.list) INVALID_ARGUMENT: Invalid filter.
When I remove the filter after AND, I don't get an error message but I just see an >
Any ideas where I am going wrong?
I have reviewed this documentation to support me building the command but not sure which is the right filter to use.
I wonder if i should be filtering on the email of a compute engine default service account that ends "-compute#developer.gserviceaccount.com" but I can't identify what the right filter for this is.
The problem is the use of " on the filter.
You need to type --filter and put the filter like this: "FILTER_EXPRESION".
One filter expression could be: security_center_properties.resource_type="google.compute.Instance"
But you can not put a double quote inside a double quote block. So you need to use the back slash (\),if not, the command interpret the first double quote of the filter as the end of the filter.
On the other hand if you delete part of the command the prompt shows you '>' because there is a double quote block that is not end and it is waiting that you ends the command.
So the filter that you want has to be like this, for example:
gcloud alpha scc assets list <ORGANIZATION ID> \
--filter "security_center_properties.resource_type=\"google.compute.Instance\" AND security_center_properties.resource_type=\"google.cloud.resourcemanager.Organization\""
I hope that this explanation could help you!

Terraform output from 2 strings to list

In terraform I have 2 data outputs:
data "aws_instances" "daas_resolver_ip_1" {
instance_tags = {
Name = "${var.env_type}.${var.environment}.ns1.${var.aws_region}.a."
}
}
data "aws_instances" "daas_resolver_ip_2" {
instance_tags = {
Name = "${var.env_type}.${var.environment}.ns2.${var.aws_region}.b."
}
}
I want to get the private_ip from each of those combine those into a list and be used as follows:
dhcp_options_domain_name_servers = ["${data.aws_instances.daas_resolver_ip_1.private_ip}", "${data.aws_instances.daas_resolver_ip_1.private_ip}"]
How can I achieve this? At the moment this is the error I get:
Error: module.pmc_environment.module.pmc_vpc.aws_vpc_dhcp_options.vpc: domain_name_servers: should be a list
I believe what you've encountered here is a common limitation of Terraform 0.11. If this is a new configuration then starting with Terraform 0.12 should avoid the problem entirely, as this limitation was addressed in the Terraform 0.12 major release.
The underlying problem here is that the private_ip values of at least one of these resources is unknown during planning (it will be selected by the remote system during apply) but then Terraform 0.11's type checker is failing because it cannot prove that these unknown values will eventually produce a list of strings as the dhcp_options_domain_name_servers requires.
Terraform 0.12 addresses this by tracking type information for unknown values and propagating types through expressions so that e.g. in this case it could know that the result is a list of two strings but the strings themselves are not known yet. From Terraform 0.11's perspective, this is just an unknown value with no type information at all, and is therefore not considered to be a list, causing this error message.
A workaround for Terraform 0.11 is to use the -target argument to ask Terraform to deal with the operations it needs to learn the private_ip values first, and then run Terraform again as normal once those values are known:
terraform apply -target=module.pmc_environment.module.pmc_vpc.data.aws_instances.daas_resolver_ip_1 -target=module.pmc_environment.module.pmc_vpc.data.aws_instances.daas_resolver_ip_2
terraform apply
The first terraform apply with -target set should deal with the two data resources, and then the subsequent terraform apply with no arguments should then be able to see what the two IP addresses are.
This will work only if all of the values contributing to the data resource configurations remain stable after the initial creation step. You'd need to repeat this two-step process on subsequent changes if any of var.env_type, var.environment, or var.aws_region become unknown as a result of other planned actions.

How to report a list in Behaviorspace NetLogo?

I am running a NetLogo model in BehaviorSpace each time varying number of runs. I have turtle-breed pigs, and they accumulate a table with patch-types as keys and number of visits to each patch-type as values.
In the end I calculate a list of mean number of visits from all pigs. The list has the same length as long as the original table has the same number of keys (number of patch-types). I would like to export this mean number of visits to each patch-type with BehaviorSpace.
Perhaps I could write a separate csv file (tried - creates many files, so lots of work later on putting them together). But I would rather have everything in the same file output after a run.
I could make a global variable for each patch-type but this seems crude and wrong. Especially if I upload a different patch configuration.
I tried just exporting the list, but then in Excel I see it with brackets e.g. [49 0 31.5 76 7 0].
So my question Q1: is there a proper way to export a list of values so that in BehaviorSpace table output csv there is a column for each value?
Q2: Or perhaps there is an example of how to output a single csv that looks exactly as I want it from BehaviorSpace?
PS: In my case the patch types are costs. And I might change those in the future and rerun everything. Ideally I would like to have as output: a graph of costs vs frequency of visits.
Thanks
If the lists are a fixed length that doesn't vary from run to run, you can get the items into separate columns by using one metric for each item. So in your BehaviorSpace experiment definition, instead of putting mylist, put item 0 mylist and item 1 mylist and so on.
If the lists aren't always the same length, you're out of luck. BehaviorSpace isn't flexible that way. You would have to write a separate program (in the programming language of your choice, perhaps NetLogo itself, perhaps an Excel macro, perhaps something else) to postprocess the BehaviorSpace output and make it look how you want.

OPTICSXi - ELKI ResultWriter

I'm using ELKI to cluster, in a hierarchical way, a dataset of geolocations using OPTICSXi.
The result of the execution of the algorithm is a set of files.
The content of a file could be:
# Cluster: nameOfCluster
# OPTICSModel
# Parents: nameOfParents (this element doesn't exist for the root cluster)
# Children: nameOfChild_0, nameOfChild_1 ... nameOfChild_n, (optional)
ID=1 lat0 lon0 reachability=?
ID=3062 lat1 lon1 reachability=1.30972586 predecessor=1
ID=7383 lat2 lon2 reachability=2.56784445 predecessor=3062
ID=42839 lat3 lon3 reachability=4.05510623 predecessor=1
I don't understand if the elements that are in each file (in the example there are four elements) belong to the same cluster or could belong to different clusters. In the latter case, I need to write some code that builds the clusters ( for example looking at the predecessor of each node), or there are some parameters that could I specify in Elki to obtain each single cluster?
By default, ELKI will produce a directory with one file per cluster. Unless the output file already exists, in which case you will get all the clusters written into the same file, separated with comments as seen above.
With a hierarchical result, such as OPTICSXi, your should however also treat all members of the child clusters to be also part of the parent. These are clusters nested into the parent. They are not repeated in the parent, to reduce redundancy in the output.
Compare the output of OPTICSXi to OPTICS output. What the Xi approach does, is split the data for you, based on sudden drops in reachability-distance. All clusters of Xi should be subsequences of the original OPTICS cluster order.
In your case, you may have chosen minPts too small, if your cluster has just 4 elements. (Although, you may have truncated the file, or you may have a lot of elements in child clusters; so the output may be fine).
Also note that you will usually want to validate whether you want the first element(s) of your cluster to belong to the cluster or not; similarly the last elements. OPTICSXi tends to err on the first elements, but not in a systematic way that would be trivial to fix. The first and last elements are those that bridge the gap from one cluster to another. You really should verify these manually (which is a good reason to not choose minPts too small).
I strongly recommend to build/use a visualization for your specific use case. Then you could just load such a cluster into your visualization and visually inspect if the result makes sense to you. I have used OPTICSXi on geographic data, and that worked very well for me.
So, if I've understood well, in the example above, the cluster is composed of the elements
ID=1, ID=3062, ID=7383, ID=42839, and all the elements in nameOfChild_0, nameOfChild_1 ... nameOfChild_n.
Maybe, I don't have to join the children in the root element, because I guess I'll obtain a unique big cluster contained all my geo-locations, in fact I have 903 child elements and 18795 node (ID).
I've done a lot of tests, choosing minPoint = {2,5,10} and xi = {0.1, 0.01, 0.001, 0.0001, 0.00001, 0.000001}. I use a visualization of my clusters, but I can't find a good result. I'm having a lot of trouble.
Thanks to your reply I've understood that I split my elements too much, in the sense that for me each file is a cluster, and for this reason I don't consider the child elements in the parent, but I consider them as separated clusters.
Moreover, I noticed that the first and the last element sometimes are wrong, I've thought to verify if this elements are predecessor of at least one element in the cluster, or at least one element in the cluster is a predecessor of those. Does this make sense?