Vertex Pipeline Metric values not being added to metrics artifact? - google-cloud-ml

We are trying to return some metrics from our Vertex Pipeline, such that they are visible in the Run Comparison and Metadata tools in the Vertex UI.
I saw here that we can use this output type Output[Metrics], and the subsequent metrics.log_metric("metric_name", metric_val) method to add the metrics, and it seemed from the available documentation that this would be enough.
We want to use the reusable component method as opposed to python function based components, around which the example is based. So we implemented it within our component code like so:
We added the output in the component.yaml:
outputs:
- name: metrics
type: Metrics
description: evaluation metrics path
then added the output to the command in the implemenation:
command: [
python3, main.py,
--gcs-test-data-path, {inputValue: gcs_test_data_path},
--gcs-model-path, {inputValue: gcs_model_path},
--gcs-output-bucket-id, {inputValue: gcs_output_bucket_id},
--project-id, {inputValue: project_id},
--timestamp, {inputValue: timestamp},
--batch-size, {inputValue: batch_size},
--img-height, {inputValue: img_height},
--img-width, {inputValue: img_width},
--img-depth, {inputValue: img_depth},
--metrics, {outputPath: metrics},
]
Next in the components main python script, we parse this argument with argparse:
PARSER.add_argument('--metrics',
type=Metrics,
required=False,
help='evaluation metrics output')
and pass it to the components main function:
if __name__ == '__main__':
ARGS = PARSER.parse_args()
evaluation(gcs_test_data_path=ARGS.gcs_test_data_path,
gcs_model_path=ARGS.gcs_model_path,
gcs_output_bucket_id=ARGS.gcs_output_bucket_id,
project_id=ARGS.project_id,
timestamp=ARGS.timestamp,
batch_size=ARGS.batch_size,
img_height=ARGS.img_height,
img_width=ARGS.img_width,
img_depth=ARGS.img_depth,
metrics=ARGS.metrics,
)
in the declaration of the component function, we then typed this metrics parameter as Output[Metrics]
from kfp.v2.dsl import Output, Metrics
def evaluation(gcs_test_data_path: str,
gcs_model_path: str,
gcs_output_bucket_id: str,
metrics: Output[Metrics],
project_id: str,
timestamp: str,
batch_size: int,
img_height: int,
img_width: int,
img_depth: int):
finally, we implement the log_metric method within this evaluation function:
metrics.log_metric('accuracy', acc)
metrics.log_metric('precision', prec)
metrics.log_metric('recall', recall)
metrics.log_metric('f1-score', f_1)
When we run this pipeline, we can see this metric artifact materialised in the DAG:
And Metrics Artifacts are listed in the Metadata UI in Vertex:
However, clicking through to view the artifacts JSON, there is no Metadata listed:
In addition, No Metadata is visible when comparing runs in the pipeline UI:
Finally, navigating to the Objects URI in GCS, we are met with 'Requested entity was not found.', which I assume indicates that nothing was written to GCS:
Are we doing something wrong with this implementation of metrics in the reusable components? From what I can tell, this all seems right to me, but it's hard to tell given the docs at this point seem to focus primarily on examples with Python Function based components.
Do we perhaps need to proactively write this Metrics object to an OutputPath?
Any helps is appreciated.
----- UPDATE ----
I have since been able to get artifact metadata and URI To update. In the end we used kfp sdk to generate a yaml file based on a #component decorated python function, we then adapted this format for our reusable components.
Our component.yaml now looks like this:
name: predict
description: Prepare and create predictions request
implementation:
container:
args:
- --executor_input
- executorInput: null
- --function_to_execute
- predict
command:
- python3
- -m
- kfp.v2.components.executor_main
- --component_module_path
- predict.py
image: gcr.io/PROJECT_ID/kfp/components/predict:latest
inputs:
- name: input_1
type: String
- name: intput_2
type: String
outputs:
- name: output_1
type: Dataset
- name: output_2
type: Dataset
with this change to the yaml, we can now successfully update the artifacts metadata dictionary, and uri through artifact.path = '/path/to/file'. These updates are displayed in the Vertex UI.
I am still unsure why the component.yaml format specified in the Kubeflow documentation does not work - I think this may be a bug with Vertex Pipelines.

As I can see in the code you are running, everything should work without a problem; but, as you commented, I would recommend you to write the metrics object into a path so that it can reach somewhere within your project.

Related

How to pass the experiment configuration to a SagemakerTrainingOperator while training?

Idea:
To use experiments and trials to log the training parameters and artifacts in sagemaker while using MWAA as the pipeline orchestrator
I am using the training_config to create the dict to pass the training configuration to the Tensorflow estimator, but there is no parameter to pass the experiment configuration
tf_estimator = TensorFlow(entry_point='train_model.py',
source_dir= source
role=sagemaker.get_execution_role(),
instance_count=1,
framework_version='2.3.0',
instance_type=instance_type,
py_version='py37',
script_mode=True,
enable_sagemaker_metrics = True,
metric_definitions=metric_definitions,
output_path=output
model_training_config = training_config(
estimator=tf_estimator,
inputs=input
job_name=training_jobname,
)
training_task = SageMakerTrainingOperator(
task_id=test_id,
config=model_training_config,
aws_conn_id="airflow-sagemaker",
print_log=True,
wait_for_completion=True,
check_interval=60
)
You can use the experiment_config in estimator.fit. More detailed example can be found here
The only way that i found right now is to use the CreateTrainigJob API (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateTrainingJob.html#sagemaker-CreateTrainingJob-request-RoleArn). The following steps are needed:
I am not sure if this will work with Bring your own script method for E.g with a Tensorflow estimator
it works with a build your own container approach
Using the CreateTrainigJob API i created the configs which in turn includes all the needed configs like - training, experiment, algporthm etc and passed that to SagemakerTrainingOperator

How do I control Drupal 8 monolog dblog log-levels

I'm using Drupal 8 with the Monolog module (1.3) for external log support and Sentry logging, while preserving core database logging (dblog/watchdog). This all works, except that I can't find a way to control the watchdog logging level (so I'm getting a lot of debug logging in production environments).
Here is my monolog configuration (monolog.services.yml):
parameters:
monolog.channel_handlers:
default: ['drupal.dblog', 'drupal.raven']
custom1: ['rotating_file_custom1']
custom2: ['rotating_file_custom2']
custom3: ['rotating_file_custom3']
services:
monolog.handler.rotating_file_custom1:
class: Monolog\Handler\RotatingFileHandler
arguments: ['file://../logs/custom1.log', 25, 'monolog.level.debug']
monolog.handler.rotating_file_custom2:
class: Monolog\Handler\RotatingFileHandler
arguments: ['file://../logs/custom2.log', 25, 'monolog.level.debug']
monolog.handler.rotating_file_custom3:
class: Monolog\Handler\RotatingFileHandler
arguments: ['file://../logs/custom3.log', 25, 'monolog.level.debug' ]
I tried adding a new services handler for drupal.dblog using the DrupalHandler, but I couldn't figure out what arguments to use (arguments are required and it is expecting a LoggerInterface implementation as the first argument).
I've read through the module documentation, but it almost exclusively focuses on file/external logging.
Any suggestions?
TIA
I also was trying to accomplish this and it was also not clear to me. solution was to add the core dblog service with the '#' notation as first argument. What also should be mentioned is the fact, that log-levels are documented as 'monolog.level.debug' but should be implemented without the prefix 'monolog.level.' because the monolog parameters are not loaded correctly as they should. Here a working example:
parameters:
monolog.channel_handlers:
default: ['rotating_file_handler', 'db_warning_handler']
locale: ['null']
monolog.processors: ['message_placeholder', 'current_user', 'request_uri', 'ip', 'referer', 'memory_usage']
services:
monolog.handler.db_warning_handler:
class: Drupal\monolog\Logger\Handler\DrupalHandler
arguments: ['#logger.dblog', 'warning']
monolog.handler.rotating_file_handler:
class: Monolog\Handler\RotatingFileHandler
arguments: ['private://logs/debug.log', 30, 'debug']

Building Envoy WASM network filter

I'm trying to build a tcp level WASM filter for Envoy. I'm testing with following filter chain :
filter_chains:
- filters:
- name: envoy.filters.wasm
typed_config:
"#type": type.googleapis.com/envoy.extensions.filters.network.wasm.v3.Wasm
config:
name: "myfilter"
vm_config:
runtime: "envoy.wasm.runtime.v8"
code:
local:
filename: "/opt/envoy/filter.wasm"
allow_precompiled: true
- name: envoy.filters.network.echo
my cpp code compiles ok and envoy starts fine, but even with just
FilterStatus MyFilterContext::onNewConnection() {
LOG_DEBUG("onNewConn");
return FilterStatus::StopIteration;
};
FilterStatus MyFilterContext::onDownstreamData(size_t, bool) {
LOG_DEBUG("onDownstream");
return FilterStatus::StopIteration;
}
I would expect the connection to never reach the echo service, yet it does every time, and there is nothing logged from wasm filter side in envoy logs, apart of trace level logs showing that wasm always returns the same logical value :
[29][trace][wasm] [source/extensions/common/wasm/wasm_vm.cc:40] [host->vm] proxy_on_new_connection(2)
[29][trace][wasm] [source/extensions/common/wasm/wasm_vm.cc:40] [host<-vm] proxy_on_new_connection return: 0
While there are many for HTTP filters, I was unable to locate any examples of a network filter implementation, which makes me wonder if anyone has this working, and if so, how exactly.
I also tried implementing it in rust, to no success as it compiles but then, within envoy it fails with
Function: proxy_on_context_create failed: Uncaught RuntimeError: unreachable
Proxy-Wasm plugin in-VM backtrace:
0: 0x164c - __rust_start_panic
Did anyone actually implemented a working network filter for Envoy ? An example code would be great as all the examples I found are for HTTP filters which do me no good.
Yes, I've implemented a working NETWORK_FILTER for Envoy. https://github.com/solo-io/proxy-runtime/commit/0b7fec73a36a979b53a05393b3868f1661af8dff was required for it to work in AssemblyScript. The commit message describes the YAML configuration I used with an Istio-flavored Envoy, but it depends on which version of Envoy you're using. Each language's runtime may or may not support it, but the C++ runtime certainly does, if you don't want to use AssemblyScript.
See also https://www.envoyproxy.io/docs/envoy/latest/configuration/listeners/network_filters/wasm_filter.html

Pass CDK context values per deployment environment

I am using context to pass values to CDK. Is there currently a way to define project context file per deployment environment (dev, test) so that when the number of values that I have to pass grow, they will be easier to manage compared to passing the values in the command-line:
cdk synth --context bucketName1=my-dev-bucket1 --context bucketName2=my-dev-bucket2 MyStack
It would be possible to use one cdk.json context file and only pass the environment as the context value in the command-line, and depending on it's value select the correct values:
{
...
"context": {
"devBucketName1": "my-dev-bucket1",
"devBucketName2": "my-dev-bucket2",
"testBucketName1": "my-test-bucket1",
"testBucketName2": "my-test-bucket2",
}
}
But preferably, I would like to split it into separate files, f.e. cdk.dev.json and cdk.test.json which would contain their corresponding values, and use the correct one depending on the environment.
According to the documentation, CDK will look for context in one of several places. However, there's no mention of defining multiple/additional files.
The best solution I've been able to come up with is to make use of JSON to separate context out per environment:
"context": {
"dev": {
"bucketName": "my-dev-bucket"
}
"prod": {
"bucketName": "my-prod-bucket"
}
}
This allows you to access the different values programmatically depending on which environment CDK is deploying to.
let myEnv = dev // This could be passed in as a property of the class instead and accessed via props.myEnv
const myBucket = new s3.Bucket(this, "MyBucket", {
bucketName: app.node.tryGetContext(myEnv).bucketName
})
You can also do so programmatically in your code:
For instance, I have a context variable of deploy_tag cdk deploy Stack\* -c deploy_tag=PROD
then in my code, i have retrieved that deploy_tag variable and I make the decisions there, such as: (using python, but the idea is the same)
bucket_name = BUCKET_NAME_PROD if deploy_tag == 'PROD' else BUCKET_NAME_DEV
this can give you a lot more control, and if you set up a constants file in your code you can keep that up to date with far less in your cdk.json that may become very cluttered with larger stacks and multiple environments. If you go this route then you can have your Prod and Dev constants file, and your context variable can inform your cdk which file to load for a given deployment.
i also tend to create a new class object with all my deployment properties either assigned or derived, and pass that object into each stack, retrieving what i need out of there.

How to connect kubeflow pipeline components

I want to establish a pipeline connection between the components by passing any kind of data just to make it look like organized like flowchart with arrows. Right now it is like below
Irrespective of whether the docker container generates output or not I would want pass some sample data between the components. However If any changes is required in the docker container code or the .yaml please let me know
KFP Code
import os
from pathlib import Path
import requests
import kfp
#Load the component
component1 = kfp.components.load_component_from_file('comp_typed.yaml')
component2 = kfp.components.load_component_from_file('component2.yaml')
component3 = kfp.components.load_component_from_file('component3.yaml')
component4 = kfp.components.load_component_from_file('component4.yaml')
#Use the component as part of the pipeline
#kfp.dsl.pipeline(name='Document Processing Pipeline', description='Document Processing Pipeline')
def data_passing():
task1 = component1()
task2 = component2(task1.output)
task3 = component3(task2.output)
task4 = component4(task3.output)
comp_typed.yaml code
name: DPC
description: This is an example
implementation:
container:
image: gcr.io/pro1-in-us/dpc_comp1#sha256:3768383b9cd694936ef00464cb1bdc7f48bc4e9bbf08bde50ac7346f25be15de
command: [python3, /dpc_comp1.py,]
component2.yaml
name: Custom_Plugin_1
description: This is an example
implementation:
container:
image: gcr.io/pro1-in-us/plugin1#sha256:16cb4aa9edf59bdf138177d41d46fcb493f84ce798781125dc7777ff5e1602e3
command: [python3, /plugin1.py,]
I tried this and this but could not achieve anything except for error. I am new to python and kubeflow. What code changes should I make to pass data between all 4 components using KFP SDK. Data can be a file/string
Let's Suppose, Component 1 downloads a .pdf file from gs bucket can i feed the same file into next downstream component?. Component 1 downloads file to '/tmp/doc_pages' location of component 1 docker container which i believe is local to that particular contain and the down stream components can not read them?
This notebook, which describes how to pass data between KFP components, may be useful. It includes the concept of 'small data', to pass directly; vs 'large data' that you write to a file, then— as shown in the example notebook— the paths for the input and output files are chosen by the system and are passed into the function (as strings).
If you don't need to pass data between steps, but want to specify a step ordering dependency (e.g. op2 doesn't run until op1 is finished) you can indicate this in your pipeline definition like so:
op2.after(op1)
In addition to the Amy's excellent answer:
Your pipeline is correct. The best way to establish a dependency between components is to establish data dependency.
Let's look at your pipeline code:
task2 = component2(task1.output)
You're passing output of task1 to component2. This should result in a dependency that you want. But there are couple of problems (and your pipeline will show compilation errors if you try to compile it):
component1 needs to have an output
component2 needs to have an input
component2 needs to have an output (so that you can pass it to component3)
Etc.
Let's add them:
name: DPC
description: This is an example
outputs:
- name: output_1
implementation:
container:
image: gcr.io/pro1-in-us/dpc_comp1#sha256:3768383b9cd694936ef00464cb1bdc7f48bc4e9bbf08bde50ac7346f25be15de
command: [python3, /dpc_comp1.py, --output-1-path, {outputPath: output_1}]
name: Custom_Plugin_1
description: This is an example
inputs:
- name: input_1
outputs:
- name: output_1
implementation:
container:
image: gcr.io/pro1-in-us/plugin1#sha256:16cb4aa9edf59bdf138177d41d46fcb493f84ce798781125dc7777ff5e1602e3
command: [python3, /plugin1.py, --input-1-path, {inputPath: input_1}, --output-1-path, {outputPath: output_1}]
With these changes, your pipeline should compile and display the dependencies that you want.
Please check the tutorial about creating components from command-line programs.
If you don't want to use dependency through outputs or passing any data between components, you can refer to PVC in previous step to explicitly call out a dependency.
Example:
You can create a PVC for storing data.
vop = dsl.VolumeOp(name="pvc",
resource_name="pvc", size=<size>,
modes=dsl.VOLUME_MODE_RWO,)
Use it in a component:
download = dsl.ContainerOp(name="download",image="",
command=[" "], arguments=[" "],
pvolumes={"/data": vop.volume},)
Now you can call out dependency between download and train as follows:
train = dsl.ContainerOp(name="train",image="",
command=[" "], arguments=[" "],
pvolumes={"/data": download.pvolumes["/data"]},)