I have the following pipeline in AWS
API-Gateway -> Lambda -> Kafka(Amazon MSK) -> Consumers
I need to load test the entire pipeline and each component in the pipeline (to identify bottlenecks)
I don't have any prior experience in load testing. So, didn't know where to start. Some blogs mentioned JMeter can be used to do the load testing. But later got to know that the pipeline is asynchronous and it can't be done using JMeter.
How can I load test the pipeline? Is there any standard way to do it?
Any help is greatly appreciated. Thanks!
You can use any load testing tool which is capable of sending message to the API gateway and then reading it from Kafka.
When it comes to JMeter it's capable of both:
API Gateway: Building a WebService Test Plan
Kafka consumer: Apache Kafka - How to Load Test with JMeter
If you want to measure the cumulative duration of the request from API till it gets "consumed" - Transaction Controller
I'm trying to use GCP Data Fusion Basic Edition with Private IP option, buth when I try to create a pipeline every action gives me this error
No discoverable found for request POST /v3/namespaces/system/apps/pipeline/services/studio/methods/v1/contexts/default/validations/stage HTTP/1.1
any suggestion on how to solve this issue
Thanks
This error is indicative of Pipeline Studio service being down. Check the status of Pipeline Studio in System Admin and look at the logs as described here.
You can restart the pipeline studio service by going to System Admin > Configuration > Make HTTP Call.
Change the method to POST and set path to namespaces/system/apps/pipeline/services/studio/start
You can validate your pipeline once pipeline studio status becomes green.
According to the google doc, a service running in the flexible enviroment can be the target of a push task:
Outside of the standard environment, you can't add tasks to push
queues, but a service running in the flexible environment can be the
target of a push task. You can specify this using the target parameter
when adding a task to queue or by specifying the default target for
the queue in queue.yaml.
However, when I tried to do it I get 404 errors in the flexible service.
That's totally normal due to the required endpoint (/_ah/queue/deferred) for task queues is it not defined in the flexible service.
How do I become a flexible service in a valid target for task queues?
Do I have to define that endpoint in my code in some way?
Usually, you'll need to write a handler in your worker service to do the processing after receiving a task. In the case of push tasks, the service will send HTTP requests to your whatever url you specify. If no url is specified the default URL /_ah/queue/[QUEUE_NAME] will be used.
Now, from the endpoint you mention, it seems you are using deferred tasks, which are a somewhat special kind. Please, see this thread for a workaround by adding the needed url entry. It mentions Managed VMS but it should still work.
I registered my task app in Spring Cloud Data Flow, created a definition for it and the status shows 'unknown'. I created the stream and trying to launch the task through task-sink and I get an error:
java.lang.IllegalStateException: failed to resolve MavenResource:
How to launch a task from the task-sink? Am I missing something? Any help is appreciated. Another question I have is how do I access the payload sent via TaskLaunchRequest in my task?
S1 http | step1: transformer-rabbit | log
S2 :S1.step1 > filter --expression=payload.contains('CUSTADDRMODRQ_V15') | task-processor | task-sink
task-sink is launching the task provided by the uri in the TaskLaunchRequest. It is looking for the resource as shown in the log
OUT Using manager EnhancedLocalRepositoryManager with priority 10.0 for /home/vcap/.m2/repository
OUT Using transporter HttpTransporter with priority 5.0 for https://repo.spring.io/libs-snapshot and finally failing.
The task is deployed in our repository and as mentioned I registered and created the definition for it as well.
This one is in cf environment and I am using SCDF server 1.0.0.M4.
In the application.properties for the task-sink i am providing maven.remote.repositories.snapshots.url=**
task create fis-ifx-event-task --definition "fis-event-task"
My goal is launching the task from the stream.
Thanks for the information. I am in fact using the BUILD-SNAPSHOT as I am unable to enable taks in 1.0.0M4 version. Here is the one I am using spring-cloud-dataflow-server-cloudfoundry-1.0.0.BUILD-20160808.144306-116. I am able to register and create task definitions. The status of the task definition is showing as 'unknown' even when I am using the sample task module provided by your team. But when I initiate the flow of the stream and when task-sink tries to launch the task, it is unable to find the maven resource. When I create the task definition, does the task module gets deployed? I don't see any app in Pivotal Apps Manager. As mentioned earlier, I provided maven.remote.repositories.snapshot.url in the application.properties file for the task-sink application. Another thing I observed is when I launch the task manually from dataflow shell it gives an error CF-UnprocessableEntity(10008): The request is semantically invalid: Unknown field(s): 'staging_disk_in_mb', 'staging_memory_in_mb' and also a message saying 'Source is empty'. Presently the task is supposed to print the timestamp and is not dependent on any input.
TaskProcessor code:
#EnableBinding(Processor.class)
#EnableConfigurationProperties(TaskProcessorProperties.class)
public class TaskProcessor {
#Autowired
private TaskProcessorProperties processorProperties;
public TaskProcessor() {
}
#Transformer(inputChannel = Processor.INPUT, outputChannel = Processor.OUTPUT)
#ELI(level = "info", eventType = ELIEventType.INBOUND)
public Object setupRequest(String message) {
Map<String, String> properties = new HashMap<String, String>();
properties.put("payload", message);
TaskLaunchRequest request = new TaskLaunchRequest(processorProperties.getUri(), null, properties, null);
return new GenericMessage<>(request);
}
}
TaskSink code:
#SpringBootApplication
#EnableTaskLauncher
#EnableBinding(Sink.class)
#EnableConfigurationProperties(TaskSinkProperties.class)
public class FisIfxEventTaskSinkApplication {
public static void main(String[] args) {
SpringApplication.run(FisIfxEventTaskSinkApplication.class, args);
}
}
I provided the stream I am using earlier in the post. Sink is receiving the TaskLaunchRequest with uri and payload as you can see here and unable to launch the task.
OUT registering [40, java.io.File] with serializer org.springframework.integration.codec.kryo.FileSerializer
2016-08-10T16:08:55.02-0600 [APP/0]
OUT Launching Task for the following resource TaskLaunchRequest{uri='maven://com.xxx:fis.ifx.event-task:jar:1.0-SNAPSHOT', commandlineArguments=[], environmentProperties={payload={"statusCode":0,"fisT
opic":"CustomerDataUpdated","payloadId":"CUSTADDRMODR``Q_V15","customerIds":[1597304]}}, deploymentProperties={}}
Before I begin, you have a number of questions here. In the future, it's better to break them up into multiple questions so that they are easier to find by other users and easier to answer. That being said:
A little context on the current state of things
In order to understand how things will work, it's important to understand the current state of things. The current releases of the software involved are:
Pivotal Cloud Foundry (PCF) - 1.7.12. This version is required for any task support.
Spring Cloud Task (SCT) - 1.0.2.RELEASE
Spring Cloud Data Flow CF (SCDF) - 1.0.0.BUILD-SNAPSHOT (current as of the date of this post).
Currently PCF 1.7.12+ has all the capabilities to run tasks. You can create v3 applications (the type of application used to launch a task), run it as a task, etc. However, the tooling around that functionality is not currently complete. There is no support for v3 applications in Apps Manager or the CLI. There is a plugin for the CLI that is more of a dev tool that can be used to help with some functions (it will show you logs, etc), but it is not fully functional and requires a specific version of the CLI to work [1]. This is one of the reasons that the task functionality within PCF is still considered experimental.
Spring Cloud Task is currently GA and supports all the functionality needed to effectively run tasks on CF. However, it's important to note that SCT doesn't handle orchestration so the actual launching of tasks on CF is the responsibility of either the user, or Spring Cloud Data Flow (the easier route).
Spring Cloud Data Flow's Cloud Foundry server implementation currently has functionality to launch tasks on PCF in the latest snapshots. We have validated this against 1.7.12 as well as the development branch of 1.8.
The task workflow within SCDF
Tasks are fundamentally different from stream applications within the context of SCDF. When you create a stream definition, you are given the option to deploy it. What this does is it actually downloads the Spring Boot über jars and deploys them to PCF as long running processes. If they go down, PCF, will relaunch them as expected, etc.
Tasks on the other hand, are not deployed. They are launched. The difference is that while you create a task definition, there is nothing deployed until you click launch. And when the task completes, the software is shut down and cleaned up. So while a stream definition may have states, it's really a one to one relationship between the definition and the deployed software. Where with a task, you can launch a task definition as many times as you want.
Your issues
Reading through your post, I see a few things that you are struggling with. Let me see if I can help:
Task Definitions within SCDF and launching them via a stream - When launching a task from a stream, the task registry within SCDF is not used. The sink expects the URL for the resource to be within the TaskLauchRequest.
Apps Manager and tasks - As mentioned above, there is no support for v3 applications in Apps Manager yet so you won't be able to see your tasks there.
Viewing the logs - In order to debug what's going wrong with launching your task on CF, you're going to want to view the logs. To do so, use the v3 CLI plugin mentioned above to view them. It's important to note that you can only tail live logs with the plugin, not view logs that have previously been rendered. Because of that, when testing, you'll want to tail the logs as soon as the app is created, before it's launched.
Error in SCDF Shell - The error you received from the SCDF shell (CF-UnprocessableEntity(10008):...) leads me to wonder if you have both the correct version of PCF (1.7.12+) and the correct version of the following other libraries:
spring-cloud-deployer-cloudfoundry - The latest snapshots
cf-java-client - 2.0.0.M10+
reactor-core - 3.0.0.RC1+
I hope this helps!
[1] https://github.com/cloudfoundry/v3-cli-plugin
Task support is not available in 1.0.0.M4 release of SCDF's CF-server. In this release, the task commands/REST-APIs should be disabled - see here. And for that reason, you wouldn't see any docs related to Tasks in the 1.0.0.M4 reference guide.
That said, the Task support is available/enabled in the BUILD-SNAPSHOT release. If you're locally building the CF-server and upon pushing it to CF, you could take advantage the task commands in the shell to create and launch task definitions.
I get the following error:Service Invocation Exeption i am working with Version 8.7 IBM InfoSphere DataStage and QualityStage Designer and using a server job and there, i have 1 sequential file, web service, sequential file.
Any idea what could be the reason of this error ?
Make sure you have chosen proper DataStage job type and your stage that operates on web service is configured properly.
You should also check the DataStage logs to get more information about root cause of the error.