VS2017 Azure WebJob Extensions - How do I deploy a TimerTrigger WebJob? - azure-webjobs

I have the following WebJob project where I'm trying to deploy a TimerTrigger WebJob function, however I cannot get it to run on a scheduled basis when deploying it via "Publish As Azure WebJob..." in Visual Studio 2017.
Program.cs
class Program
{
static void Main()
{
var config = new JobHostConfiguration();
if (config.IsDevelopment)
{
config.UseDevelopmentSettings();
}
config.UseTimers();
var host = new JobHost(config);
host.RunAndBlock();
}
}
Functions.cs
public class Functions
{
public static async Task ProcessAsync([TimerTrigger("0 */3 * * * *")] TimerInfo timerInfo, TextWriter log)
{
...
}
}
webjob-publish-settings.json
{
"$schema": "http://schemastore.org/schemas/json/webjob-publish-settings.json",
"webJobName": "TestWebJob",
"runMode": "OnDemand"
}
Settings.job
{ "schedule": "0 */3 * * * *" }
The documentation for this is pretty non-existent, and it's baffling to why Azure supports Scheduled CRON TimerTrigger's but doesn't actually include them as an option when deploying.
Is this possible?

If you have created a schedule webjob manually, I think you probably have found it will generate a settings.job to set the schedule.Then the SCHEDULE in the portal read the schedule and show it. And if you deploy a TimerTrigger webjob with VS2017, it won't generate this file because you have define the TimerTrigger function.
Then I did some tests to show it.Firstly I create a webjob with TimerTrigger and deploy it, it will show same result just like yours with n/a SCHEDULE. Then I kill the webjob process and upload a settings.job then refresh(not the refresh in in the portal) the page, then the SCHEDULE change to CRON expression. And if you delete the file, it will change back.
As for the log, in my opinion it's also caused by the settings.job, if you have this file it will trigger this webjob every x minutes, and if you don't have it will trigger the function every x minutes in a webjob.
If you still have questions, please let me know.

It seems that the above code is working. However, it runs completely differently to how you would imagine if you're familiar with running "Scheduled" WebJobs manually.
If you were to run them manually, you would usually see the Schedule at the top level, along with the Status updating every x minutes, etc:
and you would also see the logs update at parent level, like so:
However, when deploying it using the above method via Visual Studio 2017, you only ever get the WebJob running once for the duration of it's lifetime. As a result you would only ever get one parent log in the logs list too.
Though if you click into this, you will see individual logs for each scheduled function log:
Hopefully this will make sense for other people who are looking into setting up WebJobs :)

Related

Running Batch python processes on Google Cloud

I have couple of Python scripts which I would like to schedule to run once a month on Google cloud. The scripts basically trigger DLP jobs, extract data catalog information to a file in GCS. These batch workloads would hardly run for 30 mins. And so I don't need to use services like GKE, composer etc which are very resource intensive.
For these batch workloads I would like to know the best options available in GCP. Looking at some of the blog posts I found below article to use Cloud Scheduler-> Pub/Sub-> Cloud Functions -> Create VM (using a startup script).
https://medium.com/google-cloud/running-a-serverless-batch-workload-on-gcp-with-cloud-scheduler-cloud-functions-and-compute-86c2bd573f25
I have below questions with above design..
1) How long does the Cloud Function run as it starts the VM? I know cloud function has a timeout of 9mins ..what happens if the VM takes longer than 9mins to process the startup script?
Any other design ideas are much appreciated.
Thanks
I'm the author of that medium post.
1) How long does the Cloud Function run as it starts the VM?
You can change the Cloud Function code to not wait for the response, It's using NodeJS so you just don't have to wait for the Promise.
Also in that solution the Cloud Function job is only to trigger the VM creation.
.createVM(vmName, vmConfig)
.then(data => {
// Operation pending.
const vm = data[0];
const operation = data[1];
console.log(`VM being created: ${vm.id}`);
console.log(`Operation info: ${operation.id}`);
return operation.promise();
// This will return right away with the VM pending state, you can finish
// your logic here, and not wait for VM creation to finish.
// You can even ignore this step if you don't need the VM ID logged for
// debugging purposes
})
.then(() => {
const message = 'VM created with success, Cloud Function finished execution.';
console.log(message);
}
Using that same code, in the worst case (if it takes more than 9 minutes), the Cloud Function will timeout but the VM creation will continue.
The desing that I suggest is using: Cloud Scheduler + Pub/Sub + Compute Engine
This design in few words:
- you compute engine will have a utility that listens to a Cloud Pub/Sub topic
- this utility will execute upon receiving a new event from the Topic and run a cron job on the instance
- Cloud scheduler is used here to push messages to the Pub/Sub Topic in a time that you can specify in your job.
By using Pub/Sub to decouple the task-scheduling logic from the logic
running the commands on Compute Engine, you can update your cron
scripts as needed, without updating the Cloud Scheduler configuration.
You can also change your task schedule without updating the utility
service on your Compute Engine instances
you can find full explanation of this design and a sample code by following this and this.
let me know if there is anything not obvious.

How to apply Singleton Attribute for NonTriggered method in Azure Webjobs

If I apply [Singleton] and [NoAutomaticTrigger] attributes and publish the webjob, it goes to pending restart state.
We want to solve multiple instance issue which is occurring in a method.
Please help.
it goes to pending restart state.
In your case, you need to check the reason why webjob goes to pending restart state.
There are lots of reasons that goes to pending restart state. maybe due to an issues or webjob thread is finished needs to restart. We could check it with Webjob log.
Before publish it to azure, we make sure that it works correctly locally and add the appsetting AzureWebJobsDashboard and AzureWebJobsStorage with storage connection string then we could get the webjob log from webjob dashboard.
If you publish it as continuous type webjob, and method is executed completely. And the status will become to pending restart. It is a normal behavior.
[Singleton] and [NoAutomaticTrigger] attributes could work correctly, please refer to the following demo code.
static void Main()
{
JobHost host = new JobHost();
host.Call(typeof(Functions).GetMethod("CreateQueueMessage"), new { value = "Hello world!" + Guid.NewGuid() });
}
[Singleton]
[NoAutomaticTrigger]
public static void CreateQueueMessage(TextWriter logger,string value,[Queue("outputqueue")] out string message)
{
message = value;
logger.WriteLine("Creating queue message: ", message);
Console.WriteLine(message);
}

Azure webjob with runMode "OnDemand" keeps running

the azure webjob with runmode set to "onDemand" keeps running and I am not able to stop it.
I don't see anything that needs to be handled but the job.
{
"$schema": "http://schemastore.org/schemas/json/webjob-publish-settings.json",
"webJobName": "ScheduledJob",
"runMode": "OnDemand"
}
ScheduledJob Triggered Running n/a
the only way to restarted is by restarting the web service. Then start the job manually. And then it keeps running. It does not stop.
What is going on with this webjob?
Update1:
I am using the code from Pnp Partner package which can be found here.
As the code is two long I am just providing the code in the program.cs file.
For the rest please have a look at the I posted above.
static void Main()
{
var job = new PnPPartnerPackProvisioningJob();
job.UseThreading = false;
job.AddSite(PnPPartnerPackSettings.InfrastructureSiteUrl);
job.UseAzureADAppOnlyAuthentication(
PnPPartnerPackSettings.ClientId,
PnPPartnerPackSettings.Tenant,
PnPPartnerPackSettings.AppOnlyCertificate);
job.Run();
#if DEBUG
Console.ReadLine();
#endif
}
In your code, the PnPPartnerPackProvisioningJob class is inheritted from TimerJob class.
In TimerJob class, there is not a stop method. And if timer job has started executing, you can not really stop it unless you restart web jobs. For more details, you could refer to this article.
So if your requirement is to cancel a job, you will need to delete the timer job definition. However if timer job has started executing, you can not really STOP it unless you reset IIS or stop Sharepoint Windows Timer Service.

How to set Continuous Webjob as Singleton?

I've seen some guidance around using settings.job for this but it's not working - in the console I see:
WebJob singleton setting is False
How can I go about preventing scale-outs from running multiple instances of my webjob?
As far as I know, to set a continuous job as singleton, we could create a file called settings.job with the content: { "is_singleton": true } and put it at the root of the WebJob directory.
And we could get continuous job settings to make sure whether it is singleton.
GET /api/continuouswebjobs/{job name}/settings
in the console I see:
WebJob singleton setting is False
Please use Kudu tool to check whether the settings.job is existing at the root of the WebJob directory and the actual value of “is_singleton” property.
If you can use the WebJobs SDK, I prefer to use the Singleton attribute.
https://learn.microsoft.com/en-us/azure/app-service/webjobs-sdk-how-to#singleton-attribute
[Singleton]
public static async Task ProcessImage([BlobTrigger("images")] Stream image)
{
// Process the image.
}
If you have something like a ServiceBus trigger you should use [Singleton(Mode = SingletonMode.Listener)] in combination with the other host settings.

Unable to launch task from a spring cloud data flow stream

I registered my task app in Spring Cloud Data Flow, created a definition for it and the status shows 'unknown'. I created the stream and trying to launch the task through task-sink and I get an error:
java.lang.IllegalStateException: failed to resolve MavenResource:
How to launch a task from the task-sink? Am I missing something? Any help is appreciated. Another question I have is how do I access the payload sent via TaskLaunchRequest in my task?
S1 http | step1: transformer-rabbit | log
S2 :S1.step1 > filter --expression=payload.contains('CUSTADDRMODRQ_V15') | task-processor | task-sink
task-sink is launching the task provided by the uri in the TaskLaunchRequest. It is looking for the resource as shown in the log
OUT Using manager EnhancedLocalRepositoryManager with priority 10.0 for /home/vcap/.m2/repository
OUT Using transporter HttpTransporter with priority 5.0 for https://repo.spring.io/libs-snapshot and finally failing.
The task is deployed in our repository and as mentioned I registered and created the definition for it as well.
This one is in cf environment and I am using SCDF server 1.0.0.M4.
In the application.properties for the task-sink i am providing maven.remote.repositories.snapshots.url=**
task create fis-ifx-event-task --definition "fis-event-task"
My goal is launching the task from the stream.
Thanks for the information. I am in fact using the BUILD-SNAPSHOT as I am unable to enable taks in 1.0.0M4 version. Here is the one I am using spring-cloud-dataflow-server-cloudfoundry-1.0.0.BUILD-20160808.144306-116. I am able to register and create task definitions. The status of the task definition is showing as 'unknown' even when I am using the sample task module provided by your team. But when I initiate the flow of the stream and when task-sink tries to launch the task, it is unable to find the maven resource. When I create the task definition, does the task module gets deployed? I don't see any app in Pivotal Apps Manager. As mentioned earlier, I provided maven.remote.repositories.snapshot.url in the application.properties file for the task-sink application. Another thing I observed is when I launch the task manually from dataflow shell it gives an error CF-UnprocessableEntity(10008): The request is semantically invalid: Unknown field(s): 'staging_disk_in_mb', 'staging_memory_in_mb' and also a message saying 'Source is empty'. Presently the task is supposed to print the timestamp and is not dependent on any input.
TaskProcessor code:
#EnableBinding(Processor.class)
#EnableConfigurationProperties(TaskProcessorProperties.class)
public class TaskProcessor {
#Autowired
private TaskProcessorProperties processorProperties;
public TaskProcessor() {
}
#Transformer(inputChannel = Processor.INPUT, outputChannel = Processor.OUTPUT)
#ELI(level = "info", eventType = ELIEventType.INBOUND)
public Object setupRequest(String message) {
Map<String, String> properties = new HashMap<String, String>();
properties.put("payload", message);
TaskLaunchRequest request = new TaskLaunchRequest(processorProperties.getUri(), null, properties, null);
return new GenericMessage<>(request);
}
}
TaskSink code:
#SpringBootApplication
#EnableTaskLauncher
#EnableBinding(Sink.class)
#EnableConfigurationProperties(TaskSinkProperties.class)
public class FisIfxEventTaskSinkApplication {
public static void main(String[] args) {
SpringApplication.run(FisIfxEventTaskSinkApplication.class, args);
}
}
I provided the stream I am using earlier in the post. Sink is receiving the TaskLaunchRequest with uri and payload as you can see here and unable to launch the task.
OUT registering [40, java.io.File] with serializer org.springframework.integration.codec.kryo.FileSerializer
2016-08-10T16:08:55.02-0600 [APP/0]
OUT Launching Task for the following resource TaskLaunchRequest{uri='maven://com.xxx:fis.ifx.event-task:jar:1.0-SNAPSHOT', commandlineArguments=[], environmentProperties={payload={"statusCode":0,"fisT
opic":"CustomerDataUpdated","payloadId":"CUSTADDRMODR``Q_V15","customerIds":[1597304]}}, deploymentProperties={}}
Before I begin, you have a number of questions here. In the future, it's better to break them up into multiple questions so that they are easier to find by other users and easier to answer. That being said:
A little context on the current state of things
In order to understand how things will work, it's important to understand the current state of things. The current releases of the software involved are:
Pivotal Cloud Foundry (PCF) - 1.7.12. This version is required for any task support.
Spring Cloud Task (SCT) - 1.0.2.RELEASE
Spring Cloud Data Flow CF (SCDF) - 1.0.0.BUILD-SNAPSHOT (current as of the date of this post).
Currently PCF 1.7.12+ has all the capabilities to run tasks. You can create v3 applications (the type of application used to launch a task), run it as a task, etc. However, the tooling around that functionality is not currently complete. There is no support for v3 applications in Apps Manager or the CLI. There is a plugin for the CLI that is more of a dev tool that can be used to help with some functions (it will show you logs, etc), but it is not fully functional and requires a specific version of the CLI to work [1]. This is one of the reasons that the task functionality within PCF is still considered experimental.
Spring Cloud Task is currently GA and supports all the functionality needed to effectively run tasks on CF. However, it's important to note that SCT doesn't handle orchestration so the actual launching of tasks on CF is the responsibility of either the user, or Spring Cloud Data Flow (the easier route).
Spring Cloud Data Flow's Cloud Foundry server implementation currently has functionality to launch tasks on PCF in the latest snapshots. We have validated this against 1.7.12 as well as the development branch of 1.8.
The task workflow within SCDF
Tasks are fundamentally different from stream applications within the context of SCDF. When you create a stream definition, you are given the option to deploy it. What this does is it actually downloads the Spring Boot über jars and deploys them to PCF as long running processes. If they go down, PCF, will relaunch them as expected, etc.
Tasks on the other hand, are not deployed. They are launched. The difference is that while you create a task definition, there is nothing deployed until you click launch. And when the task completes, the software is shut down and cleaned up. So while a stream definition may have states, it's really a one to one relationship between the definition and the deployed software. Where with a task, you can launch a task definition as many times as you want.
Your issues
Reading through your post, I see a few things that you are struggling with. Let me see if I can help:
Task Definitions within SCDF and launching them via a stream - When launching a task from a stream, the task registry within SCDF is not used. The sink expects the URL for the resource to be within the TaskLauchRequest.
Apps Manager and tasks - As mentioned above, there is no support for v3 applications in Apps Manager yet so you won't be able to see your tasks there.
Viewing the logs - In order to debug what's going wrong with launching your task on CF, you're going to want to view the logs. To do so, use the v3 CLI plugin mentioned above to view them. It's important to note that you can only tail live logs with the plugin, not view logs that have previously been rendered. Because of that, when testing, you'll want to tail the logs as soon as the app is created, before it's launched.
Error in SCDF Shell - The error you received from the SCDF shell (CF-UnprocessableEntity(10008):...) leads me to wonder if you have both the correct version of PCF (1.7.12+) and the correct version of the following other libraries:
spring-cloud-deployer-cloudfoundry - The latest snapshots
cf-java-client - 2.0.0.M10+
reactor-core - 3.0.0.RC1+
I hope this helps!
[1] https://github.com/cloudfoundry/v3-cli-plugin
Task support is not available in 1.0.0.M4 release of SCDF's CF-server. In this release, the task commands/REST-APIs should be disabled - see here. And for that reason, you wouldn't see any docs related to Tasks in the 1.0.0.M4 reference guide.
That said, the Task support is available/enabled in the BUILD-SNAPSHOT release. If you're locally building the CF-server and upon pushing it to CF, you could take advantage the task commands in the shell to create and launch task definitions.