Using Obsidian Maintenance Jobs

When we began development on Obsidian Scheduler, one of its key objectives was transparency. Quoting my own post,

transparent means letting us know what is going on. We want to have available to us information such as when was the job scheduled to run, when did it start, when did it complete, which host ran it (in pool scenarios). We want to know if it failed, what the problem was. We want the option to be notified either being emailed, paged, and/or messaged when all or specific jobs fails. We want detailed information available to us should we wish to investigate problems in our executing jobs. And we want all of this without having to write code, create our own interface, parse log files or do extra configuration. The scheduler should just work that way.

We’re very happy that we achieved that goal and Obsidian gives us the transparency we sought.

As you likely know, Obsidian stores all this information in a database. This allows you to retain as much of this history as you desire or are required by your business group’s policy. In theory, you could retain all of this history indefinitely in the live database ensuring this information would be available at any time simply by accessing the Obsidian UI. In practice, this is often not required. Many organizations simply maintain an indefinite archive of database backups and can retrieve and restore these to an environment they would use specifically to perform any desired historical investigations.

Obsidian is bundled with two maintenance jobs that can be used to keep the Obsidian database trim and holding only the necessary recent history. These maintenance jobs take advantage of Obsidian’s built-in ability for parametrization allowing you to choose the desired retention period. These jobs are not scheduled by default, so let’s take a look at how we can configure them to run.

JobHistoryCleanupJob

This job cleans up the execution history of your jobs, retaining the most recent history. When scheduling a new job in Obsidian, you will find the Job Class com.carfey.ops.job.maint.JobHistoryCleanupJob in the selection list. Once chosen, you will see that it supports a single required parameter – maxAgeDays. Assuming you want job execution history retained for the most recent 90 days, set this value to 90 days. Here’s a screenshot of what it would look like.
JobHistoryCleanupUsage

LogCleanupJob

This job cleans up the audit and information logs that track all activity in Obsidian. As described in our wiki, these logs contain everything from scheduler system activity such as spawning and execution to user activity such as changing a schedule, its configuration or adding new jobs. These logs are classified by severity level and the Job Class com.carfey.ops.job.maint.LogCleanupJob is configured using the same required parameter maxAgeDays, but has an additional required parameter – level. It defaults to ALL and allows multiple values, but you can also retain this log data for different retention periods by severity by scheduling and configuring this job for each classification. Notice these samples:
LogCleanupUsage1
LogCleanupUsage2

That’s all there is to it! In both cases, the absence of a configured maintenance job or in the case of the LogCleanupJob, configured for a given severity level, means that data will be retained indefinitely. Of course, you can always decide later on to configure such a job and at that time Obsidian will take care of it for you.

Is there something else you’d like to see Obsidian do? Drop us a line or leave a comment below. We listen carefully to all our customers’ feature requests and give priority to customer’s needs in our product roadmap. We also appreciate hearing how Obsidian is helping you.

Job Chaining in Quartz and Obsidian Scheduler

In this post I’m going to cover how to do job chaining in Quartz versus Obsidian Scheduler. Both are Java job schedulers, but they have different approaches so I thought I’d highlight them here and give some guidance to users using both options.

It’s very common when using a job scheduler to need to chain one job to another. Chaining in this case refers to executing a specific job after a certain job completes (or maybe even fails). Often we want to do this conditionally, or pass on data to the target job so it can receive it as input from the original job.

We’ll start with demonstrating how to do this in Quartz, which will take a fair bit of work. Obsidian will come after since it’s so simple.

Chaining in Quartz

Quartz is the most popular job scheduler out there, but unfortunately it doesn’t provide any way to give you chaining without you writing some code. Quartz is a low-level library at heart, and it doesn’t try to solve these types of problems for you, which in my mind is unfortunate since it puts the onus on developers. But despite this, many teams still end up using Quartz, so hopefully this is useful to some of you.

I’m going to outline probably the most basic way to perform chaining. It will allow a job to chain to another, passing on its JobDataMap (for state). This is simpler than using listeners, which would require extra configuration, but if you want to take a look, check out this listener for a starting point.

Sample Code

This will rely on an abstract class that will provided basic flow and chaining functionality to any subclasses. It acts as a very simple Template class.

First, let’s create the abstract class that gives us chaining behaviour:

import static org.quartz.JobBuilder.newJob;
import static org.quartz.TriggerBuilder.newTrigger;
import org.quartz.*;
import org.quartz.impl.*;

public abstract class ChainableJob implements Job {
   private static final String CHAIN_JOB_CLASS = "chainedJobClass";
   private static final String CHAIN_JOB_NAME = "chainedJobName";
   private static final String CHAIN_JOB_GROUP = "chainedJobGroup";
   
   @Override
   public void execute(JobExecutionContext context) throws JobExecutionException {
      // execute actual job code
      doExecute(context);

      // if chainJob() was called, chain the target job, passing on the JobDataMap
      if (context.getJobDetail().getJobDataMap().get(CHAIN_JOB_CLASS) != null) {
         try {
            chain(context);
         } catch (SchedulerException e) {
            e.printStackTrace();
         }
      }
   }
   
   // actually schedule the chained job to run now
   private void chain(JobExecutionContext context) throws SchedulerException {
      JobDataMap map = context.getJobDetail().getJobDataMap();
      @SuppressWarnings("unchecked")
      Class jobClass = (Class) map.remove(CHAIN_JOB_CLASS);
      String jobName = (String) map.remove(CHAIN_JOB_NAME);
      String jobGroup = (String) map.remove(CHAIN_JOB_GROUP);
      
      
      JobDetail jobDetail = newJob(jobClass)
            .withIdentity(jobName, jobGroup)
            .usingJobData(map)
            .build();
         
      Trigger trigger = newTrigger()
            .withIdentity(jobName + "Trigger", jobGroup + "Trigger")
                  .startNow()      
                  .build();
      System.out.println("Chaining " + jobName);
      StdSchedulerFactory.getDefaultScheduler().scheduleJob(jobDetail, trigger);
   }

   protected abstract void doExecute(JobExecutionContext context) 
                                    throws JobExecutionException;
   
   // trigger job chain (invocation waits for job completion)
   protected void chainJob(JobExecutionContext context, 
                          Class jobClass, 
                          String jobName, 
                          String jobGroup) {
      JobDataMap map = context.getJobDetail().getJobDataMap();
      map.put(CHAIN_JOB_CLASS, jobClass);
      map.put(CHAIN_JOB_NAME, jobName);
      map.put(CHAIN_JOB_GROUP, jobGroup);
   }
}

There’s a fair bit of code here, but it’s nothing too complicated. We create the basic flow for job chaining by creating an abstract class which calls a doExecute() method in the child class, then chains the job if it was requested by calling chainJob().

So how do we use it? Check out the job below. It actually chains to itself to demonstrate that you can chain any job and that it can be conditional. In this case, we will chain the job to another instance of the same class if it hasn’t already been chained, and we get a true value from new Random().nextBoolean().

import java.util.*;
import org.quartz.*;

public class TestJob extends ChainableJob {

   @Override
   protected void doExecute(JobExecutionContext context) 
                                   throws JobExecutionException {
      JobDataMap map = context.getJobDetail().getJobDataMap();
      System.out.println("Executing " + context.getJobDetail().getKey().getName() 
                         + " with " + new LinkedHashMap(map));
      
      boolean alreadyChained = map.get("jobValue") != null;
      if (!alreadyChained) {
         map.put("jobTime", new Date().toString());
         map.put("jobValue", new Random().nextLong());
      }
      
      if (!alreadyChained && new Random().nextBoolean()) {
         chainJob(context, TestJob.class, "secondJob", "secondJobGroup");
      }
   }
   
}

The call to chainJob() at the end will result in the automatic job chaining behaviour in the parent class. Note that this isn’t called immediately, but only executes after the job completes its doExecute() method.

Here’s a simple harness that demonstrates everything together:

import org.quartz.*;
import org.quartz.impl.*;

public class Test {
   
   public static void main(String[] args) throws Exception {

      // start up scheduler
      StdSchedulerFactory.getDefaultScheduler().start();

      JobDetail job = JobBuilder.newJob(TestJob.class)
             .withIdentity("firstJob", "firstJobGroup").build();

      // Trigger our source job to triggers another
      Trigger trigger = TriggerBuilder.newTrigger()
            .withIdentity("firstJobTrigger", "firstJobbTriggerGroup")
            .startNow()
            .withSchedule(
                  SimpleScheduleBuilder.simpleSchedule().withIntervalInSeconds(1)
                  .repeatForever()).build();

      StdSchedulerFactory.getDefaultScheduler().scheduleJob(job, trigger);
      Thread.sleep(5000);   // let job run a few times

      StdSchedulerFactory.getDefaultScheduler().shutdown();
   }
   
}

Sample Output

Executing firstJob with {}
Chaining secondJob
Executing secondJob with {jobValue=5420204983304142728, jobTime=Sat Mar 02 15:19:29 PST 2013}
Executing firstJob with {}
Executing firstJob with {}
Chaining secondJob
Executing secondJob with {jobValue=-2361712834083016932, jobTime=Sat Mar 02 15:19:31 PST 2013}
Executing firstJob with {}
Chaining secondJob
Executing secondJob with {jobValue=7080718769449337795, jobTime=Sat Mar 02 15:19:32 PST 2013}
Executing firstJob with {}
Chaining secondJob
Executing secondJob with {jobValue=7235143258790440677, jobTime=Sat Mar 02 15:19:33 PST 2013}
Executing firstJob with {}

Deficiencies

Well, we’re up and chaining, but there are some problems with this approach:

  • It doesn’t integrate with a container like Spring to use configured jobs. More code would be required.
  • It forces you to know up front which jobs you want to chain, and write code for it.
  • Configuration is fixed, unless, once again, you write more code.
  • No real-time changes (unless you write more code).
  • A fair bit of code to maintain , and high likelihood you will have to expand it for more functionality.

The theme here is that it’s doable, but it’s up to you to do the work to make it happen. Obsidian avoids these problems by making chaining configurable, instead of it being a feature of the job itself. Read on to find out how.

Chaining in Obsidian

In contrast to Quartz, chaining in Obsidian requires no code and no up-front knowledge of which jobs will chain or how you might want to chain them later. Chaining is a form of configuration, and like all job-related configuration in Obsidian, you can make live changes at any time without a build or any code at all. Job configuration can use a native REST API or the web UI that’s included with Obsidian.

The following chaining features are available for free:

  • No code and no redeploy to add or remove chains.
  • You can chain specific configurations of job classes.
  • You can chain only on certain states, including failure.
  • Chain conditionally based on source job saved state (equivalent to Quartz’s JobDataMap), including multiple conditions. Regexp/Equals/Greater than, etc.
  • Chain only when matching a schedule.

Check out the feature and UI documentation to find out more.

Now that we know what’s possible, let’s see an example. Once you have your jobs configured, just create a new chain using the UI. REST API support will be here shortly but as of 1.5.1 chaining isn’t included in the API. If you need to script this right now, we can provide pointers.

In the UI, it looks like the following:

Chaining UI

Easy, huh? All configuration is stored in a database, so it’s easy to replicate it in various environments or to automate it via scripting. As a bonus, Obsidian tracks and shows you all chaining state including what job triggered a chained job. It will even tell you why a job chain didn’t fire, whether it’s because the job status didn’t match, or one of your conditions didn’t.

Conclusion

That summarizes how you can go about chaining in Quartz and Obsidian. Quartz definitely has a minimalist approach, but that leaves developers with a lot of work to do.

Meanwhile, Obsidian provides rich functionality out of the box to keep developers working on their own rich functionality, instead of the plumbing that so often seems to consume their time. If you have any suggestions or feature requests for Obsidian, drop us a note by leaving a comment or by contacting us.

Comparing Job Development in Quartz and Obsidian

Getting your program code to the point that it satisfies the functional requirements provided is a milestone for developers, one that hopefully brings satisfaction and a sense of accomplishment. If that code must be executed on a schedule perhaps for multiple uses with custom schedules and configurable parameters, this can mean a whole new set of problems.
We’re going to compare how we would write a job in Quartz and one in Obsidian that would satisfy the above requirements. We’ll use the example scenario of a recurring report. In this scenario, the report has the following dynamic criteria: it is emailed to a specified user, the report format can be selected, either PDF or Excel, and of course the execution frequency varies by user.

The following will be the sample class we’ll use to satisfy these requirements.

public class MyReportClass {
    public void emailReport(String emailAddress, String reportFormat) {
	…generate report in desired format
	…email report to user
    }
}

For the purpose of this exercise, we will leave this class alone and write a wrapper class for scheduling, allowing for its continued use in non-scheduled contexts.

Let’s start with Obsidian. All Obsidian jobs start with implementing a single interface: SchedulableJob. Our Obsidian job class will look something like this:

import com.carfey.ops.job.Context;
import com.carfey.ops.job.SchedulableJob;
import com.carfey.ops.job.param.Configuration;
import com.carfey.ops.job.param.Parameter;
import com.carfey.ops.job.param.Type;

@Configuration(knownParameters={
		@Parameter(name= MyScheduledReportClass.EMAIL, type=Type.STRING, required=true),
		@Parameter(name= MyScheduledReportClass.REPORT_FORMAT, type=Type.STRING, defaultValue="PDF", required=true)
}) 
public class MyScheduledReportClass implements SchedulableJob {
	public static final String EMAIL = "email";
	public static final String REPORT_FORMAT = "reportFormat";

	public void execute(Context context) throws Exception {
		String email = context.getConfig().getString(EMAIL);
		String reportFormat = context.getConfig().getString(REPORT_FORMAT);
		new MyReportClass().emailReport(email, reportFormat);
	}
}

You’ll notice we can annotate the class with the required parameters. This ensures that when this job is scheduled for execution, the email and reportFormat parameters will always be available. Obsidian will not allow the job to be configured without these values and will also ensure their type. But we wouldn’t mind going a step further. We’d like to validate the reportFormat is valid. How can we do so before the job is run?
We can change our class to implement ConfigValidatingJob and implement the necessary method.

Now our class looks like this:

import com.carfey.ops.job.ConfigValidatingJob;
import com.carfey.ops.job.Context;
import com.carfey.ops.job.config.JobConfig;
import com.carfey.ops.job.param.Configuration;
import com.carfey.ops.job.param.Parameter;
import com.carfey.ops.job.param.Type;
import com.carfey.ops.parameter.ParameterException;
import com.carfey.suite.action.ValidationException;

@Configuration(knownParameters={
		@Parameter(name= MyScheduledReportClass.EMAIL, type=Type.STRING, required=true),
		@Parameter(name= MyScheduledReportClass.REPORT_FORMAT, type=Type.STRING, defaultValue="PDF", required=true)
}) 
public class MyScheduledReportClass implements ConfigValidatingJob {
	public static final String EMAIL = "email";
	public static final String REPORT_FORMAT = "reportFormat";

	public void execute(Context context) throws Exception {
		String email = context.getConfig().getString(EMAIL);
		String reportFormat = context.getConfig().getString(REPORT_FORMAT);
		new MyReportClass().emailReport(email, reportFormat);
	}

	public void validateConfig(JobConfig config) throws ValidationException, ParameterException {
		String reportFormat = config.getString(REPORT_FORMAT);
		if (!"PDF".equalsIgnoreCase(reportFormat) && !"EXCEL".equalsIgnoreCase(reportFormat)) {
			throw new ValidationException("Report format must be either PDF or EXCEL");
		}
	}

}

That’s it! Our job will now only accept being scheduled with an email address specified and a valid report format specified. You could easily extend this to other types of custom validation, such as ensuring the email address is valid or perhaps that is in an allowable domain.

Now for Quartz. Let’s first of all identify some differences. Quartz doesn’t provide any mechanisms for ensuring parameters are specified or are valid before runtime. And since Quartz doesn’t provide an execution context, the best you can do when you write your own code to do so is validate the parameters on startup. Our sample below will follow the easiest approach in Quartz, to simply fail the job at runtime if the report format is invalid.

import org.quartz.Job;
import org.quartz.JobDataMap;
import org.quartz.JobExecutionContext;
import org.quartz.JobExecutionException;


public class MyScheduledReportClass implements Job {
	public static final String EMAIL = "email";
	public static final String REPORT_FORMAT = "reportFormat";

	public void execute(JobExecutionContext context) throws JobExecutionException {

        JobDataMap data = context.getMergedJobDataMap();
        String email = data.getString(EMAIL);
        String reportFormat = data.getString(REPORT_FORMAT);
		if (!"PDF".equalsIgnoreCase(reportFormat) && !"EXCEL".equalsIgnoreCase(reportFormat)) {
			throw new JobExecutionException("Report format must be either PDF or EXCEL");
		}
		new MyReportClass().emailReport(email, reportFormat);
	}

}

You may be thinking that the classes seem fairly comparable and I would agree. But with the Obsidian job, there’s nothing else that needs to be done. Since setting the runtime schedule and specifying parameters tend to be fluid, those are not done in code or even in static configuration. Using Obsidian’s UI or REST API, you specify the schedule and parameters for each instance or version of the job that is needed.

Obsidian always provides an execution context that can be standalone or be embedded as a part of an existing execution context.

Quartz never provides an execution context. Unless you are deploying in a servlet container, you always need to initialize the scheduling environment. Even when using a servlet container, you must help Quartz along. That means that With Quartz, you’ve only a portion of the code and/or configuration you’ll need.

Initialize the scheduler:

SchedulerFactory sf = new StdSchedulerFactory();
Scheduler scheduler = sf.getScheduler();
scheduler.start();

No administration console and no REST API means code and/or config to schedule and parameterize your job.

JobDetail job = newJob(MyScheduledReportClass.class).withIdentity("joe's report", "group1").usingJobData(MyScheduledReportClass.EMAIL, "joe@****.com").usingJobData(MyScheduledReportClass.REPORT_FORMAT, "PDF").build();
Trigger trigger = newTrigger().withIdentity("trigger1", "group1").startNow().withSchedule(dailyAtHourAndMinute(1, 30)).build();
scheduler.scheduleJob(job, trigger);

Now this may not seem too bad, but now imagine that Joe says he wants the report in Excel, not PDF. Are you really going to say that it requires code changes, followed by a build, followed by testing, acceptance, and promoting a new release?

True, some of the above can be moved to configuration files. While that may avoid a build cycle, it does present its own set of issues. You still have to push new configuration files, restart the jvm process and deal with potential mistakes in the new configuration files that could potentially derail all scheduling.

This also doesn’t get into the issues surrounding misfires, Job Concurrency and execution exception handling and recoverability discussed here.

What do you think? Share your experiences using Quartz for scheduling in your java projects by leaving a comment. We’d like to hear from you.

Configuring Clustering in Quartz and Obsidian Schedulers

Job scheduling is used on many software projects to enable both internal jobs and third-party integration. Clustering can provide a huge boost to reliability by providing fail-over and load-sharing. I believe that clustering should be implemented for reliability on just about all software projects, so I’ve decided to outline how to go about doing that in two popular cluster-enabled Java job schedulers. This post is going to cover how to set up clustering for Quartz and Obsidian. It will explain what work is required to configure each, and help you watch for some common pitfalls. This guide will assume you have both schedulers running in the base configuration already.

Both Quartz and Obsidian have their strong points, and this post won’t debate which is better, but it will provide the information you need to cluster either one.

Quartz

Quartz is the most popular open-source job scheduling option, and it allows for you to cluster scheduler instances via the JDBCJobStore. Though it’s also possible to do clustering with Terracotta without a persistent job store, I recommend against it for most projects since job execution history is very useful to ongoing operations and troubleshooting, and database-clustering performs adequately in all but extreme cases.

Obsidian

Obsidian is a commercial job scheduler which provides free individual instances, and a single clustering licence free for a year. It provides a similar type of clustering as Quartz, and it also provides a full UI, a REST API, downtime recovery, any many other advanced configuration options.

Configurating Obsidian for Clustering

I’ll cover Obsidian first simply because there’s little to do in comparison to Quartz. Since running Obsidian always uses a database, if you have an instance running, there will be no database configuration to update. If you haven’t set it up yet, check out the Getting Started guide – the installation package comes with an interactive Ant tool to build the properties file for you.

So here’s the thing: to cluster Obsidian, simply start up additional instances. Obsidian doesn’t have a non-clustered mode, and all instances automatically handle adjusting the load-sharing algorithm when new members join the cluster (or drop out). Your system clocks should be roughly in sync, but if they differ by a few seconds, it’s not a problem since Obsidian gracefully handles this. Still, you may synchronize your server times if desired.

Note to those using the free version: right after download, you can start two instances and they will automatically join the same cluster. If you do not have adequate licences, new members will not be able to join the cluster.

There’s no need to deploy different properties files or anything like that. As an example, if you use Amazon’s EC2 service, you could use the same image for multiple nodes in the cluster and everything would work correctly. Each cluster member will automatically assign itself a unique instance name based on its local host name and a unique suffix if required.

However, we recommend you assign each cluster member an explicit name which will help with troubleshooting if there are issues on a specific host. To do so, simply set the Java system property schedulerDesignation to the host name of your choice. For example, if starting an instance using the standalone scheduler using java directly, simply add the value -DschedulerDesignation=myHostName to the end of the command.

That’s it! Clustering Obsidian is literally as easy as copying an installation and starting it on another server or virtual machine. As a bonus, unlike other schedulers, in Obsidian you can set a job to only run on a specific host, even with clustering enabled.

In summary, the steps are:

  1. Grab a copy of your Obsidian installation (WAR, standalone package or embedded application bundle).
  2. Start it up! At this step you can provide the schedulerDesignation you wish to use.

Configurating Quartz for Clustering

For Quartz, we’re only going to cover configuring database clustering since we believe it is the right choice to most projects.

Note before you get started: If you have a non-clustered version of Quartz running, you must shut it down before starting any cluster-enabled instances.

Another note about timing: Since Quartz uses very aggressive timing, you must ensure your different instances have precisely synchronized times. Even a second difference will cause your load-sharing to be effectively disabled and all jobs will run on a single host.

Setting up clustering for Quartz can be a bit overwhelming since it exposes so many properties, and it’s hard to figure out which must be added to your properties file to get up and running, but I will try to simplify the process as much as possible.

Quartz is configured via the quartz.properties file. Here’s a sample file that outlines the bare minimum properties you will have to configure for clustering. If you want to see full details on any portion of the config, see the Quartz configuration reference. Note that the properties file should be identical on all hosts, with the one exception being the org.quartz.scheduler.instanceId which is used to identify different hosts.


# Basic Quartz configuration to provide an adequate pool of threads for execution

org.quartz.threadPool.class = org.quartz.simpl.SimpleThreadPool
org.quartz.threadPool.threadCount = 25

# Datasource for JDBCJobStore
org.quartz.dataSource.myDS.driver =com.mysql.jdbc.Driver
org.quartz.dataSource.myDS.URL = jdbc:mysql://localhost:3306/scheduler
org.quartz.dataSource.myDS.user = myUser
org.quartz.dataSource.myDS.password = myPassword

# JDBCJobStore
org.quartz.jobStore.class = org.quartz.impl.jdbcjobstore.JobStoreTX
org.quartz.jobStore.driverDelegateClass = org.quartz.impl.jdbcjobstore.StdJDBCDelegate
org.quartz.jobStore.dataSource = myDS

# Turn on clustering
org.quartz.jobStore.isClustered = true

org.quartz.scheduler.instanceName = ClusteredScheduler
# If instanceIf is set to AUTO, if will auto-generate an id automatically.
# I recommend giving explicit names to each clustered host for easy identification.
org.quartz.scheduler.instanceId = Host1

Note about JDBC settings: For whatever reason, you have to configure both the JDBC driver class and the job store’s “driver delegate” class. These will have to be set to the appropriate value for your database platform.

As I mentioned, these are the bare minimum properties you will have to configure to get clustering running. The one big pain with this configuration is that the instanceId which identifies hosts resides in the same properties file as all the other properties which should all remain the same. Keeping different versions in sync can problematic, and isn’t required for Obsidian. You can use the “AUTO” setting to avoid having to set explicit instance IDs, but as with Obsidian, we recommend you give explicit names as it can help you locate where issues are happening more quickly if you know up front which host is which.

So the main steps to enabling clustering are:

  1. Prepare properties for each cluster member. Ensure only the org.quartz.scheduler.instanceId varies in the properties file.
  2. Turn off all running instances by shutting down the application which is running it, or disabling just the Quartz process (see here).
  3. Start your application, or start the Quartz process (see here).
  4. Ensure that you never start a non-clustered instance again!

Conclusion

I hope this helps you get off the ground quickly with your new or existing project. Clustering is a great feature in any scheduler, and I feel it provides a lot of value that software and operations teams might be missing out on.

Why you shouldn’t use Quartz Scheduler

If you need to schedule jobs in Java, it is fairly common in the industry to use Quartz directly or via Spring integration. Quartz’ home page at the time of writing claims that using quartz is a simple 3-step process: Download, add to app, execute jobs when you need to. For any of you that actually have experience with Quartz, this is truly laughable.

First of all, adding the quartz library to your app does not begin to ready your application to schedule jobs. Getting your code to run in a schedule with quartz is anything but straightforward. You have to write your implementation of the job interface, then you have to construct large xml configuration files or add code to your application to build new instances of JobDetails, Triggers using complex api such as
 .withIdentity("myTrigger", "group1").startNow().withSchedule(simpleSchedule()
.withIntervalInSeconds(40).repeatForever()).build()
and then schedule them using Schedule instance from a ScheduleFactory. All this is code you have to write for each job or the equivalent xml configuration. Quite the headache for something that was supposed to simply be “execute jobs when you need to”. Even in Quartz’ tutorial, it takes 6 lessons to setup a job. What happens when you need to make a change to a job’s schedule? Temporarily disable a job? Change the parameters bound to a job? All these require a build/test/deploy cycle which is impractical for any organization.

Quartz is also deficient in its feature set. Out of the box, it is just a code library for job execution. No monitoring console for reviewing errors and history, no useful and reasonably searchable logging, no support for multiple execution nodes, no adminstration interface, no alerts or notifications, inflexible and buggy recovery mechansims for failed jobs and missed jobs.

Quartz does provide add-on support for multiple nodes, but it requires additional advanced configuration. Quartz also provides an add-on called Quartz Manager, it too needs additional advanced configuration, is a flash app and is incrediby cumbersome and impractical to use.

Simply put, Quartz doesn’t meet these basic needs:

  • No out of the box support for multiple execution nodes (pooling or clustering)
  • No adminstration UI that allows all job scheduling and configuration to be done outside of code
  • No monitoring
  • No alerts
  • Insufficient mechanisms for dealing with errors/failures and recovery

All this means Quartz is not really a justifiable choice as an enterprise scheduler. It is feature poor and has high implementation and ongoing utilization costs in terms of time and energy.

Obsidian Scheduler really is the best choice for your java-based applications. You truly can be up and running the same day you download it. We have a live, interactive demo where you can try out the interface and see first-hand how easy it is to add/change/disable jobs, to monitor all the node activity, disable/enable nodes and even take advantage of advanced schedule configuration such as chaining and sticky nodes.

In addition to our product website, we’ve discussed Obsidian’s standout features many times here on our blog. Download it today and give it a try!

Scheduler Monitoring Done Right

One of the most important features of Obsidian is the ability to be notified of application events and also quickly locate and correct any issues that arise. While products like Quartz and cron4j give us the basic scheduling, we have to develop our own solutions for monitoring and notification of events. Unfortunately, in many corporate environments, developers are very limited in how much time they can work on infrastructure pieces and things like error handling and monitoring end up getting far too little focus. When you are developing in-house or revenue-generating software, managers just want the product out the door, and aren’t worried about maintenance implications until far too late.

With these realities in mind, we’ve built in 3 levels of functionality to facilitate monitoring. Based on decades of software and operations experience, we’ve identified the primary areas where developers and support teams spend their time, and the ways that we could reduce costs so that using our scheduler would be a huge gain for our clients.

Level One – Logging Console

The first problem we identified is many applications rely on text logs generated by frameworks like log4j to troubleshoot and locate issues. The problem is, these logs can be very verbose, especially over time, and if not correctly configured, the information you need to troubleshoot an issue just may not be there. It is also a major pain to find what you are looking for within these logs. Add in issues with access control for production environments and delays from change requests, and you have a major headache on your hands.

To alleviate this problem, we developed a dashboard console within our scheduler web application. This dashboard has records of dozens of types of application events. This dashboard can be made accessible to read-only users, and can even be used by support teams with little software experience to see issues when they arise. Events can be filtered by type, time, severity and the contents of the message. Any time an event worth logging occurs, it is put into the dashboard history and can then be navigated from the dashboard console.

The real beauty of this approach is that the information is available at your fingertips and can easily be located again and again with a few simple clicks. All events are logged so there is no chance of missing information because it is at too fine a level, and you can easily configure our default job to clean old dashboard information. It also avoids the issue of granting production access to support teams, or having to contact administrators to provide logs.

Obsidian Log Dashboard

Level Two – Notifications

Another big issue of system monitoring and maintenance is knowing immediately when things go wrong, or when unexpected things happen. We knew we needed some kind of notification support, but we had to carefully consider the best way to provide this to our users so they have the flexibility and simplicity they require to maintain their applications inexpensively.

Since we had such a clean and detailed system to log application events, we leveraged this to provide event notifications. Every event may be tied to a specific entity – this may be a scheduled job, a job configuration, etc. We also have severity for every event which range from Trace to Fatal (inspired by logging levels).

The natural way to approach notification was to allow users to subscribe to a certain severity level for events of a certain type. Not only can they be alerted when errors occur, but they can also subscribe to informational events or warnings. Users can also target a specific entity to limit when they are notified – for example, they may wish to know when any job fails, or when a specific job fails. We outlined every type of subscribable event and exposed a clean and simple interface within our administration console so that users could quickly change their event subscriptions, and even temporarily disable them.

We include email notification support out of the box currently and this also gives virtually all mobile users access to SMS notifications as well – see Wikipedia’s list of SMS email gateways to see the address to use for your carrier.

Obsidian Notifications

Level Three – Full and Complete Application Logging

While we log just about any event of any relevance within Obsidian, we also know that organizations have current standards and infrastructure for logging in place. In response to this, we use log4j to log all events logged to the dashboard, plus additional diagnostic information. Since we use the well-established log4j framework, you have access to all its appender types including text appenders, network appenders and JDBC appenders. In this way, developers can have customize Obsidian’s logging to fit their needs. We still believe our notification and event logging approach is tough to beat, but we want to make life as easy for our customers as possible. This approach also makes it easy for our clients to embed Obsidian within their applications while keeping logging consistent and simple.

If you have any questions about our experience or would like to suggest a feature for Obsidian’s monitoring and logging, leave a comment.

Obsidian Scheduler 1.2 Released!

We’ve added more to Obsidian Scheduler. Version 1.2 is now available here.

Features added in this release include

  • Description annotation on job class enables inline help in admin web app
  • Scheduled one-time runs
  • New Job Status supporting any unscheduled events (ad hoc & chaning)
  • Bundled Winstone support for quick start
  • Option to wait indefinitely for job completions on graceful shutdown
  • Licence key release on graceful shutdown

As always, our live online demo has been updated with this latest release. Fully functional and completely open for your use at http://demo.carfey.com.

Scheduler Fault Tolerance & Load Balancing

Obsidian Scheduler provides enterprise scheduling features while natively supporting pooling and clustering, or in other words, load balancing and fault tolerance. But Obsidian does so in a way that is painless and non-invasive. In fact, you don’t have to do anything. Load balancing and fault tolerance are built into each instance of Obsidian Scheduler whether you choose to run it inside the web admin app, embedded in your own application, as a standalone or any combination of these. This is critical for a scheduler since you could encounter software/hardware faults, unanticipated load or any number of other things that could cripple or bring down a scheduler instance that would otherwise impact critical items from firing. This is where pooling & clustering fits so well.

In fact, we are so passionate about fault tolerance and load balancing, that we don’t offer a single node version of Obsidian. All licences are a minimum of two nodes and your fully functional trial allows you to see two nodes running without any functional restriction. We want you to have, at minimum, a second instance running to ensure your scheduled jobs run on time and that a failure doesn’t prevent other scheduled items from completing or subsequent instances from firing.

Many enterprise server solutions support pooling and clustering but often utilize a variety of complex configuration strategies and/or pool participant inter communication approaches. Obsidian doesn’t need any of these. Every Obsidian Scheduler instance of any type automatically joins the existing pool/cluster or establishes it if it is the first one on the scene. No extra configuration required. No communication between servers necessary. No multicast, no replication of data between servers. This means that you can easily swap out hardware in case of failure or add a new member for load sharing with ease. In fact, if you have standby hardware, you can have it running, awaiting availability of a node licence and it will automatically take over as soon as a node licence is available.

Obsidian also supports fault tolerance of individual jobs. If a job stalls, fails to complete because the instance failed, fails with an exception, didn’t run because no nodes were running, was conflicted by another job – all these are job failure modes that Obsidian provides recovery and tolerance mechanisms for and are all configurable and managed via the web interface. You can even configure specialized job chaining using source chain job state. In an upcoming release, Obsidian will expose internally fully manageable workflow based on source job state and/or its output/results, really, any condition or criteria you may have. You can also use the web interface to subscribe to server and job events at a high level or just target the events you are concerned with so that you’re kept up-to-date without having to login, parse and review log files, etc.

We know that running software in production environments can be unpredictable at times and that all too frequently, bad things happen. We want Obsidian Scheduler to keep you safe and to help you feel secure. Share with us your stories or let us know if you can think of any other ways we can make Obsidian better able to adapt to scheduling problems.

Scheduler Management

When we started working on our Obsidian Scheduler, one of the primary motivations was to give our clients full control over the configuration of their server runtime and total control over job configuration, including schedule, state, parameterization, etc. More than just full control, we wanted the configuration of said items to take effect immediately across all instances, without the need for restarts and without any necessary code or config file changes.

The established means of handling this is having a user interface that fronts the configuration items. But many scheduler solutions including cron4j, default Quartz and Spring’s simple scheduler support do not anticipate needs beyond the scheduling of a job. Even from a development perspective, a developer would reap productivity gains from not having to to crack open code or config files, stop and start the server runtime to make these types of changes and have them take immediate effect.

Liveness of these changes across all running scheduler hosts is accomplished using a single data store to store/view the configuration, ensuring proper concurrency safety to avoid dirty writes, deadlocks and the like. As previously discussed on this blog, using an optimistic locking strategy globally has negligible impact on performance while protecting against dirty writes. By choosing such for Obsidian and making strategic use of pessimistic locks, semaphores and reentrant locks, we can handle 1000s of concurrent jobs and dozens of concurrent hosts on minimal hardware without collisions or failures.

One of the the great features of Obsidian is its administration user interface. A webapp that can run with or without and embedded scheduler instance, this is the control panel, if you will, of your scheduler pool and your runnable jobs.

Here are some examples of configuration items that you can control from Obsidian Scheduler’s Admin UI and whose changes take effect immediately across all pooled instances.

  • Job State – including disabling/enabling a job
  • Job Schedule – cron pattern
  • Future State/Schedules – e.g. disable a job at midnight tomorrow or change a job to run hourly as of next week
  • Job Parameters
  • The script itself in Script Jobs
  • Scheduling a job
  • Remove a running host from the pool – Without shutting it down
  • Job Chaining
  • Job Conflict Configuration – including prioritization

A convenient byproduct of putting all this management control in an interface is that it is very easy to have a group of people have read-only access (e.g. production support) to these configuration items. Obsidian does just that, including a read-only role in both native DB and LDAP controlled modes. Server-level controls are also exposed in Obsidian’s admin interface, secured to the admin role. These items provide fine-grained control over advanced Obsidian Scheduler features.

I’ve mentioned the Obsidian pooling support in passing several times in this post. This is another hallmark feature of Obsidian Scheduler. In our next post, we’ll discuss what makes this feature so special.

Obsidian Scheduler 1.1 Released!

We’ve been working hard to make Obsidian Scheduler even better. Version 1.1 is now available.

Features added in this release include

  • Perfectly distributed load balancing across runnings hosts within seconds of pool membership changes
  • Header icons indicating the number of active hosts
  • Ability to disable targeted scheduler instances without shutting them down
  • Ad hoc job submission support
  • Job State/Schedule UI enhancements

Our demo has been updated with this latest release. Check it out at http://demo.carfey.com and download the update here.