Scheduling a Job in Quartz Versus Obsidian

We frequently compare Quartz and Obsidian in our blog, and today we’re going to see the difference in how you would schedule a job for recurring execution in both pieces of software.

A Quick Note on Job Configuration

Before we show the API that you use for Quartz and Obsidian, first I’ll mention that using an API typically isn’t the best approach to scheduling a job. Quartz provides a mechanism to configure jobs via XML, and Obsidian provides a full administration and monitoring web application for you to schedule jobs.

However, some use cases definitely suggest API integration, so lets move forward with that.

Quartz Example

Let’s see what scheduling a job to run every half hour looks like in Quartz.


// First, create the scheduler
SchedulerFactory sf = new StdSchedulerFactory();
Scheduler scheduler = sf.getScheduler();

// Build up the job detail
JobDetail job = JobBuilder.newJob(HelloWorld.class)
    .withIdentity("HelloWorldJob", "HelloWorldGroup")
    .build();

// Add some configuration
job.getJobDataMap().put("planet", "Earth");

// Create the scheduler
CronScheduleBuilder schedule = CronScheduleBuilder.cronSchedule("* 0/30 * * * ?");

// Create the trigger
CronTrigger trigger = TriggerBuilder.newTrigger()
    .withIdentity("HelloWorldTrigger", "HelloWorldGroup")
    .withSchedule(schedule)
    .build();

// Schedule the job with the created trigger.
scheduler.scheduleJob(job, trigger);

scheduler.start(); // This is how you start the scheduler itself if you need to

// Later, we can shut down the scheduler 
scheduler.shutdown();

As you can see, first you have to get a handle on the scheduler instance (or create it), then you create the JobDetail, CronScheduleBuilder and CronTrigger, and then you can finally schedule the job itself.

There are a few seemingly superfluous steps here, and some extraneous properties like job identities, trigger names, trigger groups, etc. that you have to provide, but this is the basic template you’d use.

Obsidian Example

Now let’s see how it’s done in Obsidian. We are using the same job class (let’s assume it satisfies both Quartz and Obsidian’s job interfaces), and we’ll use the same every-half-hour schedule.

	
// Create our configuration parameters
List<ConfigurationParameter> parameters = Arrays.asList(
         new ConfigurationParameter().withName("planet")
                                     .withType(ParameterType.STRING)
                                     .withValue("Earth")
);

// Set up the job configuration
JobCreationRequest request = new JobCreationRequest()
	.withNickname("HelloWorld")
	.withJobClass(HelloWorld.class.getName())
        .withState(JobStatus.ENABLED)
	.withSchedule("* 0/30 * * *")
        .withRecoveryType(JobRecoveryType.LAST) // how to recover failed jobs
	.withParameters(parameters);

// Now actually save this configurations, which becomes active on all schedulers.
JobDetail addedJob = new JobManager().addJob(request, "Audit User");
System.out.println("Added new job: " + addedJob );

// If you need to start an embedded scheduler, use this line to initialize it.
SchedulerStarter scheduler = SchedulerStarter.get(SchedulerMode.EMBEDDED);

// Later, we can gracefully shut down the scheduler 
scheduler.shutDown();

As you can see, Obsidian keeps things simple and does away with extraneous properties which don’t help you develop and manage your jobs. You simply create a JobCreationRequest with the required fields above, including any ConfigurationParameters, and call JobManager.addJob() with the job configuration and an optional audit user, which is used to track changes.

This API call saves your configuration to the Obsidian database, so it is instantly propagated to all schedulers in your cluster, and it outlives restarts. Many of our users take advantage of this API to perform one-time initialization of job schedules, after which they use our powerful web application to make changes as they are required.

This API was carefully designed to expose the powerful features our users demand, while still remaining simple to use. This sample will give you the basic template for how to schedule a job in Obsidian, but if you need more detail or want to use other extended features, you can check out our full Embedded API documentation.

If you have any questions about Quartz versus Obsidian, or perhaps about Obsidian features, please leave a note here or contact us and we’ll be happy to reply.

Comparing Quartz, cron4j and Obsidian Scheduler

We’ve all worked on projects that required us to do very basic tasks at periodic intervals. Perhaps we chose a basic ScheduledThreadPoolExecutor. If we’re already using Spring, maybe we tried their TaskExecutor/TaskScheduler support.

But once we encounter any number of situations such as an increased quantity of tasks, new interdependencies between tasks, unexpected problems in task execution or the like, we will likely start to consider a more extensive scheduling solution.

Our website has a fairly exhaustive feature comparison of the most commonly used Java schedulers, so we won’t go into that in this post, but we do encourage you to take a look.

Features aside, are there other criteria that should come into play? Factors such as development team responsiveness to feature requests and bug reports certainly can be critical for many organizations. If you head over to the Quartz Download page, you’ll see that they haven’t had a release in over a year, despite there being many active unresolved issues. Cron4j hasn’t had a release in over 2 years. While Spring has made some changes to the design of their TaskExecutor/TaskScheduler support in recent releases, their true priorities lie elsewhere as they have not really done much to expand the feature set.

Obsidian Scheduler on the other hand is actively maintained, actively supported (with free online support!) and responsive to our user community. In the past year, we’ve averaged a release per month delivering a blend of features, enhancements and fixes, proof that we’re a nimble and responsive organization. We encourage you to give Obsidian a try today!

Java Enterprise Software Versus What it Should Be

A lot of developers end up in the Java “enterprise” world at some point in their careers. I know the term alone conjures up all kinds of reactions, and rightly so. Often environments where lots of interesting technical challenges exist end up being those that nobody wants to work on because they are brittle, difficult to work on and just no fun. Often the problems that crop up in big projects are due to management, but I’ve seen developers make lots of bad decisions that lead to awful pieces of software, all in the name of “enterprise”.

What is Enterprise?

You could argue that the term can mean just about anything, and that’s true, but for the sake of this article I’m going to define it in a way that I think lines up with the common usage. The average enterprise project has these attributes:

  • typically in a large corporate environment
  • multiple layers of management/direction involved
  • preference for solutions from large vendors like Red Hat, IBM or Microsoft
  • preference for well-known, established (though sometimes deficient) products and standards
  • concerns about scaling and performance

Now that I’ve defined what type of project we are talking about, let’s see what they usually end up looking like.

The Typical Enterprise Java Project

Most of us have seen the hallmarks of enterprise projects. It would help if we had an example, so let’s pretend it’s an e-commerce platform with some B2B capabilities. Here’s what it might look like:

  • EJB3 plus JPA and JSF – these fit a “standard” and everyone use them, so it’s safe choice.
  • SOAP – it’s standard and defines how things like security should work, so there’s less to worry about.
  • JMS Message-Driven Beans – it fits into the platform and offers reliability and load-balancing.
  • Quartz for job scheduling – a “safe” choice (better the enemy you know than the devil you don’t).
  • Deployed on JBoss – it has the backing of a large company and paid support channels.

Now, the problem with a project like this isn’t necessarily the individual pieces of technology selected. I definitely have issues with some of the ones in my example, but the real issue is how the choices are made and what the motivations are for using certain technologies.

The stack of software above is notoriously more difficult to manage and work with compared to other choices. Development will take longer to get off the ground, changes will be more difficult to make as requirements evolve, and the project will ultimately end up more complicated than other possible solutions.

Enterprise Decision-Making

The goals that enterprise project usually fixate on when making choices are:

  • Low-risk technology – make a “safe” choice that won’t result in blow-back, even if it is known to have serious drawbacks.
  • Standards obsession – worrying more about having well-defined specifications like EJB3 or SOAP than offering the simplest solution that does the job effectively.
  • Need for paid support with an SLA, often without concern to the quality or timeliness of responses.
  • Designing out of fear of unknown future requirements.

These goals aren’t bad ones, with the exception of the last one, but they tend to overshadow the real goals of every software project. The primary objective of all software projects is to deliver a project that:

  • is on-time;
  • meets requirements;
  • is reliable;
  • performs well; and
  • is easy to maintain and extend.

These should be what decision-makers focus on in software projects, whether small or large. It’s obvious that sometimes special organizational needs factor into the choices that are made, but fundamentally good choices generally apply in all types of organizations.

So what if we were to reimagine our project with those goals in mind instead?

A Reimagined Enterprise Project

First, a little disclaimer: there are many ways one can go in any project, and I won’t claim that the technologies below are better that the ones mentioned previously. Tools need to be evaluated according to your needs, and everyone’s are different.

What I will try to do is demonstrate an example technology stack along with the reasoning for each choice. This will show how well designed systems can be built which survive in the enterprise world without succumbing to the bad choices that are often made.

Here’s the suggested stack:

  • Spring MVC using Thymeleaf – stable history, lots of development resources, quick development and flexibility. Don’t be afraid of using platforms or libraries, but try to avoid too much “buy-in” to their stack that you might regret.
  • Simple database layer using jOOQ for persistence where useful. This lets us manage performance in a more fine-grained way, while still getting easy database interaction and avoiding ORM pitfalls.
  • REST using Jackson JSON processor – REST and JSON are both popular because they are easy-to-use and understand, cheap to develop, use simple standards and are familiar to developers. Lock-in isn’t much of a problem either – we could easily switch to another JSON processor without much difficulty, unlike SOAP. This can be easily secured with SSL and basic authentication.
  • JMS messaging using JSON-encoded messages on ActiveMQ – loose coupling, reliability and load-balancing, without the nastiness of being stuck with Message-Driven Beans.
  • Obsidian Scheduler – simple-to-use, offers excellent monitoring and reduces burden on developers. Once again, the goal is to simplify and reduce costs where possible.
  • Deployed on Tomcat – no proprietary features used. This helps us follow standards, avoid upgrade issues and keep things working in the future. Who needs support with SLAs when things aren’t inexplicably breaking all the time?

I think the stack and corresponding explanations above help give an idea to what an enterprise project can be if it’s approached from the right angle. The idea is to show that even enterprise projects can be simple and be nimbly built – bloated frameworks and platforms aren’t a necessary part, and rarely offer any significant tangible benefit.

In Closing

Recent trends in development to technologies like REST have been very encouraging and inroads are being made into the enterprise world. Development groups are realizing that simplicity leads to reliability and cost-effective solutions long as the underlying technology choices meets the performance, security, etc. needs of the project.

The software world moves quickly, and is showing promising signs that it’s moving in the right direction. I can only hope that one day the memories of bloated enterprise platforms fade into obscurity.

Why You Still Shouldn’t Use Quartz Scheduler

When I first wrote on this topic, I lamented that the de facto standard for scheduling jobs in Java applications is use of the Quartz scheduling library. Why? At the time, I explained:

  • Enabling your application to schedule jobs with Quartz takes significant code and/or configuration.
  • Changes in job schedule or other parameters require code/configuration changes with a corresponding software build/test/deploy cycle.
  • Quartz is just a library and not a scheduling solution. It has no built-in UI for configuration or monitoring, no useful and reasonably searchable logging or historical review, no support for multiple execution nodes, no administration interface, no alerts or notifications, not to mention buggy recovery mechanisms for failed jobs and missed jobs.

Quartz hasn’t changed much in the two years since I wrote that post. A single minor release and two maintenance releases have done nothing to address its pain points.

It’s true that Quartz has some free or even commercial add-ons to address some of these issues, but these require additional configuration, add complexity to your software environment and are by nature afterthoughts and will never be able to fundamentally change Quartz’ nature – a library. Obsidian Scheduler has been designed from the ground up to address all of these shortcomings and more and is free to use for a single scheduling instance.

Consider some reasons why Quartz could disappoint you should you choose this “standard” for your project.

Scheduling Configuration

Quartz requires code or XML configuration or some combination thereof to schedule jobs. This has multiple drawbacks, the biggest of which is this: verifying your code or configuration changes requires either deploying your software environment in its target environment (WAR, EAR, standalone Spring application, etc.) or some appropriate test harness before finally verifying the results. If there is a severe issue, you must then decipher an API exception message and attempt to make the necessary changes and then restart the verification cycle. This process is long and tedious, especially if you have to make multiple passes to correct issues.

Obsidian on the other hand requires no configuration or code to make scheduling changes or other job execution parameters. These changes are validated and applied in real-time or can even themselves be scheduled ahead of time. You can even see a preview of when jobs will run, so you aren’t left wondering if you got your schedule correct. You are free to do this using the UI or you can use the REST API.

Configure Job Conflict Management Chaining

Execution Context

Quartz is a library. This means that it is nothing more than an API. It does not provide an execution context. Why is an execution context important? An execution context is required to interact with your scheduling environment in real-time. Any number of things you might like to do are not possible because of no execution context. Disable a job? Change a schedule? See what jobs are running? Subscribe to scheduling event? Some of these are not possible without an execution context while others are impractical. Even if you have an execution context in your application already, such as Spring, you need to expose the Quartz API functions in your context and then make them available for invocation in some UI or expose them in some new or existing context API.

Obsidian includes an execution context that can either run as standalone or embedded into an existing context (such as Spring). This execution context provides many of Obsidian’s useful features, including clustering, interactive scheduling changes via GUI or REST, real-time monitoring, scheduler pausing (global or host-specific) and so on.

Operational Roles

With Quartz, everything relating to job scheduling becomes a function of the development team. Job failed or didn’t trigger when expected? Developer must investigate in the log files. Scheduling change? Developer must make the change, test it and support pushing the change to the target environments. Need to disable the scheduler? More code changes.

With Obsidian Scheduler, we make it possible to reduce the burden on the development team and allocate functions to more appropriate owners. Developers really shouldn’t have any more access to your live environment than is absolutely necessary. If they do need an access account to Obsidian for support, give them a read-only account that allows them to support the operations and business groups in their work. Or give them the limited-read role if your job configurations contain sensitive information such as IDs or passwords. Business teams or other appropriate designates could be given write access to make job scheduling and parameter changes or submit ad-hoc jobs. Your operations staff can be given the admin role to configure the Obsidian run-time itself (thread pool sizes, system recovery, etc.)

If you’ve ever used Quartz then you’ve likely suffered through these and other issues. Download Obsidian today and experience how productive and efficient job scheduling can be!

Integrating Spring and Obsidian Scheduler

The use of Spring framework has woven its way into a large percentage of Java projects of all sorts. It has grown well beyond a dependency injection framework to a large interconnected technology stack providing “plumbing” in a large number of areas. Since it’s so often a part of our lives, we’ll discuss Obsidian’s support of Spring and how to quickly integrate your use of Spring into Obsidian Scheduler.

Version Indifferent

First, Obsidian does not include any Spring libraries and is not dependent on Spring to run. Given that Spring is a very active project and has frequent releases, and at times projects have constraints that require using older versions of certain libraries, we felt it would be best if Obsidian just worked with whatever version of Spring you’re using. Obsidian supports versions 2.5 and later of Spring.

Configuring Job Components

The first step in integrating Obsidian Scheduler is often already in place in your project. Obsidian expects to find @Component annotations or any Spring or custom extensions of those annotations (such as @Service and @Repository) on your job classes. If you’re using these annotations to wire your classes, you likely don’t have to do anything else to complete this step. If you’re not using these annotations to wire your classes, you will need to add the annotation to serve as a Marker to Obsidian. If you happen to have multiple instances of a given bean class in your Spring context, Obsidian will need to know how to uniquely identify the job within the Spring context. In these cases, use value="" in the annotation to provide the bean name. For example:

@Component("mySpringComponent")
public class SpringComponent implements SchedulableJob {

Providing the Spring Context to Obsidian

The next step is to let Obsidian know about your Spring context. Included with Obsidian is an implementation of ApplicationContextAware. If you’re using classpath scanning in Spring, you can just add the Obsidian package to your configuration. Here is a sample:

<context:component-scan base-package="com.my.package, com.carfey.ops.job.di"/>

If you’re wiring explicitly in a configuration XML file, here is a sample configuration:

<bean id="obsidianAware" class="com.carfey.ops.job.di.SpringContextAware"/>

Startup & Shutdown Options

The third step, which is optional and depends on your desired usage, relates to starting and stopping Obsidian. If you’re configuring Spring integration for use in the Obsidian standalone Web Administration application (with no scheduler running), you would not include this configuration. In all other cases, such as embedded scheduler or Web Administration application with bundled scheduler, you can use this sample:

<bean id="obsidianStarter" class="com.carfey.ops.job.di.SpringSchedulerStarter"/>

This bean should wired as late as possible in the Spring context startup, using depends-on or other appropriate strategies. This class does not possess a @Component annotation since it cannot be auto-wired via Spring classpath scanning.

Now your Add/Edit Job list in the Web Administration UI, will include all implementations of com.carfey.ops.job.SchedulableJob that you’ve wired through Spring. If you’re using Spring 3.0 or later, even jobs that use Obsidian’s annotations com.carfey.ops.job.SchedulableJob.Schedulable and com.carfey.ops.job.SchedulableJob.ScheduledRun are included. See our wiki for more information on implementing jobs using the interface or the annotations.

Good to Go!

You now have complete Spring integration with Obsidian. All job instances that loaded via Spring annotations will be loaded from the Spring container when performing validation and execution. Note that these could be either prototypes or singletons according to your needs, though we recommend prototypes if you have any kind of internal state within the job.

Of course Spring is not the only dependency injection framework available in Java. On the roadmap for Obsidian is similar integration with Google Guice. Stay tuned! Get in touch with us if you have any questions on Spring integration or any other Obsidian Scheduler features.

Reusable Jobs with Obsidian Chaining

A lot of Obsidian users write custom jobs in Java and leverage their existing codebase to performs things like file archive or file transfer right inside the job. While it’s great to reuse that existing code you have, sometimes users end up implementing the same logic in multiple jobs, which isn’t ideal, since it means extra QA and potential for inconsistencies and uncaught bugs.

Fortunately, Obsidian’s configurable chaining support combined with using job results (i.e. output) lets developers write a single job as a reusable component and then chain to it where required.

To demonstrate this, we will go through a fairly common type of situation where you have a job which generates zero or more files which must be transferred to a remote FTP site and which also must be archived. We could chain to an FTP job which chains to an archive job, but for the sake making this example simpler, we will bundle them into the same job.

File Generating Job

First, we’ll demonstrate how to save job results in our source job to make them available to our chained job. Here’s the execute() method of the job:

public void execute(Context context) throws Exception {
   // grab the configured FTP config key for the job 
   // and pass it on to the chained FTP/archive job
   context.saveJobResult("ftpConfigKey", 
         context.getConfig().getString("ftpConfigKey"));

   for (File file : toGenerate) {
      // ... some code to generate the file

      // when successful, save the output 
      // (multiple saved to the same name is fine)
      context.saveJobResult("generatedFile", file.getAbsolutePath());
   }
}

Pretty simple stuff. The most interesting thing here is the first line. To make the chained FTP/archive job really reusable, we have configured our file job with a key which we can use to load the appropriate FTP configuration used to transfer the files. We then pass this configuration value onto the FTP job as a job result, so that we don’t have to configure a separate FTP job for every FTP endpoint. However, configuring a separate FTP job for each FTP site is another option available to you, in which case you wouldn’t have to configure the file job with the config key or include that first line.

Next we’ll see how to access this output in the FTP/archive job and after that, how to set up the chaining configuration.

FTP/Archive Job

This job has two key features:

  1. It loads FTP config based on the FTP key passed in by the source job.
  2. It iterates through all files that were generated and deals with them accordingly.

Note that all job results keep their Java type when loaded in the chained job, though they are returned as List<Object>. Primitives are supported as output values, as well as any type that has a public constructor that takes a String (toString() is used to save the values).

public void execute(Context context) throws Exception {
   Map<String, List<Object>> sourceJobResults = context.getSourceJobResults();
   List<Object> fullFilePaths = sourceJobResults.get("generatedFile");
   
   if (fullFilePaths != null) {
      if (sourceJobResults.get("ftpConfigKey") == null) {
         // ... maybe fail here depending on your needs
      }
      String ftpConfigKey = (String) sourceJobResults.get("ftpConfigKey").get(0);
      FTPConfig config = loadFTPConfig(ftpConfigKey);
	  
      for (Object filePath : fullFilePaths) {
         File f = new File((String) filePath);
         // ... some code to transfer and archive the file

         // Note that this step ideally can deal with already processed files
         // in case we need to resubmit this job after failing half way through.
      }
   }
}

Again, this is pretty simple. We grab the saved results from the source job and build our logic around it. As mentioned in the comments, one thing to consider in an implementation like this is handling if the job fails after processing only some of the results. You may wish to just resubmit the failed job in a case like that, so it should be able to re-run without causing issues. Note that this isn’t an issue if you only ever have a single file to process.

Chaining Configuration

Now that the reusable jobs are in place, here’s how to set up the chaining. Here’s what it looks like in the UI:

Chaining Configuration

We use conditional chaining here to indicate we only need to chain the job when values for generatedFile exist. In addition, we ensure that an ftpConfigKey value is set. The real beauty of this is that Obsidian tracks why it didn’t chain a job if it doesn’t meet the chaining setup. For example, if the ftpConfigKey wasn’t setup, the FTP/Archive job would still have a detailed history record with the “Chain Skipped” state and the detailed message like this:

Chain Result

Note that in this example it’s not required that we do conditional chaining since our FTP/archive job handles when there are no values for generatedFile, but it’s still a good practice in case you have notifications that go out when a job completes. It also makes your detailed history more informative which may help you with troubleshooting. If you don’t wish to use conditional chaining, you could simply chain on the Completed state instead.

Conclusion

Obsidian provides powerful chaining features that were deliberately designed to maximize productivity and reliability. Our job is to make your life easier as a developer, operator or system administrator and we are continually searching for ways to improve the product and provide values to our user.

If you have any questions about the examples above, let us know in the comments.

Using Obsidian Maintenance Jobs

When we began development on Obsidian Scheduler, one of its key objectives was transparency. Quoting my own post,

transparent means letting us know what is going on. We want to have available to us information such as when was the job scheduled to run, when did it start, when did it complete, which host ran it (in pool scenarios). We want to know if it failed, what the problem was. We want the option to be notified either being emailed, paged, and/or messaged when all or specific jobs fails. We want detailed information available to us should we wish to investigate problems in our executing jobs. And we want all of this without having to write code, create our own interface, parse log files or do extra configuration. The scheduler should just work that way.

We’re very happy that we achieved that goal and Obsidian gives us the transparency we sought.

As you likely know, Obsidian stores all this information in a database. This allows you to retain as much of this history as you desire or are required by your business group’s policy. In theory, you could retain all of this history indefinitely in the live database ensuring this information would be available at any time simply by accessing the Obsidian UI. In practice, this is often not required. Many organizations simply maintain an indefinite archive of database backups and can retrieve and restore these to an environment they would use specifically to perform any desired historical investigations.

Obsidian is bundled with two maintenance jobs that can be used to keep the Obsidian database trim and holding only the necessary recent history. These maintenance jobs take advantage of Obsidian’s built-in ability for parametrization allowing you to choose the desired retention period. These jobs are not scheduled by default, so let’s take a look at how we can configure them to run.

JobHistoryCleanupJob

This job cleans up the execution history of your jobs, retaining the most recent history. When scheduling a new job in Obsidian, you will find the Job Class com.carfey.ops.job.maint.JobHistoryCleanupJob in the selection list. Once chosen, you will see that it supports a single required parameter – maxAgeDays. Assuming you want job execution history retained for the most recent 90 days, set this value to 90 days. Here’s a screenshot of what it would look like.
JobHistoryCleanupUsage

LogCleanupJob

This job cleans up the audit and information logs that track all activity in Obsidian. As described in our wiki, these logs contain everything from scheduler system activity such as spawning and execution to user activity such as changing a schedule, its configuration or adding new jobs. These logs are classified by severity level and the Job Class com.carfey.ops.job.maint.LogCleanupJob is configured using the same required parameter maxAgeDays, but has an additional required parameter – level. It defaults to ALL and allows multiple values, but you can also retain this log data for different retention periods by severity by scheduling and configuring this job for each classification. Notice these samples:
LogCleanupUsage1
LogCleanupUsage2

That’s all there is to it! In both cases, the absence of a configured maintenance job or in the case of the LogCleanupJob, configured for a given severity level, means that data will be retained indefinitely. Of course, you can always decide later on to configure such a job and at that time Obsidian will take care of it for you.

Is there something else you’d like to see Obsidian do? Drop us a line or leave a comment below. We listen carefully to all our customers’ feature requests and give priority to customer’s needs in our product roadmap. We also appreciate hearing how Obsidian is helping you.

Job Chaining in Quartz and Obsidian Scheduler

In this post I’m going to cover how to do job chaining in Quartz versus Obsidian Scheduler. Both are Java job schedulers, but they have different approaches so I thought I’d highlight them here and give some guidance to users using both options.

It’s very common when using a job scheduler to need to chain one job to another. Chaining in this case refers to executing a specific job after a certain job completes (or maybe even fails). Often we want to do this conditionally, or pass on data to the target job so it can receive it as input from the original job.

We’ll start with demonstrating how to do this in Quartz, which will take a fair bit of work. Obsidian will come after since it’s so simple.

Chaining in Quartz

Quartz is the most popular job scheduler out there, but unfortunately it doesn’t provide any way to give you chaining without you writing some code. Quartz is a low-level library at heart, and it doesn’t try to solve these types of problems for you, which in my mind is unfortunate since it puts the onus on developers. But despite this, many teams still end up using Quartz, so hopefully this is useful to some of you.

I’m going to outline probably the most basic way to perform chaining. It will allow a job to chain to another, passing on its JobDataMap (for state). This is simpler than using listeners, which would require extra configuration, but if you want to take a look, check out this listener for a starting point.

Sample Code

This will rely on an abstract class that will provided basic flow and chaining functionality to any subclasses. It acts as a very simple Template class.

First, let’s create the abstract class that gives us chaining behaviour:

import static org.quartz.JobBuilder.newJob;
import static org.quartz.TriggerBuilder.newTrigger;
import org.quartz.*;
import org.quartz.impl.*;

public abstract class ChainableJob implements Job {
   private static final String CHAIN_JOB_CLASS = "chainedJobClass";
   private static final String CHAIN_JOB_NAME = "chainedJobName";
   private static final String CHAIN_JOB_GROUP = "chainedJobGroup";
   
   @Override
   public void execute(JobExecutionContext context) throws JobExecutionException {
      // execute actual job code
      doExecute(context);

      // if chainJob() was called, chain the target job, passing on the JobDataMap
      if (context.getJobDetail().getJobDataMap().get(CHAIN_JOB_CLASS) != null) {
         try {
            chain(context);
         } catch (SchedulerException e) {
            e.printStackTrace();
         }
      }
   }
   
   // actually schedule the chained job to run now
   private void chain(JobExecutionContext context) throws SchedulerException {
      JobDataMap map = context.getJobDetail().getJobDataMap();
      @SuppressWarnings("unchecked")
      Class jobClass = (Class) map.remove(CHAIN_JOB_CLASS);
      String jobName = (String) map.remove(CHAIN_JOB_NAME);
      String jobGroup = (String) map.remove(CHAIN_JOB_GROUP);
      
      
      JobDetail jobDetail = newJob(jobClass)
            .withIdentity(jobName, jobGroup)
            .usingJobData(map)
            .build();
         
      Trigger trigger = newTrigger()
            .withIdentity(jobName + "Trigger", jobGroup + "Trigger")
                  .startNow()      
                  .build();
      System.out.println("Chaining " + jobName);
      StdSchedulerFactory.getDefaultScheduler().scheduleJob(jobDetail, trigger);
   }

   protected abstract void doExecute(JobExecutionContext context) 
                                    throws JobExecutionException;
   
   // trigger job chain (invocation waits for job completion)
   protected void chainJob(JobExecutionContext context, 
                          Class jobClass, 
                          String jobName, 
                          String jobGroup) {
      JobDataMap map = context.getJobDetail().getJobDataMap();
      map.put(CHAIN_JOB_CLASS, jobClass);
      map.put(CHAIN_JOB_NAME, jobName);
      map.put(CHAIN_JOB_GROUP, jobGroup);
   }
}

There’s a fair bit of code here, but it’s nothing too complicated. We create the basic flow for job chaining by creating an abstract class which calls a doExecute() method in the child class, then chains the job if it was requested by calling chainJob().

So how do we use it? Check out the job below. It actually chains to itself to demonstrate that you can chain any job and that it can be conditional. In this case, we will chain the job to another instance of the same class if it hasn’t already been chained, and we get a true value from new Random().nextBoolean().

import java.util.*;
import org.quartz.*;

public class TestJob extends ChainableJob {

   @Override
   protected void doExecute(JobExecutionContext context) 
                                   throws JobExecutionException {
      JobDataMap map = context.getJobDetail().getJobDataMap();
      System.out.println("Executing " + context.getJobDetail().getKey().getName() 
                         + " with " + new LinkedHashMap(map));
      
      boolean alreadyChained = map.get("jobValue") != null;
      if (!alreadyChained) {
         map.put("jobTime", new Date().toString());
         map.put("jobValue", new Random().nextLong());
      }
      
      if (!alreadyChained && new Random().nextBoolean()) {
         chainJob(context, TestJob.class, "secondJob", "secondJobGroup");
      }
   }
   
}

The call to chainJob() at the end will result in the automatic job chaining behaviour in the parent class. Note that this isn’t called immediately, but only executes after the job completes its doExecute() method.

Here’s a simple harness that demonstrates everything together:

import org.quartz.*;
import org.quartz.impl.*;

public class Test {
   
   public static void main(String[] args) throws Exception {

      // start up scheduler
      StdSchedulerFactory.getDefaultScheduler().start();

      JobDetail job = JobBuilder.newJob(TestJob.class)
             .withIdentity("firstJob", "firstJobGroup").build();

      // Trigger our source job to triggers another
      Trigger trigger = TriggerBuilder.newTrigger()
            .withIdentity("firstJobTrigger", "firstJobbTriggerGroup")
            .startNow()
            .withSchedule(
                  SimpleScheduleBuilder.simpleSchedule().withIntervalInSeconds(1)
                  .repeatForever()).build();

      StdSchedulerFactory.getDefaultScheduler().scheduleJob(job, trigger);
      Thread.sleep(5000);   // let job run a few times

      StdSchedulerFactory.getDefaultScheduler().shutdown();
   }
   
}

Sample Output

Executing firstJob with {}
Chaining secondJob
Executing secondJob with {jobValue=5420204983304142728, jobTime=Sat Mar 02 15:19:29 PST 2013}
Executing firstJob with {}
Executing firstJob with {}
Chaining secondJob
Executing secondJob with {jobValue=-2361712834083016932, jobTime=Sat Mar 02 15:19:31 PST 2013}
Executing firstJob with {}
Chaining secondJob
Executing secondJob with {jobValue=7080718769449337795, jobTime=Sat Mar 02 15:19:32 PST 2013}
Executing firstJob with {}
Chaining secondJob
Executing secondJob with {jobValue=7235143258790440677, jobTime=Sat Mar 02 15:19:33 PST 2013}
Executing firstJob with {}

Deficiencies

Well, we’re up and chaining, but there are some problems with this approach:

  • It doesn’t integrate with a container like Spring to use configured jobs. More code would be required.
  • It forces you to know up front which jobs you want to chain, and write code for it.
  • Configuration is fixed, unless, once again, you write more code.
  • No real-time changes (unless you write more code).
  • A fair bit of code to maintain , and high likelihood you will have to expand it for more functionality.

The theme here is that it’s doable, but it’s up to you to do the work to make it happen. Obsidian avoids these problems by making chaining configurable, instead of it being a feature of the job itself. Read on to find out how.

Chaining in Obsidian

In contrast to Quartz, chaining in Obsidian requires no code and no up-front knowledge of which jobs will chain or how you might want to chain them later. Chaining is a form of configuration, and like all job-related configuration in Obsidian, you can make live changes at any time without a build or any code at all. Job configuration can use a native REST API or the web UI that’s included with Obsidian.

The following chaining features are available for free:

  • No code and no redeploy to add or remove chains.
  • You can chain specific configurations of job classes.
  • You can chain only on certain states, including failure.
  • Chain conditionally based on source job saved state (equivalent to Quartz’s JobDataMap), including multiple conditions. Regexp/Equals/Greater than, etc.
  • Chain only when matching a schedule.

Check out the feature and UI documentation to find out more.

Now that we know what’s possible, let’s see an example. Once you have your jobs configured, just create a new chain using the UI. REST API support will be here shortly but as of 1.5.1 chaining isn’t included in the API. If you need to script this right now, we can provide pointers.

In the UI, it looks like the following:

Chaining UI

Easy, huh? All configuration is stored in a database, so it’s easy to replicate it in various environments or to automate it via scripting. As a bonus, Obsidian tracks and shows you all chaining state including what job triggered a chained job. It will even tell you why a job chain didn’t fire, whether it’s because the job status didn’t match, or one of your conditions didn’t.

Conclusion

That summarizes how you can go about chaining in Quartz and Obsidian. Quartz definitely has a minimalist approach, but that leaves developers with a lot of work to do.

Meanwhile, Obsidian provides rich functionality out of the box to keep developers working on their own rich functionality, instead of the plumbing that so often seems to consume their time. If you have any suggestions or feature requests for Obsidian, drop us a note by leaving a comment or by contacting us.

Comparing Job Development in Quartz and Obsidian

Getting your program code to the point that it satisfies the functional requirements provided is a milestone for developers, one that hopefully brings satisfaction and a sense of accomplishment. If that code must be executed on a schedule perhaps for multiple uses with custom schedules and configurable parameters, this can mean a whole new set of problems.
We’re going to compare how we would write a job in Quartz and one in Obsidian that would satisfy the above requirements. We’ll use the example scenario of a recurring report. In this scenario, the report has the following dynamic criteria: it is emailed to a specified user, the report format can be selected, either PDF or Excel, and of course the execution frequency varies by user.

The following will be the sample class we’ll use to satisfy these requirements.

public class MyReportClass {
    public void emailReport(String emailAddress, String reportFormat) {
	…generate report in desired format
	…email report to user
    }
}

For the purpose of this exercise, we will leave this class alone and write a wrapper class for scheduling, allowing for its continued use in non-scheduled contexts.

Let’s start with Obsidian. All Obsidian jobs start with implementing a single interface: SchedulableJob. Our Obsidian job class will look something like this:

import com.carfey.ops.job.Context;
import com.carfey.ops.job.SchedulableJob;
import com.carfey.ops.job.param.Configuration;
import com.carfey.ops.job.param.Parameter;
import com.carfey.ops.job.param.Type;

@Configuration(knownParameters={
		@Parameter(name= MyScheduledReportClass.EMAIL, type=Type.STRING, required=true),
		@Parameter(name= MyScheduledReportClass.REPORT_FORMAT, type=Type.STRING, defaultValue="PDF", required=true)
}) 
public class MyScheduledReportClass implements SchedulableJob {
	public static final String EMAIL = "email";
	public static final String REPORT_FORMAT = "reportFormat";

	public void execute(Context context) throws Exception {
		String email = context.getConfig().getString(EMAIL);
		String reportFormat = context.getConfig().getString(REPORT_FORMAT);
		new MyReportClass().emailReport(email, reportFormat);
	}
}

You’ll notice we can annotate the class with the required parameters. This ensures that when this job is scheduled for execution, the email and reportFormat parameters will always be available. Obsidian will not allow the job to be configured without these values and will also ensure their type. But we wouldn’t mind going a step further. We’d like to validate the reportFormat is valid. How can we do so before the job is run?
We can change our class to implement ConfigValidatingJob and implement the necessary method.

Now our class looks like this:

import com.carfey.ops.job.ConfigValidatingJob;
import com.carfey.ops.job.Context;
import com.carfey.ops.job.config.JobConfig;
import com.carfey.ops.job.param.Configuration;
import com.carfey.ops.job.param.Parameter;
import com.carfey.ops.job.param.Type;
import com.carfey.ops.parameter.ParameterException;
import com.carfey.suite.action.ValidationException;

@Configuration(knownParameters={
		@Parameter(name= MyScheduledReportClass.EMAIL, type=Type.STRING, required=true),
		@Parameter(name= MyScheduledReportClass.REPORT_FORMAT, type=Type.STRING, defaultValue="PDF", required=true)
}) 
public class MyScheduledReportClass implements ConfigValidatingJob {
	public static final String EMAIL = "email";
	public static final String REPORT_FORMAT = "reportFormat";

	public void execute(Context context) throws Exception {
		String email = context.getConfig().getString(EMAIL);
		String reportFormat = context.getConfig().getString(REPORT_FORMAT);
		new MyReportClass().emailReport(email, reportFormat);
	}

	public void validateConfig(JobConfig config) throws ValidationException, ParameterException {
		String reportFormat = config.getString(REPORT_FORMAT);
		if (!"PDF".equalsIgnoreCase(reportFormat) && !"EXCEL".equalsIgnoreCase(reportFormat)) {
			throw new ValidationException("Report format must be either PDF or EXCEL");
		}
	}

}

That’s it! Our job will now only accept being scheduled with an email address specified and a valid report format specified. You could easily extend this to other types of custom validation, such as ensuring the email address is valid or perhaps that is in an allowable domain.

Now for Quartz. Let’s first of all identify some differences. Quartz doesn’t provide any mechanisms for ensuring parameters are specified or are valid before runtime. And since Quartz doesn’t provide an execution context, the best you can do when you write your own code to do so is validate the parameters on startup. Our sample below will follow the easiest approach in Quartz, to simply fail the job at runtime if the report format is invalid.

import org.quartz.Job;
import org.quartz.JobDataMap;
import org.quartz.JobExecutionContext;
import org.quartz.JobExecutionException;


public class MyScheduledReportClass implements Job {
	public static final String EMAIL = "email";
	public static final String REPORT_FORMAT = "reportFormat";

	public void execute(JobExecutionContext context) throws JobExecutionException {

        JobDataMap data = context.getMergedJobDataMap();
        String email = data.getString(EMAIL);
        String reportFormat = data.getString(REPORT_FORMAT);
		if (!"PDF".equalsIgnoreCase(reportFormat) && !"EXCEL".equalsIgnoreCase(reportFormat)) {
			throw new JobExecutionException("Report format must be either PDF or EXCEL");
		}
		new MyReportClass().emailReport(email, reportFormat);
	}

}

You may be thinking that the classes seem fairly comparable and I would agree. But with the Obsidian job, there’s nothing else that needs to be done. Since setting the runtime schedule and specifying parameters tend to be fluid, those are not done in code or even in static configuration. Using Obsidian’s UI or REST API, you specify the schedule and parameters for each instance or version of the job that is needed.

Obsidian always provides an execution context that can be standalone or be embedded as a part of an existing execution context.

Quartz never provides an execution context. Unless you are deploying in a servlet container, you always need to initialize the scheduling environment. Even when using a servlet container, you must help Quartz along. That means that With Quartz, you’ve only a portion of the code and/or configuration you’ll need.

Initialize the scheduler:

SchedulerFactory sf = new StdSchedulerFactory();
Scheduler scheduler = sf.getScheduler();
scheduler.start();

No administration console and no REST API means code and/or config to schedule and parameterize your job.

JobDetail job = newJob(MyScheduledReportClass.class).withIdentity("joe's report", "group1").usingJobData(MyScheduledReportClass.EMAIL, "joe@****.com").usingJobData(MyScheduledReportClass.REPORT_FORMAT, "PDF").build();
Trigger trigger = newTrigger().withIdentity("trigger1", "group1").startNow().withSchedule(dailyAtHourAndMinute(1, 30)).build();
scheduler.scheduleJob(job, trigger);

Now this may not seem too bad, but now imagine that Joe says he wants the report in Excel, not PDF. Are you really going to say that it requires code changes, followed by a build, followed by testing, acceptance, and promoting a new release?

True, some of the above can be moved to configuration files. While that may avoid a build cycle, it does present its own set of issues. You still have to push new configuration files, restart the jvm process and deal with potential mistakes in the new configuration files that could potentially derail all scheduling.

This also doesn’t get into the issues surrounding misfires, Job Concurrency and execution exception handling and recoverability discussed here.

What do you think? Share your experiences using Quartz for scheduling in your java projects by leaving a comment. We’d like to hear from you.

Configuring Clustering in Quartz and Obsidian Schedulers

Job scheduling is used on many software projects to enable both internal jobs and third-party integration. Clustering can provide a huge boost to reliability by providing fail-over and load-sharing. I believe that clustering should be implemented for reliability on just about all software projects, so I’ve decided to outline how to go about doing that in two popular cluster-enabled Java job schedulers. This post is going to cover how to set up clustering for Quartz and Obsidian. It will explain what work is required to configure each, and help you watch for some common pitfalls. This guide will assume you have both schedulers running in the base configuration already.

Both Quartz and Obsidian have their strong points, and this post won’t debate which is better, but it will provide the information you need to cluster either one.

Quartz

Quartz is the most popular open-source job scheduling option, and it allows for you to cluster scheduler instances via the JDBCJobStore. Though it’s also possible to do clustering with Terracotta without a persistent job store, I recommend against it for most projects since job execution history is very useful to ongoing operations and troubleshooting, and database-clustering performs adequately in all but extreme cases.

Obsidian

Obsidian is a commercial job scheduler which provides free individual instances, and a single clustering licence free for a year. It provides a similar type of clustering as Quartz, and it also provides a full UI, a REST API, downtime recovery, any many other advanced configuration options.

Configurating Obsidian for Clustering

I’ll cover Obsidian first simply because there’s little to do in comparison to Quartz. Since running Obsidian always uses a database, if you have an instance running, there will be no database configuration to update. If you haven’t set it up yet, check out the Getting Started guide – the installation package comes with an interactive Ant tool to build the properties file for you.

So here’s the thing: to cluster Obsidian, simply start up additional instances. Obsidian doesn’t have a non-clustered mode, and all instances automatically handle adjusting the load-sharing algorithm when new members join the cluster (or drop out). Your system clocks should be roughly in sync, but if they differ by a few seconds, it’s not a problem since Obsidian gracefully handles this. Still, you may synchronize your server times if desired.

Note to those using the free version: right after download, you can start two instances and they will automatically join the same cluster. If you do not have adequate licences, new members will not be able to join the cluster.

There’s no need to deploy different properties files or anything like that. As an example, if you use Amazon’s EC2 service, you could use the same image for multiple nodes in the cluster and everything would work correctly. Each cluster member will automatically assign itself a unique instance name based on its local host name and a unique suffix if required.

However, we recommend you assign each cluster member an explicit name which will help with troubleshooting if there are issues on a specific host. To do so, simply set the Java system property schedulerDesignation to the host name of your choice. For example, if starting an instance using the standalone scheduler using java directly, simply add the value -DschedulerDesignation=myHostName to the end of the command.

That’s it! Clustering Obsidian is literally as easy as copying an installation and starting it on another server or virtual machine. As a bonus, unlike other schedulers, in Obsidian you can set a job to only run on a specific host, even with clustering enabled.

In summary, the steps are:

  1. Grab a copy of your Obsidian installation (WAR, standalone package or embedded application bundle).
  2. Start it up! At this step you can provide the schedulerDesignation you wish to use.

Configurating Quartz for Clustering

For Quartz, we’re only going to cover configuring database clustering since we believe it is the right choice to most projects.

Note before you get started: If you have a non-clustered version of Quartz running, you must shut it down before starting any cluster-enabled instances.

Another note about timing: Since Quartz uses very aggressive timing, you must ensure your different instances have precisely synchronized times. Even a second difference will cause your load-sharing to be effectively disabled and all jobs will run on a single host.

Setting up clustering for Quartz can be a bit overwhelming since it exposes so many properties, and it’s hard to figure out which must be added to your properties file to get up and running, but I will try to simplify the process as much as possible.

Quartz is configured via the quartz.properties file. Here’s a sample file that outlines the bare minimum properties you will have to configure for clustering. If you want to see full details on any portion of the config, see the Quartz configuration reference. Note that the properties file should be identical on all hosts, with the one exception being the org.quartz.scheduler.instanceId which is used to identify different hosts.


# Basic Quartz configuration to provide an adequate pool of threads for execution

org.quartz.threadPool.class = org.quartz.simpl.SimpleThreadPool
org.quartz.threadPool.threadCount = 25

# Datasource for JDBCJobStore
org.quartz.dataSource.myDS.driver =com.mysql.jdbc.Driver
org.quartz.dataSource.myDS.URL = jdbc:mysql://localhost:3306/scheduler
org.quartz.dataSource.myDS.user = myUser
org.quartz.dataSource.myDS.password = myPassword

# JDBCJobStore
org.quartz.jobStore.class = org.quartz.impl.jdbcjobstore.JobStoreTX
org.quartz.jobStore.driverDelegateClass = org.quartz.impl.jdbcjobstore.StdJDBCDelegate
org.quartz.jobStore.dataSource = myDS

# Turn on clustering
org.quartz.jobStore.isClustered = true

org.quartz.scheduler.instanceName = ClusteredScheduler
# If instanceIf is set to AUTO, if will auto-generate an id automatically.
# I recommend giving explicit names to each clustered host for easy identification.
org.quartz.scheduler.instanceId = Host1

Note about JDBC settings: For whatever reason, you have to configure both the JDBC driver class and the job store’s “driver delegate” class. These will have to be set to the appropriate value for your database platform.

As I mentioned, these are the bare minimum properties you will have to configure to get clustering running. The one big pain with this configuration is that the instanceId which identifies hosts resides in the same properties file as all the other properties which should all remain the same. Keeping different versions in sync can problematic, and isn’t required for Obsidian. You can use the “AUTO” setting to avoid having to set explicit instance IDs, but as with Obsidian, we recommend you give explicit names as it can help you locate where issues are happening more quickly if you know up front which host is which.

So the main steps to enabling clustering are:

  1. Prepare properties for each cluster member. Ensure only the org.quartz.scheduler.instanceId varies in the properties file.
  2. Turn off all running instances by shutting down the application which is running it, or disabling just the Quartz process (see here).
  3. Start your application, or start the Quartz process (see here).
  4. Ensure that you never start a non-clustered instance again!

Conclusion

I hope this helps you get off the ground quickly with your new or existing project. Clustering is a great feature in any scheduler, and I feel it provides a lot of value that software and operations teams might be missing out on.