Blog

Effortless Batch Task Execution in Node.js

Queue up those tasks...

One of Apex Designer's super powers is the ability to package patterns and integrations as libraries that can be reused across applications. One of the latest examples of this is the Scheduled System Task library. This blog post covers the motivation for the library as well as several evolutions to cover additional use cases.

Introduction

Most reasonably complex applications have a need to execute timer-based or event-based system tasks. For example, many of the applications we develop for ourselves and our clients using Apex Designer include some kind of daily post to Slack with usage statistics. These are typically set to run once per day at midnight. An application behavior is set to run on application start and that uses setTimeout to schedule (and schedule again) the logic that runs a query and posts the results to Slack using the Slack Library.

This works great in dev, but with production applications, high availability and horizontal scaling are achieved by running multiple instances of the application. Using this simple logic, each instance of the application dutifully generates the Slack post and you end up with duplicate messages. 

This may not be a big deal for Slack activity summaries but it is not great for applications that need to spread the load across multiple nodes or ensure that multiple nodes are not trying to process the same information. Also, the current state of the queue and its processing cannot be maintained by the instance, as instances can be added or removed at any time by the pod manager. 

There are third-party Node.js libraries for distributed queuing (for example Bull). Also, most cloud infrastructure vendors have a Scheduler Service (e.g. AWS CloudWatch Events). However, Apex Designer applications already have a database, and so we were able to design a more elegant solution for this problem.

Scheduled System Task Library

Enter the Scheduled System Task library. This library introduces a Scheduled System Task business object to your application, encapsulating all the logic needed to define, claim, and execute scheduled tasks.

For the activity summary use case described above, the app replaces the setTimeout logic with one call to schedule a system task. This includes the "what" and "when":

  1. Business Object Name
  2. Instance Id (optional)
  3. Behavior Name
  4. Inputs (JSON string of an array)
  5. Run After (date with time)

The library’s worker logic claims and executes the task at the designated time, handling errors and retries as needed. This ensures that only one instance of your application performs the task, avoiding duplicate posts and enabling load sharing across multiple nodes.

Delayed System Task Capability

After using the library for some time, we came across another use case that was not completely covered with the capabilities described so far. The OpenAI library has a mixin that lets the application configure one or more business objects to automatically push formatted information to OpenAI for RAG search.

Most of the user experiences we build automatically save updates the user makes after a slight pause in typing. This provides a great experience for the user because if they lose connectivity, all their changes are saved up to that point. However, triggering a server-side operation after every pause is impractical.

To address this use case, we added a new behavior to the Scheduled System Task library. It lets the application define how long to wait until the information is pushed to OpenAI (the default is 60 seconds). The library checks for a pending task with the same inputs. If it does not exist, a task is scheduled for 60 seconds in the future. If the task does exist, it reschedules the existing task to be a further 60 seconds in the future. This approach balances user experience with server-side efficiency.

Predecessor Task Capability

The third evolution of the library introduces the concept of predecessor relationships between tasks. This allows applications to define a sequence of tasks that must be executed in a specific order, enabling parallel execution across multiple nodes.

Imagine you have a series of data processing steps that must occur in a defined sequence. The library can schedule these tasks, ensuring each step waits for its predecessor to complete before starting. This not only optimizes task execution but also improves the overall efficiency and reliability of your applications.

Conclusions

The Scheduled System Task library is a great example of an Apex Designer library. It packages a set of easy to use features that address real-world challenges and makes it easy to add them to any application.

Below is a summary of the key benefits provided by the Scheduled System Task library:

  • Simplifies task scheduling, reducing code complexity
  • Prevents duplicate task execution, ensuring efficient resource use
  • Improves application performance and availability by spreading the load
  • Increases reliability by automatically handling and retrying failed tasks
  • Reduces server load and improves efficiency by batching updates
  • Facilitates complex workflows and parallel task execution across instances
  • Saves development time and effort, promoting code reuse and consistency

You can read more about Apex Designer or check out the documentation.

David Knapp

David Knapp