Temporal is a workflow engine that is used for durable executions of overdraft and loans that consist of multiple steps, and can last up to several months. The engine provides functionality for having reliable and deterministic execution of these steps, and allows for idempotent interaction with the clients.


Prior to working with the workflows it is recommended to become acquainted with the basics. The further sections were written with an assumption that the reader has already got some familiarity with the tooling. If not, one can find plenty of information in the following sources:

  1. The official Temporal documentation

  2. Temporal workshop videos on YouTube

  3. Temporal community forum

Workflow Structure

The workflow related code can be broken down into such parts:

  1. WorkflowClient, WorkflowServiceStubs - classes provided by Temporal for interacting with the workflows (start, signal, etc)

  2. Worker - an executor of Temporal tasks that actually runs the workflows

  3. WorkflowIdResolver - a helper for resolving workflowId by accountId

  4. workflow.definition - overdraft and loans workflow classes that define actual execution logic

  5. workflow.activity - overdraft and loans activities implementation

  6. workflow.dto - various DTO classes for workflow and activity method parameters

All the workflows have some common methods.

  1. init - initializes the workflow from the loan-manager side, creates a mapping entry of accountId to workflowId in the DB, or ends the execution if such a mapping already exists. In order to make this mapping working, i.e. being able to resolve an active workflowId by an accountId, we imposed a restriction that only one workflow of each type can be running for a single accountId.

  2. destroy - deactivates the aforementioned mapping in the DB.

  3. checkIdempotency - checks if an idempotency was used, and skips the request if such key was used.

  4. provideReply - registers reply received from an external source to continue the execution. This is part of a request-reply pattern implementation that is used for synchronizing asynchronous communication with TM over Kafka.

For overdrafts, we also utilize idle/ready flags to ignore clients requests when some other operation is in progress so that there is no accidental data inconsistency. For example, a user cannot do a repay while fee application is in progress.

Besides that, large parts of the overdrafts code was extracted as separate WorkflowTasks. This was done solely to make the code more manageable, and technically they still relate to the workflow implementation, and can be inlined into the main class.

Conventions

Workflow/activity methods use DTOs as parameters

This is because Temporal uses gRPC and serializes arguments with Jackson. Using DTOs is easier in terms of compatibility, since you can easily add new fields without breaking running workflows.

Note that arguments can get deserialized as maps, so one has to be cautious when handling polymorphic types or writing tests. Although, different approaches were used for different functionality, the best one was to use a custom Jackson type resolver to deserialize DTOs to correct subtypes.

Request-reply

In many cases we need to communicate with TM to perform some operations. Since this communication is asynchronous, we needed some kind of a synchronization mechanism to make it work. The present approach consists in sending a posting instruction to TM using a regular activity method (or rather in a retry block), and awaiting for a reply. This reply is registered with a provideReply() workflow method that is called from a posting handler once a matching Kafka message is consumed. Since we need to know a workflowId to send a provideReply() signal, some specific data is required in the incoming posting instructions. Namely, accountId or overderftId to resolve a workflowId, and requestId that is generated when sending a request to TM, and that serves as a correlation id.

NOTE: The previous implementation was based on mimicking JMSTemplate using asynchronous activities and multithreading for waiting for replies. However, this solution did not work well with multiple instances of loan-manager, because one instance could send a request and wait for a reply, but another instance would receive that reply. Or, more precisely, replies were delivered to a different Kafka partition that was assigned to a different loan-manager instance.

Promises

When there are some activities that can be run simultaneously, like sending notifications, messages, and events, it is best performance-wise to wrap them into promises and let Temporal handle all of them asynchronously, so that one does not block the other. However, these are better not be mixed with activities that handle some business logic or other concerns.

Termination and Clean-Up

If there is an active workflowId stored in the DB, but there is no running workflow in Temporal with the same id, then upon workflowId resolution, this DB entry is deactivated. Thus, it is enough to terminate the workflow only in Temporal, for example, using a Temporal UI.

Versioning

When there are running workflows in production, one must be careful with making changes to the workflows code, as the workflows have to be deterministic and it must be possible to replay them from the very beginning to the current state. It means that any changes in what activities are called, their order, etc. has to remain the same. If it does not, then one has to add versioning checks in their code (see getVersion() in the documentation). Or if using many getVersion() checks is difficult, it is also possible to write the whole new versions of WorkflowTasks.