Useful tip
This operator functions correctly only when Dagserver is running on Windows platforms.
Here you will find documentation for each of the components and operators that are included within Dagserver. You will also find information about how to implement your own operator within the tool.
Dagserver includes a variety of operators that facilitate different tasks within the system. These operators are categorized into several groups: Process, Remote, MQ, and External. Below is the documentation for each operator within these categories.
The "PROCESS" category refers to operators that are used to process data within the DAG execution flow. The following operators are found in the "Process" category.
The CmdOperator is designed to execute BATCH (DOS Windows) commands.
cmd: Command (code) to be executed by the operator
When executed, the CmdOperator runs the provided BATCH command and captures the output. The output is then returned in a DataFrame format with a single column named output.
When running this operator, if a problem is encountered, a DomainException will be thrown. The detail can be reviewed in the System Exceptions view
This operator functions correctly only when Dagserver is running on Windows platforms.
The DummyOperator is a simple operator that is mainly used for testing and demonstration purposes. It does not perform any specific operation other than logging its initiation and returning a status DataFrame.
When executed, the DummyOperator logs its start and end, along with any arguments provided. It then returns a DataFrame with a single status column indicating "ok".
If a problem is encountered during execution, a DomainException will be thrown. The details can be reviewed in the System Exceptions view.
The ExcelOperator is designed to read from and write to Excel files. This operator can be used to import data from Excel sheets into a DataFrame or to export DataFrame data to an Excel sheet.
filePath: Path to the Excel file
mode: Operation mode (read or write)
sheetName: Name of the sheet to read from or write to
startRow: Starting row for reading or writing
startColumn: Starting column for reading or writing
xcom: Name of the DataFrame to be written to the Excel file (used in write mode)
endRow: Ending row for reading (used in read mode)
endColumn: Ending column for reading (used in read mode)
includeTitles: Whether to include column titles when writing (true or false)
When executed, the ExcelOperator will perform the specified operation (read or write) on the provided Excel file. If reading, it will import the specified sheet's data into a DataFrame. If writing, it will export the provided DataFrame to the specified sheet.
If a problem is encountered during execution, a DomainException will be thrown. The details can be reviewed in the System Exceptions view.
The FakerOperator generates mock data using the JavaFaker library. It constructs DataFrames filled with randomized data based on the specified fields and locale.
fakerjson: JSON array specifying fields to generate using JavaFaker methods
count: Number of rows of data to generate
locale: Locale setting for generating localized data (e.g., en-US, es, fr)
When executed, the FakerOperator uses the JavaFaker library to create mock data based on the specified fields and locale. It generates a DataFrame containing randomized data for each specified field, repeated for the specified count of rows.
If a problem occurs during execution, a DomainException will be thrown. Details of the error can be reviewed in the System Exceptions view.
The FileOperator allows reading from and writing to files with configurable options for handling structured and unstructured data.
mode: Operation mode (read or write)
filepath: Path to the file to read from or write to
firstRowTitles: Whether the first row contains titles (true or false)
xcom: Name of the DataFrame to be written to or read from (used in write or read mode)
rowDelimiter: Delimiter for parsing rows (used in read mode)
When executed, the FileOperator will perform the specified operation (read or write) on the provided file. If reading, it can handle structured data with defined row delimiters or unstructured data by reading the entire content as a single column DataFrame.
If a problem is encountered during execution, a DomainException will be thrown. The details can be reviewed in the System Exceptions view.
If the rowDelimiter parameter is not defined, the FileOperator will read the file as unstructured data, generating a single-column DataFrame.
The GroovyOperator allows dynamic execution of Groovy scripts, integrating with other operators and DAGs in the DAG Server.
source: Groovy script source code
When executed, the GroovyOperator evaluates the provided Groovy script (source
). It supports integrating with the "operator api" to execute other operators and the "dag api" to execute other DAGs.
If the script returns a List or Map, it will be converted into a DataFrame. If it returns a DataFrame directly, it will be used as the output. If null or an empty result is returned, a status DataFrame indicating "empty" will be created.
If a problem is encountered during execution, a DomainException will be thrown. The details can be reviewed in the System Exceptions view.
The GroovyOperator allows executing other operators using the "operator api" and also executing other DAGs using the "dag api". This capability enhances integration and workflow flexibility within DAG Server.
The JavaOperator allows executing external Java classes dynamically loaded from JAR files. It supports invoking Callable instances that integrate with DAG Server's execution environment.
classpath: Path to the directory containing JAR files
className: Fully qualified name of the Java class to execute
When executed, the JavaOperator dynamically loads JAR files from the specified classpath and executes the designated Java class. The class must implement the Callable interface for integration with DAG Server.
If a problem occurs during execution, a DomainException will be thrown and details can be reviewed in the System Exceptions view.
The JavaOperator supports executing external Java classes that implement the Callable interface. It integrates with DAG Server by allowing setting parameters using the methods setXcom and setArgs, enhancing flexibility in integrating custom Java functionality within workflows.
The PathDirOperator retrieves metadata information about files and directories located at the specified path. It returns details such as filename, size, accessibility, and type for each file or directory found.
path: The full directory path to retrieve metadata from
If the specified path is invalid or does not exist, the operator throws a DomainException, which can be reviewed in the System Exceptions view.
The PathDirOperator retrieves file metadata based on the provided path. Ensure the path is correctly formatted and accessible to avoid DomainException errors. It returns a DataFrame containing information about each file or directory found.
The QualityOperator evaluates data quality based on criteria provided in a JSON format (`qualityjson`). It checks each field in the DataFrame stored in `xcom` for compliance with specified data types and raises errors for mismatches.
qualityjson: JSON object defining expected data types for each field
xcom: Name of the cross-communication data object containing the DataFrame to evaluate
If the `xcom` specified does not exist for the given `dagname`, a DomainException is thrown, which can be reviewed in the System Exceptions view.
The QualityOperator evaluates data quality based on the specified criteria in `qualityjson`. Ensure the `xcom` contains the DataFrame to evaluate, and that `qualityjson` accurately defines expected data types. It returns a DataFrame with additional columns indicating the quality status and messages for each field.
The "REMOTE" category refers to operators that are used to interact with remote repositories or file systems. The following operators are found in the "Remote" category.
The Remote API implementation only allows standard operations, such as listing directory contents and uploading and downloading files. Commands are defined using the "Remote" UI in the operator parameter editor.
The FTPOperator facilitates FTP operations such as listing files, uploading files, and downloading files between the specified FTP server and local system.
host: The FTP server hostname or IP address
port: The FTP server port number
ftpUser: The FTP username
ftpPass: The FTP password
commands: Semi-colon separated list of FTP commands formatted as "command argument1 argument2"
If any errors occur during FTP operations, a DomainException is thrown, which can be reviewed in the System Exceptions view.
The FTPOperator supports FTP operations for file management. Ensure the `host`, `port`, `ftpUser`, `ftpPass`, and `commands` are correctly configured for seamless FTP interaction. Review logs for detailed operation results.
The SFTPOperator facilitates SFTP operations such as listing files, uploading files, and downloading files between the specified SFTP server and local system.
host: The SFTP server hostname or IP address
port: The SFTP server port number
sftpUser: The SFTP username
sftpPass: The SFTP password
commands: Semi-colon separated list of SFTP commands formatted as "command argument1 argument2"
If any errors occur during SFTP operations, a DomainException is thrown, which can be reviewed in the System Exceptions view.
The SFTPOperator supports SFTP operations for secure file management. Ensure the `host`, `port`, `sftpUser`, `sftpPass`, and `commands` are correctly configured for seamless SFTP interaction. Review logs for detailed operation results.
The Samba2Operator facilitates operations with Samba (SMB) shares, enabling functionalities such as listing files, uploading files, and downloading files between the specified Samba server and the local system.
host: The Samba server hostname or IP address
smbUser: The Samba username
smbPass: The Samba password
smbDomain: The Samba domain (if applicable)
smbSharename: The name of the Samba share to interact with
commands: Semi-colon separated list of Samba commands formatted as "command argument1 argument2"
If any errors occur during Samba operations, a DomainException is thrown, which can be reviewed in the System Exceptions view.
The Samba2Operator provides robust integration with Samba (SMB) shares for efficient file management. Ensure the `host`, `smbUser`, `smbPass`, `smbDomain`, `smbSharename`, and `commands` are correctly configured for seamless interaction. Review logs for detailed operation results.
The WebDAVOperator facilitates operations with WebDAV servers, allowing functionalities such as listing files, uploading files, and downloading files between the specified server and the local system.
host: The host name or IP address of the WebDAV server
port: The port number of the WebDAV server
username: The username for authentication on the server (leave empty if not required)
password: The password for authentication on the server (leave empty if not required)
commands: A list of commands separated by semicolons, in the format "command argument1 argument2"
If errors occur during WebDAV operations, a DomainException will be thrown, which can be reviewed in the System Exceptions view.
The WebDAVOperator offers efficient integration with WebDAV servers for file management. Ensure to configure `host`, `port`, `username`, `password`, and `commands` correctly for seamless interaction. Check the logs for detailed outcomes of the operations.
The following operators mainly refer to integrations with different message queues. The following operators are found in the "MQ" category.
The ActiveMQOperator facilitates interaction with ActiveMQ message queues, supporting both message production and consumption based on specified modes.
mode: Specifies the mode of operation, either "produce" for message production or "consume" for message consumption
brokerURL: The URL of the ActiveMQ broker
queueName: The name of the queue to interact with
xcom (optional): Name of the cross-communication object (XCom) associated with the data frame to be produced
timeout (optional): Timeout value in milliseconds for consuming messages (default is 10000)
If errors occur during ActiveMQ operations, a DomainException will be thrown, which can be reviewed in the System Exceptions view.
The ActiveMQOperator provides robust integration with ActiveMQ message queues for effective message handling. Ensure to configure `mode`, `brokerURL`, and `queueName` correctly based on your operational requirements. Refer to the logs for detailed insights into the messaging operations.
The KafkaOperator enables interaction with Kafka messaging queues, supporting both message production and consumption based on specified modes.
mode: Specifies the mode of operation, either "produce" for message production or "consume" for message consumption
bootstrapServers: The comma-separated list of host and port pairs that Kafka broker will use to establish initial connections to the Kafka cluster
topic: The Kafka topic name to interact with
timeoutSeconds: Timeout value in seconds for producer/consumer operations
xcom (optional): Name of the cross-communication object (XCom) associated with the data frame to be produced
poll (optional): Polling timeout in milliseconds for consuming messages (default is 10000)
groupId (optional): The consumer group id for Kafka consumer (required for consuming messages)
If errors occur during Kafka operations, a DomainException will be thrown, which can be reviewed in the System Exceptions view.
The KafkaOperator provides robust integration with Kafka messaging queues for efficient message handling. Ensure to configure `mode`, `bootstrapServers`, `topic`, and `timeoutSeconds` according to your application's requirements. Refer to the logs for detailed insights into the messaging operations.
The RabbitMQOperator facilitates interaction with RabbitMQ messaging queues, supporting operations such as message publishing and consumption based on specified modes.
host: The hostname or IP address of the RabbitMQ server
username: The username for authentication on the RabbitMQ server
password: The password for authentication on the RabbitMQ server
port: The port number for the RabbitMQ server
mode: Specifies the mode of operation, either "publish" for message publishing or "consume" for message consumption
xcom (optional): Name of the cross-communication object (XCom) associated with the data frame to be published
exchange (optional): The RabbitMQ exchange to interact with when publishing messages
routingKey (optional): The routing key used for message publishing
queue (optional): The queue name to consume messages from
body (optional): The message body content to be published (source code format)
If errors occur during RabbitMQ operations, a DomainException will be thrown, which can be reviewed in the System Exceptions view.
The RabbitMQOperator offers seamless integration with RabbitMQ messaging queues for efficient message handling. Ensure to configure `host`, `username`, `password`, `port`, and `mode` appropriately for your use case. Refer to the logs for detailed information on message operations.
The following operators mainly refer to integrations with various systems and tools on the market. The following operators are found in the "EXTERNAL" category.
The ChatGPTOperator integrates with the ChatGPT API for natural language processing tasks, allowing interactions based on provided prompts.
apiKey: The API key for accessing the ChatGPT API (password type)
prompt: The prompt or source code snippet to interact with ChatGPT, supporting named parameters in the format ":paramName"
xcom (optional): Name of the cross-communication object (XCom) containing prompts and results for batch processing
If errors occur during ChatGPT interactions, a DomainException will be thrown, which can be reviewed in the System Exceptions view.
The ChatGPTOperator allows seamless integration with the ChatGPT API for natural language processing tasks. Ensure to provide the correct `apiKey` and `prompt` parameters, including named parameters for dynamic prompts. Review the logs for detailed information on each interaction.
The HttpOperator facilitates HTTP requests to external APIs or services, supporting methods such as GET, POST, PUT, and DELETE.
url: The URL endpoint for the HTTP request
method: The HTTP method to use (GET, POST, PUT, DELETE)
timeout: Timeout duration in milliseconds for the HTTP connection
contentType: The content type of the request payload (e.g., application/json)
xcom (optional): Name of the cross-communication object (XCom) containing request body data for POST requests
authorizationHeader (optional): Authorization token or credentials for authenticated requests
If errors occur during HTTP requests, a DomainException will be thrown, which can be reviewed in the System Exceptions view.
The HttpOperator supports various HTTP methods for interaction with external APIs. Ensure to set up the `url`, `method`, `timeout`, and `contentType` parameters correctly. For POST requests, provide the request body via XCom and handle authentication with the `authorizationHeader` option.
The JdbcOperator facilitates interaction with JDBC-compliant databases, supporting SQL queries and updates.
url: The JDBC URL to connect to the database
user: The username for database authentication
pwd: The password for database authentication
driver: The JDBC driver class name
driverPath: The path to the JDBC driver JAR files
query: The SQL query to execute, supports SELECT, INSERT, UPDATE, DELETE queries
xcom (optional): Name of the cross-communication object (XCom) containing data for parameterized queries
If errors occur during database interactions, a DomainException will be thrown, which can be reviewed in the System Exceptions view.
The JdbcOperator supports SQL queries such as SELECT, INSERT, UPDATE, and DELETE against JDBC-compliant databases. Ensure to set up the `url`, `user`, `pwd`, `driver`, `driverPath`, and `query` parameters correctly. For parameterized queries, use the `xcom` option to pass dynamic data.
This operator supports named parameters in SQL queries. Use :parameterName syntax in your SQL queries to dynamically substitute values from the DataFrame rows.
The MailOperator sends an email using SMTP protocol. It supports authentication, plaintext or TLS encryption, and allows optional attachments and carbon copy (CC) recipients.
host: SMTP server hostname or IP address
port: Port number for SMTP server
userSmtp: SMTP username for authentication
pwdSmtp: SMTP password for authentication
fromMail: Sender's email address
toEmail: Recipient's email address
subject: Subject of the email
protocol: Email protocol, choose from "plaintext" or "TLSv1.2"
body (optional): Optional email body content in HTML format
xcom (optional): Optional name of cross-communication data (XCom) to append to the email body
attachedFilename (optional): Optional filename for attachment
stepAttachedFilename (optional): Optional name of cross-communication data (XCom) containing attachment content
ccList (optional): Optional semicolon-separated list of email addresses for CC
If errors occur during email sending, a DomainException will be thrown.
The MailOperator supports sending emails with optional attachments and CC recipients. Ensure to configure the `host`, `port`, `userSmtp`, `pwdSmtp`, `fromMail`, `toEmail`, `subject`, and `protocol` parameters correctly for successful email delivery. Use `body`, `xcom`, `attachedFilename`, `stepAttachedFilename`, and `ccList` options as needed.
The MongoDBOperator facilitates interactions with MongoDB databases, supporting read, insert, and delete operations.
hostname: MongoDB server hostname or IP address
port: Port number for MongoDB server
mode: Operation mode, choose from "READ", "INSERT", or "DELETE"
database: Name of the MongoDB database
collection: Name of the MongoDB collection
timeout: Timeout in milliseconds for connection
username (optional): MongoDB username for authentication
password (optional): MongoDB password for authentication
filter (optional): Optional JSON filter for read operations
xcom (optional): Name of the cross-communication object (XCom) containing data for operations
If errors occur during MongoDB operations, a DomainException will be thrown.
The MongoDBOperator supports read, insert, and delete operations against MongoDB databases. Ensure to set up the `hostname`, `port`, `mode`, `database`, `collection`, and `timeout` parameters correctly. Use `username`, `password`, `filter`, and `xcom` options as needed for authentication, query filtering, and cross-communication.
The RedisOperator facilitates interactions with Redis databases, supporting read, save, and delete operations.
hostname: Redis server hostname or IP address
port: Port number for Redis server
mode: Operation mode, choose from "READ", "SAVE", or "DELETE"
redisCluster: Boolean indicating whether Redis operates in cluster mode
keyObject: Key name or identifier for Redis data manipulation
xcom (optional): Name of the cross-communication object (XCom) containing data for operations
body (optional): Body or content for save operations
If errors occur during Redis operations, a DomainException will be thrown.
The RedisOperator supports read, save, and delete operations against Redis databases. Ensure to set up the `hostname`, `port`, `mode`, `redisCluster`, and `keyObject` parameters correctly. Use `xcom` and `body` options as needed for cross-communication and data manipulation.
The SshOperator executes remote commands via SSH on a specified host.
host: SSH server hostname or IP address
user: Username for SSH authentication
port: Port number for SSH connection
cmd: Command to execute on the remote SSH server (Shell script or command)
pwd (optional): Password for SSH authentication
knowhostfile (optional): Path to a known hosts file for SSH
privateKeyFile (optional): Path to the private key file for SSH authentication
If errors occur during SSH command execution, a DomainException will be thrown.
The SshOperator supports executing commands on remote servers using SSH. Ensure to configure `host`, `user`, `port`, and `cmd` parameters correctly. Use `pwd`, `knowhostfile`, and `privateKeyFile` options as needed for authentication and known hosts management.
The ZookeeperOperator interacts with Apache ZooKeeper for managing distributed configuration and synchronization.
host: ZooKeeper server hostname or IP address
mode: Operation mode (READ, INSERT, DELETE)
path: ZooKeeper node path
timeout: Timeout in milliseconds for ZooKeeper connection
xcom (optional): Cross-communication variable name
If errors occur during interaction with ZooKeeper, a DomainException will be thrown.
The ZookeeperOperator supports managing ZooKeeper nodes with operations like read, insert, and delete. Configure `host`, `mode`, `path`, and `timeout` parameters correctly. Optionally, use `xcom` for cross-dag communication.
The technical details that correspond to the different execution channels of a DAG will be detailed.
This channel is always active, and it is not possible to deactivate it. Represents the main form of execution of a DAG, when it is executed on a schedule.
This channel is always active, and it is not possible to deactivate it. This channel is used when executing an on-demand dag via an HTTP call.
This channel can be activated/deactivated. You can configure the repository that this channel will listen to and the dag that it will execute from the input channels menu.
This channel can be activated/deactivated. You can configure the queue that this channel will listen to and the dag that it will execute from the input channels menu.
This channel can be activated/deactivated. You can configure the subscription that this channel will listen to and the dag that it will execute from the input channels menu.
This channel can be activated/deactivated. You can configure the subscription that this channel will listen to and the dag that it will execute from the input channels menu.
This channel can be activated/deactivated. You can configure the queue that this channel will listen to and the dag that it will execute from the input channels menu.
Additional features of the application will be documented here.
The scheduler implemented internally within Dag Server is Quartz. The Quartz scheduler supports the clustering mechanism through a database. Currently only the mysql engine (MariaDB) is supported.
The database connection must be configured in the quartz.properties file. For this, the connection url and credentials must be configured in the "org.quartz.dataSource.quartzDS" parameters.
By default, dagserver sets up an in-memory h2 database
By default, Dagserver loads and maintains data in xcom memory and exceptions occur in the system. This is not recommended in a productive environment.
This can be configured to use disk persistence by configuring two mapDB files. one to save the XCOMs, and another to save the exceptions
You must replace the "storage-hashmap" profile, with the "storage-map-db" profile, in the spring.profiles.active variable within the application.properties file.
In addition, the following variables must be configured in the same file:
It is possible to configure the dagserver to work together with Keycloak (Single Sign On)
You must replace the "auth-internal" profile, with the "auth-keycloak" profile, in the spring.profiles.active variable within the application.properties file.
In addition, the following variables must be configured in the same file: