Tasks and Scheduling
Tasks allow you to define arbitrary SQL scripts that run automatically at scheduled times. You can think of them as similar to cron jobs.
In addition to that, tasks can run on startup to configure DuckDB as needed.
Schedule tasks to:
- Load, cleanup and transform data
- Write data to remote sources
- Get creative with DuckDB community extensions - for example use the HTTP client extension to send notifications to Slack
Run tasks on startup to:
- Install DuckDB extensions
- Attach databases
- Create views
How to Define Tasks
Section titled “How to Define Tasks”You can define tasks as files or via the UI.
Any file ending with .task.sql is considered a task.
To create a task via the UI click on “New” in the menu, then switch the type at the top from “Dashboard” to “Task”.
Unlike dashboards which are restricted to read-only SQL, any DuckDB SQL statement is allowed in tasks apart from SET and PRAGMA statements.
Schedule tasks
Section titled “Schedule tasks”The first SQL statement of the task defines its schedule by returning a single value of the type SCHEDULE which is either an INTERVAL or a TIMESTAMP.
Schedule a task that runs every 5 minutes:
SELECT INTERVAL '5 minutes'::SCHEDULE;Run every day at 1:00AM:
SELECT today() + INTERVAL '25h'::SCHEDULE;Run every week on Monday at 1:00AM:
SELECT date_trunc('week', now()) + INTERVAL '7days 1h'::SCHEDULE;Startup Tasks
Section titled “Startup Tasks”To run a task on startup use init:SCHEDULE:
SELECT 'init'::SCHEDULE;Multiple init-tasks run in alphabetical order and top-level tasks run before tasks in folders.
Memory Mode
Section titled “Memory Mode”When the duckdb option is set to :memory: a new database is initialized for every dashboard. In that case not all init-tasks run but only the tasks in the same folder as the dashboard itself and parent folders.
Non-scheduled Scripts
Section titled “Non-scheduled Scripts”You can also run the SQL script directly via the UI by pressing the “Run” button. This can be useful for testing and for scripts you like to only trigger manually.
If you do not want a task to run automatically skip the ::SCHEDULE statement.
Failed tasks and monitoring
Section titled “Failed tasks and monitoring”If a task fails, it will be retried automatically the next time it is scheduled to run. If getting the next scheduled task run time fails, it will not be retried automatically.
The time and status of the last task run is shown in the UI.
But to make sure that tasks are working correctly and run at correct times, you need to monitor Shaper. For more on monitoring see the Deploy Docs.
Tasks when running Shaper in a cluster of multiple nodes
Section titled “Tasks when running Shaper in a cluster of multiple nodes”Scheduled tasks and when running tasks manually they run only on one node in a cluster, and init-tasks run on all nodes.
When running Shaper in a cluster don’t store data in Shaper’s built-in DuckDB.
Disable Tasks Functionality
Section titled “Disable Tasks Functionality”You can disable all tasks functionality by setting the flag --no-tasks or the environment variable SHAPER_NO_TASKS=true.
This will also disable existing tasks. But it will not delete them. So if you later remove the flag, the tasks will be available again.