MariaDB Recursive Table Expressions

Published on Oct 7, 2021
Topics: Coding

I recently learned about recursive expressions in MariaDB. This offers a pretty good solution for dealing with hierarchy data, it isn't perfect but it can save some you some coding.

Let use a fairly simple example, we need to track employees and their supervisors. That isn't too bad to do, in our system we want employees to have multiple supervisors so we need a pivot table.

Old Method - Flattened Table

Thanks to this Stack Overflow answer I have usually used a table to track direct relationships and also maintained a flattened table.

Here is a simplified set of tables to illustrate what would happen.

employees
name

employee_supervisor
employee
supervisor

employee_supervisor_all
employee
supervisor

employee_supervisor_all table would contain the entire tree structure flattened to simplify look ups. But populating it is kind of a pain.

Given a hierarchy like..

  • Roger supervises Don
  • Don supervises Peggy
  • Peggy supervises Ginsberg

Our direct supervisor table looks like...
supe

And our full supervisor table looks like...
supe

This works great to quickly answer a question like, "Does Roger supervise Peggy?"

SELECT COUNT(*) FROM employee_supervisor_all WHERE supervisor = 'Roger' and employee = 'Peggy';

We don't need to do any special code to handle it, or recursively work through a list of people that Roger supervises. You, of course, have to write some recursive functions to fill out that employee_supervisor_all table. Also you need to decide how to update that table. Are you going to update the entire table with every supervisor change? With a background task every few minutes? Are you going to try and write some precise update logic so that if a supervisor two levels up changes it can handle it quickly?

It is a real pain. In my last project I ended up writing functions to update the whole table at once - starting with employee that had no supervisors and working down to employees with no underlings. I then scheduled a job to update that table every 10 minutes. It works well enough - but definitely has downfalls.

New Method - Recursive Table Expressions

We can eliminate the need for the employee_supervisor_all table by using a recursive table expression.

To get answer the question, "Does Roger supervise Peggy?" we use a query like...

WITH recursive emp_sup as (
    SELECT *
        FROM employee_supervisor WHERE employee = 'Peggy'
    UNION
    SELECT es.*
        FROM employee_supervisor AS es, emp_sup
        WHERE es.employee = emp_sup.supervisor
)
SELECT COUNT(*) FROM emp_sup WHERE supervisor = 'Roger';

It is not beautiful, but I find it much easier than writing code to populate the flattened list.

To get the list of all of Ginsberg's supervisors you would do...

WITH recursive emp_sup as (
    SELECT *
        FROM employee_supervisor WHERE employee = 'Ginsberg'
    UNION
    SELECT es.*
        FROM employee_supervisor AS es, emp_sup
        WHERE es.employee = emp_sup.supervisor
)
SELECT supervisor FROM emp_sup

Of course if you want just direct supervisors, just do the query the old fashioned way

SELECT supervisor FROM employee_supervisor WHERE employee = 'Ginsberg';

You can look up all underlings for a supervisor too

WITH recursive emp_sup as (
    SELECT *
        FROM employee_supervisor WHERE supervisor = 'Don'
    UNION
    SELECT es.*
        FROM employee_supervisor AS es, emp_sup
        WHERE es.supervisor = emp_sup.employee
)
SELECT employee FROM emp_sup

Notice that change to the union when switching from supervisor look up to underling look up?

That union is really where the magic happens. That explains how to traverse the tree. It says to keep going up the tree, grabbing anyone with a supervisor that matches the employee supervised by 'Don'.

Laravel Integration

I am looking for a way to integrate this seamlessly within Laravel. The best came I up with right now is to have a method that fetches the id fields and then query the model. Using a little local caching to help prevent multiple queries.

<?php

namespace App\Models;

use Illuminate\Database\Eloquent\Collection;
use Illuminate\Database\Eloquent\Factories\HasFactory;
use Illuminate\Database\Eloquent\Model;
use Illuminate\Support\Facades\DB;

/**
 * @property int $id
 */
class Employee extends Model {

    public function supervisors()
    {
        return $this->belongsToMany(Employee::class, 'employee_supervisor', 'employee_id', 'supervisor_id');
    }
    
    protected static $allSupervisorIds = [];
    public function getAllSupervisorIds() : array
    {
        if(!isset(static::$allSupervisorIds[$this->id])) {
            $records = DB::select(DB::raw(
                "WITH recursive emp_super as (
                SELECT *
                    FROM employee_supervisor
                    WHERE employee_id = {$this->id}
                UNION
                SELECT es.*
                    FROM employee_supervisor AS es, emp_super
                    WHERE es.employee_id = emp_super.supervisor_id
            )
            SELECT * FROM emp_super;"
            ));
    
            static::$allSupervisorIds[$this->id] = array_map(fn($r) => $r->supervisor_id, $records);
        }
    
        return static::$allSupervisorIds[$this->id];
    }
}

Then in a controller or elsewhere you can do stuff like this...

<?php

$bob = Employee::findOrFail('Bob');
$directSupervisors = $bob->supervisors;
$allSupervisors = Employee::whereIn('id', $bob->getAllSupervisorIds());

$roger = Employee:findOrFail('Roger');
$doesRogerSuperviseJohn = in_array($roger->id, $bob->getAllSupervisorIds());

Performance Notes

I didn't see any huge performance hits with recursive table functions in my data sets. On my set of a few hundred employees with some complex scenario situations it took around 3ms to query the worst case scenario (grab all underlings for the person who supervises everyone). If these queries are not performant enough on your system I would suggest using them to help maintain your employee_supervisor_all table mentioned above.