At incident.io, we are strong advocates for database migrations. Mastering the ability to execute them swiftly and securely is crucial for keeping up with the growth of our product.
While we have previously shared our guidelines for handling migrations safely, today we are delving into a specific topic – migrating JSONB columns.
It’s important to note that although we use Postgres, this approach is applicable to all SQL databases.
So, what happens when the structure of a JSON-stored type changes?
Following our playbook, the goal is to migrate all existing data into the new format. But how do we manage both formats simultaneously during the transition period?
Let’s consider a recent example: OR Conditions.
Throughout our application, we support the concept of conditions – “If A and B, do X”.
Recently, we introduced support for OR logic. For instance, “If (A and B) or (C and D), do X”.
Internally, we label each set of OR’d conditions as a condition group.
In the previous system, we stored our conditions as a JSON array of objects, with each object representing a condition. With the introduction of condition groups, we needed to store an array of arrays of objects.
Below is our original domain type representing the schema:conditions
is a JSONB
Postgres column, which is unmarshalled into a slice of Go structs.
type EngineConfig struct {
ID string `json:"id" gorm:"type:text;primaryKey"`
Conditions Conditions `json:"conditions" gorm:"type:jsonb"`
}
type Conditions []Condition
type Condition struct {
Operation string `json:"operation"`
...
}
And here is our new domain type, now storing condition groups:
type EngineConfig struct {
ID string `json:"id" gorm:"type:text;primaryKey"`
ConditionGroups ConditionGroups `json:"conditions" gorm:"column:conditions;type:jsonb"`
}
type ConditionGroups [][]Condition
type ConditionGroup struct {
Conditions []Condition `json:"conditions"`
}
How do we transition data from the old format []Condition
to the new []ConditionGroup
?
The solution lies in interfaces sql.Scanner and driver.Valuer.
type Scanner interface {
Scan(src any) error
}
type Valuer interface {
Value() (Value, error)
}
By implementing Scan
and Value
on a column’s type, we can read and write data to and from the database, respectively.
Thus, we can provide custom marshaling logic to manage data in both formats securely.
func (g *ConditionGroups) Scan(val any) error {
bytes, ok := val.([]byte)
if !ok {
return errors.New("could not parse condition groups as []byte")
}
if err := json.Unmarshal(bytes, g); err == nil {
if len(*g) > 0 && (*g)[0].Conditions != nil {
return nil
}
}
conditions := []Condition{}
if err := json.Unmarshal(bytes, &conditions); err != nil {
return errors.New("could not unmarshal bytes into []Condition")
}
if len(conditions) == 0 {
*g = EngineConditionGroups{}
return nil
}
*g = EngineConditionGroups{
EngineConditionGroup{
Conditions: conditions,
},
}
return nil
}
func (g EngineConditionGroups) Value() (driver.Value, error) {
return json.Marshal(g)
}
With our custom scanner in place, we can write data in the new format while still being able to read both old and new data.
This allows us to continue work on the rest of the codebase, enabling the change to progress through the backend and frontend simultaneously with the remaining migration tasks.
The subsequent step in the runbook involves backfilling all old data in an asynchronous process to unify the data state.
Once again, the scanner proves invaluable – we can simply read the data and immediately rewrite it.
Performing this operation in a batch with an `ON CONFLICT DO UPDATE` clause significantly improves performance.
Please note that we have also introduced a temporary indexed column `conditions_migrated` to assist with the backfill process.
“`go
func backfillConditions(ctx context.Context, db *gorm.DB) error {
maxExpectedRows := 300000
batchSize := 1000
maxIts := maxExpectedRows / batchSize
for i := 0; i < maxIts; i++ {
existing, err := domain.NewQuerier[domain.EngineConfig](db).
Filter(domain.EngineConfigBuilder(
domain.EngineConfigBuilder.ConditionsMigrated(false),
)).
Limit(batchSize).
Find(ctx)
if err != nil {
return err
}
if len(existing) == 0 {
break
}
payloads := []domain.Escalation{}
for _, e := range existing {
payloads = append(payloads, domain.EngineConfigBuilder(
domain.EngineConfigBuilder.ID(e.ID),
domain.EngineConfigBuilder.OrganisationID(e.OrganisationID),
domain.EngineConfigBuilder.Conditions(e.Conditions),
domain.EngineConfigBuilder.ConditionsMigrated(true),
).Build())
}
res := db.
Clauses(clause.OnConflict{
Columns: []clause.Column{{Name: “id”}},
DoUpdates: clause.AssignmentColumns([]string{
“conditions”,
“conditions_migrated”,
}),
}).
Create(ctx, payloads)
if res.Error != nil {
return res.Error
}
}
fmt.Println(“Finished!”)
return nil
}
“`
Once the backfill operation is complete, the data will be in a consistent state. You can then remove the custom scanner and proceed with your tasks. sentence in a different way:
The dog eagerly chased after the squirrel in the park.
In the park, the dog eagerly pursued the squirrel.