default/override: An Elegant Schema for User Settings

JJ Maxwell

6 February 2025 (UTC)

Engineering

Share this Article

Table of Contents

Common Mistakes: What *Not* To Do

A Better SQL Schema: default/override

How to Use It

Pros

Cons

In Short: Ditch Giant Tables, Use The Default-Override Method

Every app, even basic ones, needs to manage user settings. It always starts simple, right? "Does the user want the Weekly Email?" But then features grow, and suddenly you're dealing with tons of settings: themes, notifications, currencies, time zones, the list goes on. If you're building something complex, like Double – which is for investing – settings get really important.

We have to manage things like fraud settings, drift limits and wash sale prevention settings. Some of these we want to be user controlled, some should only be editable by an admin user. How you store and handle these settings really matters for your app's speed, how well it scales, and how easy it is to maintain. I've tried a bunch of ways to manage user settings, and some are way better than others. Let's look at common mistakes first, and then I'll show you a schema that we came up with recently that's been working really well for us.

Common Mistakes: What Not To Do

1. The Wide Table Trap

One common mistake to avoid is the wide table trap, where the initial thought for managing new settings is to simply keep adding columns. It may seem easy at first; for example, setting up daily emails might look as simple as adding a wants_daily_email boolean column to your users table with a quick ALTER TABLE command. However, this approach quickly leads to columns getting out of control.

Before long, you'll need columns for newsletters, app notifications, dark mode, time zones, and other features, causing your users table to become a massive, wide structure. Wide tables can negatively impact performance, as queries take longer because the database has to process huge rows, many of which are sparsely populated. Indexes become more complicated, and backups and restores take significantly longer.

Schema changes also become a pain point. Adding another setting involves another risky ALTER TABLE operation in a live application. Remembering the purpose of each column becomes a documentation nightmare, and changing or removing old settings becomes a major project.

Finally, because most users only modify a few settings, your users table becomes filled with wasted space in the form of numerous NULL values, further slowing down queries and wasting storage.

2. The Key-Value Mess

Another approach that may seem appealing after rejecting wide tables is the "grab bag" table, often implemented as a generic user_settings table with columns like user_id, setting_name, and setting_value.

At first glance, this key-value structure appears flexible, because it promises easy addition of new settings without altering the main table. However, bulk updates soon become a significant challenge. For example, adding a new setting to all users could require millions of insert operations, resulting in large, risky database transactions. This can lead to server instability and data inconsistencies, often necessitating complex error handling, retry mechanisms, and background jobs to ensure data integrity.

Data quality also becomes a concern because the setting_value field can accommodate any data type – text, numbers, booleans – making it difficult for the database to enforce consistency. Consequently, all data validation and type handling logic must be implemented in the application code, increasing complexity and the potential for bugs. Finally, user signup processes can become slower.

Setting default settings for new users may require inserting numerous rows into the user_settings table, adding latency to the signup flow and increasing the chances of failures during this critical operation.

3. The Schemaless Temptation

Another tempting approach, especially with the rise of NoSQL databases like MongoDB, is to store settings as JSON documents directly within each user's record. The schemaless nature of NoSQL databases initially appears advantageous, and using JSON for settings feels like a natural fit. Adding a new setting seems as simple as adding a new key to the JSON document. However, querying across users for specific settings becomes challenging.

While document databases excel at retrieving all data for a single user, asking questions about a particular setting across all users becomes slow and inefficient. For instance, answering "How many users use dark mode?" could require a full database scan, which does not scale efficiently.

Furthermore, the perceived benefit of being "schemaless" diminishes when data consistency is required. Ensuring that every user has a specific setting or validating data types becomes the application's responsibility, as the database provides no inherent mechanisms for these rules.

Finally, while NoSQL promises flexibility, settings migrations over time remain complex. Updating settings across numerous NoSQL documents often necessitates custom scripts and background jobs, mirroring the same challenges encountered with the "grab bag" SQL table approach in terms of complexity and potential issues.

A Better SQL Schema: default/override

After struggling with these problems, I've found a SQL schema that actually works well. It balances flexibility, performance, and keeps your data sane. It uses two key tables: default and override.

default Table: The Master List of Settings

Think of default as the single place where you define all possible settings in your app. It's like the blueprint for your settings system.

Column	Data Type	Description	Example
`name`	`VARCHAR(255)`	Unique name for the setting (primary key).	`rebalance_cadence`
`value`	`TEXT`	The default value everyone gets.	'monthly'
`value_type`	`ENUM('boolean', 'string', 'integer', 'enum')`	What is the data type of this setting?	'enum'
`allowed_values`	`JSON`	(Optional, for 'enum' type) List of valid choices as JSON.	`['daily', 'weekly', 'monthly']` (for rebalance cadence)
`is_user_editable`	`BOOLEAN`	Can users change this setting themselves?	`TRUE` (for rebalance cadence), `FALSE` (for internal flags)

override Table: User-Specific Changes

override only stores settings where a user has changed the default. It's kept lean and efficient.

Column	Data Type	Description	Example
`user_id`	`INTEGER`	Links to the `users` table (part of primary key).	12345
`name`	`VARCHAR(255)`	Links to `default.name` (part of primary key).	`rebalance_cadence`
`value`	`TEXT`	The user's chosen value for the setting.	'weekly'
`created_at`	`TIMESTAMP`	When the user made this change.	`2023-10-27 10:00:00`
`updated_at`	`TIMESTAMP`	When the user last updated this setting.	`2023-10-27 10:00:00`

How to Use It

This system is designed to be simple and effective in practice. Adding a new setting is straightforward; you only need to insert a single row into the default table. Here, you define the setting's name, its default value, the data type, any valid options, and whether users are allowed to edit this setting. Crucially, this process requires no database schema alterations and avoids mass data modifications.

To retrieve a user's setting, the COALESCE function (or its equivalent in your ORM) efficiently handles the logic. It prioritizes checking the override table. If a user-specific setting exists there, that value is used. Otherwise, the function falls back to the default value defined in the default table.

SELECT COALESCE(so.value, sd.value) AS setting_value
FROM default sd
LEFT JOIN override so ON sd.name = so.name AND so.user_id = :user_id
WHERE sd.name = :setting_name;

Retrieving multiple settings for a user simultaneously remains efficient through the use of joins and filters. Furthermore, data validation is enforced at the SQL level. When a setting is modified, a new row is inserted into the override table. Before saving this override, validation can be performed based on the value_type and allowed_values columns defined in default, allowing the system to reject invalid setting changes proactively.

For user settings management, users can easily retrieve all their settings, encompassing both their personalized overrides and the system defaults, or filter specifically for settings they are permitted to modify. Users can then update their settings via the application's API, which incorporates built-in validation checks and persists valid changes to the override table.

Pros

Pros: This approach offers several advantages. It scales well, allowing for fast addition and retrieval of settings. Since only user modifications are stored, the database remains efficient. Data consistency is ensured by default which establishes data types and valid options, with database-level checks promoting reliability.

Settings management is simplified with default acting as a central list, eliminating ambiguity about setting definitions. The design inherently maintains data consistency, negating the need for supplementary background processes to synchronize settings.

The system is clean and organized, clearly separating defaults from user-specific changes, enhancing overall system clarity and usability. is_user_editable provides control over user modifications, enabling specification of settings users can adjust, which is beneficial for sophisticated applications.

Finally, it facilitates efficient bulk operations and the creation of user-facing settings interfaces, making it easy to retrieve numerous settings at once.

Cons

Cons: This system also presents some drawbacks. The primary disadvantage is its complexity compared to simpler methods like wide tables or key-value based approaches.

Additionally, gaining easy insight into historical settings is limited. As user settings rely on two tables, tracking a setting's value at a specific past time is not straightforward, requiring periodic snapshots of both tables for reconstruction, which is less than ideal.

Lastly, retrieving settings necessitates querying the application's backend to access the database (or use a cache). While slightly slower than direct column access from a users table, the benefits generally outweigh this minor performance difference. Database views can also be implemented to mitigate this speed concern in many scenarios.

In Short: Ditch Giant Tables, Use The Default-Override Method

For apps that need to be reliable and handle lots of users, especially complex ones like investing platforms, how you store settings really matters. Moving away from simple wide tables and using a structured schema like default/override has been a great move for us here at Double. It's cleaner, scales better, easier to maintain, and keeps our data in good shape.

What's your experience with handling user settings? Run into similar problems? Found different solutions?

Don't miss an article.

Get the most recent news and updates straight from Double Team!

Our Tools

See All

Fee Calculator

With zero AUM fees on Double, see how much savings a portfolio might accumulate compared to a traditional wealth advisor fee or ETF expense ratio.

0.25%

0.1%

1% Typical Fee

0.89%

Tax Bracket Estimator

Estimate the taxes across federal, state, short term and long term for your situation.

JJ Maxwell

6 February 2025 (UTC)

Engineering

Share this Article

Table of Contents

Common Mistakes: What *Not* To Do

A Better SQL Schema: default/override

How to Use It

Pros

Cons

In Short: Ditch Giant Tables, Use The Default-Override Method

Our Tools

See All

Fee Calculator

With zero AUM fees on Double, see how much savings a portfolio might accumulate compared to a traditional wealth advisor fee or ETF expense ratio.

0.25%

0.1%

1% Typical Fee

0.89%

Tax Bracket Estimator

Estimate the taxes across federal, state, short term and long term for your situation.

Don't miss an article.

Get the most recent news and updates straight from Double Team!