When performance problems appear in a database, the first reaction is often:
“Let’s add an index.”
Sometimes that works. Sometimes it changes nothing. And in some cases, it even makes the query slower.
The problem is not indexes themselves — it’s the misunderstanding of how indexes actually work.
This article explains database indexing in a practical and beginner-friendly way using simple SQL examples. Instead of memorizing rules, you’ll understand why indexes help, when they fail, and how to design them correctly.
The Example Table
Let’s start with a simple orders table containing hundreds of thousands of rows.
CREATE TABLE orders
(
id BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
firstName VARCHAR(255) NOT NULL,
lastName VARCHAR(255) NOT NULL,
address VARCHAR(255) NOT NULL,
status ENUM ('created', 'preparing', 'prepared', 'shipped', 'delivered') NOT NULL,
canPreparingStart TINYINT(1) NOT NULL,
created_at TIMESTAMP NULL,
updated_at TIMESTAMP NULL,
deleted_at TIMESTAMP NULL
)
COLLATE = utf8mb4_unicode_ci;
Now imagine the application frequently runs this query:
SELECT *
FROM orders
WHERE firstName = 'Pavel';
At first glance, the query looks harmless.
But with hundreds of thousands (or millions) of rows, performance starts becoming a problem.
Understanding the Problem with EXPLAIN
Before optimizing queries, we should understand how the database executes them.
That’s where EXPLAIN becomes useful.
EXPLAIN
SELECT *
FROM orders
WHERE firstName = 'Pavel';
Without an index, the database performs a full table scan.
That means:
Every row is read
Every row is checked
The database scans the entire table
Even if only a few rows match.
This is expensive.
What Is an Index?
An index helps the database find rows faster without scanning the entire table.
The most common type is the B-Tree index.
Think of it like a phone book.
If you want to find someone named “Pavel”:
You don’t start reading from page 1
You jump somewhere near the middle
Then narrow the search step by step
That’s essentially how a B-Tree index works.
Adding the First Index
Let’s create an index on firstName.
CREATE INDEX orders_firstName_index
ON orders (firstName);
Now run the same query again:
SELECT *
FROM orders
WHERE firstName = 'Pavel';
Performance improves dramatically because the database now performs an:
Index lookup
instead of aFull table scan
The index allows the database to jump directly to matching rows.
B-Tree Indexes Work Left to Right
At this point, many developers think:
“Great. Indexes solved the problem.”
But here comes the first important limitation.
Consider this query:
SELECT *
FROM orders
WHERE firstName LIKE '%vel';
Or:
SELECT *
FROM orders
WHERE firstName LIKE '%ave%';
Even with the index present, the database often ignores it and performs a full table scan.
Why?
Because B-Tree indexes work from left to right.
Why Leading Wildcards Break Indexes
A B-Tree index stores data in sorted order.
For example:
Adam
Alex
Daniel
Pavel
Peter
The database can efficiently search:
LIKE 'Pa%'
because it knows where names starting with “Pa” begin.
But with:
LIKE '%vel'
there’s no starting point.
The database has no choice but to scan everything.
What B-Tree Indexes Handle Well
B-Tree indexes are excellent for:
Exact matches
WHERE firstName = 'Pavel'
Prefix searches
WHERE firstName LIKE 'Pa%'
Range queries
WHERE created_at BETWEEN '2025-01-01' AND '2025-12-31'
What B-Tree Indexes Do NOT Handle Well
B-Tree indexes struggle with:
Suffix searches
LIKE '%text'
Contains searches
LIKE '%text%'
Functions on columns
WHERE LOWER(firstName) = 'pavel'
Different Queries Need Different Indexes
This does not mean those queries cannot be optimized.
It simply means they require different index types, such as:
Full-text indexes
Functional indexes
Trigram indexes
The key lesson:
B-Tree indexes are powerful, but they are not universal solutions.
The Second Limitation: Data Distribution
Now let’s look at another example.
Imagine this query:
SELECT *
FROM orders
WHERE deleted_at IS NULL
LIMIT 20000 OFFSET 0;
A common assumption is:
“Let’s index
deleted_at.”
So we do:
CREATE INDEX orders_deleted_at_index
ON orders (deleted_at);
Surprisingly, performance may actually become worse.
Why?
The Data Matters More Than the Index
Suppose almost every row has:
deleted_at = NULL
This means the column has low selectivity.
In simple terms:
The index cannot effectively narrow down results
Too many rows match the condition
The database still has to process a huge amount of data.
Phone Book Analogy Again
Imagine a phone book where almost everyone has the same name.
Even though the book is sorted, searching becomes inefficient because the result set is massive.
That’s exactly what happens with low-selectivity indexes.
High Selectivity Makes Indexes Powerful
Now imagine we soft-delete most records.
Suddenly:
WHERE deleted_at IS NULL
matches only a small percentage of rows.
Now the index becomes extremely useful because it can quickly isolate a tiny subset of records.
This is why:
Good indexing is not only about queries — it’s also about data distribution.
Composite Indexes
Now let’s move to composite indexes.
Suppose we frequently run:
SELECT *
FROM orders
WHERE firstName = 'Pavel'
AND lastName = 'Komin';
A composite index can help significantly.
CREATE INDEX orders_first_last_index
ON orders (firstName, lastName);
This is usually much better than creating two separate indexes.
Why Composite Indexes Work Better
With separate indexes:
INDEX(firstName)
INDEX(lastName)
the database may still need to combine results internally.
But with a composite index:
(firstName, lastName)
the database can directly navigate to the exact combination.
This reduces work dramatically.
Column Order Is Critical
Composite indexes still follow the same left-to-right rule.
So this index:
(firstName, lastName)
works well for:
WHERE firstName = 'Pavel'
and:
WHERE firstName = 'Pavel'
AND lastName = 'Komin'
But not efficiently for:
WHERE lastName = 'Komin'
because the search starts from the second column, breaking the left-to-right chain.
Choosing Column Order Correctly
The first column in a composite index should ideally be:
Frequently queried
Highly selective
This allows the database to eliminate as many rows as possible early in the search process.
The Biggest Indexing Mistake
One of the most common mistakes is creating indexes without understanding:
Query patterns
Data distribution
Selectivity
Index structure
Indexes are not magic.
Poorly designed indexes can:
Waste storage
Slow down inserts and updates
Increase maintenance overhead
Make queries slower
Key Takeaways
1. B-Tree indexes work left to right
They are optimized for:
Exact matches
Prefix searches
Range queries
2. Leading wildcards break B-Tree indexes
These queries are problematic:
LIKE '%text'
LIKE '%text%'
3. Data selectivity matters
Indexes on low-diversity columns often provide little benefit.
4. Composite indexes depend on column order
The first column is the most important.
5. Good indexing requires understanding your workload
Always analyze:
Query frequency
Filtering patterns
Cardinality/selectivity
Execution plans (
EXPLAIN)
before adding indexes.
Final Thoughts
Indexes are one of the most powerful tools for database optimization — but only when used correctly.
Understanding:
how B-Tree indexes work,
why selectivity matters,
and how composite indexes behave
will help you optimize databases far more effectively than simply “adding indexes everywhere.”
The best database engineers don’t just create indexes.
They understand why the database chooses to use them — or ignore them.