Introduction to SQL Indexing
Database performance can make or break an application. Even the most elegantly designed system will frustrate users if queries take too long to execute. Among the most powerful tools in a database administrator's arsenal for performance optimization is proper indexing.
In this comprehensive guide, we'll explore everything you need to know about SQL indexing to significantly boost your database performance—from basic concepts to advanced strategies, complete with practical examples and benchmarks.
What You'll Learn
This guide covers both fundamental and advanced indexing concepts, with practical examples you can apply immediately. You'll learn to identify when indexes are appropriate, which type to use, and how to measure their impact on performance.
Indexing Fundamentals
At its core, an index is a data structure that improves the speed of data retrieval operations on a database table. Think of it like the index at the back of a book—instead of reading through every page to find information, you can quickly look up specific topics.
How Indexes Work
Without an index, SQL must perform a full table scan, examining each row sequentially until all matching data is found. This becomes progressively more inefficient as tables grow larger.
An index creates a separate structure that contains:
- The indexed column(s) values
- Pointers to the actual rows in the table
This structure is typically stored in a B-tree format that allows for rapid lookups, like this simplified illustration:

Common Index Types
Different databases support various index types, but most offer these common options:
Index Type | Description | Best Use Cases |
---|---|---|
B-tree (Balanced tree) | Standard index type that sorts values in a tree structure | General-purpose indexing for equality and range queries |
Clustered | Determines physical order of data in table | Primary keys, frequently accessed range queries |
Non-clustered | Separate structure with pointers to data | Secondary access paths, covering indexes |
Unique | Enforces uniqueness of indexed columns | Primary keys, unique constraints |
Composite | Index on multiple columns | Queries filtering on multiple columns together |
Full-text | Specialized for text searching | Content searching, document indexing |
When to Create Indexes
While indexes dramatically improve query performance, they aren't free—they consume storage space and add overhead to write operations. Here's when you should consider creating an index:
Good Index Candidates
- Columns frequently used in WHERE clauses
- Columns used in JOIN conditions
- Columns used in ORDER BY or GROUP BY operations
- Columns with high cardinality (many unique values)
Poor Index Candidates
- Small tables (full table scans may be faster)
- Columns with low cardinality (few unique values)
- Columns rarely used in queries
- Tables with frequent INSERT/UPDATE/DELETE operations
Performance Trade-off
Remember that each index speeds up reads but slows down writes. For write-heavy applications, be particularly selective about which indexes to create.
Creating Effective Indexes
Let's explore how to create different types of indexes with practical SQL examples.
Basic Index Creation
Here's the syntax for creating a simple index:
-- Basic non-clustered index
CREATE INDEX idx_last_name
ON employees (last_name);
-- Unique index
CREATE UNIQUE INDEX idx_employee_id
ON employees (employee_id);
-- Specific index type (syntax varies by database system)
CREATE INDEX idx_description
ON products (description)
USING GIN; -- PostgreSQL-specific for full-text
Composite (Multi-Column) Indexes
When queries frequently filter on multiple columns together, a composite index can be more efficient than separate indexes:
-- Composite index for filtering by state and city
CREATE INDEX idx_location
ON customers (state, city);
-- This index helps queries like:
SELECT * FROM customers
WHERE state = 'California' AND city = 'San Francisco';
Column Order Matters!
In a composite index, the order of columns is critical. The index above will help queries filtering on state alone or state+city, but not queries filtering only on city.
Covering Indexes
A covering index includes all columns referenced in a query, allowing the database to retrieve data directly from the index without accessing the table:
-- Covering index for a specific query pattern
CREATE INDEX idx_product_lookup
ON products (category_id, price, product_name);
-- This query can be resolved entirely from the index
SELECT product_name, price
FROM products
WHERE category_id = 5
ORDER BY price;
Covering indexes can dramatically improve performance by eliminating table lookups.
Advanced Indexing Strategies
Filtered/Partial Indexes
In some database systems, you can create indexes on a subset of rows, which saves space and improves maintenance overhead:
-- SQL Server filtered index
CREATE INDEX idx_active_orders
ON orders (order_date, customer_id)
WHERE status = 'active';
-- PostgreSQL partial index
CREATE INDEX idx_high_value_orders
ON orders (order_date)
WHERE total_amount > 1000;
Included Columns (SQL Server)
In SQL Server, you can add non-key columns to an index to create covering indexes more efficiently:
-- Index with included columns
CREATE INDEX idx_order_lookup
ON orders (order_date)
INCLUDE (customer_id, total_amount);
-- This query can use the index exclusively
SELECT order_date, customer_id, total_amount
FROM orders
WHERE order_date > '2023-01-01';
Ensuring Indexes Are Used
Creating an index doesn't guarantee the query optimizer will use it. Here are some factors that might prevent index usage:
- Functions applied to indexed columns
- Implicit data type conversions
- Complex expressions in WHERE clauses
- Very small result sets where a table scan is faster
For example, this query cannot use an index on last_name effectively:
-- Index won't be used efficiently
SELECT * FROM employees
WHERE UPPER(last_name) = 'SMITH';
-- Better version that can use the index
SELECT * FROM employees
WHERE last_name = 'Smith';
Measuring Index Performance
To determine if your indexes are effective, you need to analyze query execution:
Reading Execution Plans
Most database systems provide tools to visualize query execution plans:
- SQL Server: EXPLAIN or Execution Plan in Management Studio
- MySQL: EXPLAIN
- PostgreSQL: EXPLAIN ANALYZE
- Oracle: EXPLAIN PLAN
Here's an example of using EXPLAIN in MySQL:
EXPLAIN SELECT *
FROM orders
WHERE customer_id = 12345;
Look for these key indicators in execution plans:
- "Index Seek" or "Index Scan" operations (good)
- "Table Scan" or "Full Table Scan" operations (potentially problematic for large tables)
- The number of rows examined vs. returned

Benchmarking Index Impact
To quantify the impact of an index, you can:
- Measure query execution time before adding the index
- Create the index
- Measure execution time again
-- Before index - measure time
SET STATISTICS TIME ON;
SELECT * FROM large_table WHERE non_indexed_column = 'value';
SET STATISTICS TIME OFF;
-- Create index
CREATE INDEX idx_column ON large_table (non_indexed_column);
-- After index - measure time
SET STATISTICS TIME ON;
SELECT * FROM large_table WHERE non_indexed_column = 'value';
SET STATISTICS TIME OFF;
Real-world Testing
Always test index changes in a development environment before applying them to production. The impact can vary significantly depending on data distribution and server configuration.
Index Maintenance
Indexes require ongoing maintenance to remain effective:
Addressing Fragmentation
As data changes, indexes can become fragmented, reducing their efficiency. Regularly check fragmentation levels:
-- SQL Server fragmentation check
SELECT
object_name(ips.object_id) AS TableName,
i.name AS IndexName,
ips.index_type_desc,
ips.avg_fragmentation_in_percent
FROM
sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, NULL) ips
JOIN
sys.indexes i ON ips.object_id = i.object_id AND ips.index_id = i.index_id
WHERE
ips.avg_fragmentation_in_percent > 30
ORDER BY
ips.avg_fragmentation_in_percent DESC;
Address fragmentation with reorganization or rebuilding:
-- For moderate fragmentation (5-30%)
ALTER INDEX idx_name ON table_name REORGANIZE;
-- For high fragmentation (>30%)
ALTER INDEX idx_name ON table_name REBUILD;
Identifying Unused Indexes
Unused indexes waste space and slow down write operations. Periodically identify and remove them:
-- SQL Server unused index query
SELECT
o.name AS TableName,
i.name AS IndexName,
i.type_desc AS IndexType,
us.user_seeks,
us.user_scans,
us.user_lookups,
us.user_updates
FROM
sys.indexes i
JOIN
sys.objects o ON i.object_id = o.object_id
LEFT JOIN
sys.dm_db_index_usage_stats us ON i.object_id = us.object_id AND i.index_id = us.index_id
WHERE
o.type = 'U' -- User tables only
AND i.type_desc != 'HEAP'
AND (us.user_seeks = 0 OR us.user_seeks IS NULL)
AND (us.user_scans = 0 OR us.user_scans IS NULL)
AND (us.user_lookups = 0 OR us.user_lookups IS NULL)
ORDER BY
o.name, i.name;
Real-world Case Studies
Case Study: Order Processing System
A retail company's order processing system was experiencing slowdowns during peak hours. The main orders table had grown to over 10 million rows. Here's how indexing solved the problem:
Problem Query
SELECT o.order_id, o.order_date, o.customer_id, c.customer_name, o.total_amount
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
WHERE o.status = 'processing'
AND o.order_date >= DATEADD(day, -7, GETDATE())
ORDER BY o.order_date DESC;
The query was taking over 15 seconds to execute. After analyzing the execution plan, we identified these issues:
- No index on the status column, forcing a full table scan
- Inefficient sorting operation for ORDER BY
- Suboptimal join performance
Solution implemented:
-- Created a filtered index for active orders
CREATE INDEX idx_processing_orders
ON orders (order_date DESC)
WHERE status = 'processing';
-- Added covering information
CREATE INDEX idx_customer_lookup
ON customers (customer_id)
INCLUDE (customer_name);
After implementing these indexes, the query execution time dropped to under 200ms—a 75x improvement!
Case Study: Reporting Database
A financial reporting system was experiencing timeout issues when generating monthly reports. The main transactions table contained 50+ million rows with data going back several years.
The solution involved a combination of strategies:
- Partitioning the table by date ranges
- Creating appropriate indexes on each partition
- Implementing a covering index for commonly run reports
-- Covering index for monthly summaries
CREATE INDEX idx_transaction_reporting
ON transactions (transaction_date, account_id, category_id)
INCLUDE (amount, description);
The result was a 95% reduction in report generation time, bringing the system back within acceptable performance parameters.
Common Indexing Mistakes
Avoid these common pitfalls in your indexing strategy:
Over-indexing
Creating too many indexes can be as problematic as having too few:
- Every index increases storage requirements
- Each write operation must update all affected indexes
- The query optimizer has more work to determine the optimal plan
Aim for a minimal set of indexes that cover your most important queries.
Incorrect Column Order in Composite Indexes
In a composite index, always put the columns in this order:
- Equality columns first (WHERE column = value)
- Then inequality columns (WHERE column > value)
- Finally, columns used in ORDER BY
-- Good composite index order
CREATE INDEX idx_product_search
ON products (category_id, subcategory_id, price);
-- For queries like:
SELECT * FROM products
WHERE category_id = 5 AND subcategory_id = 10
ORDER BY price;
Not Analyzing Index Usage
Indexes should evolve with your application. Regularly review and adjust your indexing strategy based on:
- Changing query patterns
- Growing data volumes
- New features or requirements
Conclusion
Effective SQL indexing is both an art and a science. It requires understanding your data, your query patterns, and the specific characteristics of your database system. By implementing the strategies outlined in this guide, you can dramatically improve database performance while maintaining a balanced approach that considers both read and write operations.
Remember these key principles:
- Index the right columns based on query patterns
- Understand the trade-offs between different index types
- Measure performance before and after adding indexes
- Maintain indexes regularly
- Monitor and evolve your indexing strategy over time
With a thoughtful indexing strategy in place, your database will be prepared to handle growth and deliver consistent performance even as your application scales.