SQL Indexes - The Definitive Guide
Indexes in SQL are an important part of the language.
But why is that? What are SQL indexes? And why should you know about them?
I’ll cover all of that in this guide.
I'll also show you the different types of indexes, and share many different sites that have examples and further information on indexes.

A Simple Explanation of Indexes in SQL
Have you ever looked at a big book, like a textbook, and tried to look for a specific topic?
Let’s say you have a textbook on the history of human civilisation, and you wanted to find the section on Julius Caesar.
You might have some idea of what page this topic is discussed on. It was in Roman times, so it might be somewhere in the middle of the book? You flip somewhere to the middle and look for it, and keep scanning pages until you find it mentioned.
Or, you read the entire book looking for information on this one topic. You would start on page 1 which discusses the start of human evolution, and read every page until you get to the current year.
Well, that would take a while so you probably wouldn’t do it. But we can agree it’s not a very efficient way of finding what you need.
Reading every page until you find the topic you’re looking for is equivalent to the database's method of finding specific records in a table.
When you run a query, the database looks through every record in the table to find the records you need.
Unless… you have an index.
Let’s go back to our textbook example. Let’s say this textbook had a section at the very end of the book called Index. It listed all of the topics that were covered in the book.
This could take up a lot of pages, depending on how big the book is. But, in this index section, there would be an entry for Julius Caesar:
Julius Caesar….. pg 415, 502-510
This shows the topic, as well as the specific page numbers where this topic is mentioned.
Now, all you need to do is turn to those page numbers and learn more about Julius Caesar.
This is the same way that indexes work in databases.
So what is an index in SQL?
An index is an object in the database that allows you to find specific data in a table faster.
Like the index section in a book, they take up a bit more space, but the benefits that they bring are often worth it.
Also, like in a book, indexes work best when you have a specific term to search for.
You don’t need to ask the database to use an index if it’s already there. Most of the time, the database will work out that it needs to use it.
Why Should Developers Know About Indexes?
So, as a developer, why do you need to know about indexes?
You might be thinking, “Oh, performance tuning? That’s for the DBA to work out. I just want to write my query to get my data."
But you’ll need your queries to run as fast as you can.
Whether you’re an ETL developer working on populating and improving a data warehouse, an application developer working on creating new software, or anything else, you’ll want your queries to be fast.
As a developer, you’re in the best position to understand what your needs are and what the data requirements are. Understanding what an index does can really improve your knowledge of the database and the performance of the query and application.
What Types of Indexes Are There?
In SQL databases, there are many different types of indexes. In this guide, I’ll cover these indexes:
- B-tree index
- Bitmap index
- Function-based index
- Clustered Index
- Non-Clustered Index
I’ll explain what these index types are later in this guide.
There are some other index types that exist in other databases, which I’ll explain briefly in this guide. These are:
- Reverse key
- B-tree cluster
- Descending index
What Privileges Do I Need to Create Indexes?
To create an index on an Oracle database, you need at least one of these privileges:
- The table to be indexed must be in your schema
- You must have the INDEX object privilege on the table
- You must have the CREATE ANY INDEX privilege
You'll need similar privileges in other databases.
Further Reading
- Oracle Privileges (Oracle Docs)
- Index Privilege example (PDF)
- Oracle SQL query: show all privileges for a user
- 10 Tips for SQL Developers (including why developers should understand indexes)
The Most Common Type of Index and How to Create It: The B-Tree Index
The most common type of SQL index is a b-tree index.
It’s also the “default” index type, or the type of index that is created if you don’t add any modifiers to the statement (which we’ll look at shortly).
B-tree stands for “balanced tree”. It’s called “balanced” because it can access all elements with the same number of steps.
It’s called a tree because of the way the index is used by the database. It works like a tree with branches and leaves.
So, how does it work?
Let’s say we’re looking for a record in a table with a specific ID value of 109. The ID is a column in this table and it has a b-tree index on it.
A b-tree index would look like this:
The steps to find the record with an ID of "B" would be:
- Look at the first level of the index. Find the entry, or node on this level, that covers the value of "B". There is only one here (the "A" at the top).
- Move to the second level of the index that comes from the first level identified in the previous step.
- Find the entry, or node, that includes the value of "B" in its range. In the example above, we have two nodes on the second level: "A" and "N". "A" covers the values starting with "A" and ending with "M", because the next value is "N". "N" covers the range of values from "N" to "Z". It's a simple example, and real values would be more complicated, but it demonstrates that the possible values (A to Z) are split between A-M and N-Z.
- Move to the third level of the index that was identified in the previous step.
- Find the entry, or node, that includes the value of "B" in its range. The nodes are labelled here as A, G, N, and R. Which one is it? It's the "A" node, because that covers the range of "A" to the just before next node (value of G) which means it ends at "H".
- Find the record that matches ID "B" and return that record.
As you can see, there is a repeating sequence of “look in the range of values” and “go to the next level”. This is repeated for the number of levels in the index. In most indexes, even those with millions of records, there is usually less than five levels.
Create a B-Tree Index
So I’ve just explained how they work. But how do you create a b-tree index in a database?
You use the CREATE INDEX command. It looks like this:
1CREATE INDEX index_name
2ON table_name (columns);
There are only a few things you need to specify to create a b-tree index:
- index_name: The name of the new index you’re creating, which must be a unique name.
- table_name: The name of the table you want to create the index for.
- columns: A single column name, or a comma-separated list of the columns you want to create the index on.
The index name needs to be unique and can be up to 30 characters.
This code will work on Oracle, MySQL, SQL Server, and PostgreSQL.
Also, while it’s not required, I like to name the indexes I create with a prefix and a descriptive name, so I know more about the index by just looking at the name.
I name my indexes in this format:
1idx_table_cols
I start with the letters “idx”, then underscore, then the name of the table (or a shorthand version of it, if it’s too long), then another underscore, then the name of the column or columns (or a shortened version).
Let’s take a look at an example.
1CREATE INDEX idx_emp_id
2ON employee (id);
This would create an index on the ID column in the employee table, and the index would be called idx_emp_id. I could tell by looking at the name that it’s an index, on the employee table, and on the ID column.
It’s a b-tree index, as that’s the default index type in Oracle, SQL Server, MySQL, and PostgreSQL. In SQL Server, it's also called a non-clustered index, as that is the default type.
Later in this article, I’ll explain when you should create a b-tree index over other index types.
Further Reading
- Full CREATE INDEX syntax on Oracle Docs
- Further analysis on indexes
- Indexing NULL values
- B-Tree index theory on Wikipedia
- Internals of Oracle's B-Tree index
A Great Index Alternative for Columns with Less Distinct Values: Bitmap Indexes
Another type of index in a database is a bitmap index.
A bitmap index looks like a two-dimensional “map” or table of values.
Each row is a separate ROWID, the unique identifier for the row.
Each column relates to a separate value of a specific column.
And the intersection of rows and columns is a single bit, which indicates that the specific ROWID has a value in that column.
Let’s take a look at an example.
An employee table may have a column called status, and the two values that may exist in this table are “Active” or “Inactive”.
A bitmap index created on this column could be visualised like this:
ROWID | Active | Inactive |
---|---|---|
AAA1 | 1 | 0 |
AAA2 | 0 | 1 |
AAA3 | 1 | 0 |
AAA4 | 0 | 1 |
AAA5 | 0 | 1 |
You can see that each ROWID is listed down the left in the first column.
The next column then shows the first possible value for the status column, which is “Active”. In this column, any rows in the table (as identified by their ROWID) that have a status value that matches this column header (a value of “Active”) will have a “bit” or a 1 in this column.
For the next column, the header of the column is the next value of the status column, which is “Inactive”. The same logic is then applied - any ROWID that has a status value of Inactive has a bit or value of 1 in this column.
So, this bitmap index is then used to improve the performance of any queries on this table. There’s no guarantee that this index will be used though.
As you might have guessed, the more possible values in the column being index, the more “columns” in the bitmap index and the larger the index will be, meaning it will be less effective.
So, bitmap indexes are usually best created on columns with a small number of distinct values. For example, you might create it on a “status" column, but not on a “country” column

Bitmap Index Syntax and Example
Let’s take a look at the syntax and an example of creating bitmap indexes in SQL.
Bitmap indexes are included in Oracle.
They are not included in MySQL, SQL Server, or PostgreSQL. PostgreSQL has something called Bitmap Index Scan which is an operation done when a query is run and not a predefined bitmap index. SQL Server also has a bitmap operation that is performed when a query is run.
In Oracle, To create a bitmap index:
1CREATE BITMAP INDEX index_name
2ON table_name (columns);
There are only a few things you need to specify to create a bitmap index:
- Keyword BITMAP: The only difference in syntax between this index and a “normal” b-tree index is the word BITMAP in the syntax. It comes after CREATE and before INDEX.
- index_name: The name of the new index you’re creating, which must be a unique name.
- table_name: The name of the table you want to create the index for.
- columns: A comma-separated list of the columns you want to create the index on.
As mentioned earlier, the index name must be unique and up to 30 characters. I also like to follow a certain format when naming my indexes. If your organisation has a format, I suggest following it as well.
Here’s how I name my indexes:
1idx_table_cols
I start with the letters “idx”, then underscore, then the name of the table (or a shorthand version of it, if it’s too long), then another underscore, then the name of the column or columns (or a shortened version).
Let’s take a look at an example of a bitmap index. I’ll use the gender column that I mentioned earlier.
1CREATE BITMAP INDEX idx_emp_gender
2ON employee(gender);
This creates a new bitmap index on the employee table on the gender column.
Further Reading
- An overview of Bitmap indexes
- B-Tree vs Bitmap Indexes with examples
- Bitmap Index on Wikipedia
- Oracle's document on using bitmap indexes in data warehouses
- Indexing an Oracle Data Warehouse
Improving Queries That Use Functions: Function-Based Indexes
A function-based index is one that is created on the results of a function or expression.
How is that different from a b-tree index?
Well, it’s because a b-tree index on a column is not used if the query performs a function on that column.
If a function-based index is created on the same function that the query uses, then the index will likely be used in the optimisation.
Let me explain this further.
Functions and Function-Based Indexes
Let’s say I wrote a query that found all employees that had a monthly salary greater than 10,000:
1SELECT employee_id,
2annual_salary,
3annual_salary / 12 AS monthly_salary
4FROM employee
5WHERE annual_salary / 12 >= 10000;
To find the monthly salary, I divided the annual salary by 12.
Now, let’s say I have no indexes on this table. I can run the query and it will work, but it might be slow.
For this example, let’s say I want to create an index on it. I see it’s using the order_date in the WHERE clause, so I create a b-tree index on this column.
1CREATE INDEX idx_emp_ansal
2ON employee(annual_salary);
I can run the SELECT query again. But it will still be slow. The index I created won’t be used.
Why is that?
It’s because the index is on the annual_salary column, and the value being compared against in the QUERY is actually the result of (annual_salary / 12).
Even though no function has been applied, it still performs some mathematical function or modification on the values.
This means any query that uses a function or expression on a column in the WHERE clause will almost never use an index created on that column without the function or expression. (There are some times where the database may use an index if it decides it will be cheaper than the alternatives.)
This is where the function-based index comes in.
Syntax and Example of a Function-Based Index
I can create function-based indexes in SQL in a similar way to creating a b-tree index (I can also create a function-based index as a bitmap index, which I’ll get to shortly).
To create a function-based index, simply use your function (or expression) inside the INDEX definition.
1CREATE INDEX index_name
2ON table_name(function(column_name));
This works the same as a b-tree index, except we add in a function.
So, the function-based index for our example could look like this:
1CREATE INDEX idx_emp_mthsal
2ON employee(annual_salary / 12);
A new index is created. Now, let’s run our original query.
1SELECT employee_id, annual_salary, annual_salary / 12 AS monthly_salary
2FROM employee
3WHERE annual_salary / 12 >= 10000;
This query should now use the index we have just created.
Why Create a Function-Based Index?
There are some situations where you may want to create a function-based index:
- Comparing strings using the same case using UPPER or LOWER on string columns for case-insensitive searches.
- Limiting data based on a specific date component (e.g. all data for a single month). There are alternative ways to do this, that may involve partitioning or a BETWEEN keyword.
Consider any function you’re using in the WHERE clause of your queries as a possibility of creating an index.
But indexes are not a “silver bullet” for performance. I’ll explain more about this later in this article.
Further Reading
- About Function-Based Indexes
- Richard Foote on Function-Based Indexes
- StackOverflow question on Function-Based Indexes and NULL values
- AskTOM post about function-based indexes and index-organised tables
- StackExchange question about bitmap indexes and cardinality
Clustered and Non-Clustered Indexes in SQL Server
What is a clustered index? What is a non-clustered index? And how are they different?
They are two types of indexes in SQL Server.
Clustered Index
A clustered index is a type of index that defines how data is stored for the table. The data in a table is stored on a disk on the server, and a clustered index specifies that the data in the table is stored in the same order as the data in the index.
Or, more specifically, the leaf level (the lowest level) of the clustered index is the table rows. This means the clustered index could be described as being the table.
Using the library example, the clustered index would be the Dewey Decimal System ID number. Let's say you wanted to find the book with the ID of 021.41092. You would first look for the bookshelf that had the books ranging from 000-025 by reading the signs at the end of the rows.
You would then scan the shelves, looking for books that were close to the one you were looking for, until you find the book with the ID you were looking for: 021.41092.
This is how a clustered index works. The books are stored in the same order as their IDs (which is the indexed value), and a clustered index is stored in the same way as the lowest level values of the index.
Why is this important? Because, if the data is stored in the same order in the clustered index and the table, it's faster for the database to find the data in the table.
The only drawback is that there can only be one clustered index on a table. This is because the clustered index defines how the data is stored. There can't be two ways to store the data in the table. Or, using the library example, there can't be two ways to organise the same books on the bookshelves.
Non-Clustered Index
A non-clustered index is a type of index that stores a separate list that refers to the rows in the table. The list is separate and it does not define how the records are stored in the table.
Using our library example, let's say you were looking for a book but didn't know what the Dewey Decimal System ID number was, but you knew the author or the title. Libraries used to have a set of drawers, like a filing cabinet, called a Card Catalog. These had a set of cards, one for each book, with the book and the ID number on it. These cards were sorted in a specific order, such as alphabetically by author.
So, if you knew the author, you could look up the author in this card catalogue, find the book ID number, then walk to that spot on the shelf and get the book.
The benefit of a non-clustered index is that there can be more than one on a table, as they don't define how the data is stored in the tables.
Using the library example, you could create a copy of all of the cards in the card catalogue and sort them alphabetically by title. This would be a second "non clustered index". It would help if you were searching by title, instead of the author.
The more non-clustered indexes you have, the easier it will be to find records (e.g. finding books by publisher or publication date). But it slows down the process to insert or update records.
Using the library example, if you add a new book, you would have to create one card for each catalogue and put it in the right place.
What is the Difference Between a Clustered and Non-Clustered Index?
There are a few differences between a clustered index and a non-clustered index. A clustered index impacts how data is stored in a table. Only one clustered index can exist on a table, but you can have multiple non-clustered indexes on a table. Finally, the non-clustered index is the default type of index (if no keyword is used).
Here's a summary of the differences.
Criteria | Clustered | Non-Clustered |
---|---|---|
Impacts table storage | Yes | No |
Number allowed for a table | 1 | Multiple |
Default | No | Yes |
Let's take a look at some examples of creating indexes in SQL Server.
Example: Clustered Index
To create a clustered index in SQL Server, we can use the CREATE INDEX command:
1CREATE CLUSTERED INDEX index_name ON table (column);
An example would be:
1CREATE CLUSTERED INDEX ix_cust_id ON customer (cust_id);
This will create a clustered index on the cust_id column in the customer table. The index is called ix_cust_id.
Because there can only be one clustered index on a table, we'll get an error if we try to create a second clustered index (even if it's on a different column).
1CREATE CLUSTERED INDEX ix_cust_id2 ON customer (cust_name);
Cannot create more than one clustered index on table 'customer'. Drop the existing clustered index 'ix_cust_id' before creating another.
Example: Non-Clustered Index
Creating a non-clustered index in SQL Server is similar:
1CREATE [NONCLUSTERED] INDEX index_name ON table (column);
The NONCLUSTERED keyword is optional. You can include it to make it clear that the index is non-clustered, or leave it out.
An example would be:
1CREATE INDEX ix_cust_name ON customer (cust_name);
This index is successful, even if there is already a clustered index on the table.
You can create an index using the optional NONCLUSTERED keyword as well, which does the same thing:
1CREATE NONCLUSTERED INDEX ix_cust_name2 ON customer (cust_name);
So that's how you can create a non-clustered index in SQL Server.
A Quick Overview of All the Other SQL Index Types
There are several other types of indexes in SQL databases, which I’ll discuss briefly here.
While you might not use them that often (if at all), it can be good to at least know what they are.
Reverse Key Index (Oracle)
A reverse key index is similar to a standard b-tree index, except the bytes that the index are stored as are reversed.
This means that inserts are spread out over several database blocks.
Descending Index (Oracle)
A b-tree index is stored so that it can be quickly traversed in ascending or descending order.
A descending index is an index on multiple columns, where one or more of those columns are sorted in descending order compared to the other columns.
It's an index that is not used very often.
Bitmap Join Index (Oracle)
A bitmap join index is a bitmap index on the join of two tables. For each row that exists in one column, the index will store the rowid of the related row in the other table.
They can sometimes be more efficient than materialised views.
Application Domain Index (Oracle)
This index is created by a user for data in an application-specific domain, such as image processing.
Unique Index (MySQL, PostgreSQL)
A unique index is an index where all of the values must be unique. It's like a unique constraint on the column but an index is added as well.
Fulltext Index (MySQL)
Fulltext indexes are used on CHAR, VARCHAR, and TEXT columns, and are ideal for searching and looking up text values.
Spatial Index (MySQL)
Spatial indexes are designed for Point and Geometry data types and work well with these data types.
Further Reading
- Richard Foote on Reverse Indexes
- Using reverse key indexes to solve buffer busy wait problems
- Short article on Descending Indexes and why they may not be useful
- Bitmap Join indexes
- Building Domain Indexes
- MySQL Create Index
Understand All the Different Ways Indexes are Used in Queries
We’ve now learned a few of the main index types, and how to create them.
Now let’s learn how they are used when running queries.
This is important to know if you’re trying to tune your queries (improve the performance of them).
What is “index scanning”?
In databases, index scanning is where the database retrieves the information it needs for a row from the index instead of the table.
Reading data from an index is usually faster than reading data from a table. So, if the database can read the data faster from the index than from the table, it will do so. This is essentially how indexes are able to perform better than reading from tables.
There are several different types of index scans. I’ll explain what they are.
But how do you see what type of index scan you’re using?
You can find this by performing an Explain Plan on your SQL statement. This can be done by either writing “EXPLAIN PLAN FOR” before your SELECT statement:
1EXPLAIN PLAN FOR
2SELECT first_name, last_name
3FROM all_customers
4WHERE customer_id = 54;
Or, by clicking the Explain Plan button in your IDE (such as SQL Developer):
So, that’s how you can see what type of index scan is being used. Let’s take a look at what the different index scans are, and what they mean.
This table summarises the different index scanning methods. I'll go into detail on them in a moment.
Index Scan Type | Description |
---|---|
Index Full Scan | Database reads the data from the index, not the table, and it reads in order |
Index Fast Full Index Scan | Database reads the data from the index, not the table, and in no order |
Index Range Scan | A scan performed in order. Often seen in WHERE clauses with multiple results |
Index Unique Scan | A scan performed on columns with unique values. Often seen on primary keys |
Index Skip Scan | Performed when a composite index is available but not all of the fields are specified |
Full Table Access | No index has been used (even though one may exist) |
What is an "index full scan"?
A full index scan, or index full scan, is where the database reads the entire index in order.
It can be used in any of the following situations:
- An ORDER BY clause is used, which has all of the columns from the index in the same order as the index
- A Sort Merge Join, where the query has all of the columns from the index in the same order as the index
- A GROUP BY clause is used, which contains all of the columns in the index
It can also eliminate sorting because the data in the index is ordered by the index key.
Let’s say you have a query that looks like this:
1SELECT first_name, last_name
2FROM all_customers
3ORDER BY first_name;
The table has an index on the first_name and last_name columns, and the first_name column has a NOT NULL constraint on it.
The explain plan could look like this:
1-------------------------------------------------------------------------------
2| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
3-------------------------------------------------------------------------------
4| 0 | SELECT STATEMENT | | 1000K| 13M| 3457 (1)| 00:00:42 |
5| 1 | INDEX FULL SCAN | IDX_AC_FLN | 1000K| 13M| 3457 (1)| 00:00:42 |
6-------------------------------------------------------------------------------
To find the data for this query, the database will perform a full scan of the index in sorted order. This is often faster than querying the table.
What is an "index fast full scan"?
A fast full index scan is similar to the full index scan, but faster…
Actually, it’s where the database reads the data in the index without accessing the table, and in no particular order.
A fast full index scan is used as an alternative to a full index scan when both of these conditions are met:
- The index must contain all columns needed for the query
- A row containing all NULL values must not appear in the query result set. This means a column in the index must have either a NOT NULL constraint, or a predicate applied that prevents NULLs from being considered.
Let’s take a look at an example query:
1SELECT DISTINCT first_name
2FROM all_customers;
Notice that there is no ORDER BY clause. Also, let’s assume that there is an index on column first_name.
The explain plan for this query would look like this:
1-----------------------------------------------------------------------------------------
2| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
3-----------------------------------------------------------------------------------------
4| 0 | SELECT STATEMENT | | 198 | 1386 | 702 (5)| 00:00:09 |
5| 1 | HASH UNIQUE | | 198 | 1386 | 702 (5)| 00:00:09 |
6| 2 | INDEX FAST FULL SCAN| IDX_CUSTCOPY_FN | 1000K| 6835K| 676 (1)| 00:00:09 |
7-----------------------------------------------------------------------------------------
When running this query, the database will scan the index for the required values, rather than reading the table.
What is an "index range scan"?
An index range scan is where the index is scanned in order.
It can be used where a WHERE clause is used in a query, and there are multiple records in the table that match each value of the column in the WHERE clause.
For example, let’s say the first_name column in the employee table has an index. And there is a query that has a WHERE clause on this first_name column:
1SELECT first_name, last_name
2FROM all_customers
3WHERE first_name = 'William';
The explain plan shows:
1-------------------------------------------------------------------------------
2| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
3-------------------------------------------------------------------------------
4| 0 | SELECT STATEMENT | | 5051 | 70714 | 20 (0)| 00:00:01 |
5|* 1 | INDEX RANGE SCAN| IDX_AC_FLN | 5051 | 70714 | 20 (0)| 00:00:01 |
6-------------------------------------------------------------------------------
This query may use an index range scan if there is more than one record that matches each of those first names in the table.
What is an "index unique scan"?
An index unique scan is similar to an index range scan.
However, each index key must have either 0 or 1 table rows (ROWIDs) associated with it.
In other words, it’s a scan that is performed on indexes that have unique values. It stops processing after it finds the first record because it knows that there is no second record.
These scans are often performed on indexes on primary key columns.
An example of a query that may use an index unique scan is:
1SELECT first_name, last_name
2FROM all_customers
3WHERE customer_id = 54;
This would return a single row.
The explain plan looks like this:
1---------------------------------------------------------------------------------------------
2| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
3---------------------------------------------------------------------------------------------
4| 0 | SELECT STATEMENT | | 1 | 19 | 3 (0)| 00:00:01 |
5| 1 | TABLE ACCESS BY INDEX ROWID| ALL_CUSTOMERS | 1 | 19 | 3 (0)| 00:00:01 |
6|* 2 | INDEX UNIQUE SCAN | PK_ACUST_ID | 1 | | 2 (0)| 00:00:01 |
7---------------------------------------------------------------------------------------------
What is an "index skip scan"?
An index skip scan is used when a composite index is available but all of the fields in the composite index are not specified.
A composite index is split into several indexes for the purposes of the query, and each of these indexes are then scanned for the matching value.
The database might use an index skip scan where:
- A composite index exists (an index on two or more fields)
- The first field of the index contains a low number of distinct values, and is not specified in the query
- The second field of the index contains a high number of distinct values, and is specified in the query
Let’s take a look at a sample query:
1SELECT first_name, last_name
2FROM all_customers
3WHERE last_name = 'Smith';
In this table, there is an index on (first_name, last_name).
The explain plan could look like this:
1-------------------------------------------------------------------------------
2| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
3-------------------------------------------------------------------------------
4| 0 | SELECT STATEMENT | | 4132 | 57848 | 200 (0)| 00:00:03 |
5|* 1 | INDEX SKIP SCAN | IDX_AC_FLN | 4132 | 57848 | 200 (0)| 00:00:03 |
6-------------------------------------------------------------------------------
The database could perform an index skip scan, even though the business_size column was not specified in the query’s WHERE condition.
What is a "full table scan"?
A full table scan is where the database reads the data from the table without using any indexes.
This can happen even if an index exists on the table - and even if you're expecting the database to use it!
Full Table Scans are often thought of as the worst kind of table access, as they search on every row in the table, which can be slow.
However, it's not always the worst method.
Most modern database optimisers are very good at working out the quickest way to run a query. If you find that your query uses a full table scan, it could be for one of two reasons:
- Your table is small and no indexes will help the query run faster.
- Your table has no indexes, and a full table scan is slow, which means your query needs to be optimised.
An example of a query that uses a full table scan is this one:
1SELECT *
2FROM all_customers;
The explain plan for this query looks like this:
1-----------------------------------------------------------------------------------
2| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
3-----------------------------------------------------------------------------------
4| 0 | SELECT STATEMENT | | 1000K| 34M| 1583 (1)| 00:00:19 |
5| 1 | TABLE ACCESS FULL| ALL_CUSTOMERS | 1000K| 34M| 1583 (1)| 00:00:19 |
6-----------------------------------------------------------------------------------
As you can see, it shows a Table Access Full line in the explain plan output.
Further Reading
- Indexes from Oracle Docs
- AskTOM post on Full Index Scan vs Fast Full Index Scan
- Index scanning
- More information and examples on Index Skip Scan
- Blog post with more information on index access methods
How to Make Changes to Indexes After You Create Them
We looked at how to create an index earlier in this guide.
But what if you want to make changes to an index that already exists?
In this section, I’ll show you how to rename, drop, enable, and disable indexes in SQL.
How to Rename an Index
If you want to change the name of an index, you can do that in SQL.
To rename an index, you use the ALTER command.
1ALTER INDEX index_name
2RENAME TO new_index_name;
The parameters for this statement are:
- index_name: The current name of the index.
- new_index_name: The new name you want to give to the index.
How to Drop an Index
To delete an index from the database, you drop it. This is done using the DROP command.
1DROP INDEX index_name;
The index_name is the name of the index you wish to drop.
It’s a simple command to run - only one parameter.
How to Disable an Index
A handy feature of a database is the ability to disable an index. This means the index still exists in the database, along with the usage statistics.
However, the index is not used in the optimiser.
This is helpful if you wish to test the performance of the system without a specific index, without deleting the index.
To disable indexes in Oracle and SQL Server:
1ALTER INDEX index_name ON table_name DISABLE;
This will disable the index. Future queries won’t use this index. Any queries that have an index hint that uses this index will fail.
MySQL and PostgreSQL don't support the ability to disable an index.
How to Enable an Index
After you disable an index, you may want to enable it.
To do this, you need to rebuild it. There is no ALTER INDEX ENABLE command.
To rebuild an index in Oracle and SQL Server:
1ALTER INDEX index_name ON table_name REBUILD;
The index will be re-enabled and the usage stats will be reset.
MySQL and PostgreSQL don't support the ability to enable an index.
What Is "Compute Statistics" in Oracle And How To Do It
Statistics on an index are values that reflect the data used in the index, such as number of rows and unique values.
The good news is that statistics are calculated automatically by the Oracle database whenever you create a new index.
If you ever need to calculate statistics manually, you can use the DBMS_STATS.GATHER_INDEX_STATS command. You may want to do this if:
- You add a new column to the table, and you’re using it to join to another table.
- You create a new table. However, Oracle has a process called “dynamic sampling” which usually means you don’t need statistics on a new table.
To compute statistics on an index, you can use the DBMS_STATS.GATHER_INDEX_STATS command.
1DBMS_STATS.GATHER_INDEX_STATS;
The COMPUTE STATISTICS parameter was deprecated in Oracle 10g in favour of this command.
Further Reading
- Brent Ozar on Disabling and Dropping Indexes
- Richard Foote on index statistics
- Examples of the ALTER INDEX command
- Stanford and Oracle post on managing indexes (Oracle 10g version)
- AskTOM thread
Create Effective Indexes With These Tips
When I started working with databases, I didn’t really understand indexes. If I found a slow running query, my first thought was to create some indexes and see if it works.
It seems like this is a common approach to slow queries. “Just create an index, it will be fine!”
However, once you understand indexes (after reading this guide), and once you get to know your data, you should be able to know whether creating an index is a good idea or not.
You should consider a few things when creating indexes:
- Consider indexing columns that are used often in WHERE clauses.
- Consider indexing columns that are used often in JOIN clauses to join tables.
- Index columns that have a high percentage of unique records (this percentage is called "selectivity")
- If a column has only a few distinct values, then don’t create a B-Tree index on the column. Consider creating a bitmap index instead.
- Don’t create indexes on columns that are updated frequently, as this can slow down the performance of the UPDATE, INSERT, and DELETE statements.
- If your query uses a function in the WHERE clause, consider creating a function-based index. Indexes without the function will not be used on queries that use a function.
- Consider indexing foreign key columns as these are not indexed automatically. These may already be covered by the point above which mentions join clauses, as foreign keys are used in joins.
- Consider creating a composite index (an index on two or more columns) where all columns are used in a SELECT query (which means the data can be loaded from the index and not the table).
Whenever you create your index, you should compare the advantages of the performance of the SELECT statements against the increase in processing time for INSERT/UPDATE/DELETE statements and the storage taken up by the index. Sometimes a minor increase in SELECT performance is not worth it when an index is large or causes INSERT queries to slow down.
Why can't I just create indexes on every column?
If you have too many indexes on a table, or have created an index to "see if it will work", then this can cause more problems than it solves.
The database will need to store the data for the index somewhere, which can take up space. Sure, disk space is cheap in theory, but some databases are on shared servers and have limited space.
Also, every time a row is inserted, updated, or deleted, the index needs to be updated. This can cause these operations to run slower, which can slow down your whole system.
So, it's much better to be specific and particular about which indexes you need to create.
Two factors that really help with determining this are cardinality and selectivity.
Cardinality vs Selectivity
Cardinality and selectivity are two terms that are often used in the SQL world when working with indexes. I’ll explain what they are and how they are different.
Cardinality is the number of distinct values in a particular column.
Selectivity is the ratio of cardinality on an indexed column to the total number of rows in a table.
Cardinality could be a value from 1 (if all values in a table for a particular column are the same) up to thousands or millions.
Selectivity will always be between 0 and 1, as it is a ratio.
Let’s see an example. Let’s say we have a table called customers and a column called business_size:
CUSTOMER_ID | BUSINESS_SIZE |
---|---|
1 | S |
2 | L |
3 | M |
4 | M |
5 | S |
For the business_size column index, the cardinality would be 3, as there are 3 distinct values (S, M, and L).
The selectivity would be 0.6. This is calculated as cardinality/total rows = 3/5 = 0.6.
How does this relate to indexes?
A selectivity value of a query closer to 1 is said to have high selectivity, and closer to 0 is called low selectivity.
If a query has a high selectivity, then it is normally fast to select using an index. If a query has low selectivity, it is often not worth the time to use the index in the query, and the database may not use it.
So, how do you ensure you have high selectivity?
You can add a composite index, which is an index on two columns instead of one. This will ensure the index has more unique values, which has high selectivity.
Further Reading
- Cardinality and Selectivity example in MySQL
- Increasing selectivity example
- Example of adding a MySQL index on a WordPress table
Conclusion
So, this guide has introduced you to all you need to know about indexes. It has explained what an index is, showed the syntax of creating indexes, explained the different types of indexes, and how to modify the indexes.
In most cases, you'll be creating b-tree indexes. Bitmap indexes can also be useful, and function-based indexes are helpful if you use functions or expressions.
Don't create indexes on every column as it can slow your database down. Understand the queries you need to run regularly, and optimise around those.
Do you have any questions on indexes in databases? Leave your comments below!
Lastly, if you want to improve your SQL skills, and get a bonus PDF file with a diagram showing you when to create different types of indexes, enter your email below!
Special thanks to Jeff Kemp for some corrections and feedback on this article. You can read his Oracle blog here.
