FB

SQL Merge: The Complete Guide

Do you need to insert or update data depending on if it already exists?

You can do this with the SQL MERGE statement.

What is the MERGE Statement?

The MERGE statement is a type of statement that lets you either insert data or update data, depending on if it already exists. It lets you merge two tables in SQL. It’s a bit smarter than an INSERT INTO SELECT statement.

SQL MERGE is available in Oracle, SQL Server, and Postgres (not MySQL).

You can analyse records to see if they match between a source and a target table. If the record is found, then an update can be performed. If the record is not found, then an insert can be performed.

It’s often called an “upsert”, which is a combination of the word “update” and “insert”, which is what the MERGE statement does.

 

Why is the MERGE Statement Useful?

The MERGE statement is useful for ETL processes, where you need to move large amounts of data.

It’s an alternative to a combination of INSERT and UPDATE statements, and it’s faster, and easier to code.

It’s often useful when you need to synchronise data regularly from a source table (e.g. daily, weekly, monthly), and you want to update some record and insert new records.

 

MERGE Statement Syntax

The syntax of a MERGE statement in SQL looks like this:

MERGE INTO table_name
USING table_name
ON (condition)
WHEN MATCHED THEN update_clause
DELETE where_clause
WHEN NOT MATCHED THEN insert_clause
[LOG ERRORS log_errors_clause reject_limit <integer | unlimited> ]

This syntax includes:

  • Where the data comes from
  • Where the data is being inserted or updated to
  • How to match the data
  • What happens when a match is found
  • What happens when a match is not found
  • How to log errors, if needed

Let’s look at each of these clauses.

MERGE INTO table_name

This clause lets you specify the target table name or view name that you are updating or inserting data into.

If you’re using a view, the view needs to be an updatable view.

You also need to have INSERT and UPDATE privileges on this table to be able to perform a MERGE on it.

USING table_name

This is where you define where the data comes from. It can be a table, a view, or a subquery.

ON (condition)

This is where you specify the condition that determines if the MERGE statement updates or inserts a row.

Each row in the source table is checked against this condition. If it is true, then the record in the target table is updated, based on what you specify in the update clause. If it is false, then the record is inserted, based on what you specify in the insert clause.

This is often done on some kind of ID field, but can be done on other fields that identify the record.

WHEN MATCHED THEN update_clause

This clause specifies the columns and their new values in the target table.

The WHEN MATCHED means that this clause only occurs if the ON condition is TRUE for a particular row. For example, if the source row already exists in the target table.

You can add a WHERE clause into this update clause to update data if another condition is true. If this condition is true, the data is updated, otherwise no data is updated.

DELETE where_clause

This clause lets you delete data from the target table after it is updated, if it meets a condition you specify.

This allows you to clean up data while you’re updating it.

It only applies to rows that meet the WHERE clause specified here AND the ON condition specified earlier. If a row meets this where clause but not the ON condition, then it is not deleted.

Also, this WHERE clause looks at the value after the update statement occurs, not the old value before it was updated.

It saves you from having to write a separate delete statement.

WHEN NOT MATCHED THEN insert_clause

This clause lets you specify values to insert into the target table, if the ON condition is not met.

You should specify the column names and the values here, just like a regular INSERT statement. If you omit the column names, then the columns in the table must match those in the VALUES clause – which can be subject to change in the future!

A WHERE clause can be added here, to allow the database to only insert data that matches the WHERE clause as well as returning false from the ON condition. If the WHERE clause is met but the ON condition returns true, then the data is not inserted.

LOG ERRORS

This clause lets you log any errors that are encountered into a separate table, for you to analyse later.

The syntax of this clause looks like this:

LOG ERRORS [INTO table_name] [ (simple_expression) ] [REJECT LIMIT integer|UNLIMITED ]

What do all of these keywords mean?

  • INTO table_name: This is where you specify the name of the table you want to log errors into. This is an optional clause. If you don’t specify the INTO table_name, then the database will use the default error logging table, where the name is generated by the DBMS_ERRLOG package. The default name is ERR$_, followed by the first 25 characters of the table being updated.
  • simple_expression: This is used to identify errors from this statement in the logging table. It can be a literal text or number value, or a general SQL expression.
  • REJECT LIMIT: This is where you can specify an upper limit for the number of errors to be logged before the MERGE statement ends and rolls back any changes. The default reject limit is 0.

There are some restrictions on this error logging feature:

  • The following conditions cause the MERGE statement to fail and roll back, and not use the error logging capability:
    • Deferred constraints that are violated
    • Any MERGE operation that raises a unique constraint or index violation
  • You can’t track errors in the error logging table ofr LONG, LOB, or object type columns.
  • If the error logging table contains a column of an unsupported type and that column name relates to a column in the target table, then the merge statement will fail when it is parsed.

Merge SQL Examples

Right now you might be thinking this statement seems helpful but complicated.

So, let’s take a look at some examples to help explain it.

We’re going to use a customer table that stores names, states, and emails for customers. We also have a customer_import table which is where our source data comes from.

This is a simplified scenario of loading a table into a data warehouse, where you have a source table from the main system and a target table of the data warehouse table.

In this example, which uses the same data and structure as my Introduction to SQL course, we have this table:

CREATE TABLE customer (
  customer_id NUMBER(5),
  first_name VARCHAR2(100),
  last_name VARCHAR2(100),
  address_state VARCHAR2(10),
  email_address VARCHAR2(350),
  CONSTRAINT pk_cust PRIMARY KEY (customer_id)
);

Now we’ll insert some data.

INSERT INTO customer (customer_id, first_name, last_name, address_state, email_address)
VALUES (1, 'Teresa', 'Hudson', 'NY', '[email protected]');
INSERT INTO customer (customer_id, first_name, last_name, address_state, email_address) 
VALUES (2, 'Fred', 'Montgomery', 'CA', '[email protected]');
INSERT INTO customer (customer_id, first_name, last_name, address_state, email_address) 
VALUES (3, 'Lois', 'Lawson', 'OR', '[email protected]');
INSERT INTO customer (customer_id, first_name, last_name, address_state, email_address) 
VALUES (4, 'Alice', 'Perry', 'SC', 'aliceperry');
INSERT INTO customer (customer_id, first_name, last_name, address_state, email_address) 
VALUES (5, 'Ralph', 'Montgomery', 'TX', '[email protected]');
INSERT INTO customer (customer_id, first_name, last_name, address_state, email_address) 
VALUES (6, 'Dorothy', 'Armstrong', 'OR', '[email protected]');
INSERT INTO customer (customer_id, first_name, last_name, address_state, email_address) 
VALUES (7, 'Fred', 'Wallace', 'NY', '[email protected]');
INSERT INTO customer (customer_id, first_name, last_name, address_state, email_address) 
VALUES (8, 'Joseph', 'Bell', 'FL', '[email protected]');
INSERT INTO customer (customer_id, first_name, last_name, address_state, email_address) 
VALUES (9, 'Lois', 'Martinez', 'CALIF', '[email protected]');
INSERT INTO customer (customer_id, first_name, last_name, address_state, email_address) 
VALUES (10, 'Robert', 'Rice', 'IN', 'robrice123');

You might notice that some of the state values and email addresses don’t meet the format you might expect. We’ll see how these change during the MERGE process.

Now, let’s say we have a customer_import table using the same table syntax.

CREATE TABLE customer_import (
  customer_id NUMBER(5),
  first_name VARCHAR2(100),
  last_name VARCHAR2(100),
  address_state VARCHAR2(10),
  email_address VARCHAR2(350)
);

This table is populated with the following data:

INSERT INTO customer_import (customer_id, first_name, last_name, address_state, email_address) 
VALUES (15, 'John', 'Stock', 'ND', '[email protected]');
INSERT INTO customer_import (customer_id, first_name, last_name, address_state, email_address) 
VALUES (12, 'Joseph', 'Bell', 'OR', '[email protected]');
INSERT INTO customer_import (customer_id, first_name, last_name, address_state, email_address) 
VALUES (3, 'Lois', 'Lawson', 'WA', '[email protected]');
INSERT INTO customer_import (customer_id, first_name, last_name, address_state, email_address) 
VALUES (8, 'Joseph', 'Bell', 'FL', '[email protected]');
INSERT INTO customer_import (customer_id, first_name, last_name, address_state, email_address) 
VALUES (1, 'Teresa', 'Hudson', 'NY', '[email protected]');
INSERT INTO customer_import (customer_id, first_name, last_name, address_state, email_address) 
VALUES (4, 'Alice', 'Perry', 'SC', '[email protected]');

It doesn’t matter that the values aren’t in any particular order.

We now have our two tables: customer, and customer_import. They look like this:

customer:

customer _id first_name last_name address _state email_address
1 Teresa Hudson NY [email protected]
2 Fred Montgomery CA [email protected]
3 Lois Lawson OR [email protected]
4 Alice Perry SC aliceperry
5 Ralph Montgomery TX [email protected]
6 Dorothy Armstrong OR [email protected]
7 Fred Wallace NY [email protected]
8 Joseph Bell FL [email protected]
9 Lois Martinez CALIF [email protected]
10 Robert Rice IN robrice123

customer_import:

customer _id first_name last_name address _state email_address
15 John Stock ND [email protected]
12 Joseph Bell OR [email protected]
3 Lois Lawson WA [email protected]
8 Joseph Bell FL [email protected]
1 Teresa Hudson NY [email protected]
4 Alice Perry SC [email protected]

Let’s use the MERGE statement to load the data from the customer_import table into the customer table.

We’ll use the customer_id field to see if the record already exists. We could use any of the other fields, such as first_name and last_name, but in this example the customer_id is the identifier.

So, our statement would look like this:

MERGE INTO customer c
USING customer_import i
ON (c.customer_id = i.customer_id)
WHEN MATCHED THEN
UPDATE SET
c.first_name = i.first_name,
c.last_name = i.last_name,
c.address_state = i.address_state,
c.email_address = i.email_address
WHEN NOT MATCHED THEN
INSERT (c.customer_id, c.first_name, c.last_name, c.address_state, c.email_address)
VALUES (i.customer_id, i.first_name, i.last_name, i.address_state, i.email_address);

Now, let’s see the updated customer table.

customer _id first_name last_name address _state email_address
1 Teresa Hudson NY [email protected]
2 Fred Montgomery CA [email protected]
3 Lois Lawson WA [email protected]
4 Alice Perry SC [email protected]
5 Ralph Montgomery TX [email protected]
6 Dorothy Armstrong OR [email protected]
7 Fred Wallace NY [email protected]
8 Joseph Bell FL [email protected]
9 Lois Martinez CALIF [email protected]
10 Robert Rice IN robrice123
12 Joseph Bell OR [email protected]
15 John Stock ND [email protected]

I have highlighted the data that is new or changed here.

The changes that came from the customer_import table can be explained:

  • ID 15 was inserted as it’s a new row (no row with ID 15 exists).
  • ID 12 was inserted as it’s a new row (no row with ID 12 exists). Even though it’s the same name as ID 8, it’s a new row because the matching is done on customer_id and not on the name.
  • ID 3 had the address_state updated as it was the only field that was different.
  • ID 8 had the email_address updated.
  • ID 1 was in the import table but no data changed, as it was already the same.
  • ID 4 had the email_address updated as well.

So with this simple example, we can see that some data is updated and other data is inserted, based on if a match is found.

Insert Only and Update Only

Can you do an insert-only or an update-only MERGE statement?

Yes, you can. There are two ways to do this:

  1. Omit one of the clauses
  2. Use a condition that is always true or always false

Why would you do this as a MERGE instead of an INSERT or UPDATE statement?

Well, if you only want to update data that already exists, or insert data that does not exist, then you might find it easier to write an SQL statement using MERGE than writing an INSERT or an UPDATE statement that looks for matches using a subquery.

Let’s look at the two methods.

Method 1: Omit one of the clauses

Because the WHEN MATCHED and WHEN NOT MATCHED clauses are optional, one of them can be omitted from the MERGE statement.

If you leave out the WHEN MATCHED clause, it means no data will be updated, and it will be an insert-only MERGE statement.

If you leave out the WHEN NOT MATCHED clause, it means no data will be inserted, and it will be an update-only MERGE statement.

Using the previous example, writing an insert-only MERGE statement would look like this:

MERGE INTO customer c
USING customer_import i
ON (c.customer_id = i.customer_id)
WHEN NOT MATCHED THEN
INSERT (c.customer_id, c.first_name, c.last_name, c.address_state, c.email_address)
VALUES (c.customer_id, c.first_name, c.last_name, c.address_state, c.email_address);

Writing an update-only MERGE statement would look like this:

MERGE INTO customer c
USING customer_import i
ON (c.customer_id = i.customer_id)
WHEN MATCHED THEN
UPDATE SET
c.first_name = i.first_name,
c.last_name = i.last_name,
c.address_state = i.address_state,
c.email_address = i.email_address;

Method 2: Use a condition that’s always true or false

Another way to write a MERGE statement that only inserts or updates is to use a condition in the MERGE statement that always returns true or false.

If the condition always returns true, then the WHEN MATCHED will always run. Alternatively, if the condition always returns false, then the WHEN NOT MATCHED will always run.

An example of a true condition would be this:

MERGE INTO customer
USING customer_import
ON (1=1)

An example of a false condition would be this:

MERGE INTO customer
USING customer_import
ON (1=0)

What is the advantage of writing it this way? There is no join performed to the second table, which means it could perform faster.

Can the Database Update Multiple Tables as Part of the MERGE Statement?

No, this is not possible. Both the MERGE statement and the UPDATE statement only allow you to update one table as part of a query.

To update two tables, you could write some procedural code to get the data you need to update, and update two tables inside a loop.

Things to Be Aware Of With the SQL MERGE Statement

There are a few things to be aware of if you’re using the MERGE statement in SQL.

  • You need to have the INSERT and UPDATE privilege on the target table, and the SELECT privilege on the source table for the MERGE statement to work.
  • You can’t update any of the columns you are matching on in the ON clause. Using the example above, if you try to update the customer_id, you’ll get an error. In Oracle, this is: ORA-38104: Columns referenced in the ON clause cannot be updated.
  • You can only update a row in the target table once.
  • Oracle does not implement fine-grained access control during MERGE statements.
  • The MERGE statement will record the row as “merged” if the source and target record is identical and the merge condition is true.
  • MERGE does not tell you the number of rows that have been inserted or updated. It only shows the total number of rows merged.

Conclusion

The MERGE statement is quite useful when updating records in a table from another table. It’s often used in ETL or data warehousing scenarios but is helpful whenever you need to perform an insert and update based on a condition.

17 thoughts on “SQL Merge: The Complete Guide”

  1. Thank you very much, this is really helpful to me.
    You have done justice to MERGE SQL oracle by this Article.

    However I will appreciate your further supports as am preparing to write my SQL ORACLE certification Exams

  2. Don’t think that the following is correct:
    WHEN NOT MATCHED THEN
    INSERT (c.customer_id, c.first_name, c.last_name, c.address_state, c.email_address)
    VALUES (c.customer_id, c.first_name, c.last_name, c.address_state, c.email_address);

    Should be inserting values in c from i.
    WHEN NOT MATCHED THEN
    INSERT (c.customer_id, c.first_name, c.last_name, c.address_state, c.email_address)
    VALUES (i.customer_id, i.first_name, i.last_name, i.address_state, i.email_address);

  3. Hi, Thank you for the informative article. I am using a view & table in the select part of the merge and running into performance issue, although the table is indexed. Can you give some tips on how to improve performance when using merge

    1. Hi Priya,
      I would suggest running the SELECT statement by itself and analysing the performance.
      You can take a look at the EXPLAIN PLAN to see where issues may be appearing.
      Could you send me the SQL query and I could take a look? You can use the contact form to email me.
      Thanks,
      Ben

  4. Varsheeth T V

    Hi Bro,
    i have a Question; can i use MERGE statement to check where a field in table is NULL or NOT-NULL and insert new value if it’s null or add the new value to existing value if it’s not null.

  5. Hi, I’m using MERGE in my ETL to update a table from external datasource.
    However my table is versioned so I run only WHEN NOT MATCHED and in test condition every column is tested for match except VERSION.
    New VERSION as SYSDATE is used with INSERT when values do not match.
    I then use separate UPDATE clause to put records with 0 numerical values to indicate ones removed from the source.
    In this way I can ask my table for data snapshot from any point of time.
    Is there any way to utilize same MERGE to update records to 0 when they are not found in source table?

  6. Hi,
    some above you stated the DELETE can only be used as part of the match after the update but i think it can perform match-before the update as well depending on how its strutured.

  7. RANJITH Kumar

    Thanks for your notes on merge . it made me to understand with clear explanation in the first attempt

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Table of Contents