SQL: If Exists Update Else Insert


This is a pretty common situation that comes up when performing
database operations.  A stored procedure is called and the data needs
to be updated if it already exists and inserted if it does not.  If we
refer to the Books Online documentation, it gives examples that are
similar to:

IF EXISTS (SELECT * FROM Table1 WHERE Column1=’SomeValue’)
    UPDATE Table1 SET (…) WHERE Column1=’SomeValue’
ELSE
    INSERT INTO Table1 VALUES (…)

This
approach does work, however it might not always be the best approach. 
This will do a table/index scan for both the SELECT statement and the
UPDATE statement.  In most standard approaches, the following statement
will likely provide better performance.  It will only perform one
table/index scan instead of the two that are performed in the previous
approach.

UPDATE Table1 SET (…) WHERE Column1=’SomeValue’
IF @@ROWCOUNT=0
    INSERT INTO Table1 VALUES (…)

The saved table/index scan can increase performance quite a bit as the number of rows in the targeted table grows.

Just
remember, the examples in the MSDN documentation are usually the
easiest way to implement something, not necessarily the best way.  Also
(as I re-learned recently), with any database operation, it is
good to performance test the different approaches that you take. 
Sometimes the method that you think would be the worst might actually
outperform the way that you think would be the better way.
 

Comments (51)

  1. jackbond says:

    Thanks for the tip, I have a lot of code with the inferior approach that I will be updating. Fortunately, SQL 2008’s Merge statement will make this a non-issue.

  2. Sent to me by Chris:

    "Since I don’t have a blog and you don’t allow anonymous comments I thought I’d shoot a quick email with a question/concern. Regarding your post "SQL: If Exists Update Else Insert" with the alternative method of doing the Update and then checking the @@ROWCOUNT as to whether to perform an insert or not… I definitely would not have thought of it that way either. However, I was wondering, with the scope of @@ROWCOUNT being global is there the possibility that the @@ROWCOUNT value could be incorrect within the local scope if there is heavy use on the database? Or am I incorrect on the scope being global?  The reason I thought of this goes back to the use of SCOPE_IDENTITY() over @@IDENTITY  to get the last identity in the same scope.

    Thanks for your time."

  3. I have enabled anonymous comments.  I thought that I had done that before, but I guess not.

    Chris,

     Great question!  

     The scope of @@ROWCOUNT (or ROWCOUNT_BIG() in the case of a VERY large result) is limited to the current scope that contains the statement that was previously executed.  So you are guaranteed the correct result when using @@ROWCOUNT.  

     Like you mentioned, this is not the case with @@IDENTITY.  You can get some really unexpected results if it is used improperly.  

    Jeremiah

  4. Will this code work for any SQL compliant server?

  5. Andrew,

     The basic idea should work with other types of SQL servers (however, I have not verified this).  You should just be able to use the @@ROWCOUNT equivalent for the system that you are using.

    For example:

    MySql uses "row_count()"

    Oracle uses "sql%rowcount"

  6. Guy says:

    If you are thinking of MySQL there is a function called ROW_COUNT() that serves the same purpose (available as of version 5). Theoretically Jeremiah’s suggestion should result in improved performance on MySQL also, but you should test it to make sure.

  7. Richard Ayotte says:

    MySQL does something much better. It’s not part of the standard but still very useful.

    REPLACE [LOW_PRIORITY | DELAYED]

       [INTO] tbl_name [(col_name,…)]

       VALUES ({expr | DEFAULT},…),(…),…

    It does what you describe. If the record exists, it is updated, else inserted.

    http://dev.mysql.com/doc/refman/5.0/en/replace.html

  8. Richard,

    Thanks for the comment.  I was not aware of the REPLACE statement in MySQL.

    Looking at the documentation, it does effectively do the same thing.  But it seems to do it in a different manner.  If a row with the same PK or UNIQUE index exists, it deletes the old row and then inserts the new row.  For anything that does not already exist in the table, it inserts.

    I don’t know if the deleting is more or less efficient than doing an update in MySQL.  I would advise trying both ways and then compare the performance to see which solution better fits your needs.

  9. Looking into the MySql documentation a little more I found this statement:

    INSERT … ON DUPLICATE KEY UPDATE

    This seems to be closer to the upsert code that I initially wrote about.

  10. JohnEric says:

    RE: @@identity

    You can use Scope_Identity() instead.  That guarantees the correct value.

  11. Marc Brooks says:

    I vastly prefer this sort of game, for example a HitCounter for some item that automatically inserts new rows houry:

    INSERT INTO dbo.HitCounter(ItemID, TimeSlot)

    SELECT TOP 1

    @ItemID AS ID

    ,DateAdd(hh, DateDiff(hh, 0, GetUtcDate()), 0) AS TimeSlot

    FROM (SELECT 1 AS FakeColumn) AS FakeTable

    WHERE NOT EXISTS (SELECT * FROM dbo.HitCounter

                     WHERE ItemID = @ItemID

                     AND TimeSlot = DateAdd(hh, DateDiff(hh, 0, GetUtcDate()), 0))

    UPDATE dbo.HitCounter

    SET Hits = Hits + 1

    WHERE ItemID = @ItemID

    AND TimeSlot = DateAdd(hh, DateDiff(hh, 0, GetUtcDate()), 0)

    SELECT ID

    FROM dbo.HitCounter

    WHERE ItemID = @ItemID

  12. Gabriel O. Perez says:

    This is a great approach that I never thought of, and the only concern I had was answered within the comments section. Thanks!

  13. gOODiDEA says:

    Debug:DebuggingIIS7.0WebapplicationsremotelywithVisualStudio2008EmbeddingASP.NETServer…

  14. Joe Davies says:

    This is great!  I shaved lots of time off my bulk insert/updates.  

    Thanks!

  15. Thanks for the tip–just tried it and it worked perfectly.

    Is this more efficient than doing an "IF SELECT COUNT(*) FROM … > 0 ELSE … "

    That would select only one value, I’m just not fully sure of the overhead created when SQL Server executes the SELECT in this vs. an UPDATE that makes no changes.  I’m not really sure how to really test this either and for my small app, it’s really negligible anyway.

    Either way, yours worked seamlessly, so thanks!

  16. Using REPLACE in mysql is problematic because it DELETEs the existing row before INSERTing.

    The INSERT … ON DUPLICATE KEY UPDATE you described is the better solution for mysql, until SQL2003’s MERGE is implemented widely.

  17. John says:

    What about the performance difference in doing a delete and insert vs. an update?

    I have an application that will probably always update 4 rows in the table (after the first iteration).  Performance wise, should I do a delete for those four rows and then insert each row or simply do four updates?

  18. Brad says:

    This was a great help Jeremiah!  Our DBA recommended checking @@ERROR=0 in addition to @@ROWCOUNT=0 just in case the UPDATE was the right thing to do, but didn’t work for whatever reason.

  19. PAKO says:

    I CANTO USE DE IF CONDITION IN MYSQL SERVER 5.0

  20. Martin says:

    I’m trying to do this in a way that will work on SQL Server and DB2. Unfortunately DB2 is extremely primitive and as soon as you involve ‘IF’ you are into requiring a procedure, can’t just run it in a script. The other side of the coin is that MERGER, which will do the job nicely on DB2, isn’t supported in SQL Server before SQL 2008.

    Has anyone a solution that will run in a script on both DB2 and SQL Server 2005?

  21. Mithun says:

    How does this behave in a multi-threaded environment? How to solve in that case?

  22. Terry says:

    My scenario was that I needed to SELECT a journal number (if it existed), otherwise create the journal number, and INSERT it.  (I didn’t need to do any updates in my scenario).  Here is my quick solution:

    ————————————–

    select @JournalNumber = JournalNumber  from JournalHeaderTable with (nolock) where BatchName = @BatchName

    IF @@ROWCOUNT=0

    BEGIN

         exec  taGetNextJournalEntry  @O_vJournalEntryNumber = @JournalNumber output

         BEGIN TRANSACTION

               insert into JournalHeaderTable(BatchName, JournalNumber, Reference, TrxDate, TrxType)

               values (@BatchName, @JournalNumber, @Refrence, @TransactionDate, 0)

         COMMIT TRANSACTION

    END

  23. Karthik says:

    In case of SQL 2008, we can use MERGE for this..just implemented that within my SSIS package to handle new and modified data, and it works like a charm..

  24. Karim Sultan says:

    Hey thanks Jeremiah, for SQL 2005, your solution is the best I’ve seen yet (after trolling and review 3 other suggestions).  I’ll gladly give up portability (it’s a quick port to MySQL and Oracle anyways) for a performance gain.  In fact, it’s not so much a performance gain as it is a reduction in queries (max 2 versus max 3) that makes it much more appealing over the long run.  Thanks for the insight, and keep sharing.

  25. Pete says:

    This is a great tip no doubt, however one observation is, it is being compared with bad code in the first instance.

    Here’s the original example

    IF EXISTS (SELECT * FROM Table1 WHERE Column1=’SomeValue’)

       UPDATE Table1 SET (…) WHERE Column1=’SomeValue’

    ELSE

       INSERT INTO Table1 VALUES (…)

    yes this will be slow. The major error is in the select * statement.

    Most Developers know not to use this, but for the wrong reasons.

    It’s not because there are more columns returned and therefore bytes per row (which is true, but has minimal effect for most queries based on normailsed tables)

    No, the real problem is Select * forces sql to tablescan unless every column returned is in an index. In other words, every row in the table is read – that’s why it’s slow.

    If you use Select ‘PrimaryKeyColumn’ or Select ‘IndexedColumn(s),….’ performance is improved by a huge factor.

    Just try it on a large table with a bulk update. or use SLQ profiler and view the queryplan – you’ll see select * using 99% more resources when returning the exact same query (except for the columns) limited by the same where clause

    Never the less, the solution offered by Jerimihah is a great idea and outside the box. I think adding an errorcheck as suggested by one DBA is also a good idea.

    just my 2 cents worth

  26. pooja says:

    this is not working on mysql..plz comment why

  27. chris says:

    Hello,

    Is this code thread safe?

    Would…

    UPDATE Table1 SET (…) WHERE Column1=’SomeValue’

    IF @@ROWCOUNT=0

       INSERT INTO Table1 VALUES (…)

    …need wrapping in a transaction? i.e if two processes read @@ rowcount as 0 (as the row didn’t exist and then both try and INSERT you’d get an error? Or would SQL server run the whole statement as an atomic operation?

  28. chris,

     Although I cannot find the official documentation that backs this up, I am slmost 100% sure that @@rowcount is scoped to the query/session/connection that you are using.  As far as I know, it is "thread safe" as pertaining to your question.

  29. emmy says:

    hi Jeremiah,

    i have a problem inserting data only once.

    i knew that the data is inserted twice as i’m grasping these data from finger print device.for example i grasp data from 28/10/2009 from the buffer of the machine.i just want to compare datetime field in database with today’s date.suppose

    min(datetime1) is 28/10/2009 and max(datetime1) is 13/12/2009 i want to insert 14,15 & 16/12.if tommorrow is 17/12/2009 .

    i tried to make condition to  insert only today’s date but it keeps inserting it.i mean i eant to make sure it’s inserted only once.if it’s exists i may make update to the existing data.

    i tried to make this procedure: it didnt work out.

    create procedure sp_attendance5 @EnrollNo int=null,@name1 varchar(20)=null,

    @datetime1 datetime=NULL,@check1 varchar(20)=null,@num int= null

    as

    begin

    IF EXISTS (select EnrollNo,name1,datetime1,check1 from attendance)

    update attendance set (EnrollNo=@EnrollNo,name1=@name1,datetime1=@datetime1,check1=@check1) where num=@num

    else

    if datetime1<=dateadd(dd,DATEDIFF(dd,0,getdate()),0)                                

    insert into attendance(EnrollNo,name1,datetime1,check1) values (@EnrollNo,@name1,@datetime1,@check1)

    end

    go

     the primary key in attendance database is Num it’s auto increment

    Num,EnrollNo,name1,datetime1,check1

    1 134 Adam Hurry 28/10/2009 14:29:49 check in

    2 102 Noel Lipson 28/10/2009 17:21:12 check in

    3 102 Noel Lipson 28/10/2009 17:21:19 Check Out

  30. Brian says:

    I had a need for a similar use, but on MySQL.  Instead of @@ROWCOUNT you can use ROW_COUNT(), but in using it, I revealed another interesting problem…

    Affected rows (in mySQL anyways) counts an affected row only if it CHANGES.  This means that if a row exists and your update was successful, it may still return 0 affected rows.  If you don’t know if the row exists, chances are you might not know if it’s changing either.

    This means this method may add a duplicate row anyways!

    At least this is what I encountered in mySQL.  I would have liked to do it all in one Query, but until I know a way that works better, I’ll have to settle!

    Brian

  31. emmy says:

    but i’m not using my sql.it’s sql server 2005!.

  32. Vibhore says:

    Just for sake of completeness 2 things

    1. Isnt Update a Blocking call? In case the user expect most of the time Insert will happen he should go with if exists(select) insert else update. Also in some cases only an insert is required if record doesnt exist, the basic if not exists(select) insert , will work best

    2. but if most of the time update is expected the way you mentioned works better.

    Had been using these for past few months and one of my friend Just stumbled on your blog and even for the case where update is never expected  or going to happen only 5-10% of the time, he used your way. So thought will mention these. Would love to know your views

  33. Vibhore,

     While it is true that the update statement is doing nothing if you are just doing inserts, it is no less performant that using if (exists).  In the case of an insert, both will perform one table scan (or index search) to perform the insert.  In the case of Update first, it is essentially a no-op.

     The benefit is really seen when you are doing both inserts and updates, since the if (exists) method will perform 2 scans, whereas the update (@@Rowcount) method will only perform 1.

     So to sum up, if you know that you are only doing inserts, then this doesn’t apply.  But if you perform any updates in addidtion to inserts, then the reduced scans are quite beneficial.

     My advice would be to use SQL Profiler and run a sample set of data using both methods.  

    Thanks for the comment.

  34. saloni says:

    hi,

    your post is very helpful article as im working on same scenario.

    but im new to pl/sql can you please write full query.

  35. Rash says:

    Your Tutorial is very nice,as your post i m creating procedure but getting error ,please tell me where i m getting error;

    Error Descrption:You hve an error  in your sql syntax: ckeck  the manual that corresponds to your MYSQL

    server version fro the right syntax to use near 'END' at line 13

    DELIMITER $$

    DROP PROCEDURE IF EXISTS `myhealthcamp`.`area` $$

    CREATE PROCEDURE `myhealthcamp`.`area`

    (

     IN id INT,

     IN ar VARCHAR(45)

    )

    BEGIN

     if exists (select area_id from area where area_id = id)

     Then

       update area set areaname = ar where area_id=id;

     else

       insert into area(area_id, areaname) values(id,ar);

    END IF

    END $$

    DELIMITER ;

  36. SkyNET says:

    Fortunately if you really kow your stuff, you'll have a logical mind to deduce such outcomes.

    Not claiming to have a logical mind like I hear most people, when really they can't work out simple problems. This is database querying, let's not get too carried away people. Quite frankly I can make my own data storage solutions that perform better than SQL anyway, just not as convenient!

  37. Alby says:

    Wouldn't the above example have a problem under high volume environments? I'm not an expert in SQL but it appears that Update does not lock if the row does not exist.

  38. Zote says:

    Does SQLServer supports something like Syabse SQL Anywhere on existing update clause? If not, do you have a plan to implement it?

  39. Justin says:

    MySQL has another approach for the same

    dev.mysql.com/…/insert-on-duplicate.html

  40. Fede says:

    I love you ! you are great!

  41. baptx says:

    Everything in one query with INSERT … ON DUPLICATE KEY UPDATE (MySQL) dev.mysql.com/…/insert-on-duplicate.html

  42. Ritu says:

    Thanks , Nice post

  43. Alexey says:

    Cool ;P Thx you!

  44. jaan says:

    great idea, thanks

  45. SGM says:

    UPDATE Table1 SET (…) WHERE Column1='SomeValue'

    IF @@ROWCOUNT=0

       INSERT INTO Table1 VALUES (…)

    Does it put the extra Load (pressure) on Transaction Log? Just because at time TWO statements gets logged in Transaction Log

    WHERE AS

    IF EXISTS (SELECT * FROM Table1 WHERE Column1='SomeValue')

       UPDATE Table1 SET (…) WHERE Column1='SomeValue'

    ELSE

       INSERT INTO Table1 VALUES (…)

    Please suggest???  

  46. LukePet says:

    Exists any free tool that dynamically generate the "If Exist Update Else Insert" statement for all records of a specific table?

  47. Bhanu Naidu says:

    Very simple and valuable post.  Thank you Jeremiah

  48. Mary says:

    I poured data from a relational database table to another table Staging DB. But first I need to remove the name of the database and add it to the table DicDB Staging database, but in case if in DicDB is entered in the name of the database with a specific ID, then the name need not be entered again and to have its existing ID. How can I write in Execute SQL Task tab SQL Statement?

  49. Coolguy says:

    Thank you Jeremiah, u are a true gentleman and a scholar

  50. Johnny Boy says:

    This here is a great article on the *fastest* way to insert a record where one doesn't already exist:

    cc.davelozinski.com/…/fastest-way-to-insert-new-records-where-one-doesnt-already-exist

    for those speed freaks amongst us. 🙂