Removing Duplicate Records in a SQL Server Table
Step 1: Creating a Table
/* Create Table with 7 entries - 3 are duplicate entries */ CREATE TABLE DuplicateRcordTable (Col1 INT, Col2 INT)
Step 2: Inserting data into the table
INSERT INTO DuplicateRcordTable SELECT 1, 1 UNION ALL SELECT 1, 1 --duplicate UNION ALL SELECT 1, 1 --duplicate UNION ALL SELECT 1, 2 UNION ALL SELECT 1, 2 --duplicate UNION ALL SELECT 1, 3 UNION ALL SELECT 1, 4 GO
Step 3: Selecting Data
/* It should give you 7 rows */ SELECT * FROM DuplicateRcordTable GO
Step 4: Deleting the Duplicate Records
/* Delete Duplicate records */ WITH CTE (COl1,Col2, DuplicateCount) AS ( SELECT COl1,Col2, ROW_NUMBER() OVER(PARTITION BY COl1,Col2 ORDER BY Col1) AS DuplicateCount FROM DuplicateRcordTable ) DELETE FROM CTE WHERE DuplicateCount > 1 GO
a CTE acts as a virtual table, I am able to process data modification statements against it, and the underlying table will be affected. In this case, I am removing any record from the SalesCTE that is ranked higher than 1. This will remove all of my duplicate records.
Step 5: Selecting the records after deleting the duplicate records
/* It should give you Distinct 4 records */ SELECT * FROM DuplicateRcordTable GO
----------------------------------------------------------------------------
2 way
Following code is useful to delete duplicate records. The table must have identity column, which will be used to identify the duplicate records. Table in example is has ID as Identity Column and Columns which have duplicate data are DuplicateColumn1, DuplicateColumn2 and DuplicateColumn3.
DELETE
FROM MyTable WHERE ID NOT IN
(SELECT MAX(ID)FROM MyTable GROUP BY DuplicateColumn1, DuplicateColumn2, DuplicateColumn3)
--------------------------------------------------------------------------------------------
No comments:
Post a Comment