Find duplicate records in a table and remove them

Find duplicate records in a table and remove them

Why duplicates?

It is surprisingly easy to inadvertently create duplicate records in a table. More often than not, it is because we either didn’t check for an existing record or we didn’t look for the right records by messing up the keys in some way.

How do we delete the duplicates?

This first step is to find the duplicate records. I use this code: –

with x as   (select  *,rn = row_number()
            over(PARTITION BY fieldName order by FieldName)
            from    mytable)
select * from x
where rn > 1

This lists all instances where there are duplicates of ‘fieldName’ in the table ‘mytable’.

Of course, you can change the field and table names to suit your use case.

The great thing about this query is that you can remove the duplicates from the table by replacing the ‘select * from x’ line with ‘delete x’ so the code will look like this: –

with x as   (select  *,rn = row_number()
            over(PARTITION BY FieldName order by FieldName)
            from    mytable)
delete x
where rn > 1

Stephen

Hi, my name is Stephen Finchett. I have been a software engineer for over 30 years and worked on complex, business critical, multi-user systems for all of my career. For the last 15 years, I have been concentrating on web based solutions using the Microsoft Stack including ASP.Net, C#, TypeScript, SQL Server and running everything at scale within Kubernetes.