Last active
August 29, 2015 14:13
-
-
Save tkMageztik/5170668457bae1e48d1b to your computer and use it in GitHub Desktop.
Delete duplicates rows, except 1.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
First of all, you need to decide what is the duplicate criteria that you will use. For example I want to check if the phone number is duplicate in the same Org Nro, company Name and postalcode. I can do this: | |
First table for test | |
CREATE TABLE TEST | |
( | |
id int, | |
phonenumber int, | |
orgno int, | |
companyname varchar(100), | |
postalcode varchar(4) | |
) | |
Then Test data | |
insert into TEST | |
values (1,4533660,1,'COMPANY 1',1234) | |
insert into TEST | |
values (2,4533660,1,'COMPANY 1',1234) | |
insert into TEST | |
values (3,954189547,1,'COMPANY 2',4444) | |
insert into TEST | |
values (4,954189547,1,'COMPANY 2',4444) | |
insert into TEST | |
values (5,3652591,1,'COMPANY 3',4444) | |
insert into TEST | |
values (6,4201580,1,'COMPANY 4',4444) | |
insert into TEST | |
values (7,3337788,1,'COMPANY 5',4444) | |
Finally, the query for duplicates, with the criteria selected. | |
Select phonenumber, | |
orgno, | |
companyname, | |
postalcode, | |
COUNT(*) | |
from test | |
group by phonenumber, | |
orgno, | |
companyname, | |
postalcode | |
HAVING COUNT(*) > 1 | |
With this query, you can easy find duplicates, most important is review the criteria of duplicate. | |
Edit | |
If you want delete all duplicate rows except 1, you can do this: | |
DELETE A | |
FROM TEST A | |
INNER JOIN | |
( | |
SELECT ROW_NUMBER()OVER(PARTITION BY a.phonenumber, | |
a.orgno, | |
a.companyname, | |
a.postalcode ORDER BY a.id)AS POS, | |
a.phonenumber, | |
a.orgno, | |
a.companyname, | |
a.postalcode, | |
a.id | |
FROM TEST A | |
JOIN | |
( | |
SELECT | |
phonenumber, | |
orgno, | |
companyname, | |
postalcode, | |
COUNT(*) AS CONTADOR | |
FROM test | |
GROUP BY phonenumber, | |
orgno, | |
companyname, | |
postalcode | |
HAVING COUNT(*) > 1 | |
) TB | |
ON A.companyname = TB.companyname | |
AND A.orgno = TB.orgno | |
AND A.phonenumber = TB.phonenumber | |
AND A.postalcode = TB.postalcode | |
) TB_2 | |
ON | |
a.id = tb_2.id | |
--http://stackoverflow.com/questions/27907438/script-to-remove-duplicate-from-database/27907712#27907712 | |
and tb_2.pos > 1 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment