Board index » delphi » find and delete duplicate records

find and delete duplicate records

Hi Group - Is there is a way to use a stored procedure to find and delete
duplicate records in a table? Does anybody have example code? Regards Quin
 

Re:find and delete duplicate records


"qWin"

Quote
> Hi Group - Is there is a way to use a stored procedure to find and delete
> duplicate records in a table? Does anybody have example code? Regards Quin

SELECT field1, field2, ... , fieldN, COUNT(*)
FROM yourtable
GROUP BY field1, field2, ... , fieldN
HAVING COUNT(*) > 1

HTH.

Ciao,
    Max

Re:find and delete duplicate records


The real problem comes when you want delete duplicated records because,
since they don't have unique identifier, you cant delete it with generic SQL
sentence.

The best bet you have is:

1. Create a new unique field on your table and update it with unique values:

    ALTER TABLE ... ADD ID NUMERIC(18, 0);
    CREATE GENERATOR ID_GEN;
    COMMIT;

    UPDATE TABLE ...
    SET ID = GEN_ID(ID_GEN, 1);
    COMMIT;

2. Write the SP such:

    CREATE PROCEDURE ... AS
        DECLARE ...
    BEGIN
        FOR
            SELECT K1, K2 ... /* Key fields */
            FROM ...
            GROUP BY K1, K2 ... /* Key fields */
            HAVING COUNT(*) > 1
            INTO :K1, :K2 ...
        DO
            DUPS = 0;
            FOR
                SELECT ID
                FROM ...
                WHERE K1 = :K1 AND K2 = :K2 ... /* Key fields */
                INTO :ID
            DO
                /* Skip first record */
                IF DUPS > 0 THEN
                     DELETE FROM ...
                        WHERE ID = :ID;
                DUPS = DUPS + 1;
            END
        END
    END
--
Sergio Samayoa
Lgica Software
http://www.geocities.com/logicasw/

Re:find and delete duplicate records


Sorry,
there is an error in:

DELETE FROM foo a
  WHERE EXISTS (SELECT * FROM foo b WHERE  b.A1 = a.A1 AND b.A2 = a.A2)

because it equivalent to DELETE FROM foo

Alexander

Quote
Alexandre Kozlov wrote:
> Hi,
> if you know column names of this table ( fo ex. A1 and A2) you can use the
> following:

> DELETE FROM foo a
>   WHERE EXISTS (SELECT * FROM foo b WHERE  b.A1 = a.A1 AND b.A2 = a.A2)

> or (in some cases) you can use the following (less time consuming) procedure:
> 1) create table foo1 with the same structure;
> 2) INSERT INTO foo1 SELECT DISTINCT * FROM foo;
> 3) 'rebuild all references' from table foo to table foo1;
> 4) drop table foo;
> 5) CREATE VIEW foo AS SELECT * from foo1;
> 6) give all necessary grants to view foo (as it were table foo)

> Alexander

> qWin wrote:

> > Hi Group - Is there is a way to use a stored procedure to find and delete
> > duplicate records in a table? Does anybody have example code? Regards Quin

Re:find and delete duplicate records


Hi,
if you know column names of this table ( fo ex. A1 and A2) you can use the
following:

DELETE FROM foo a
  WHERE EXISTS (SELECT * FROM foo b WHERE  b.A1 = a.A1 AND b.A2 = a.A2)

or (in some cases) you can use the following (less time consuming) procedure:
1) create table foo1 with the same structure;
2) INSERT INTO foo1 SELECT DISTINCT * FROM foo;
3) 'rebuild all references' from table foo to table foo1;
4) drop table foo;
5) CREATE VIEW foo AS SELECT * from foo1;
6) give all necessary grants to view foo (as it were table foo)

Alexander

Quote
qWin wrote:
> Hi Group - Is there is a way to use a stored procedure to find and delete
> duplicate records in a table? Does anybody have example code? Regards Quin

Re:find and delete duplicate records


You really can't delete it with DELETE statement - you only can SELECT DISTINCT
and copy it to another table with the same structure - and if you really don't
want duplications
in this new table in future add UNIQUE constraints: UNIQUE ( field1, field2,...)
Quote
Alexandre Kozlov wrote:
> Hi,
> if you know column names of this table ( fo ex. A1 and A2) you can use the
> following:

> DELETE FROM foo a
>   WHERE EXISTS (SELECT * FROM foo b WHERE  b.A1 = a.A1 AND b.A2 = a.A2)

> or (in some cases) you can use the following (less time consuming) procedure:
> 1) create table foo1 with the same structure;
> 2) INSERT INTO foo1 SELECT DISTINCT * FROM foo;
> 3) 'rebuild all references' from table foo to table foo1;
> 4) drop table foo;
> 5) CREATE VIEW foo AS SELECT * from foo1;
> 6) give all necessary grants to view foo (as it were table foo)

> Alexander

> qWin wrote:

> > Hi Group - Is there is a way to use a stored procedure to find and delete
> > duplicate records in a table? Does anybody have example code? Regards Quin

Re:find and delete duplicate records


Thanks for the replies. The reason duplicates appear is user does a certain
copy data option twice by mistake or user makes typo inputting range
selection data. Results in same data records existing twice or more. Can I
use UNIQUE as you suggest to simply prevent that? Is there any overhead or
other performance sacrifice for doing that? How can I handle error message
or hide error message when duplicate record is rejected? Thanks q

Quote
"Alexandre Kozlov" <Alexandre_Koz...@uml.edu> wrote in message

news:3B093896.A3471005@uml.edu...
Quote
> You really can't delete it with DELETE statement - you only can SELECT
DISTINCT
> and copy it to another table with the same structure - and if you really
don't
> want duplications
> in this new table in future add UNIQUE constraints: UNIQUE ( field1,
field2,...)

> Alexandre Kozlov wrote:

> > Hi,
> > if you know column names of this table ( fo ex. A1 and A2) you can use
the
> > following:

> > DELETE FROM foo a
> >   WHERE EXISTS (SELECT * FROM foo b WHERE  b.A1 = a.A1 AND b.A2 = a.A2)

> > or (in some cases) you can use the following (less time consuming)
procedure:
> > 1) create table foo1 with the same structure;
> > 2) INSERT INTO foo1 SELECT DISTINCT * FROM foo;
> > 3) 'rebuild all references' from table foo to table foo1;
> > 4) drop table foo;
> > 5) CREATE VIEW foo AS SELECT * from foo1;
> > 6) give all necessary grants to view foo (as it were table foo)

> > Alexander

> > qWin wrote:

> > > Hi Group - Is there is a way to use a stored procedure to find and
delete
> > > duplicate records in a table? Does anybody have example code? Regards
Quin

Other Threads