Non-destructive duplicate removal from a MySQL table

in Development


  1. Create a new table

    CREATE TABLE records_new LIKE records;
  2. Add composite UNIQUE constraint preventing duplicates in the future

    ALTER TABLE records_new ADD UNIQUE idx_composite_UNIQUE (site_id ASC, session_id ASC, query ASC, remote_addr ASC);
  3. Copy records over.

    INSERT INTO records_new SELECT * FROM records ORDER BY id ON DUPLICATE KEY UPDATE records_new.created=NOW();

    The above updates created to NOW() on every duplicate hit, but if you don't want that, change it to ON DUPLICATE KEY UPDATE records_new.id=records_new.id (where id is the table's PRIMARY KEY), which does nothing

  4. Swap the table names

    RENAME TABLE records TO records_old, records_new TO records;

This has been to a large extent based on César Revert-Gomar's answer, so credits to the author.

#mysql