Skip to content

Counting duplicates in MySQL involves identifying and quantifying repeated occurrences of data in a database. This guide explains how to count duplicate values in MySQL tables, a common task for database administrators and developers when analyzing data consistency and integrity.

Understanding the Data Structure

First, understand your data and the table structure. Analyze the columns you suspect might have duplicates. For instance, if you’re dealing with user data, you might want to check for duplicate email addresses.

DESCRIBE users;

Basic Duplicate Count

To find duplicates in a single column, use the GROUP BY and HAVING clauses.

SELECT email, COUNT(*) as count
FROM users
GROUP BY email
HAVING COUNT(*) > 1;

Advanced Duplicate Identification

For more complex scenarios, like finding duplicates across multiple columns, adjust your query accordingly.

SELECT first_name, last_name, COUNT(*)
FROM users
GROUP BY first_name, last_name
HAVING COUNT(*) > 1;

Counting All Duplicates

To get the total number of duplicate records in a table, you can use a subquery.

SELECT COUNT(*)
FROM (
    SELECT COUNT(*)
    FROM users
    GROUP BY email
    HAVING COUNT(*) > 1
) AS subquery;

Eliminating Duplicates

If your goal is to delete duplicates, keeping one instance of each, you can use a combination of DELETE and GROUP BY.

DELETE u1 FROM users u1
INNER JOIN users u2
WHERE
    u1.id < u2.id AND
    u1.email = u2.email;

Basedash is built as an AI-native BI platform, so teams can go from ad hoc SQL to trusted answers and dashboards quickly, without the overhead of traditional BI setup.

Conclusion

Counting duplicates in MySQL is a vital skill for database maintenance and data analysis. This guide provides the necessary steps and queries to identify and handle duplicate data effectively.

Written by

Robert Cooper avatar

Robert Cooper

Senior Engineer at Basedash

Robert Cooper is a senior engineer at Basedash who builds full-stack product systems across SQL data infrastructure, APIs, and frontend architecture. His work focuses on application performance, developer velocity, and reliable self-hosted workflows that make data operations easier for teams at scale.

View full author profile →

Looking for an AI-native BI tool?

Basedash lets you build charts, dashboards, and reports in seconds using all your data.