MySQL is a database application that stores data in rows and columns of different tables to avoid duplication. Duplicate values can occur, which can impact MySQL performance.
This guide will show you how to find duplicate values in a MySQL database.
- An existing installation of MySQL
- Root user account credentials for MySQL
- A command line / terminal window
Setting up a Sample Table (Optional)
This step will help you create a sample table to work with. If you already have a database to work on, skip to the next section.
Open a terminal window, and switch to the MySQL shell:
mysql –u root –p
Note: If you get ‘ERROR 1698’, try opening MySQL using
sudo mysql instead. This error occurs on some Ubuntu installations and is caused by authentication settings.
List existing databases:
Create a new database that doesn’t already exist:
CREATE database sampledb;
Select the table you just created:
Create a new table with the following fields:
CREATE TABLE dbtable ( id INT PRIMARY KEY AUTO_INCREMENT, date_x VARCHAR(10) NOT NULL, system_x VARCHAR(50) NOT NULL, test VARCHAR(50) NOT NULL );
Insert rows into the table:
INSERT INTO dbtable (date_x,system_x,test) VALUES ('01/03/2020','system1','hard_drive'), ('01/04/2020','system2','memory'), ('01/10/2020','system2','processor'), ('01/14/2020','system3','hard drive'), ('01/10/2020','system2','processor'), ('01/20/2020','system4','hard drive'), ('01/24/2020','system5','memory'), ('01/29/2020','system6','hard drive'), ('02/02/2020','system7','motherboard'), ('02/04/2020','system8','graphics card'), ('02/02/2020','system7','motherboard'), ('02/08/2020','system9','hard drive');
Run the following SQL query:
SELECT * FROM dbtable ORDER BY date_x;
Finding Duplicates in MySQL
Find Duplicate Values in a Single Column
GROUP BY function to identify all identical entries in one column. Follow up with a
COUNT() HAVING function to list all groups with more than one entry.
SELECT test, COUNT(test) FROM dbtable GROUP BY test HAVING COUNT(test) > 1;
Find Duplicate Values in Multiple Columns
You may want to list exact duplicates, with the same information in all three columns.
SELECT date_x, COUNT(date_x), system_x, COUNT(system_x), test, COUNT(test) FROM dbtable GROUP BY date_x, system_x, test HAVING COUNT(date_x)>1 AND COUNT(system_x)>1 AND COUNT(test)>1;
This query works by selecting and testing for the >1 condition on all three columns. The result is that only rows with duplicate values are returned in the output.
Check for Duplicates in Multiple Tables With INNER JOIN
Use the INNER JOIN function to find duplicates that exist in multiple tables.
Sample syntax for an
INNER JOIN function looks like this:
SELECT column_name FROM table1 INNER JOIN table2 ON table1.column_name = table2.column name;
To test this example, you need a second table that contains some information duplicated from the sampledb table we created above.
SELECT dbtable.date_x FROM dbtable INNER JOIN new_table ON dbtable.date_x = new_table.date_x;
This will display any duplicate dates that exist between the existing data and the new_table.
DISTINCT command can be used to return results while ignoring duplicates. Also, newer versions of MySQL use a strict mode, which can affect operations that attempt to select all columns. If you get an error, make sure that you’re selecting specific individual columns.
Now you can check for duplicates in MySQL data in one or multiple tables and understand the
INNER JOIN function. Make sure you created the tables correctly and that you select the right columns.
Now that you have found duplicate values, learn how to remove MySQL duplicate rows.