MySQL DISTINCT
The DISTINCT
keyword in MySQL is used to remove duplicate rows from the result set of a query. It ensures that the returned data contains only unique values for the specified columns. This is useful for eliminating redundancy and ensuring that the results contain distinct records.
Syntax
SELECT DISTINCT column1, column2, ...
FROM table_name;
column1, column2, ...
: The columns for which to return distinct values. If multiple columns are specified, the combination of values in these columns must be unique for each row in the result set.
Key Concepts
Selecting Unique Rows
To retrieve unique rows based on one or more columns:
SELECT DISTINCT department FROM employees;
This query returns a list of unique departments from the
employees
table, eliminating any duplicate department names.Multiple Columns
To get unique combinations of values across multiple columns:
SELECT DISTINCT department, job_title FROM employees;
This query returns unique pairs of
department
andjob_title
, meaning each combination of department and job title will be distinct.Using
DISTINCT
with Aggregate FunctionsYou can use
DISTINCT
inside aggregate functions to count unique values:SELECT COUNT(DISTINCT department) AS unique_departments FROM employees;
This query counts the number of unique departments in the
employees
table.Handling NULL Values
The
DISTINCT
keyword considersNULL
values as distinct. If you have multiple rows withNULL
values in the same column,DISTINCT
will treat eachNULL
as a single unique value:SELECT DISTINCT department FROM employees;
If there are multiple rows where the
department
column isNULL
, the result will include a singleNULL
value.Performance Considerations
- Indexing: Using
DISTINCT
can impact query performance, especially with large datasets. Indexing the columns used withDISTINCT
can help improve performance. - Query Complexity: For complex queries with multiple joins, the use of
DISTINCT
can be computationally expensive. Optimize queries to minimize the impact on performance.
- Indexing: Using
Examples
Single Column Distinct
SELECT DISTINCT city FROM customers;
This query retrieves a list of unique cities from the
customers
table.Multiple Columns Distinct
SELECT DISTINCT first_name, last_name FROM employees;
This query retrieves unique combinations of
first_name
andlast_name
from theemployees
table.Distinct with Conditions
SELECT DISTINCT city FROM customers WHERE country = 'USA';
This query retrieves unique cities from the
customers
table where the country is 'USA'.
Best Practices
- Use with Care: Use
DISTINCT
when necessary to avoid unnecessary performance overhead. Ensure it is used when genuinely needed to eliminate duplicates. - Index Columns: Index columns used with
DISTINCT
to improve performance, particularly for large tables. - Combine with Other Clauses: Combine
DISTINCT
with other clauses likeWHERE
,JOIN
, andORDER BY
to refine and sort the results.