How to Concatenate Dataframes and Remove Duplicate Rows in Python
To concatenate dataframes and remove duplicate rows in Python, you can use the pd.concat() function along with the drop_duplicates() function.
The following example shows how to concatenate dataframes and remove duplicate rows in Python.
Using concat() and drop_duplicates() Function
We can use the concat() and drop_duplicates() functions to combine dataframes and remove duplicate rows.
Suppose we have the following dataframes:
# Import pandas library
import pandas as pd
# Create two dataframes
df1 = pd.DataFrame({'Product_Code': ['A-101', 'A-102', 'A-103'],
'Product_Name': ['Laptop', 'Mobile', 'Printer'],
'Price': [4500, 550, 250],
'Status': [1, 1, 1]})
df2 = pd.DataFrame({'Product_Code': ['A-101', 'B-102', 'B-104'],
'Product_Name': ['Laptop', 'Scanner', 'Mouse'],
'Price': [4500, 350, 50],
'Status': [1, 1, 1]})
# Combine dataframes
combined_df = pd.concat([df1, df2], sort=False)
# Remove duplicate rows
combined_df.drop_duplicates(inplace=True)
# Show dataframe
print(combined_df)
Output: 👇️
Product_Code Product_Name Price Status
0 A-101 Laptop 4500 1
1 A-102 Mobile 550 1
2 A-103 Printer 250 1
1 B-102 Scanner 350 1
2 B-104 Mouse 50 1
In this example, we use the pd.concat() function to combine df1 and df2 into a single dataframe combined_df. We then use the drop_duplicates() function to remove duplicate rows from the combined dataframe.
Conclusion
We can use the pd.concat() function along with the drop_duplicates() function to concatenate dataframes and remove duplicate rows in Python.
This method provides a convenient way to merge dataframes and ensure that the resulting dataframe contains unique rows.