How to Concatenate Dataframes and Remove Duplicate Rows in Python

To concatenate dataframes and remove duplicate rows in Python, you can use the pd.concat() function along with the drop_duplicates() function.

The following example shows how to concatenate dataframes and remove duplicate rows in Python.

Using concat() and drop_duplicates() Function

We can use the concat() and drop_duplicates() functions to combine dataframes and remove duplicate rows.

Suppose we have the following dataframes:

# Import pandas library
import pandas as pd

# Create two dataframes
df1 = pd.DataFrame({'Product_Code': ['A-101', 'A-102', 'A-103'],
                    'Product_Name': ['Laptop', 'Mobile', 'Printer'],
                    'Price': [4500, 550, 250],
                    'Status': [1, 1, 1]})

df2 = pd.DataFrame({'Product_Code': ['A-101', 'B-102', 'B-104'],
                    'Product_Name': ['Laptop', 'Scanner', 'Mouse'],
                    'Price': [4500, 350, 50],
                    'Status': [1, 1, 1]})

# Combine dataframes
combined_df = pd.concat([df1, df2], sort=False)

# Remove duplicate rows
combined_df.drop_duplicates(inplace=True)

# Show dataframe
print(combined_df)

Output: 👇️

  Product_Code Product_Name  Price  Status
0        A-101       Laptop   4500       1
1        A-102       Mobile    550       1
2        A-103      Printer    250       1
1        B-102      Scanner    350       1
2        B-104        Mouse     50       1

In this example, we use the pd.concat() function to combine df1 and df2 into a single dataframe combined_df. We then use the drop_duplicates() function to remove duplicate rows from the combined dataframe.

Conclusion

We can use the pd.concat() function along with the drop_duplicates() function to concatenate dataframes and remove duplicate rows in Python.

This method provides a convenient way to merge dataframes and ensure that the resulting dataframe contains unique rows.