How to Create a New Dataframe Without Binary Data in Python

Filtering out binary data (columns with True or False values) can help clean up your dataset for more focused analysis. This can be efficiently done using the select_dtypes() function in Python.

The following example shows how to create a new dataframe without binary data in Python.

Removing Binary Data Using select_dtypes()

The select_dtypes() function in Pandas allows you to include or exclude specific data types from your DataFrame. By excluding the bool data type, we can remove binary data.

Let’s consider a DataFrame containing product details, where one of the columns stores binary data (True or False).

# Import pandas library
import pandas as pd

# Declare dataframe
df = pd.DataFrame({
    'Date': ['01-03-2023', '01-03-2023', '01-03-2023', '01-03-2023', '02-03-2023', '02-03-2023'],
    'Product_Code': ['A-101', 'A-102', 'A-103', 'B-101', 'B-102', 'B-104'],
    'Product_Name': ['Laptop', 'Mobile', 'Printer', 'Keyboard', 'Scanner', 'Mouse'],
    'Price': [4500, 550, 250, 50, 350, 50],
    'Status': [True, True, True, False, True, True]
})

# Select columns that are not boolean
new_df = df.select_dtypes(exclude=['bool'])

# Display the new dataframe
print(new_df)

Output: 👇️

         Date Product_Code Product_Name  Price
0  01-03-2023        A-101       Laptop   4500
1  01-03-2023        A-102       Mobile    550
2  01-03-2023        A-103      Printer    250
3  01-03-2023        B-101     Keyboard     50
4  02-03-2023        B-102      Scanner    350
5  02-03-2023        B-104        Mouse     50

In this example, we use the select_dtypes() function to create a new dataframe new_df that excludes columns with boolean data from the existing dataframe df.

The output shows the new dataframe without binary data.

Conclusion

By using the select_dtypes() function, you can effectively clean your DataFrame and exclude binary data with minimal effort.

  • This method is highly efficient and adaptable for datasets with mixed data types.
  • Removing boolean data simplifies the dataset, helping you focus on meaningful insights.