How to Create a New Dataframe Without Binary Data in Python
Filtering out binary data (columns with True
or False
values) can help clean up your dataset for more focused analysis. This can be efficiently done using the select_dtypes() function in Python.
The following example shows how to create a new dataframe without binary data in Python.
Removing Binary Data Using select_dtypes()
The select_dtypes()
function in Pandas allows you to include or exclude specific data types from your DataFrame. By excluding the bool data type, we can remove binary data.
Let’s consider a DataFrame containing product details, where one of the columns stores binary data (True
or False
).
# Import pandas library
import pandas as pd
# Declare dataframe
df = pd.DataFrame({
'Date': ['01-03-2023', '01-03-2023', '01-03-2023', '01-03-2023', '02-03-2023', '02-03-2023'],
'Product_Code': ['A-101', 'A-102', 'A-103', 'B-101', 'B-102', 'B-104'],
'Product_Name': ['Laptop', 'Mobile', 'Printer', 'Keyboard', 'Scanner', 'Mouse'],
'Price': [4500, 550, 250, 50, 350, 50],
'Status': [True, True, True, False, True, True]
})
# Select columns that are not boolean
new_df = df.select_dtypes(exclude=['bool'])
# Display the new dataframe
print(new_df)
Output: 👇️
Date Product_Code Product_Name Price
0 01-03-2023 A-101 Laptop 4500
1 01-03-2023 A-102 Mobile 550
2 01-03-2023 A-103 Printer 250
3 01-03-2023 B-101 Keyboard 50
4 02-03-2023 B-102 Scanner 350
5 02-03-2023 B-104 Mouse 50
In this example, we use the select_dtypes() function to create a new dataframe new_df that excludes columns with boolean data from the existing dataframe df.
The output shows the new dataframe without binary data.
Conclusion
By using the select_dtypes()
function, you can effectively clean your DataFrame and exclude binary data with minimal effort.
- This method is highly efficient and adaptable for datasets with mixed data types.
- Removing boolean data simplifies the dataset, helping you focus on meaningful insights.