Removing columns from a pandas DataFrame that contain certain names can be accomplished using the drop()
function. The drop()
function takes two arguments: the labels of the columns to be removed, and the axis
(0 for rows, 1 for columns).
Here is an example of how to use the drop()
function to remove columns with specific names from a DataFrame:
import pandas as pd
# Create a sample DataFrame
data = {'Name': ['John', 'Mike', 'Sara'],
'Age': [25, 30, 35],
'Address': ['New York', 'Chicago', 'Los Angeles'],
'Phone': [123, 456, 789]}
df = pd.DataFrame(data)
# Print the original DataFrame
print(df)
# Remove the 'Address' and 'Phone' columns
df = df.drop(columns=['Address', 'Phone'])
# Print the modified DataFrame
print(df)
The output of the first print(df)
statement will be:
Name Age Address Phone
0 John 25 New York 123
1 Mike 30 Chicago 456
2 Sara 35 Los Angeles 789
And the output of the second print(df)
statement will be:
Name Age
0 John 25
1 Mike 30
2 Sara 35
As you can see, the 'Address' and 'Phone' columns have been removed from the DataFrame.
Another way to remove a column is by using the del
keyword on the DataFrame.
import pandas as pd
# Create a sample DataFrame
data = {'Name': ['John', 'Mike', 'Sara'],
'Age': [25, 30, 35],
'Address': ['New York', 'Chicago', 'Los Angeles'],
'Phone': [123, 456, 789]}
df = pd.DataFrame(data)
# Print the original DataFrame
print(df)
# Remove the 'Address' column
del df['Address']
# Print the modified DataFrame
print(df)
This method also will remove the 'Address' column from the DataFrame
You can also remove multiple columns by passing a list of column names to the del
keyword.
del df[['Address','Phone']]
It's also possible to remove columns based on certain conditions. For example, you can remove all columns that contain a specific string in their name.
# Create a sample DataFrame
data = {'Name': ['John', 'Mike', 'Sara'],
'Age': [25, 30, 35],
'Address_1': ['New York', 'Chicago', 'Los Angeles'],
'Phone_1': [123, 456, 789],
'Address_2': ['New York', 'Chicago', 'Los Angeles'],
'Phone_2': [123, 456, 789]}
df = pd.DataFrame(data)
# Print the original DataFrame
print(df)
# Remove all columns that contain the string '_1'
df = df
Sure, in addition to removing columns based on specific names, you can also remove columns based on certain conditions. For example, you can remove all columns that contain a specific string in their name.
import pandas as pd
Create a sample DataFrame
data = {'Name': ['John', 'Mike', 'Sara'],
'Age': [25, 30, 35],
'Address_1': ['New York', 'Chicago', 'Los Angeles'],
'Phone_1': [123, 456, 789],
'Address_2': ['New York', 'Chicago', 'Los Angeles'],
'Phone_2': [123, 456, 789]}
df = pd.DataFrame(data)
Print the original DataFrame
print(df)
Remove all columns that contain the string '_1'
df = df.drop(columns=[col for col in df.columns if '_1' in col])
Print the modified DataFrame
print(df)
The output of the first `print(df)` statement will be:
Name Age Address_1 Phone_1 Address_2 Phone_2
0 John 25 New York 123 New York 123
1 Mike 30 Chicago 456 Chicago 456
2 Sara 35 Los Angeles 789 Los Angeles 789
And the output of the second `print(df)` statement will be:
Name Age Address_2 Phone_2
0 John 25 New York 123
1 Mike 30 Chicago 456
2 Sara 35 Los Angeles 789
You can also remove columns based on their data type. For example, you can remove all columns that have a numeric data type.
Create a sample DataFrame
data = {'Name': ['John', 'Mike', 'Sara'],
'Age': [25, 30, 35],
'Address': ['New York', 'Chicago', 'Los Angeles'],
'Phone': [123, 456, 789],
'Salary':[2000,3000,4000]
}
df = pd.DataFrame(data)
Print the original DataFrame
print(df)
Remove all numeric columns
df = df.drop(columns=df.select_dtypes(['int64','float64']).columns)
Print the modified DataFrame
print(df)
The output of the first `print(df)` statement will be:
Name Age Address Phone Salary
0 John 25 New York 123 2000
1 Mike 30 Chicago 456 3000
2 Sara 35 Los Angeles 789 4000
And the output of the second `print(df)` statement will be:
Name Address
0 John New York
1 Mike Chicago
2 Sara Los Angeles
It's also possible to remove a column by its index. For example, you can remove the first column of a DataFrame by passing the index 0 to the `drop()` function.
Create a sample DataFrame
Popular questions
- How can I remove a specific column from a DataFrame in pandas?
- You can remove a specific column from a DataFrame in pandas by using the
drop()
function and passing the column name as an argument. For example:
df = df.drop(columns='column_name')
- How can I remove multiple columns from a DataFrame in pandas?
- You can remove multiple columns from a DataFrame in pandas by using the
drop()
function and passing a list of column names as an argument. For example:
df = df.drop(columns=['column_1', 'column_2', 'column_3'])
- How can I remove all columns that contain a specific string in their name from a DataFrame in pandas?
- You can remove all columns that contain a specific string in their name from a DataFrame in pandas by using a list comprehension to create a list of columns that match the condition and passing that list to the
drop()
function. For example:
df = df.drop(columns=[col for col in df.columns if 'specific_string' in col])
- How can I remove all columns with a specific data type from a DataFrame in pandas?
- You can remove all columns with a specific data type from a DataFrame in pandas by using the
select_dtypes()
function to select columns of the desired data type, then passing the resulting list of column names to thedrop()
function. For example:
df = df.drop(columns=df.select_dtypes(['int', 'float']).columns)
- Can we remove a column by its index in DataFrame?
- Yes, it is possible to remove a column by its index in a DataFrame by passing the index to the
drop()
function. For example:
df = df.drop(df.columns[index], axis=1)
Please note that it will drop the column at specific index and axis=1 is for column.
Tag
Filtering