How to remove columns in Pandas DataFrames

Create a Pandas Dataframe

Following our previous introductory post about Pandas Library in Python, we will now see how to manipulate data. The first thing we usually do when we create a new DataFrame from an Excel file, is to cleanse the data by dropping unnecessary columns.

Create a new DataFrame

For simplicity reasons in our example below, instead of importing the data (from the Excel file), we will just create a small sample dataframe from scratch. This way, you can easily follow along by copying the code below into your Python environment.

# Firstly, we need to import Pandas library in Python
import pandas as pd

# Create some sample data (supposedly some products with cost & stock)
data = {
    'Product Name':['Product A','Product B','Product C'],
    'Cost / Unit':[10, 12, 17],
    'Warranty Period':['2 Years', '2 Years', '2 Years'],
    'Current Stock':[100, 128, 85],
    'Supplier':['Factory A', 'Factory A', 'Factory A']
}

# Create the DataFrame
df = pd.DataFrame(data)

# let's print the results so we can see the outcome
print(df)

Python will print the below results:

Python IDLE Output -

Remove columns in Pandas DataFrames

Now it’s time to see how to remove some columns. In the sample data above, we see that Warranty Period is 2 Years for all products. Let’s assume that 2 years warranty is standard for all products and we decide to remove that column.

# We have already created the DataFrame: 'df' (above)
# Now we want to replace it but without the column 'Warranty Period'
df = df.drop(columns='Warranty Period')

# Let's see the end result
print(df)

And here’s the result:

Python IDLE Dataframe output - How to remove columns in Pandas DataFrames Turorial

But what if we want to remove more than one column? In this case, we will need to pass a list instead of a string to the .drop method. Suppose that in our example we would like to have “Warranty Period” & “Supplier” columns removed.

Here is the code:

# we will go a step back & create again the df DataFrame because 
# we have already removed the "Warranty Period" column in the example above.
df = pd.DataFrame(data)

df = df.drop(columns=['Warranty Period', 'Supplier'])

# Let's see the end result
print(df)

Python will print the below:

Python IDLE - How to remove columns in Pandas DataFrames Tutorial

Further Reading & References

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *