# How to sort data with Pandas DataFrames

## Intro

This post is part of my Pandas DataFrames series. In these posts I explain (with examples) how someone can manipulate lots of data with Python and the Pandas Library. The examples have practical implications in a business / work environment.

Today I will show you, how we can easily sort our data with Pandas and the use of some easy to understand examples.

For our example, we suppose that we already have a DataFrame loaded in memory (you can either import data or create a DataFrame directly within your Python environment).

## Examples

Our hypothesis is that we have a DataFrame called **“df”** which has a collection of products with data such as: *prices*, *sizes *(let’s assume for example they are clothes),* product features* and those products are classified into *categories *and *subcategories *(for example the type of clothes or brands).

We can sort the data with either one, two or more criteria. In our first example we would want to sort the data with two criteria & that criteria have numerical values. We choose to sort them for example with price & size in an ascending way.

### Code

We use the method **“.sort_values”**. When there are more than one criteria, the *order *of the criteria also determines the order the sorting takes place. Below is the example of the code in which we firstly sort the data based on prices and *then *based on sizes:

`df = df.sort_values(by=['price','size'], ascending=[True, True])`

By “playing” with the parameters we can sort them in various ways. For example, the sort can be descending:

*#Descending order*
df = df.sort_values(by=['price','size'], ascending=[False, False])
*# or mixed (ascending in price & descending in size)
*df = df.sort_values(by=['price','size'], ascending=[True, False])

Now let’s assume that we also want to sort our data based on a specific order that we would like to impose. This is very common when our data are not numerical (or the numbers does not mean a thing).

In our example let’s suppose we want to sort the data firstly by *category*, then *size *and then *price*. However, the category does not contain numbers but text instead. For example, we have the following (product) categories: *trousers*, *coats*, *hats*, *shirts*, *socks*, *shoes* & *ties*.

We want to sort the products in the following order based on their *category*:

- hats
- coats
- shirts
- ties
- trousers
- socks
- shoes

Here is our code:

*# We create a categorical variable that contains our custom sort order*
category_order['category'] = pd.Categorical(df['category'], ['hats','coats','shirts','ties','trousers','socks','shoes'])
*# We then apply it in the ".sort_values" method*
# *together with our rest criteria* *as below:*
df= df.sort_values(by=['category_order','size','price'], ascending=[True, True, True])