How to sort pandas dataframe | Sorting pandas dataframes

In this post, we will mainly focus on all features related to sort pandas dataframe. Pandas is a highly used library in python for data analysis. Mainly because of its enriched set of functionalities.

At the end of this post you will learn,

  • Sorting pandas dataframe based on indexes
  • Ascending and Descending Sorting on a single column
  • Sort pandas dataframe with multiple columns
  • Sorting dataframe by placing missing values first
  • Pandas dataframe sorting with multi level indexes and columns

So, let’s explore more about pandas sort functionality.

Import libraries

Create pandas dataframe from lists

accommodateshost_namelisting_countreview_count
2014-04-012Maarten125
2012-02-272Fronk & Lieve71
2013-05-113Elke14
2012-04-172Francis259
2012-10-132Kristof13
2011-01-294Liesbet189
2014-03-152Chloé13
2014-05-225Klara12
2012-07-244Maarten17
2014-02-172Kristina114

You can apply two type of sorting on any dataframe in pandas,

  • Sorting on Index (using pandas sort_index function)
  • Sorting on Columns (using pandas sort_values function)

Earlier pandas library was supporting sorted function But it function is deprecated in recent releases of pandas.

Sorting pandas dataframe based on indexes

We can use pandas sort_index function to sort above dataframe based index.

accommodateshost_namelisting_countreview_count
2014-05-225Klara12
2014-04-012Maarten125
2014-03-152Chloé13
2014-02-172Kristina114
2013-05-113Elke14
2012-10-132Kristof13
2012-07-244Maarten17
2012-04-172Francis259
2012-02-272Fronk & Lieve71
2011-01-294Liesbet189

Ascending and Descending Sorting on a single column

For sorting dataframe based on the values of a single column, we can specifying the column name as an argument in pandas sort_values() function.

accommodateshost_namelisting_countreview_count
2014-03-152Chloé13
2013-05-113Elke14
2012-04-172Francis259
2012-02-272Fronk & Lieve71
2014-05-225Klara12

For reverse sorting (sorting in descending order) we can set ascending=False in sort_values() method.

accommodateshost_namelisting_countreview_count
2014-03-152Chloé13
2013-05-113Elke14
2012-04-172Francis259
2012-02-272Fronk & Lieve71
2014-05-225Klara12

Sort pandas dataframe with multiple columns

With pandas sort functionality you can also sort multiple columns along with different sorting orders. For that, we have to pass list of columns to be sorted with argument by=[]. Additionally, in the same order we can also pass a list of boolean to argument ascending=[] specifying sorting order.

  • While using by and ascending together, length of both list must be same
  • Specify True for ascending and False for descending order sorting
  • Sorting sequence will be from columns left to right from the list

Let’s sort our dataframe by lowest accommodates and higher review count.

accommodateshost_namelisting_countreview_count
2014-05-225Klara12
2012-07-244Maarten17
2011-01-294Liesbet189
2013-05-113Elke14
2012-02-272Fronk & Lieve71
2012-10-132Kristof13
2014-03-152Chloé13
2014-02-172Kristina114
2014-04-012Maarten125
2012-04-172Francis259

Sorting dataframe by placing missing values first

Let’s create the same dataframe with missing values in review count.

accommodateshost_namelisting_countreview_count
2014-04-012Maarten125.0
2012-02-272Fronk & Lieve7NaN
2013-05-113Elke14.0
2012-04-172Francis259.0
2012-10-132Kristof13.0
2011-01-294Liesbet189.0
2014-03-152Chloé13.0
2014-05-225Klara12.0
2012-07-244Maarten17.0
2014-02-172Kristina1NaN

In any sorting order, NaN values will always be placed at last. To change that default behavior we can specify na_position=’first’ as an argument.

accommodateshost_namelisting_countreview_count
2012-02-272Fronk & Lieve7NaN
2014-02-172Kristina1NaN
2014-05-225Klara12.0
2012-10-132Kristof13.0
2014-03-152Chloé13.0
2013-05-113Elke14.0
2012-07-244Maarten17.0
2014-04-012Maarten125.0
2012-04-172Francis259.0
2011-01-294Liesbet189.0

Pandas dataframe sorting with multi level indexes and columns

Let’s create dataframe with multi level indexes.

accommodateslisting_countreview_count
host_namesjoining_year
Maarten20142125.0
Fronk & Lieve201227NaN
Elke2013314.0
Francis20122259.0
Kristof2012213.0
Liesbet20114189.0
Chloé2014213.0
Klara2014512.0
Maarten2012417.0
Kristina201421NaN

In above dataframe we have two level index.

  • At level 0 index is host name
  • At level 1 index is year of joining

Now lets sort dataframe decending host names indexe with ascending year of joining index.

accommodateslisting_countreview_count
host_namesjoining_year
Maarten2012417.0
20142125.0
Liesbet20114189.0
Kristof2012213.0
Kristina201421NaN
Klara2014512.0
Fronk & Lieve201227NaN
Francis20122259.0
Elke2013314.0
Chloé2014213.0

In above dataframe you can see that there were two hosts named ‘Maarten’ joined in years 2012 and 2014.

Hope you njoyed the post. Stay tuned for such more posts on pandas.!!!!!

 

Leave a Reply

Your email address will not be published. Required fields are marked *