Row selection

Each row in a DataFrame corresponds to a data point in our problem space. We need to perform row selection if we want to create a subset of the data elements we have in our problem space. This subset can be created by using one of the two following methods:

  • By specifying their position 
  • By specifying a filter

A subset of rows can be retrieved by its position as follows:

>>> df.iloc[1:3,:]
id name age decision
1 2 Elena 23 False
2 3 Steven 40 True

Note that the preceding code will return the first two rows and all columns.

To create a subset by specifying the filter, we need to use one or more columns to define the selection criterion. For example, a subset of data elements can be selected by this method, as follows: 

>>> df[df.age>30]
  id    name  age  decision
0  1   Fares   32      True
2  3  Steven   40      True

>>> df[(df.age<35)&(df.decision==True)]
id name age decision 0 1 Fares 32 True

Note that this code creates a subset of rows that satisfies the condition stipulated in the filter.