# How to Use MultiIndex for Hierarchical Data Organization in Pandas

Let's learn how to use multiindex pandas for hierarchical data operations.

Image by Editor | Midjourney & Canva

Letâ€™s learn how to use MultiIndex in Pandas for hierarchical data.

## Preparation

We would need the Pandas package to ensure it is installed. You can install them using the following code:

pip install pandas

Then, letâ€™s learn how to handle MultiIndex data in the Pandas.

## Using MultiIndex in Pandas

MultiIndex in Pandas refers to indexing multiple levels on the DataFrame or Series. The process is helpful if we work with higher-dimensional data in a 2D tabular structure. With MultiIndex, we can index data with multiple keys and organize them better. Letâ€™s use a dataset example to understand them better.

import pandas as pd

index = pd.MultiIndex.from_tuples(
[('A', 1), ('A', 2), ('B', 1), ('B', 2)],
names=['Category', 'Number']
)

df = pd.DataFrame({
'Value': [10, 20, 30, 40]
}, index=index)

print(df)

The output:

Value
Category Number
A        1          10
2          20
B        1          30
2          40

As you can see, the DataFrame above has a two-level Index with the Category and Number as their index.

Itâ€™s also possible to set the MultiIndex with the existing columns in our DataFrame.

data = {
'Category': ['A', 'A', 'B', 'B'],
'Number': [1, 2, 1, 2],
'Value': [10, 20, 30, 40]
}
df = pd.DataFrame(data)
df.set_index(['Category', 'Number'], inplace=True)

print(df)

The output:

Value
Category Number
A        1          10
2          20
B        1          30
2          40

Even with different methods, we have similar results. Thatâ€™s how we can have the MultiIndex in our DataFrame.

If you already have the MultiIndex DataFrame, itâ€™s possible to swap the level with the following code.

print(df.swaplevel())

The output:

Value
Number Category
1      A            10
2      A            20
1      B            30
2      B            40

Of course, we can return the MultiIndex to columns with the following code:

print(df.reset_index())

The output:

Category  Number  Value
0        A       1     10
1        A       2     20
2        B       1     30
3        B       2     40

So, how to access MultiIndex data in Pandas DataFrame? We can use the .loc method for that. For example, we access the first level of the MultiIndex DataFrame.

print(df.loc['A'])

The output:

Value
Number
1          10
2          20

We can access the data value as well with Tuple.

print(df.loc[('A', 1)])

The output:

Value    10
Name: (A, 1), dtype: int64

Lastly, we can perform statistical aggregation with MultiIndex using the .groupby method.

print(df.groupby(level=['Category']).sum())

The output:

Value
Category
A            30
B            70

Mastering the MultiIndex in Pandas would allow you to gain insight into hierarchal data.