Skip to content

pandas Series πŸ“ˆΒΆ

Python Professional PathData Science with pandas

Prerequisites: Introduction to pandas, Python Lists and Dictionaries

Mentor's Note: A Series is the simplest building block in pandas. Before you can understand a DataFrame (a full table), you need to understand a Series (a single column). Think of it as a Python list that knows the name of every item. πŸ’‘

What You'll Learn

By the end of this tutorial, you'll know:

  • What a pandas Series is and how it differs from a Python list
  • How to create a Series from a list (default index) and from a dictionary (custom index)
  • How to filter, slice, and do math on a Series without writing any loops
  • The common mistake of assuming s[0] always works when you have a custom index

🌟 The Scenario: The Marks Register Column¢

Your teacher has a register book. The left column has student names, the right column has marks. If you tear out just the marks column and keep the names as labels β€” that's a pandas Series.

  • The Logic:
    • Values: [95, 82, 70] β€” the actual data
    • Index: ['Vishnu', 'Ankit', 'Priya'] β€” the labels
    • Name: 'marks' β€” what this column represents
  • The Result: A 1D array where every value has a meaningful label. βœ…

πŸ“– Concept ExplanationΒΆ

1. What is a Series?ΒΆ

A pandas Series is a one-dimensional labelled array. It has:

  • Values β€” the data (numbers, strings, booleans, etc.)
  • Index β€” labels for each value (default: 0, 1, 2... or custom strings)
  • dtype β€” the data type of the values (int64, float64, object, etc.)
  • name β€” an optional name for the Series

2. Default Index vs Custom IndexΒΆ

Default Index Custom Index
How Created from a list Created from a dictionary
Labels 0, 1, 2, ... 'Vishnu', 'Ankit', ...
Access s[0] s['Vishnu']

3. Series vs Python ListΒΆ

Feature Python List pandas Series
Labels ❌ Index only βœ… Custom labels
Vectorised math ❌ Needs loop βœ… s * 2, s + 10
Built-in stats ❌ Needs code βœ… .mean(), .max()
Filter by condition Verbose s[s > 80]

🎨 Visual Logic¢

graph LR
    subgraph Default Index
        A0["Index: 0"] --> V0["Value: 95"]
        A1["Index: 1"] --> V1["Value: 82"]
        A2["Index: 2"] --> V2["Value: 70"]
    end
    subgraph Custom Index
        B0["Index: 'Vishnu'"] --> W0["Value: 95"]
        B1["Index: 'Ankit'"] --> W1["Value: 82"]
        B2["Index: 'Priya'"] --> W2["Value: 70"]
    end

πŸ’» ImplementationΒΆ

import pandas as pd

# Create a Series from a list
marks = pd.Series([95, 82, 70])

print(marks)
# 0    95
# 1    82
# 2    70
# dtype: int64

print(marks.index)   # RangeIndex(start=0, stop=3, step=1)
print(marks.values)  # [95 82 70]
print(marks.dtype)   # int64
import pandas as pd

# Dictionary keys become the index (labels)
marks = pd.Series({'Vishnu': 95, 'Ankit': 82, 'Priya': 70})

print(marks)
# Vishnu    95
# Ankit     82
# Priya     70
# dtype: int64

# Access by label
print(marks['Vishnu'])  # 95

# Give the Series a name
marks.name = 'marks'
import pandas as pd

marks = pd.Series({'Vishnu': 95, 'Ankit': 82, 'Priya': 70, 'Sara': 88})

# Statistical methods
print(marks.mean())   # 83.75
print(marks.max())    # 95
print(marks.min())    # 70
print(marks.sum())    # 335

# Vectorised operations (no loop needed!)
bonus = marks + 5     # Add 5 to every value
scaled = marks * 1.1  # Scale all marks by 10%

# Boolean filtering β€” select marks above 80
toppers = marks[marks > 80]
print(toppers)
# Output:
# Vishnu    95
# Ankit     82
# Sara      88
# dtype: int64

Try this in your Python shell β€” no file needed.

>>> import pandas as pd
>>> marks = pd.Series({'Vishnu': 95, 'Ankit': 82, 'Priya': 70})
>>> marks
Vishnu    95
Ankit     82
Priya     70
dtype: int64
>>> marks.mean()
82.33333333333333
>>> marks[marks > 80]
Vishnu    95
Ankit     82
dtype: int64
>>> marks['Vishnu']
95
>>> marks + 5
Vishnu    100
Ankit      87
Priya      75
dtype: int64

New to the REPL?

Type python3 in your terminal to start. Each >>> is what you type; the line below is Python's response. Type exit() to quit.

import pandas as pd

marks = pd.Series({'Vishnu': 95, 'Ankit': 82, 'Priya': 70, 'Sara': 88})

# Slice by position (like a list)
print(marks[0:2])
# Vishnu    95
# Ankit     82

# Slice by label (inclusive on both ends)
print(marks['Ankit':'Sara'])
# Ankit    82
# Priya    70
# Sara     88

πŸ“Š Sample Dry RunΒΆ

Series: marks = pd.Series({'Vishnu': 95, 'Ankit': 82, 'Priya': 70, 'Sara': 88})

Expression: marks[marks > 80]

Step Action Result
1 Evaluate marks > 80 Vishnu: True, Ankit: True, Priya: False, Sara: True
2 Use boolean mask to filter Keep rows where mask is True
3 Return filtered Series Vishnu: 95, Ankit: 82, Sara: 88

🎯 Practice Lab πŸ§ͺΒΆ

Task: City Population Analysis

Create a pandas Series of 5 Indian city populations (in millions):

  • Surat: 7.8, Mumbai: 20.7, Delhi: 31.2, Chennai: 10.9, Kolkata: 14.9

Then answer these questions using Series methods:

  1. Which city has the highest population? (idxmax())
  2. What is the average population? (.mean())
  3. Filter cities with population above 12 million.
  4. Add 0.5 million to every city (vectorised addition).

Hint: Pass a dictionary to pd.Series() with city names as keys.


❓ Frequently Asked QuestionsΒΆ

Q: I created a Series with string labels. Why does s[0] give a KeyError?

When you create a Series from a dictionary, the keys become the index. If your index is ['Vishnu', 'Ankit', 'Priya'], then s[0] looks for a label 0 β€” which doesn't exist. Use s.iloc[0] to access by position, or s['Vishnu'] to access by label. This is one of the most common beginner mistakes.

Q: What's the difference between label-based slicing and position-based slicing?

With a custom string index, s['Ankit':'Sara'] includes both endpoints (inclusive). With integer position slicing s[0:2], the end is exclusive (like a Python list). This inconsistency trips up many beginners β€” use .loc[] for label-based and .iloc[] for position-based access to be explicit.

Q: Can a Series hold mixed data types?

Yes, but the dtype will become object (like a Python list). Performance and memory use suffer because pandas can't use optimised numeric operations. Try to keep Series columns uniform β€” all integers, all floats, or all strings.

Q: CBSE exam β€” how do you create a Series from a dictionary?

s = pd.Series({'key1': val1, 'key2': val2}) β€” dictionary keys become the index labels. This is the standard CBSE answer.


βœ… SummaryΒΆ

In this tutorial, you've learned:

  • βœ… A pandas Series is a 1D labelled array β€” values plus an index
  • βœ… Create from a list (default 0-based index) or a dictionary (keys become labels)
  • βœ… Vectorised operations like s + 5 or s * 1.1 work on every element β€” no loop needed
  • βœ… Boolean filtering s[s > 80] returns a new Series with only matching values
  • βœ… s[0] fails with a custom string index β€” use s.iloc[0] for position, s['label'] for labels

πŸ’‘ Interview & Exam TipsΒΆ

Q: What is the default index of a Series created from a list?

Integer-based RangeIndex starting from 0: 0, 1, 2, ...

Q: What is the difference between a Python list and a pandas Series?

A Series has a labelled index, supports vectorised math (s * 2), and has built-in statistical methods (.mean(), .max()). A list has none of these.

Q: How do you create a Series with custom labels?

Pass a dictionary: pd.Series({'Vishnu': 95, 'Ankit': 82}) β€” keys become the index labels.

Q: What does s[s > 80] return?

A new Series containing only the values where the condition is True β€” this is called boolean indexing.


πŸ“š Further ReadingΒΆ

Continue your learning path:

Go deeper: