pandas Series πΒΆ
Prerequisites: Introduction to pandas, Python Lists and Dictionaries
Mentor's Note: A Series is the simplest building block in pandas. Before you can understand a DataFrame (a full table), you need to understand a Series (a single column). Think of it as a Python list that knows the name of every item. π‘
What You'll Learn
By the end of this tutorial, you'll know:
- What a pandas Series is and how it differs from a Python list
- How to create a Series from a list (default index) and from a dictionary (custom index)
- How to filter, slice, and do math on a Series without writing any loops
- The common mistake of assuming
s[0]always works when you have a custom index
π The Scenario: The Marks Register ColumnΒΆ
Your teacher has a register book. The left column has student names, the right column has marks. If you tear out just the marks column and keep the names as labels β that's a pandas Series.
- The Logic:
- Values:
[95, 82, 70]β the actual data - Index:
['Vishnu', 'Ankit', 'Priya']β the labels - Name:
'marks'β what this column represents
- Values:
- The Result: A 1D array where every value has a meaningful label. β
π Concept ExplanationΒΆ
1. What is a Series?ΒΆ
A pandas Series is a one-dimensional labelled array. It has:
- Values β the data (numbers, strings, booleans, etc.)
- Index β labels for each value (default: 0, 1, 2... or custom strings)
- dtype β the data type of the values (
int64,float64,object, etc.) - name β an optional name for the Series
2. Default Index vs Custom IndexΒΆ
| Default Index | Custom Index | |
|---|---|---|
| How | Created from a list | Created from a dictionary |
| Labels | 0, 1, 2, ... | 'Vishnu', 'Ankit', ... |
| Access | s[0] |
s['Vishnu'] |
3. Series vs Python ListΒΆ
| Feature | Python List | pandas Series |
|---|---|---|
| Labels | β Index only | β Custom labels |
| Vectorised math | β Needs loop | β
s * 2, s + 10 |
| Built-in stats | β Needs code | β
.mean(), .max() |
| Filter by condition | Verbose | s[s > 80] |
π¨ Visual LogicΒΆ
graph LR
subgraph Default Index
A0["Index: 0"] --> V0["Value: 95"]
A1["Index: 1"] --> V1["Value: 82"]
A2["Index: 2"] --> V2["Value: 70"]
end
subgraph Custom Index
B0["Index: 'Vishnu'"] --> W0["Value: 95"]
B1["Index: 'Ankit'"] --> W1["Value: 82"]
B2["Index: 'Priya'"] --> W2["Value: 70"]
end
π» ImplementationΒΆ
import pandas as pd
marks = pd.Series({'Vishnu': 95, 'Ankit': 82, 'Priya': 70, 'Sara': 88})
# Statistical methods
print(marks.mean()) # 83.75
print(marks.max()) # 95
print(marks.min()) # 70
print(marks.sum()) # 335
# Vectorised operations (no loop needed!)
bonus = marks + 5 # Add 5 to every value
scaled = marks * 1.1 # Scale all marks by 10%
# Boolean filtering β select marks above 80
toppers = marks[marks > 80]
print(toppers)
# Output:
# Vishnu 95
# Ankit 82
# Sara 88
# dtype: int64
Try this in your Python shell β no file needed.
>>> import pandas as pd
>>> marks = pd.Series({'Vishnu': 95, 'Ankit': 82, 'Priya': 70})
>>> marks
Vishnu 95
Ankit 82
Priya 70
dtype: int64
>>> marks.mean()
82.33333333333333
>>> marks[marks > 80]
Vishnu 95
Ankit 82
dtype: int64
>>> marks['Vishnu']
95
>>> marks + 5
Vishnu 100
Ankit 87
Priya 75
dtype: int64
New to the REPL?
Type python3 in your terminal to start. Each >>> is what you type; the line below is Python's response. Type exit() to quit.
π Sample Dry RunΒΆ
Series: marks = pd.Series({'Vishnu': 95, 'Ankit': 82, 'Priya': 70, 'Sara': 88})
Expression: marks[marks > 80]
| Step | Action | Result |
|---|---|---|
| 1 | Evaluate marks > 80 |
Vishnu: True, Ankit: True, Priya: False, Sara: True |
| 2 | Use boolean mask to filter | Keep rows where mask is True |
| 3 | Return filtered Series | Vishnu: 95, Ankit: 82, Sara: 88 |
π― Practice Lab π§ͺΒΆ
Task: City Population Analysis
Create a pandas Series of 5 Indian city populations (in millions):
- Surat: 7.8, Mumbai: 20.7, Delhi: 31.2, Chennai: 10.9, Kolkata: 14.9
Then answer these questions using Series methods:
- Which city has the highest population? (
idxmax()) - What is the average population? (
.mean()) - Filter cities with population above 12 million.
- Add 0.5 million to every city (vectorised addition).
Hint: Pass a dictionary to pd.Series() with city names as keys.
β Frequently Asked QuestionsΒΆ
Q: I created a Series with string labels. Why does s[0] give a KeyError?
When you create a Series from a dictionary, the keys become the index. If your index is ['Vishnu', 'Ankit', 'Priya'], then s[0] looks for a label 0 β which doesn't exist. Use s.iloc[0] to access by position, or s['Vishnu'] to access by label. This is one of the most common beginner mistakes.
Q: What's the difference between label-based slicing and position-based slicing?
With a custom string index, s['Ankit':'Sara'] includes both endpoints (inclusive). With integer position slicing s[0:2], the end is exclusive (like a Python list). This inconsistency trips up many beginners β use .loc[] for label-based and .iloc[] for position-based access to be explicit.
Q: Can a Series hold mixed data types?
Yes, but the dtype will become object (like a Python list). Performance and memory use suffer because pandas can't use optimised numeric operations. Try to keep Series columns uniform β all integers, all floats, or all strings.
Q: CBSE exam β how do you create a Series from a dictionary?
s = pd.Series({'key1': val1, 'key2': val2}) β dictionary keys become the index labels. This is the standard CBSE answer.
β SummaryΒΆ
In this tutorial, you've learned:
- β A pandas Series is a 1D labelled array β values plus an index
- β Create from a list (default 0-based index) or a dictionary (keys become labels)
- β
Vectorised operations like
s + 5ors * 1.1work on every element β no loop needed - β
Boolean filtering
s[s > 80]returns a new Series with only matching values - β
s[0]fails with a custom string index β uses.iloc[0]for position,s['label']for labels
π‘ Interview & Exam TipsΒΆ
Q: What is the default index of a Series created from a list?
Integer-based RangeIndex starting from 0: 0, 1, 2, ...
Q: What is the difference between a Python list and a pandas Series?
A Series has a labelled index, supports vectorised math (s * 2), and has built-in statistical methods (.mean(), .max()). A list has none of these.
Q: How do you create a Series with custom labels?
Pass a dictionary: pd.Series({'Vishnu': 95, 'Ankit': 82}) β keys become the index labels.
Q: What does s[s > 80] return?
A new Series containing only the values where the condition is True β this is called boolean indexing.
π Further ReadingΒΆ
Continue your learning path:
- β Introduction to pandas β install pandas and understand why it exists
- Next: DataFrame Basics β β a DataFrame is just multiple Series sharing an index
Go deeper:
- Official pandas docs β Series β full method reference
- Indexing & Selection β master
.loc[]and.iloc[]to avoid index confusion - CSV with pandas β load real data and each column becomes a Series automatically