pandas Series ๐
Mentor's Note: A Series is the simplest building block in pandas. Before you can understand a DataFrame (a full table), you need to understand a Series (a single column). Think of it as a Python list that knows the name of every item. ๐ก
By the end of this tutorial, you'll know:
- What a pandas Series is and how it differs from a Python list
- How to create a Series from a list (default index) and from a dictionary (custom index)
- How to filter, slice, and do math on a Series without writing any loops
- The common mistake of assuming
s[0]always works when you have a custom index
๐ The Scenario: The Marks Register Columnโ
Your teacher has a register book. The left column has student names, the right column has marks. If you tear out just the marks column and keep the names as labels โ that's a pandas Series.
- The Logic:
- Values:
[95, 82, 70]โ the actual data - Index:
['Vishnu', 'Ankit', 'Priya']โ the labels - Name:
'marks'โ what this column represents
- Values:
- The Result: A 1D array where every value has a meaningful label. โ
๐ Concept Explanationโ
1. What is a Series?โ
A pandas Series is a one-dimensional labelled array. It has:
- Values โ the data (numbers, strings, booleans, etc.)
- Index โ labels for each value (default: 0, 1, 2... or custom strings)
- dtype โ the data type of the values (
int64,float64,object, etc.) - name โ an optional name for the Series
2. Default Index vs Custom Indexโ
| Default Index | Custom Index | |
|---|---|---|
| How | Created from a list | Created from a dictionary |
| Labels | 0, 1, 2, ... | 'Vishnu', 'Ankit', ... |
| Access | s[0] | s['Vishnu'] |
3. Series vs Python Listโ
| Feature | Python List | pandas Series |
|---|---|---|
| Labels | โ Index only | โ Custom labels |
| Vectorised math | โ Needs loop | โ
s * 2, s + 10 |
| Built-in stats | โ Needs code | โ
.mean(), .max() |
| Filter by condition | Verbose | s[s > 80] |
๐จ Visual Logicโ
๐ป Implementationโ
- 1. From a List (Default Index)
- 2. From a Dictionary (Custom Index)
- 3. Key Methods & Operations
- 4. Interactive REPL
- 4. Slicing
import pandas as pd
# Create a Series from a list
marks = pd.Series([95, 82, 70])
print(marks)
# 0 95
# 1 82
# 2 70
# dtype: int64
print(marks.index) # RangeIndex(start=0, stop=3, step=1)
print(marks.values) # [95 82 70]
print(marks.dtype) # int64
import pandas as pd
# Dictionary keys become the index (labels)
marks = pd.Series({'Vishnu': 95, 'Ankit': 82, 'Priya': 70})
print(marks)
# Vishnu 95
# Ankit 82
# Priya 70
# dtype: int64
# Access by label
print(marks['Vishnu']) # 95
# Give the Series a name
marks.name = 'marks'
import pandas as pd
marks = pd.Series({'Vishnu': 95, 'Ankit': 82, 'Priya': 70, 'Sara': 88})
# Statistical methods
print(marks.mean()) # 83.75
print(marks.max()) # 95
print(marks.min()) # 70
print(marks.sum()) # 335
# Vectorised operations (no loop needed!)
bonus = marks + 5 # Add 5 to every value
scaled = marks * 1.1 # Scale all marks by 10%
# Boolean filtering โ select marks above 80
toppers = marks[marks > 80]
print(toppers)
# Output:
# Vishnu 95
# Ankit 82
# Sara 88
# dtype: int64
Try this in your Python shell โ no file needed.
>>> import pandas as pd
>>> marks = pd.Series({'Vishnu': 95, 'Ankit': 82, 'Priya': 70})
>>> marks
Vishnu 95
Ankit 82
Priya 70
dtype: int64
>>> marks.mean()
82.33333333333333
>>> marks[marks > 80]
Vishnu 95
Ankit 82
dtype: int64
>>> marks['Vishnu']
95
>>> marks + 5
Vishnu 100
Ankit 87
Priya 75
dtype: int64
Type python3 in your terminal to start. Each >>> is what you type; the line below is Python's response. Type exit() to quit.
import pandas as pd
marks = pd.Series({'Vishnu': 95, 'Ankit': 82, 'Priya': 70, 'Sara': 88})
# Slice by position (like a list)
print(marks[0:2])
# Vishnu 95
# Ankit 82
# Slice by label (inclusive on both ends)
print(marks['Ankit':'Sara'])
# Ankit 82
# Priya 70
# Sara 88
๐ Sample Dry Runโ
Series: marks = pd.Series({'Vishnu': 95, 'Ankit': 82, 'Priya': 70, 'Sara': 88})
Expression: marks[marks > 80]
| Step | Action | Result |
|---|---|---|
| 1 | Evaluate marks > 80 | Vishnu: True, Ankit: True, Priya: False, Sara: True |
| 2 | Use boolean mask to filter | Keep rows where mask is True |
| 3 | Return filtered Series | Vishnu: 95, Ankit: 82, Sara: 88 |
๐ฏ Practice Lab ๐งชโ
Create a pandas Series of 5 Indian city populations (in millions):
- Surat: 7.8, Mumbai: 20.7, Delhi: 31.2, Chennai: 10.9, Kolkata: 14.9
Then answer these questions using Series methods:
- Which city has the highest population? (
idxmax()) - What is the average population? (
.mean()) - Filter cities with population above 12 million.
- Add 0.5 million to every city (vectorised addition).
Hint: Pass a dictionary to pd.Series() with city names as keys.
โ Frequently Asked Questionsโ
Q: I created a Series with string labels. Why does s[0] give a KeyError?
When you create a Series from a dictionary, the keys become the index. If your index is ['Vishnu', 'Ankit', 'Priya'], then s[0] looks for a label 0 โ which doesn't exist. Use s.iloc[0] to access by position, or s['Vishnu'] to access by label. This is one of the most common beginner mistakes.
Q: What's the difference between label-based slicing and position-based slicing?
With a custom string index, s['Ankit':'Sara'] includes both endpoints (inclusive). With integer position slicing s[0:2], the end is exclusive (like a Python list). This inconsistency trips up many beginners โ use .loc[] for label-based and .iloc[] for position-based access to be explicit.
Q: Can a Series hold mixed data types?
Yes, but the dtype will become object (like a Python list). Performance and memory use suffer because pandas can't use optimised numeric operations. Try to keep Series columns uniform โ all integers, all floats, or all strings.
Q: CBSE exam โ how do you create a Series from a dictionary?
s = pd.Series({'key1': val1, 'key2': val2}) โ dictionary keys become the index labels. This is the standard CBSE answer.
โ Summaryโ
In this tutorial, you've learned:
- โ A pandas Series is a 1D labelled array โ values plus an index
- โ Create from a list (default 0-based index) or a dictionary (keys become labels)
- โ
Vectorised operations like
s + 5ors * 1.1work on every element โ no loop needed - โ
Boolean filtering
s[s > 80]returns a new Series with only matching values - โ
s[0]fails with a custom string index โ uses.iloc[0]for position,s['label']for labels
๐ก Interview & Exam Tipsโ
Q: What is the default index of a Series created from a list?
Integer-based RangeIndex starting from 0: 0, 1, 2, ...
Q: What is the difference between a Python list and a pandas Series?
A Series has a labelled index, supports vectorised math (s * 2), and has built-in statistical methods (.mean(), .max()). A list has none of these.
Q: How do you create a Series with custom labels?
Pass a dictionary: pd.Series({'Vishnu': 95, 'Ankit': 82}) โ keys become the index labels.
Q: What does s[s > 80] return?
A new Series containing only the values where the condition is True โ this is called boolean indexing.
๐ Further Readingโ
Continue your learning path:
- โ Introduction to pandas โ install pandas and understand why it exists
- Next: DataFrame Basics โ โ a DataFrame is just multiple Series sharing an index
Go deeper:
- Official pandas docs โ Series โ full method reference
- Indexing & Selection โ master
.loc[]and.iloc[]to avoid index confusion - CSV with pandas โ load real data and each column becomes a Series automatically