Skip to main content
Unlisted page
This page is unlisted. Search engines will not index it, and only users having a direct link can access it.

VNSGU BCA Sem 2: Data Analysis Using Python (205_04) Practical Solutions - April 2025 Set F

Paper Details
  • Subject: Data Analysis Using Python (DAUP)
  • Subject Code: 205_04
  • Set: F
  • Semester: 2
  • Month/Year: April 2025
  • Max Marks: 25
  • Time Recommendation: 45 Minutes
  • Paper: View Paper | Download PDF

Questions & Solutions

All questions are compulsory

Q1: CSV Data Processing Pipeline

Max Marks: 20

Write a Python script that perform following:

  1. Create students.csv file that contains rno, name, city, address, mob, per.
  2. Converting above CSV file into dataframe.
  3. Display columns name of students.csv.
  4. Display only name and city.
  5. Fill empty value with 'Nan'.

1. CSV File Creation

Generate the source data file with the required fields.

Hint

You can use the csv module or simply write a string to a file. Ensure you leave some fields empty to test the 'Nan' filling logic later.

View Solution & Output
import pandas as pd
import csv

# [1] Create students.csv file
data = [
['rno', 'name', 'city', 'address', 'mob', 'per'],
[1, 'Aarav', 'Surat', 'Adajan', '9876543210', 85.5],
[2, 'Diya', 'Ahmedabad', 'Satellite', '9876543211', 78.0],
[3, 'Krish', 'Surat', '', '9876543212', 92.0], # Missing address
[4, 'Mira', 'Baroda', 'Alkapuri', '', 65.4], # Missing mobile
[5, 'Aryan', 'Surat', 'Vesu', '9876543214', None] # Missing percentage
]

with open('students.csv', 'w', newline='') as f:
writer = csv.writer(f)
writer.writerows(data)

print("students.csv created successfully.")

Step-by-Step Explanation:

  1. Initialization: Define raw data as a nested list for standard CSV structure.
  2. Logic Flow: Use Python's csv.writer to generate the file and populate it with sample rows including deliberate empty values.
  3. Completion: Finalize the file creation and confirm its presence in the directory.

2. Loading & Column Inspection

Transform the CSV into a Pandas DataFrame and explore its structure.

Hint

Use pd.read_csv() for loading and the .columns attribute to view header names.

View Solution & Output
# [2] Converting CSV into Data Frame
df = pd.read_csv('students.csv')

# [3] Display columns name
print("\nColumn Names:")
print(df.columns.tolist())

Step-by-Step Explanation:

  1. Initialization: Import the pandas library for CSV-to-DataFrame conversion.
  2. Logic Flow: Read the students.csv file into memory using pd.read_csv().
  3. Completion: Extract and display all column labels as a list for structural verification.

3. Data Selection & Cleanup

Extract specific information and handle missing data.

Hint
  • Select specific columns: df[['name', 'city']]
  • Fill missing values: df.fillna('Nan')
View Solution & Output
# [4] Display only name and city
print("\nStudent Name and City List:")
print(df[['name', 'city']])

# [5] Fill empty value with 'Nan'
df_filled = df.fillna('Nan')

print("\nData Frame after filling empty values:")
print(df_filled)

Step-by-Step Explanation:

  1. Initialization: Prepare to filter the DataFrame columns.
  2. Logic Flow: Select specific columns by name and use fillna() to replace missing data with the string 'Nan'.
  3. Completion: Print the resulting cleaned DataFrame to verify the successful replacement of all empty fields.
Concept Deep Dive: Missing Data (NaN)

In Data Science, "NaN" (Not a Number) is the standard marker for missing data. Pandas provides powerful tools like isnull(), dropna(), and fillna() to manage these gaps. While the question asks to fill with the string 'Nan', in real analysis, we often fill with the mean or median of the column to maintain statistical consistency.

Q2: Viva Preparation

Max Marks: 5

Potential Viva Questions
  1. Q: What is the difference between NaN and None in Pandas?
    • A: NaN is a floating-point "Not a Number" used for numerical missing data, while None is Python's internal null type. Pandas usually converts None to NaN for consistency.
  2. Q: How do you select multiple columns in Pandas?
    • A: By passing a list of column names inside double square brackets: df[['col1', 'col2']].
  3. Q: What does df.columns return?
    • A: It returns an Index object containing all the column labels of the DataFrame.
  4. Q: How can you find the data type of each column?
    • A: Use the df.dtypes attribute.
  5. Q: What is the difference between dropna() and fillna()?
    • A: dropna() removes rows or columns with missing values, while fillna() replaces them with a specified value.
  6. Q: How do you check if any value is missing in the whole DataFrame?
    • A: Use df.isnull().values.any().
Common Pitfalls
  • Double Brackets: Forgetting the second set of brackets df['name', 'city'] will cause a KeyError. Always use df[['name', 'city']] for multiple columns.
  • Inplace Parameter: df.fillna() returns a new DataFrame. To change the original, use df.fillna('Nan', inplace=True) or reassign it.

Quick Navigation

SetLink
Set ESolutions
Set FCurrent Page

Last Updated: April 2026

📍 Visit Us

🏫 VD Computer Tuition Surat

VD Computer Tuition
📍 Address
2/66 Faram Street, Rustompura
Surat395002, Gujarat, India
📞 Phone / WhatsApp
+91 84604 41384
🌐 Website

Computer Classes & Tuition — Areas We Serve in Surat

AdajanAlthanAmroliAthwaAthwalinesBhagalBhatarBhestanCanal RoadChowkCitylightDumasGaurav PathGhod Dod RoadHaziraJahangirpuraKamrejKapodraKatargamLimbayatMagdallaMajura GateMota VarachhaNanpuraNew CitylightOlpadPalPandesaraParle PointPiplodPunaRanderRing RoadRustampuraSachinSalabatpuraSarthanaSosyo CircleUdhnaVarachhaVed RoadVesuVIP Road
📞 Call Sir💬 WhatsApp Sir