VNSGU BCA Sem 2: Programming Skills - Advanced (204_02) Practical Solutions - Set C¶

Paper Details

Subject: Programming Skills - Advanced (PKUP)
Subject Code: 204_02
Set: C
Semester: 2
Month/Year: April 2025
Max Marks: 25
Time Recommendation: 45 Minutes
Paper: View Paper | Download PDF

Questions & Solutions¶

All questions are compulsory¶

Q1A: Movie Data Pipeline¶

Max Marks: 20

Using Python, perform the following tasks for MOVIE data management: 1. Create a dictionary (or a set of sets) containing movie details with the following fields: MOVIE_ID, TITLE, GENRE, DIRECTOR, RATING, RELEASE_YEAR. Add at least 10 movie records. 2. Convert the dictionary into a pandas Data Frame. 3. Save the Data Frame as a CSV file named movie_data.csv. 4. Load the CSV file and display: a) All movies in the "Action" genre b) Movies with a rating above 8.5 c) Movies released after 2015

1. Dictionary & DataFrame Creation¶

Initialize the movie dataset and transform it into a Pandas DataFrame.

Hint

A dictionary where each key represents a column is the most efficient way to create a DataFrame. Use pd.DataFrame() for the conversion.

flowchart TD
dict[Movie Dictionary]
df[Pandas DataFrame]
conv[pd.DataFrame Conversion]

dict --> conv
conv --> df

View Solution & Output

import pandas as pd

# [1] Create dictionary with movie details
movie_dict = {
    'MOVIE_ID': [101, 102, 103, 104, 105, 106, 107, 108, 109, 110],
    'TITLE': ['Inception', 'The Dark Knight', 'Interstellar', 'Parasite', 'Avengers', 'Joker', 'The Matrix', 'Avatar', 'John Wick', 'Mad Max'],
    'GENRE': ['Sci-Fi', 'Action', 'Sci-Fi', 'Drama', 'Action', 'Drama', 'Sci-Fi', 'Sci-Fi', 'Action', 'Action'],
    'DIRECTOR': ['Nolan', 'Nolan', 'Nolan', 'Bong Joon-ho', 'Russo Bros', 'Phillips', 'Wachowskis', 'Cameron', 'Stahleski', 'Miller'],
    'RATING': [8.8, 9.0, 8.6, 8.5, 8.0, 8.4, 8.7, 7.8, 7.4, 8.1],
    'RELEASE_YEAR': [2010, 2008, 2014, 2019, 2012, 2019, 1999, 2009, 2014, 2015]
}

# [2] Convert dictionary into pandas DataFrame
df = pd.DataFrame(movie_dict)
print("DataFrame successfully initialized.")
print(df.head())

Step-by-Step Explanation: 1. Initialization: Define the movie_dict with 10 records and import pandas to manage the tabular data. 2. Logic Flow: Use pd.DataFrame() to structure the dictionary data into a clean, queryable format stored in df. 3. Completion: Display the first few records to confirm the initialization was successful.

2. File Storage (CSV)¶

Persist the movie records into a CSV file for long-term storage.

Hint

Use df.to_csv('filename.csv', index=False). Disabling the index ensures the CSV doesn't have an extra column for row numbers.

flowchart TD
df[Pandas DataFrame]
save[Save to CSV]
file[movie_data.csv]

df --> save
save --> file

View Solution & Output

# [3] Save DataFrame as CSV file
df.to_csv('movie_data.csv', index=False)
print("File 'movie_data.csv' created successfully.")

Step-by-Step Explanation: 1. Initialization: Identify the DataFrame df containing the movie information. 2. Logic Flow: Execute to_csv() with index=False to write the data to 'movie_data.csv'. 3. Completion: A success message confirms that the file has been created on the system.

3. Data Loading & Filtering¶

Load the movie data and perform specific analytics.

Hint

Use boolean indexing to filter the rows. Example: df[df['RATING'] > 8.5].

flowchart TD
load[Load read_csv]
filter[Apply Filters]
show[Show Results]

load --> filter
filter --> show

View Solution & Output

# [4] Load CSV file and perform queries
load_df = pd.read_csv('movie_data.csv')

# Query 1: Action Genre
print("\n[a] Action Movies:")
print(load_df[load_df['GENRE'] == 'Action'])

# Query 2: Rating > 8.5
print("\n[b] Top Rated Movies (Rating > 8.5):")
print(load_df[load_df['RATING'] > 8.5])

# Query 3: Release Year after 2015
print("\n[c] Recent Movies (After 2015):")
print(load_df[load_df['RELEASE_YEAR'] > 2015])

Step-by-Step Explanation: 1. Initialization: Load the data from movie_data.csv back into Python using pd.read_csv(). 2. Logic Flow: Use comparison operators to filter movies by genre (Action), minimum rating (8.5), and recent release year (2015). 3. Completion: Print each filtered list to help analyze movie performance and availability.

Q2: Viva Preparation¶

Max Marks: 5

Potential Viva Questions

Q: How do you filter multiple conditions in Pandas?
A: Use the bitwise operators & (AND), | (OR) inside the brackets, e.g., df[(df['GENRE'] == 'Action') & (df['RATING'] > 8.0)].
Q: What is the difference between head() and tail()?
A: head() shows the first 5 rows, while tail() shows the last 5 rows of the DataFrame.
Q: How can you find the mean rating of all movies?
A: Use df['RATING'].mean().
Q: What does the shape attribute return?
A: A tuple representing the dimensions of the DataFrame (rows, columns).
Q: How do you delete a column from a DataFrame?
A: Use df.drop(columns=['column_name']).
Q: What is the purpose of reset_index()?
A: It converts the current index into a regular column and resets the row labels to consecutive integers starting from zero.

Common Pitfalls

Case Sensitivity: Filtering for "Action" will not find "action". Ensure your data casing is consistent.
Header Row: When loading a CSV, Pandas expects the first row to be headers. If your CSV lacks headers, you must specify header=None.

Set	Link
Set A	Solutions
Set B	Solutions
Set C	Current Page
Set D	Solutions
Set E	Solutions
Set F	Solutions

Last Updated: April 2025

VNSGU BCA Sem 2: Programming Skills - Advanced (204_02) Practical Solutions - Set C¶

Questions & Solutions¶

All questions are compulsory¶

Q1A: Movie Data Pipeline¶

1. Dictionary & DataFrame Creation¶

2. File Storage (CSV)¶

3. Data Loading & Filtering¶

Q2: Viva Preparation¶

Quick Navigation¶

Related Solutions¶