VNSGU BCA Sem 2: Programming Skills - Advanced (204_02) Practical Solutions - Set C¶
Paper Details
- Subject: Programming Skills - Advanced (PKUP)
- Subject Code: 204_02
- Set: C
- Semester: 2
- Month/Year: April 2025
- Max Marks: 25
- Time Recommendation: 45 Minutes
- Paper: View Paper | Download PDF
Questions & Solutions¶
All questions are compulsory¶
Q1A: Movie Data Pipeline¶
Max Marks: 20
Using Python, perform the following tasks for MOVIE data management:
1. Create a dictionary (or a set of sets) containing movie details with the following fields: MOVIE_ID, TITLE, GENRE, DIRECTOR, RATING, RELEASE_YEAR. Add at least 10 movie records.
2. Convert the dictionary into a pandas Data Frame.
3. Save the Data Frame as a CSV file named movie_data.csv.
4. Load the CSV file and display:
a) All movies in the "Action" genre
b) Movies with a rating above 8.5
c) Movies released after 2015
1. Dictionary & DataFrame Creation¶
Initialize the movie dataset and transform it into a Pandas DataFrame.
Hint
A dictionary where each key represents a column is the most efficient way to create a DataFrame. Use pd.DataFrame() for the conversion.
flowchart TD
dict[Movie Dictionary]
df[Pandas DataFrame]
conv[pd.DataFrame Conversion]
dict --> conv
conv --> df
View Solution & Output
import pandas as pd
# [1] Create dictionary with movie details
movie_dict = {
'MOVIE_ID': [101, 102, 103, 104, 105, 106, 107, 108, 109, 110],
'TITLE': ['Inception', 'The Dark Knight', 'Interstellar', 'Parasite', 'Avengers', 'Joker', 'The Matrix', 'Avatar', 'John Wick', 'Mad Max'],
'GENRE': ['Sci-Fi', 'Action', 'Sci-Fi', 'Drama', 'Action', 'Drama', 'Sci-Fi', 'Sci-Fi', 'Action', 'Action'],
'DIRECTOR': ['Nolan', 'Nolan', 'Nolan', 'Bong Joon-ho', 'Russo Bros', 'Phillips', 'Wachowskis', 'Cameron', 'Stahleski', 'Miller'],
'RATING': [8.8, 9.0, 8.6, 8.5, 8.0, 8.4, 8.7, 7.8, 7.4, 8.1],
'RELEASE_YEAR': [2010, 2008, 2014, 2019, 2012, 2019, 1999, 2009, 2014, 2015]
}
# [2] Convert dictionary into pandas DataFrame
df = pd.DataFrame(movie_dict)
print("DataFrame successfully initialized.")
print(df.head())
Step-by-Step Explanation:
1. Initialization: Define the movie_dict with 10 records and import pandas to manage the tabular data.
2. Logic Flow: Use pd.DataFrame() to structure the dictionary data into a clean, queryable format stored in df.
3. Completion: Display the first few records to confirm the initialization was successful.
2. File Storage (CSV)¶
Persist the movie records into a CSV file for long-term storage.
Hint
Use df.to_csv('filename.csv', index=False). Disabling the index ensures the CSV doesn't have an extra column for row numbers.
flowchart TD
df[Pandas DataFrame]
save[Save to CSV]
file[movie_data.csv]
df --> save
save --> file
View Solution & Output
# [3] Save DataFrame as CSV file
df.to_csv('movie_data.csv', index=False)
print("File 'movie_data.csv' created successfully.")
Step-by-Step Explanation:
1. Initialization: Identify the DataFrame df containing the movie information.
2. Logic Flow: Execute to_csv() with index=False to write the data to 'movie_data.csv'.
3. Completion: A success message confirms that the file has been created on the system.
3. Data Loading & Filtering¶
Load the movie data and perform specific analytics.
Hint
Use boolean indexing to filter the rows. Example: df[df['RATING'] > 8.5].
flowchart TD
load[Load read_csv]
filter[Apply Filters]
show[Show Results]
load --> filter
filter --> show
View Solution & Output
# [4] Load CSV file and perform queries
load_df = pd.read_csv('movie_data.csv')
# Query 1: Action Genre
print("\n[a] Action Movies:")
print(load_df[load_df['GENRE'] == 'Action'])
# Query 2: Rating > 8.5
print("\n[b] Top Rated Movies (Rating > 8.5):")
print(load_df[load_df['RATING'] > 8.5])
# Query 3: Release Year after 2015
print("\n[c] Recent Movies (After 2015):")
print(load_df[load_df['RELEASE_YEAR'] > 2015])
Step-by-Step Explanation:
1. Initialization: Load the data from movie_data.csv back into Python using pd.read_csv().
2. Logic Flow: Use comparison operators to filter movies by genre (Action), minimum rating (8.5), and recent release year (2015).
3. Completion: Print each filtered list to help analyze movie performance and availability.
Q2: Viva Preparation¶
Max Marks: 5
Potential Viva Questions
- Q: How do you filter multiple conditions in Pandas?
- A: Use the bitwise operators
&(AND),|(OR) inside the brackets, e.g.,df[(df['GENRE'] == 'Action') & (df['RATING'] > 8.0)]. - Q: What is the difference between
head()andtail()? - A:
head()shows the first 5 rows, whiletail()shows the last 5 rows of the DataFrame. - Q: How can you find the mean rating of all movies?
- A: Use
df['RATING'].mean(). - Q: What does the
shapeattribute return? - A: A tuple representing the dimensions of the DataFrame (rows, columns).
- Q: How do you delete a column from a DataFrame?
- A: Use
df.drop(columns=['column_name']). - Q: What is the purpose of
reset_index()? - A: It converts the current index into a regular column and resets the row labels to consecutive integers starting from zero.
Common Pitfalls
- Case Sensitivity: Filtering for
"Action"will not find"action". Ensure your data casing is consistent. - Header Row: When loading a CSV, Pandas expects the first row to be headers. If your CSV lacks headers, you must specify
header=None.
Quick Navigation¶
Related Solutions¶
| Set | Link |
|---|---|
| Set A | Solutions |
| Set B | Solutions |
| Set C | Current Page |
| Set D | Solutions |
| Set E | Solutions |
| Set F | Solutions |
Last Updated: April 2025