VNSGU BCA Sem 2: Programming Skills - Advanced (204_02) Practical Solutions - Set F¶

Paper Details

Subject: Programming Skills - Advanced (PKUP)
Subject Code: 204_02
Set: F
Semester: 2
Month/Year: April 2025
Max Marks: 25
Time Recommendation: 45 Minutes
Paper: View Paper | Download PDF

Questions & Solutions¶

All questions are compulsory¶

Q1A: Employee Data Pipeline¶

Max Marks: 20

Using Python, perform the following tasks for EMPLOYEE data management: 1. Create a dictionary (or a set of sets) containing details of employees with the following keys/fields: EMP_ID, NAME, DEPARTMENT, SALARY, JOIN_YEAR. Add at least 10 employee records. 2. Convert the dictionary into a pandas DataFrame. 3. Save the DataFrame as a CSV file named employee_data.csv. 4. Load the CSV file and display: a) All employees from the "IT" department b) All employees with salary greater than 60,000 c) Employees who joined after 2020

1. Dictionary & DataFrame Creation¶

Initialize the employee database and convert it into a Pandas DataFrame.

Hint

Construct a dictionary where keys are your column names and values are lists of 10 items. Then use pd.DataFrame() to structure it.

flowchart TD
dict[Employee Dictionary]
df[Pandas DataFrame]
conv[pd.DataFrame Conversion]

dict --> conv
conv --> df

View Solution & Output

import pandas as pd

# [1] Create dictionary with employee details
employee_dict = {
    'EMP_ID': [1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010],
    'NAME': ['Raj', 'Simran', 'Amit', 'Priya', 'Karan', 'Neha', 'Vikram', 'Sneha', 'Rahul', 'Anjali'],
    'DEPARTMENT': ['IT', 'HR', 'IT', 'Sales', 'Finance', 'IT', 'Marketing', 'Sales', 'IT', 'HR'],
    'SALARY': [75000, 55000, 80000, 45000, 65000, 70000, 60000, 48000, 90000, 52000],
    'JOIN_YEAR': [2019, 2021, 2018, 2022, 2020, 2021, 2019, 2023, 2017, 2022]
}

# [2] Convert dictionary into pandas DataFrame
df = pd.DataFrame(employee_dict)
print("DataFrame successfully initialized.")
print(df.head())

Step-by-Step Explanation: 1. Initialization: Import the pandas library as pd and define an employee_dict dictionary with 10 records. 2. Logic Flow: Convert the dictionary into a tabular DataFrame df using the pd.DataFrame() method. 3. Completion: Display the first few rows of the DataFrame using head() to confirm the successful initialization.

2. File Storage (CSV)¶

Export the employee records to a persistent CSV file.

Hint

Use df.to_csv('employee_data.csv', index=False).

flowchart TD
df[Pandas DataFrame]
save[Save to CSV]
file[employee_data.csv]

df --> save
save --> file

View Solution & Output

# [3] Save DataFrame as CSV file
df.to_csv('employee_data.csv', index=False)
print("File 'employee_data.csv' created successfully.")

Step-by-Step Explanation: 1. Initialization: Utilize the existing df DataFrame. 2. Logic Flow: Apply the to_csv() method with index=False to export data. 3. Completion: Print a success message confirming the creation of 'employee_data.csv'.

3. Data Loading & Analytics¶

Load the CSV and perform targeted queries.

Hint

Use filtering like df[df['DEPARTMENT'] == 'IT'].

flowchart TD
load[Load read_csv]
filter[Apply Filters]
show[Show Results]

load --> filter
filter --> show

View Solution & Output

# [4] Load CSV file and perform queries
load_df = pd.read_csv('employee_data.csv')

# Query 1: IT Department
print("\n[a] IT Department Employees:")
print(load_df[load_df['DEPARTMENT'] == 'IT'])

# Query 2: Salary > 60,000
print("\n[b] High Salary Employees (> 60,000):")
print(load_df[load_df['SALARY'] > 60000])

# Query 3: Join Year > 2020
print("\n[c] Recent Joiners (After 2020):")
print(load_df[load_df['JOIN_YEAR'] > 2020])

Step-by-Step Explanation: 1. Initialization: Load the data from 'employee_data.csv' using pd.read_csv(). 2. Logic Flow: Use Boolean indexing to filter the DataFrame for "IT" department, high salaries, and recent joiners. 3. Completion: Display the filtered results for each specific query.

Q2: Viva Preparation¶

Max Marks: 5

Potential Viva Questions

Q: How can you check for duplicate records in a DataFrame?
A: Use df.duplicated().sum() to count duplicates or df.drop_duplicates() to remove them.
Q: What is the default index in a DataFrame?
A: A RangeIndex starting from 0 and incrementing by 1 for each row.
Q: How do you change the value of a specific cell?
A: Use .at or .loc, e.g., df.at[0, 'SALARY'] = 80000.
Q: What is the difference between read_csv and to_csv?
A: read_csv is used to load data from a file into a DataFrame, while to_csv is used to export a DataFrame into a file.
Q: How do you find the total salary expense for a department?
A: Use df.groupby('DEPARTMENT')['SALARY'].sum().
Q: Can you perform SQL-like joins in Pandas?
A: Yes, using the pd.merge() function.

Common Pitfalls

Column Naming: Ensure the keys in your dictionary match the columns you use for filtering (e.g., JOIN_YEAR vs join_year).
Path Resolution: When loading the CSV, ensure the script is running in the same directory where the CSV was saved.

Set	Link
Set A	Solutions
Set B	Solutions
Set C	Solutions
Set D	Solutions
Set E	Solutions
Set F	Current Page

Last Updated: April 2025

VNSGU BCA Sem 2: Programming Skills - Advanced (204_02) Practical Solutions - Set F¶

Questions & Solutions¶

All questions are compulsory¶

Q1A: Employee Data Pipeline¶

1. Dictionary & DataFrame Creation¶

2. File Storage (CSV)¶

3. Data Loading & Analytics¶

Q2: Viva Preparation¶

Quick Navigation¶

Related Solutions¶