VNSGU BCA Sem 2: Programming Skills - Advanced (204_02) Practical Solutions - Set F¶
Paper Details
- Subject: Programming Skills - Advanced (PKUP)
- Subject Code: 204_02
- Set: F
- Semester: 2
- Month/Year: April 2025
- Max Marks: 25
- Time Recommendation: 45 Minutes
- Paper: View Paper | Download PDF
Questions & Solutions¶
All questions are compulsory¶
Q1A: Employee Data Pipeline¶
Max Marks: 20
Using Python, perform the following tasks for EMPLOYEE data management:
1. Create a dictionary (or a set of sets) containing details of employees with the following keys/fields: EMP_ID, NAME, DEPARTMENT, SALARY, JOIN_YEAR. Add at least 10 employee records.
2. Convert the dictionary into a pandas DataFrame.
3. Save the DataFrame as a CSV file named employee_data.csv.
4. Load the CSV file and display:
a) All employees from the "IT" department
b) All employees with salary greater than 60,000
c) Employees who joined after 2020
1. Dictionary & DataFrame Creation¶
Initialize the employee database and convert it into a Pandas DataFrame.
Hint
Construct a dictionary where keys are your column names and values are lists of 10 items. Then use pd.DataFrame() to structure it.
flowchart TD
dict[Employee Dictionary]
df[Pandas DataFrame]
conv[pd.DataFrame Conversion]
dict --> conv
conv --> df
View Solution & Output
import pandas as pd
# [1] Create dictionary with employee details
employee_dict = {
'EMP_ID': [1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010],
'NAME': ['Raj', 'Simran', 'Amit', 'Priya', 'Karan', 'Neha', 'Vikram', 'Sneha', 'Rahul', 'Anjali'],
'DEPARTMENT': ['IT', 'HR', 'IT', 'Sales', 'Finance', 'IT', 'Marketing', 'Sales', 'IT', 'HR'],
'SALARY': [75000, 55000, 80000, 45000, 65000, 70000, 60000, 48000, 90000, 52000],
'JOIN_YEAR': [2019, 2021, 2018, 2022, 2020, 2021, 2019, 2023, 2017, 2022]
}
# [2] Convert dictionary into pandas DataFrame
df = pd.DataFrame(employee_dict)
print("DataFrame successfully initialized.")
print(df.head())
Step-by-Step Explanation:
1. Initialization: Import the pandas library as pd and define an employee_dict dictionary with 10 records.
2. Logic Flow: Convert the dictionary into a tabular DataFrame df using the pd.DataFrame() method.
3. Completion: Display the first few rows of the DataFrame using head() to confirm the successful initialization.
2. File Storage (CSV)¶
Export the employee records to a persistent CSV file.
Hint
Use df.to_csv('employee_data.csv', index=False).
flowchart TD
df[Pandas DataFrame]
save[Save to CSV]
file[employee_data.csv]
df --> save
save --> file
View Solution & Output
# [3] Save DataFrame as CSV file
df.to_csv('employee_data.csv', index=False)
print("File 'employee_data.csv' created successfully.")
Step-by-Step Explanation:
1. Initialization: Utilize the existing df DataFrame.
2. Logic Flow: Apply the to_csv() method with index=False to export data.
3. Completion: Print a success message confirming the creation of 'employee_data.csv'.
3. Data Loading & Analytics¶
Load the CSV and perform targeted queries.
Hint
Use filtering like df[df['DEPARTMENT'] == 'IT'].
flowchart TD
load[Load read_csv]
filter[Apply Filters]
show[Show Results]
load --> filter
filter --> show
View Solution & Output
# [4] Load CSV file and perform queries
load_df = pd.read_csv('employee_data.csv')
# Query 1: IT Department
print("\n[a] IT Department Employees:")
print(load_df[load_df['DEPARTMENT'] == 'IT'])
# Query 2: Salary > 60,000
print("\n[b] High Salary Employees (> 60,000):")
print(load_df[load_df['SALARY'] > 60000])
# Query 3: Join Year > 2020
print("\n[c] Recent Joiners (After 2020):")
print(load_df[load_df['JOIN_YEAR'] > 2020])
Step-by-Step Explanation:
1. Initialization: Load the data from 'employee_data.csv' using pd.read_csv().
2. Logic Flow: Use Boolean indexing to filter the DataFrame for "IT" department, high salaries, and recent joiners.
3. Completion: Display the filtered results for each specific query.
Q2: Viva Preparation¶
Max Marks: 5
Potential Viva Questions
- Q: How can you check for duplicate records in a DataFrame?
- A: Use
df.duplicated().sum()to count duplicates ordf.drop_duplicates()to remove them. - Q: What is the default index in a DataFrame?
- A: A RangeIndex starting from 0 and incrementing by 1 for each row.
- Q: How do you change the value of a specific cell?
- A: Use
.ator.loc, e.g.,df.at[0, 'SALARY'] = 80000. - Q: What is the difference between
read_csvandto_csv? - A:
read_csvis used to load data from a file into a DataFrame, whileto_csvis used to export a DataFrame into a file. - Q: How do you find the total salary expense for a department?
- A: Use
df.groupby('DEPARTMENT')['SALARY'].sum(). - Q: Can you perform SQL-like joins in Pandas?
- A: Yes, using the
pd.merge()function.
Common Pitfalls
- Column Naming: Ensure the keys in your dictionary match the columns you use for filtering (e.g.,
JOIN_YEARvsjoin_year). - Path Resolution: When loading the CSV, ensure the script is running in the same directory where the CSV was saved.
Quick Navigation¶
Related Solutions¶
| Set | Link |
|---|---|
| Set A | Solutions |
| Set B | Solutions |
| Set C | Solutions |
| Set D | Solutions |
| Set E | Solutions |
| Set F | Current Page |
Last Updated: April 2025