VNSGU BCA Sem 2: Data Analysis Using Python (205_04) Practical Solutions - Set B¶
Paper Details
- Subject: Data Analysis Using Python (DAUP)
- Subject Code: 205_04
- Set: B
- Semester: 2
- Month/Year: April 2025
- Max Marks: 25
- Time Recommendation: 45 Minutes
- Paper: View Paper | Download PDF
Questions & Solutions¶
All questions are compulsory¶
Q1: Employee Data Analysis Pipeline¶
Max Marks: 20
Create a table EMPLOYEE in SQLite with the following columns: EMP_ID, EMP_NAME, DEPARTMENT, SALARY, JOIN_DATE using Python.
1. Insert 10 records into the table.
2. Display records where the salary is greater than 50,000.
3. Export employee data and convert it into a Data Frame.
4. Convert the Data Frame into employee.csv file and save it.
1. SQLite Table Creation & Data Entry¶
Initialize the database and populate the employee records.
Hint
Use sqlite3.connect() to create the database file. Ensure EMP_ID is defined as PRIMARY KEY.
flowchart TD
start[Start]
conn[Connect to office.db]
tbl[Create EMPLOYEE Table]
ins[Bulk Insert 10 Records]
done[Commit & Close]
start --> conn
conn --> tbl
tbl --> ins
ins --> done
View Solution & Output
import sqlite3
# [1] Create database and table
conn = sqlite3.connect('office.db')
cursor = conn.cursor()
cursor.execute('''
CREATE TABLE IF NOT EXISTS EMPLOYEE (
EMP_ID INTEGER PRIMARY KEY,
EMP_NAME TEXT,
DEPARTMENT TEXT,
SALARY REAL,
JOIN_DATE TEXT
)
''')
# Insert 10 records
employee_data = [
(101, 'Amit', 'IT', 55000, '2022-01-10'),
(102, 'Priya', 'HR', 45000, '2021-03-15'),
(103, 'Rahul', 'Sales', 60000, '2023-05-20'),
(104, 'Sia', 'IT', 72000, '2020-11-05'),
(105, 'Aryan', 'Finance', 48000, '2022-08-12'),
(106, 'Neha', 'HR', 52000, '2021-07-18'),
(107, 'Vikram', 'IT', 65000, '2019-09-30'),
(108, 'Sneha', 'Marketing', 42000, '2023-02-14'),
(109, 'Karan', 'Sales', 58000, '2021-03-22'),
(110, 'Mira', 'IT', 49000, '2022-12-01')
]
cursor.executemany('INSERT INTO EMPLOYEE VALUES (?,?,?,?,?)', employee_data)
conn.commit()
print("Table created and 10 records inserted.")
Step-by-Step Explanation:
1. Initialization: Import sqlite3 and connect to 'office.db'.
2. Logic Flow: Create the EMPLOYEE table and use executemany() to insert 10 sample records.
3. Completion: Commit the transaction to save data to the disk.
2. High Salary Query¶
Fetch and filter employees based on salary criteria.
Hint
Use the SQL WHERE clause: SELECT * FROM EMPLOYEE WHERE SALARY > 50000.
flowchart TD
q[Query: SALARY > 50000]
f[Fetch Records]
p[Print Results]
q --> f
f --> p
View Solution & Output
# [2] Display records where salary > 50,000
cursor.execute('SELECT * FROM EMPLOYEE WHERE SALARY > 50000')
high_salary_emps = cursor.fetchall()
print("\nEmployees with Salary > 50,000:")
for row in high_salary_emps:
print(row)
Step-by-Step Explanation:
1. Initialization: Use the existing database cursor.
2. Logic Flow: Execute a SELECT query to filter employees with SALARY > 50000 and retrieve them using fetchall().
3. Completion: Iterate through the result set and print each record.
3. Data Export to Pandas & CSV¶
Convert database records into a tabular format using Pandas and save to disk.
Hint
pd.read_sql_query() maps the table structure directly to a DataFrame, preserving column names.
flowchart TD
sql[Read from SQL]
df[Convert to Pandas DF]
csv[Save to employee.csv]
sql --> df
df --> csv
View Solution & Output
import pandas as pd
# [3] Export to Data Frame
df = pd.read_sql_query('SELECT * FROM EMPLOYEE', conn)
print("\nData Frame View:")
print(df.head())
# [4] Convert to CSV
df.to_csv('employee.csv', index=False)
print("\nFile 'employee.csv' saved successfully.")
conn.close()
Step-by-Step Explanation:
1. Initialization: Import pandas and use pd.read_sql_query() to fetch all employee data.
2. Logic Flow: Convert the SQL result set directly into a DataFrame and then use to_csv() to save it.
3. Completion: Print confirmation messages and close the database connection.
Concept Deep Dive: Databases vs DataFrames
SQLite is excellent for storing data safely and permanently on disk. Pandas DataFrames are best for analyzing and manipulating data in memory. This exercise demonstrates the common data pipeline: Store (SQL) -> Load (Pandas) -> Export (CSV).
Q2: Viva Preparation¶
Max Marks: 5
Potential Viva Questions
- Q: How do you handle decimal numbers in SQLite?
- A: Use the
REALorNUMERICdata type. - Q: What is the purpose of
import sqlite3? - A: It provides the API needed to interact with SQLite databases directly from Python scripts.
- Q: How do you read a CSV file back into Pandas?
- A: Use
pd.read_csv('filename.csv'). - Q: What is the difference between
executemany()and callingexecute()in a loop? - A:
executemany()is more efficient as it optimizes the database transaction for multiple records. - Q: How do you find the highest salary in the SQLite table via Python?
- A:
cursor.execute('SELECT MAX(SALARY) FROM EMPLOYEE')thencursor.fetchone(). - Q: What does
conn.rollback()do? - A: It undoes all changes made in the current transaction if an error occurs before committing.
Common Pitfalls
- Data Persistence: Changes are not saved to the
.dbfile until you callconn.commit(). - Header Mismatch: When exporting, ensure your SQL query selects all required columns to match your DataFrame expectations.
Quick Navigation¶
Related Solutions¶
| Set | Link |
|---|---|
| Set A | Solutions |
| Set B | Current Page |
| Set C | Solutions |
| Set D | Solutions |
| Set E | Solutions |
| Set F | Solutions |
Last Updated: April 2025