Skip to content

VNSGU BCA Sem 2: Data Analysis Using Python (205_04) Practical Solutions - Set C

Paper Details

  • Subject: Data Analysis Using Python (DAUP)
  • Subject Code: 205_04
  • Set: C
  • Semester: 2
  • Month/Year: April 2025
  • Max Marks: 25
  • Time Recommendation: 45 Minutes
  • Paper: View Paper | Download PDF

Questions & Solutions

All questions are compulsory

Q1: Order Data Analysis Pipeline

Max Marks: 20

Create a table ORDERS in SQLite with the following columns: ORDER_ID, CUSTOMER_NAME, PRODUCT_NAME, QUANTITY, TOTAL_PRICE, ORDER_DATE using Python. 1. Insert 10 records into the table. 2. Display records where the total price is greater than 500. 3. Export order data and convert it into a Data Frame. 4. Convert the Data Frame into orders.csv file and save it.

1. SQLite Table Creation & Data Entry

Set up the sales database and populate it with order history.

Hint

In SQLite, ORDER is a reserved keyword. Use double quotes "ORDERS" or a different table name to avoid syntax errors.

flowchart TD
start[Start]
conn[Connect to sales.db]
tbl[Create ORDERS Table]
ins[Bulk Insert 10 Records]
done[Commit & Close]

start --> conn
conn --> tbl
tbl --> ins
ins --> done
View Solution & Output
import sqlite3

# [1] Create database and table
conn = sqlite3.connect('sales.db')
cursor = conn.cursor()

cursor.execute('''
    CREATE TABLE IF NOT EXISTS ORDERS (
        ORDER_ID INTEGER PRIMARY KEY,
        CUSTOMER_NAME TEXT,
        PRODUCT_NAME TEXT,
        QUANTITY INTEGER,
        TOTAL_PRICE REAL,
        ORDER_DATE TEXT
    )
''')

# Insert 10 records
orders_data = [
    (101, 'Rahul', 'Laptop', 1, 55000, '2024-05-01'),
    (102, 'Priya', 'Mouse', 2, 800, '2024-05-02'),
    (103, 'Amit', 'Keyboard', 1, 1200, '2024-05-03'),
    (104, 'Sia', 'USB Hub', 1, 450, '2024-05-04'),
    (105, 'Aryan', 'Monitor', 1, 8500, '2024-05-05'),
    (106, 'Neha', 'Cables', 3, 300, '2024-05-06'),
    (107, 'Vikram', 'Printer', 1, 12000, '2024-05-07'),
    (108, 'Sneha', 'Webcam', 1, 2500, '2024-05-08'),
    (109, 'Karan', 'Pen Drive', 5, 1500, '2024-05-09'),
    (110, 'Mira', 'Earphones', 1, 499, '2024-05-10')
]

cursor.executemany('INSERT INTO ORDERS VALUES (?,?,?,?,?,?)', orders_data)
conn.commit()
print("Table created and 10 records inserted.")

Step-by-Step Explanation: 1. Initialization: Establish a connection to sales.db and create a cursor object for database operations. 2. Logic Flow: Execute a SQL command to create the ORDERS table and use executemany() to insert 10 sample records efficiently. 3. Completion: Commit the transaction to save changes and print a success message.

2. High Value Order Query

Identify and display orders exceeding 500 in total price.

Hint

Use the SQL WHERE clause: SELECT * FROM ORDERS WHERE TOTAL_PRICE > 500.

flowchart TD
q[Query: TOTAL_PRICE > 500]
f[Fetch All Rows]
p[Print Rows]

q --> f
f --> p
View Solution & Output
# [2] Display records where total price > 500
cursor.execute('SELECT * FROM ORDERS WHERE TOTAL_PRICE > 500')
high_value_orders = cursor.fetchall()

print("\nOrders with Total Price > 500:")
for row in high_value_orders:
    print(row)

Step-by-Step Explanation: 1. Initialization: Use the existing database cursor to prepare a selection query. 2. Logic Flow: Execute a SELECT statement with a WHERE clause to filter orders where TOTAL_PRICE exceeds 500. 3. Completion: Fetch all matching results and iterate through them to display each record.

3. Data Export to Pandas & CSV

Generate a portable CSV file for reporting using the Pandas library.

Hint

df.to_csv() saves the current state of the DataFrame to a text-based CSV format.

flowchart TD
sql[Read SQL Table]
df[Load into DataFrame]
csv[Save to orders.csv]

sql --> df
df --> csv
View Solution & Output
import pandas as pd

# [3] Export to Data Frame
df = pd.read_sql_query('SELECT * FROM ORDERS', conn)
print("\nData Frame View:")
print(df.head())

# [4] Convert to CSV
df.to_csv('orders.csv', index=False)
print("\nFile 'orders.csv' saved successfully.")

conn.close()

Step-by-Step Explanation: 1. Initialization: Import the pandas library to handle data manipulation and conversion. 2. Logic Flow: Use pd.read_sql_query() to pull all data from the ORDERS table directly into a DataFrame. 3. Completion: Save the DataFrame to a file named orders.csv using to_csv() and close the database connection.

Concept Deep Dive: Reserved Keywords

In SQL, words like ORDER, SELECT, and TABLE are "reserved" for commands. If you name your table ORDER, you might get syntax errors. It's best practice to use plural names like ORDERS or enclose them in double quotes "ORDER" to tell SQLite it's a name, not a command.

Q2: Viva Preparation

Max Marks: 5

Potential Viva Questions
  1. Q: How do you check the data types of columns in a DataFrame?
  2. A: Use df.dtypes.
  3. Q: What is the difference between read_sql and read_sql_query?
  4. A: read_sql is a wrapper around both read_sql_table and read_sql_query. read_sql_query is specifically for executing SQL strings.
  5. Q: How do you rename a column in Pandas?
  6. A: Use df.rename(columns={'OLD_NAME': 'NEW_NAME'}).
  7. Q: Why use REAL instead of INTEGER for total price?
  8. A: REAL supports floating-point (decimal) numbers, which are necessary for currency and prices.
  9. Q: How do you select rows with multiple conditions in SQL?
  10. A: Use the AND or OR keywords, e.g., SELECT * FROM ORDERS WHERE TOTAL_PRICE > 500 AND QUANTITY < 5.
  11. Q: What is the purpose of df.to_dict()?
  12. A: It converts the DataFrame back into a Python dictionary, which can be useful for integration with other scripts or APIs.

Common Pitfalls

  • Floating Point Comparison: When filtering prices like > 500, ensure the data is stored as REAL or FLOAT to handle decimals correctly.
  • CSV Path: Ensure you have write permissions in the folder where you are saving the CSV.

Quick Navigation

Set Link
Set A Solutions
Set B Solutions
Set C Current Page
Set D Solutions
Set E Solutions
Set F Solutions

Last Updated: April 2025