CSE/ISE 305
Spring 2002
Stony Brook
Principles of Database Systems
Annie Liu
Project 3
Handout P3
Mar. 18, 2002
Due Apr. 4

Querying and Updating Data.

In this project assignment, we will do queries and updates based on the database schema we created in Project 2. We plan to give a sample solution to Project 2 later; you may update your database schema based on ours if you wish.

What to do? Write SQL statements that do the following sequence of query and update operations. Your SQL statements should be able to handle any database instance, i.e., for any set of users, daily commutes, and other trips, etc.

  1. Delete all users, all daily commutes, all other trips, and all related information. That is, empty your the database.
  2. Insert the following users and related information:
    (u1, passwd1, Johnson, Mary, student resident, Steeley, A101, biology, freshman, 631-111-2222, mary@yahoo.com, 17, F, owns Honda Civic 2000, non smoker, share driving)
    (u2, passwd2, Smith, John, student commuter, East Avenue, Good Town, NY, 12345, Inter Street, computer science, senior, 111-222-2222, john@yahoo.com, 21, M, owns Ford Taurus 2001, non smoker, drive only, $80)
    (u3, passwd3, Brown, Bill, faculty, East Avenue, Good Town, NY, 12345, Inter Street, computer science, professor, 111-222-3333, bill@yahoo.com, 35, M, owns Toyota Camry 2001, non smoker, share driving, $120)
    (u4, passwd4, Kennedy, Lisa, staff, East Avenue, Good Town, NY, 12345, Inter Street, music, administrator, 111-222-4444, lisa@yahoo.com, 55, F, owns Chevy Cavalier 1999 and Toyota Celica 1995, non smoker, share driving, $100)
  3. Insert the following daily commutes:
    (u2, start 8:30am, end 5pm, within 40min earlier, within 30min later, 2002, spring)
    (u3, start 9am, end 4:30pm, within 60min earlier, within 60min later, 2002, spring)
    (u4, start 8:30am, end 4:30pm, within 30min earlier, within 30min later, 2002, spring)
  4. Insert the following other trips:
    (trip1, u1, 03/31/2002, visiting friends, from Steeley A101, to East Avenue Good Town NY 12345 at Inter Street, depart at 4:30pm, arrive at 5:30pm, within 30min earlier departure and 45min earlier arrival, ride offered, 2 passengers capacity, $15)
    (trip2, u2, 03/30/2002, festival, from East Avenue Good Town NY 12345 at Inter Street, to South Station Boston MA 01800 at South Station, depart at 9am, arrive at 3pm, within 20min earlier departure and 60min earlier arrival, ride offered, 3 passengers capacity, $20)
    (trip3, u2, 03/31/2002, meeting classmates after festival, from South Station Boston MA 01800 at South Station, to Steeley A102, depart at 10:30am, arrive at 4:20pm, within 60min earlier departure and 20min earlier arrival, ride offered, 3 passengers capacity, $20)
    (trip4, u4, 03/30/2002, festival, from East Avenue Good Town NY 12345 at Inter Street, to South Station Boston MA 01800 at South Station, depart at 9am, arrive at 3pm, within 60min earlier departure and 60min earlier arrival, ride offered, 3 passengers capacity, ride also requested, for 1 passenger, $25)
    (trip5, u4, 03/31/2002, back from festival, from South Station Boston MA 01800 at South Station, to East Avenue Good Town NY 12345 at Inter Street, depart at 11am, arrive at 6pm, within 60min earlier departure and 60min earlier arrival, ride offered, 3 passengers capacity, ride also requested, for 1 passenger, $25)
  5. Return all required information and optional information about a user (given login name, e.g., use u1).
  6. Return all information about the daily commute, if it exists, posted by a user (given login name, e.g., use u2).
  7. Return all information about all other trips, if any, posted by a user (given login name, e.g., use u4).
  8. Count the total numbers of users, students, faculty/staff, students living on campus, users living off campus, daily commutes, and other trips, respectively.
  9. Return all matching users (login names) for a daily commute (given login name, i.e., use u3), where matching means with the same residence address, within time constraints, and with compatible preference for driving or riding; sort the result first by cost compensation and then by the sum of minimum wait time, both in increasing order.
    Clarification: Given the daily commute of user A, suppose it matches the daily commute of user B.
    with compatible preference for driving or riding: if not both A and B prefer "drive only" and not both A and B prefer "ride only" (this interprets the preference of "share driving" as flexible).
    within time constrains: if the allowed range of arrival time of A (between the start time specified and the allowed minutes earlier arrival specified) overlaps with the allowed range of arrival time of B, and similarly for departure time.
    sum of minimum wait time for A with respect to B: the total time A must wait because of going with B for both starting and ending at school. For example, if we have
    A: start 9am, end 4:30pm, within 60min earlier, within 60min later
    B: start 8:30am, end 5pm, within 40min earlier, within 30min later
    then the sum of minimum wait time for A with respect to B is 1 hour (30min for starting plus 30min for ending). For example, for starting, B has range 7:50-8:30am, but A wants 9am, so the minimum between B and A is 30min. On the other hand, if B is given, and A matches it, then the sum of minimum wait time for B with respect to A is 0min.
  10. Return all matching trips (trip id) for a given trip (given trip id, e.g., use trip4), where matching means with the same date, same from and to addresses (Stony Brook campus considered the same address), within time constraints, and matching requests and offers; sort the result first by cost compensation and then by sum of minimum difference from the desired times, both in increasing order.
    Clarification: offers and requests are explicit here so matches are obvious; matching within time constraints is similar as for daily commute; the sum of minimum difference from the desired times is also similar as the sum of minimum wait time for daily commute. For example, if we have
    A: depart 9am, within 30min earlier, arrive 4:30pm, within 60min earlier
    B: depart 8:30am, within 60min earlier, arrive 4pm, within 40min earlier
    then the sum of minimum difference from the desired times for A with respect to B is 1 hour (30min for departure plus 30min for arrival). For example, for departure, B has range 7:30-8:30am, but A wants 9am, so the minimum between B and A is 30min. On the other hand, if B is given, and A matches it, then the sum of minimum difference from the desired times for B with respect to A is 0min.
  11. Return all pairs of two consecutive trips (trip ids) that together match a given trip (given trip id, e.g., use trip5). All other aspects are similar to the query above.
  12. Compute the average age of freshmen, seniors, graduate students, and faculty/staff, respectively.
  13. Return a list of pairs of the form (number of cars owned, number of people own that many cars), sorted by increasing order of the number of cars owned). For example, (1 100) (2 10) (3 1) means that 100 people own 1 car each, 10 people own 2 car each, and 1 person owns 3 cars.
  14. Return all users (login names) that live at the same residence address as a given user (given login name, e.g., use u2).
  15. Delete a daily commute of a user (given login name, e.g., use u2).
  16. Delete a trip of a user (given trip id, e.g., use trip1).
  17. Delete a user (given login name, e.g., use u3) and all postings by the user.
  18. Delete all obsolete postings of daily commutes and other trips. For daily commutes, assume that spring semester ends on April 31, summer ends on August 31, and Fall ends on December 31.

You need to run your SQL program to perform the queries and updates.

What to deliver? Hand in printout of the following items: your SQL code for the query and update operations (with comments for anything not obvious), the results (dump) of executing each operation, and the content of the tables at the end.

Team work: Continue working with the partner you had for Project 2; if you did not try to look for a partner for Project 2, you should do it now, for this project and the final project.

Each team should hand in only one copy of the assignment solution; include identification information for both team members.

Bonus: (5% extra credit) Is it possible to write an SQL query that finds all any-length consecutive trips that together match a given trip, regardless of the database content? Justify your answer.

(25% extra credit) As the bonus part in Project 2, copied below, plus the only additional requirement is that you run the queries for this assignment on the large data set as well and report results of the queries (when the results are too big, report only statistics about the results, including minimally the count of result sets); in particular, report any interesting performance observations.

A large group of random data is not required for this assignment, but you will get 25% extra credit if you can generate data for 1000 users, 1000 commuting requests, and 1000 other trip requests and hand in your results with this assignment (we don't care how you generated the data except that your code must be parametric, i.e., 1000 must be a parameter than one can change to any other reasonable number). Your results for this part should include the code that you use to generate the data and load the data into the database, statistics (including minimally the row count of each table) of your database obtained using appropriate SQL queries, together with the code for these SQL queries.

If you have done the bonus part of Project 2, you can improve it and say how you improved; this allows you to get full extra credit if you had not. You will also get 10% extra credit for running the queries for this assignment and reporting the results on the large data set.