Homework-#4-java

Setup
Update your repository to gain access to the files necessary for this assignment: BufferPool.java, Permissions.java, and an updated version of Database.java. This assignment does rely on a working version of assignment 1. Specifically, you will be reading and writing tuples, and manipulating HeapFiles and HeapPages. If you are concerned that you implementation for assignment 1 is not correct, please meet with the instructor as soon as possible to discuss your options.

Overview

In this assignment, you will implement locking that can be used with transactions. These transactions will use the two phase locking routine that we have discussed in class. These transactions are specifically expected to implement strict two phase locking, meaning that they will acquire all locks before performing any modifications on data. Locks should generally be kept until the transaction is complete (either aborted or committed), though it may be possible to release some locks earlier than that.

Note that you are not being tasked with creating transactions themselves. We are simply providing the methods that transactions would use to access data. It is then the transaction’s responsibility to use these methods properly.

BufferPool

The majority of the work for this assignment is in the BufferPool. A BufferPool can be thought of as an object that manages access to pages within our database. It will manage locks, make sure that a transaction has the necessary locks before reading or writing data, and manage the data access (reads and writes) to the data itself.

It does this by managing a cache of HeapPages. Any HeapPages that need to be read or written to will first be stored in this cache. If a page is modified, we will mark that page dirty so that we know that it is not currently the same as the data on disk. This is important, as if we were to modify the data directly on disk then it would be much more difficult to rollback a transaction upon abort.

Examine the methods in BufferPool and make sure you understand what their purpose is before you begin. Some implementation hints follow:

* You will need to decide on a structure for the page cache. We also need a way of marking whether pages are dirty or not, i.e. if they have been modified.

* You will also need to decide on a structure to manage what locks currently exist. All locks should be done on the page level. It is possible for more than one transaction to have a read (shared) lock on a page, but it is only possible for one write lock to exist for a page. Write locks cannot be acquired if any other locks already exist for a given page (read or write). It is possible (and may be convenient) to upgrade a lock from a read lock to a write lock. Typically this structure would be considered part of the database Catalog.

* In order for a page to be read or written to, it must be in the BufferPool cache. Therefore, it is possible for all lock acquisitions to take place in the getPage method. If a transaction is only reading the data, then this method is all that is necessary. If a transaction wishes to modify data, then it will call either insertTuple or deleteTuple which must verify that the given transaction has the write lock for the requested page before performing the operation. This write lock will have been acquired via a previous call to the getPage() method.

* Since space in the BufferPool cache is limited, you may have to sometimes evict pages when reading in a new page. We want to ensure that we do not evict any dirty pages, as these pages are currently being worked on by a transaction. Complete your evict() method such that it evicts the first non-dirty page that it can find. If no such page exists, throw an exception.

* It is possible that a transaction requests a lock that is not currently available. If this occurs, you should block, i.e. periodically try to acquire the lock again until it is recieved (or the transaction is aborted).

* If a transaction has completed successfully (commit), then the BufferPool should make sure that each of the pages modified by the transaction are written to disk. It should then mark each of those pages as clean (not dirty) so that they can be used by other transactions, if necessary. It should also release all locks held by the transaction. The flushPage and releaseLock methods exist for these purposes.

* If a transaction completes unsuccessfully (abort), then the BufferPool should discard any pages that were modified by that transaction. This is as simple as replacing the modified copy of the page in cache with a fresh copy of the page from disk.

* It may be tempting to remove the page from the BufferPool cache when it is flushed after a transaction. It is actually better to keep this page in cache in case another transaction wishes to use it.

* Please keep in mind that BufferPool is primarily in charge of monitoring data access. It does not actually perform the read/write operations that are requested, rather it should call methods in HeapPage or HeapFile to handle this (or perhaps the HeapFile methods should call BufferPool? This is an implementation decision that you will need to make.). Depending on your implementation, this means that you may not be calling those methods directly, rather, any data access could now go through the BufferPool (again this is an implementation decision). Depending on your implementation, this could require some modification to the code you wrote for previous homework assignments (specifically executing queries from assignment 2) but you are not required to make those changes for this assignment.

* Care is needed when inserting a tuple into a table. We do not know what page that tuple will be inserted into, since the HeapFile must first scan the pages and find an empty slot. Note, however, that the addTuple method returns the page that contains the inserted tuple. Therefore, you can use this method to insert the tuple, mark the resulting page as dirty, and insert that page into the buffer pool cache (if necessary) so that it can be properly flushed when the transaction is complete.

* It may be necessary to make modifications to HeapFile, HeapPage, and/or Catalog to complete this assignment. These modifications should be relatively small.

Deadlock

As discussed in class, it is possible for deadlock to occur when working with multiple transactions. We also discussed a few techniques that can be used to avoid deadlocks. You should choose a deadlock avoidance technique and implement it, so that if and when deadlock does occur, at least one of the transactions involved is allowed to proceed.

You may need to abort a transaction as a result of deadlock resolution. You may assume that this transaction will be manually restarted at a later time. In other words, you are not required to restart an aborted transaction.

As a hint, the simplest form of deadlock avoidance would likely involve some sort of timeout – if a lock cannot be acquired in a certain period of time then assume deadlock has occured and abort the transaction.

Testing

You have been provided some tests that you can use to check basic functionality of each of the objects. These tests are NOT comprehensive. Just because you pass the tests does not mean that you will get a perfect score on this assignment. Please feel free to write some of your own tests to check your implementation. Specifically, these tests only test single transactions running one after the other. It does not test multiple transactions running simultaneously or deadlock resolution.

Your tests will not be part of your grade. When we do grade your assignment, we will replace the existing tests with some more thorough tests and record the output.

Rubric

The below rubric is intended as a guideline and will be followed as closely as possible, however submissions that deviate too much from the specification may be graded using an alternate rubric.

Page tracking structure (10 points), Lock tracking structure (10 points), Deadlock resolution (10 points)

BufferPool: Constructor and instance variables (10 points), getPage (10points), releasePage (5 points), committing transactions (10 points), aborting transactions (10 points), holdsLock (5 points), insertTuple (5 points), deleteTuple (5 points), flushPage (5 points), evictPage(5 points)

Total: 100 points

Submission

To submit your homework, simply commit and push your code to your repository by the due date. Be sure to put your names at the top of each file. Any submissions that are committed and pushed after the due date will be discarded.

Acknowledgements

This lab is a heavily modified version of a lab provided from MIT Open Courseware. I am grateful to them for providing this service.

Sam Madden, Robert Morris, Michael Stonebraker, Carlo Curino, 6.830 / 6.814 Database Systems, Fall 2010. (MIT OpenCourseWare: Massachusetts Institute of Technology), https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-830-database-systems-fall-2010/index.htm (Accessed 9/11/17). License: Creative commons BY-NC-SA

Leave a Reply

Your email address will not be published. Required fields are marked *