All Categories :
Java
Chapter 32
Persistence
by Eric Williams
CONTENTS
Persistence in an object-oriented programming language
deals with the ability of objects to exist beyond the lifetime
of the program in which they were created. This chapter addresses
the topic of persistence from a number of perspectives.
First, it looks at what persistence is and what it means for Java
objects to be persistent. An overview of several forms of persistence
is presented.
Then the chapter delves into implementing file-based persistence,
a strategy in which the programmer does most of the work to store
objects persistently in a file. A Persistent
framework is also introduced to provide developers with a framework
in which to implement persistence in their own classes.
Finally, the chapter covers the subject of Persistent Java (PJava),
a research project at the University of Glasgow. This project's
stated goals include building a prototype persistent storage interface
for implementing orthogonal persistence in Java. An overview of
persistent stores is presented before the discussion of PJava.
Persistence describes something that exists beyond its
expected lifetime. As applied to an object-oriented programming
language, persistence describes objects that exist for an extended
period of time, often beyond the lifetime of the original program
that created the objects.
Object Lifetime
New Java programmers learn that objects have a lifetime.
An object begins its life when created by the new
operator (for example, new String("hi")).
After it is created, the object exists until destroyed by the
Java virtual machine's garbage collector. (An object can be garbage
collected only when the Java program no longer holds a reference
to the object.) Objects can also be destroyed implicitly, when
the Java program ends. This code snippet demonstrates the essential
concepts of Java object lifetimes:
{
Date d = new Date(); // Date object starts its life
System.out.println(d.toString());
}
// Date object is no longer reachable, and may be destroyed
In this example, a new Date
is created within a program block ({})
and stored in a variable (d)
local to that block. After reaching the ending curly brace (}),
the local variable d exists
no longer. From that moment, the Date
object that was created is no longer reachable and may be garbage
collected.
Persistence as Extending an Object's Lifetime
Persistence is a way to extend the lifetime of an object beyond
the lifetime of the program that created it. To understand why
it is useful to have persistent objects, consider an AddressBook
class that contains names, addresses, and telephone numbers:
public class AddressBook {
public String[] names = null;
public String[] addresses = null;
public String[] phonenums = null;
}
A person writes information in an address book so that it is available
at a later date, when the information is needed. Most people are
unlikely to remember addresses and telephone numbers, so they
write that information into a book. If you try to use the AddressBook
class to represent a real address book, you will find that it
does not support the "save it now, use it later" paradigm.
All instances of the AddressBook
class are destroyed when the Java program ends.
To be useful, an AddressBook
object must exist for an extended period of time. It must be persistent
(probably for years). Every time the user looks up, adds, or modifies
address information, the AddressBook
object is needed. Because the program that uses the AddressBook
isn't always running, the AddressBook
must be preserved during the time the program is not running.
Persistence is usually implemented by preserving the state (attributes)
of an object between executions of the program. To preserve its
state, the object is converted to a sequence of bytes and stored
on a form of long-term media (usually a disk). When the object
is needed again, it is restored from the long-term media; the
restoration process creates a new Java object that is identical
to the original. Although the restored object is not "the
same object," its state and behavior are identical. (Object
identity in a persistent system is an important issue, and is
discussed in greater detail later in this chapter.) The following
example outlines an API for a helper class that might be used
to provide save and restore capabilities for AddressBook
objects:
class AddressBookHelper {
public static void store(AddressBook book, File file) {...}
public static AddressBook restore(File file) {...}
}
To save an AddressBook to
a file, you must explicitly write a few lines of code to store
the object. The code might look like the following:
File output = new new File("address.book"); // persistent media
AddressBookHelper.store(addrBook, output);
Restoring an AddressBook
from a file would look similar:
File input = new File("address.book"); // persistent media
AddressBook addrBook = AddressBookHelper.restore(input);
There are several forms of persistence available to Java programmers.
The forms discussed in this chapter include file-based persistence,
relational databases, and object databases. These forms of persistence
differ in several categories, including logical organization of
an object's state, the amount of work required of the application
programmer to support persistence, concurrent access to the persistent
object (from different processes), and support for transactional
commit and rollback semantics.
Files
Files are often used to store information between invocations
of a program. Data stored in a file can be simple (a text file)
or complex (a circuit diagram). In daily use of a computer, you
often interact with objects that are stored in files (word processing
documents, spreadsheets, network diagrams, and so on).
Files can be used as the basis for a persistence scheme in Java.
Although Java 1.0 does not support a built-in mechanism to store
objects in files, Java 1.0 does provide a portable streaming library
(DataInput and DataOutput).
This library makes it easier for the programmer to save and restore
objects.
A file-based persistence mechanism requires the programmer to
put a bit of work into achieving persistence. The programmer must
choose an external representation of the object and write the
code that saves and restores the objects.
Usually, concurrency control and transactional semantics do not
apply to file-based persistence. Storing objects in files is usually
appropriate for single-user applications that follow the File/Open
and File/Save model.
Note |
JavaSoft has recently introduced a new API that simplifies the process of storing objects
in files (and streaming objects across the network). Information about the Object Serialization
API can be found at http://chatsubo.javasoft.com/current/.
|
RDBMS
Relational database management systems (RDBMS) can also store
persistent objects, but the characteristics of a relational database
are different from file-based persistence. A relational database
is organized into tables, rows, and columns, rather than the unstructured
sequence of bytes represented by a file. An effort is under way
to standardize the use of relational databases in Java (the JDBC
API).
There are two major ways to store objects in a relational database.
The first option is to interact with the database on its terms.
The JDBC API provides interfaces that directly represent relational
database structures. These structures can be used and manipulated
as-is. The other option is to write your own Java classes and
"map" between the relational data structures and your
classes. This type of mapping is a well-understood problem for
which many commercial solutions are available (Java implementations
will no doubt be available soon).
When using a relational database, unless you are using a tool
to perform database-to-class mapping, you must write a large volume
of code to interact with the database. Managing objects in the
database requires you to write SQL statements (inserts, updates,
deletes, and so on), which are forwarded to the database through
the JDBC API.
Although using a relational database is more work, there are a
few benefits. Relational databases usually support concurrency
control and transactional properties. Multiple users can access
the database without stepping on each other's changes because
the database uses locks to safeguard access. Additionally, almost
all relational databases support ACID properties (atomicity,
concurrency, isolation, durability). These properties protect
the integrity of the data by ensuring that blocks of work (referred
to as transactions) either complete successfully or are
rolled back without affecting other users.
Note |
The JDBC API has been officially standardized. Although few vendors are shipping products that support the API, almost all relational database vendors have publicly committed to providing implementations of the JDBC API.
|
ODBMS
Object database management systems (ODBMS) support persistence
in a different manner than file-based persistence and relational
databases. The philosophy behind object databases is to make the
programmer's job simpler. Object databases (as the name implies)
store objects; the programmer does not have to write SQL statements
or methods to package and unpackage objects-the object database
interface usually takes care of those details.
Object databases usually support concurrency control and ACID
properties, like relational databases do. They provide for concurrency
access to the database, and they also provide commit and rollback
transactional control. (Object databases are covered in greater
depth later in this chapter, in the "The PersistentJava (PJava)
Project" section.)
Note |
As this book went to press, there were no commercial object databases available for Java. Three vendors (Versant, O2, and Object Design) had publicly stated their intent to release Java object database products, but none was available. On the academic front, the PersistentJava project was nearing completion of its first implementation.
|
The following sections present an example of how to implement
a simple file-based persistent store (that you can use to add
persistability to your classes). First, you look at how
to read and write primitive data using standard classes and interfaces
provided by Java. Then you look at how to read and write whole
objects, not just primitive data types. Finally, you learn how
to apply these new interfaces to make your classes persistent.
IO Helpers-DataInput
and DataOutput
Before discussing how to store whole objects in files, it is important
to learn how to store primitive Java data values in files (int,
float, String,
and so on). The java.io package
provides two interfaces (DataInput
and DataOutput) that contain
a standard API for reading and writing primitive Java types. Table
32.1 provides a summary of the methods in DataInput
and DataOutput.
Table 32.1. The DataInput
and DataOutput
APIs.
Data Type | DataInput
| DataOutput |
boolean
| readBoolean()
| writeBoolean()
|
byte |
readByte()
| writeByte()
|
char |
readChar()
| writeChar()
|
short |
readShort()
| writeShort()
|
int |
readInt() |
writeInt()
|
long |
readLong()
| writeLong()
|
float |
readFloat()
| writeFloat()
|
double
| readDouble()
| writeDouble()
|
String
| readUTF()
| writeUTF()
|
Note |
Even though String is not strictly
an elemental data type (it is a class),
DataInput and DataOutput
define an API for reading and writing Strings.
The primary reason is that the String data type is a major part of the
language-DataInput and DataOutput without String support
would be a less-than-functional solution. The String data type is also handled
differently; Strings are encoded in a way that compacts
the representation, when possible.
|
The DataInput and DataOutput
interfaces are simple to use. The following example demonstrates
a few of the DataInput and
DataOutput methods:
class Person {
String name = null;
int age = 0;
...
void write(DataOutput out) {
out.writeUTF(name); // write the name string
out.writeInt(age); // write the age
}
...
void read(DataInput in) {
name = in.readUTF(); // read the name string
age = in.readInt(); // read the age
}
}
DataInput and DataOutput
provide a platform-independent solution for the data representation
problem. Data written to a file (or socket) on one platform can
be read by Java programs on different platforms because the representation
of the data types is standardized. An int
or String written to a file
on a Windows NT machine can be read from that file on a Solaris
machine, Macintosh, and so on. If Java did not provide a standard
interface for data formatting, every programmer would solve this
problem independently. The result would be a Tower of Babel, which
would make communicating between Java programs problematic (especially
because Java is targeted for the network computing industry).
Sun has solved the data representation problem before. Years ago,
Sun created the eXternal Data Representation (XDR) format and
an accompanying C library. XDR was created to provide a standard
format for data interchange over networks and to serve as the
data format for Remote Procedure Calls (RPC). Today, XDR is still
widely used.
Although similar to XDR, the format required by DataInput
and DataOutput is not identical
to XDR. Java's solution is less complicated and more compact.
The DataInput/DataOutput
format requires the following:
- Data is represented in binary form (not
ASCII), for compactness.
- Data is represented in network byte-order
(big-endian).
- For elemental data types, data is stored
in exactly the same number of bytes as guaranteed by the JVM-that
is, a byte is stored
as one byte; a char, as two
bytes; an int, as four bytes,
and so on.
- No padding or byte-alignment is required.
- Strings are encoded
using a special format that reduces the number of bytes written
(especially if you are using the Latin character set).
Primitive data types can be written to or read from files, sockets,
or any type of stream using the DataInput
and DataOutput interfaces.
When reading and writing files, there are two implementations
of the DataInput and DataOutput
interfaces to choose from (in the java.io
package). The RandomAccessFile
class implements both DataInput
and DataOutput. The more
frequently used classes are DataInputStream
(which implements DataInput)
and the DataOutputStream
(which implements DataOutput).
To write data to a file, you should use a DataOutputStream
as a filter over a FileOutputStream.
Here's an example:
void write(File file, String s, int i, float f) {
// first open the FileOutputStream
FileOutputStream fileout = new FileOutputStream(file);
// then open the DataOutputStream "on top of" the
// FileOutputStream that's already open
DataOutputStream dataout = new DataOutputStream(fileout);
// then write to the DataOutputStream, which will be
// streamed "into" the FileOutputStream
dataout.writeUTF(s);
dataout.writeInt(i);
dataout.writeFloat(f);
dataout.close();
}
Reading from a file is as simple as the last example. You open
a DataInputStream over a
FileInputStream and make
calls to the DataInput reading
methods.
The Persistent
Framework
The java.io package supplies
the necessary classes to read and write primitive data. But what
about reading and writing entire objects? Although DataInput/DataOutput
is a powerful concept (the portable data format), these interfaces
do not contain methods to read or write entire objects. Objects
seem to be "left as an exercise for the reader." I decided
to take up the challenge and implement a simple framework for
reading and writing objects. The interfaces and classes in this
framework are present on the accompanying CD-ROM. Feel free to
use the provided framework in your code.
You have already encountered the concepts that go into reading
and writing primitive data. DataInput
and DataOutput can handle
the streaming of primitive types, but they do not handle class
types. To stream class types, we need a new concept-the concept
of "a class whose instances can stream themselves."
This can be generalized in an interface, called Persistent:
import PersistentInput;
import PersistentOutput;
import java.io.IOException;
/**
* Persistent interface. Provides a class with the ability to write
* itself to a stream, and to read itself from a stream.<p>
*
* @see PersistentInput
* @see PersistentOutput
* @author Eric R Williams
*/
public interface Persistent {
/**
* Writes self to the specified output stream.<p>
*
* @param out the persistent output interface to write self to.
* @exception IOException if an I/O problem occurs.
*/
public void write(PersistentOutput out) throws IOException;
/**
* Reads self from the specified input stream.<p>
*
* @param in the persistent input interface to read self from.
* @exception IOException if an I/O problem occurs.
*/
public void read(PersistentInput in) throws IOException;
}
Note |
Note the use of javadoc-style comments in the preceding example. Documenting your code using the javadoc standard format is always a good idea. This format helps you produce online documents describing your code, and it is generally expected by other developers. For the remainder of this chapter, however, the javadoc-style comments have been removed to cut down on the size of the code listings.
|
The Persistent interface
provides a standard way to add persistence (and streamability)
to classes. To add persistence to a class, implement the Persistent
interface in that class. There are only two methods to implement:
one to write the object to an output stream (write(PersistentOutput))
and one to read the object from an input stream (read(PersistentInput)).
If you examine the Persistent
interface, you encounter two additional classes: PersistentOutput
and PersistentInput. They
are actually not classes, but interfaces. These interfaces extend
the DataInput and DataOutput
interface models to provide support for reading and writing Persistent
objects, as follows:
import Persistent;
import java.io.DataOutput;
import java.io.IOException;
public interface PersistentOutput extends DataOutput {
void writePersistent(Persistent obj) throws IOException;
}
PersistentOutput defines
an API that extends the DataOutput
interface and adds a new method (to write Persistent
objects). The new method, writePersistent(Persistent),
is declared in a style consistent with the other methods declared
in the DataOutput interface.
A similar interface is defined to extend DataInput-the
PersistentInput interface:
import Persistent;
import java.io.DataInput;
import java.io.IOException;
public interface PersistentInput extends DataInput {
Persistent readPersistent() throws IOException;
}
These three interfaces-Persistent,
PersistentInput, and PersistentOutput-form
a framework that makes it easy to add persistence to your classes.
There are two additional classes in the Persistent
framework: PersistentInputStream
and PersistentOutputStream.
These classes are discussed in detail in a later section.
Using the Simple Persistent Store
Now that you have been introduced to the Persistent
framework, let's examine how to apply that framework to make objects
persistent. This process involves modifying a class you have already
written to add the Persistent
interface to that class. We will use a simple class created to
demonstrate the Persistent
framework: the Shape class.
The original code for Shape
(without persistence) is listed here:
import java.io.*;
import java.awt.Point;
public class Shape {
private Point[] vertices;
private String name;
public Shape(Point[] vertices, String name) {
this.name = name;
this.vertices = vertices;
}
public Shape(int size, String name) {
this.name = name;
vertices = new Point[size];
for (int i=0; i<size; i++) {
vertices[i] = new Point(0, 0);
}
}
public Point getPoint(int pos) {
return vertices[pos];
}
public String getName() {
return name;
}
}
Shape is a simple class;
it has only two attributes: a name and an array of points (the
boundaries of the shape). The Shape
class depends on java.awt.Point
to represent Point objects.
To add persistence to the Shape
class, we need to make a few changes to the class source code:
- Add implements
Persistent to the class
declaration line
- Add a no-parameter constructor (the reason
for this will be discussed later)
- Code the write(PersistentOutput)
method, which is required by the Persistent
interface
- Code the read(PersistentInput)
method, which is also required by the Persistent
interface
The first two items on this list are trivial. They involve minor
changes to the class. The latter two items are more involved tasks.
Before we start coding the read()
and write() methods, we need
to choose an external format for the Shape
class. The external format is a specification of the order and
structure of the object's attributes. One convenient notation
used to express this format is similar to C struct
declarations. (This notation is used in the Java Virtual Machine
Specification to describe the layout for Java .class
files.) We can represent the Shape
class using the following structure:
int vertex_count;
struct {
int x;
int y;
} vertices [vertex_count];
String name;
This notation specifies that the first element in the format is
labeled vertex_count and
is an int. The second element
is labeled vertices; it is
an array of length vertex_count
(which was already specified). The array is composed of a compound
structure containing two ints,
x and y,
respectively. The last element is a String,
labeled name. In this notation,
the labels exist for human consumption only-they are not included
in the stored objects. Labels help readers of the format understand
what data is being represented.
Once you choose an external format for the Shape
class, you can begin to construct the routines to read and write
a Shape. Here is an implementation
of the write(PersistentOutput)
method:
public void write(PersistentOutput out) throws IOException {
out.writeInt(vertices.length); // write # of points
for(int i=0; i<vertices.length; i++) { // write each point
out.writeInt(vertices[i].x);
out.writeInt(vertices[i].y);
}
out.writeUTF(name); // write shape name
}
Only two of the DataOutput
interface methods are used in this example: writeInt()
and writeUTF(). As you can
see, this method logically carries out the agreed-on format-array
length, followed by the array of points, and then followed by
a string. The process of writing an object to a file is not difficult;
it is expressed in about five lines of code.
The following is an implementation of the read(PersistentInput)
method:
public void read(PersistentInput in) throws IOException {
vertices = new Point[in.readInt()]; // read # of points
for(int i=0; i<vertices.length; i++) { // read each point
vertices[i] = new Point(in.readInt(), in.readInt());
}
name = in.readUTF(); // read shape name
}
The read() method implements
the agreed-on format. Again, the method is short and simple to
understand, using just two methods from the DataInput
interface: readInt() and
readUTF(). First, it reads
the vertices' array size, followed by each vertex (a Point
consisting of two ints, x
and y), and finally reads
a String, the name of the
shape.
Now that we have seen the pieces, let's put it all together. The
following code listing includes the Shape
class (renamed to PShape),
plus the additions that have been made (in bold) to support
persistence:
package COM.MCP.Samsnet.jun;
import java.io.*;
import java.awt.Point;
import COM.MCP.Samsnet.jun.Persistent;
public class PShape implements Persistent {
private Point[] vertices;
private String name;
public PShape() { // need a no-parameter constructor
vertices = null
name = null
}
public PShape(Point[] vertices, String name) {
this.name = name;
this.vertices = vertices;
}
public PShape(int size, String name) {
this.name = name;
vertices = new Point[size];
for (int = 0l i<size; i++) {
vertices[i] = new Point(0, 0);
}
}
public Point getPoint(int pos) {
return vertices[pos];
}
public String getName() {
return name;
}
public void write(PersistentOutput out) throws IOException {
out.writeInt(vertices.length); // write # of points
for(int i=0; i<vertices.length; i++) { // write each point
out.writeInt(vertices[i].x);
out.writeInt(vertices[i].y);
}
out.writeUTF(name); // write shape name
}
public void read(PersistentIntput in) throws IOException {
vertices = new Point[in.readInt()]; // read # of points
for(int i=0; i<vertices.length; i++) { // read each point
vertices[i] = new Point(in.readInt(), in.readInt());
}
name = in.readUTF(); // read shape name
}
public String toString() {
StringBuffer b = new StringBuffer(name);
for (int i=0; i<vertices.length; i++) {
b.append(" (" + vertices[i].x + "," + vertices[i].y + ")");
}
return b.toString();
}
public boolean equals(Object obj) {
boolean isequal = false;
if (obj instanceof PShape) {
PShape shape = (PShape)obj;
isequal = (this.name.equals(shape.name))
&& (this.vertices.length == shape.vertices.length);
int i=0;
while (isequal && i<vertices.length) {
if (! this.vertices[i].equals(shape.vertices[i])) {
isequal = false;
break;
}
i++;
}
}
return isequal;
}
}
To validate the persistence of the shape
class, we need to have a test class that does the
following:
- Creates a Shape
object
- Writes it to a file using a PersistentOutputStream
- Reads it back from the file using a PersistentInputStream
- Compares the two objects
The following class, PShapeTest,
validates the persistence of PShape.
(All these classes are on the accompanying CD-ROM, so feel free
to run this test.)
package COM.MCP.Samsnet.jun;
import COM.MCP.Samsnet.jun.PShape;
import COM.MCP.Samsnet.jun.PersistentOutputStream;
import COM.MCP.Samsnet.jun.PersistentInputStream;
import java.io.*;
public class PShapeTest {
public static void main(String[] args) {
try {
PShape square = new PShape(4, "SquareOne");
square.getPoint(0).move(0, 0);
square.getPoint(1).move(1, 0);
square.getPoint(2).move(1, 1);
square.getPoint(3).move(0, 1);
PersistentOutputStream out = // create a PersistentOutputStream
new PersistentOutputStream( // on top of a FileOutputStream
new FileOutputStream("pshape.sav"));
out.writePersistent(square); // *** write the Shape ***
out.close();
PersistentInputStream in = // create a PersistentInputStream
new PersistentInputStream( // on top of a FileInputStream
new FileInputStream("pshape.sav"));
PShape shape2 =
(PShape) in.readPersistent(); // *** read the Shape ***
in.close();
if (square.equals(shape2)) {
System.out.println("everything is ok!");
}
} catch (Exception ee) {
System.err.println(ee.toString());
ee.printStackTrace();
}
} // main
} // class
The Implementation of PersistentInputStream
and PersistentOutputStream
The only missing pieces now are the classes that provide implementations
for the PersistentOutput
and PersistentInput interfaces.
As interfaces, they are API specifications only; implementations
are required if you are going to use the interfaces.
Let's start with PersistentOutput.
The PersistentOutput interface
is very complicated; it contains all the methods of DataOutput
(approximately 14 methods), plus writePersistent().
That's a lot of methods to implement! Fortunately, reuse by inheritance
comes in handy; a class that nearly matches the needs already
exists. By subclassing DataOutputStream,
all the DataOutput methods
defined in DataOutputStream
are inherited (and do not have to be reimplemented). You only
have to implement a constructor and a writePersistent()
method. Here's a listing of the DataOutputStream
class:
import java.io.*;
import Persistent;
import PersistentOutput;
public class PersistentOutputStream extends DataOutputStream
implements PersistentOutput {
public PersistentOutputStream(OutputStream out) {
super(out);
}
public final void writePersistent(Persistent obj) throws IOException {
if (obj == null) { // treat null in a special way
writeUTF("null"); // write "null" as the class name
} else {
writeUTF(obj.getClass().getName()); // write the object's class name
obj.write(this); // then write the object itself
}
}
}
The writePersistent() method
writes the string "null"
if the specified Persistent
object is null. Otherwise,
the method writes the class name of the object (a String),
followed by the object writing itself to the stream (using the
write(PersistentOutput) method
of the Persistent interface).
The PersistentOutputStream
does not have to understand the format a Persistent
object uses when it writes itself to the stream. Moving the writing
logic to the classes that implement Persistent
is what the Persistent interface
is all about.
The PersistentInputStream
is slightly more complicated, but it still inherits most of its
behavior from DataInputStream,
as shown here:
import java.io.*;
import Persistent;
import PersistentInput;
public class PersistentInputStream extends DataInputStream
implements PersistentInput {
public PersistentInputStream(InputStream in) {
super(in);
}
public final Persistent readPersistent() throws IOException {
Persistent obj = null;
String classname = readUTF(); // read the class name
if ("null".equals(classname)) {
obj = null; // if "null", return null
} else {
try {
// retrieve the Class object for the specified class name
Class clazz = Class.forName(classname);
// build a new instance of the Class (throws an exception if
// the class is abstract or does not have a no-param constructor
obj = (Persistent) clazz.newInstance();
// let the object read itself from the stream
obj.read(this);
} catch (ClassNotFoundException ee) { // catch all kinds of
throw new IOException(ee.toString()); // exceptions and rethrow
} catch (InstantiationException ee) {
throw new IOException(ee.toString());
} catch (IllegalAccessException ee) {
throw new IOException(ee.toString());
}
}
return obj;
}
}
The readPersistent() method
reads the name of the object's class from the stream. If that
name is equal to "null",
the null value is returned.
Otherwise, the method locates the Java Class
object corresponding to the class name and uses the Class
to create a new instance of the Persistent
object. The new Persistent
object then reads itself from the stream in the read(PersistentInput)
method.
You may wonder about the exception handling in the readPersistent()
method. Why does it have so many catch
statements? They were used to keep the readPersistent()
method consistent with the methods of DataInput,
all of which throw only IOException.
If you do not catch the listed exceptions and rethrow them as
IOExceptions, the exception
class names must be declared in the throws
clause of the readPersistent()
method-which would be inconsistent with DataInput.
Note |
The object creation step in the PersistentInputStream class requires the use of the Class method newInstance(), which is Java's generic interface for creating an object, given the Class instance. To allocate a new object of a class using newInstance(), the class must have a public constructor that takes no parameters (this is the constructor method that will be invoked by newInstance()). A public no-parameter constructor was added to the PShape class to support the use of newInstance().
|
The PersistentInputStream
and PersistentOutputStream
implementation of reading and writing Persistent
objects has several limitations:
- If you attempt to read a persistent object
for which the Java class has not yet been loaded, an exception
is thrown.
- Object identity is not considered. Two
references to a single object are written as two objects on a
PersistentOutputStream.
- Cyclical data structures cause the PersistentOutputStream
to enter a recursive loop, eventually exhausting stack space and
throwing an exception. (An example of a cyclical structure is
one in which two objects contain references to each other.)
The Persistent framework
classes are simple and straightforward. In short order, you can
add "persistence" to your classes; you can store objects
in files or send them across a network to another computer. These
interfaces and classes are not a general solution to the problem
of persistence, but it's a good solution when you have to store
or send simple objects. Additionally, the Persistent
framework is a useful tool to teach some of the concepts of persistence.
In October 1995 (the early days of Java, before the language skyrocketed
in popularity), Sun funded a year-long research project at the
University of Glasgow to investigate adding persistence
to the Java programming language. The Glasgow researchers have
proposed a design specification for adding "orthogonal"
persistence to Java. They have also begun building a persistent
storage interface to link Java to a persistent store.
Persistent Store Concepts
Few programmers are familiar with persistent stores or object
databases. The following brief sections introduce the basic concepts
involved in a persistent store.
Note |
The phrases persistent store and object database are often used interchangeably. Because the authors of the PJava design refer to PJava as an "interface to a persistent store," this chapter refers to PJava as a "persistent storage" interface.
|
Persistent Stores versus Relational Databases
Foremost, a persistent store is a kind of database. You
are probably familiar with the term database (a storage
pool for information). Most commercially available databases support
long- term data storage on disk, structural organization of the
data, methods to retrieve data from the database, methods to update
data already stored in the database, row or page locking to prevent
concurrent access problems, isolation of uncompleted transactions
from other transactions, and so on. Most persistent stores meet
these criteria.
By far the most common type of client-server database system is
the relational database (for example, Oracle, Informix, Sybase,
DB2, and so on). Contrasting a persistent store with a relational
database is a useful exercise to understand what a persistent
store is and what it is not.
Relational databases are organized in tabular data structures:
tables, columns, and rows. Data from different tables can be joined
to create new ways of looking at the data. SQL is used to send
commands to the database, such as commands to create new rows
of data, to update rows, and so on. SQL commands can also be used
from other programming languages because they are sent to the
database server for processing.
Relational databases, with their tabular data structures, do not
mesh well with object-oriented (OO) programming languages. There
are three major problems encountered using relational databases
from an OO language. First, relational data structures do not
provide for class encapsulation. OO programmers are encouraged
to model their domain using classes, providing an API to class
users, and "hiding" all data within the class. Relational
structures expose all data and do not allow encapsulation by an
API. Second, OO classes support a rich set of data types that
are difficult or impossible to model efficiently in a relational
structure. Examples include multidimensional arrays, dictionaries,
and object references. Last, it is difficult to represent class
inheritance in a relational database. Although it is possible,
deep class inheritance trees can result in n-way joins
on the database server, which have poor performance.
Tools that attempt to solve the object and relational mismatch
are available. These tools map relational data structures into
OO classes using relatively simple rules (for example, map tables
to classes, columns to attributes, and foreign key attributes
to object relationships). Although some of these products have
been successful, this approach has had problems. These products
suffer from performance issues, particularly when complex navigation
is performed through the mapped data structures. Additionally,
these products limit the type-expressiveness of the language because
not all the data types expressible in the object-oriented language
are easily expressible in a relational database.
Persistent stores are different from relational databases. Persistent
stores do the following:
- Eliminate the use of relational data structures
(instead, whole objects are stored directly in the database)
- Enable the programmer to write classes
in a normal, object-oriented fashion to represent data that will
be made persistent
- Enable the programmer to take advantage
of more data types than is possible when using a relational database
- Provide a simpler interface than a relational
database interface
Creating and Using Persistent Objects
Different persistent storage interfaces have different methods
for creating persistent objects (or making existing objects persistent).
Some interfaces require the programmer to specify whether an object
is to be persistent at the time an object is created. Other persistent
stores implement a concept referred to as persistent roots.
Persistent root objects are explicitly identified as objects that
are persistent; any object that is referred to by the persistent
root is also considered persistent. All objects that are reachable
in this fashion (from the persistent root) are also considered
to be persistent and are saved in the persistent store. This concept
is called persistence through reachability.
Retrieving objects from a persistent store is significantly different
from retrieving data through SQL. When using SQL, the programmer
must explicitly request data (using SELECT
statements); with persistent stores, programmers seldom make explicit
queries for objects. Persistent stores usually provide a mechanism
to request only "top-level" objects, either through
direct query or through a request for a particular persistent
root.
Persistent storage interfaces almost universally employ a process
known as swizzling (or object faulting) to retrieve
objects from the database. Objects are retrieved on the fly, as
they are needed. After obtaining a reference to a top-level object,
programmers normally use that object to access related objects.
When attempting to access an object that has not yet been retrieved
from the database, the object is swizzled in. The attempt
to access the object is trapped by the database interface, which
then retrieves the object's storage block from the database, restores
the object, and then allows the object access to continue.
Finally, persistent stores usually have a mechanism to identify
objects uniquely: the object ID. Every object in a persistent
store is assigned its own unique object ID, which can be used
to differentiate objects of the same class whose values are equal.
PJava Design
The first PersistentJava design, known as PJava0, was published
in January 1996. An additional paper (Atkinson, et al. '96) was
published in February and describes the design issues of PJava0.
(Both of these papers are available from http://www.dcs.gla.ac.uk/~susan/pjava.)
The PJava0 design goals, principles, and architecture are outlined
in the following sections.
Project Goals
The stated goal of the PersistentJava project is to provide orthogonal
persistence in Java. The PJava researchers are creating a persistent
storage mechanism that can store objects of any type in the persistent
store. This is the operating meaning of orthogonal-the
independence of the persistence from the data type. Any object,
without respect to type, can be made persistent.
Many persistent stores and object databases do not support orthogonal
persistence. Orthogonal persistence is extremely hard to implement
in most programming languages. It means that programmers can write
code without considering that they might be dealing with persistent
objects. This forces the persistent storage interface to be extremely
flexible in how it deals with data types. Additionally, this makes
implementing a programming-language independent database server
difficult because a very tight binding is made to one language's
type system.
The Glasgow team has set out with a goal of orthogonal persistence;
doing so has implications they must handle. Any object, be it
of a user-defined or system-defined class, can be persistent.
Persistent objects can include Object,
Panel, SecurityManager,
Button, Class,
Hashtable, and so on.
An additional goal of the research project is the building of
a prototype application that uses the prototype persistent storage
interface. The application is referred to as Forest, a
distributed software configuration management and build system
([Atkinson, et al. 96] Atkinson, Daynès & Spence. Draft
PJava Design 1.2. Department of Computer Science, University
of Glasgow. January 1996).
Design Principles
The PJava team used several principles to guide their design:
- Data type independence from persistence
(orthogonal)
- Persistence through reachability from
persistent roots
- No changes to the Java language
- Support for different styles of transactions
- Persistence without modification to existing
Java code
- Flexibility, to allow for integration
with multiple persistent stores
The PJava team intentionally left out one potential design goal:
"No changes to the Java virtual machine." In fact, the
team has actively pursued the modification of the JVM; it is a
central part of the architecture (and probably the only feasible
way to implement orthogonal persistence). Unfortunately, JavaSoft
has stated that they will not incorporate the PJava changes into
the commercial JVM, effectively relegating PJava to the academic
community for the time being.
The foremost point to remember about the design of PersistentJava
is that it does not require the programmer to change any existing
classes. It does not require the programmer to use a "special"
version of the system classes. It does, however, require the programmer
to use a customized virtual machine.
Storing and Retrieving Objects
One of the first things you want to know as the user of a persistent
store is how to make objects persistent. How do you store objects
in the database? PersistentJava incorporates the concept of a
persistent root. The Draft PJava Design 1.2 document states
that an early revision of the design included a PersistentRoot
class-objects of type PersistentRoot
(or a subclass thereof) have the property of "being a persistent
root." However, the design was changed; any object may be
registered as a persistent root, thus making the "root"
property independent of data type.
Here is an example of how to make an object a persistent root
in PJava0:
// make obj a persistent root (pstore is a PJavaStore)
pstore.registerPRoot("root-1", obj);
To retrieve a persistent root from the database, follow this example:
// get the handle for all Open Orders
Orders[] orders = (Orders[]) pstore.getPRoot("OpenOrders");
Note |
The preceding code example is the only Pjava code sample included with this book. As this book goes to press, the PJava0 implementation has yet to be completed.
|
Recall from the earlier discussion of persistent roots (in the
section "Persistent Store Concepts") that roots are
only the starting point for the identification of persistent objects.
By adding a single persistent root to the database, you may be
adding thousands of objects to the persistent Java store.
Now you can store root objects in the persistent store and retrieve
them. But how do you access other objects? Does a similar "ask
the database for the object" interface exist? The answer
is both yes and no. When you use a root object to access related
objects, you call methods on and retrieve the attributes of those
objects. When you attempt to access a related object that has
not yet been brought from the database, the modified virtual machine
intercepts this action, bringing the object from the database
for you. You are not required to do anything special. Use objects
as you normally would-the object retrieval mechanism is transparent.
The PJava virtual machine (a modified JVM) performs work that
is not visible to the programmer-the VM monitors access to objects.
When an attempt is made to access a persistent object that has
not yet been accessed, PJava goes into action. Part of the PJava
system is called on to retrieve the object. It determines whether
the storage block containing the object has already been loaded;
if not, it makes a trip to the persistent store. When the object's
storage is loaded, PJava converts the byte-oriented storage into
a Java object. The PJava VM then allows your code to continue
accessing the object. This mechanism of transparent object retrieval
is often called swizzling, or object faulting (a
legacy of certain object databases that perform this operation
using OS page-faulting mechanisms).
Transactions
The next thing you may want to know about PJava is how to begin
and end a transaction. The designers of PJava wanted to allow
multiple transaction styles, so they created a transaction root
class, TransactionShell.
This class has two provided subclasses: NestedTransaction
and OLTPTransaction, but
the programmer can subclass TransactionShell
to create new transactional styles.
Transactions in PJava can either be launched synchronously (that
is, in the same thread) or asynchronously (in a different thread)
by invoking the start() method
of the transaction object. The TransactionShell
class executes the user's transaction logic through a Runnable
object, whose run() method
is invoked as the "main" method of the transaction.
To obtain the result of the transaction (whether it succeeds or
fails), call the claim()
method. If you want to stop an asynchronously running transaction,
you can invoke the kill()
method on that transaction.
In PJava, you can run one transaction nested within another transaction
using the NestedTransaction
class. Nested transactions enable you to perform updates in a
child transaction without affecting the state of the parent transaction.
A child transaction that completes successfully passes all its
updates (the modified objects) to its parent transaction. If the
child transaction aborts, none of its updates are ever reflected
in the parent transaction. You also can spawn parallel, independent
NestedTransactions. In this
case, each of the sibling transactions is isolated from all others,
and can commit or abort independently.
An additional transaction class, OLTPTransaction,
also is available. An OLTPTransaction
is a traditional transaction style that cannot be executed asynchronously
and cannot be nested.
Persistence involves extending the lifetime of an object beyond
the lifetime of the program in which it was created. In this chapter,
you have seen several possible ways to implement persistence:
- Saving the representation of an object
directly to a file using the DataOutput
and DataInput interfaces
- Using the Persistent
framework provided with this chapter
- Using some form of database library (for
example, JDBC)
- Using a persistent store, like the one
being created by researchers at the University of Glasgow
Contact
reference@developer.com with questions or comments.
Copyright 1998
EarthWeb Inc., All rights reserved.
PLEASE READ THE ACCEPTABLE USAGE STATEMENT.
Copyright 1998 Macmillan Computer Publishing. All rights reserved.