An Overview

towards a marriage of theoretical and practical computer science

Introduction

There are many sources for the ideas behind this project. In the fall of 1997, while researching a final project for Michael Lewis's Artificial Intelligence course, we became interested in Douglas Hofstadter's book, "Fluid Concepts and Creative Analogies", and spent a great deal of time thinking about and discussing Hoftstadter's ideas. As our final project, we attempted a port of Hoftsadter's and Mitchell's Copycat system (originally written in LISP) to C++. It never became fully operational, but became the inspiration for this independent study.

Copycat, as implemented by Mitchell is an AI system for the construction of analogies between "letter strings". For example, given the problem "ABC:ABD::TUV:?" (which might be read "ABC is to ABD as TUV is to what?"), Copycat may produce the answer "TUW" by noting that the last letter in the first analogy is replaced by its successor, and "doing the same thing" to complete the second analogy. In fact, Copycat is able to produce more "creative" analogies to more challenging problems. Some examples include

"ABC:ABD::XYZ:WYZ"

(replacing rightmost with leftmost and successor with predecessor) and

"ABC:ABD::MRRJJJ:MRRJJJJ"

(replacing "alphabetic position" with "length of sub-string"). Clearly Copycat is an intriguing program, and is made more so by Hoftstadter's repeated claim that the program's framework is independent of any particular problem domain.

Looking over Mitchell's LISP code, we found that claims of scalablity and domain independence, while potentially accurate, were clearly not a major factor in the design and implementation of the actual code for the project. Little effort was made to separate the basic framework from problem specific information and routines. The dialect of LISP that was used seems to be rather platform dependent and while LISP is commonly used for AI systems, it is not as widely used within the general programming community. We sought to remedy these problems and others by creating a system of interest to both theoretical computer science and practical business users.

Early in the design phase, it became clear that Hoftstadter's slipnet, a variation of semantic networks, was of greater relevance to database management theory than to the narrowly defined problem Copycat was developed to explore. Hence our Java implementation, entitled the Reactively Ordered Data Model (RODM), is intended, in part, to provide a better Database Management System (DBMS), one that not only surpasses the functionality of the industry standard Structured Query Language (SQL), but is capable of encoding and evaluating higher order predicate calculus.

A Quick Walk Through the Slipnet

All the data in the ROD model is stored in three word sentences of the form "Subject verb object". For example, "I built this_system" is a legal sentence in the ROD model. A collection of these sentences forms the slipnet, a labelled, directed graph which is analogous to a database in traditional relational systems or to the AI construct of frames or semantic networks. Data can be viewed textually as a series of sentences, or graphically as a collection of points representing nouns and objects connected by directed lines, representing verbs or relations. An example of the graphical view of a sample database is shown below:

Some of the sentences embedded in this database include Professor inst_of Community (meaning that the concept of Professor and the concept of Community are related by the fact that professors are part of the community) and Professor has Department which signifies that every professor has the attribute department (every professor belongs to some department). Note that the has relation can be used to create a database schema (or frame) by dictating what attributes an instance of that scheme must specify.

Our application supports the creation, editing, and querying of slipnet data, and can store and retrieve this data as a binary file or data stream (so that network-based database systems can be easily created), or as a text file comprised of three word sentences which can be easily created and interpreted by a human.

Using our query language, RODQL, a user can query the database to retrieve answers to complicated questions like:

"What has anything to do with either Java or Computer Science, and show me all the Math Majors too?"

The result set can be viewed either as a series of true sentences or as a graph, and intermediate results can be held for the construction of even more convoluted queries.

The Slipnet and Beyond

While the design and implementation of the database system has been the focus of the first phase of the project, it is really just the foundation of a larger artificial intelligence framework. Using a heuristically driven search engine to guide RODQL-type queries, we hope to provide a system that can address problems from a wide variety of domains. In this way the ROD model can be used not only as a powerful database management tool (one that includes artificially intelligent search capabilities), but also as a general purpose problem solving application.

Although these extensions to the basic ROD framework have yet to be implemented you'll see that such developments have played a major role in our overall design philosophy throughout the project. Moreover, we find that even without the AI search engine, Hoftstatdter's extensions to the standard semantic network implementation can be used to increase the performance of any database management system.

Back to the top of this page.

Footnote: I would just like to state for the record that the Reactively Ordered Data acronym was invented entirely by Skip, despite my objections. (He is endlessly amused by the fact that my name can fit into an "eight dot three" filename extension.) In compensation, the Java package that encapsulates our parser and related structures is entitled SFC, or Skip Foundation Classes.

--R. Waldhoff