UniKL Logo

Lehrgebiet Informationssysteme

FB Informatik

FB Informatik
 
LG IS
AG DBIS
AG HIS
 About us
 Staff
  Boris Stumm
    Curriculum Vitae
 Publications
 Projects
 Intern
 Impressum
Jobs / Tasks
Courses
Publications
Contact
Misc
Impressum
(C) AG DBIS
 

Open diploma, project, bachelor and master theses

I have various work to be done currently, ranging from implementation-heavy to conceptual. Details change once in a while, so it is best if you come by and ask me directly. However, below I wrote down some of open work, to give you an impression of what to expect :-)

Note that many of the topics are not really database-heavy, so even without being you a DBMS-expert, you will find something interesting.

You can do the thesis in english or german, whatever you like best.

Prerequisites

  • The implementation is done in java 5, so it is necessary that you are somewhat experienced about java, and the JDK API. I emphasize on code quality as much as I do on functionality.
  • Depending on the topic, knowledge of SQL, OWL, RDF, SQL, JDBC, XML, Webservices might come in handy.
  • All files are managed in an subversion repository.

Open work (Caro)

To get most benefit out of existing information, companies are trying to combine information from different sources by using information integration technologies. This results in highly complex, distributed systems with many dependencies between each other.

Because of restructuring, acquisitions etc. such an information infrastructure is subject to constant change. This "system evolution" normally affects only a few systems directly. But because of the interwoven dependencies, it might have an indirect impact on other systems as well. Possible consequences may be data corruption, system failures or inconsistencies that are detected much later. It is very important to reduce these consequences to a minimum in number and duration.

Caro tries to monitor the state of the whole information infrastructure, and to analyse the effects of changes to system, no matter if they are planned or ad hoc. We make two assumptions here: First, we will have to live with incomplete information, and second, predefined processes will not be adhered to. These assumptions simply reflect the human behaviour which we cannot change. Even under these difficult circumstances, our approach always works on a "best effort" basis. We may lose some detail in the analysis, but still can get approximate results. The more information (ontologies, database schemas, etc.) we have, and the more people adhere to processes (agreements between responsible administrators etc.), the better the results of the performed analysis are. A trade-off exists between putting more work into making Caro do a good analysis, or to have more manual work afterwards when problems are detected. In neither case consequences of a change will go undetected, since we always use pessimistic estimates.

Development of a metadata repository for versioned storage of metadata

In Caro, we have to store and manage metadata collected from various information systems. The data changes over time, and we have to keep the history of it. Also, the metadata might be changed concurrently by several people or programs, so besides versioning, ACID principles are important. Therefore, a metadata repository (MDR) is needed which can handle these requirements.

Your task consists of the following topics:

  • Evaluate if and how existing versioning systems, like SVN, can be used in the implementation of the MDR.
  • Implement the MDR in java, and depending on the outcome of the versioning systems evaluation, take advantage of that.
  • Provide a web service interface to the repository.
  • Provide a local interface, if the repository and the accessing application run in the same JVM

Development of a SQL meta model for the change impact system description model (CISDM) of Caro

The internal data model of Caro is a typed digraph, which allows to describe and store arbitrary meta data. The meta-model is partitioned in a specific part for change impact analysis, and parts for each data model that is to be stored. Basically, the meta-model defines node and edge types like "Compound" "hasPart" and "Part" on the CIA level, and types like "Table", "hasColumn", "Column" in the SQL specific part.

Your task consists of the following topics:

  • Create a type hierarchy capturing the most important SQL constructs.
  • Create restrictions on nodes and edge types to prevent inconsistencies in meta data graphs.
  • According to the insights gained while developing the SQL meta model, refine and improve the CISDM

The meta model is described in the web ontology language OWL, so for this work a basic knowledge of RDF and OWL is recommended.

Development of a XML meta model for the change impact system description model of Caro

This is comparable to the SQL meta model described above.

Development of a Framework for metadata agents (MDAs)

Metadata agents in Caro are responsible for several different tasks:

  • Observation of information systems, either periodically or continuously, for changes.
  • Metadata extraction from systems, and transformation in the Caro metadata format.
  • Notification of the change management component if changes are detected.
  • Allow manual editing of metadata
  • In case of problems, control the observed information system
  • Communicate with users if needed

Most of these tasks are very specific to the systems in observation, and so can not be implemented in a generic way. However, it is important to provide a MDA-framework which provides as much generic functionality as possible, and thus reduces the effort needed to adapt MDAs to specific information systems.

Your task consists of the following topics:

  • Conceptual development of the MDA framework
  • Implementation
  • Prototypical implementation of specific plugins as proof-of-concept

Development of specific metadata agents

After having a MDA framework as described above, we need implementations for several data models, like SQL, XML, file systems, web services etc. Your task is to provide these implementations.

Development of a change manager for Caro

The change manager is the central component of Caro, responsible for coordination of MDAs and the metadata repository, as well as analysis. Core analysis functionality already exist, but various management and communication functionality is still missing.

Your task consists of the following topics:

  • Designing the change manager component and its interfaces to other components
  • Implementation
  • Setting up a scenario as proof-of concept

Open work (GraphEdit)

A recent project of us is GraphEdit, an editor for arbitrary graph models. The distinguishing fact of GraphEdit is it's abstracting capabilities. Depending on a user-provided stylesheet, it can abstract from the underlying graph model, and just show the relevant parts, in a user-friendly manner.

Graphs are manipulated with "edit operations", which are user-provided rules specifying where and how the graph can be modified. This way, the end user can edit the graph with high-level operations and does not have to worry about "nodes" and "edges"

Development of style sheets for GraphEdit

GraphEdit is applicable to a wide range of applications, if the corresponding style sheets are provided. We already implemented a prototypical SQL DDL editor (yes, even SQL can be represented as graph :-), but we strive for more, for example:

  • A UML editor
  • An E/R editor
  • A comfortable FOAF viewer
  • An OWL/RDF editor
  • And much more :)

The goal of this work is to estimate the effort it takes to write stylesheets for a certain application, and to find out the editor parts that still need improvement.

Generating code from GXL files

While for editing purposes, a graph editor is a well suited tool, in general one needs to serialize the graph sooner or later. This could be serialization of an E/R diagram into SQL code, or UML into Java code. It often is required that the generated code still is human readable, e.g., for debugging. Currently, GraphEdit just writes out graphs as GXL files, which is a XML based format for graph representation.

Your task consists of the following topics:

  • Conceptually design an framework that allows code generation from GXL files.
  • Implementation.
  • Prototypical implementation of a specific code generator, e.g., for E/R diagrams.

Extending GraphEdit to support complex node objects

In principle, GraphEdit can use arbitrary JComponent subclasses to display nodes, e.g. JTables or JTrees, to display complex node structures. However, it is not trivial to use these widgets and maintain the consistency between the view and the underlying graph model. Your task here is it to evaluate what changes are necessary to enable the use of these advanced widgets, and implement them.