Publications & Invited Talks

Selected Publications

Hartmut Liefke and Susan Davidson. View Maintenance for Hierarchical Semistructured Data. (PDF, PS)
In Proceedings of International DAWAK Conference, Greenwich, UK, September 2000.

Hartmut Liefke and Dan Suciu. An Extensible Compressor for XML Data. (PDF, PS)
In ACM SIGMOD Record, Volume 29, Number 1, March 2000.

Hartmut Liefke and Dan Suciu. An Efficient Compressor for XML Data. (PDF, PS) (Best Paper Award)
In Proceedings of ACM SIGMOD International Conference on Management of Data, Dallas, USA, June 2000.
(Full Version as PDF or PS)

Hartmut Liefke and Susan Davidson. Efficient View Maintenance in XML Data Warehouses. (PDF, PS)
Technical Report MS-CIS-99-27, University of Pennsylvania, Philadelphia, USA (1999)

Hartmut Liefke and Susan B. Davidson. Specifying Updates in Biomedical Databases. (PDF, PS)
In Proceedings of International Conference on Scientific and Statistical Database Management. Cleveland, USA, July 1999.
(Full Version as PDF or PS)

Hartmut Liefke. Horizontal Query Optimization on Ordered Semistructured Data. (PDF, PS)
In Proceedings of ACM SIGMOD International Workshop on the Web and Databases. Philadelphia, USA, June 1999.

Hartmut Liefke and Susan B. Davidson. Processing Updates on Complex Value Databases. (PDF, PS)
In Proceedings of Information Resource Management Association International Conference. Hershey, USA, May 1999.

Hartmut Liefke and Susan Davidson. Efficient View Maintenance in XML Data Warehouses. (PDF, PS)
Technical Report MS-CIS-99-11, University of Pennsylvania, Philadelhia, USA (1999)

Hartmut Liefke and Susan Davidson. An Execution Model for CPL+ (PDF, PS)
Technical Report MS-CIS-98-29, University of Pennsylvania, Philadelhia, USA (1998)

Peter Buneman, Alin Deutsch, Wenfei Fan, Hartmut Liefke, Arnaud Sahuguet, and Wang-Chiew Tan.
Beyond XML Query Languages. (PDF, PS)
In W3C Query Languages Workshop, Boston, USA, December 1998.

Hartmut Liefke. Quaternion Calculus for Modeling Rotations in 3D Space (PDF, PS)
Preliminary PhD Exam, University of Pennsylvania, 1998

Talks

The following is a list of some of my talks. The list does not always include the most recent talks and some of the talks from 1996-98 are missing too.

IT Architectures and B2B Integration at BMW Group
November 24, 2003, University of Washington, Seattle, USA
IT Architectures and B2B Integration at BMW Group
September 15, 2003, Hongkong University of Science & Technology, Computer Science Department, Hongkong
XML in eCommerce Anwendungen: Erfahrungen und Lösungen
June 12, 2001, Darmstadt XML Congress, Darmstadt, Germany
XML - Ein Segen für eCommerce Anwendungen?
January 16, 2001, Dresden University of Technology, Computer Science Department, Germany
XMILL: An Efficient Compressor for XML Data
May 17, 2000, ACM SIGMOD International Conference, Dallas, Texas, USA
XML - Eine Datenbank Revolution?
January 10/11, 2000, Magdeburg University and Free University Berlin, Germany

Abstract (German): XML ist ein neues Datenformat für die Repräsentation und den Austausch von Daten über das Internet. XML zeichnet sich gegenueber herkömmlichen, herstellerspezifischen Datenformaten durch zwei Merkmale aus: es ist standardisiert und flexibel. Neben dem Datenaustausch wird XML auch vermehrt als Grundlage zum Speichern von Daten verstanden. Die Entwicklung von effizienten Methoden für den Zugriff und die Manipulation von XML Daten ist jedoch nötig.
In der Forschung wurde XML als eine Instanz von "semistructured data" identifiziert. Anfragesprachen, Indexierungs- und Optimierungstechniken und physische Speichermodelle wurden entwickelt und auf XML angewendet. Dieser Vortrag gibt eine Übersicht über aktuelle Entwicklungen und Problemstellungen in der XML Forschung.
An Efficient Compressor for XML
December 14, 1999, AT&T Labs Research, Florham Park, NJ
An Efficient Compressor for XML
October 18, 1999, Graduate Research Symposium, University of Pennsylvania
October 5, 1999, DB Seminar, University of Pennsylvania
August 25, 1999, AT&T Labs Research, Florham Park, NJ
An Efficient Compressor for XML
August 24, 1999, Bell-Labs Lucent Technologies, Murray Hill, NJ

Abstract: XML is becoming an increasingly popular standard for representing and storing documents and for transporting data over the Internet. The amount of data available in XML is growing rapidly and efficient transport and storage techniques are necessary. One such technique is compression. Conventional compressors - such as Lempel-Ziv, or Huffman encoding - achieve reasonable compression. However they do not consider the specific syntax and semantics of XML and thus miss several opportunities for compression.
In this talk, we will describe a special purpose compressor for XML, called XMLzip, that improves over general purpose ones. The main component of XMLzip is a clustering technique that groups data elements together before applying conventional data compression to them. Depending on the type of XML data to be compressed, the user can choose between default clustering techniques or can define own clustering strategies using regular expressions. Furthermore, XMLzip allows the user to add domain-specific compressors for complex text structures, such as URLs or dates. Consequently, further improvement of the compression rate can be obtained.
Specifying Updates in Biomedical Databases
July 28, 1999, Intl' Conference on Scientific and Statistical Database Management, Cleveland

Abstract: Many of the publicly available biomedical data sources -- such as Genbank and SwissProt -- are not stored in traditional databases but in a variety of file formats (e.g. ASN.1 and EMBL). The data is complex, involving deeply nested structures. While query languages for such data have been well-studied, the issue of updating such databases has not. The need for a concise update language is critical since the changes to the data are typically very small when compared to the entire value.
Starting with a query language called Collection Programming Language (CPL), we describe an extension called CPL+ which provides an intuitive framework for updates on complex values. We illustrate the language using examples and present various optimization that can substantially improve the performance of complex updates.
Horizontal Query Optimization on Ordered Semistructured Data
June 4, 1999, ACM SIGMOD Workshop on The Web and Databases (WebDB'99), Philadelphia, Pennsylvania

Abstract: The exchange and storage of XML data is becoming increasingly important. In contrast to conventional semistructured data, the labels in a document-oriented representation such as XML are ordered. Furthermore, regular expressions (DTDs) describe the horizontal (and vertical) structure of the data. Traditional query languages for semi-structured data ignore the horizontal order and are therefore limited in their expressiveness and optimizability.
We describe a query language for querying ordered semistructured data. This query language provides primitives for specifying more powerful queries on ordered semistructured data. Furthermore, we describe how horizontal type information in DTDs is used to optimize queries based on finite automata.
Processing Updates on Complex Value Databases
May 16, 1999, IRMA International Conference, Hershey, Pennsylvania

Abstract: Query languages and optimization techniques for complex and object-oriented databases have been extensively studied by the database community. Languages for updating such databases, however, have not been studied to the same extent, although they are clearly important since databases change over time. We have therefore developed a language for updating complex value databases called CPL+. The syntax of CPL+ is concise and optimizable.
To argue that the rewrite rules produce updates with fewer accesses and updates to stored values, we present an execution model for CPL+ and an abstract storage model for a complex value database. We develop the notion of a workspace - i.e. the set of persistent objects that are accessed or updated within an update. Based on this measurement, we illustrate how rewriting of update expressions can reduce the cost of updates.
XML-Extensions: XLink, XPointer, Namespaces, XML-Data, DCD
May 5, 1999, Lecture Series "XML and Beyond", University of Pennsylvania

Abstract: This lecture was given as part of the "XML and Beyond" course organized by the Database Research Group for academic and industrial interest groups.
Horizontal Query Optimization on Ordered Semistructured Data
March 18, 1999, Database Research Seminar

Abstract: This is an extended version of the WebDB'99 talk
An Introduction to O₂ (PS-Slides)
1998, Database Research Seminar

Abstract: O2 is a major object-oriented database management system, which has become increasingly popular in the last few years.
In this (rather application-oriented) talk, I will give a general introduction to the system. I will describe important features of O₂, the system architecture and its components. The underlying data model and the supported programming and query languages (ODMG C++, O₂C, and OQL) will be discussed in detail by using examples. Furthermore, I will give a brief overview of system management, transaction management, and administration.
Quaternion Calculus for Modeling Rotations in 3D Space (PS-Slides)
May 13, 1998, Preliminary Exam Defense

Abstract: Rotations in the three dimensional space have been of large interest in various scientific communities. The set of rotations forms the multiplicative group SO(3). Different parameterizations of SO(3), such as Euler angles, quaternions, and Cayley matrices have been analyzed and are frequently used in computer graphics.
Quaternions are one of the most natural and concise ways of representing rotations in 3D space. In this paper, we describe the algebraic and geometric foundations of quaternions. Interestingly, it turns out that it is possible to model rotations as elements of S³, the unit sphere in the four-dimensional space. Indeed, the group of rotations SO(3) has the topology of the real projective space RP³.
While the geometric and algebraic characteristics of single rotations have been well-studied, the representation of curves on the spherical space $SO(3) has only recently been investigated. One of the most intriguing problems connected to rotations is the interpolation between given snapshots (or keyframes) at different times. In the Euclidean space, polynomials can easily be used for interpolation. Unfortunately, in the curved space of rotations SO(3), methods for interpolation are more challenging. This paper presents the most important approaches for interpolating rotations and modeling curves on the spherical space S³, such as spherical Bezier curves.
Die Abbildung von Datenbankschemata in Wissensbasen (WinWord-File)
March 11, 1998, Invited Talk at Chemnitz University of Technology, Germany

Abstract: Die Wissensrepraesentation in der kuenstlichen Intelligenz erfolgt oft auf der Basis von Beschreibungslogiken. Auf der Basis von solchen Logiken erstellte Wissensbasen sind inbesondere einfach lesbar, erweiterbar und es kann neues Wissen geschlossen werden.
Die diesem Vortrag zugrundeliegende Arbeit beschaeftigt sich mit der Abbildung relationaler und objekt-orientierter Datenbanken in Beschreibungslogiken. Dadurch koennen die positive Aspekte von kommerziellen Datenbanken (hohe Zuverlaessigkeit, Effizienz bei Anfragen) mit den Vorteilen von wissensbasierten Systemen (Erweiterbarkeit, einfache Anfragen, logisches Schliessen) verknuepft werden. Es wird auf die Abbildung von Schemata auf Konzeptbeschreibungen und die Transformation von Anfragen und deren Optimierung eingegangen.
Weiterhin werden moegliche Anwendungen solcher Abbildungen innerhalb von Agenturen und Agenturen-Netzwerken beschrieben und es wird auf die Problematik bei der Repraesentation von Wissen in verteilten Systemen eingegangen. Darueber hinaus wird die Moeglichkeit von Verhandlungen ueber den Objekten in der Wissensbasis diskutiert.
HERMES: A Heterogeneous Reasoning and Mediator System (PS-Slides)
1997, CIS650 Seminar Presentation

Abstract: HERMES is a system for integrating heterogeneous databases and reasoning paradigms. It is under development at the University of Maryland.
Component Objects and Component Software (PS-Slides)
1997, Database Research Seminar

Abstract: The main goal of object and software components is to distribute software and objects over networks with a certain extent of abstraction. Important objectives are system independency and location transparency.
In the talk I want to give an overview over current approaches in this diverse and rapidly changing market. Firstly, I will briefly discuss the historic approach of Remote Procedure Call (RPC) and its data description language. Secondly, the object-oriented Common Object Request Broker Architecture (CORBA) and some of its implementations will be described. Lastly, I will give an introduction into Microsoft's Component Object Model (COM) and its (in)famous application: Object Linking and Embedding (OLE).
Issues about Object-Oriented Modeling (PS-Slides)
1997, Database Research Seminar

Abstract: Object-orientation has been during the last year and and in today's software engineering is unthinkable without it. In the talk I want to present a very general definition of object-orientation and will show how real-world OO-systems restrict this, yielding to a major lack of modeling power. Connecteds to this I will also do a short discourse into knowledge representation in artificial intelligence showing up some interesting similarities. Furthermore, I will introduce a concept called configurations, that leads to an extension to the conventional object-oriented inheritance scheme and I will show how this would improve the modeling power of conventional pr ogramming languages. I will discuss major ideas in object-oriented software engineering like automated software and document generation, component software and important concepts like COM, CORBA and OLE.