GraphDuce is a common project between the CDuce team and Brixlogic a start-up company that provides XML / Web Services business layer software solutions for the financial services market.
The project is financed by the Réseau National de recherche et d'innovation en Technologies Logicielles (RNTL) and aims at ...
More information about the project can be found in the following page on GraphDuce (in French)
This ACI is motivated by the increasing number of applications that produce, consume or handle large sets of data, or "datamasses". In many cases, these are either raw data or a collection of data from various sources, both of which lack uniform descriptive criteria. Such cases require more flexibility than the classical relational model can provide, and have given rise to the so-called semi-structured data model , of which XML is one of the most prominent examples.
Our project intends to study the processing, querying and handling of large datamasses whenever data is available in XML format. We pay particular attention to the programming languages and query languages problems. We aim to cover in a uniform way a wide spectrum of different areas, namely: programming languages (expressiveness, typing, new programming primitives, query underlying logics, logical optimization), data access (streamed data, compression, access to secondary memory storages, persistency engines), implementation (pattern matching compiling, physical optimization, subtyping verification, execution models for streamed data).
We will tackle these challenges following three research directions:
- query languages: one of the characteristics of the relation model is to
base query languages on the relational algebra or the relational calculus.
These are paradigms characterized by high declarativity (in the
sense that they describe the result rather the way to obtain the result)
and limited expressiveness (notably, they are not Turing complete). The
"simplicity" of these languages is at the origin of the good
performances, performances that can be improved by using the algebraic
properties of the operators (logical optimization) or by secondary memory
management techniques (physical optimization). Our goal is to develop a
similar, or at least close, framework for the XML model, and we will
pursue it as follows: theoretical study of the expressiveness and
complexity of the query languages; definition of query languages for XML
and their implementation; definition and validation of optimization
- streaming: the possibility of process streams of data without needing of storing whole documents (if not partially) is crucial in the context of datamasses. We will consider the
aspects related to streaming also when the data is compressed.
, so one of the main difficulty to
overcome here is to identify a suitable class of ``streamable''
queries, with or without compression, and in the former case to
determine optimal compression granularity.
- document typing: type systems are used in the first place for
document validation and for checking integrity constraints, but as
with standard programming languages, types are at the basis of many
helpful optimizations. This makes the study of typing systems one of
our primary objectives.
Another motivation for line of work is our interest in integrity constraints whose satisfaction does not depend on the ordering of the fields in a document, unlike the constraints expressible in ``classical'' type systems for XML such as DTD. This is a natural choice when processing data originating from the fusion of several relational databases (a frequent instance of large documents), since the order of the fields is then irrelevant.
The groups involved in our project have each already been working separately on XML document handling, although this is only one of the incentives for us to work together. Indeed, we share the same fundamental theoretic approach, namely automata theory and the associated logics, and the same interest in query languages and document validation: typing, integrity constraints Beyond our agreement on foundational tools and our agreement on goals, cooperation inside the project is further strengthened by the choice of a single software target, the CDuce language , a joint development of LIENS and LRI, two of the sites involved in this project.
More information about the project can be found in the following page on XML Transformation Languages: logic and applications (TraLaLA) (in French)
MyThS (Models and Types for Security in Mobile Distributed Systems Contract IST-2001-32617) is a project funded by the Information Society Technologies (IST) Programme of the European Union. It is a collaboration between the Ecole Normale Superieure de Paris, the University of Sussex, and the University Ca' Foscari of Venice
MyThS seeks to develop type-based foundational theories of security for mobile and distributed systems. By relying on strong typing as the basic principle, MyThS addresses the foundations of programming languages and paradigms that allow static detection of security violations, and aims at developing type theoretic methods and tools that enable formal analyses of security guarantees appropriate for systems and applications on the global computing platform.
More information about the project can be found in the following page on MyThS
Preserving the confidentiality and integrity of data hosted in multiple distributed sources (personal, administrative, healthcare, business or scientific data) constitutes a tremendous challenge for the database community. Unfortunately, existing access control models implemented in Data Base Management Systems (DBMS) exhibit important weaknesses. First, existing models are unable to tackle the complexity of distributed and decentralized organizations as well as the growing diversity of channels to access the information. Second, while the semantics of access control policies is well established when applied to relational data, things become fuzzier when semi-structured and hierarchical data like XML documents - are considered. Third, existing models suffer from a centralized access rights administration, making them more vulnerable to both internal and external attacks (according to the FBI computer crime and security report, more than 50% of database attacks are conducted by insiders). The goal of the CASC project is to address these three important issues: how to tackle complex distributed organizations, how to define accurate access control policies on XML-like data and how to secure the global architecture against attacks.
Several access right models have been proposed in the literature (the most well known being DAC, MAC and RBAC) and existing DBMS mix concepts from different models in the same implementation. The resulting models are not always well formalized so that some situations are complex to model and may lead to unexpected information leakage. In this context, we proposed a formal access right model, called ORBAC (Organization Based Access Control model), that encompass all the concepts required to express a security policy in complex distributed organizations. Its generality and formal foundation makes this model the best candidate to serve as a common framework in this project. The work plan will be divided into three tasks, each of them addressing one of the aforementioned issues:
- Extending ORBAC towards distributed architectures. The objective is to extend ORBAC with the concepts required to deploy and administer the model in distributed organizations. More precisely, the following problems have to be addressed: consistency of the access rules to be deployed, distribution of the access right control, distribution of the access right administration and characterization of the trusted components that need to be integrated in the global architecture to secure it.
- Instantiate ORBAC in the XML context. XML is now a de facto standard to exchange data on the Internet. To date, few attentions have been paid on the definition of an access right model for XML and all proposals suffer from important drawbacks. Defining a coherent and powerful access right model for XML is still an open issue that slows down the deployment of many Web applications dealing with sensitive data. The ORBAC model provides a sound basis to define such a model. While ORBAC is agnostic wrt a data model, interesting and difficult problems are foreseen in the translation of ORBAC concepts into XML concepts.
- Chip-Secured XML-ORBAC architecture. Whatever be the expressive power of an access right model, it remains inoperative against attacks directed to the database footprint on disk by an intruder and against the actions of an ill-intentioned Database Administrator (DBA) (a DBA has enough privileges to change the access right policy or to tamper the access right management). Our objective is to study how data encryption and secured hardware components (e.g., smartcards or tokens) could be exploited to secure the control and the administration of ORBAC access right rules applied to XML documents.
This project may lead to significant advances in the following areas: (1) abstraction of the fundamental concepts required in any access right model and formalization of the associated administration procedures, (2) definition of a powerful and sound access right model for XML documents and (3) definition of chip-secured data access and administration architectures. In addition, this combination of research efforts around the ORBAC model allows to investigate a complete and general solution to secure distributed confidential data.
This ambitious objective could be reached thanks to the complementary skills of the CASC partners, namely: formalization of access right models integrating the concepts of organization and context usage (ENST-B), access right models for XML documents (LIUPPA), security analysis of XML transformations (ENS-LRI) and chip-secured data access models and data encryption (INRIA).
More information about the project can be found in the following page