The goal of this project is to produce a new generation of XML programming languages stemming from the synergy of integrating three approaches into a unique framework: a logical approach, a data-oriented approach and a programming language approach. Languages whose constructions are inspired by the latest results in the PL research; with precise and polymorphic type systems that merge PL typing techniques with logical-solver-based type inference; with efficient implementations issued by latest researches on tree automata and formally certified by latest theorem prover technologies; with optimizations directly issued from their types systems and the logical formalizations and whose efficiency will be formally guaranteed; with the capacity to specify and formally verify invariants, business rules, and data integrity. Languages with a direct and immediate impact on standardization processes. More information avaliable at Typex web site


The research work proposes to undertake seeks to push the frontier of XML technology innovation in three interconnected directions. First, we propose to study languages, algorithms, and develop prototypes for efficient and expressive XML processing, in particular advancing towards massively distributed XML repositories. Second, we will consider models for describing, controlling, and reacting to the dynamic behavior of XML corporas and XML schemas with time. Third, we propose theories, models and prototypes for composing XML programs for richer interactions, and XML schemas into rich, expressive, yet formally grounded type descriptions. More information Codex site


GraphDuce is a common project between the CDuce team and Brixlogic a start-up company that provides XML / Web Services business layer software solutions for the financial services market.

The project is financed by the Réseau National de recherche et d'innovation en Technologies Logicielles (RNTL) and aims at ...

More information about the project can be found in the following page on GraphDuce (in French)


This ACI is motivated by the increasing number of applications that produce, consume or handle large sets of data, or "datamasses". In many cases, these are either raw data or a collection of data from various sources, both of which lack uniform descriptive criteria. Such cases require more flexibility than the classical relational model can provide, and have given rise to the so-called semi-structured data model [1], of which XML is one of the most prominent examples.

Our project intends to study the processing, querying and handling of large datamasses whenever data is available in XML format. We pay particular attention to the programming languages and query languages problems. We aim to cover in a uniform way a wide spectrum of different areas, namely: programming languages (expressiveness, typing, new programming primitives, query underlying logics, logical optimization), data access (streamed data, compression, access to secondary memory storages, persistency engines), implementation (pattern matching compiling, physical optimization, subtyping verification, execution models for streamed data).

We will tackle these challenges following three research directions:

  1. query languages: one of the characteristics of the relation model is to base query languages on the relational algebra or the relational calculus. These are paradigms characterized by high declarativity (in the sense that they describe the result rather the way to obtain the result) and limited expressiveness (notably, they are not Turing complete). The "simplicity" of these languages is at the origin of the good performances, performances that can be improved by using the algebraic properties of the operators (logical optimization) or by secondary memory management techniques (physical optimization). Our goal is to develop a similar, or at least close, framework for the XML model, and we will pursue it as follows: theoretical study of the expressiveness and complexity of the query languages; definition of query languages for XML and their implementation; definition and validation of optimization techniques.

  2. streaming: the possibility of process streams of data without needing of storing whole documents (if not partially) is crucial in the context of datamasses. We will consider the aspects related to streaming also when the data is compressed. always possible [2], so one of the main difficulty to overcome here is to identify a suitable class of ``streamable'' queries, with or without compression, and in the former case to determine optimal compression granularity.

  3. document typing: type systems are used in the first place for document validation and for checking integrity constraints, but as with standard programming languages, types are at the basis of many helpful optimizations. This makes the study of typing systems one of our primary objectives.

    Another motivation for line of work is our interest in integrity constraints whose satisfaction does not depend on the ordering of the fields in a document, unlike the constraints expressible in ``classical'' type systems for XML such as DTD. This is a natural choice when processing data originating from the fusion of several relational databases (a frequent instance of large documents), since the order of the fields is then irrelevant.

The groups involved in our project have each already been working separately on XML document handling, although this is only one of the incentives for us to work together. Indeed, we share the same fundamental theoretic approach, namely automata theory and the associated logics, and the same interest in query languages and document validation: typing, integrity constraints Beyond our agreement on foundational tools and our agreement on goals, cooperation inside the project is further strengthened by the choice of a single software target, the CDuce language [3][4], a joint development of LIENS and LRI, two of the sites involved in this project.

More information about the project can be found in the following page on XML Transformation Languages: logic and applications (TraLaLA) (in French)


MyThS (Models and Types for Security in Mobile Distributed Systems Contract IST-2001-32617) is a project funded by the Information Society Technologies (IST) Programme of the European Union. It is a collaboration between the Ecole Normale Superieure de Paris, the University of Sussex, and the University Ca' Foscari of Venice

MyThS seeks to develop type-based foundational theories of security for mobile and distributed systems. By relying on strong typing as the basic principle, MyThS addresses the foundations of programming languages and paradigms that allow static detection of security violations, and aims at developing type theoretic methods and tools that enable formal analyses of security guarantees appropriate for systems and applications on the global computing platform.

More information about the project can be found in the following page on MyThS


Preserving the confidentiality and integrity of data hosted in multiple distributed sources (personal, administrative, healthcare, business or scientific data) constitutes a tremendous challenge for the database community. Unfortunately, existing access control models implemented in Data Base Management Systems (DBMS) exhibit important weaknesses. First, existing models are unable to tackle the complexity of distributed and decentralized organizations as well as the growing diversity of channels to access the information. Second, while the semantics of access control policies is well established when applied to relational data, things become fuzzier when semi-structured and hierarchical data like XML documents - are considered. Third, existing models suffer from a centralized access rights administration, making them more vulnerable to both internal and external attacks (according to the FBI computer crime and security report, more than 50% of database attacks are conducted by insiders). The goal of the CASC project is to address these three important issues: how to tackle complex distributed organizations, how to define accurate access control policies on XML-like data and how to secure the global architecture against attacks.

Several access right models have been proposed in the literature (the most well known being DAC, MAC and RBAC) and existing DBMS mix concepts from different models in the same implementation. The resulting models are not always well formalized so that some situations are complex to model and may lead to unexpected information leakage. In this context, we proposed a formal access right model, called ORBAC (Organization Based Access Control model), that encompass all the concepts required to express a security policy in complex distributed organizations. Its generality and formal foundation makes this model the best candidate to serve as a common framework in this project. The work plan will be divided into three tasks, each of them addressing one of the aforementioned issues:

- Extending ORBAC towards distributed architectures. The objective is to extend ORBAC with the concepts required to deploy and administer the model in distributed organizations. More precisely, the following problems have to be addressed: consistency of the access rules to be deployed, distribution of the access right control, distribution of the access right administration and characterization of the trusted components that need to be integrated in the global architecture to secure it.

- Instantiate ORBAC in the XML context. XML is now a de facto standard to exchange data on the Internet. To date, few attentions have been paid on the definition of an access right model for XML and all proposals suffer from important drawbacks. Defining a coherent and powerful access right model for XML is still an open issue that slows down the deployment of many Web applications dealing with sensitive data. The ORBAC model provides a sound basis to define such a model. While ORBAC is agnostic wrt a data model, interesting and difficult problems are foreseen in the translation of ORBAC concepts into XML concepts.

- Chip-Secured XML-ORBAC architecture. Whatever be the expressive power of an access right model, it remains inoperative against attacks directed to the database footprint on disk by an intruder and against the actions of an ill-intentioned Database Administrator (DBA) (a DBA has enough privileges to change the access right policy or to tamper the access right management). Our objective is to study how data encryption and secured hardware components (e.g., smartcards or tokens) could be exploited to secure the control and the administration of ORBAC access right rules applied to XML documents.

This project may lead to significant advances in the following areas: (1) abstraction of the fundamental concepts required in any access right model and formalization of the associated administration procedures, (2) definition of a powerful and sound access right model for XML documents and (3) definition of chip-secured data access and administration architectures. In addition, this combination of research efforts around the ORBAC model allows to investigate a complete and general solution to secure distributed confidential data.

This ambitious objective could be reached thanks to the complementary skills of the CASC partners, namely: formalization of access right models integrating the concepts of organization and context usage (ENST-B), access right models for XML documents (LIUPPA), security analysis of XML transformations (ENS-LRI) and chip-secured data access models and data encryption (INRIA).

More information about the project can be found in the following page

[1] S. Abiteboul, P. Buneman, and D. Suciu. Data on the Web : From Relations to Semistructured Data and XML. Morgan Kaufmann, 1999.

[2] Luc Ségoufin and Victor Vianu. Validating streaming XML documents. In Symposium on Principles of Database Systems (PODS), 2002.

[3] V. Benzaken, G. Castagna and A. Frisch. CDuce: An XML-Centric General-Purpose Language Proceedings of the ACM International Conference on Functional Programming, 2003.

[4] CDuce: A modern programming language adapted to the manipulation of XML documents: