We need students to help implement the Mansion mobile agent system now!
For general information about the Mansion system and introductory papers see the Mansion webpage.
Student projects are available ranging from Masters projects of about 6 months to short-term (about 6 weeks) implementation projects, e.g., as part of the "capita selecta" section of the topmasters curriculum. The projects cover research areas related to (low-level) operating system support for mobile agent execution (jailing, see below) to high-level Mansion world design. Students applying for project 1 should have a background that includes the Operating Systems course. Students from AI or other computer science directions are welcome to apply for projects 2 and 3. Supervision is done by Guido van 't Noordende (guido@cs.vu.nl, first contact) and Andy Tanenbaum or Frances Brazier, depending on the project.
Porting the jailer
Porting the jailer requires porting a number of assembly routines of our user-level x86 based jailer such that they work on another CPU, among other things. In addition, porting requires a study of the system call interface on the ported-to operating system (e.g., linux and BSD systems differ in the way that system call arguments are passed to the kernel), and generally requires a lot of tweaking of the basic policy enforcement module so that it allows calls which are harmless (on this UNIX version) and inspects those which may interfere with the user's policy.
The Linux specific (ptrace(2) based) user-space jailer is mostly finished and a good starting point for porting the system. We also implemented a sophisticated Linux kernel support for the jailing system but that is not relevant to this project.
This is a difficult project, and is is mainly intended as a full-time (6 months or so) student project. There will be little room for doing other classes or assignments in parallel with this project.
Resource management
The most interesting project of the above two projects in terms of scientific contribution, is to extend the jailer with resource management functionality. Resource management is important to limit the resource usage (e.g., CPU speed, network bandwidth, file system usage) of an agent. We also intend to use resource management as the basis for implementing a jailling system which prevents covert channels between jails running on the same host, which is important when agents running in different jails have different security levels or are otherwise constrained using some kind of information flow control policy.
In this context, an important part of this project is that you find out in which ways a process running on Linux could deploy resource exhaustion (e.g., through CPU usage, memory usage, file locking, disk trashing, etc.) for exporting information to a process in another jail. In addition, there should be ways not only to prevent resource exhaustion based covert channels, but also to prevent other processes from noticing such excessive resource usage. In other words, we should be able to prevent one jailed process from sending bits through an indirect channel, and another jailed process from reading those bits.
Hooks for adding resource management to the jailer are available in the jailer's policy module, but the current jailer implementation does not support resource management yet. The implementation does prevent jailed processes from setting up direct communication channels (e.g., through files, sockets or IPC mechanisms) to processes in another jail, but indirect channels are currently not covered by the implementation.
The topic of this project is therefore twofold: first to do research on feasible resource management schemes for use in the Linux specific jailing system (it is hard enough to study resource management for one specific UNIX flavour, so we limit ourselves to linux for this project), and to study the impact of enforcing these on the behaviour of jailed processes and their effectiveness for limiting the overall system load. And secondly, you should study how those mechanisms can be used to prevent excessive resource usage in order to prevent export of information from a jail via covert channels.
Due to the mix of low-level detail required for implementing resource management schemes and more high-level requirements which are important for covert channel prevention and for implementing information flow policies, this project will probably be supervised on a dayly basis by two Ph.D. students. The overall supervisor will be Andy Tanenbaum.
A full-time student project (at least 6 months) could cover the resource management project. A similar timeframe is conceivable for porting the jailer to at least one other UNIX operating system or hardware architecture, although that could prove more problematic than extending the current jailer with resource management support.
Both projects require good knowledge of the C programming language. Porting requires some knowledge of intel x86 assembly. Other assembly languages can be learned during the project (in the case of the 6 months project, second bullet).
The project(s) can start immediately.
A primary application domain in which this can be used is in the Medical domain, where privacy protection of patient data is of primary concern. A confined room allows agents to search, for example, patient data for epidemiological studies (i.e., do statistics on a database with patient data) without any sensitive patient information being able to escape the room. An agent (i.e., arbitrary code written by a scientist) can only export anonymized data or aggregate data from the room by by giving it to the confined room's guardian agent and retrieving it from the guardian agent after leaving the room. The confined room allows any researcher to send agents to a hostpital's confined room and search there, even if the researcher is not one of the hospital's doctors or staff. Nowadays, looking at patient data in other hospitals is not possible, which restricts the scope of most medical or epidemiological studies to patients at one or at most a few hospitals.
A primary problem is how to filter data which an agent wants to export such that an agent cannot secretly export confidential data. The interface (communication language) with which an agent has to communicate with the guardian agent has to be designed. It is likely that this interface is application specific; however, the design of the interface / communication language is expected to teach us much about the more general problems that occur when designing confined rooms and guardian agents.
This project is about designing a confined room containing confidential data and a suitable guardian agent for use in a medical application. A first required step is to work out in detail a suitable application which is representative of a problem (such as the one outlined above) in the medical domain. Initially, you (we) will have to talk to medical researchers about how medical databases are currently organized and what kind of analysis can be done on those databases so that you can design a prototype demonstration system that represents a real medical system. Once this is done, you will have to design and implement a confined room with a guardian agent, and some agents that do work in the confined room, possibly using existing data analysis tools. A prototype infrastructure (middleware) system is implemented with which the designed rooms and agents can be built.
This project is a masters project. AI students as well as Informatica students are requested to apply. Students in Medical Information Systems are also explicitly requested to apply. Much initial knowledge about, or affinity with, medical systems is not required but is a pre.
The project can start immediately.
As an example, in a bioInformatics setting, an agent can be sent to a room owned by the Brookhaven Protein Data Base (PDB) or the EMBL bioinformatics institutes, where it can find protein data or sequence data (depending on the application) in the databases hosted by those institutes. Rather than using the institute's own algorithms (which may be available to researchers through a web service interface, for example), a researcher can implement its own protein or sequence analysis algorithms, embed these in an agent, and ship the agent off to the room in question to do its own computation there. When the agent is done, it will return to its owner and present the results. It is also possible to send a number of agents each implementing a different programs to the database at the same time, to be able to do analysis of the data through several stages implemented by the different agents, which can communicate with each other using TCP connections. All agents run locally. This improves efficiency compared to the case where a web service representing stage 1 returns its results to the (remote) user, which then have to be send back to another web service representing stage 2 of the computation pipeline, etc. (this is how things are currently done in a number of web-service based bioinformatics applications). In addition, it is very important that a researcher can send its own programs to a database (a room), rather than being restricted to using the programs which are provided by the database or web service owner (e.g., PDB or EMBL), which may yield different results than what is required by the user.
A student is looked for who has a background in computer science, and preferrably has some affinity with biological problems or bioinformatics tools. Part of the assignment is to find a common way in which an agent can access and search for information in a database for a specific application domain domain (such as sequence data or protein data). A lot of work is currently being done to standardise access to most biological databases, we can make use of this standardisation for our application. A number of tools for analysis of biological data are available from the bioinformatics department at the VU. These tools (programs) can readily be embedded into a mobile agent.
Infrastructural (middleware) tools exist to ship agents over the internet and start them up in a room in a secure way. The endresult of the project should be a working, although probably simple, Mansion world which contains a couple of rooms containing biological databases, and a number of pre-built agents (based on existing bioinformatics programs) which can be shipped to these rooms and collect information there.
The project will be supervised between the computer systems and the bioinformatics departments and has a duration of at least 6 months.
The project can start immediately.
For information about the above projects or other projects in the Mansion framework, contact Guido van 't Noordende (guido@cs.vu.nl).
The project is special in that it is relatively open and in that you will have relative freedom in finding suitable applications for the security mechanisms that you have to implement; the project is therefore not necessarily bound to Mansion. A literature study is part of the project's requirements. Please contact Guido van 't Noordende (guido@cs.vu.nl) for more information on the possibilities, project duration, etc. The project can start immediately.
Introduction
Mansion uses distributed (i.e., replicated) objects. These objects implement some of the core functionality of the Mansion system, such as rooms and services (see the Mansion project page for more information). An object in Mansion can contain content that can be inspected by mobile agents. Content can range from a simple web document to a collection of pictures or movies or a database containing medical information. Another example of the use of objects in Mansion is where an object implements a service, such an agent location service. This service has to be available all the time, and replication can be used to achieve a degree of reliability and availability. However, the name of the location service should be independent of where the service is located, i.e., it should have location independent naming, such that the service can be transparently replicated on the fly, for example when demand increases. In addition, the clients that use the location service (e.g., middleware processes in the Mansion system) must be sure that they talk to the right service and not to some impostor. From the content perspective, i.e., from the service or object provider's side, it is necessary that an object's (full) content is only replicated to trusted processes; this allows an object or service owner to make sure that the integrity is protected, and keeps placement (on particular machines) of the data in the object under control of the object / data owner.
In both cases, a mechanism is required that securely binds a name to a replicated service (or rather, to processes that provide the service) in a secure, location transparent way. This is achieved using public-key cryptography, by means of the concept of zones.
A zone is a group of processes (e.g., middleware processes hosting objects) which each has its own public/private key pair of which the public key is signed using a common key, the zone key. The zone key is owned and kept secret by the zone owner, which can sign the public keys of all members of the zone. A process is a member of a zone as long as it has a valid zone membership certificate signed by the zone owner. Zone certificates have an expiration date; in some cases, zone membership certificates may be blacklisted by the zone owner. The zone's public key is hashed to yield a self-certifying ZoneID. Using the ZoneID, members of the zone can be looked up using a, not necessarily trusted, location service such as DNS, and authenticated using an authenticated key-exchange (challenge-reply) protocol based on the ZoneID(see [1]and [2]).
The Project
The mechanisms and services needed to control distribution of objects based on zone-based distribution policies need to be explored and implemented. Objects can be implemented as Globe objects, or as web services. Globe has a security architecture which is mostly implemented but which has to be adapted so that it supports zone-based authentication. Another system that could potentially be used to implement distributed objects in Mansion is IBIS (see www.cs.vu.nl/ibis). Ibis is a Java-based system that is primarily targed at high performance computing but also aims to scale beyond local area networks. If Ibis were to be used in Mansion, it must be adapted to support zone-based distribution policies and authentication as outlined above. Ibis objects can be accessed remotely using Java-RMI or via a web service interface.
You need to implement the basic libraries required to do secure authentication based on ZoneIDs. In addition, you need to integrate those tools in one or more distributed systems, like Globe, Ibis, or a (transparently) distributed web service. Implementing a naming service that publishes the contact points of zone members is also one of the things you have to implement. Naming services and lookup tools for zone members (e.g., using DNS) should be implemented as self-contained client libraries, in such a way that it is usable by different systems. We hope the tools can be used in the Mansion framework, but your own implementation(s) will only become part of Mansion if their usability is shown by a demonstration system that you have to implement.You should begin by identifying the relevant pieces of software to implement Mansion security based on ZoneIDs in Globe, but also in Ibis and web services. A literature study of other distributed systems and how the zone-based control mechanisms and implementation can be applied to these systems is part of the thesis' requirements. Your initial exploration can be used as the basis for implementing zone-based authentication into one or more existing systems.
The project can start immediately. Please contact Guido van 't Noordende (guido@cs.vu.nl) for more information.
References
[1] D. Mazieres, M. Kaminsky, M.F. Kaashoek, E. Witchel. Separating key management from file system security. 17th ACM Symposium on Operating Systems Principles (SOSP) 1999, pp. 124-139.
[2] G.J. van 't Noordende, F.M.T. Brazier, A.S. Tanenbaum. Security in a Mobile Agent System, First IEEE Symposium on Multi-Agent Security and Survivability, Philadelphia, August 2004.
[3] M. van Steen, P. Homburg, and A.S. Tanenbaum. Globe: A Wide-Area Distributed System. IEEE Concurrency, January-March, 1999, pp. 70-78.
[4] B.C. Popescu, M. van Steen, A.S. Tanenbaum. A Security Architecture for Object-Based Distributed Systems. Proc. 18th Annual Computer Security Applications Conference, Las Vegas, NV, December 2002.
[5] W. Vogels, Web Services Are Not Distributed Objects, IEEE Internet Computing, Nov-Dec 2003, pp. 59-66.