CP2PC User Interface Specifications (Version 7) =================================== This document contains the concepts and functions that are in a CP2PC application's user interface, such as a GUI. In previous documents we uncovered the concepts and functions that a selection of existing P2P networks have. (Some of these concepts are shared by all of the networks; others are particular to a subset of the networks.) These concepts were classified as: 1. User concepts. 2. Concepts internal to the P2P application, neither part of the user concepts nor of the P2P protocol's API. 3. Concepts supported by the P2P protocol's API. In this document we distill (mainly from the above user concepts) a set of user concepts and functions of a CP2PC application's user interface. Note that in CP2PC the aim is to develop a single API on top of which a *number* of applications might be built, each with their own user interface. The CP2PC application that we have in mind here is able to provide access to the most fundamental file-sharing functionality of any P2P network. We assume that: 1. Any real-world P2P application that interfaces to a number of P2P networks will need an important subset of these concepts. 2. Concepts supported by a real-world P2P application that are not found in this document are not fundamental to file sharing. An example of this is chatting. The goal of this exercise is to arrive at a user interface that: 1. We know must be supported in full by the CP2PC API. 2. Might actually be developed at a later stage by us. In this document we do not address the internal design of the application. The application is considered a black box with a GUI (or other user interface). This document also contains additional elements, such as rationale, how a particular user concept might be mapped to a particular P2P network, and even functions that are not fundamental but would be nice to have. This kind of information is placed between square brackets ([]). In this document, we use 'P2P network' to mean 'P2P file sharing network'. Concepts -------- File File ID File attributes File Collection Collection ID Peer File ---- The application's basic element of shared data is the file. This is a concept present in all file sharing P2P networks in some form or another. File ID ------- In the user interface specifications we define two kinds of file ID: a native (P2P network-specific) file ID and a CP2PC file ID. A file ID identifies a file for e.g. downloading. Different instances of a file on a P2P network have different file IDs. A file ID is not necessarily a unique identifier. The file ID is best compared to a URI. Native and CP2PC file IDs solve a very specific problem: they allow a file to be referenced (named) by different applications. In particular, some networks, such as CFS, do not (yet) have a search facility. In these networks the only way to access a shared file is by obtaining a reference to it from the publisher of the file. Native File ID -------------- Native file IDs are persistent, network-wide references that are specific to a particular P2P network. They allow a file to be referenced by different applications on that P2P network. Native file IDs can be exchanged between CP2PC and non-CP2PC users that are on the P2P network. We are assuming that each network that supports a persistent, network- wide file ID has a standard textual (string) representation of the file ID (which might for example be exchanged by email). For example, in CFS it is the public key of a CFS file system, combined with a pathname within that file system as follows: 'pubkey/dir/.../dir/file'. Note that a native file ID need not be a URI. [ For the moment we are assuming that native file IDs are representable as strings. If we should encounter a P2P network whose native file ID has a binary representation, we may additionally define a binary form of the native file ID. Some networks might not allow file IDs to be exchanged 'out of band' among their users. For these networks the CP2PC application does not need to support a native file ID. We are not aware of specific examples of such networks. However, one can easily imagine a P2P network protocol in which a search returns file IDs that: 1) are 'ephemeral' (non-persistent) and can only be used for *immediate* download, or 2) can be used for download only by the application that invoked the search operation (i.e. are not network-wide). Such ephemeral or non-network-wide file IDs cannot be used as native file IDs. ] CP2PC File ID ------------- CP2PC file IDs are persistent, global references that are specific to CP2PC. They allow a file on any P2P network to be referenced by different CP2PC applications. (Such a file need not necessarily have been published by a CP2PC application. A CP2PC application could generate a CP2PC file ID for a file published by a non-CP2PC application.) CP2PC file IDs can be exchanged between CP2PC users but are of little use to non-CP2PC users. CP2PC file IDs are derived from native file IDs as follows. A CP2PC file ID is formed from the native file ID together with the name of a P2P network on which the file exists. As an example, given a CFS native file ID 'pubkey/dir/.../dir/file', the CP2PC file ID is: cp2pc:cfs:pubkey/dir/.../dir/file. [ CP2PC file IDs are only supported for those P2P networks that CP2PC has native file ID support for (see earlier notes). ] File attributes --------------- A file can have attributes, such as its name, size, hash or checksum, and type (e.g. 'music'). Some attributes apply to all files (size) others are specific to a file type ('artist' in the case of music). Note that not all P2P networks have explicit support for attributes other than a filename or file size. We may wish to distinguish general attributes (such as filename and size) and network-specific attributes. [ Network-specific attributes might be flexibly implemented by allowing them to be defined by the network-specific part of the CP2PC application. ] File Collection --------------- For publishing, files may optionally be grouped together as collections. The files in a collection belong together in some user-defined way. For example, the tracks of a music album typically belong together and can be published as a collection. The files in a collection can be individually accessed and searched for on the P2P networks (i.e. by other P2P applications). The main reason for introducing collections into CP2PC is that some P2P networks only publish files as collections. For example, in CFS, files are published as part of a file system hierarchy; in GDN, files are published as part of a package. In these networks, when files are published in collections, they are more efficiently managed than when they are published individually. In GDN, for example, the files in a package are replicated together. Note that P2P networks that publish files as collections still allow the files in a collection to be accessed (e.g. downloaded) individually. For P2P networks that do not explicitly support a collection of files, the collection concept can be ignored, or it can be mapped to some other native concept if one is available. [ For example, in mnet a collection might be mapped to a file attribute: each file in the collection carries the same value for this attribute. A search by another user that includes such a file as a result can display the collection attribute, which allows the user to subsequently search for other files in the collection. (Note however that the mnet native application that we have seen does not display non-standard attributes such as our collection.) ] CP2PC collections are allowed to be empty. Collection ID ------------- Each collection has a native (P2P network-specific) collection ID and a CP2PC collection ID. These IDs is analogous to the file IDs. [ A collection ID should if possible be distinguishable from a file ID in order that it may be recognised as such by an application. This will detect erroneous use of a file ID as a collection ID or vice-versa. Note that we cannot enforce this for native IDs. ] Peer ---- A peer is a 'node' in one of the P2P networks. Depending on the network, a peer can be contained within a user application (such as Limewire/ Gnutella), or it may be running independently of a user application, as a kind of server (such as a CFS node). Perhaps surprisingly, in some native P2P applications the peer concept is very transparent. For example in CFS, files are referenced through the native file ID, which does not contain any peer identification. However, in an application such as Limewire (Gnutella), it is normal to see what peer publishes a file. [ In the CP2PC application the peer concept can be mostly tucked away, but it should be present as 'meta-information' (e.g. information about a file might include the publishing peer, if that is known). Also, during configuration the peer concept is often required: the address of some peer is often needed for joining a P2P network. ] Functions --------- Joining Searching Downloading Publishing Meta-information Configuration and Control Monitoring Joining ------- Connecting and disconnecting to the P2P networks. To communicate with peers on a P2P network, the CP2PC application must join that network. A user has control over what P2P networks to join. Contact information may be required to join a particular P2P network, e.g. the address of another peer. [ A user might specify what networks to join at start-up. In addition a user might connect and disconnect while the application is running. ] [ For each network that requires it a user should specify some network-specific contact information, e.g. the address of another peer. In many cases contact information can be transparent to the user. For example, the Limewire (Gnutella) application comes with contact information built in. However, a user must be able to override such information, e.g. to join a private network. ] Searching --------- Searching for files on the P2P networks. To search, the user specifies a search query. At some point search results will start to come in. Search Query ------------ Some P2P networks support free form search queries, typically consisting of a simple list of keywords; other networks support structured queries, for example consisting of a list of attribute/value pairs. Yet other networks such as CFS and GDN do not support search at all. Both free form and structured search queries are supported in an integrated way. An example of an integrated approach is as follows. The user might be presented with a search form in which attribute/value pairs can be entered. A structured search query based directly on the form is sent to the P2P networks that support such a query. For P2P networks that only support free form searches, only the value parts of the form are used. [ A form can contain general attributes such as a file name. It can also contain standard attributes for each file type, such as a performer in the case of a music file type. See the 'File attributes' section. ] Search Results -------------- Search results consist of a number of entries, one entry per file. For each entry the remote file meta-information (see 'Meta-information') is available. [ Files may be displayed as groups. There are several ways to group files, which include: o Group collections of files. o Group similar file names. ] Downloading ----------- Downloading a file published by a (remote) peer on one of the P2P networks. Downloads can be initiated in two ways: 1. From a search result (e.g clicking a search result file entry). 2. By specifying a native file id. The following information about the download is available: o Download progress o Remote file meta-information (see 'Meta-information'). o Remote peer meta-information (see 'Meta-information'). [ Nice features: When initiated from the search results: o any of a group o swarmed Resume broken download. ] Publishing ---------- Making a file available to the P2P networks for searching and or downloading. We also discuss 'unpublishing' here. During a publish operation by the user a number of operations may need to be performed by the application, depending on each P2P network: o Registering information about the file in a service (GDN, mnet). o Uploading the file and its contents to a P2P network (GDN, CFS, mnet). o Merely making the file visible to incoming search requests and download requests (Gnutella). The user can publish files in one of two modes: 1) Individual file publishing. 2) File collection publishing. The difference between these two modes is that with file collection publishing, the user explicitly groups a number of files together as a file collection. In the individual file publishing mode, no grouping is performed by the user. These two modes are further described in the following sections. Individual file publishing -------------------------- In this mode the user selects a number of files for publishing without explicitly creating a file collection. The cp2pc application places any such files in a 'default collection'. This is transparent to the user. File collection publishing -------------------------- In this mode the user selects a number of files for publishing and creates a separate file collection for them. The following operations on collections are supported: o Create/publish collection. This operation (a) defines what files are initially part of the collection, and (b) publishes the files in the collection. o Modify collection. This operation allows files to be added to or removed from an existing collection. Basically, this operation allows the initial selection of files to be modified. o Delete/unpublish collection. This operation withdraws the files in a collection from publication, and deletes the collection. 'Published collection' meta-information is available for each collection currently published (see 'Meta-information'). Selection --------- There are two ways to select files for publishing. They apply to both individual file publishing and file collection publishing. 1 Select files in the file system (such as in JXTA). Such files can also be unpublished later by unselecting them. 2 'Monitored directories'. Directories in the file system can be selected, whose files are to be published (such as in LimeWire (Gnutella)). The cp2pc application will monitor such directories for changes as follows. Files that are added to the directory are published by the application, files that are removed from the directory are unpublished by the application. There may be a delay between the changes to the contents of a directory and the publish/ unpublish operation. [ Some kind of 'update' button might be added. ] Monitored directories can also be unselected later, in which case any files in the directory are unpublished. [ Using the API terminology defined in the CP2PC API document ('P2P component' and 'application component'), a directory monitored by the application is not necessarily monitored by P2P components, even for Gnutella-like P2P components. It may instead be monitored by the application component. It may be useful to allow files to be deleted from monitored directories without necessarily unpublishing them. A P2P network in which published files are uploaded to the network does not require a published file to be present in the user's file system. It may also be useful to unpublish a file from a monitored directory without having to delete it. A filter (e.g. a file pattern) might optionally be specified with a monitored directory. Files in the directory that do not match this pattern are never published. ] Notes: o Monitoring the *contents* of a published file is a separate issue. We do not require our CP2PC application to monitor the contents of published files. o Modifications to the contents of a published file (performed outside of the application) may or may not be observed by other peers. In the normal case this is unspecified (however see below). In particular it is possible that such changes are visible on only some P2P networks but not on others. As a more advanced feature the application could allow the user to explicitly specify that a file be published with its *current* contents. If the contents of the file changes, no changes are visible to other peers. [ This may require the application to create an internal copy of the file. ] Miscellaneous ------------- A list of currently published files is available. 'Published file' meta-information (see 'Meta-information') is available for each file currently published. [ Use smileys for the status of each file: :-) published :-x locally available, but not published :-( file is published but not locally available (error) ] Meta-information ---------------- Meta-information about objects such as files and peers. Remote file ----------- File published by a remote peer: o Native and CP2PC file ID. o File collection that the file is part of. o File attributes. o The remote peer that published the file (+ meta) o The P2P network that the file has been published on. Published file -------------- File published by the cp2pc application: o Native and CP2PC file ID. o Collection it is part of (+ meta). o Publishing progress (think: uploading). Published collection -------------------- File collection published by the cp2pc application: o Native and CP2PC collection ID. o Published files in the collection (plus meta). Remote peer ----------- XXX Configuration and Control ------------------------- We distinguish general configuration, and network-specific configuration. General configuration --------------------- o Whether behind a firewall. o Specification of Internet connection bandwidth. o Which networks to join. o Whether to be a minimal peer, or offer optional services as well (Gnutella superpeer, JXTA rendezvous, mnet services, etc.) o File system resources available to cp2pc (maps to e.g. the number of CFS virtual servers to create). [ Limit bandwidth for use by data transmissions. ] Network-specific configuration ------------------------------ o Contact address of another peer on each P2P network. o Addresses of GDN object servers. o Relay address to use, when behind a firewall. o Authentication information (e.g. a password). o Mojo control. o JXTA peer groups to be a member of. [ Net-specific overrides of general configuration. For example, in the general configuration the user might specify to be a 'minimal peer'. In the network-specific configuration, the user might specify that it should be a JXTA rendezvous. Define and block freeloading. ] Monitoring ---------- Monitoring the activity and resource usage of the cp2pc application and of the P2P networks. Much of this information may be available to a limited extent only. P2P Networks ------------ How large is a P2P network: number of peers and files. Info about connections to a P2P network. Services that are offered by this peer to a P2P network (see Configuration). File system resources --------------------- Resources in use for storage by a P2P network, typically for remote peers. Resources used by my published files. Incoming searches ----------------- What searches am I handling? May apply to Gnutella and mnet only. File data transmission ---------------------- File data transmitted to and from other peers. This includes: o Downloads initiated by the user. o Data received from a remote peer for storage by this peer (CFS). o Data sent by this remote peer for storage by a remote peer (CFS), or as a result of a remote user search. The information includes (to the extent available): o Number of bytes transmitted so far. o Speed (bytes per second) of current transmissions. o Currently active transmissions: how many, what peers. o The number of requests for locally published files. Other ----- Info about mojo. [ Other nice features ------------------- History of files previously published, downloads, searches. Persistence and Crash recovery, e.g. auto-resumption of current downloads. Peer Group ---------- Peers can be grouped together to form a peer group. Depending on the P2P network, a peer group may be used for various purposes, such as visibility of publishing, or security. Not all P2P networks have a peer group concept. In Gnutella something like a peer group can be defined by having a common password among several peers. In Gnutella (LimeWire application) each peer is part of at most one peer group. In JXTA multiple peer groups can be defined and named, and a peer can be a member of multiple peer groups. In GDN a peer group could be defined as the set of users that are registered with the same access granting organisation. These users can publish their files on the same set of object servers (those object servers that trust the access granting organisation). [ In the CP2PC application the peer group concept can be mostly tucked away. However, advanced users should be able to specify peer groups for CP2PC operations, when that is appropriate. Each CP2PC peer group corresponds to a single peer group in one of the P2P networks, and is specified in a network-specific way. ] ] CHANGES ------- Since version 6 --------------- o In the introduction, mentioned that 'P2P network' means 'P2P file sharing network'. o In 'Native File ID' clarified that a native file ID is a string and that it need not be a URI. Also added a note that in the future we may need to deal with a binary native file ID as well. Since version 5 --------------- o In 'Collection ID', clarified why a collection ID should be distinguishable from a file ID. Since version 4 --------------- o In the introduction, added comments that this document does not address the internal design of the application. o In 'File Collection', specified that CP2PC collections are allowed to be empty. o In 'Selection', specified that a directory monitored by the application is not necessarily monitored by P2P components. o In 'Selection', specified the 'advanced feature' of allowing the user to specify that a file be published with its current contents. o In 'Miscellaneous' changed 'Use smileys for the status of each published file' to 'Use smileys for the status of each file. o In 'Peer Group' added comments about how to define a peer group in GDN.