VOStore interface

Reagan Moore moore at sdsc.edu
Fri Aug 5 17:11:56 PDT 2005

This is where we differ in the specification.

Which account does VOStore run under on the local storage system?


>If the user is accessing the local storage system directly then they 
>can do whatever they want. VOStore, however, is the presentation of 
>that repository to the VO world and does not necessarily interface 
>with a VOSpace layer: this means that the VOStore interface has to 
>be capable of handling the VO authentication mechanism. The 
>authorization story is as we seemed to have agreed.
>    Cheers,
>    Matthew
>Reagan Moore wrote:
>>The expectation is that the VOStore interface does not need to do 
>>either authentication or authorization.  If a person is working 
>>directly with a local storage system, then they are accessing their 
>>own personal data while running under their personal account ID. 
>>They can execute the VOStore interface as a local application.
>>If VOSpace is accessing the local storage system through VOStore, 
>>then VOSpace authenticates its access to the local storage system 
>>to read or write files under the VOSpace account ID.  Again VOStore 
>>is just a local application that VOSpace executes.
>>If the owner of data on the local storage repository chooses to 
>>make a file world readable, then VOSpace would be able to access 
>>the file through VOStore.
>>>Reagan Moore wrote:
>>>>I would like to propose the following separation of identity and 
>>>>access control management.  The issues appear to be how to 
>>>>separate support for local files in a local storage repository 
>>>>from the files that are registered into a shared collection that 
>>>>spans multiple storage repositories.  An easy way to make the 
>>>>differentiation is to identify the usage model for each type of 
>>>>data management system.  I would like to learn whether this 
>>>>approach would meet all of the IVOA requirements.
>>>>Local storage repository:
>>>>This is a storage system that is controlled by local 
>>>>administrators who establish access accounts for the persons who 
>>>>are allowed to use the system.
>>>>The users can choose their own file names, manipulate the files 
>>>>with the utilities that are available on the local storage, and 
>>>>are authenticated by the local system.  If desired, a user could 
>>>>log onto the local storage repository, and use a VO specific 
>>>>interface such as VOStore to access their own personal data. 
>>>>Since VOStore would be run under their account ID to access files 
>>>>that they own, there is no additional required authentication. 
>>>>They could also use other access mechanisms such as perl scripts, 
>>>>or Unix shell commands, C library calls, whatever is supported on 
>>>>the local storage repository.  These access mechanisms allow them 
>>>>to access files that they own.
>>>>A VOStore interface for this usage model would provide:
>>>>- get file
>>>>- put file
>>>>- list files
>>>>The only advantage is that if the VOStore interface were 
>>>>supported on all local storage repositories, the user would have 
>>>>a standard access mechanism.
>>>>Shared collection - VOSpace:
>>>>The purpose of the shared collection is to organize files across 
>>>>multiple storage repositories, provide a way to register files 
>>>>into the shared collection, establish access controls on the 
>>>>shared data, provide standard services for manipulating the files 
>>>>(Cone Search, SIAP, SSAP, Mosaic, ...), support replication, 
>>>>support selection of the closest file.
>>>>The shared collection provides a global (or logical) name space 
>>>>that can be organized in a directory structure independently of 
>>>>the naming convention and path hierarchy employed at the local 
>>>>storage systems. Thus the VOSpace system must manage the mapping 
>>>>from the logical name space to the naming convention used in the 
>>>>local storage system.
>>>>An account ID is established under which the shared collection 
>>>>(VOSpace) is able to deposit files in the local storage 
>>>>repository. This means the shared collection owns the data that 
>>>>is stored at the local storage repository.  In order to access 
>>>>the data, a user would need to authenticate herself to the shared 
>>>>collection, which in turn authenticates itself to the local 
>>>>storage repository. Whether or not to allow the access is 
>>>>controlled by ACLs managed by VOSpace.  This means that the 
>>>>authentication mechanism used by VOSpace is completely 
>>>>independent of the authentication mechanisms used by the local 
>>>>storage systems.
>>>>In order to handle the fact that local storage systems use a 
>>>>variety of authentication mechanisms (Unix password, PKI 
>>>>certificates, Kerberos certificates, DCE credentials, ...) the 
>>>>VOSpace implementation could use the Generic Security Service API 
>>>>(GSSAPI) to handle the heterogeneity.  In addition, an arbitrary 
>>>>authentication mechanism can be chosen for authenticating users 
>>>>to VOSpace.
>>>>If a VOStore interface is provided by the local storage 
>>>>repository, then VOSpace would be able to invoke the VOStore 
>>>>access mechanism (running under the VOSpace account ID).  Note 
>>>>that in this model VOStore does no authentication.  All 
>>>>authentication is controlled by a combination of the local 
>>>>storage system and VOSpace.
>>>>The type of operations that would be required by VOStore, 
>>>>however, are more sophisticated.  They include:
>>>>- get file
>>>>- put file
>>>>- list files
>>>>- register an existing file into VOSpace, while mapping from the 
>>>>local name to the VOSpace preferred name
>>>>- register an existing directory structure into VOSpace, while 
>>>>setting the VOSpace logical names and VOSpace directory structure 
>>>>to be the same as the local directory structure
>>>>- register an existing local file into VOSpace as a replica of an 
>>>>existing VOSpace logical file.
>>>>With the latter three commands, it is possible to meet the 
>>>>specific requirement that users be able to control the names of 
>>>>files both on the local system and in VOSpace.  Note that for the 
>>>>user to access the local file system they required an account ID 
>>>>on the local file system.  They then stored a local file under 
>>>>their own account ID. They would add read permission for the 
>>>>VOSpace account ID to their local file to permit access by 
>>>>This separates authorization cleanly between the local storage 
>>>>system (which only checks for access by local account IDs) and 
>>>>the VOSpace shared collection (which authorizes all accesses to 
>>>>files owned by VOSpace).  This means that VOSpace is managing 
>>>>multiple levels of indirection:
>>>>- mapping from the global or logical file name space to the local 
>>>>repository name space
>>>>- mapping from an authenticated user through application of ACLs 
>>>>to decide whether the user can read a VOSpace owned file.
>>>>- mapping preferred location for accessing replicas (typically 
>>>>pick a file on the file system with the user's IP address, then 
>>>>any other file system, then a tape archive)
>>>>For completeness, VOStore may need an operation that sets access 
>>>>permission for VOSpace, when VOStore is run under the local user 
>>>>account ID.
>>>>Reagan Moore
>>>>>I think that most of what is VOStore and what is VOSpace is 
>>>>>clear; however, the two grey areas are access control 
>>>>>(authorization) and identifiers and this stems from the use case 
>>>>>where the user wants direct access to a VOStore (e.g. a local 
>>>>>store) and does not want to go through the VOSpace layer. Here 
>>>>>are my suggestions for handling these areas:
>>>>>Access control:
>>>>>A VOStore can run in two modes: authorized and unauthorized. An 
>>>>>unauthorized VOStore is semantically equivalent to an anonymous 
>>>>>ftp site: any authenticated user (we still maintain security) 
>>>>>can put something in, move/rename it, get it and delete it.
>>>>>An authorized VOStore will only allow the requested operation if 
>>>>>a valid authentication token is included in the request - all 
>>>>>the VOStore has to do here is validate the authentication token. 
>>>>>The generation of the authentication token is handled by 
>>>>>VOSpace: it makes sure that the authenticated user has 
>>>>>permission to do what they are requesting and if so, places a 
>>>>>valid token in the request down to the VOStore.
>>>>>The protocol identifier ivo:// identifies a resource that exists 
>>>>>in the VO. It does not promise that you can completely resolve a 
>>>>>URI beginning ivo:// in a registry, merely that some component 
>>>>>of the URI will relate to a resource that has a registry entry, 
>>>>>i.e. the bit before the first # can be resolved in a registry. 
>>>>>So I can go to a registry and find out where 
>>>>>ivo://nvo.caltech/vostores/vostore1 is
>>>>>but I need to go to VOStore interface for this store to resolve 
>>>>>ivo://nvo.caltech/vostores/vostore1#halibut3. I do not see why 
>>>>>we need to introduce a second protocol just for VOStore contents.
>>>>>Now resolution of individual VOStore identifiers has to be done 
>>>>>at the VOStore level; however, VOSpace gives you the ability to 
>>>>>set up a single logical identifier for multiple copies of the 
>>>>>same resource so here we might want a separate protocol: vos and 
>>>>>resolution of this identifier has to be done at the VOSpace 
>>>>>level since VOSpace manages multiple VOStores.
>>>>>    Cheers,
>>>>>    Matthew
>>>>>Paul Harrison wrote:
>>>>>>Reagan Moore wrote:
>>>>>>>The differentiation between the VOStore and VOSpace interfaces 
>>>>>>>is becoming unclear.  The latest draft implies that properties 
>>>>>>>that were originally associated with VOSpace would now be 
>>>>>>>supported by VOStore.
>>>>>>I have to say that I agree that there seems to be some 
>>>>>>confusion in this area - with hindsight it was probably a 
>>>>>>mistake to defer the specification of VOSpace and work on 
>>>>>>VOStore alone as the "easier" problem - the specifications 
>>>>>>should be worked in tandem to see where it is most appropriate 
>>>>>>to place roles and responsibilities for particular use cases, 
>>>>>>so that a "global" solution is arrived at.
>>>>>>I thought that the original separation into VOStore and VOSpace 
>>>>>>was done so that VOStore could be an essentially "dumb" BLOB 
>>>>>>repository that did what it was told by the VOStore layer when 
>>>>>>it comes to issues of file permissions and hierarchical file 
>>>>>>names. However, because no VOSpace specification was created, 
>>>>>>these more advanced features have crept into the VOStore layer.
>>>>>>>Let's look at the current VOStore and VOSpace proposal:
>>>>>>>VOStore                                     VOSpace
>>>>>>>Storage of objects                          management of 
>>>>>>>virtual file system
>>>>>>>data stored under unspecified ID?
>>>>>>>no user home directory                      User home directory
>>>>>>>directory hierarchy                         Directory hierarchy
>>>>>>>Unique file name within storage             User-defined file names
>>>>>>>                                             Mapping VOSpace 
>>>>>>>name to VOStore name
>>>>>>>                                             List files for user
>>>>>>>Restrict access by user identity?
>>>>>>>Identify files with URIs
>>>>>>>Access controls on local file name          Access controls on 
>>>>>>>VOSPace name
>>>>>>>This characterization mixes name space, mixes access controls, 
>>>>>>>does not provide consistent identity, does not allow 
>>>>>>>consistent management.  For instance, if a URI is being 
>>>>>>>provided for file identity within the VOStore interface, then 
>>>>>>>there is no need for user-specified names within VOSTore.  A 
>>>>>>>second issue is the assumption that file access can be 
>>>>>>>restricted by user identity. This means that the VOStore must 
>>>>>>>manage the owner for each file, access controls for each file. 
>>>>>>>File systems usually do this by creating accounts for each 
>>>>>>>user name and applying Unix permissions.  Is this capability 
>>>>>>>to be provided now by both VOSpace and VOStore?  We need a 
>>>>>>>cleaner separation of capabilities.
>>>>>>This security aspect is crucial - it is clear that the owners 
>>>>>>of VOStores would not want to be managing user identity lists 
>>>>>>of all the VObs users at their stores - the fine grained access 
>>>>>>controls should be at the VOSpace level. If VOStores only 
>>>>>>respond to requests from trusted VOSpace services then this is 
>>>>>>possible, but I think that the perceived requirement for more 
>>>>>>detailed access control in the VOSpace layer has come about 
>>>>>>because prototype end-user applications have appeared that talk 
>>>>>>directly to the VOStore layer - of course, it is not surprising 
>>>>>>that this has happened because there was no VOSpace definition 
>>>>>>for the end user applications to talk to.
>>>>>>How file/BLOB identity is managed is also crucial to producing 
>>>>>>a system that offers more than ftp. I thought that one of the 
>>>>>>fundamental driving  use cases for a VOSpace was that the same 
>>>>>>BLOB could potentially live on serveral VOStores, and that when 
>>>>>>specifying a resource in VOSpace, in a workflow for instance, 
>>>>>>the resource could be retrieved from the VOStore that was 
>>>>>>"closest" on the network to where the resource would be 
>>>>>>consumed. This sort of use case does require some careful 
>>>>>>thought about the allocation and management of identifiers, and 
>>>>>>I think probably means that the VOStore will have to be aware 
>>>>>>of the VOSpace identifier.
>>>>>>I also have an issue with reusing ivo: as the protocol part for 
>>>>>>the URI of an identifier in this system - ivo: is already well 
>>>>>>defined and used as the identifer for registry entries, and the 
>>>>>>"protocol" for accessing the entity associated with the 
>>>>>>identifier is defined in the registry interface standard. This 
>>>>>>means that given an identifier of the form 
>>>>>>ivo://authority.org/something#blah a software agent (or human 
>>>>>>for that matter) cannot tell by inspection whether the 
>>>>>>identifier refers to a file in VOSpace or is simply a reference 
>>>>>>to a registry entry (e.g. for a SkyNode) - this leads to 
>>>>>>software having to be more complex in order constantly to test 
>>>>>>for the different possibilities. I think that it would be 
>>>>>>better to have a URI with a different protocol part, vos: for 
>>>>>>instance, it would then be immediately apparent that the 
>>>>>>VOSpace protocol should be used to access the entity referred 
>>>>>>to by the identifier.
>>>>>>>Let's look at the Storage Resource Broker data grid separation 
>>>>>>>of local storage management from the virtual file system 
>>>>>>>Local storage system                        SRB name space
>>>>>>>Storage of objects                          management of 
>>>>>>>virtual file system
>>>>>>>data stored under SRB ID
>>>>>>>no user home directory                      User home directory
>>>>>>>directory indirection structure             Directory hierarchy
>>>>>>>Unique file name within storage             User-defined file names
>>>>>>>                                             Mapping SRB name 
>>>>>>>to local file name
>>>>>>>                                             List files for user
>>>>>>>Access through SRB ID, controlled by SRB
>>>>>>>                                             Identify files by URIs
>>>>>>>                                             Access controls on SRB name
>>>>>>I think that as Regan points out the separation of 
>>>>>>responsibilities that  SRB has with the local storage system is 
>>>>>>pretty much the right model for  VOSpace and VOStore - though 
>>>>>>it means that SRB is pretty much at VOSpace level rather than a 
>>>>>>VOStore as is suggested in the current VOSpace definition 
>>>If you also allow the possibility that the local storage 
>>>repository can run in an unauthorized (anonymous access) manner 
>>>then this is exactly what Guy and I were suggesting. Does that 
>>>mean that we actually all agree on this :-)
>>>    Cheers,
>>>    Matthew

More information about the vospace mailing list