VOStore interface

Reagan Moore moore at sdsc.edu
Mon Aug 8 12:15:45 PDT 2005


Matthew:
The account does make a difference when VOStore attempts to access 
the requested data.

If VOStore gets an authenticated request for local data that belongs 
to another user ID, how does VOStore gain access to the data?

The challenge is that VOStore is being used to access both local data 
owned by an individual, as well as data that is being published into 
VOSpace.

Access to data in VOSpace can be achieved by having VOStore run under 
the VOSpace account ID.

Access to data owned by an individual requires either that VOStore 
run under the individual's account, or that permission be given by 
the individual for access by the accoun that VOStore is run under.

One way out is to assume VOSTore is run under the account that 
corresponds to VOSpace.  Then VOSpace can manage the authorization 
independently of VOStore.

Reagan


>Hi,
>
>I would argue that this is an implementation issue: you have to make 
>sure that VOStore can fulfil what it promises.
>
>The required functionality for authentication is just that the 
>VOStore can recognise a valid message, e.g. the certificate used to 
>sign the SOAP message has the NVO CA in its certificate chain.
>
>    Cheers,
>
>    Matthew
>
>Reagan Moore wrote:
>
>>Matthew:
>>This is where we differ in the specification.
>>
>>Which account does VOStore run under on the local storage system?
>>
>>Reagan
>>
>>>Hi,
>>>
>>>If the user is accessing the local storage system directly then 
>>>they can do whatever they want. VOStore, however, is the 
>>>presentation of that repository to the VO world and does not 
>>>necessarily interface with a VOSpace layer: this means that the 
>>>VOStore interface has to be capable of handling the VO 
>>>authentication mechanism. The authorization story is as we seemed 
>>>to have agreed.
>>>
>>>    Cheers,
>>>
>>>    Matthew
>>>
>>>
>>>Reagan Moore wrote:
>>>
>>>>Matthew:
>>>>
>>>>The expectation is that the VOStore interface does not need to do 
>>>>either authentication or authorization.  If a person is working 
>>>>directly with a local storage system, then they are accessing 
>>>>their own personal data while running under their personal 
>>>>account ID. They can execute the VOStore interface as a local 
>>>>application.
>>>>
>>>>If VOSpace is accessing the local storage system through VOStore, 
>>>>then VOSpace authenticates its access to the local storage system 
>>>>to read or write files under the VOSpace account ID.  Again 
>>>>VOStore is just a local application that VOSpace executes.
>>>>
>>>>If the owner of data on the local storage repository chooses to 
>>>>make a file world readable, then VOSpace would be able to access 
>>>>the file through VOStore.
>>>>
>>>>Reagan
>>>>
>>>>>Reagan Moore wrote:
>>>>>
>>>>>>I would like to propose the following separation of identity 
>>>>>>and access control management.  The issues appear to be how to 
>>>>>>separate support for local files in a local storage repository 
>>>>>>from the files that are registered into a shared collection 
>>>>>>that spans multiple storage repositories.  An easy way to make 
>>>>>>the differentiation is to identify the usage model for each 
>>>>>>type of data management system.  I would like to learn whether 
>>>>>>this approach would meet all of the IVOA requirements.
>>>>>>
>>>>>>Local storage repository:
>>>>>>
>>>>>>This is a storage system that is controlled by local 
>>>>>>administrators who establish access accounts for the persons 
>>>>>>who are allowed to use the system.
>>>>>>The users can choose their own file names, manipulate the files 
>>>>>>with the utilities that are available on the local storage, and 
>>>>>>are authenticated by the local system.  If desired, a user 
>>>>>>could log onto the local storage repository, and use a VO 
>>>>>>specific interface such as VOStore to access their own personal 
>>>>>>data. Since VOStore would be run under their account ID to 
>>>>>>access files that they own, there is no additional required 
>>>>>>authentication. They could also use other access mechanisms 
>>>>>>such as perl scripts, or Unix shell commands, C library calls, 
>>>>>>whatever is supported on the local storage repository.  These 
>>>>>>access mechanisms allow them to access files that they own.
>>>>>>
>>>>>>A VOStore interface for this usage model would provide:
>>>>>>- get file
>>>>>>- put file
>>>>>>- list files
>>>>>>The only advantage is that if the VOStore interface were 
>>>>>>supported on all local storage repositories, the user would 
>>>>>>have a standard access mechanism.
>>>>>>
>>>>>>Shared collection - VOSpace:
>>>>>>
>>>>>>The purpose of the shared collection is to organize files 
>>>>>>across multiple storage repositories, provide a way to register 
>>>>>>files into the shared collection, establish access controls on 
>>>>>>the shared data, provide standard services for manipulating the 
>>>>>>files (Cone Search, SIAP, SSAP, Mosaic, ...), support 
>>>>>>replication, support selection of the closest file.
>>>>>>
>>>>>>The shared collection provides a global (or logical) name space 
>>>>>>that can be organized in a directory structure independently of 
>>>>>>the naming convention and path hierarchy employed at the local 
>>>>>>storage systems. Thus the VOSpace system must manage the 
>>>>>>mapping from the logical name space to the naming convention 
>>>>>>used in the local storage system.
>>>>>>
>>>>>>An account ID is established under which the shared collection 
>>>>>>(VOSpace) is able to deposit files in the local storage 
>>>>>>repository. This means the shared collection owns the data that 
>>>>>>is stored at the local storage repository.  In order to access 
>>>>>>the data, a user would need to authenticate herself to the 
>>>>>>shared collection, which in turn authenticates itself to the 
>>>>>>local storage repository. Whether or not to allow the access is 
>>>>>>controlled by ACLs managed by VOSpace.  This means that the 
>>>>>>authentication mechanism used by VOSpace is completely 
>>>>>>independent of the authentication mechanisms used by the local 
>>>>>>storage systems.
>>>>>>
>>>>>>In order to handle the fact that local storage systems use a 
>>>>>>variety of authentication mechanisms (Unix password, PKI 
>>>>>>certificates, Kerberos certificates, DCE credentials, ...) the 
>>>>>>VOSpace implementation could use the Generic Security Service 
>>>>>>API (GSSAPI) to handle the heterogeneity.  In addition, an 
>>>>>>arbitrary authentication mechanism can be chosen for 
>>>>>>authenticating users to VOSpace.
>>>>>>
>>>>>>If a VOStore interface is provided by the local storage 
>>>>>>repository, then VOSpace would be able to invoke the VOStore 
>>>>>>access mechanism (running under the VOSpace account ID).  Note 
>>>>>>that in this model VOStore does no authentication.  All 
>>>>>>authentication is controlled by a combination of the local 
>>>>>>storage system and VOSpace.
>>>>>>
>>>>>>The type of operations that would be required by VOStore, 
>>>>>>however, are more sophisticated.  They include:
>>>>>>- get file
>>>>>>- put file
>>>>>>- list files
>>>>>>- register an existing file into VOSpace, while mapping from 
>>>>>>the local name to the VOSpace preferred name
>>>>>>- register an existing directory structure into VOSpace, while 
>>>>>>setting the VOSpace logical names and VOSpace directory 
>>>>>>structure to be the same as the local directory structure
>>>>>>- register an existing local file into VOSpace as a replica of 
>>>>>>an existing VOSpace logical file.
>>>>>>
>>>>>>With the latter three commands, it is possible to meet the 
>>>>>>specific requirement that users be able to control the names of 
>>>>>>files both on the local system and in VOSpace.  Note that for 
>>>>>>the user to access the local file system they required an 
>>>>>>account ID on the local file system.  They then stored a local 
>>>>>>file under their own account ID. They would add read permission 
>>>>>>for the VOSpace account ID to their local file to permit access 
>>>>>>by VOSpace.
>>>>>>
>>>>>>This separates authorization cleanly between the local storage 
>>>>>>system (which only checks for access by local account IDs) and 
>>>>>>the VOSpace shared collection (which authorizes all accesses to 
>>>>>>files owned by VOSpace).  This means that VOSpace is managing 
>>>>>>multiple levels of indirection:
>>>>>>- mapping from the global or logical file name space to the 
>>>>>>local repository name space
>>>>>>- mapping from an authenticated user through application of 
>>>>>>ACLs to decide whether the user can read a VOSpace owned file.
>>>>>>- mapping preferred location for accessing replicas (typically 
>>>>>>pick a file on the file system with the user's IP address, then 
>>>>>>any other file system, then a tape archive)
>>>>>>
>>>>>>For completeness, VOStore may need an operation that sets 
>>>>>>access permission for VOSpace, when VOStore is run under the 
>>>>>>local user account ID.
>>>>>>
>>>>>>
>>>>>>Reagan Moore
>>>>>>
>>>>>>>
>>>>>>>I think that most of what is VOStore and what is VOSpace is 
>>>>>>>clear; however, the two grey areas are access control 
>>>>>>>(authorization) and identifiers and this stems from the use 
>>>>>>>case where the user wants direct access to a VOStore (e.g. a 
>>>>>>>local store) and does not want to go through the VOSpace 
>>>>>>>layer. Here are my suggestions for handling these areas:
>>>>>>>
>>>>>>>Access control:
>>>>>>>-------------------
>>>>>>>
>>>>>>>A VOStore can run in two modes: authorized and unauthorized. 
>>>>>>>An unauthorized VOStore is semantically equivalent to an 
>>>>>>>anonymous ftp site: any authenticated user (we still maintain 
>>>>>>>security) can put something in, move/rename it, get it and 
>>>>>>>delete it.
>>>>>>>An authorized VOStore will only allow the requested operation 
>>>>>>>if a valid authentication token is included in the request - 
>>>>>>>all the VOStore has to do here is validate the authentication 
>>>>>>>token. The generation of the authentication token is handled 
>>>>>>>by VOSpace: it makes sure that the authenticated user has 
>>>>>>>permission to do what they are requesting and if so, places a 
>>>>>>>valid token in the request down to the VOStore.
>>>>>>>
>>>>>>>Identifiers:
>>>>>>>--------------
>>>>>>>
>>>>>>>The protocol identifier ivo:// identifies a resource that 
>>>>>>>exists in the VO. It does not promise that you can completely 
>>>>>>>resolve a URI beginning ivo:// in a registry, merely that some 
>>>>>>>component of the URI will relate to a resource that has a 
>>>>>>>registry entry, i.e. the bit before the first # can be 
>>>>>>>resolved in a registry. So I can go to a registry and find out 
>>>>>>>where ivo://nvo.caltech/vostores/vostore1 is
>>>>>>>but I need to go to VOStore interface for this store to 
>>>>>>>resolve ivo://nvo.caltech/vostores/vostore1#halibut3. I do not 
>>>>>>>see why we need to introduce a second protocol just for 
>>>>>>>VOStore contents.
>>>>>>>
>>>>>>>Now resolution of individual VOStore identifiers has to be 
>>>>>>>done at the VOStore level; however, VOSpace gives you the 
>>>>>>>ability to set up a single logical identifier for multiple 
>>>>>>>copies of the same resource so here we might want a separate 
>>>>>>>protocol: vos and resolution of this identifier has to be done 
>>>>>>>at the VOSpace level since VOSpace manages multiple VOStores.
>>>>>>>
>>>>>>>    Cheers,
>>>>>>>
>>>>>>>    Matthew
>>>>>>>
>>>>>>>
>>>>>>>Paul Harrison wrote:
>>>>>>>
>>>>>>>>Reagan Moore wrote:
>>>>>>>>
>>>>>>>>>The differentiation between the VOStore and VOSpace 
>>>>>>>>>interfaces is becoming unclear.  The latest draft implies 
>>>>>>>>>that properties that were originally associated with VOSpace 
>>>>>>>>>would now be supported by VOStore.
>>>>>>>>>
>>>>>>>>
>>>>>>>>I have to say that I agree that there seems to be some 
>>>>>>>>confusion in this area - with hindsight it was probably a 
>>>>>>>>mistake to defer the specification of VOSpace and work on 
>>>>>>>>VOStore alone as the "easier" problem - the specifications 
>>>>>>>>should be worked in tandem to see where it is most 
>>>>>>>>appropriate to place roles and responsibilities for 
>>>>>>>>particular use cases, so that a "global" solution is arrived 
>>>>>>>>at.
>>>>>>>>
>>>>>>>>I thought that the original separation into VOStore and 
>>>>>>>>VOSpace was done so that VOStore could be an essentially 
>>>>>>>>"dumb" BLOB repository that did what it was told by the 
>>>>>>>>VOStore layer when it comes to issues of file permissions and 
>>>>>>>>hierarchical file names. However, because no VOSpace 
>>>>>>>>specification was created, these more advanced features have 
>>>>>>>>crept into the VOStore layer.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>Let's look at the current VOStore and VOSpace proposal:
>>>>>>>>>
>>>>>>>>>VOStore                                     VOSpace
>>>>>>>>>Storage of objects                          management of 
>>>>>>>>>virtual file system
>>>>>>>>>data stored under unspecified ID?
>>>>>>>>>no user home directory                      User home directory
>>>>>>>>>directory hierarchy                         Directory hierarchy
>>>>>>>>>Unique file name within storage             User-defined file names
>>>>>>>>>                                             Mapping VOSpace 
>>>>>>>>>name to VOStore name
>>>>>>>>>                                             List files for user
>>>>>>>>>Restrict access by user identity?
>>>>>>>>>Identify files with URIs
>>>>>>>>>Access controls on local file name          Access controls 
>>>>>>>>>on VOSPace name
>>>>>>>>>
>>>>>>>>>This characterization mixes name space, mixes access 
>>>>>>>>>controls, does not provide consistent identity, does not 
>>>>>>>>>allow consistent management.  For instance, if a URI is 
>>>>>>>>>being provided for file identity within the VOStore 
>>>>>>>>>interface, then there is no need for user-specified names 
>>>>>>>>>within VOSTore.  A second issue is the assumption that file 
>>>>>>>>>access can be restricted by user identity. This means that 
>>>>>>>>>the VOStore must manage the owner for each file, access 
>>>>>>>>>controls for each file. File systems usually do this by 
>>>>>>>>>creating accounts for each user name and applying Unix 
>>>>>>>>>permissions.  Is this capability to be provided now by both 
>>>>>>>>>VOSpace and VOStore?  We need a cleaner separation of 
>>>>>>>>>capabilities.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>This security aspect is crucial - it is clear that the owners 
>>>>>>>>of VOStores would not want to be managing user identity lists 
>>>>>>>>of all the VObs users at their stores - the fine grained 
>>>>>>>>access controls should be at the VOSpace level. If VOStores 
>>>>>>>>only respond to requests from trusted VOSpace services then 
>>>>>>>>this is possible, but I think that the perceived requirement 
>>>>>>>>for more detailed access control in the VOSpace layer has 
>>>>>>>>come about because prototype end-user applications have 
>>>>>>>>appeared that talk directly to the VOStore layer - of course, 
>>>>>>>>it is not surprising that this has happened because there was 
>>>>>>>>no VOSpace definition for the end user applications to talk 
>>>>>>>>to.
>>>>>>>>
>>>>>>>>How file/BLOB identity is managed is also crucial to 
>>>>>>>>producing a system that offers more than ftp. I thought that 
>>>>>>>>one of the fundamental driving  use cases for a VOSpace was 
>>>>>>>>that the same BLOB could potentially live on serveral 
>>>>>>>>VOStores, and that when specifying a resource in VOSpace, in 
>>>>>>>>a workflow for instance, the resource could be retrieved from 
>>>>>>>>the VOStore that was "closest" on the network to where the 
>>>>>>>>resource would be consumed. This sort of use case does 
>>>>>>>>require some careful thought about the allocation and 
>>>>>>>>management of identifiers, and I think probably means that 
>>>>>>>>the VOStore will have to be aware of the VOSpace identifier.
>>>>>>>>
>>>>>>>>I also have an issue with reusing ivo: as the protocol part 
>>>>>>>>for the URI of an identifier in this system - ivo: is already 
>>>>>>>>well defined and used as the identifer for registry entries, 
>>>>>>>>and the "protocol" for accessing the entity associated with 
>>>>>>>>the identifier is defined in the registry interface standard. 
>>>>>>>>This means that given an identifier of the form 
>>>>>>>>ivo://authority.org/something#blah a software agent (or human 
>>>>>>>>for that matter) cannot tell by inspection whether the 
>>>>>>>>identifier refers to a file in VOSpace or is simply a 
>>>>>>>>reference to a registry entry (e.g. for a SkyNode) - this 
>>>>>>>>leads to software having to be more complex in order 
>>>>>>>>constantly to test for the different possibilities. I think 
>>>>>>>>that it would be better to have a URI with a different 
>>>>>>>>protocol part, vos: for instance, it would then be 
>>>>>>>>immediately apparent that the VOSpace protocol should be used 
>>>>>>>>to access the entity referred to by the identifier.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>Let's look at the Storage Resource Broker data grid 
>>>>>>>>>separation of local storage management from the virtual file 
>>>>>>>>>system management:
>>>>>>>>>
>>>>>>>>>Local storage system                        SRB name space
>>>>>>>>>Storage of objects                          management of 
>>>>>>>>>virtual file system
>>>>>>>>>data stored under SRB ID
>>>>>>>>>no user home directory                      User home directory
>>>>>>>>>directory indirection structure             Directory hierarchy
>>>>>>>>>Unique file name within storage             User-defined file names
>>>>>>>>>                                             Mapping SRB name 
>>>>>>>>>to local file name
>>>>>>>>>                                             List files for user
>>>>>>>>>Access through SRB ID, controlled by SRB
>>>>>>>>>                                             Identify files by URIs
>>>>>>>>>                                             Access controls 
>>>>>>>>>on SRB name
>>>>>>>>>
>>>>>>>>
>>>>>>>>I think that as Regan points out the separation of 
>>>>>>>>responsibilities that  SRB has with the local storage system 
>>>>>>>>is pretty much the right model for  VOSpace and VOStore - 
>>>>>>>>though it means that SRB is pretty much at VOSpace level 
>>>>>>>>rather than a VOStore as is suggested in the current VOSpace 
>>>>>>>>definition document.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>Hi,
>>>>>
>>>>>If you also allow the possibility that the local storage 
>>>>>repository can run in an unauthorized (anonymous access) manner 
>>>>>then this is exactly what Guy and I were suggesting. Does that 
>>>>>mean that we actually all agree on this :-)
>>>>>
>>>>>    Cheers,
>>>>>
>>>>>    Matthew



More information about the vospace mailing list