VOStore interface

Matthew Graham mjg at cacr.caltech.edu
Fri Aug 5 17:34:11 PDT 2005


Hi,

I would argue that this is an implementation issue: you have to make 
sure that VOStore can fulfil what it promises.

The required functionality for authentication is just that the VOStore 
can recognise a valid message, e.g. the certificate used to sign the 
SOAP message has the NVO CA in its certificate chain.

    Cheers,

    Matthew

Reagan Moore wrote:

> Matthew:
> This is where we differ in the specification.
>
> Which account does VOStore run under on the local storage system?
>
> Reagan
>
>> Hi,
>>
>> If the user is accessing the local storage system directly then they 
>> can do whatever they want. VOStore, however, is the presentation of 
>> that repository to the VO world and does not necessarily interface 
>> with a VOSpace layer: this means that the VOStore interface has to be 
>> capable of handling the VO authentication mechanism. The 
>> authorization story is as we seemed to have agreed.
>>
>>    Cheers,
>>
>>    Matthew
>>
>>
>> Reagan Moore wrote:
>>
>>> Matthew:
>>>
>>> The expectation is that the VOStore interface does not need to do 
>>> either authentication or authorization.  If a person is working 
>>> directly with a local storage system, then they are accessing their 
>>> own personal data while running under their personal account ID. 
>>> They can execute the VOStore interface as a local application.
>>>
>>> If VOSpace is accessing the local storage system through VOStore, 
>>> then VOSpace authenticates its access to the local storage system to 
>>> read or write files under the VOSpace account ID.  Again VOStore is 
>>> just a local application that VOSpace executes.
>>>
>>> If the owner of data on the local storage repository chooses to make 
>>> a file world readable, then VOSpace would be able to access the file 
>>> through VOStore.
>>>
>>> Reagan
>>>
>>>
>>>> Reagan Moore wrote:
>>>>
>>>>> I would like to propose the following separation of identity and 
>>>>> access control management.  The issues appear to be how to 
>>>>> separate support for local files in a local storage repository 
>>>>> from the files that are registered into a shared collection that 
>>>>> spans multiple storage repositories.  An easy way to make the 
>>>>> differentiation is to identify the usage model for each type of 
>>>>> data management system.  I would like to learn whether this 
>>>>> approach would meet all of the IVOA requirements.
>>>>>
>>>>> Local storage repository:
>>>>>
>>>>> This is a storage system that is controlled by local 
>>>>> administrators who establish access accounts for the persons who 
>>>>> are allowed to use the system.
>>>>> The users can choose their own file names, manipulate the files 
>>>>> with the utilities that are available on the local storage, and 
>>>>> are authenticated by the local system.  If desired, a user could 
>>>>> log onto the local storage repository, and use a VO specific 
>>>>> interface such as VOStore to access their own personal data. Since 
>>>>> VOStore would be run under their account ID to access files that 
>>>>> they own, there is no additional required authentication. They 
>>>>> could also use other access mechanisms such as perl scripts, or 
>>>>> Unix shell commands, C library calls, whatever is supported on the 
>>>>> local storage repository.  These access mechanisms allow them to 
>>>>> access files that they own.
>>>>>
>>>>> A VOStore interface for this usage model would provide:
>>>>> - get file
>>>>> - put file
>>>>> - list files
>>>>> The only advantage is that if the VOStore interface were supported 
>>>>> on all local storage repositories, the user would have a standard 
>>>>> access mechanism.
>>>>>
>>>>> Shared collection - VOSpace:
>>>>>
>>>>> The purpose of the shared collection is to organize files across 
>>>>> multiple storage repositories, provide a way to register files 
>>>>> into the shared collection, establish access controls on the 
>>>>> shared data, provide standard services for manipulating the files 
>>>>> (Cone Search, SIAP, SSAP, Mosaic, ...), support replication, 
>>>>> support selection of the closest file.
>>>>>
>>>>> The shared collection provides a global (or logical) name space 
>>>>> that can be organized in a directory structure independently of 
>>>>> the naming convention and path hierarchy employed at the local 
>>>>> storage systems. Thus the VOSpace system must manage the mapping 
>>>>> from the logical name space to the naming convention used in the 
>>>>> local storage system.
>>>>>
>>>>> An account ID is established under which the shared collection 
>>>>> (VOSpace) is able to deposit files in the local storage 
>>>>> repository. This means the shared collection owns the data that is 
>>>>> stored at the local storage repository.  In order to access the 
>>>>> data, a user would need to authenticate herself to the shared 
>>>>> collection, which in turn authenticates itself to the local 
>>>>> storage repository. Whether or not to allow the access is 
>>>>> controlled by ACLs managed by VOSpace.  This means that the 
>>>>> authentication mechanism used by VOSpace is completely independent 
>>>>> of the authentication mechanisms used by the local storage systems.
>>>>>
>>>>> In order to handle the fact that local storage systems use a 
>>>>> variety of authentication mechanisms (Unix password, PKI 
>>>>> certificates, Kerberos certificates, DCE credentials, ...) the 
>>>>> VOSpace implementation could use the Generic Security Service API 
>>>>> (GSSAPI) to handle the heterogeneity.  In addition, an arbitrary 
>>>>> authentication mechanism can be chosen for authenticating users to 
>>>>> VOSpace.
>>>>>
>>>>> If a VOStore interface is provided by the local storage 
>>>>> repository, then VOSpace would be able to invoke the VOStore 
>>>>> access mechanism (running under the VOSpace account ID).  Note 
>>>>> that in this model VOStore does no authentication.  All 
>>>>> authentication is controlled by a combination of the local storage 
>>>>> system and VOSpace.
>>>>>
>>>>> The type of operations that would be required by VOStore, however, 
>>>>> are more sophisticated.  They include:
>>>>> - get file
>>>>> - put file
>>>>> - list files
>>>>> - register an existing file into VOSpace, while mapping from the 
>>>>> local name to the VOSpace preferred name
>>>>> - register an existing directory structure into VOSpace, while 
>>>>> setting the VOSpace logical names and VOSpace directory structure 
>>>>> to be the same as the local directory structure
>>>>> - register an existing local file into VOSpace as a replica of an 
>>>>> existing VOSpace logical file.
>>>>>
>>>>> With the latter three commands, it is possible to meet the 
>>>>> specific requirement that users be able to control the names of 
>>>>> files both on the local system and in VOSpace.  Note that for the 
>>>>> user to access the local file system they required an account ID 
>>>>> on the local file system.  They then stored a local file under 
>>>>> their own account ID. They would add read permission for the 
>>>>> VOSpace account ID to their local file to permit access by VOSpace.
>>>>>
>>>>> This separates authorization cleanly between the local storage 
>>>>> system (which only checks for access by local account IDs) and the 
>>>>> VOSpace shared collection (which authorizes all accesses to files 
>>>>> owned by VOSpace).  This means that VOSpace is managing multiple 
>>>>> levels of indirection:
>>>>> - mapping from the global or logical file name space to the local 
>>>>> repository name space
>>>>> - mapping from an authenticated user through application of ACLs 
>>>>> to decide whether the user can read a VOSpace owned file.
>>>>> - mapping preferred location for accessing replicas (typically 
>>>>> pick a file on the file system with the user's IP address, then 
>>>>> any other file system, then a tape archive)
>>>>>
>>>>> For completeness, VOStore may need an operation that sets access 
>>>>> permission for VOSpace, when VOStore is run under the local user 
>>>>> account ID.
>>>>>
>>>>>
>>>>> Reagan Moore
>>>>>
>>>>>>
>>>>>> I think that most of what is VOStore and what is VOSpace is 
>>>>>> clear; however, the two grey areas are access control 
>>>>>> (authorization) and identifiers and this stems from the use case 
>>>>>> where the user wants direct access to a VOStore (e.g. a local 
>>>>>> store) and does not want to go through the VOSpace layer. Here 
>>>>>> are my suggestions for handling these areas:
>>>>>>
>>>>>> Access control:
>>>>>> -------------------
>>>>>>
>>>>>> A VOStore can run in two modes: authorized and unauthorized. An 
>>>>>> unauthorized VOStore is semantically equivalent to an anonymous 
>>>>>> ftp site: any authenticated user (we still maintain security) can 
>>>>>> put something in, move/rename it, get it and delete it.
>>>>>> An authorized VOStore will only allow the requested operation if 
>>>>>> a valid authentication token is included in the request - all the 
>>>>>> VOStore has to do here is validate the authentication token. The 
>>>>>> generation of the authentication token is handled by VOSpace: it 
>>>>>> makes sure that the authenticated user has permission to do what 
>>>>>> they are requesting and if so, places a valid token in the 
>>>>>> request down to the VOStore.
>>>>>>
>>>>>> Identifiers:
>>>>>> --------------
>>>>>>
>>>>>> The protocol identifier ivo:// identifies a resource that exists 
>>>>>> in the VO. It does not promise that you can completely resolve a 
>>>>>> URI beginning ivo:// in a registry, merely that some component of 
>>>>>> the URI will relate to a resource that has a registry entry, i.e. 
>>>>>> the bit before the first # can be resolved in a registry. So I 
>>>>>> can go to a registry and find out where 
>>>>>> ivo://nvo.caltech/vostores/vostore1 is
>>>>>> but I need to go to VOStore interface for this store to resolve 
>>>>>> ivo://nvo.caltech/vostores/vostore1#halibut3. I do not see why we 
>>>>>> need to introduce a second protocol just for VOStore contents.
>>>>>>
>>>>>> Now resolution of individual VOStore identifiers has to be done 
>>>>>> at the VOStore level; however, VOSpace gives you the ability to 
>>>>>> set up a single logical identifier for multiple copies of the 
>>>>>> same resource so here we might want a separate protocol: vos and 
>>>>>> resolution of this identifier has to be done at the VOSpace level 
>>>>>> since VOSpace manages multiple VOStores.
>>>>>>
>>>>>>    Cheers,
>>>>>>
>>>>>>    Matthew
>>>>>>
>>>>>>
>>>>>> Paul Harrison wrote:
>>>>>>
>>>>>>> Reagan Moore wrote:
>>>>>>>
>>>>>>>> The differentiation between the VOStore and VOSpace interfaces 
>>>>>>>> is becoming unclear.  The latest draft implies that properties 
>>>>>>>> that were originally associated with VOSpace would now be 
>>>>>>>> supported by VOStore.
>>>>>>>>
>>>>>>>
>>>>>>> I have to say that I agree that there seems to be some confusion 
>>>>>>> in this area - with hindsight it was probably a mistake to defer 
>>>>>>> the specification of VOSpace and work on VOStore alone as the 
>>>>>>> "easier" problem - the specifications should be worked in tandem 
>>>>>>> to see where it is most appropriate to place roles and 
>>>>>>> responsibilities for particular use cases, so that a "global" 
>>>>>>> solution is arrived at.
>>>>>>>
>>>>>>> I thought that the original separation into VOStore and VOSpace 
>>>>>>> was done so that VOStore could be an essentially "dumb" BLOB 
>>>>>>> repository that did what it was told by the VOStore layer when 
>>>>>>> it comes to issues of file permissions and hierarchical file 
>>>>>>> names. However, because no VOSpace specification was created, 
>>>>>>> these more advanced features have crept into the VOStore layer.
>>>>>>>
>>>>>>>>
>>>>>>>> Let's look at the current VOStore and VOSpace proposal:
>>>>>>>>
>>>>>>>> VOStore                                     VOSpace
>>>>>>>> Storage of objects                          management of 
>>>>>>>> virtual file system
>>>>>>>> data stored under unspecified ID?
>>>>>>>> no user home directory                      User home directory
>>>>>>>> directory hierarchy                         Directory hierarchy
>>>>>>>> Unique file name within storage             User-defined file 
>>>>>>>> names
>>>>>>>>                                             Mapping VOSpace 
>>>>>>>> name to VOStore name
>>>>>>>>                                             List files for user
>>>>>>>> Restrict access by user identity?
>>>>>>>> Identify files with URIs
>>>>>>>> Access controls on local file name          Access controls on 
>>>>>>>> VOSPace name
>>>>>>>>
>>>>>>>> This characterization mixes name space, mixes access controls, 
>>>>>>>> does not provide consistent identity, does not allow consistent 
>>>>>>>> management.  For instance, if a URI is being provided for file 
>>>>>>>> identity within the VOStore interface, then there is no need 
>>>>>>>> for user-specified names within VOSTore.  A second issue is the 
>>>>>>>> assumption that file access can be restricted by user identity. 
>>>>>>>> This means that the VOStore must manage the owner for each 
>>>>>>>> file, access controls for each file. File systems usually do 
>>>>>>>> this by creating accounts for each user name and applying Unix 
>>>>>>>> permissions.  Is this capability to be provided now by both 
>>>>>>>> VOSpace and VOStore?  We need a cleaner separation of 
>>>>>>>> capabilities.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> This security aspect is crucial - it is clear that the owners of 
>>>>>>> VOStores would not want to be managing user identity lists of 
>>>>>>> all the VObs users at their stores - the fine grained access 
>>>>>>> controls should be at the VOSpace level. If VOStores only 
>>>>>>> respond to requests from trusted VOSpace services then this is 
>>>>>>> possible, but I think that the perceived requirement for more 
>>>>>>> detailed access control in the VOSpace layer has come about 
>>>>>>> because prototype end-user applications have appeared that talk 
>>>>>>> directly to the VOStore layer - of course, it is not surprising 
>>>>>>> that this has happened because there was no VOSpace definition 
>>>>>>> for the end user applications to talk to.
>>>>>>>
>>>>>>> How file/BLOB identity is managed is also crucial to producing a 
>>>>>>> system that offers more than ftp. I thought that one of the 
>>>>>>> fundamental driving  use cases for a VOSpace was that the same 
>>>>>>> BLOB could potentially live on serveral VOStores, and that when 
>>>>>>> specifying a resource in VOSpace, in a workflow for instance, 
>>>>>>> the resource could be retrieved from the VOStore that was 
>>>>>>> "closest" on the network to where the resource would be 
>>>>>>> consumed. This sort of use case does require some careful 
>>>>>>> thought about the allocation and management of identifiers, and 
>>>>>>> I think probably means that the VOStore will have to be aware of 
>>>>>>> the VOSpace identifier.
>>>>>>>
>>>>>>> I also have an issue with reusing ivo: as the protocol part for 
>>>>>>> the URI of an identifier in this system - ivo: is already well 
>>>>>>> defined and used as the identifer for registry entries, and the 
>>>>>>> "protocol" for accessing the entity associated with the 
>>>>>>> identifier is defined in the registry interface standard. This 
>>>>>>> means that given an identifier of the form 
>>>>>>> ivo://authority.org/something#blah a software agent (or human 
>>>>>>> for that matter) cannot tell by inspection whether the 
>>>>>>> identifier refers to a file in VOSpace or is simply a reference 
>>>>>>> to a registry entry (e.g. for a SkyNode) - this leads to 
>>>>>>> software having to be more complex in order constantly to test 
>>>>>>> for the different possibilities. I think that it would be better 
>>>>>>> to have a URI with a different protocol part, vos: for instance, 
>>>>>>> it would then be immediately apparent that the VOSpace protocol 
>>>>>>> should be used to access the entity referred to by the identifier.
>>>>>>>
>>>>>>>>
>>>>>>>> Let's look at the Storage Resource Broker data grid separation 
>>>>>>>> of local storage management from the virtual file system 
>>>>>>>> management:
>>>>>>>>
>>>>>>>> Local storage system                        SRB name space
>>>>>>>> Storage of objects                          management of 
>>>>>>>> virtual file system
>>>>>>>> data stored under SRB ID
>>>>>>>> no user home directory                      User home directory
>>>>>>>> directory indirection structure             Directory hierarchy
>>>>>>>> Unique file name within storage             User-defined file 
>>>>>>>> names
>>>>>>>>                                             Mapping SRB name to 
>>>>>>>> local file name
>>>>>>>>                                             List files for user
>>>>>>>> Access through SRB ID, controlled by SRB
>>>>>>>>                                             Identify files by URIs
>>>>>>>>                                             Access controls on 
>>>>>>>> SRB name
>>>>>>>>
>>>>>>>
>>>>>>> I think that as Regan points out the separation of 
>>>>>>> responsibilities that  SRB has with the local storage system is 
>>>>>>> pretty much the right model for  VOSpace and VOStore - though it 
>>>>>>> means that SRB is pretty much at VOSpace level rather than a 
>>>>>>> VOStore as is suggested in the current VOSpace definition document.
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>> Hi,
>>>>
>>>> If you also allow the possibility that the local storage repository 
>>>> can run in an unauthorized (anonymous access) manner then this is 
>>>> exactly what Guy and I were suggesting. Does that mean that we 
>>>> actually all agree on this :-)
>>>>
>>>>    Cheers,
>>>>
>>>>    Matthew
>>>
>



More information about the vospace mailing list