From andre.schaaff at astro.unistra.fr Wed Oct 12 08:38:05 2011 From: andre.schaaff at astro.unistra.fr (Andre Schaaff) Date: Wed, 12 Oct 2011 17:38:05 +0200 Subject: IVOA Grid and Web Services sessions In-Reply-To: <4E931845.7090608@astro.unistra.fr> References: <4E931845.7090608@astro.unistra.fr> Message-ID: <4E95B45D.1010908@astro.unistra.fr> Hello, It seems that my mail has not been diffused on the grid list last monday. regards Andr? -------- Original Message -------- Subject: IVOA Grid and Web Services sessions Date: Mon, 10 Oct 2011 18:07:33 +0200 From: Andre Schaaff Reply-To: andre.schaaff at astro.unistra.fr Organization: CDS To: grid at ivoa.net CC: Andreas Wicenec Hello, If you are attending the Pune interop meeting and wish to contribute to the GWS sessions, please let me know the title and the time you need. We have 2 sessions called VOSpace and UWS, but it is completely open. So feel free to propose a presentation outside these two topics. Andre -- Andre Schaaff Observatoire astronomique 11, rue de l'Universite F-67000 Strasbourg From pierre.lesidaner at obspm.fr Wed Oct 12 09:28:17 2011 From: pierre.lesidaner at obspm.fr (Pierre Le Sidaner) Date: Wed, 12 Oct 2011 18:28:17 +0200 Subject: IVOA Grid and Web Services sessions In-Reply-To: <4E95B45D.1010908@astro.unistra.fr> References: <4E931845.7090608@astro.unistra.fr> <4E95B45D.1010908@astro.unistra.fr> Message-ID: <4E95C021.5020004@obspm.fr> Le 12/10/2011 17:38, Andre Schaaff a ?crit : > Hello, > > It seems that my mail has not been diffused on the grid list last monday. > > regards > > Andr? > > -------- Original Message -------- > Subject: IVOA Grid and Web Services sessions > Date: Mon, 10 Oct 2011 18:07:33 +0200 > From: Andre Schaaff > Reply-To: andre.schaaff at astro.unistra.fr > Organization: CDS > To: grid at ivoa.net > CC: Andreas Wicenec > > Hello, > > If you are attending the Pune interop meeting and wish to contribute to > the GWS sessions, please let me know the title and the time you need. > We have 2 sessions called VOSpace and UWS, but it is completely open. > So feel free to propose a presentation outside these two topics. > > > Andre Hi Andre We (Jonathan and I ) want to make a contribution Proposition of UWS 1.1 We hope to have finish by tomorrow the document and send it to the mailing list. Sorry for being so late, but it take a lot of time between writers to converge on all the point. We hope that this time spent will be save in discussions later Regards Pierre > > -- ------------------------------------------------------------------------- Pierre Le Sidaner Observatoire de Paris Division Informatique de l'Observatoire Observatoire Virtuel 01 40 51 20 89 61, avenue de l'Observatoire 75014 Paris mailto:pierre.lesidaner at obspm.fr http://vo-web.obspm.fr -------------------------------------------------------------------------- From pierre.lesidaner at obspm.fr Thu Oct 13 04:20:51 2011 From: pierre.lesidaner at obspm.fr (Pierre Le Sidaner) Date: Thu, 13 Oct 2011 13:20:51 +0200 Subject: Modification Of UWS 1 to 1.1 In-Reply-To: <4E96AC8D.1040003@obspm.fr> References: <4E96AC8D.1040003@obspm.fr> Message-ID: <4E96C993.2000602@obspm.fr> | Hi all | | As we have discuss it in Napoli, we have made the proposed evolution | of the document concerning the rest messages. | We have try to simplify the messages | to give all the HTTP code response for a message. | We only present the modification of paragraph 2 in the document and | we have provide also the UML schemas to explain the resources and | sequences. As we have not discuss this point with the group we don't | promote an xml schema for 1.1. | | What are the main difference between 1.0 | real simplification using REST Principe that will not allow multiple | interpretation of a command | creation and starting a job is on one phase | parameters are include in the starting job phase | We hope that this proposition will be much more easy to implement | both from server and client phase. It has take us a lot of time and | exchange with french agency CNES who have made the first client to | write this simplified sequence. | We have removed some useless messages on our point of view like abort | from the user. Because it make the same thing as delete as you can | not retrieve the result as explain in 1.0 version. | We have remove pending phase as describe before. Job can be on | suspended phase, but it's only server action. | We have add possibility to upload file and not only to give URL | | What have to be discuss : | pagination for long job list. We can propose a standard way | authentication We propose to adopt the RFC standard already existing | in HTTP with token | WADL as a service description that can leave open any JDL as an XML | description of parameters inside. This is important to build client | and allow to describe easily simple service or to have a complex XML | model for theory services. | | We hope that you have many useful comment on the text. As the resource | schema is quite big, we push you a JPG in supplement to the PDF. | | Regards | Jonathan, Jean-Christophe and Pierre -------------- next part -------------- A non-text attachment was scrubbed... Name: uws-v1.1.pdf Type: application/pdf Size: 223776 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: uws_resources.png Type: image/png Size: 109519 bytes Desc: not available URL: From skoda at sunstel.asu.cas.cz Thu Oct 13 05:52:17 2011 From: skoda at sunstel.asu.cas.cz (Petr Skoda) Date: Thu, 13 Oct 2011 14:52:17 +0200 (CEST) Subject: Modification Of UWS 1 to 1.1 In-Reply-To: <4E96C993.2000602@obspm.fr> References: <4E96AC8D.1040003@obspm.fr> <4E96C993.2000602@obspm.fr> Message-ID: Hi Pierre and others, As you remember we tried to use UWS for setting the VO-KOREL which is a "cloud-like" service for running the one particular FORTRAN program (korel) in a user-friendly environment - thus requiring the web browser to interact with the service (including job control). It may, of course be different from original requirements on UWS to provide asynchronous communication for TAP queries .... However, I think that in the future our approach may be easily followed providing the astronomers with nice wrappers of their "boring " numerical code and so the "cloud" aspect will be more accented. So I will commnet your changes from this point of view: On Thu, 13 Oct 2011, Pierre Le Sidaner wrote: > | real simplification using REST Principe that will not allow multiple | > interpretation of a command > | creation and starting a job is on one phase NO - it is exactly where current UWS is handy: We have common such a scenario - user prepares in his working space (there is some kind of quota imposed for each user) several jobs - i.e. he uploads massive data sets and parameter sets for number of experiments (different spectral regions for disentangling, different set o finput spectra etc ...). As he is aware he has limited number of memory and processes, he has to decide what jobs to run in parallel. He may run it and disconnnect. Then he may use mobile device to look in his job list to see the results and by changing some parameters can rerun it immediately - here it means the creation of new job and run together. But he can as well look in mobile job list and decide OK now I know the methods converges and I can run one large job prepared for some time and thus being in PENDING phase. Rhe same with ABORT and DELETE - he may know how long should the typical run on given set take. But if it is running too long probably something is wrong and he can manually ABORT that job. Or he might run several smaller experiments and the long job in parallel, but it is clear that such a parameter combination does not make good results so he may ABORT that long job. The DELETE as we understood the schema we use for deleting the whole space for the job - it means both input data, parameter set and RESULTS (our system produces results even if job is not finshed - e.g. the part of convergence or divergence can be seen here as well as stdout (giving hint e.g. for parameter error). > | We hope that this proposition will be much more easy to implement > | both from server and client phase. It is too short-sighted to see only the aspect of easiness while loosing important interaction and lowering the user's comfort. > | We have removed some useless messages on our point of view like abort | > from the user. Because it make the same thing as delete as you can > | not retrieve the result as explain in 1.0 version. I do not understand why you could not retrieve results after abort - and it does not do the same: in UWS1.0 sec 2.2.3.6 "A job may be aborted by POSTing to the /{jobs}/(job-id)/phase URI. The POST contains a single parameter PHASE=ABORT which instructs the UWS to attempt to abort the job. Aborting a job has the effect of stopping a job executing, but the resources associated with a job remain intact. " ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ sec 2.2.3.2: " Deleting a Job Sending a HTTP DELETE to a Job resource destroys that job, with the ^^^^^^^^^^^^^^^^ meaning noted in the definition of the Job object, above. " and definition of the JOB: "2.1.2. Job A Job object contains the state of one job. The state is a collection of other objects. Each Job contains: * Exactly one Execution Phase. * Exactly one Execution Duration. * Exactly one Deletion Time * Exactly one Quote. * Exactly one Results List. * Exactly one Owner. * Zero or one Run Identifier. * Zero or one Error." ------------------ As we understand it - the DELETING means removing all remnants of the job - including the results. > | We have remove pending phase as describe before. Job can be on > | suspended phase, but it's only server action. NO - suspend it is only in case the processor is not available. But pending means the job space is created (data and parameters uploaded, item in database structure created etc.... But the real application (e.g. number cruchning code is neither run nor deployed on GRID by queue sub system. > | We have add possibility to upload file and not only to give URL it is exactly how we do this in VO-KOREL - upload of several files (one one or two data files and parameter files, we can upload as well all of this in one tgz file prepared by the user (for automation of the CLOUD service" > | | What have to be discuss : > | pagination for long job list. Especially the isolation of individual users to see only their own jobs, > | | We hope that you have many useful comment on the text. As the resource | As i said we are probably doing something revolutionary with UWS and so our interpretation may be wrong (I hope Paul and Dave will comment on their primary ideas of UWS - but I liked the original design as very flexible) Wht I dislike in your proposal as well is the requirement for server to give the estimate of runtime. In 99% of multiparametric optimization (even with genetic algorithms) you cannot predict the covergence. But you may impose initial limit just to prevent the closed loop or solution oscillation. But the user has to decide himself how to change the limit. Thats the reason we allocate its user limited amount of processed and memory - but some queue priority would be probably better. I feel the UWS (which is not a standard like protocol or data format, but a conceptual idea - pattern how to do things) should be rather expanded to emply more flexibility and design freedom even for yet unforeseen purposes, than to restrict it. Perhaps we might define some implementation standard at the level of "protocol" for given purposes with all the MUST and MAY - e.g. as a requirements for pure machine-to-machine intearction (like is TAP). I think the current TAP is well ortogonal so e.g. you can combine for your purpose the CreateJOb+Startjob (phase transition from Pending-RUN-Queuded-Executing) as just two succesive calls . I am not able to attend Puna, but will follow the program nearly on-line, so please put all the materials after preentation and discussion of UWS session on the wiki ASAP. Best regards, Petr BTW you may find the description of VO-KOREL here : http://www.ta3.sk/IB2E/posters/F05.pdf ************************************************************************* * Petr Skoda Phone : +420-323-649201, ext. 361 * * Stellar Department +420-323-620361 * * Astronomical Institute AS CR Fax : +420-323-620250 * * 251 65 Ondrejov e-mail: skoda at sunstel.asu.cas.cz * * Czech Republic * ************************************************************************* From paul.harrison at manchester.ac.uk Fri Oct 14 03:22:25 2011 From: paul.harrison at manchester.ac.uk (Paul Harrison) Date: Fri, 14 Oct 2011 11:22:25 +0100 Subject: Modification Of UWS 1 to 1.1 In-Reply-To: <4E96C993.2000602@obspm.fr> References: <4E96AC8D.1040003@obspm.fr> <4E96C993.2000602@obspm.fr> Message-ID: <86344C63-7064-4C5A-960C-51E6751CFF4C@manchester.ac.uk> Dear Pierre et. al. First some general comments; I think that in several areas you are making proposals that would make UWS 1.0 services invalid - this cannot be done, from both a point of IVOA procedure (a point version must be backwards compatible) and from a strategic perspective (we do not want to make anything obsolete at this stage). In addition you seem to want to be narrowing the scope of UWS by suggesting removing features that you do not want to use but that others (e.g. see email from Petr) find useful. You have also presented the document as a rewrite of the original document, rather than as a list of suggested changes - this makes it rather difficult to see what changes you are actually proposing (much is just the same as UWS 1.0). In general UWS 1.1. should be only a clarification (and possible small extension) of UWS 1.0 - I have already made the edits to the document with the uncontroversial clarifications from the last interop. http://code.google.com/p/volute/source/diff?spec=svn1596&r=1499&format=side&path=/trunk/projects/grid/uws/doc/UWS.html&old_path=/trunk/projects/grid/uws/doc/UWS.html&old=1353 With regard to other of your specific points, I think that I have made a response in http://www.ivoa.net/pipermail/grid/2011-May/002503.html. I will not be attending the Interop in person, but hopefully will be able to at least monitor what is going on remotely if someone can have something like skype running. Regards, Paul. On 2011-10 -13, at 12:20, Pierre Le Sidaner wrote: > | Hi all > | | As we have discuss it in Napoli, we have made the proposed evolution > | of the document concerning the rest messages. > | We have try to simplify the messages > | to give all the HTTP code response for a message. > | We only present the modification of paragraph 2 in the document and > | we have provide also the UML schemas to explain the resources and > | sequences. As we have not discuss this point with the group we don't > | promote an xml schema for 1.1. > | | What are the main difference between 1.0 > | real simplification using REST Principe that will not allow multiple | interpretation of a command > | creation and starting a job is on one phase > | parameters are include in the starting job phase > | We hope that this proposition will be much more easy to implement > | both from server and client phase. It has take us a lot of time and > | exchange with french agency CNES who have made the first client to > | write this simplified sequence. > | We have removed some useless messages on our point of view like abort | from the user. Because it make the same thing as delete as you can > | not retrieve the result as explain in 1.0 version. > | We have remove pending phase as describe before. Job can be on > | suspended phase, but it's only server action. > | We have add possibility to upload file and not only to give URL > | | What have to be discuss : > | pagination for long job list. We can propose a standard way > | authentication We propose to adopt the RFC standard already existing > | in HTTP with token > | WADL as a service description that can leave open any JDL as an XML | description of parameters inside. This is important to build client > | and allow to describe easily simple service or to have a complex XML > | model for theory services. > | | We hope that you have many useful comment on the text. As the resource | schema is quite big, we push you a JPG in supplement to the PDF. > | | Regards > | Jonathan, Jean-Christophe and Pierre > > > Dr. Paul Harrison JBCA, Manchester University http://www.manchester.ac.uk/jodrellbank From andre.schaaff at astro.unistra.fr Fri Oct 14 05:49:26 2011 From: andre.schaaff at astro.unistra.fr (Andre Schaaff) Date: Fri, 14 Oct 2011 14:49:26 +0200 Subject: Last call : IVOA Grid and Web Services sessions Message-ID: <4E982FD6.4070908@astro.unistra.fr> Hello, If you wish to contribute to the GWS sessions, please let me know today the title and the time you need. As said in the previous call for contributions, we have 2 sessions called VOSpace and UWS, but it is completely open. Andre From andre.schaaff at astro.unistra.fr Fri Oct 14 06:22:45 2011 From: andre.schaaff at astro.unistra.fr (Andre Schaaff) Date: Fri, 14 Oct 2011 15:22:45 +0200 Subject: UWS discussion Message-ID: <4E9837A5.6070701@astro.unistra.fr> Following the last mails concerning UWS. Concerning the second GWS session mainly dedicated to UWS, as the time is always running fast during the session and as interested people are not all attending the interop i think that it would be helpful to continue the discussion by mail (the session is on Thursday). So i encourage Paul, Pierre, Petr and all the other involved people to continue this discussion and to find a compromise on a first set of topics before the session. Andr? From pierre.lesidaner at obspm.fr Sun Oct 16 05:54:16 2011 From: pierre.lesidaner at obspm.fr (Pierre Le Sidaner) Date: Sun, 16 Oct 2011 14:54:16 +0200 Subject: Modification Of UWS 1 to 1.1 In-Reply-To: References: <4E96AC8D.1040003@obspm.fr> <4E96C993.2000602@obspm.fr> Message-ID: <4E9AD3F8.1050100@obspm.fr> Hi Petr I try to reply point by point UWS is the language to manage job in asynchronous way at distance, we are clear that we define how to interface UWS service. We want it to be simple just to simplify his implementation. > > > Hi Pierre and others, > > As you remember we tried to use UWS for setting the VO-KOREL which is > a "cloud-like" service for running the one particular FORTRAN program > (korel) in a user-friendly environment - thus requiring the web > browser to interact with the service (including job control). It may, > of course be different from original requirements on UWS to provide > asynchronous communication for TAP queries .... > > However, I think that in the future our approach may be easily > followed providing the astronomers with nice wrappers of their "boring > " numerical code and so the "cloud" aspect will be more accented. So I > will commnet your changes from this point of view: > > > On Thu, 13 Oct 2011, Pierre Le Sidaner wrote: >> | real simplification using REST Principe that will not allow >> multiple | interpretation of a command >> | creation and starting a job is on one phase > > NO - it is exactly where current UWS is handy: > > We have common such a scenario - user prepares in his working space > (there is some kind of quota imposed for each user) several jobs - > i.e. he uploads massive data sets and parameter sets for number of > experiments (different spectral regions for disentangling, different > set o finput spectra etc ...). As he is aware he has limited number of > memory and processes, he has to decide what jobs to run in parallel. > He may run it and disconnnect. Then he may use mobile device to look > in his job list to see the results and by changing some parameters can > rerun it immediately - here it means the creation of new job and run > together. From my point of view. The management of : witch job has to be sent in parallel, what is the available RAM, what is the processors speed is not the user problem. He has to send job. He can send many of them, he don't have to know if other users send job at the same time. This is an infrastructure problem manage by the provider that can have many kind of cluster, scheduler, batch queue. So the user send 100 job. They are placed in a queue. Usually the scheduler send this job on the available CPU using the knowledge of the ram, the number of CPU and the time reserve for execution. This is how it work on every cluster. > > But he can as well look in mobile job list and decide OK now I know > the methods converges and I can run one large job prepared for some > time and thus being in PENDING phase. Don't reinvent the scheduler, job are on queue and with you mobile you can see the status of all your job if you have a web interface to see so. > > Rhe same with ABORT and DELETE - he may know how long should the > typical run on given set take. But if it is running too long probably > something is wrong and he can manually ABORT that job. Or he might run > several smaller experiments and the long job in parallel, but it is > clear that such a parameter combination does not make good results so > he may ABORT that long job. > > The DELETE as we understood the schema we use for deleting the whole > space for the job - it means both input data, parameter set and > RESULTS (our system produces results even if job is not finshed - e.g. > the part of convergence or divergence can be seen here as well as > stdout (giving hint e.g. for parameter error). > > There is yet no d?dicated space to the user out of VOSpace. So Input parameters are anyway destroyed after the job. But you tell me that you want intermediate result from a job you decided to stop. we have to think about that and reintroduce abort if it's necessary. > >> | We hope that this proposition will be much more easy to implement >> | both from server and client phase. > > It is too short-sighted to see only the aspect of easiness while > loosing important interaction and lowering the user's comfort. I really don't see the point. User will not talk UWS, he will have an interface and will play with it we only describe exchange message between this interface and the server. we have to define all the possible user requirement in this exchange message. I don't see where the lowering comfort is. But I am open to any extension if our proposition does not fulfil the requirement. We are only try to make the standard more clear by defining message sent back from every actions and limiting the number of method to do the same action to limit ambiguity not comfort. Then implementation will be easier and more efficient. > >> | We have removed some useless messages on our point of view like >> abort | from the user. Because it make the same thing as delete as >> you can >> | not retrieve the result as explain in 1.0 version. > > I do not understand why you could not retrieve results after abort - > and it does not do the same: > > in UWS1.0 sec 2.2.3.6 > "A job may be aborted by POSTing to the /{jobs}/(job-id)/phase URI. > The POST contains a single parameter PHASE=ABORT which instructs the > UWS to attempt to abort the job. Aborting a job has the effect of > stopping a job executing, but the resources associated with a job > remain intact. " > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Ok as I say before, I understand this need and we have to face it, you are right > > > sec 2.2.3.2: > > " Deleting a Job > > Sending a HTTP DELETE to a Job resource destroys that job, with the > ^^^^^^^^^^^^^^^^ > meaning noted in the definition of the Job object, above. " > > and definition of the JOB: > > "2.1.2. Job > > A Job object contains the state of one job. The state is a collection > of other objects. Each Job contains: > > * > > Exactly one Execution Phase. > * > > Exactly one Execution Duration. > * > > Exactly one Deletion Time > * > > Exactly one Quote. > * > > Exactly one Results List. > * > > Exactly one Owner. > * > > Zero or one Run Identifier. > * > > Zero or one Error." > ------------------ > > As we understand it - the DELETING means removing all remnants of the > job - including the results. Yes > > > >> | We have remove pending phase as describe before. Job can be on >> | suspended phase, but it's only server action. > > NO - suspend it is only in case the processor is not available. > But pending means the job space is created (data and parameters > uploaded, item in database structure created etc.... But the real > application (e.g. number cruchning code is neither run nor deployed on > GRID by queue sub system. As I say before We have to face the problem of intermediate result But not to substitute ourselves to the job manager. Otherwise when there is multiple user and multiple CPU, it become unmanageable. It's not the purpose of UWS, it's provider internal business. > > >> | We have add possibility to upload file and not only to give URL > > it is exactly how we do this in VO-KOREL - upload of several files > (one one or two data files and parameter files, we can upload as well > all of this in one tgz file prepared by the user (for automation of > the CLOUD service" > >> | | What have to be discuss : >> | pagination for long job list. > > Especially the isolation of individual users to see only their own jobs, This can be done only if we identify user. If we want to define the mechanism of identification, then we propose to use HTTP authentication mechanism that is an RFC But I agree it's the next step we have to face. > >> | | We hope that you have many useful comment on the text. As the >> resource | > > As i said we are probably doing something revolutionary with UWS and > so our interpretation may be wrong (I hope Paul and Dave will comment > on their primary ideas of UWS - but I liked the original design as > very flexible) > > Wht I dislike in your proposal as well is the requirement for server > to give the estimate of runtime. In 99% of multiparametric > optimization (even with genetic algorithms) you cannot predict the > covergence. But you may impose initial limit just to prevent the > closed loop or solution oscillation. But the user has to decide > himself how to change the limit. > Thats the reason we allocate its user limited amount of processed and > memory - but some queue priority would be probably better. I understand the difficulty of time prediction. But you know better the code than the user. So you are able to make an initial estimation of time duration. The queue priority depend on the resource you ask. You can not have the same priority for a quick job mono processor and a multi processor job that use all the memory of the machine and run on 200h. So if you say my job is a 5mn job, then I change my mind it will use 200h. It's not acceptable for the other users and it brake all the mechanism of the job queue that decide the job scheduling process. > > > I feel the UWS (which is not a standard like protocol or data format, > but a conceptual idea - pattern how to do things) should be rather > expanded to emply more flexibility and design freedom even for yet > unforeseen purposes, than to restrict it. As you say UWS is not like Simple access protocol. It's the definition of a language to manage distant job. In UWS you ask a service the way to query them and retrieve result. From this information you are able to make a client that send job and retrieve results. But we didn't give any recommendation on web interface. Our purpose is not to reduce functionality, but just to remove ambiguity. If you can send a job from 3 different manner, it mean that people implementing server side will have to face choice usually not clear it mean that client must implement the 3 manner it meant that the description of the service have to face a description of 3 maners. I talk only about sending job, we can talk about parameters, job status ... > > Perhaps we might define some implementation standard at the level of > "protocol" for given purposes with all the MUST and MAY - e.g. as a > requirements for pure machine-to-machine intearction (like is TAP). > > I think the current TAP is well ortogonal so e.g. you can combine for > your purpose the CreateJOb+Startjob (phase transition from > Pending-RUN-Queuded-Executing) as just two succesive calls . I just say, give me a good reason to create a job and not sending it. Why if you create a job you don't want to send it. And why you are not able to wait until the time you are ready to send directly the job without going to this pending status. Then I will have to face this new phase, a new life time of job in pending phase. Anyway thank you for your comment and proposition. I hope to read you soon. Regards Pierre -- ------------------------------------------------------------------------- Pierre Le Sidaner Observatoire de Paris Division Informatique de l'Observatoire Observatoire Virtuel 01 40 51 20 89 61, avenue de l'Observatoire 75014 Paris mailto:pierre.lesidaner at obspm.fr http://vo-web.obspm.fr -------------------------------------------------------------------------- From pierre.lesidaner at obspm.fr Sun Oct 16 06:07:41 2011 From: pierre.lesidaner at obspm.fr (Pierre Le Sidaner) Date: Sun, 16 Oct 2011 15:07:41 +0200 Subject: Modification Of UWS 1 to 1.1 In-Reply-To: <86344C63-7064-4C5A-960C-51E6751CFF4C@manchester.ac.uk> References: <4E96AC8D.1040003@obspm.fr> <4E96C993.2000602@obspm.fr> <86344C63-7064-4C5A-960C-51E6751CFF4C@manchester.ac.uk> Message-ID: <4E9AD71D.5010900@obspm.fr> Le 14/10/2011 12:22, Paul Harrison a ?crit : > Dear Pierre et. al. > > First some general comments; > > I think that in several areas you are making proposals that would make UWS 1.0 services invalid - this cannot be done, from both a point of IVOA procedure (a point version must be backwards compatible) and from a strategic perspective (we do not want to make anything obsolete at this stage). In addition you seem to want to be narrowing the scope of UWS by suggesting removing features that you do not want to use but that others (e.g. see email from Petr) find useful. We don't remove what it was. We just clarify and simplify UWS. It was last interop, that to ma make things goes on we leave 1.0 as it was and modify things in 1.1. I don't remove what I don't want to use. I just don't see the usage of some phases. And the groups we have worked with neither. So after implementing 1.0 we decide to remove what no proposed use case need. But it's a proposition to be discuss. As I answer to petr, we thought that is user can send sub-program or modify parameters instead of aborting job. But we didn't face convergence effect to be retreive. That's the interest of discussing things in large group. > You have also presented the document as a rewrite of the original document, rather than as a list of suggested changes - this makes it rather difficult to see what changes you are actually proposing (much is just the same as UWS 1.0). In general UWS 1.1. should be only a clarification (and possible small extension) of UWS 1.0 - I have already made the edits to the document with the uncontroversial clarifications from the last interop. > > http://code.google.com/p/volute/source/diff?spec=svn1596&r=1499&format=side&path=/trunk/projects/grid/uws/doc/UWS.html&old_path=/trunk/projects/grid/uws/doc/UWS.html&old=1353 > > With regard to other of your specific points, I think that I have made a response in http://www.ivoa.net/pipermail/grid/2011-May/002503.html. > > I will not be attending the Interop in person, but hopefully will be able to at least monitor what is going on remotely if someone can have something like skype running. I see with Andr?, but we will try for sure. For the document, we don't intent to rewrite all, we just have proposed paragraph 2 with the same HTML format. > Regards, > Paul. > > On 2011-10 -13, at 12:20, Pierre Le Sidaner wrote: > >> | Hi all >> | | As we have discuss it in Napoli, we have made the proposed evolution >> | of the document concerning the rest messages. >> | We have try to simplify the messages >> | to give all the HTTP code response for a message. >> | We only present the modification of paragraph 2 in the document and >> | we have provide also the UML schemas to explain the resources and >> | sequences. As we have not discuss this point with the group we don't >> | promote an xml schema for 1.1. >> | | What are the main difference between 1.0 >> | real simplification using REST Principe that will not allow multiple | interpretation of a command >> | creation and starting a job is on one phase >> | parameters are include in the starting job phase >> | We hope that this proposition will be much more easy to implement >> | both from server and client phase. It has take us a lot of time and >> | exchange with french agency CNES who have made the first client to >> | write this simplified sequence. >> | We have removed some useless messages on our point of view like abort | from the user. Because it make the same thing as delete as you can >> | not retrieve the result as explain in 1.0 version. >> | We have remove pending phase as describe before. Job can be on >> | suspended phase, but it's only server action. >> | We have add possibility to upload file and not only to give URL >> | | What have to be discuss : >> | pagination for long job list. We can propose a standard way >> | authentication We propose to adopt the RFC standard already existing >> | in HTTP with token >> | WADL as a service description that can leave open any JDL as an XML | description of parameters inside. This is important to build client >> | and allow to describe easily simple service or to have a complex XML >> | model for theory services. >> | | We hope that you have many useful comment on the text. As the resource | schema is quite big, we push you a JPG in supplement to the PDF. >> | | Regards >> | Jonathan, Jean-Christophe and Pierre >> >> >> > Dr. Paul Harrison > JBCA, Manchester University > http://www.manchester.ac.uk/jodrellbank > > > -- ------------------------------------------------------------------------- Pierre Le Sidaner Observatoire de Paris Division Informatique de l'Observatoire Observatoire Virtuel 01 40 51 20 89 61, avenue de l'Observatoire 75014 Paris mailto:pierre.lesidaner at obspm.fr http://vo-web.obspm.fr -------------------------------------------------------------------------- From paul.harrison at manchester.ac.uk Mon Oct 17 01:00:20 2011 From: paul.harrison at manchester.ac.uk (Paul Harrison) Date: Mon, 17 Oct 2011 09:00:20 +0100 Subject: Modification Of UWS 1 to 1.1 In-Reply-To: <4E9AD71D.5010900@obspm.fr> References: <4E96AC8D.1040003@obspm.fr> <4E96C993.2000602@obspm.fr> <86344C63-7064-4C5A-960C-51E6751CFF4C@manchester.ac.uk> <4E9AD71D.5010900@obspm.fr> Message-ID: <78219593-B207-4296-B25F-D6A058D73FBD@manchester.ac.uk> On 2011-10 -16, at 14:07, Pierre Le Sidaner wrote: > Le 14/10/2011 12:22, Paul Harrison a ?crit : >> Dear Pierre et. al. >> >> First some general comments; >> >> I think that in several areas you are making proposals that would make UWS 1.0 services invalid - this cannot be done, from both a point of IVOA procedure (a point version must be backwards compatible) and from a strategic perspective (we do not want to make anything obsolete at this stage). In addition you seem to want to be narrowing the scope of UWS by suggesting removing features that you do not want to use but that others (e.g. see email from Petr) find useful. > We don't remove what it was. We just clarify and simplify UWS. It was last interop, that to ma make things goes on we leave 1.0 as it was and modify things in 1.1. > I don't remove what I don't want to use. I just don't see the usage of some phases. And the groups we have worked with neither. So after implementing 1.0 we decide to remove what no proposed use case need. But it's a proposition to be discuss. This is a central point - you might want to "simplify" how UWS works, but in a 1.1 version nothing that was true in the 1.0 version can be invalidated (see section 1.2 of http://www.ivoa.net/Documents/DocStd/20100413/REC-DocStd-1.2.pdf), so any changes would be in addition to old 1.0 behaviour of the UWS interface. I think that it is certainly true that the 1.1 version of the document could benefit from some more clarification of the purpose of several features of the 1.0 interface so that it is more obvious what must be implemented and what is an optional behaviour. Regards, Paul. From skoda at sunstel.asu.cas.cz Mon Oct 17 11:15:07 2011 From: skoda at sunstel.asu.cas.cz (Petr Skoda) Date: Mon, 17 Oct 2011 20:15:07 +0200 (CEST) Subject: Modification Of UWS 1 to 1.1 In-Reply-To: <4E9AD3F8.1050100@obspm.fr> References: <4E96AC8D.1040003@obspm.fr> <4E96C993.2000602@obspm.fr> <4E9AD3F8.1050100@obspm.fr> Message-ID: Hi Pierre, I am sorry I could not answer earlier but still just in time before Thursday's UWS session... > I try to reply point by point > UWS is the language to manage job in asynchronous way at distance, we are > clear that we define how to interface UWS service. I think the UWS is not a language (like JDL - which is already a part of implementation and as said in 1.3. the "change of JDL changes the service contract) - I feel the suggested Prarameter Description language as presented by Carlo Zwolf, Franck le Petit and Paul Harrison http://www.ivoa.net/internal/IVOA/InterOpOct2011Theory/PDL_2011.10.17.pdf (I am following the Interop despite not being there;-) is such a JDL (in fact the parameter description) > We want it to be simple just to simplify his implementation. It is your wish to make your life simple - but in situation that something has to be a global recommendation to solve the requirements of most of community, the things are never simple (I know the situation with "Simple" Spectra Access ;-)) and Generic data sets ... >> creation of new job and run together. > From my point of view. The management of : witch job has to be sent in > parallel, what is the available RAM, what is the processors speed is not the > user problem. I am afraid you are watching the situation as an informatician but not as a scientist. The wise scientist knows quite well his needs and makes decisions to optimize the time when he gets PROPER results. It was even more critical when the mainframe has limited amount of quick (fast) queues and long-run ones with different memory and CPU time limits - I am sure that this has to be eventually put in UWS somehow as well. > He has to send job. He can send many of them, he don't have to know if other > users send job at the same time. This is an infrastructure problem manage by > the provider that can have many kind of cluster, scheduler, batch queue. Yes - the direct management of computation (how many nodes, which type, what size of memory...) has to be enforced by the manager of the infrastructure after investigation of typical requirements of users, avarage job size and run time etc ..... But as I tried to explain (probably not well ;-) our (i.e. my view as your oponent and my colleagues behind VO-KOREL) understanding of UWS idea is : UWS defines the way how the asynchronous service (and stateful, job-oriented) can be implemented without any intention to restrict its usage to particular service contract or without any limits of possible implementation. UWS is a PATTERN or RECIPE HOW to arrange things Your idea about its usage its already very concrete: The user has a ANONYMOUS grid where he submits UNRESTRICTED number of ARBITRARILY large jobs and it is a task of a scheduler to process all of them accordingly to policy and rules implemented by data provider. > Usually the scheduler send this job on the available CPU using the knowledge > of the ram, the number of CPU and the time reserve for execution. > This is how it work on every cluster. Well its a further part of system - the real execution (i.e allocating RAM and executing the binary on given node) - see below: > Don't reinvent the scheduler, job are on queue and with you mobile you can > see the status of all your job if you have a web interface to see so. Its not about the scheduler - its about my knowledge an experience >> There is yet no d?dicated space to the user out of VOSpace. In current VO services - we are providing this space. But I do not see the reason why to use VOSPACE if things are stored on given URL. VOSPACE is for exchange of data between services. > So Input parameters are anyway destroyed after the job. In your case yes, in our we have user space and files where it is stored. I think we are talking about different parts of "job run" - see below > But you tell me that you want intermediate result from a job you decided to > stop. > we have to think about that and reintroduce abort if it's necessary. It is allowed by UWS and we need it (imagine the log file with progress of convergence of some model) If it is aborted manually (I am impatient) or by exceeding the allowed CPU time (in given queue) or the timeout by Exec Duration, the results may still be very usefull telling that the job is almost done but the user was just optimistic to give small tollerance in convergence - but it may as well tell you this is nonsense - it does not converge with such a set of params ..... or it is oscillating ... > User will not talk UWS, he will have an interface and will play with it > we only describe exchange message between this interface and the server. Exactly > we have to define all the possible user requirement in this exchange message. YES - the requirement of user interface is : I want to upload large data (set of spectra) and initial parameters to my REMOTE USER SPACE for several experiements and after getting results I can decide to modify parameters and rerun the job with the same data or run another experiment already PENDING. > I don't see where the lowering comfort is. But I am open to any extension if > our proposition does not fulfil the requirement. if you want remove PENDING and excute immediately after the job creation, then you break the requirements given above. If not, then I am sorry for misunderstanding.... >>> We have remove pending phase as describe before. Job can be on >>> | suspended phase, but it's only server action. > We are only try to make the standard more clear by defining message sent > back from every actions and limiting the number of method to do the same > action to limit ambiguity not comfort. Then implementation will be > easier and more efficient. Nothing against easier and more readable and clear formulations. But you have to conserve the functionality ..... > >> >>> | We have removed some useless messages on our point of view like abort | >>> from the user. Because it make the same thing as delete as you can >>> | not retrieve the result as explain in 1.0 version. >> >> I do not understand why you could not retrieve results after abort - >> and it does not do the same: >> As we understand it - the DELETING means removing all remnants of the job - >> including the results. > Yes >> > As I say before > We have to face the problem of intermediate result The people do not like the blind running of a code - they like some intermediate results to be produced to check the progress of a job (of course sometimes it may not be easy as the system cannot transfer from given node to output node of a cluster - howver some distributed file systems try to do this anyway ... > But not to substitute ourselves to the job manager. The user has to be his jobs manager ;-) But he has to have the interface allowing him to control the jobs ... > Otherwise when there is multiple user and multiple CPU, it become > unmanageable. It's not the purpose of UWS, it's provider internal business. Well in every OS the user has a right to kill HIS jobs if something is going wrong. It would be very dumb to allow the user to execute many jobs (and let the machine go to knees) just to play with different parameter sets as he wants to get some results without more advanced thinking about the nature of a problem (its a experience with VO-KOREL there is a lot of parameters and as you know the global optimization task requires some problem knowledge to restrict yourself to the critical parameters to converge and to given range of probable solutions --- this is well understood by users, nevertheless they try to put the burden of decision on "dumb" computer if having enough freedom - then they run hundreds of jobs in parallel day and night not to let the machine to idle ;-) So user in restricted environment (e.g. limited CPU, memory etc) will have to decide what jobs to run and while computing, he may discover another idea to get a solution faster. Then he naturalely decides to abort the job and run another. The same concerns the PENDING state - he might prepare it but still waits for results or error on currently running. >> Especially the isolation of individual users to see only their own jobs, > This can be done only if we identify user. Well in non anonymous environment you have clear identification of user (he has somehow to authenticate) In anonymous you should probably to arrange somehow some temporary ID. But as UWS is Stateful -you have to remember the state (and user) to be able to identify him again after re-connection. in 2.1.8 (added in UWS 1.0) is defined the Owner but only in case of authenticated service. I think it is not forbiddent to use this object even for anonymous service using some other means of users' identification Here I am not sure about how to impose the rule of STATEFULNESS for anonymous user - it was not our case ... in principle knowing the job id anybody could get someone's results after completion - probably not wanted by most of community. > If we want to define the mechanism of identification, then we propose to use > HTTP authentication mechanism that is an RFC This is what we are doing as well ... >> As i said we are probably doing something revolutionary with UWS and so our >> interpretation may be wrong (I hope Paul and Dave will comment on their >> primary ideas of UWS - but I liked the original design as very flexible) I am explaining below what I think is different in my vision and yours > But you know better the code than the user. So you are able to make an > initial estimation of time duration. NO - in certain cases it depends on parameters - and its almost unpredictable - in case of multiparametric optimization. Of course if you just run image processing you can scale easily the job time using the size of image as parameter. I am afraid it is a special case, however. > The queue priority depend on the resource you ask. You can not have the same > priority for a quick job mono processor and a multi processor job that use > all the memory of the machine and run on 200h. Exactly - that's why I think the user has to be provided with different queues and there should be some UWS parameter hinting the UWS service to which one should be the job submitted. Then this "crystal ball number" may be replaced with the default time limit of given queue - but it is still difficult as usually the CPU time not the wall time is used in queue policies.... > So if you say my job is a 5mn job, then I change my mind it will use 200h. YES -- I can ABORT all my jobs to get into the limits allowing me to use the long-term queue - depends on job management policy an my status (be allowed to use more memory, CPU time etc ...) > It's not acceptable for the other users and it brake all the mechanism of the > job queue that decide the job scheduling process. User should have the freedom to decide what he wants if being allowed to do it. If you remove the ABORT there is no way how to stop wrongly prepared job , If you replace it with DELETE then all the effort is lost, while in ABORT case the user could check the results produced so far (to recognize the problem diverges and so he has to think more about the physical nature and reformulate the problem) > In UWS you ask a service the way to query them and retrieve result. From > this information you are able to make a client that send job and > retrieve results. But we didn't give any recommendation on web > interface. It was the original objection of Mathew as well - UWS is for running automatic submission - mainly for TAP. It was not originally intended for running browser interaction. As I said we tried to find another (more fancy for scientists - I have asked many of them) application of UWS (as it is a general pattern without restriction). But so far I do not see any reason why only the blind automatic execution of jobs should be the official way, if different idea can fit well with current standard. > Our purpose is not to reduce functionality, but just to remove ambiguity. This is philosophical retorics ;-) You are reducing the functionality as you do not allow me to prepare job without run and you removed my right to deside about its fate (STOPING it) The ambiguity is often only apparent - tiny differences may have crucial consequences (see the arbitrary EULA ;-) > If you can send a job from 3 different manner, it mean that people > implementing server side will have to face choice usually not clear > it mean that client must implement the 3 manner > it meant that the description of the service have to face a description of 3 > maners. can you exactly describe which 3 means of starting you have in mind ? The job must be CREATED by uploading of parameters. And as it said in 2.2.3.1 you may add the job control parameter telling PHASE=RUN, so the jobs is run immediately - so all your services may define in parameter sets this implicitly (but I would still let to user to check this option in input form or add it to the client string etc ...) If you remove the PENDING phase you restrict the user's freedom. Its a key issue - you are forcing anybody to accept your solution as you feel currently powerful enough to dictate others your particular solution. And what with the poor older services having PENDING, ABORT etc - it looks like the MS or Oracle policy - drop all what is workig as we decided to make you happy with our new version which is the only one we support. (sorry nothing personally against your group - I am sure you wanted to make more developers happy without seeing the consequences) But OTOH I feel this is the crucial point of misunderstanding between VO developers and potential VO users and probably one of the reasons why the VO interest ist diminishing in wide astronomical community (e.g. seen in state of VO affairs in ESO ). The developers want to enforce the astronomers to accept their simply-to-implement and conceptionally clean solution of VO application or service while the scientists want to have something different which is however harder to implement or not obvious to developer. All the IVOA rules tried to maintain the back compatibility and most of standards were extended to employ new features, never cut in funcionality! > I just say, give me a good reason to create a job and not sending it. Why if > you create a job you don't want to send it. Probably the concepts we are talking about is different - Our solution as I said is a CLOUD -like - so the user has some private space allocated where he can upload the data+parameters. He can use this space as he needs within certain limits. If he uploads a lot of data, the disk quota system stops him to create new jobs (so he has to decide what to remove - i.e. data, input parameters and results) - this is what I understand the DELETE operation. By CREATE of job I mean - allocation of this private space where the data and parameters are uploaded - this is what I call PENDING - after the PHASE=RUN is this volume sent by some scheduler to particular machine for execution or submitted to some queue - here it comes the private policy of the scheduler to power - the state is QUEUED. Then the particular node may be given the processing code and data and he runs it (maybe I am wrong but in cluster or grid the code is not on a node but it is uploaded there by the scheduling node from user's space . So here the UWS should decide what numerical code to run (it knows he service URL) and this should be sent with user's data wrapped somehow for deployment). In other words - there is a UWS server that has to handle user's space where the data and params are uploaded. There must be some server containing the binaries of code of all supported services and there is a entry point (schedulling node) of grid or cluster which does not care about UWS but gets a binary of given service, parameters and data of user and according to his scheduler decides on which node to run it. You may have a networked (clustered filesystem) and homogenized cluser and then all the binaries are same and only the name of binary code is sent in some job descritpion language to given node referring to shared user's space where the data and params were stored. He can run limited number of parallel jobs (we are thinking about some policy of different memory requirement, time limits etc even if this a task of scheduler - but the user has to have some options like you can run 5 short low memory jobs - other will be queued or one large for longer time memory) Even if we do not let him decide we can let the scheduler decide but then some queue classes must be defined and he must decide how to run the job. Even in case of such a TAP application the user may want to query whole universe ;-) (large amount of data expected in large cone, larger range of magnitudes etc ....) but it may take long time - or he decides to restrict the query and wants to have results quickly .... > And why you are not able to wait > until the time you are ready to send directly the job without going to this > pending status. As I tried to explain above - the user in our case have the allocated storage on our servers which can use for long time directly - there is a storage of this experiments, comments, results, graphs etc ... Its a CLOUD concept which is extremely comfortable for users like the eshop - all my purchases are there so I can re-order again the same goods - well its a different ;-) but here I can see the improvements in parameters on the same data and re-execute the jobs with the same data (already uploaded - may be large - in my space) and changed parameters (one number modified from my mobile ;-) Of course in case I re-submit the same job (i.e. I take the same data-sets, same parameters, allow to edit the parameters) I do not force PENDING phase - as I suppose I want to see results of my modified job (or even the same - as the server version of the binary may be upgraded) immediately. So once again - creating jobs means uploading large data + params to server user's space. But then I can use this space to run the same job with small change in parameters without need of uploading all the data again. I agree this is a particular case but in general it may be extended to concept of workflows - the data (e.g. processing of image mosaic will take time to upload) but then I may play with workflow description from mobile and check the quality of solution (mostly not the image itself but some quality plot, statistics etc ...) Probably you should ask users of your services wheteher they would like to have history of their jobs without need uf re-uploading it always. There are still slow networks in hotels, or using GPRS etc .... What we are allowing the users is a "supercomputing" on my phone And I am pretty sure this will be required soon by all astronomers to have in VO - as the mobility freedom is increasing - tablets etc .. There will have to be the proper model for VO to provide such cloud services - instead of running Aladin of large computer (lot of memory for Java etc ..) the simple browser will connect Aladin in cloud to see the DSS and 2MASS thumbnail of region where the supernova has just exploded. The same in theory case. To summarize for further consideration in GWS session - I do not support your idea of removing PENDING, ABORT etc ... I think the critical issue is the solution of jobs isolation - authentication and some users's identification for open (public) services (e.g. cookie's ). The new version of UWS should include somehow the parameter description language and will need some concept of queue priorities. I would even appreciate the another phase ;-) the STOPED as logical extension of ABORT - stop executing, free processor, allow the user to see the results for short time and decide if ABORT or CONTINUE - the state is same as in SUSPENDED but the system does not run it again if troubles (e.g. memory limit) are gone but the user has to put it in EXECUTING state intentionally. Of course it will be harder to implement in GRID environment but I am sure modern systems will alow this (something like checkpoint storage, unswapped job etc ...) The user can already run long job and he is not allowed to run another but he may want to stop the execution to get resources, run the quicker one and then continue (for another month ;-) to compute again ... Of course the crucial question is what happens with computing node - is it a temporary stop for several minutes where the process is just swapped out (and will be killed later) or kind of checkpoint where all the space of running jobs is stored in some user space from what it can be recalled after weeks etc .... I am not experienced in such a technologies on clusters, but it may be even some special feature of application binary that allows to store the intermediate results of long computation and the newly executed job may be given the parameter to continue after re-reading such a state (something like the SAVE in IDL or Save session in most unix applications ...) To be honest I dislike the idea of automatic Destruction Time removing my results shortly after being computed. I would first decide the user to decide what he wants to maintain withing limited space (not to remove jobs older e.g. than 2 month and keep current as the older onec may be important well represnenting some solution of my problem that I still want to work on, but the current are results of multiple demos of some silly experiment. But if the destrution Time is long enough (e.g. month or more) then the idea in last paragraph of 2.2.3.3 is nice (if quota exceeded, further jobs are stored only shorter times - if the user is warned about this, he will be carefull not to loose fancy new results ;-) And I like the 4.3 - the idea of CEA v2 very much. Best regards, Petr ************************************************************************* * Petr Skoda Phone : +420-323-649201, ext. 361 * * Stellar Department +420-323-620361 * * Astronomical Institute AS CR Fax : +420-323-620250 * * 251 65 Ondrejov e-mail: skoda at sunstel.asu.cas.cz * * Czech Republic * ************************************************************************* From andre.schaaff at astro.unistra.fr Wed Oct 19 22:20:46 2011 From: andre.schaaff at astro.unistra.fr (Andre Schaaff) Date: Thu, 20 Oct 2011 07:20:46 +0200 Subject: Pune GWS sessions presentations Message-ID: <4E9FAFAE.1090004@astro.unistra.fr> Hello, All the presentations from yesterday session ... and of today's coming session are available. http://www.ivoa.net/cgi-bin/twiki/bin/view/IVOA/InterOpOct2011GWS Regards Andr? From Paul.Harrison at manchester.ac.uk Thu Oct 20 00:06:12 2011 From: Paul.Harrison at manchester.ac.uk (Paul Harrison) Date: Thu, 20 Oct 2011 08:06:12 +0100 Subject: UWS 1.1 comments Message-ID: <1204671A-21C7-474A-BC17-2091D6B66EE9@manchester.ac.uk> Hi, Given the difficulties with my attempt to join the Registry session the other day via Skype, I thought that I would send an email with my thoughts just in case Skype fails us again.... Here is my response to the presentation by Normand, Malapert & Le Sidaner http://www.ivoa.net/internal/IVOA/InterOpOct2011GWS/normand_malapert_lesidaner_uws11_IVOA2011.pdf - Slightly unsatisfactory again, as this has ended up as an email rather than a "real-time" conversation. In general -------------- 1.The "REST purity" of the design UWS was constrained by some pragmatic criteria (and anyway no-one should be a REST zealot ;-)) * it should be possible to drive a UWS service from most browsers (without javascript) * it should fit in with existing patterns of use of the S*AP protocols 2. Any changes to UWS need to be backwards compatible so as not to invalidate any existing services - this is particularly important as UWS is used by TAP. This means in general that existing behaviours cannot be changed, and any new behaviours cannot be mandatory - unless I guess that there is a majority amongst the people who have already implemented services and clients to make the "disruptive" change. 3. UWS is meant to be able to act as a uniform fa?ade onto a grid/cloud backend whilst simplifying and unifying some common concepts of job control systems. Several of the suggestions are to remove features which might be relevant in a more generalized model of a job control system, but for now do not seem to have an immediate need. It should be noted that in order to reach a consensus in a timely matter on UWS 1.0, other metadata (e.g. quotas, priorities) that was not absolutely necessary for basic operation was not included. So I would expect future versions of UWS to add to the metadata by including these more difficult to generalize areas. Specifically -------------- Looking at the presentation page by page. 1. Page 4 The reason for the two step execution of a job is so that the job with its parameters can be evaluated by the service and then changes be made to the job metadata that might affect the execution of the job. It was already recognised that it might be desired to just accept all the job execution defaults and put the job into an executing state in one operation - see http://www.ivoa.net/Documents/UWS/20101010/REC-UWS-1.0-20101010.html#d1e1375 - by including a PHASE=RUN amongst the initial job parameters - perhaps this needs to be made clearer and could be explicitly separated from the JDL by saying that job metadata parameters are passed during the job creation step only in "query" part of the creation URI not in the POST body. 2. Page 5 What is the tunnelling API that is mentioned wrt having only DELETE to delete jobs? - Whilst the addition of responding to a POST with ACTION=DELETE is a small amount of extra work for the author of the UWS server side, they will be thanked by the author of a browser based UWS client side implementation as they can do this with a one button FORM that is guaranteed to work on just about every browser implementation without having to write any ticky javascript to be able to send a DELETE http method. 3. Page 6. I think that everyone is agreed that the Quote is difficult for the service to provide (It even says so in the standard http://www.ivoa.net/Documents/UWS/20101010/REC-UWS-1.0-20101010.html#Quote) and perhaps we should not have included it in the initial standard. However the service is allowed to say "don't know". ExecutionDuration is supposed roughly equivalent to CPU time. The CPU time is an exact measure and job control systems often limit the CPU time that can be given to a particular job. I agree that the language used in the description of ExecutionDuration is rather misleading as it makes it seem like ExecutionDuration is just like Quote because of the phrase "wall clock time" (though because of the possibility of a job being suspended and not knowing exactly when a job will start once queued they can never be exactly equivalent) - the intention was to say that if CPU time was not available then wall clock time could be used as a measure (but the wording says that it should always be used) - I think that the solution here is to make sure that the ExecutionDuration definition is more carefully worded. 4. Page 7. There is a clear use case for users wanting to be able to set an ExecutionDuration if the service that they are using has quotas - they can stop a single job (possibly unexpectedly because of the input parameters) from using all of their quota in one execution thus preventing other jobs from being run. Similarly setting the DestructionTime when the job is initialized can be useful to make sure that the job (and associated storage) is deleted in a timely fashion - preventing them from exceeding a storage quota if they are submitting many jobs in succession. Another use of this facility is that the default destruction time on a service may be less than the maximum that the service will allow, so the client can request extra time than they would be given by default. 5. Page 8. As already said you can start the job with all its parameters in one step already in V1.0 or you can do it in several steps using feedback from the server to fine tune the job metadata. Also note that you are not allowed to create new parameters after the initial POST, only potentially change their values - and again this is an optional feature for the server to support - the only place that the server must support setting parameter values is in the initial POST. 6. Page 9 The most basic use case for a user being able to abort a job is if they submit a job and start it executing and then realise that they have made a mistake (e.g. with a parameter value) and that will cause the job to run for days when it should only take minutes - they can be a good citizen and abort the job. 7. Page 10. I am not a great fan of pagination myself (it is more complex for both the server author and the client author). However, if there is a perceived need for this facility at the meeting, then it must be that HTTP GET on /{jobs} returns the whole list and *NOT* a paginated version of the list - mainly for backwards compatibility, but also because it makes no sense to return the paginated version (how big is the page?) - the desire for pagination should be always indicated by the relevant parameters in the query part of the request URI. Pagination would also require a change the the UWS schema (to indicate that only part of the response is in the job list) which would be disruptive. I actually feel that some standard filtering - e.g. only list jobs according to phase, jobs newer than date etc. would be better than pagination... 8. Page 11. Authentication is orthogonal to the UWS specification, and what it says about it (http://www.ivoa.net/Documents/UWS/20101010/REC-UWS-1.0-20101010.html#security) is probably sufficent, as authentication is dealt with by the http://www.ivoa.net/Documents/latest/SSOAuthMech.html standard. I have long been of the opinion that the SSOAuth standard is not really sufficient on its own for creating a practical SSO system as it relies on X509 user certificates to achieve the "Single" part that and X509 certificates have not reached a wide enough user base. BTW your suggestion to use basic authentication is currently disallowed by the SSOAuth standard and that is why UWS returns 403 for the areas that the user is not allowed to see, because 401 implies that the client could try again with using Basic Auth. Anyway there is still much to be debated on the practical use of authentication mechanisms in the IVOA, but is is not directly a UWS 1.1 issue. 9. Page 12. What you describe is just about the only option that you have in UWS1.0 if your JDL is not expressible as simple parameter/value pairs and cannot easily be the legal content of a element. It does say this, but perhaps only by looking in two places section 2.2.2.4 http://www.ivoa.net/Documents/UWS/20101010/REC-UWS-1.0-20101010.html#d1e1353 and section 2.1.11 http://www.ivoa.net/Documents/UWS/20101010/REC-UWS-1.0-20101010.html#ResultsList2. However a client should be able to pick up a parameter value whether is is given "in line" or "by reference", so if the service can express its JDL as the legal content of a parameter value in line, then it can, although I agree that it is a better choice to do it "by reference". Other ------- On some points from Dean Hinshaw's DataScope and UWS talk * I think it is OK for results to appear before a job has finished - it is not against the spirit of the standard - indeed is part of the reason why Aborting a job can leave partial results. * It is definitely not OK to return a result value in-line - it is invalid against the current schema - I cannot really remember why we did not allow both in-line and by-reference values for results as we do for parameters because it seems sensible to me now, but would be a disruptive change at this stage to allow content in the result element as it would require a Conclusion --------------- I think that UWS 1.1 should be about clarification of the UWS 1.0 standard rather than attempting to make changes to the basic model - frankly it is too late for that now. I think that the presentation highlights areas where more explanation is needed. A future version beyond 1.1 could introduce new extended features to the UWS pattern. Paul. p.s. I think that I had almost forgotten about this page myself - http://www.ivoa.net/cgi-bin/twiki/bin/view/IVOA/UWSEnhancement - a place for suggestions for UWS enhancements. Dr. Paul Harrison JBCA, Manchester University http://www.manchester.ac.uk/jodrellbank From andre.schaaff at astro.unistra.fr Thu Oct 20 00:26:30 2011 From: andre.schaaff at astro.unistra.fr (Andre Schaaff) Date: Thu, 20 Oct 2011 09:26:30 +0200 Subject: REST document Message-ID: <4E9FCD26.7000303@astro.unistra.fr> Hello, The document concerning the REST Basic Profile. Andr? -------------- next part -------------- A non-text attachment was scrubbed... Name: IVOA-WD-VOREST.pdf Type: 0/unknown Size: 252061 bytes Desc: not available URL: From patrick.dowler at nrc-cnrc.gc.ca Thu Oct 20 01:31:23 2011 From: patrick.dowler at nrc-cnrc.gc.ca (Patrick Dowler) Date: Thu, 20 Oct 2011 01:31:23 -0700 Subject: UWS 1.1 comments In-Reply-To: <1204671A-21C7-474A-BC17-2091D6B66EE9@manchester.ac.uk> References: <1204671A-21C7-474A-BC17-2091D6B66EE9@manchester.ac.uk> Message-ID: <201110200131.24237.patrick.dowler@nrc-cnrc.gc.ca> On 2011-10-20 00:06:12 Paul Harrison wrote: > As already said you can start the job with all its parameters in one step > already in V1.0 or you can do it in several steps using feedback from the > server to fine tune the job metadata. Also note that you are not allowed > to create new parameters after the initial POST, only potentially change > their values - and again this is an optional feature for the server to > support - the only place that the server must support setting parameter > values is in the initial POST. Paul - can you point to the section of the UWS spec that prohibits one to POST new params to the job or the parameters child resource? I naively would think that a POST to /joblist/ would modify the job and a POST to /joblist//parameters would modify the parameter list (eg add something to it). A POST to /joblist//parameters/foo would modify the existing parameter named foo and fail (404) if it was not there (not create it). This would all be standard REST style, imo. Thoughts? PS-In my opinion, changes to a spec that allow something previously not allowed or otherwise relax constraints are in principle backwards compatible, but if the spec really says that POST FOO=bar /joblist/job must modify an existing param named FOO, that is problematic. -- Patrick Dowler Tel/T?l: (250) 363-0044 Canadian Astronomy Data Centre National Research Council Canada 5071 West Saanich Road Victoria, BC V9E 2M7 Centre canadien de donnees astronomiques Conseil national de recherches Canada 5071, chemin West Saanich Victoria (C.-B.) V9E 2M7 From paul.harrison at manchester.ac.uk Thu Oct 20 02:10:50 2011 From: paul.harrison at manchester.ac.uk (Paul Harrison) Date: Thu, 20 Oct 2011 10:10:50 +0100 Subject: UWS 1.1 comments In-Reply-To: <201110200131.24237.patrick.dowler@nrc-cnrc.gc.ca> References: <1204671A-21C7-474A-BC17-2091D6B66EE9@manchester.ac.uk> <201110200131.24237.patrick.dowler@nrc-cnrc.gc.ca> Message-ID: On 2011-10 -20, at 09:31, Patrick Dowler wrote: > On 2011-10-20 00:06:12 Paul Harrison wrote: >> As already said you can start the job with all its parameters in one step >> already in V1.0 or you can do it in several steps using feedback from the >> server to fine tune the job metadata. Also note that you are not allowed >> to create new parameters after the initial POST, only potentially change >> their values - and again this is an optional feature for the server to >> support - the only place that the server must support setting parameter >> values is in the initial POST. > > Paul - can you point to the section of the UWS spec that prohibits one to POST > new params to the job or the parameters child resource? I naively would think > that a POST to /joblist/ would modify the job and a POST to > /joblist//parameters would modify the parameter list (eg add something > to it). A POST to /joblist//parameters/foo would modify the existing > parameter named foo and fail (404) if it was not there (not create it). This > would all be standard REST style, imo. > > Thoughts? > > > PS-In my opinion, changes to a spec that allow something previously not > allowed or otherwise relax constraints are in principle backwards compatible, > but if the spec really says that POST FOO=bar /joblist/job must modify an > existing param named FOO, that is problematic. It was not explicit in the original 1.0 specification (the wording just said that you may "update a parameter"), but I thought that the discussion from the last Interop had come up with the recommendation that it be made clear that parameter creation could only happen at job creation time and so I had added "(but not create)" it to the latest draft http://volute.googlecode.com/svn/trunk/projects/grid/uws/doc/UWS.html#ResultsList2 (sec 2.1.11) as one of the "uncontroversial" changes. The thinking behind it was that as parameter creation was part of the JDL which is outside the scope of UWS then there could not be a generalized client that could create parameters in the parameter list. The whole area or being able to set the parameters in the parameter list is actually optional and given that the original standard did not explicitly allow or disallow it, so I guess the decision can still go either way - I take it that you are using this feature.... Paul.