WS-Gram

From TeraGrid Wiki

Jump to: navigation, search

Contents

Description

Globus Tookit 4 contains a web service-based Grid Resource Allocation and Management (GRAM) component. WS-GRAM is a WSRF[1]-based web service used by computational resources on the Teragrid[2] to remotely submit, monitor, and cancel jobs. These jobs may be either be interactive jobs which tend to perform simple tasks which complete quickly, or they may be jobs managed by a scheduler such as PBS[3].

Services

Managed Job Factory Service

The Managed Job Factory Service (MJFS) is used to create an instance of the stateful Managed Job Service by calling the createManagedJob method. The resulting Managed Job Service is used to control and monitor a job. In addition the MJFS publishes information about the characteristics of the compute resource using the WS-Resource specification[4], [5].

Operations

  • createManagedJob : This operation creates a Managed Job Service instance, subscribes the client for notifications if requested, and replies with one or two endpoint references (EPRs)[6]. These EPRs in turn point at the actual Managed Job Service used to control and monitor a job.
    • Input Parameters: a job description, an optional initial termination time for the job resource, and an optional state notification subscription request.
    • Output Parameters: One or two endpoint references (EPRs). The 1st EPR points to the Managed Job Service, the 2nd points to the subscription service and is only present when the client requests notification subscription.
    • Fault: createManagedJobFault
  • getResourceProperty, getMultipleResourceProperties, and QueryResourceProperties are all part of the WS-ResourceProperties portType. WSDL[7], XSD[8].

Managed Job Service

Operations

  • release : This operation takes no parameters and returns nothing. Its purpose is to release a hold placed on a state through the use of the "holdState"[9] field in the job description.
  • setTerminationTime : This (optional) step allows the client to reschedule automatic termination to be different than was originally set during creation of the ManagedJob resource.
  • destroy : This (optional) step allows the client to explicitly abort a job and destroy the ManagedJob resource in the event that the scheduled automatic termination time is not adequate. If the job has already completed (i.e. is in the Done or Failed state), this will simply destroy the resource associated with the job. If the job has not completed, appropriate steps will be taken to purge the job process from the scheduler and perform clean up operations before setting the job state to Failed.
  • subscribe : This (optional) step allows a client to subscribe for notifications of status (and particularly life cycle status) of the ManagedJob resource. For responsiveness, it is possible to establish an initial subscription in the createManagedJob() operation without an additional round-trip communication to the newly created job.
  • getResourceProperty and getMultipleResourceProperties : These (optional) steps allow a client to query the status (and particularly life cycle status) of the ManagedJob resource.

Job Description

A GRAM job can be described using the XML Schema found here[10]. The GT 4.0 WS GRAM: Job Description Schema Doc is a more human-readable document describing the job description schema. [11]. Using this schema could result in producing an XML document that looks like the following simple job example:

<?xml version="1.0" encoding="UTF-8"?>
<job>
    <executable>/bin/echo</executable>
    <directory>${GLOBUS_USER_HOME}</directory>
    <argument>Welcome to the Teragrid.</argument>
    <stdout>${GLOBUS_USER_HOME}/stdout</stdout>
    <stderr>${GLOBUS_USER_HOME}/stderr</stderr>
</job>

or this more complex example which demonstrates how to submit a multijob:

<?xml version="1.0" encoding="UTF-8"?>
<multiJob xmlns:gram="http://www.globus.org/namespaces/2004/10/gram/job" 
     xmlns:wsa="http://schemas.xmlsoap.org/ws/2004/03/addressing">
    <factoryEndpoint>
        <wsa:Address>
            https://localhost:8443/wsrf/services/ManagedJobFactoryService
        </wsa:Address>
        <wsa:ReferenceProperties>
            <gram:ResourceID>Multi</gram:ResourceID>
        </wsa:ReferenceProperties>
    </factoryEndpoint>
    <directory>${GLOBUS_LOCATION}</directory>
    <count>1</count>

    <job>
        <factoryEndpoint>
            <wsa:Address>https://localhost:8443/wsrf/services/ManagedJobFactoryService</wsa:Address>
            <wsa:ReferenceProperties>
                <gram:ResourceID>Fork</gram:ResourceID>
            </wsa:ReferenceProperties>
        </factoryEndpoint>
        <executable>/bin/date</executable>
        <stdout>${GLOBUS_USER_HOME}/stdout.p1</stdout>
        <stderr>${GLOBUS_USER_HOME}/stderr.p1</stderr>
        <count>2</count>
    </job>

    <job>
        <factoryEndpoint>
            <wsa:Address>https://localhost:8443/wsrf/services/ManagedJobFactoryService</wsa:Address>
            <wsa:ReferenceProperties>
                <gram:ResourceID>Fork</gram:ResourceID>
            </wsa:ReferenceProperties>
        </factoryEndpoint>
        <executable>/bin/echo</executable>
        <argument>Hello World!</argument>        
        <stdout>${GLOBUS_USER_HOME}/stdout.p2</stdout>
        <stderr>${GLOBUS_USER_HOME}/stderr.p2</stderr>
        <count>1</count>
    </job>

</multiJob>

WSDL

All WS-GRAM WSDL files, portTypes, and XSD data types can be found here[12].

Examples

Command Line Example

The WS-GRAM service can be invoked via the command line using the globusrun-ws command provided by the globus toolkit and found in the $GLOBUS_LOCATION/bin directory.

The globusrun-ws command usage can be seen at the globusrun-ws Command Usage page.

Fork Job

For example, the following will submit a simple job to the local machine using the locally installed version of GT4.

tg-login1% globusrun-ws -submit -f simpleJob.xml

where simpleJob.xml is:

<?xml version="1.0" encoding="UTF-8"?>
<job>
    <executable>/bin/echo</executable>
    <directory>${GLOBUS_USER_HOME}</directory>
    <argument>Welcome to the Teragrid.</argument>
    <stdout>${GLOBUS_USER_HOME}/stdout</stdout>
    <stderr>${GLOBUS_USER_HOME}/stderr</stderr>
</job>

Batch Job

To submit a simple job to a PBS queue on the locally installed version of GT4 you need to specify the scheduler using the Factory Type (-Ft) flag.

tg-login1% globusrun-ws -submit -Ft PBS -f simpleQueue.xml

where simpleQueue.xml is:

<?xml version="1.0" encoding="UTF-8"?>
<job>
    <executable>${GLOBUS_USER_HOME}/ws-gram/run.sh</executable>
    <directory>${GLOBUS_USER_HOME}/ws-gram</directory>
    <stdout>${GLOBUS_USER_HOME}/ws-gram/stdout</stdout>
    <stderr>${GLOBUS_USER_HOME}/ws-gram/stderr</stderr>
    <count>2</count>
    <queue>dque</queue>
</job>

To submit the same batch job to a remote host, simply add the -factory flag and the address of the Managed Job Factory Service, like so:

lslogin1% globusrun-ws -submit -factory https://tg-login1.sdsc.teragrid.org:8443/wsrf/services/ManagedJobFactoryService -Ft PBS -f simpleQueue.xml

Programmatic Example

The GT4 documentation contains a fairly comprehensive guide to "Submitting a job in Java using WS Gram" here[13].

Links & References

Globus GT 4.0 WS_GRAM page [14].

Submitting a job in Java using WS GRAM [15].

GramJob.java source code from cvs [16].

GT 4.0 WS GRAM Developer's Guide [17].




--Steve Mock, mock@sdsc.edu, 22:08, 25 January 2007 (GMT)

Personal tools