Page tree
Skip to end of metadata
Go to start of metadata

Collection Job

A collection job is a set of mutually independent jobs that is submitted, monitored and controlled as a single request. Collection job is a simple way to submit multiple jobs at once. A good reason for using collection job is that the sub-jobs have common input files.The WMS allows for sharing and inheriting sandboxes. This means that a only single copy of each file will be transferred even when it is used in many sub-jobs.
Collection Job JDL Example

[

  Type = "collection";

  InputSandbox = {

    "input_common1.txt",

    "input_common2.txt"

  };

nodes = {

   [

     JobType = "Normal";

     NodeName = "node1";

     Executable = "/bin/sh";

     Arguments = "script_node1.sh";

     InputSandbox = {"script_node1.sh",

                      root.InputSandbox[0]

                    };

     StdOutput = "myoutput1";

     StdError  = "myerror1";

     OutputSandbox = {"myoutput1","myerror1"};

     ShallowRetryCount = 1;

   ],

[

     JobType = "Normal";

     NodeName = "node2";

     Executable = "/bin/sh";

     InputSandbox = {"script_node2.sh",

                    root.InputSandbox[1]

                  };

     Arguments = "script_node2.sh";

     StdOutput = "myoutput2";

     StdError  = "myerror2";

     OutputSandbox = {"myoutput2","myerror2"};

    ShallowRetryCount = 1;

   ],

 [

     JobType = "Normal";

     NodeName = "node3";

     Executable = "/bin/cat";

     InputSandbox = {root.InputSandbox};

     Arguments = "*.txt";

     StdOutput = "myoutput3";

     StdError  = "myerror3";

     OutputSandbox = {"myoutput3","myerror3"};

    ShallowRetryCount = 1;

   ]

 };

]

script_node1.sh

#!/bin/sh

echo "Current date is `date`"

echo "Dumping now input files"

echo "**********************"

cat *.txt

script_node2.sh

#!/bin/sh

echo "Running machine is `hostname`"

ls -l

echo "Dumping now input files"

echo "**********************"

cat *.txt

input_common1.txt

first input

input_common2.txt

second input

 

Working with Input Data From the Storage Element

 

BlastParametric.jdl

JobType        = "Parametric";

Executable     = "blastparam.sh";

Arguments      = "input output _PARAM_";

PerusalFileEnable = true;

PerusalTimeInterval = 120;

Parameters = 151;

ParameterStart = 1;

ParameterStep = 1;

StdOutput      = "std.out";

StdError       = "std.err";

InputSandbox   = {"blastparam.sh"};

OutputSandbox  = {"std.err","std.out"};

Requirements = Member("VO-academicgrid-BLAST", other.GlueHostApplicationSoftwareRunTimeEnvironment);

blastparam.sh

#!/bin/sh

export PATH=$PATH:$VO_ACADEMICGRID_SW_DIR/blast/bin

export LFC_HOST=lfc.biruni.upm.my

FOLDER_PREFIX=/grid/academicgrid/farhan/blast/full

FOLDER_FASTA=$1

FOLDER_RESULT=$2

FASTA_FILE=frag$3.fasta

OUTPUT_FILE=$FASTA_FILE.xml

FASTA_LFC=$FOLDER_PREFIX/$FOLDER_FASTA/$FASTA_FILE

lcg-cp lfn:$FASTA_LFC file:$FASTA_FILE

export BLASTDB=/opt/blastdb

export BLASTMAT=$VO_ACADEMICGRID_SW_DIR/blast/data

echo "Blast started at `date`”

blastall -b 20 -v 20 -p blastx -e 0.001 -m 7 -d nr -i $FASTA_FILE -o $OUTPUT_FILE

echo "Blast ended at `date`"

lcg-cr -l lfn:$FOLDER_PREFIX/$FOLDER_RESULT/$OUTPUT_FILE -d dpm.biruni.upm.my file:$OUTPUT_FILE

 

Working with Requirement

 

The Requirements attribute can be used to express any kind of constraint on the resources where the job can run. Its value is a Boolean expression that must evaluate to true for a job to run on that specific CE. There are few ways how requirement can be expressed:
  • Forming expressions: To force a job to only run on a particular CE:
    • Requirements = other.GlueCEName == ”academicgrid";
    • Requirements = other.GlueCEName == ”academicgrid" && other.GlueCEInfoTotalCPUs > 1;
    • Requirements = (!other.GlueCEInfoTotalCPUs < 10);

  • Member: Member attribute is used to only run on a particular cluster that has the software required
    • Requirements = Member("VO-academicgrid-OPENFOAM", other.GlueHostApplicationSoftwareRunTimeEnvironment)
    • Use lcg-info --list-ce --vo academicgrid --attr 'Tag’ command to check what software does the grid sites are provided
  • RegExp: Another function RegExp can be used to see if a supplied matches as as regular expression, for example
    • Requirements = RegExp(”biruni.upm.my", other.GlueCEUniqueId);

  • Gangmatching: In order to specify requirements involving three entities (i.e., the job, the CE and a SE), the WMS uses a special match-making mechanism, called gangmatching. This is supported by some JDL functions: anyMatch, whichMatch, allMatch. For example to ensure that the job runs on a CE with,, at least 200 MB of free disk space on a close SE, the following JDL expression can be used:-
    • Requirements = anyMatch(other.storage.CloseSEs,target.GlueSAStateAvailableSpace > 204800);

  • Rank The choice of the CE where to execute the job, among all the ones satisfying the requirements, is based on the rank of the CE; namely, a quantity expressed as a floating-point number. The CE with the highest rank is the one selected.The user can define the rank with the Rank attribute as a function of the CE attributes. The default definition takes into account the number of CPUs in the CPU that are free:
    • Rank = other.GlueCEStateFreeCPUs;

List of tag that can be used:

 

Tag

Description

GlueCEUniqueID

name of the Computing Element, including the queue (see lcg-infosites –vo academicgrid ce)

GlueCEPolicyMaxCPUTime

required CPU time (affects esp. the assignment to a queue)

GlueCEPolicyMaxWallClockTime

required total running time (also affects the assignment to a queue)

GlueHostOperatingSystemName

name of the operating system on the workernode

GlueHostOperatingSystemRelease

release of the operating system on the workernode

GlueHostArchitecturePlatformType

processor architecture on the worker node

GlueHostApplicationSoftwareRunTimeEnvironment

list of tags set by the software managers of the VO, has to be checked with 'Member'

GlueCEStateEstimatedResponseTime

estimated time between submission to the site and start of the job on a worker node (for ranking)

GlueCEStateFreeCPUs

number of free cpus on a site (useful for ranking)

GlueHostNetworkAdapterOutboundIP

boolean variable that specifies if the worker node has an outbound IP, i.e. a way to the internet

 

 

Directed Acyclic Graph Job

A DAG (directed acyclic graph) represents a set of jobs where the input, output, or execution of one or more jobs depends on one or more other jobs. It is very useful to solve a complex job with multiple dependencies/process or workflow. DAG-type job is supported by EMI/UMD WMS version 2 onwards. Below is an example of DAG relationship:

 

dag.jdl

[

Type = "dag";

InputSandbox = {"job.sh", "job2.sh"};

Nodes = [

  nodeA = [

    Description = [

    JobType = "Normal”;

    Executable = "job.sh";

    Arguments = "A”;

    StdOutput = "std.out”;

    StdError = "std.err”;

    InputSandbox = {root.InputSandbox[0]};

    OutputSandbox = {"std.out","std.err"};

    ];

  ];

  nodeB = [

    Description = [

    JobType = "Normal";

    Executable = "job2.sh";

    Arguments = "B";

    StdOutput = "std.out";

    StdError = "std.err";

    InputSandbox = {root.InputSandbox[1]};

    OutputSandbox = {"std.out","std.err"};

    ];

  ];

  nodeC = [

    Description = [

    JobType = "Normal";

    Executable = "job3.sh";

    Arguments = "C";

    StdOutput = "std.out";

    StdError = "std.err";

    InputSandbox = {"job3.sh"};   OutputSandbox = {"std.out","std.err"};

     ];

  ];

  nodeD = [

  Description = [

    JobType = "Normal";

    Executable = "job.sh";

    Arguments = "D";

    StdOutput = "std.out";

    StdError = "std.err";

    InputSandbox = {root.InputSandbox[0]};

    OutputSandbox = {"std.out","std.err"};

    ];

  ];

];

Dependencies = { {nodeA,nodeB},{nodeA,nodeC},{{nodeB,nodeC},nodeD} };

]

job1.sh

#!/bin/bash

echo "Job $1 - `date` - BEGIN"

hostname

sleep 100

echo "Job $1 - `date` - END"

job2.sh

#!/bin/bash

echo "This is job2"

echo "Job $1 - `date` - BEGIN"

hostname

sleep 100

echo "Job $1 - `date` - END"

job3.sh

#!/bin/bash

echo "This is job3"

echo "Job $1 - `date` - BEGIN"

hostname

sleep 100

echo "Job $1 - `date` - END"

  • No labels