/* * Title.......: Identity Example * Version.....: 1.0 * Date........: Oct 13 2004 * Author......: Marco Pappalardo INFN Catania marco.pappalardo@ct.infn.it * Support.....: grid-prod@ct.infn.it //for bugs and comments * Thanks To...: Alessio Gianelle //please donate to him * Disclaimer..:) This example has to be intended to be distribuited as is for Gilda. * The author will not be responsible for any damage could occur * to people using it. Be Happy and Enjoy the GRID!!!! */ /* * File........: ReadMe file with instructions for use of the example. */ THE IDENTITY EXAMPLE ==================== This example contains a DAG made by 9 different nodes. Each node consists of a job running a script we'll show in a while. For the moment, let's focus on the DAG itself looking at the JDL Dag description listed below. [ type = "dag"; max_nodes_running = 10; nodes = [ nodeA = [ node_type = "edg-jdl"; file ="nodes/nodeA.jdl" ; ]; nodeB = [ node_type = "edg-jdl"; file ="nodes/nodeB.jdl" ; ]; nodeC = [ node_type = "edg-jdl"; file ="nodes/nodeC.jdl" ; ]; nodeD = [ node_type = "edg-jdl"; file ="nodes/nodeD.jdl" ; ]; nodeE = [ node_type = "edg-jdl"; file ="nodes/nodeE.jdl" ; ]; nodeF = [ node_type = "edg-jdl"; file ="nodes/nodeF.jdl" ; ]; nodeG = [ node_type = "edg-jdl"; file ="nodes/nodeG.jdl" ; ]; nodeH = [ node_type = "edg-jdl"; file ="nodes/nodeH.jdl" ; ]; nodeI = [ node_type = "edg-jdl"; file ="nodes/nodeI.jdl" ; ]; dependencies = { { nodeA, nodeB }, { nodeB, nodeC }, { nodeD, nodeE }, { nodeG, nodeH } } ]; ] Defining a Node is very simple. Within the nodes attribute section, insert a new node by writing as follows: nodes = [ .... nodeA = [ node_type = "edg-jdl"; file ="nodes/nodeA.jdl" ; ]; ... ]; The node_type attribute contains the type of the node just defined. Currently "edg-jdl" is the only supported type. file attribute contains the pathname of the node jdl explaining what the job/node will do when executed. In our example the nodeA is represented by nodes/nodA.jdl jdl file: [ Executable = "identity_message.sh"; Arguments = "NodeA"; RetryCount = 2 ; VirtualOrganisation = "gilda"; InputSandbox = {"nodes/identity_message.sh"}; Stdoutput = "std.out" ; StdError = "std.err" ; OutputSandbox ={ "std.out" ,"std.err"} ; rank = 1.0; ] We won't go in deep description of this jdl you should learn from public standard JDL description documentation. Let's simply underline the meaning of the attributes used in our example: 1. Executable. A script or another executable file that will be run as soon as the nodes is run. 2. Argument. Command lines argument to be passed to the executable. 3. InputSandbox. The executable and eventually all input files. 4. StdOutput. The name of the output file to fill with std output produced by the job. 5. StdError. The name of the output file to fill with std error stream flow produced by the job. 6. OutputSandbox. The list of output files of the job to be retrieved at the end of execution. Some nodes can run independently from the rest of the DAG; others instead are subject to certain dependency condition enlisted in the bottom part of the JDL. The dependences among the nodes are expressed in the form of couples of node names. As an example, to specify that nodeB depends on (must wait for completion of) nodeA, we write { nodeA, nodeB }. You can read it "nodeA precedes nodeB". A "triangular dependence" by which nodeB and nodeC depend on completion of nodeA can be described through {nodeA, nodeB }, {nodeA, nodeC} as well as through {nodeA, {nodeB, nodeC} } In our examples, dipendences are set as follows: nodes = [ .... nodeA = [ node_type = "edg-jdl"; file ="nodes/nodeA.jdl" ; ]; ... dependencies = { { nodeA, nodeB }, { nodeB, nodeC }, { nodeD, nodeE }, { nodeG, nodeH } } ]; In practice: A preceeds B who preceeds C D preceeds E F is independent G preceeds H I is independent SUBMITTING A DAG ================ Here you can find the command line and the standard output produced. Let's submit out first dag. Please notice that edg-dag-submit command must be used. grid-demo1:~/DAGexample> edg-dag-submit --vo gilda identity_ex/identity_dag.jdl Selected Virtual Organisation name (from --vo option): gilda Connecting to host grid007.ct.infn.it, port 7772 Logging to host grid007.ct.infn.it, port 9002 ********************************************************************************************* JOB SUBMIT OUTCOME The dag has been successfully submitted to the Network Server. Use edg-job-status command to check job current status. Your dag identifier (edg_jobId) is: - https://grid007.ct.infn.it:9000/yJXRixjP5feimRRs0hqZeQ ********************************************************************************************* After submitting our dag, we have to monitor its status waiting for the completion. CHECKING DAG STATUS =================== The status monitoring of the DAG is made through periodical edg-job-status commands. Command edg-job-status requests the edg_jobId of the DAG (see previous submit outcome). grid-demo1:~/DAGexample> edg-job-status https://grid007.ct.infn.it:9000/yJXRixjP5feimRRs0hqZeQ ************************************************************* BOOKKEEPING INFORMATION: Status info for the Job : https://grid007.ct.infn.it:9000/yJXRixjP5feimRRs0hqZeQ Current Status: Running - Nodes information for: Status info for the Job : https://grid007.ct.infn.it:9000/wCDw2nJN2qAXuKzPSuw4Ag Current Status: Waiting Status info for the Job : https://grid007.ct.infn.it:9000/d4u0AIMxwqPfkcJFhnntwQ Current Status: Submitted Status info for the Job : https://grid007.ct.infn.it:9000/94S253rmkJ9toLx2WVTGiQ Current Status: Waiting Status info for the Job : https://grid007.ct.infn.it:9000/lux4ceqUdUMrOCi18nCBDA Current Status: Submitted Status info for the Job : https://grid007.ct.infn.it:9000/iJBvrVeLAM3ZhEBXYzs_0A Current Status: Submitted Status info for the Job : https://grid007.ct.infn.it:9000/isqLkOSkFWny9Kna_4JAug Current Status: Waiting Status info for the Job : https://grid007.ct.infn.it:9000/QA-KMEKOUZxxM8bvUfuglw Current Status: Waiting Status info for the Job : https://grid007.ct.infn.it:9000/GedxAULdmIQTGCQecT0fYg Current Status: Submitted Status info for the Job : https://grid007.ct.infn.it:9000/FZkQw5yZ0bcsOUWiMGYbuw Current Status: Waiting ************************************************************* You can repeat the status command as many times you want in order to follow DAG execution. You'll see nodes status change. grid-demo1:~/DAGexample> edg-job-status https://grid007.ct.infn.it:9000/yJXRixjP5feimRRs0hqZeQ ************************************************************* BOOKKEEPING INFORMATION: Status info for the Job : https://grid007.ct.infn.it:9000/yJXRixjP5feimRRs0hqZeQ Current Status: Running - Nodes information for: Status info for the Job : https://grid007.ct.infn.it:9000/wCDw2nJN2qAXuKzPSuw4Ag Current Status: Scheduled Status info for the Job : https://grid007.ct.infn.it:9000/d4u0AIMxwqPfkcJFhnntwQ Current Status: Submitted Status info for the Job : https://grid007.ct.infn.it:9000/94S253rmkJ9toLx2WVTGiQ Current Status: Scheduled Status info for the Job : https://grid007.ct.infn.it:9000/lux4ceqUdUMrOCi18nCBDA Current Status: Submitted Status info for the Job : https://grid007.ct.infn.it:9000/iJBvrVeLAM3ZhEBXYzs_0A Current Status: Submitted Status info for the Job : https://grid007.ct.infn.it:9000/isqLkOSkFWny9Kna_4JAug Current Status: Scheduled Status info for the Job : https://grid007.ct.infn.it:9000/QA-KMEKOUZxxM8bvUfuglw Current Status: Scheduled Status info for the Job : https://grid007.ct.infn.it:9000/GedxAULdmIQTGCQecT0fYg Current Status: Submitted Status info for the Job : https://grid007.ct.infn.it:9000/FZkQw5yZ0bcsOUWiMGYbuw Current Status: Scheduled ************************************************************* Let's check again. grid-demo1:~/DAGexample> edg-job-status https://grid007.ct.infn.it:9000/yJXRixjP5feimRRs0hqZeQ ************************************************************* BOOKKEEPING INFORMATION: Status info for the Job : https://grid007.ct.infn.it:9000/yJXRixjP5feimRRs0hqZeQ Current Status: Running - Nodes information for: Status info for the Job : https://grid007.ct.infn.it:9000/wCDw2nJN2qAXuKzPSuw4Ag Current Status: Scheduled Status info for the Job : https://grid007.ct.infn.it:9000/d4u0AIMxwqPfkcJFhnntwQ Current Status: Submitted Status info for the Job : https://grid007.ct.infn.it:9000/94S253rmkJ9toLx2WVTGiQ Current Status: Scheduled Status info for the Job : https://grid007.ct.infn.it:9000/lux4ceqUdUMrOCi18nCBDA Current Status: Submitted Status info for the Job : https://grid007.ct.infn.it:9000/iJBvrVeLAM3ZhEBXYzs_0A Current Status: Submitted Status info for the Job : https://grid007.ct.infn.it:9000/isqLkOSkFWny9Kna_4JAug Current Status: Scheduled Status info for the Job : https://grid007.ct.infn.it:9000/QA-KMEKOUZxxM8bvUfuglw Current Status: Scheduled Status info for the Job : https://grid007.ct.infn.it:9000/GedxAULdmIQTGCQecT0fYg Current Status: Submitted Status info for the Job : https://grid007.ct.infn.it:9000/FZkQw5yZ0bcsOUWiMGYbuw Current Status: Scheduled ************************************************************* grid-demo1:~/DAGexample> If you need more verbosity from the command, feel free to use -v (verbosity option) to increase the level of verbosity (1 or 2, commonly set to 0). grid-demo1:~/DAGexample> edg-job-status -v 2 https://grid007.ct.infn.it:9000/yJXRixjP5feimRRs0hqZeQ ************************************************************* BOOKKEEPING INFORMATION: Status info for the Job : https://grid007.ct.infn.it:9000/yJXRixjP5feimRRs0hqZeQ Current Status: Running Status Reason: unavailable Destination: dagman Submitted: Fri Oct 15 14:46:40 2004 CEST --- - cancelling = 0 - ce_node = <193.206.208.12:33456> - children_num = 9 - condorId = 19 - cpuTime = 0 - destination = dagman - done_code = 0 - expectUpdate = 0 - jobtype = 1 - lastUpdateTime = Fri Oct 15 14:47:10 2004 CEST - location = LRMS/worknode/<193.206.208.12:33456> - network_server = grid007.ct.infn.it:7772 - owner = /C=IT/O=GILDA/OU=Personal Certificate/L=INFN - Catania/CN=Marco Pappalardo/Email=marco.pappalardo@ct.infn.it - resubmitted = 0 - seed = uLU0BArrdV98O41PLThJ5Q - subjob_failed = 0 ************************************************************* - Nodes information for: https://grid007.ct.infn.it:9000/yJXRixjP5feimRRs0hqZeQ: Status info for the Job : https://grid007.ct.infn.it:9000/wCDw2nJN2qAXuKzPSuw4Ag Current Status: Scheduled Status Reason: Job successfully submitted to Globus Destination: cetilab.tilab.com:2119/jobmanager-lcgpbs-infinite Submitted: Fri Oct 15 14:46:40 2004 CEST Parent Job: https://grid007.ct.infn.it:9000/yJXRixjP5feimRRs0hqZeQ --- - cancelling = 0 - children_num = 0 - condorId = 24 - cpuTime = 0 - destination = cetilab.tilab.com:2119/jobmanager-lcgpbs-infinite - done_code = 0 - expectUpdate = 0 - jobtype = 0 - lastUpdateTime = Fri Oct 15 14:47:44 2004 CEST - location = LRMS/cetilab.tilab.com:2119/jobmanager-lcgpbs//var/edgwl/logmonitor/CondorG.log/dag.https_3a_2f_2fgrid007.ct.infn.it_3a9000_2fyJXRixjP5feimRRs0hqZeQ.log - network_server = grid007.ct.infn.it:7772 - owner = /C=IT/O=GILDA/OU=Personal Certificate/L=INFN - Catania/CN=Marco Pappalardo/Email=marco.pappalardo@ct.infn.it - resubmitted = 0 - seed = sW6nRSW0uyBfecwst_MUfA - subjob_failed = 0 - user tags = edg_wl_ui_dagnodename=nodeF; ************************************************************* Status info for the Job : https://grid007.ct.infn.it:9000/d4u0AIMxwqPfkcJFhnntwQ Current Status: Submitted Submitted: Fri Oct 15 14:46:40 2004 CEST Parent Job: https://grid007.ct.infn.it:9000/yJXRixjP5feimRRs0hqZeQ --- - cancelling = 0 - children_num = 0 - cpuTime = 0 - done_code = 0 - expectUpdate = 0 - jobtype = 0 - lastUpdateTime = Fri Oct 15 14:46:41 2004 CEST - network_server = grid007.ct.infn.it:7772 - owner = /C=IT/O=GILDA/OU=Personal Certificate/L=INFN - Catania/CN=Marco Pappalardo/Email=marco.pappalardo@ct.infn.it - resubmitted = 0 - seed = sW6nRSW0uyBfecwst_MUfA - subjob_failed = 0 - user tags = edg_wl_ui_dagnodename=nodeB; ************************************************************* Status info for the Job : https://grid007.ct.infn.it:9000/94S253rmkJ9toLx2WVTGiQ Current Status: Scheduled Status Reason: Job successfully submitted to Globus Destination: cetilab.tilab.com:2119/jobmanager-lcgpbs-ifinite Submitted: Fri Oct 15 14:46:40 2004 = LRMS/cetilab.tilab.com:2119/jobmanager-lcgpbs//var/edgwl/logmonitor/CondorG.log/dag.https_3a_2f_2fgrid007.ct.infn.it_3a9000_2fyJP5feimRRs0hqZeQ.log - network_server = grid007.ct.infn.it:7772 - owner = /C=IT/O=GILDA/OU=Personal Certificate/L=INFN - Catania/CN=Marco Pappalardo/Email=marco.pappalardo@ct.infn.it - resubmitted = 0 - seed = sW6nRSW0uyBfecwst_MUfA - subjob_failed = 0 - user tags = edg_wl_ui_dagnodename=nodeA; ************************************************************* GETTING LOGGING INFO OF NODES OR DAGS ===================================== Another way to retrieve information on the status of the DAG or one of its nodes is using edg-job-get-logging-info command. E.g. for the DAG type: grid-demo1:~/DAGexample> edg-job-get-logging-info https://grid007.ct.infn.it:9000/yJXRixjP5feimRRs0hqZeQ ********************************************************************** LOGGING INFORMATION: Printing info for the Job : https://grid007.ct.infn.it:9000/yJXRixjP5feimRRs0hqZeQ Event: RegJob Event: Transfer Event: Transfer Event: Accepted Event: EnQueued Event: DeQueued Event: Match Event: EnQueued Event: EnQueued Event: DeQueued Event: Transfer Event: Transfer Event: Accepted Event: Running ********************************************************************** Verbosity can be set in command line for this command too. For a node the command is the same. Just remember to change JobId!!! grid-demo1:~/DAGexample> edg-job-get-logging-info https://grid007.ct.infn.it:9000/FZkQw5yZ0bcsOUWiMGYbuw ********************************************************************** LOGGING INFORMATION: Printing info for the Job : https://grid007.ct.infn.it:9000/FZkQw5yZ0bcsOUWiMGYbuw Event: RegJob Event: RegJob Event: UserTag Event: HelperCall Event: Match Event: Running Event: Done Event: Accepted Event: Transfer Event: HelperReturn ********************************************************************** FAILURE HANDLING ================ In this example, we'll continue monitoring the status of the DAG waiting for its completion. So, entry repeatedly status command lines in order to follow execution: grid-demo1:~/DAGexample> edg-job-status https://grid007.ct.infn.it:9000/yJXRixjP5feimRRs0hqZeQ ************************************************************* BOOKKEEPING INFORMATION: Status info for the Job : https://grid007.ct.infn.it:9000/yJXRixjP5feimRRs0hqZeQ Current Status: Running - Nodes information for: Status info for the Job : https://grid007.ct.infn.it:9000/wCDw2nJN2qAXuKzPSuw4Ag Current Status: Running Status info for the Job : https://grid007.ct.infn.it:9000/d4u0AIMxwqPfkcJFhnntwQ Current Status: Submitted Status info for the Job : https://grid007.ct.infn.it:9000/94S253rmkJ9toLx2WVTGiQ Current Status: Running Status info for the Job : https://grid007.ct.infn.it:9000/lux4ceqUdUMrOCi18nCBDA Current Status: Submitted Status info for the Job : https://grid007.ct.infn.it:9000/iJBvrVeLAM3ZhEBXYzs_0A Current Status: Submitted Status info for the Job : https://grid007.ct.infn.it:9000/isqLkOSkFWny9Kna_4JAug Current Status: Running Status info for the Job : https://grid007.ct.infn.it:9000/QA-KMEKOUZxxM8bvUfuglw Current Status: Running Status info for the Job : https://grid007.ct.infn.it:9000/GedxAULdmIQTGCQecT0fYg Current Status: Submitted Status info for the Job : https://grid007.ct.infn.it:9000/FZkQw5yZ0bcsOUWiMGYbuw Current Status: Running ************************************************************* Wow! Almost all nodes are running. The one you see still in Submitted status are waiting for completion of preceeding ones due to dependency settings. As soon as preceeding one are completed (Done Success status), they begin to change status. Look below: grid-demo1:~/DAGexample> edg-job-status https://grid007.ct.infn.it:9000/yJXRixjP5feimRRs0hqZeQ ************************************************************* BOOKKEEPING INFORMATION: Status info for the Job : https://grid007.ct.infn.it:9000/yJXRixjP5feimRRs0hqZeQ Current Status: Running - Nodes information for: Status info for the Job : https://grid007.ct.infn.it:9000/wCDw2nJN2qAXuKzPSuw4Ag Current Status: Done (Success) Status info for the Job : https://grid007.ct.infn.it:9000/d4u0AIMxwqPfkcJFhnntwQ Current Status: Scheduled Status info for the Job : https://grid007.ct.infn.it:9000/94S253rmkJ9toLx2WVTGiQ Current Status: Done (Success) Status info for the Job : https://grid007.ct.infn.it:9000/lux4ceqUdUMrOCi18nCBDA Current Status: Submitted Status info for the Job : https://grid007.ct.infn.it:9000/iJBvrVeLAM3ZhEBXYzs_0A Current Status: Scheduled Status info for the Job : https://grid007.ct.infn.it:9000/isqLkOSkFWny9Kna_4JAug Current Status: Running Status info for the Job : https://grid007.ct.infn.it:9000/QA-KMEKOUZxxM8bvUfuglw Current Status: Done (Success) Status info for the Job : https://grid007.ct.infn.it:9000/GedxAULdmIQTGCQecT0fYg Current Status: Submitted Status info for the Job : https://grid007.ct.infn.it:9000/FZkQw5yZ0bcsOUWiMGYbuw Current Status: Done (Success) ************************************************************* Unfortunately not all submission correspond to a success!! It can happen that a DAG fails. And in our case the DAG failed. Infact, you can see a Done (Failed) entry. grid-demo1:~/DAGexample> edg-job-status https://grid007.ct.infn.it:9000/yJXRixjP5feimRRs0hqZeQ ************************************************************* BOOKKEEPING INFORMATION: Status info for the Job : https://grid007.ct.infn.it:9000/yJXRixjP5feimRRs0hqZeQ Current Status: Done (Exit Code !=0) - Nodes information for: Status info for the Job : https://grid007.ct.infn.it:9000/wCDw2nJN2qAXuKzPSuw4Ag Current Status: Done (Success) Status info for the Job : https://grid007.ct.infn.it:9000/d4u0AIMxwqPfkcJFhnntwQ Current Status: Done (Success) Status info for the Job : https://grid007.ct.infn.it:9000/94S253rmkJ9toLx2WVTGiQ Current Status: Done (Success) Status info for the Job : https://grid007.ct.infn.it:9000/lux4ceqUdUMrOCi18nCBDA Current Status: Done (Failed) Status info for the Job : https://grid007.ct.infn.it:9000/iJBvrVeLAM3ZhEBXYzs_0A Current Status: Done (Success) Status info for the Job : https://grid007.ct.infn.it:9000/isqLkOSkFWny9Kna_4JAug Current Status: Done (Failed) Status info for the Job : https://grid007.ct.infn.it:9000/QA-KMEKOUZxxM8bvUfuglw Current Status: Done (Success) Status info for the Job : https://grid007.ct.infn.it:9000/GedxAULdmIQTGCQecT0fYg Current Status: Done (Failed) Status info for the Job : https://grid007.ct.infn.it:9000/FZkQw5yZ0bcsOUWiMGYbuw Current Status: Done (Success) ************************************************************* The failure is related to the failure of the job identified by https://grid007.ct.infn.it:9000/isqLkOSkFWny9Kna_4JAug jobid. We need to inspect the reason? There was any error in my JDLs (in my case the answer is obviously not :) but anyone should check/ask first!!!!)? Did anything bad happened to CE or WN where my nodes was sent? let's try to understand why: look at the line having "***** ERROR *****" on right hand side!!! grid-demo1:~/DAGexample> edg-job-get-logging-info -v 2 https://grid007.ct.infn.it:9000/isqLkOSkFWny9Kna_4JAug ********************************************************************** LOGGING INFORMATION: Printing info for the Job : https://grid007.ct.infn.it:9000/isqLkOSkFWny9Kna_4JAug --- Event: RegJob - arrived = Fri Oct 15 12:46:40 2004 - host = grid007.ct.infn.it - ns = grid007.ct.infn.it:7772 - nsubjobs = 0 - parent = https://grid007.ct.infn.it:9000/yJXRixjP5feimRRs0hqZeQ - source = UserInterface - src_instance = (nil) - timestamp = Fri Oct 15 12:46:40 2004 - user = /C=IT/O=GILDA/OU=Personal Certificate/L=INFN - Catania/CN=Marco Pappalardo/Email=marco.pappalardo@ct.infn.it --- Event: RegJob - arrived = Fri Oct 15 12:46:41 2004 - host = grid-demo1.ct.infn.it - ns = grid007.ct.infn.it:7772 - nsubjobs = 0 - parent = https://grid007.ct.infn.it:9000/yJXRixjP5feimRRs0hqZeQ - seed = sW6nRSW0uyBfecwst_MUfA - source = UserInterface - timestamp = Fri Oct 15 12:46:41 2004 - user = /C=IT/O=GILDA/OU=Personal Certificate/L=INFN - Catania/CN=Marco Pappalardo/Email=marco.pappalardo@ct.infn.it --- Event: UserTag - arrived = Fri Oct 15 12:46:42 2004 - host = grid-demo1.ct.infn.it - name = edg_wl_ui_dagnodename - source = UserInterface - timestamp = Fri Oct 15 12:46:42 2004 - user = /C=IT/O=GILDA/OU=Personal Certificate/L=INFN - Catania/CN=Marco Pappalardo/Email=marco.pappalardo@ct.infn.it - value = nodeD --- Event: HelperCall - arrived = Fri Oct 15 12:47:05 2004 - helper_name = BigHelper - helper_params = /var/edgwl/jobcontrol/submit/yJ/dag.https_3a_2f_2fgrid007.ct.infn.it_3a9000_2fyJXRixjP5feimRRs0hqZeQ/ad.https_3a_2f_2fgrid007.ct.infn.it_3a9000_2fisqLkOSkFWny9Kna_5f4JAug /var/edgwl/jobcontrol/submit/yJ/dag.https_3a_2f_2fgrid007.ct.infn.it_3a9000_2fyJXRixjP5feimRRs0hqZeQ/Condor.https_3a_2f_2fgrid007.ct.infn.it_3a9000_2fisqLkOSkFWny9Kna_5f4JAug.submit - host = grid007.ct.infn.it - source = BigHelper - src_role = CALLED - timestamp = Fri Oct 15 12:47:05 2004 - user = /C=IT/O=GILDA/OU=Personal Certificate/L=INFN - Catania/CN=Marco Pappalardo/Email=marco.pappalardo@ct.infn.it --- Event: Match - arrived = Fri Oct 15 12:47:14 2004 - dest_id = cetilab.tilab.com:2119/jobmanager-lcgpbs-infinite - host = grid007.ct.infn.it - source = BigHelper - timestamp = Fri Oct 15 12:47:14 2004 - user = /C=IT/O=GILDA/OU=Personal Certificate/L=INFN - Catania/CN=Marco Pappalardo/Email=marco.pappalardo@ct.infn.it --- Event: Running - arrived = Fri Oct 15 12:55:59 2004 - host = wn2tilab.tilab.com - node = wn2tilab.tilab.com - source = LRMS - timestamp = Fri Oct 15 12:55:59 2004 - user = /C=IT/O=GILDA/OU=Personal Certificate/L=INFN - Catania/CN=Marco Pappalardo/Email=marco.pappalardo@ct.infn.it --- Event: Done - arrived = Fri Oct 15 12:56:03 2004 - exit_code = 0 - host = wn2tilab.tilab.com - reason = (nil) - source = LRMS - status_code = OK - timestamp = Fri Oct 15 12:56:02 2004 - user = /C=IT/O=GILDA/OU=Personal Certificate/L=INFN - Catania/CN=Marco Pappalardo/Email=marco.pappa ardo@ct.infn.it --- Event: Accepted - arrived = Fri Oct 15 12:47:22 2004 - from = JobController - from_host = localhost - from_instance = unavailable - host = grid007.ct.infn.it - local_jobid = 23 - source = LogMonitor - src_instance = unique - timestamp = Fri Oct 15 12:47:22 2004 - user = anonymous --- Event: Transfer - arrived = Fri Oct 15 12:47:44 2004 - dest_host = cetilab.tilab.com:2119/jobmanager-lcgpbs - dest_instance = /var/edgwl/logmonitor/CondorG.log/dag.https_3a_2f_2fgrid007.ct.infn.it_3a9000_2fyJXRixjP5feimRRs0hqZeQ.log - dest_jobid = unavailable - destination = LRMS - host = grid007.ct.infn.it - reason = Job successfully submitted to Globus - result = OK - source = LogMonitor - src_instance = unique - timestamp = Fri Oct 15 12:47:44 2004 - user = anonymous --- Event: Running - arrived = Fri Oct 15 12:58:01 2004 - host = grid007.ct.infn.it - node = cetilab.tilab.com - source = LogMonitor - src_instance = unique - timestamp = Fri Oct 15 12:58:01 2004 - user = anonymous --- Event: Done - arrived = Fri Oct 15 13:17:44 2004 - exit_code = 1 - host = grid007.ct.infn.it - reason = Globus resource down ***** ERROR IS HERE ***** - source = LogMonitor - src_instance = unique - status_code = FAILED - timestamp = Fri Oct 15 13:17:44 2004 - user = anonymous --- Event: Done - arrived = Fri Oct 15 13:28:12 2004 - exit_code = 1 - host = grid007.ct.infn.it - reason = Job got an error while in the CondorG queue. - source = LogMonitor - src_instance = unique - status_code = FAILED - timestamp = Fri Oct 15 13:28:12 2004 - user = anonymous --- Event: HelperReturn - arrived = Fri Oct 15 12:47:15 2004 - helper_name = BigHelper - host = grid007.ct.infn.it - retval = 0 - source = BigHelper - src_role = CALLED - timestamp = Fri Oct 15 12:47:15 2004 - user = /C=IT/O=GILDA/OU=Personal Certificate/L=INFN - Catania/CN=Marco Pappalardo/Email=marco.pappalardo@ct.infn.it ********************************************************************** As expected, there was an error on the selected WN. It is well sai with "Globus resource down" reason for fail. SUCCESSFUL JOB AND OUTPUT RETRIEVAL =================================== Let's try again by submitting another DAG. This time we'll be lucky and the DAG will successfully end. grid-demo1:~/DAGexample> edg-dag-submit --vo gilda identity_ex/identity_dag.jdl Selected Virtual Organisation name (from --vo option): gilda Connecting to host grid007.ct.infn.it, port 7772 Logging to host grid007.ct.infn.it, port 9002 ********************************************************************************************* JOB SUBMIT OUTCOME The dag has been successfully submitted to the Network Server. Use edg-job-status command to check job current status. Your dag identifier (edg_jobId) is: - https://grid007.ct.infn.it:9000/WtEp3dKpZUPlPdnu-tWU0Q ********************************************************************************************* After few minutes, we'll have: grid-demo1:~/DAGexample> edg-job-status https://grid007.ct.infn.it:9000/WtEp3dKpZUPlPdnu-tWU0Q ************************************************************* BOOKKEEPING INFORMATION: Status info for the Job : https://grid007.ct.infn.it:9000/WtEp3dKpZUPlPdnu-tWU0Q Current Status: Done (Success) - Nodes information for: Status info for the Job : https://grid007.ct.infn.it:9000/veqdHQ1p_kCE76LHW-RIMA Current Status: Done (Success) Status info for the Job : https://grid007.ct.infn.it:9000/V3Gpl_itaaDPinAj1gxgEA Current Status: Done (Success) Status info for the Job : https://grid007.ct.infn.it:9000/UUYAiYbCmH_HGAPtKnd-MQ Current Status: Done (Success) Status info for the Job : https://grid007.ct.infn.it:9000/VOgVFX7vppOUzgiQaKHAiQ Current Status: Done (Success) Status info for the Job : https://grid007.ct.infn.it:9000/NqcqFDNXVq_r25t2_XCodg Current Status: Done (Success) Status info for the Job : https://grid007.ct.infn.it:9000/yh8NXDjyAXuRr7w-IG0V-Q Current Status: Done (Success) Status info for the Job : https://grid007.ct.infn.it:9000/rarUxdlDdXochHmUqEy6_w Current Status: Done (Success) Status info for the Job : https://grid007.ct.infn.it:9000/BOafn48VeqvsmRlbkYtOFA Current Status: Done (Success) Status info for the Job : https://grid007.ct.infn.it:9000/ntcWs3YB5fRP9hWfIldlqQ Current Status: Done (Success) ************************************************************* We can now retrieve our output through edg-job-get-output command. Obviously we're using the edg-job-get-output command, as below: grid-demo1:~/DAGexample> edg-job-get-output --dir . https://grid007.ct.infn.it:9000/WtEp3dKpZUPlPdnu-tWU0Q The --dir parameter provides the destination path for output download. In this directory you'll find a directory for the DAG and, inside it, one directory per node. Check each node output. For any node you should see something like that: Ciao, I'm node nodeA! I'm running on grid-demo1 I am part of the Hostname Identity DAG Example for GILDA. I was born in Catania, Italy, and I'm very happy to meet you. That's all!!! ***************************************** * Title | Identity_Message.sh * * Author | Marco Pappalardo * * E-mail | marco.pappalardo@ct.infn.it * * Version | 1.0 - Oct 13 2004 * *****************************************