Configuration#

Running algorithms from the SHADOW library requires two important configuration files:

The Workflow specification

The Environment specification

These are both defined in JSON files, which are described in more detail below.

Workflow Specification#

{
    "nodes": [
        {
            "comp": 119000,
            "id": 0
        },
        {
            "comp": 92000,
            "id": 1
        },
        {
            "comp": 95000,
            "id": 2
        },
        ....
        ....
    ],
    "links": [
        {
            "transfer_data": 18,
            "source": 0,
            "target": 1
        },
        {
            "transfer_data": 12,
            "source": 0,
            "target": 2
        },
        {
            "transfer_data": 9,
            "source": 0,
            "target": 3
        },
        ...
        ...
}

The environment the in which the workflow is scheduled is defined in a separate file; this way, scheduling across different environment configurations can be tested (additionally, it is likely workflows will change, whereas workflows will be run in different environments all the time).

{
  "system": {
    "resources": {
      "cat0_m0": {
        "flops": 7.0
      },
      "cat1_m1": {
        "flops": 6.0
      },
      "cat2_m2": {
        "flops": 11.0
      }
    },
    "compute_bandwidth": {
      "cat0": 1.0,
      "cat1": 1.0,
      "cat2": 1.0
    }
  }
}

For large systems, with many resources of the same type, the following is common:

"resources": {
  "cat0_m0": {
    "flops": 145.0
  },
  "cat0_m1": {
    "flops": 145.0
  },
  "cat0_m2": {
    "flops": 145.0
  },
  "cat0_m3": {
    "flops": 145.0
  },
}

As mentioned earlier, it is also possible to use pre-calculated costs (i.e. completion time in seconds) when scheduling with SHADOW. This approach is less flexible for scheduling workflows, but is a common approach used in the scheduling algorithm literature. This can be achieved by adding a list of costs per tasks to the workflow specification JSON file, in addition to the following ‘header’:

Here, we present an example schedule for the DAG presented in the original HEFT paper.

{
    "header" : {
    "time": true
    },
    ...

    "nodes": [
    {
        "comp": [
            14,
            16,
            9
        ],
        "id": 0
    },
    ...
}