Overview of the Grid Computing Toolbox
Calling Sequence
Grid[command](arguments)
command(arguments)
|
Description
|
|
•
|
The Grid Computing Toolbox unlocks additional routines in the Grid package for distributed parallel computing with Maple. The Grid package is a Maple Library that is part of the standard Maple distribution.
This package contains one submodule:
|
- The Server module contains commands to start/stop a grid server on the local machine.
•
|
The Grid library contains procedures for distributing computations across an arbitrary number of machines and/or CPUs on the same machine. Each machine or CPU is called a "node" in the grid.
|
•
|
Grid offers MPI-like commands for message passing. In addition, several high-level parallel commands are also available.
|
•
|
There are 3 modes to the Grid package:
"local" - default mode included with Maple for running parallel multiprocess computations on a single machine
"mpi" - mode for interacting with the local message passing interface (MPI) protocol and scheduler for batch processing
"hpc" - mode for distributed computing on a heterogeneous network
|
•
|
In "hpc" mode, each node in the grid must run a grid server process. This server must be started explicitly on each node. By default, the servers will discover each other automatically and the grid will be assembled without further action. Auto-discovery can be turned off for use in a controlled environment, for example a PBS setup.
|
•
|
For "mpi" mode, an implementation of the MPI library specification for message passing is required. For more information about such libraries, see http://www.mcs.anl.gov/research/projects/mpi/. The current MPI implementation in the Grid Computing Toolbox supports the Microsoft® high-performance computing (HPC) MPI found on Microsoft® Windows® Server 2008 clusters.
|
•
|
Parallel jobs can be launched interactively from within Maple by providing the number of nodes to run on.
|
•
|
Parallel jobs can also be launched from a batch system like PBS, in which case PBS will control which nodes the job will run on.
|
|
|
Getting Started
|
|
•
|
To get started quickly, you can run a personal server on your desktop machine in order to develop and test parallel applications before running them on a real grid. The PersonalGridServer worksheet allows you to quickly start such a personal server. In this way, you can simulate a grid with any number of nodes on your desktop.
|
•
|
In order to configure Grid for running on a cluster of machines, in "hpc" mode you must start Grid Servers on each machine that you want to run a parallel job on. Once the servers are started, you, or anyone else with the Grid package and access to the servers, can run any number of parallel jobs until the servers are shut down. See Starting the grid server for more details. In "mpi" and "local" modes, no external Grid Server process needs to be started.
|
•
|
Now you are ready to launch parallel jobs, you must first choose a mode of execution by running the Grid[Setup] command. When choosing a distributed mode, either "mpi" or "hpc", the simplest next step is to execute the Grid[Launch]() command (with no arguments). This will open this Interactive worksheet that allows you to easily launch jobs. There are also examples on how to write a parallel Maple program. Note: This command only works using the Standard Worksheet Interface.
|
•
|
For advanced use, there are a number of other ways to start parallel jobs. You can use the Grid[Launch] command, specifying the number of nodes and the code to run, or you can use a command-line tool to launch parallel batch jobs.
|
|
|
Grid Properties
|
|
The Grid Toolbox uses settings that are stored in the file conf/grid.properties relative to the Grid Computing Toolbox directory that can be queried using the command kernelopts(toolboxdir="Grid").
These settings are described in the Properties help page.
|
|
Grid License Types
|
|
Grid Licenses control how many servers can be started on a single machine (gridcpus), and the maximum number of nodes that can be involved in a parallel job (gridnodes). If the Grid library fails to detect a valid license, the software will only allow execution in "local" mode.
|
|
Accessing Grid Library Commands
|
|
Each command in the Grid library can be accessed by using either the long form or the short form of the command name in the command calling sequence.
The underlying implementation of the Grid library is a module; therefore, it is possible to use the form Grid:-command to access a command from the package. For more information, see Module Members.
|
|
List of Grid Library Commands
|
|
Refer to the Overview of the Grid Package help page for a list of available commands and their descriptions. The Grid[Server] submodule and Grid[Setup] command particularly important when using the Grid Computing Toolbox.
Certain commands are only useful during execution of a parallel job. Other commands are used to prepare and launch parallel jobs. To display the help page for a particular Grid command, see Getting Help with a Command in a Package.
|
|
Examples
|
|
Define settings for a local server.
Create a small test environment - This will start 3 nodes locally on this computer.
Note: These nodes are running in a child process of Maple and should be stopped prior to a restart or closing Maple.
The server would normally be started from the script provided in the bin directory that is part of the Grid Computing Toolbox installation.
>
|
|
| (7.1) |
Determine if the nodes are available.
>
|
|
| (7.2) |
Define a self-contained procedure containing code that will be executed on each of the nodes.
sampleCode := proc()
uses Grid;
local thisNode, maxNodes, val, i, rply;
thisNode := MyNode();
maxNodes := NumNodes();
print("PID=", kernelopts(pid));
if thisNode <> 0 then
# The other nodes: read input from the Master node
val := Receive(0);
val := val^thisNode;
print(val);
Send(0, val);
else
# This is the master node
# Send something to other nodes
for i from 1 to (maxNodes-1) do
Send(i, (i+10));
od;
rply := 0;
for i from 1 to (maxNodes-1) do
val := Receive(i);
print(val);
rply := rply, val;
od;
return [rply];
fi;
end:
Execute the code on the grid network. The Master Node, node 0, that controls the grid is defined by the Setup command.
>
|
|
Node 0: "PID=", 23461
Node 1: "PID=", 23465
Node 2: "PID=", 23463
Node 2: 144
Node 1: 11
Node 0: 11
Node 0: 144
| |
| (7.3) |
>
|
|
| (7.4) |
>
|
|
|
|