mt README This file is intended to give Myrinet programmers an vague idea of how the GM mapper and related tools can be used, modified, and extended. It should be read along with the mt source code. Table of Contents 0. building mt. 1.0 mt overview 1.0.1 mt directories 1.0.2 mt classes 1.1 mt file formats 1.1.0 map files (*.map) 1.1.1 route files. (*.routes) 1.1.2 hosts files. (*.hosts) 1.1.3 counters files. (*.counters) 1.4. The mt_Graph class. 2.0 mapper 2.1 test_mapper 3. simple_simulator 4. ad_hoc_simulator 5. ping 6. test_ping 7. simple_routes 11. routes4danny 12. test_static_mapper 13. static_mapper 14. file_mapper 15. merlin 16. deadlock 17. map2wiring 18. longest 19. best 20. examine 21. conflicts 22. sift 23. merge_routes 24. safe_routes 25. reduce 0. building mt. Most of the simulators and test programs are not built by the GM makefiles. You must build them separately. To do this, type "make all gm" (GNU make) in the GM/mt directory. You need gnu make. If gnu make is not called "make" on your system, specify MAKE=gmake (for instance) on the command line when you invoke make. The top level makefile itself calls make in the mt subdirectories. The mt makefiles depend on a script and makefile include files in the mt/config directory. The script is called "arch.sh", and there is one makefile include file for each architecture the mt software has been compiled on before. You will have to modify the makefile include file for your architecture, which defines paths to compilers and so on. You can cross-compile by specifying the value of arch when you invoke make. For instance, make arch=ppc_vxWorks makes for vxWorks. The mt libraries and tools should be compilable by any ANSI C++ compiler, like GNU C++. Users of the microsoft compiler should disable the non-ANSI kludges first. The code is filled with assert statements. If you want failed asserts to terminate the application, dumping core, compile with -DMT_ABORT. Tools are built into a diretory called mt/tools/`mt/config/arch.sh`. 1.0 mt overview "mt" stands for "mapper tools". It includes the GM mapper, a Myrinet simulator library, a framework for writing custom mappers and route calculators, and a bunch of test programs. All tools are built in the "tools" directory. Tools are put together from different modules. You can write your own modules and modify standard mt tools to operate differently. For instance, you can replace the route calculator in the GM mapper so that it configures the network using your own routing algorithm. You can simulate a myrinet using this different routing algorithm, and see how much bandwidth the routes allow under full traffic, where the bottlenecks are, and whether there are deadlocks. You can create your network in a file and compute routes for it without using the GM mapper, or you can look at the routes computed by the GM mapper without using gm_board_info. You can implement your own network mapping algorithm to replace the one used by the GM mapper, and you can test it on an arbitrary network with the myrinet simulator. You can import routes created by your own software and scripts into the GM mapper and configure a myrinet with these routes. You can design your own network topologies with the graphical network editing program (merlin) and simulate the topologies and see how well they work. You can see graphical representations of your myrinet and routes on the myrinet. You can take a network map and a file of routes and print out switches and ports each route goes through. Each mt tool links to: -the mt library, libmt.a -a route calculator (like libsimple.a) -a network library (either libsm.a (simulator) or libmtgm.a and libgm.a (GM)) -an optional simulation (like libnway,a) -an object file with main in it. (like mapper.o) Each module has its own set of arguments, and these arguments must be entered in separate sections on the command line. If you run any program with -help, you'll get a list of all the section names available. You can get the usage statement for a section by putting -help in that section. -route-args ... -simulator-args ... -simulation-args ... -mapper-args ... -job-args ... Instead of typing out arguments you can use an "args" file, which is just a file containing command line arguments. The mt/tools directory has many example args files. Line breaks are okay. 1.0.1 mt directories ad_hoc_calculator - the ad hoc calculator library config - makefile include files dijkstra - dijkstra calculator library file_calculator - calculator library that takes routes from a route file force - brute force calculator library libmt - mt library: lots of support classes merlin - merlin java program mmapper - myrinet mapper library nway - n-way simulation library ping - ping library simple - simple route caculator library smapper - static mapper library tools - tools built by linking mt libraries together watch - java program to show mapper counters files. 1.0.2 mt classes The following classes are part of libmt.a, and are the building blocks of the mt tools and other libraries. Each class is defined in the mt/libmt directory by a header file and source file named after it. For instance, mt_Address is defined by the files mt/libmt/mt_Address.[ch]. mt_Address -implements a 48 bit Myrinet board id mt_Args -represents command line arguments to mt tools mt_Calculator -base class for all route calculators mt_Component -base class for all mt classes mt_File -filename manipulation class mt_FileReader -reads files into token streams mt_FileWriter -writes files mt_Gm -abstract libgm mt_Graph -represents a myrinet mt_Host -a host on a myrinet mt_HostTable -the GM ID table mt_Job -something that can run on a network (simulation or mapper) mt_LineReader -reads lines into token streams mt_MapFile -reads (parses) and writes map files mt_Mapper -base class for mappers. Includes route distributing code mt_MapperModule -mapper wrapper class mt_MapperOptions -reads mapper arguments mt_Message -GM mapper mapping messages mt_Module -base class for pieces of mt tools mt_Network -something that can send and receive messages (GM or simulator) mt_NetworkQueue -a network queue class mt_Node -a host or a switch mt_NodeFactory -something that makes nodes mt_Queue -a queue class mt_Queueable -something that can be queued mt_Responder -something that can respond to a mapping message (for the simulator) mt_Route -implements a Myrinet route mt_RouteFile -reads and writes routes file mt_RouteTable -stores all host pair routes mt_Simulation -a traffic pattern mt_StringReader -reads strings into a token stream mt_Switch -a Myrinet switch mt_Tokenizer -converts a stream into tokens (used for parsing code) sm_Callable -event handler for simulator sm_EventList -event list for simulator sm_Graph -graph for simulator sm_Host -simulated host sm_Node -simulated node sm_Null -disconnected port for simulator (swallows messages) sm_Packet -a simulated Myrinet packet. sm_Port -a port on a simulated switch sm_Simulato -the Myrinet simulator sm_Switch -simulated switch 1.1 mt file formats There are certain file formats that work with the mt tools. These formats are human readable ASCII text. All numbers in these files are base 10 unless they are not. 1.1.0 map files (*.map) Map files are written by the GM mapper. They are read by the static mapper, by the simulator, and by the java program called merlin. They describe a network by giving node names and connections. In map files whitespace and newlines delimit tokens. Map files contain some redundancies to simplify the parsing code. The format is expandable. The tokens "x", "y", "s", and "h" are reserved. A map file begins with a list of node definitions in any order. A node definition starts with a node type, which is either "s" for switch or "h" for host. Following the node type is a port count. The port count for hosts is ignored, and should be given as a "-". After the port count is the node name, which can be any text, so long as it doesn't exceed mt_Node::NAME_LENGTH, which is currently 32, and is defined in mt_Node.h. Node names must be unique. If they contain spaces then the whole name must be quoted with double quotes. Names without spaces do not have to be quoted. After the node name comes the connection count. The connection count will be less than the port count if some ports are not connected. After the connection count there follows a list of connections. A connection consists of a 0-based local port index, a remote node type, a remote port count, a remote node name, and a 0-based remote port index, where local means the node being defined, and remote means the node at the other end of the connection. After the list of connections there follows any number of option/value pairs. An option is any non-reserved token and a value is any token. Tokens with spaces in them must be surrounded by double quotes. Users of the mt libraries can add whatever new option/value pairs that they need for their own programs. There are some pre-existing options that the mt tools use. These include x, y, number, meshX, meshY, and limit. The map file parsing class mt_MapFile works with the network graph class mt_Graph to allow programs to parse arbitrary option/value pairs, making the map file format expandable. Here is part of a map file: s 16 "s0" 16 0 h - "h0" 0 1 h - "h1" 0 2 h - "h2" 0 3 h - "h3" 0 4 h - "h4" 0 5 h - "h5" 0 6 h - "h6" 0 7 h - "h7" 0 8 h - "h8" 0 9 h - "h9" 0 10 s 16 "s1" 0 11 s 16 "s1" 2 12 s 16 "s1" 4 13 s 16 "s3" 3 14 s 16 "s3" 4 15 s 16 "s3" 5 ;optional paramters below x 60 y 10 meshX 0 meshY 0 number 0 h - "h0" 1 0 s 16 "s0" 0 ;optional x 60 y 10 number 6 ... 1.1.1 route files. (*.routes) These are written by route calculators and the mapper. They give Myrinet style routes (relative hops) for a set of hosts. (All to all.) In route files newlines delimit routes. A route file starts with a host count. Then follows a number giving the maximum number of routes between any two hosts. There can be more than one route between any two hosts. For instance, a route calculator may be written so that it computes a low priority route and a high priority route between every pair of hosts. In this example case, the maximum route count would be 2. The simple calculator built into the standard GM mapper does not compute multiple routes, so route files written by it will have a maximum route count of 1. After the host count and the maximum route count there follow sections. Each section contains routes from a single host to all other hosts in the network. A section begins with the source host name. Host names in route files are like those in map files. After the host name there follow two numbers in parentheses. The first number is the number of routes in the section. This will be more than the host count given at the start of the route file if there are alternate routes in the section. The second number is the sum of all the route lengths in the section. This number simplifies the parsing of the route file and memory allocation. The rest of the section is made up of routes, one route per line. A route consists of a destination host name and a list of relative port hops separated by commas . Here is part of a route file: 81 1 "c-15.SU-18.SM-0.alaska" (81 302) "c-15.SU-18.SM-0.alaska" 0 "c-8.SU-18.SM-0.alaska" -7 "c-9.SU-18.SM-0.alaska" -6 "c-10.SU-18.SM-0.alaska" -5 "c-11.SU-18.SM-0.alaska" -4 "c-12.SU-18.SM-0.alaska" -3 "c-13.SU-18.SM-0.alaska" -2 "c-14.SU-18.SM-0.alaska" -1 "c-0.SU-19.SM-0.alaska" 1,-9 ... 1.1.2 hosts files. (*.hosts) These are written by the mapper and represent an assignment of GM IDs to Myrinet board IDs. Each assignment consists of a GM ID, a number representing a host type, a 48-bit hexidecimal Myrinet board ID, and a host name. Commonly the host type is 0. Here is part of a hosts file: 1 0 0060dd7fee2e "bigbox.myri.com" 2 0 0060dd7fed1b "mtx.myri.com" 1.1.3 counters files. (*.counters) Counters files are made by the mapper and show counts of mapping messages and errors. The format is a counter name followed by a count on each line. 1.2 building mt I moved this to the top. 1.4. The mt_Graph class. mt_Graph is the most important class used by the mt libraries and tools. It is defined in the files mt/libmt/mt_Graph.[ch]. mt_Graph is a subclass of mt_NodeFactory, which means that it has two methods that return newly created mt_Node instances: newHost () and newSwitch () (). You can subclass mt_Graph to represent graphs of your own kinds of hosts and switches by overriding newHost () and newSwitch () to return something other than the standard mt_Host and mt_Switch objects. When a map file is being parsed into an mt_Graph, new nodes are constructed by the mt_Graph's NodeFactory methods. 1.5 Route Calculators A route calculator is a C++ class that conforms to the specification defined by mt/libmt/mt_Calculator.h, in other words it is a subclass of the abstract class mt_Calculator. All calculators provide (or think they provide) deadlock free routes, which is an important feature for Myrinets. There are four route calculators included with the mt software: simple_routes, ad_hoc_routes, dijkstra_routes and force_routes. simple_routes is the standard up*/down* route calculator used by the GM mapper. ad_hoc_routes implements various danny algorithms for specific kinds of networks, like meshes and three-levels. dijkstra_routes implements Dijkstra's algorithm with an up*/down* constraint to avoid deadlocks. In practice the routes generated by this algorithm are no better than those found by simple_routes. Finally force_routes implements some brute-force algorithm that I don't know anything about. Here is the life span of a route calculator. First it is created and its constructor is called. Then it is given a chance to parse command line arguments through its parseArgs () method. The command line arguments are represented by an mt_Args object. Then the calculator is given a graph of a network to compute routes for. The graph is an mt_Graph object. Its interface is defined in libmt/mt_Graph.h. In the initialize method the calculator should make a copy of the mt_Graph for its own use. Since mt_Calculators are subclasses of mt_Graphs, this copying amounts to the mt_Calculator adding every node in the original graph to itself. This copying is done automatically by the mt_Calculator::initialize function, so the overriding subclass initialize method should call this superclass function first. After the calculator is initialized, the calculator's getRoute function is called repeatedly as the caller queries the calculator for routes between hosts. Then the calculator's cleanup method is called, during which time the calculator should free any memory it allocated in its initialize method. The steps from initialization to cleanup can be repeated during the course of the calculator's life span. When the calculator is no longer needed, its destructor method is called and it is destroyed. Route calculators provide definitions for the following virtual functions: virtual int getMaxRoutes () = 0; Returns the maximum number of routes between any two hosts. Calculators that don't calculate alternate routes between host pairs should return 1. virtual int getNumRoutes (int from, int to) = 0; Returns the number of routes between two hosts. The arguments "to" and "from" are 0-based indexes into an mt_Graph's hosts array, which is accessed by mt_Graph::getHost (). virtual int initialize (char*mapFile); Initializes the calculator to compute routes for the network represented by the map file "mapFile". A map file can be converted into an mt_Graph with the mt_MapFile class. virtual int initialize (mt_Graph*graph, mt_Node*root); Initializes the calculator to compute routes for the network in the mt_Graph "graph" , with the root "root", which is the root for a tree-based calculator like up*/down* style calculator. virtual void cleanup (); Frees any memory allocated by initialize (). virtual int parseArgs (mt_Args*args) = 0; Parses the calculator's command line arguments. Use mt_Component::printFormat () to report any syntax errors or unrecongnized options and return 0 on error. virtual void usage () = 0; Prints out with mt_Component::printFormat (), which has the same syntax as printf (), a usage statement for calculator's arguments. virtual int getRoute (int from, int to, int routeIndex, mt_Route*route) = 0; Writes the routeIndex-th route from host "from" to host "to" into "route", and return the route length. routeIndex ranges from 0 to getNumRoutes (from, to); virtual mt_Node*newHost (char*name, char*type) = 0; virtual mt_Node*newSwitch (char*name, char*type) = 0; These functions are common to all mt_Graph objects. mt_Calculators are subclasses of mt_Graph. The functions return a new host or switch subclassed from the abstract class mt_Node. If a route route calculation algorithm needs a graph with nodes providing extra functionality, it should return instances of these nodes here. Otherwise it can just return instances of the predefined classes mt_Host and mt_Switch. There are a dozen virtual functions that an mt_Node has to define, including, setOption (), which is called for each option/value pair in a node definition in a map file, when a graph is being created from a map file. Myrinet programmers may want to provide their own route calculator. A custom calculator may provide better routes than the general purpose calculator built into the GM mapper for a particular network. A calculator can be linked to the GM mapper library or to the static_mapper to make custom versions of the GM mapper or the static_mapper that perform exactly the same as the standard versions, except for the routes that get assigned. Programmers can test their route calculators by linking to the stand alone calculator program, to make their own stand alone calculators. Programmers should look at the the targets in mt/tools/makefile to see how mt pieces are put together to form programs, specifically simple_routes, mapper, test_mapper, and routes4danny. 2.0 mapper "mapper" is the GM mapper. It is made by compiling and linking the source file tools/mapper.c with a route calculator (libsimple.a), the myrinet mapper library (libmm.a), the mt-gm library (libmtgm.a), and the mt library (libmt.a) and the gm library (libgm.a). You can make your own GM mapper by writing yowr own route calculator library and compiling and linking mapper.c to it and libmm.a, libmtgm.a, libmt.a and libgm.a. There are two files in the mt/tools directory meant to be used with the mapper: "active.args" and "passive.args". On a myrinet there should be just one mapper running with "active.args". Additional mappers should be started with "passive.args". "passive.args" makes the mapper run in a sort of passive mode, to be used as a backup mapper in case the primary mapper becomes disconnected from the myrinet. To make the mapper passive, "passive.args" disables the -make-hosts and -level options. Two additional options, just-set-hostname and -never-end may be enabled. These make the passive mapper remain passive forever without doing much. I think that this useless configuration was a requirement for the Microsoft Windows NT (tm). Users with real operating systems should not enable just-set-hostname and -never-end. For information about specific options, read the comments in the "active.args" file. 2.1 test_mapper "test_mapper" is a version of "mapper" linked to the myrinet simulator (libsm.a) instead of to the libmtgm.a and libgm.a. You should always make a test_mapper along with a mapper. It means just linking together to other libraries what you already wrote. Instead of exploring a real network, test_mapper explores a simulated one. You run test_mapper like you run the GM mapper: it takes its arguments from an "args" file. There is an example "args" file called gm/mt/tools/mapper.args which should be modified before you run test_mapper. This "args" file is the same as the standard "active.args" or "passive.args", except that it has a section of arguments added for the simulator. This section begins with the line, "-simulator-args". The most important of these arguments is "-map-file", which tells the simulator which map file (node names & network topology) to simulate. The easiest way to create a map file is to use the mapper file created by the GM mapper as it runs. You shouldn't have to change any other arguments in the -simulator.args section. 3. simple_simulator A simple n-way simulation is defined in the mt/nway directory. This simulation links to the Myrinet simulator library, making the program mt/tools/arch/simple_simulator. You start simple_simulator the same way you start test_mapper (also a simulation), by specifying an "args" file on the command line. There is an example "args" file called "sim.args". The simple_simulator uses the standard simple route calculator, libsimple.a. The simulation returns a usage figure for each network link. Networks can have usages less than 1 because of blocking. 4. ad_hoc_simulator The n-way simulation linked with ad hoc route calculators is called the "ad_hoc_simulator". Only Danny uses these route calculators. 5. ping "ping" uses the GM mapper port to send out a GM mapper scout messages or probe messages to hosts or swtiches. It keeps track of replies. Missing messages indicate a problem. To compile ping type "make all gm" in the mt directory. (This will also make the GM mapper.) Ping also sends probe messages to switches to test ports. See the file ping.args for example arguments. You should modify ping.args and run ping by typing "ping ping.args". You can ignore the simulator section of the args file. You should specify at least -myself, -map-file, -test-switches or test-hosts. You can also use command line arguments directly with ping, without an args file. If you get late messages, increase the timeout. If you don't specify a node, ping will test all ports on all switches or hosts. To make a map-file, run the GM mapper with -map-once enabled, and use the map file it produces, mapper.map. Try using gm_debug with ping to see CRC counters. You can test all ports on single switch with the "node" option. Ports can only be tested if they are connected to switches or loopback cables. Make sure if you are using the mapper to generate a map before running ping that you enable the -find-loops option. Ping can be run forever, with the -forever option. Try the -test-xbars option to test xbars thoroughly. ping examples: intel_linux/ping -route-args -job-args -map-file mapper.map -myself compaq.myri.com -test-xbars -print-xbars tested 3 nodes received all 227 messages switch s0: the route from compaq.myri.com to s0 is 00 . 1 . . . . . . . . . . . . . . 01 1 . . . . . . . . . . . . . . . 02 . . . . . . . . . . . . . . . . 03 . . . . . . . . . . . . . . . . 04 . . . . . . . . . . . . . . . . 05 . . . . . . . . . . . . . . . . 06 . . . . . . . . . . . . . . . . 07 . . . . . . . . . . . . . . . . 08 . . . . . . . . . . . . . . . . 09 . . . . . . . . . . . . . . . . 10 . . . . . . . . . . . . . . . . 11 . . . . . . . . . . . . . . . . 12 . . . . . . . . . . . . . . . . 13 . . . . . . . . . . . . . . . . 14 . . . . . . . . . . . . . . . . 15 . . . . . . . . . . . . . . . . switch s1: the route from compaq.myri.com to s1 is -5 00 . . . . . . . . . . . . . . . . 01 . . . . . . . . . . . . . . . . 02 . . . . . . . . . . . . . . . . 03 . . . . . . . . . . . . . . . . 04 . . . . . . . . . . . . . . . . 05 . . . . . . . . . . . . . . . . 06 . . . . . . . . . . . . . . . . 07 . . . . . . . . . . . . . . . . 08 . . . . . . . . . . . . . . . . 09 . . . . . . . . . . . . . . . . 10 . . . . . . . . . . . . . . . . 11 . . . . . . . . . . . . . . . . 12 . . . . . . . . . . . . . . . . 13 . . . . . . . . . . . . . . . . 14 . . . . . . . . . . . . . . . . 15 . . . . . . . . . . . . . . . . switch s2: the route from compaq.myri.com to s2 is -4 00 . 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 01 . . 1 1 1 1 1 1 1 1 1 1 1 1 1 1 02 . 1 . 1 1 1 1 1 1 1 1 1 1 1 1 1 03 . 1 1 . 1 1 1 1 1 1 1 1 1 1 1 1 04 . 1 1 1 . 1 1 1 1 1 1 1 1 1 1 1 05 . 1 1 1 1 . 1 1 1 1 1 1 1 1 1 1 06 . 1 1 1 1 1 . 1 1 1 1 1 1 1 1 1 07 . 1 1 1 1 1 1 . 1 1 1 1 1 1 1 1 08 . 1 1 1 1 1 1 1 . 1 1 1 1 1 1 1 09 . 1 1 1 1 1 1 1 1 . 1 1 1 1 1 1 10 . 1 1 1 1 1 1 1 1 1 . 1 1 1 1 1 11 . 1 1 1 1 1 1 1 1 1 1 . 1 1 1 1 12 . 1 1 1 1 1 1 1 1 1 1 1 . 1 1 1 13 . 1 1 1 1 1 1 1 1 1 1 1 1 . 1 1 14 . 1 1 1 1 1 1 1 1 1 1 1 1 1 . 1 15 . 1 1 1 1 1 1 1 1 1 1 1 1 1 1 . compaq.myri.com% compaq.myri.com% intel_linux/ping -route-args -job-args -map-file mapper.map -myself compaq.myri.com -test-xbars print-xbars -node s2 -fearless tested 1 node received all 256 messages switch s2: the route from compaq.myri.com to s2 is -4 00 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 01 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 02 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 03 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 04 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 05 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 06 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 07 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 08 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 09 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 12 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 13 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 14 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 15 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 compaq.myri.com% intel_linux/ping -route-args -job-args -map-file mapper.map -myself compaq.myri.com -test-xbars -node s2 -min-size 10 -max-size 1000 -increment-size 100 tested 1 node received all 2250 messages switch s2: the route from compaq.myri.com to s2 is -4 00 . 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 01 . . 1 1 1 1 1 1 1 1 1 1 1 1 1 1 02 . 1 . 1 1 1 1 1 1 1 1 1 1 1 1 1 03 . 1 1 . 1 1 1 1 1 1 1 1 1 1 1 1 04 . 1 1 1 . 1 1 1 1 1 1 1 1 1 1 1 05 . 1 1 1 1 . 1 1 1 1 1 1 1 1 1 1 06 . 1 1 1 1 1 . 1 1 1 1 1 1 1 1 1 07 . 1 1 1 1 1 1 . 1 1 1 1 1 1 1 1 08 . 1 1 1 1 1 1 1 . 1 1 1 1 1 1 1 09 . 1 1 1 1 1 1 1 1 . 1 1 1 1 1 1 10 . 1 1 1 1 1 1 1 1 1 . 1 1 1 1 1 11 . 1 1 1 1 1 1 1 1 1 1 . 1 1 1 1 12 . 1 1 1 1 1 1 1 1 1 1 1 . 1 1 1 13 . 1 1 1 1 1 1 1 1 1 1 1 1 . 1 1 14 . 1 1 1 1 1 1 1 1 1 1 1 1 1 . 1 15 . 1 1 1 1 1 1 1 1 1 1 1 1 1 1 . compaq.myri.com% intel_linux/ping -route-args -job-args -map-file mapper.map -myself compaq.myri.com -test-xbars -node s2 -min-size 10 -max-size 1000 -increment-size 100 -in-port 12 -out-port 3 tested 1 node received all 10 messages switch s2: the route from compaq.myri.com to s2 is -4 00 . . . . . . . . . . . . . . . . 01 . . . . . . . . . . . . . . . . 02 . . . . . . . . . . . . . . . . 03 . . . . . . . . . . . . . . . . 04 . . . . . . . . . . . . . . . . 05 . . . . . . . . . . . . . . . . 06 . . . . . . . . . . . . . . . . 07 . . . . . . . . . . . . . . . . 08 . . . . . . . . . . . . . . . . 09 . . . . . . . . . . . . . . . . 10 . . . . . . . . . . . . . . . . 11 . . . . . . . . . . . . . . . . 12 . . . 1 . . . . . . . . . . . . 13 . . . . . . . . . . . . . . . . 14 . . . . . . . . . . . . . . . . 15 . . . . . . . . . . . . . . . . compaq.myri.com% 6. test_ping "test_ping" is the same as "ping" except that it is linked to the Myrinet simulator. It is probably not useful. 7. simple_routes "simple_routes" is a stand alone route calculator. All stand alone calculators are made by compiling and linking the source file mt/tools/map2routes.c to the libmt library and to a route calculator library. simple_routes is linked to the simple route calculator, whose source files are located in "mt/simple". You can make your own stand alone route calculator by writing yowr own route calculator library and compiling and linking mt/tools/map2routes.c to it and libmt. All route calculators are linked with map2routes.o to form stand alone route calculators. These take a map file and return a route file. usage: sparc_solaris/simple_routes routefile [-route-args ...] 11. routes4danny "routes4danny" takes a map file and a route file and prints out the routes in an easier to read format. It is the smallest mt program and you can look at its source file, mt/tools/routes4danny.c to see how to use initialize the mt library and use it. usage: sparc_solaris/routes4danny 12. test_static_mapper This is identical to the static_mapper, except that it is linked to the Myrinet simulator, instead of the GM library. 13. static_mapper "static_mapper" configures a GM network using a map file instead of exploring it. 14. file_mapper This is the static mapper linked to the file calculator. It lets the user configure his network with routes from a routes file. The file_mapper should be run with the file.args file in the tools directory. The user has to modify file.args, specifying the names of the map file, the routes file, and the mapper node. If you can map and configure your network once, with the mapper, you can re-configure any node in the future without mapping the network again, or using the network at all. So you can bring up individual nodes without relying on a mapper running elsewhere on the network. Here's what you need to do. 0. follow the instructions in gm/mt/README on how to make the extra mapper tools. All of the following steps and files refer to tools built in the mt/tools directory, not in gm/binary/sbin. 1. Power on all nodes and start GM on them. 3. Edit the active.args file, enabling -map-file, -make-hosts, -routes-file, and -map-once. Run the mapper. It will configure the whole network and terminate, leaving a map-file, a routes-file and a hosts-file. The point of this step is to get a map-file, a route-file, and a hosts-file, all of which are needed in the next steps. You can use your own tools to generate these files if you like, avoiding the use of the mapper altogether. 4. Now that you have these files, you can configure the network without using the mapper. Edit file.args to give it the names of the files created by step 3, and the name of the host you are on, and run file_mapper with file.args as its sole parameter. To configure the whole network, disable -never-configure-others, otherwise enable it, and with -map-once, file_mapper will configure the node it is running on, and then terminate. 15. merlin merlin is a java program that lets you edit map files graphically, that shows routes on a map file, and so on. See http://www.myri.com/staff/finucane/merlin 16. deadlock Use the "deadlock" program in the tools directory to determine whether a set of routes can possibly deadlock on a network. mercury% sparc_solaris/deadlock sparc_solaris/deadlock usage: sparc_solaris/deadlock mercury% sparc_solaris/deadlock oct.map oct.routes sparc_solaris/deadlock oct.map oct.routes cannot deadlock mercury% sparc_solaris/deadlock oct.map oct-shortest.routes sparc_solaris/deadlock oct.map oct-shortest.routes can deadlock on the following cycle: n16:4 n16:5 n21:0 n21:1 n17:5 n17:4 n20:1 n20:0 n16:4 mercury% 17. map2wiring map2wiring converts a map file to a wiring diagram. creator% sparc_solaris/map2wiring usage: sparc_solaris/map2wiring creator% sparc_solaris/map2wiring oct1.map n0:0 n16:0 n1:0 n16:1 n2:0 n16:2 n3:0 n16:3 n16:4 n20:0 n16:5 n21:0 n16:6 n22:0 n16:7 n23:0 n20:1 n17:4 n20:2 n18:4 n20:3 n19:4 n21:1 n17:5 n21:2 n18:5 n21:3 n19:5 n22:1 n17:6 n22:2 n18:6 n22:3 n19:6 n23:1 n17:7 n23:2 n18:7 n23:3 n19:7 creator% 18. longest longest prints out the longest route in a route file. mpi0.myri.com% intel_linux/longest f.map f.shortest longest route is route 0 from cadmin to c0131, 6 hops mpi0.myri.com% 19. best best repeatedly simulates random traffic on a network using each host as the root of the up*/down* tree. intel_linux/best g.map 10000 0.3752 c0195 0.3752 c0211 0.3752 c0227 0.3752 cadmin 0.3752 c0194 0.3752 c0210 0.3752 c0226 0.3852 c0193 0.3852 c0209 0.3852 c0225 0.3852 c0192 0.3852 c0208 0.3852 c0224 20. examine examine does something that I can't remember. 21. conflicts conflicts takes a map file, a route file, and a file containing host name pairs and prints out how many routes go across each link in the network. compaq.myri.com% cat small.hosts h1 h2 h2 h3 compaq.myri.com% intel_linux/conflicts small.map small.routes small.hosts | more s0:0 0 s0:1 0 s0:2 0 s0:3 0 s0:4 0 s0:5 0 s0:6 0 s0:7 0 s0:8 0 s0:9 1 s0:10 1 22. sift sift takes a map, a route file, and a file containing GM ID pairs and prints out counts of ports used in routes between thse GM ID pairs. sift is meant to be used to locate bad ports from point to point error data. bash$ cat sift.errors 1 2 1 5 5 1 5 4 5 6 bash$ intel_linux/sift -map-file sift.map -route-file sift.routes -error-file sift.errors s0 port 4 had 1, link to s2 s0 port 10 had 1, link to s5 s2 port 3 had 1, link to s0 s2 port 12 had 1, link to s8 s5 port 2 had 2, link to s6 s5 port 11 had 1, link to s0 s8 port 0 had 2, link to s9 s8 port 9 had 1, link to s2 s9 port 0 had 1, link to s8 s9 port 2 had 2, link to s10 s6 port 1 had 1, link to s5 s6 port 12 had 2, link to h0 s10 port 3 had 1, link to s9 s10 port 7 had 2, link to s14 s14 port 1 had 2, link to s13 s14 port 8 had 1, link to s10 s13 port 7 had 3, link to h4 s13 port 10 had 1, link to s14 23. merge_routes merge_routes takes 2 or more route files (possibly incomplete) and merges them into 1 route file. Duplicate routes are taken from the earlier files. usage: intel_linux/merge_routes [...] intel_linux/merge_routes map.map 1.routes 2.routes 3.routes > 4.routes 24. safe_routes safe_routes takes a route file that can deadlock (like a file produced with -shortest-output) and a deadlock-free route file and replaces routes in the first file that cause deadlocks with routes from the second file. usage: intel_linux/safe_routes [finucane@blue1 tools]$ intel_linux/safe_routes f.map f.bad f.good f.mixed routes can deadlock on the following cycle: s1:4 00e041000632 to 00e041000556 i.e. 218 to 44 s9:4 00e04100069d to 00e041000508 i.e. 217 to 109 s9:5 00e04100069d to 00e041000508 i.e. 217 to 109 s34:5 00e041000556 to 0060dd7fbfaf i.e. 44 to 5 s34:4 00e041000612 to 0060dd7fbfaf i.e. 157 to 5 s11:4 00e04100051e to 00e041000632 i.e. 107 to 218 s11:6 00e041000616 to 00e041000632 i.e. 125 to 218 s1:6 0060dd7fbfaf to 00e041000556 i.e. 5 to 44 s1:4 removed route from 218 to 44 replacing route checking revised routes routes cannot deadlock 25. reduce reduce removes nodes from a map file usage: intel_linux/reduce [ ...] intel_linux/reduce f.map g.map 0060dd7fbfa6 0060dd7fbfa5