Actual source code: Map.hh
1: #ifndef included_ALE_Map_hh
2: #define included_ALE_Map_hh
4: #ifndef included_ALE_Sifter_hh
5: #include <Sifter.hh>
6: #endif
9: // Concepts.
10: // -----------
11: // Because of the use of templating and generic programming techniques,
12: // many fundamental ALE types cannot be defined by making progressively specific
13: // class declarations within a given hierarchy. Instead, they must satisfy certain
14: // Concept requirements that make them acceptable inputs to various Algorithms.
15: // This reflects a shortcoming of the current compiler technology that some defining
16: // features of Concepts have to be specified in documentation, rather than within the
17: // language. Sometimes, however, conceptual types can be viewed as themselves
18: // encapsulating Algorithms acting on other types implementing certain concepts.
19: // This allows to define the structure of these algorithmic types using generic
20: // programming techniques available in C++, for example.
23: // Atlas & Sec
24: // -----------------------
25: // In the past we have considered the Atlas concept, which, for the given
26: // Sifter, chart and ind types computed the assignment of indices
27: // to the points in the underlying Sieve (relative to a given chart).
28: // This mimics a system of local coordinate systems on a manifold or some
29: // such space.
31: // Essentially, Atlas can be thought of as a Sifter, with the
32: // underlying Sifter's points (base or cap?) in the cap, charts in the base
33: // and indices being the color on the edges. However, any type that responds
34: // to the fundamental requests -- setting the number of indices over a point
35: // (fiber dimension), (re)ordering the indices after some dimension modifications
36: // have been made, and retrieving the indices -- is an Atlas.
38: // An object that assigns data of a given type to the (point, chart) pairs of an Atlas
39: // is called its section or Sec. If an Atlas is viewed as a discrete model of a structure
40: // bundle over the point space, a Sec is a discrete model of a section of that bundle.
41: // Sec is required to define a restrict method, which, given a (point,chart) pair returns
42: // a data iterator.
44: // If the Sifter underling the Atlas is a Sieve, we assume that to each
45: // covering arrow p --> q (within the same chart), corresponds a mapping of the data d(p) <-- d(q),
46: // reflecting the idea that d(q) can be 'restricted' to d(p) within the same chart (or perhaps
47: // in any chart?). This sort of behavior is certainly impossible to guarantee through
48: // an interface specification, but it remains a conceptual requirement on Sec that
49: // the data over p are somehow "included" in the data over q. This "inclusion" is partly specified
50: // in the Atlas (e.g., by ensuring that the indices over p are included in those over q),
51: // and partly in Sec itself.
54: // Map & ParMap concepts.
55: // -----------------------
56: // A Map is thought of as a type of mapping from a Sec class to another Sec class.
57: // Maps and ParMaps can be viewed as algorithmic types acting on the input/output Sec types.
58: // Most importantly, a Map must advertise the Atlases of the input and output Sec
59: // types. Furthermore, a Map acts essentially as a Sec relative to the output
60: // atlas: giving a fixed input Sec S, a Map extends the Sec interface by defining restrictions
61: // to output (point,chart) pairs relative to S. Alternatively, a Map can return an output Sec T,
62: // containing the result of applying the map to S, which can be queried independently.
64: // With these features of a Map hardly define any implementation structure (rather an interface),
65: // since the particular behavior of restrictions is unconstrained by this specification.
66: // Particular Maps can impose further constraints on the input/output atlases
67: // and expand the interface, so long as they conform to the Map concept outlined above.
68: // For example, we can require that each output chart data depend only on the explicitly specified
69: // input charts' data. This is done by specifying the 'lightcone' Sifter or a Sieve, connecting
70: // the input charts to the output charts they depend on. Taking the cone (or the closure, if it's
71: // a Sieve) of a given output chart in the lightcone returns all of the input charts necessary
72: // to compute the data over the output chart.
74: // The specification of a Map's lightcone is very necessary to enable preallocation of internal
75: // data structures, such as the matrix storage, for linear maps. In the distributed context
76: // it also enables the setup of communication structures. This behavior is encapsulated in
77: // a ParMap, which is itself a conceptual type that extends Map. ParMap encapsulates (among other things)
78: // three conceptual Map objects: Gather,Scatter and the Transform. A ParMap is then an algorithm that orchestrates
79: // the action of other maps.
81: // ***
82: // The ParMap algorithm is a most clear illustration of the locazation principle underlying Sieves and computation over
83: // them: restrict-compute-assemble. To compute the action of a ParMap on a distributed Sec the necessary data must be
84: // communicate to and from the processes requiring and holding the data. The total overlap of a processes domain with the
85: // rest of the communicator is naturally covered by the individual overlaps indexed by the remote ranks.
86: // To communicate the local input Sec data to the remote processes, the Sec is first restricted to each of the overlap pieces,
87: // forming another Sec, whose charts are the overlaps indexed by indices, and whose points are the (in_point,in_chart)
88: // pairs on which in the input Sec is defined. This Sec can be viewed as multisheeted coverings of the overlap porition
89: // of the input Sec, and the multiplexing process forming the new Sec will be called Scatter. It is a map between two Secs
90: // with the input atlas and the new rank-indexed atlas.
91: // After Scatter maps the input Sec to a multisheeted covering, the multisheeted data are communicated to the processes according
92: // to the rank in each chart. This can be viewed as a map between two such Secs -- communication is certainly a mapping in
93: // the distributed context -- done 'locally' over each chart. Thus, the data from the input Sec are first localized onto
94: // each rank-chart, then mapped, and finally must be assembled.
95: // To obtain a Sec over the local domain, the Scatter process must be reversed, using the a map called Gather. Gather
96: // takes in the multisheeted covering of the overlap obtained after the communication phase, and obtains a single
97: // (in_point,in_chart) data point from the collection of all such points over all the rank charts. During this reduction
98: // the overlap portion of the communicated input Sec is unified with the local data over the same (in_point,in_chart) pair,
99: // completing the assembly of the input Sec.
100: // Once the input data have been communicate to the consuming processes, Transform locally maps the data from the input
101: // Sec to the output Sec. The Gather/Scatter maps involved in the communication stage depend on the GatherAtlas, which
102: // describes the multisheeted covering of InAtlas. Gather/Scatter maps can in principle operate on a Sec of any Atlas,
103: // unifying the data over the same point in different charts, producing a Sec over a single-charted atlas. Different
104: // implementations of Gather/Scatter lead to different multiplexing/reduction procedures, while the atlas structure stays
105: // the same. Gather/Scatter maps can be used locally as well and need not act on the result of a communication.
106: // ***
108: // The Gather output atlas has the same structure as the ParMap input atlas,
109: // while the Gather input atlas -- GatherAtlas -- combines the ParMap input (point,chart) pairs into
110: // the source and puts the communicator rank in the target. The Scatter input/output atlases have the
111: // structure of the output/input atlases of Gather respectively. The Transform atlases have the same
112: // structure as the ParMap atlases.
113: // (Gather input) (Gather output == Transform input)
114:
115: // index index
116: // (point,chart) -----> rank <==> point -----> chart
118: // (Scatter output) (Scatter input == Transform input)
120: // GatherAtlas is constructed from ParMap's input atlas using the lightcone.
121: // The Gather input/Scatter output atlas essentially applies the idea of a chart recursively:
122: // Transform input charts are distributed among different processes. Given a single process, its overlap
123: // with other processes can be indexed by their remote communicator ranks. All (point_in,chart_in) pairs shared
124: // with a given rank are part of a single rank-chart. This way a single in-chart is "blown up"
125: // into a "multisheeted" covering by rank-charts; each (point_in,chart_in) pair becomes a rank-point within
126: // one or many rank-charts.
128: // The data over this rank-atlas are essentially the data in the send/receive buffers,
129: // and the Scatter map is responsible for (multiplexing) packing and moving the data from the input Sec into the rank-Sec
130: // encapsulating these buffers. Once this has been done, ParMap executes the communication code (send/recv),
131: // and the rank-multisheeted data are transfered to the required processes.
133: // Scatter send/recv
134: // ... rank_0
135: // point_in --> chart_in ==> (point_in,chart_in) --> rank_k ==>
136: // ... rank_K
138: // Then Gather reduces the data over a single (point_in,chart_in) pair in all of the rank-charts. This can be thought
139: // of as gluing all of the partial sections over the overlaps with remote processes into a single "remote" Sec and then
140: // gluing it with the "local" Sec. Once the remote data have been assimilated into the local input Sec, Transform does
141: // its thing.
144: // rank_0 Gather Transform
145: // ...
146: // (point_in,chart_in) --> rank_n ==> point_in --> chart_in ==> point_out --> chart_out
147: // ...
148: // rank_N
150: // Observe that the structure of the GatherAtlas is essentially the same as the structure of the
151: // Overlap Sifter in ParDelta, therefore the Overlap code can be reused. However, that code is not customizable,
152: // while we may want to allow the GatherAtlas constructor the flexibility to massage the atlas (e.g., to keep only
153: // s single rank for a given (point,chart) pair, thereby implementing the 'owner' concept). Making GatherAtlas a class
154: // a class will allow this flexibility by exposing the input atlas computation method to overloading.
155: // The prototypical GatherAtlas object will be implemented to keep all of the ranks in the remote overlap under
156: // a given (point_in, chart_in) pair. Custom GatherAtlas objects may prune that so that the number and amount
157: // of data sent/recv'd by ParMap is only as required. Here we assume that the overlap is small and computed only once
158: // or infrequently, while ParMap mappings are frequent.
160:
163: //
164: // Atlas, Sec and Map classes
165: //
166: namespace ALE {
167: namespace X {
169: // We require that any class implementing the Atlas concept extending Sifter.
170: // FIX: should Atlas depend on a Sieve type? A Sifter type?
171: template <typename Ind_, typename Point_, typename Chart_>
172: class Atlas : public ASifter<Ind_, Point_, Chart_, SifterDef::multiColor> {
173: public:
174: //
175: // Encapsulated types
176: //
177: typedef Point_ point_type;
178: typedef Chart_ chart_type;
179: typedef Ind_ ind_type;
180: typedef ALE::pair<ind_type, ind_type> index_type;
181: //
182: typedef ASifter<index_type, point_type, chart_type, SifterDef::multiColor> sifter_type;
183: public:
184: //
185: // Basic interface
186: //
187: Atlas(MPI_Comm comm = PETSC_COMM_SELF, const int& debug = 0) : sifter_type(comm, debug) {};
188: virtual ~Atlas(){};
189: //
190: // Extended interface
191: //
192: ind_type size(){
193: ind_type sz = 0;
194: // Here we simply look at each chart's cone and add up the sizes
195: // In fact, we assume that within a chart indices are contiguous,
196: // so we simply add up the offset to the size of the last point of each chart's cone and sum them up.
197: baseSequence base = this->base();
198: for(typename baseSequence::iterator bitor = base->begin(); bitor != base->end(); bitor++) {
199: index_type ii = this->cone(*bitor)->rbegin()->color();
200: sz += ii.first + ii.second;
201: }
202: return sz;
203: };
204: ind_type size(const chart_type& c) {
205: // Here we simply look at the chart's cone and add up the sizes.
206: // In fact, we assume that within a chart indices are contiguous,
207: // so we simply return the sum of the offset to the size of the chart's last point.
208: index_type ii = this->cone(c).rbegin()->color();
209: ind_type sz = ii.first + ii.second;
210: return sz;
211: };
212: ind_type size(const chart_type& c, const point_type& p) {
213: // Here we assume that at most a single arrow between p and c exists
214: arrowSequence arrows = this->arrows(p,c);
215: ind_type sz = 0;
216: if(arrows.begin() != arrows.end()) {
217: sz = arrows.begin()->first;
218: }
219: return sz;
220: };
221:
222: ind_type offset(const chart_type& c) {
223: // We assume that within a chart indices are contiguous, so the offset of the chart
224: // is the offset of the first element in its cone.
225: ind_type off = this->cone(c).begin()->color().first;
226: return off;
227: };
228: ind_type offset(const chart_type& c, const point_type& p) {
229: // Here we assume that at most a single arrow between p and c exists
230: arrowSequence arrows = this->arrows(p,c);
231: // CONTINUE: what's the offset in case p is not in c
232: ind_type sz = 0;
233: if(arrows.begin() != arrows.end()) {
234: sz = arrows.begin()->first;
235: }
236: return sz;
237: };
238:
239: };// class Atlas
241:
243: // FIX: should Sec depend on a Sieve? Perhaps Atlas should encapsulate a Sieve type?
244: template <typename Data_, typename Atlas_>
245: class Sec {
246: public:
247: //
248: // Encapsulated types
249: //
250: typedef Atlas_ atlas_type;
251: typedef Data_ data_type;
252: typedef typename atlas_type::point_type point_type;
253: typedef typename atlas_type::chart_type chart_type;
254: typedef typename atlas_type::index_type index_type;
255: //
256: // Perhaps the most important incapsulated type: sequence of data elements over a sequence of AtlasArrows.
257: // of the sequence.
258: template <typename AtlasArrowSequence_>
259: class DataSequence {
260: // The AtlasArrowSequence_ encodes the arrows over a chart or a chart-point pair.
261: // The crucial assumption is that the begin() and rbegin() of the AtlasArrowSequence_ contain the extremal indices
262: public:
263: //
264: // Encapsulated types
265: //
266: typedef AtlasArrowSequence_ atlas_arrow_sequence_type;
267: //
268: // Encapsulated iterators
269: class iterator {
270: public:
271: typedef std::input_iterator_tag iterator_category;
272: typedef data_type value_type;
273: typedef int difference_type;
274: typedef value_type* pointer;
275: typedef value_type& reference;
276: protected:
277: // Encapsulates a data_type pointer
278: data_type* _ptr;
279: public:
280: iterator(const iterator& iter) : _ptr(iter._ptr) {};
281: iterator(data_type* ptr) : _ptr(ptr) {};
282: data_type& operator*(){return *(this->_ptr);};
283: virtual iterator operator++() {++this->_ptr; return *this;};
284: virtual iterator operator++(int n) {iterator tmp(this->_ptr); ++this->_ptr; return tmp;};
285: virtual bool operator==(const iterator& iter) const {return this->_ptr == iter._ptr;};
286: virtual bool operator!=(const iterator& iter) const {return this->_ptr != iter._ptr;};
287: };
288: //
289: class reverse_iterator {
290: public:
291: typedef std::input_iterator_tag iterator_category;
292: typedef data_type value_type;
293: typedef int difference_type;
294: typedef value_type* pointer;
295: typedef value_type& reference;
296: protected:
297: // Encapsulates a data_type pointer
298: data_type* _ptr;
299: public:
300: reverse_iterator(const reverse_iterator& iter) : _ptr(iter._ptr) {};
301: reverse_iterator(data_type* ptr) : _ptr(ptr) {};
302: data_type& operator*(){return *(this->_ptr);};
303: virtual reverse_iterator operator++() {--this->_ptr; return *this;};
304: virtual reverse_iterator operator++(int n) {reverse_iterator tmp(this->_ptr); --this->_ptr; return tmp;};
305: virtual bool operator==(const reverse_iterator& iter) const {return this->_ptr == iter._ptr;};
306: virtual bool operator!=(const reverse_iterator& iter) const {return this->_ptr != iter._ptr;};
307: };
308: protected:
309: const atlas_arrow_sequence_type& _arrows;
310: data_type *_base_ptr;
311: index _size;
312: public:
313: //
314: // Basic interface
315: DataSequence(data_type *arr, const atlas_arrow_sequence_type& arrows) : _arrows(arrows) {
316: // We immediately calculate the base pointer into the array and the size of the data sequence.
317: // To compute the index of the base pointer look at the beginning of the _arrows sequence.
318: this->_base_ptr = arr + this->_arrows->begin()->color().first;
319: // To compute the total size of the array segement, we look at the end of the _arrows sequence.
320: ALE::pair<index_type, index_type> ii = this->_arrows->rbegin()->color();
321: this->_size = ii.first + ii.second;
322: };
323: ~DataSequence(){};
324: //
325: // Extended interface
326: index_type size() {return this->_size;};
327: iterator begin() {return iterator(this->_base_ptr);};
328: iterator end() {return iterator(this->_base_ptr+this->_size+1);};
329: iterator rbegin() {return reverse_iterator(this->_base_ptr+this->_size);};
330: iterator rend() {return reverse_iterator(this->_base_ptr-1);};
331: }; // class Sec::DataSequence
332: protected:
333: Obj<atlas_type> _atlas;
334: data_type* _data;
335: bool _allocated;
336: public:
337: //
338: // Basic interface
339: //
340: Sec(const Obj<atlas_type> atlas, const (data_type*)& data) {this->setAtlas(atlas, false); this->_data = data;};
341: Sec(const Obj<atlas_type> atlas = Obj<atlas_type>()) {this->_data = NULL; this->setAtlas(atlas);};
342: ~Sec(){if((this->_data != NULL)&&(this->_allocated)) {PetscFree(this->_data); CHKERROR(ierr, "Error in PetscFree");}};
343: //
344: // Extended interface
345: //
346: void setAtlas(const Obj<atlas_type>& atlas, bool allocate = true) {
347: if(!this->_atlas.isNull()) {
348: throw ALE::Exception("Cannot reset nonempty atlas");
349: }
350: else {
351: if(atlas.isNull()) {
352: throw ALE::Exception("Cannot set a nonempty atlas");
353: }
354: else {
355: this->_atlas = atlas;
356: this->_allocated = allocate;
357: if(allocate) {
358: // Allocate data
359: PetscMalloc(this->_atlas->size()*sizeof(data_type), &this->_data); CHKERROR(ierr, "Error in PetscMalloc");
360: }
361: }
362: }
363: };// setAtlas()
364: Obj<atlas_type> getAtlas() {return this->_atlas;};
365: Obj<atlas_type> atlas() {return getAtlas();};
366: //
367: DataSequence<typename atlas_type::coneSequence>
368: restrict(const chart_type& chart) {
369: return DataSequence<typename atlas_type::coneSequence>(this->_data, this->_atlas->cone(chart));
370: };
371: DataSequence<typename atlas_type::arrowSequence>
372: restrict(const chart_type& chart, const point_type& point) {
373: return DataSequence<typename atlas_type::coneSequence>(this->_data, this->_atlas->arrows(point, chart));
374: };
375: };// class Sec
378: template <typename Data_, typename Atlas_>
379: class ArraySec : public Sec<Data_, Atlas_> {
380: public:
381: //
382: // Encapsulated types
383: //
384: typedef Sec<Data_,Atlas_> sec_type;
385: typedef Atlas_ atlas_type;
386: typedef Data_ data_type;
387: typedef typename atlas_type::point_type point_type;
388: typedef typename atlas_type::chart_type chart_type;
389: typedef typename atlas_type::index_type index_type;
390: //
391: // Basic interface
392: //
393: ArraySec(const Obj<atlas_type> atlas, const (data_type*)& data) {this->setAtlas(atlas, false); this->_data = data;};
394: ArraySec(const Obj<atlas_type> atlas = Obj<atlas_type>()) {this->_data = NULL; this->setAtlas(atlas);};
395: ~ArraySec(){if((this->_data != NULL)&&(this->_allocated)) {PetscFree(this->_data); CHKERROR(ierr, "Error in PetscFree");}};
396: //
397: // Extended interface
398: //
399: data_type*
400: restrict(const chart_type& chart) {
401: return this->_data + this->_atlas->offset(chart);
402: };
403: data_type*
404: restrict(const chart_type& chart, const point_type& point) {
405: return this->_data + this->_atlas->offset(chart,point);
406: };
407: }; // class ArraySec
410: // Overlap is a container class that declares GatherAtlas and ScatterAtlas types as well as
411: // defines their construction procedures.
412: // InAtlas_ and OutAtlas_ are Atlases, Lightcone_ is a Sifter.
413: // Lightcone connects the charts of some ParMap's OutAtlas_ to the ParMap's InAtlas_.
414: // GatherAtlas and ScatterAtlas have (point_in,chart_in) pairs as points and process ranks as charts.
415: // Preconditions: InAtlas, Lightcone, and OutAtlas share communicator; we require that chart type be Point
416: // FIX: should GatherAtlas/ScatterAtlas depend on an underlying Topology Sieve?
417: template <typename Data_, typename InAtlas_, typename OutAtlas_, typename Lightcone_>
418: class Overlap {
419: public:
420: // encapsulated types
421: typedef Data_ data_type;
422: typedef InAtlas_ in_atlas_type;
423: typedef Lightcone_ lightcone_type;
424: typedef OutAtlas_ out_atlas_type;
425: typedef MPI_Int chart_type;
426: typedef ALE::pair<in_atlas_type::point_type, chart_type> point_type;
427: typedef typename in_atlas_type::ind_type ind_type;
428: typedef typename in_atlas_type::index_type index_type;
429: //
430: typedef typename Atlas<point_type, chart_type, ind_type> gather_scatter_atlas_type;
431: protected:
432: Obj<in_atlas_type> _in_atlas;
433: Obj<out_atlas_type> _out_atlas;
434: Obj<lightcone_type> _lightcone;
435: //
436: Obj<gather_scatter_atlas_type> _gather_atlas;
437: Obj<gather_scatter_atlas_type> _scatter_atlas;
438: public:
439: //
440: Overlap(const Obj<in_atlas_type>& in_atlas,const Obj<out_atlas_type>& out_atlas, const Obj<lightcone_type>& lightcone = Obj<lightcone_type>()) : atlas_type(in_atlas->comm()), _in_atlas(in_atlas),_out_atlas(out_atlas),_lightcone(lightcone) {
441: this->computeAtlases(this->_gather_atlas, this->_scatter_atlas);
442: };
443: ~Overlap(){};
444: //
445: // Extended interface
446: Obj<in_atlas_type> inAtlas() {return this->_in_atlas();};
447: Obj<out_atlas_type> outAtlas() {return this->_out_atlas();};
448: Obj<lightcone_type> lightcone(){return this->_lightcone();}
449: //
450: void computeAtlases(Obj<gather_atlas_type> gather_atlas, Obj<scatter_atlas_type> scatter_atlas) {
451: // This function computes the gather and scatter atlases necessary for exchanging and fusing the input data lying over the
452: // overlap with the remote processes into the local data over the overlap points.
453: // The Lightcone sifter encodes the dependence between input and output charts of some map: each out-chart in the base
454: // of the Lightcone depends on the in-chart in its Lightcone cone (depends for the computation of the map values).
455: // A Null Lightcone Obj is interpreted as an identity Lightcone, hence the if-else dichotomy in the code below.
456: //
458: // In order to retrieve the remote points that OutAtlas depends on, we compute the overlap of the local bases of InAtlas
459: // restricted to a suitable subset of Lightcone's cap. The idea is that at most the charts the in the cap of the Lightcone
460: // are required locally by the OutAtlas.
461: // Furthermore, at most the cone of OutAtlas' base in Lightcone is required, which is the set we use to compute the overlap.
462: // If the Lightcone Obj is Null, we take all of the OutAtlas base as the base of the overlap in InAtlas.
463: //
464: if(gather_atlas.isNull()) {
465: gather_atlas = gather_atlas_type(this->_in_atlas->comm());
466: }
467: if(scatter_atlas.isNull()){
468: scatter_atlas = scatter_atlas_type(this->_in_atlas->comm());
469: }
470: //
471: if(!lightcone.isNull()) {
472: // Take all of out-charts & compute their lightcone closure; this will give all of the in-charts required locally
473: typename out_atlas_type::capSequence out_base = this->_out_atlas->base();
474: typename lightcone_type::coneSet in_charts = this->_lightcone->closure(out_base);
475: // Now we compute the "overlap" of in_atlas with these local in-charts; this will be the gather atlas
476: this->__pullbackAtlases(in_charts, gather_atlas, scatter_atlas);
478: }// if(!lightcone.isNull())
479: else {
480: // FIX: handle the Null lightcone case
481: }
482: };// computeAtlases()
484: protected:
485: // Internal type definitions to ensure compatibility with the legacy code in the parallel subroutines
486: typedef ALE::Point Point;
487: typedef int int32_t;
488: typedef ALE::pair<int32_t, int32_t> int_pair;
489: typedef ALE::set<std::pair<int32_t, int32_t> > int_pair_set;
490: typedef ALE::map<int32_t,int32_t> int__int;
491: typedef ALE::map<Point, int32_t> Point__int;
492: typedef ALE::map<Point, std::pair<int32_t,int32_t> > Point__int_int;
493: typedef ALE::map<Point, int_pair_set> Point__int_pair_set;
495: template <typename Sequence>
496: svoid __determinePointOwners(const Obj<Sequence>& points, int32_t *LeaseData, Point__int& owner) {
498: // The Sequence points will be referred to as 'base' throughout, although it may in fact represent a cap.
499: MPI_Comm comm = this->comm();
500: MPI_Int rank = this->commRank();
501: MPI_Int size = this->commSize();
503: // We need to partition global nodes among lessors, which we do by global prefix
504: // First we determine the extent of global prefices and the bounds on the indices with each global prefix.
505: int minGlobalPrefix = 0;
506: // Determine the local extent of global domains
507: for(typename Sequence::iterator point_itor = points->begin(); point_itor != points->end(); point_itor++) {
508: Point p = (*point_itor);
509: if((p.prefix < 0) && (p.prefix < minGlobalPrefix)) {
510: minGlobalPrefix = p.prefix;
511: }
512: }
513: int MinGlobalPrefix;
514: MPI_Allreduce(&minGlobalPrefix, &MinGlobalPrefix, 1, MPIU_INT, MPI_MIN, comm);
515: CHKERROR(ierr, "Error in MPI_Allreduce");
516:
517: int__int BaseLowerBound, BaseUpperBound; // global quantities computed from the local quantities below
518: int__int BaseMaxSize; // the maximum size of the global base index space by global prefix
519: int__int BaseSliceScale, BaseSliceSize, BaseSliceOffset;
520:
521: if(MinGlobalPrefix < 0) { // if we actually do have global base points
522: // Determine the upper and lower bounds on the indices of base points with each global prefix.
523: // We use maps to keep track of these quantities with different global prefices.
524: int__int baseLowerBound, baseUpperBound; // local quantities
525: // Initialize local bound maps with the upper below lower so we can later recognize omitted prefices.
526: for(int d = -1; d >= MinGlobalPrefix; d--) {
527: baseLowerBound[d] = 0; baseUpperBound[d] = -1;
528: }
529: // Compute local bounds
530: for(typename Sequence::iterator point_itor = points->begin(); point_itor != points->end(); point_itor++) {
531: Point p = (*point_itor);
532: int d = p.prefix;
533: int i = p.index;
534: if(d < 0) { // it is indeed a global prefix
535: if (i < baseLowerBound[d]) {
536: baseLowerBound[d] = i;
537: }
538: if (i > baseUpperBound[d]) {
539: baseUpperBound[d] = i;
540: }
541: }
542: }
543: // Compute global bounds
544: for(int d = -1; d >= MinGlobalPrefix; d--){
545: int lowerBound, upperBound, maxSize;
546: MPI_Allreduce(&baseLowerBound[d],&lowerBound,1,MPIU_INT,MPI_MIN,comm);
547: CHKERROR(ierr, "Error in MPI_Allreduce");
548: MPI_Allreduce(&baseUpperBound[d],&upperBound,1,MPIU_INT,MPI_MAX,comm);
549: CHKERROR(ierr, "Error in MPI_Allreduce");
550: maxSize = upperBound - lowerBound + 1;
551: if(maxSize > 0) { // there are actually some indices in this global prefix
552: BaseLowerBound[d] = lowerBound;
553: BaseUpperBound[d] = upperBound;
554: BaseMaxSize[d] = maxSize;
555:
556: // Each processor (at least potentially) owns a slice of the base indices with each global indices.
557: // The size of the slice with global prefix d is BaseMaxSize[d]/size + 1 (except if rank == size-1,
558: // where the slice size can be smaller; +1 is for safety).
559:
560: // For a non-empty domain d we compute and store the slice size in BaseSliceScale[d] (the 'typical' slice size) and
561: // BaseSliceSize[d] (the 'actual' slice size, which only differs from 'typical' for processor with rank == size -1 ).
562: // Likewise, each processor has to keep track of the index offset for each slice it owns and stores it in BaseSliceOffset[d].
563: BaseSliceScale[d] = BaseMaxSize[d]/size + 1;
564: BaseSliceSize[d] = BaseSliceScale[d];
565: if (rank == size-1) {
566: BaseSliceSize[d] = BaseMaxSize[d] - BaseSliceScale[d]*(size-1);
567: }
568: BaseSliceSize[d] = PetscMax(1,BaseSliceSize[d]);
569: BaseSliceOffset[d] = BaseLowerBound[d] + BaseSliceScale[d]*rank;
570: }// for(int d = -1; d >= MinGlobalPrefix; d--){
571: }
572: }// if(MinGlobalDomain < 0)
573:
574: for (typename Sequence::iterator point_itor = points->begin(); point_itor != points->end(); point_itor++) {
575: Point p = (*point_itor);
576: // Determine which slice p falls into
577: // ASSUMPTION on Point type
578: int d = p.prefix;
579: int i = p.index;
580: int proc;
581: if(d < 0) { // global domain -- determine the owner by which slice p falls into
582: proc = (i-BaseLowerBound[d])/BaseSliceScale[d];
583: }
584: else { // local domain -- must refer to a rank within the comm
585: if(d >= size) {
586: throw ALE::Exception("Local domain outside of comm size");
587: }
588: proc = d;
589: }
590: owner[p] = proc;
591: LeaseData[2*proc+1] = 1; // processor owns at least one of ours (i.e., the number of leases from proc is 1)
592: LeaseData[2*proc]++; // count of how many we lease from proc
593: }
595: // Base was empty
596: if(points->begin() == points->end()) {
597: for(int p = 0; p < size; p++) {
598: LeaseData[2*p+0] = 0;
599: LeaseData[2*p+1] = 0;
600: }
601: }
602: }; // __determinePointOwners()
605: template <typename BaseSequence_>
606: void __pullbackAtlases(const BaseSequence& pointsB, Obj<gather_atlas_type>& gather_atlas, Obj<scatter_atlas_type>& scatter_atlas){
607: typedef typename in_atlas_type::traits::baseSequence Sequence;
608: MPI_Comm comm = _graphA->comm();
609: int size = _graphA->commSize();
610: int rank = _graphA->commRank();
611: PetscObject petscObj = _graphA->petscObj();
612: PetscMPIInt tag1, tag2, tag3, tag4, tag5, tag6;
614: // The bases we are going to work with
615: Obj<Sequence> pointsA = _graphA->base();
616: Obj<Sequence> pointsB = _graphB->base();
618: // We MUST have the same sellers for points in A and B (same point owner determination)
619: int *BuyDataA; // 2 ints per processor: number of A base points we buy and number of sales (0 or 1).
620: int *BuyDataB; // 2 ints per processor: number of B base points we buy and number of sales (0 or 1).
621: PetscMalloc2(2*size,int,&BuyDataA,2*size,int,&BuyDataB);CHKERROR(ierr, "Error in PetscMalloc");
622: PetscMemzero(BuyDataA, 2*size * sizeof(int));CHKERROR(ierr, "Error in PetscMemzero");
623: PetscMemzero(BuyDataB, 2*size * sizeof(int));CHKERROR(ierr, "Error in PetscMemzero");
624: // Map from points to the process managing its bin (seller)
625: Point__int ownerA, ownerB;
627: // determine owners of each base node and save it in a map
628: __determinePointOwners(_graphA, pointsA, BuyDataA, ownerA);
629: __determinePointOwners(_graphB, pointsB, BuyDataB, ownerB);
631: int msgSize = 3; // A point is 2 ints, and the cone size is 1
632: int BuyCountA = 0; // The number of sellers with which this process (A buyer) communicates
633: int BuyCountB = 0; // The number of sellers with which this process (B buyer) communicates
634: int *BuySizesA; // The number of A points to buy from each seller
635: int *BuySizesB; // The number of B points to buy from each seller
636: int *SellersA; // The process for each seller of A points
637: int *SellersB; // The process for each seller of B points
638: int *offsetsA = new int[size];
639: int *offsetsB = new int[size];
640: for(int p = 0; p < size; ++p) {
641: BuyCountA += BuyDataA[2*p+1];
642: BuyCountB += BuyDataB[2*p+1];
643: }
644: PetscMalloc2(BuyCountA,int,&BuySizesA,BuyCountA,int,&SellersA);CHKERROR(ierr, "Error in PetscMalloc");
645: PetscMalloc2(BuyCountB,int,&BuySizesB,BuyCountB,int,&SellersB);CHKERROR(ierr, "Error in PetscMalloc");
646: for(int p = 0, buyNumA = 0, buyNumB = 0; p < size; ++p) {
647: if (BuyDataA[2*p+1]) {
648: SellersA[buyNumA] = p;
649: BuySizesA[buyNumA++] = BuyDataA[2*p];
650: }
651: if (BuyDataB[2*p+1]) {
652: SellersB[buyNumB] = p;
653: BuySizesB[buyNumB++] = BuyDataB[2*p];
654: }
655: if (p == 0) {
656: offsetsA[p] = 0;
657: offsetsB[p] = 0;
658: } else {
659: offsetsA[p] = offsetsA[p-1] + msgSize*BuyDataA[2*(p-1)];
660: offsetsB[p] = offsetsB[p-1] + msgSize*BuyDataB[2*(p-1)];
661: }
662: }
664: // All points are bought from someone
665: int32_t *BuyPointsA; // (point, coneSize) for each A point boung from a seller
666: int32_t *BuyPointsB; // (point, coneSize) for each B point boung from a seller
667: PetscMalloc2(msgSize*pointsA->size(),int32_t,&BuyPointsA,msgSize*pointsB->size(),int32_t,&BuyPointsB);CHKERROR(ierr,"Error in PetscMalloc");
668: for (typename Sequence::iterator p_itor = pointsA->begin(); p_itor != pointsA->end(); p_itor++) {
669: BuyPointsA[offsetsA[ownerA[*p_itor]]++] = (*p_itor).prefix;
670: BuyPointsA[offsetsA[ownerA[*p_itor]]++] = (*p_itor).index;
671: BuyPointsA[offsetsA[ownerA[*p_itor]]++] = _graphA->cone(*p_itor)->size();
672: }
673: for (typename Sequence::iterator p_itor = pointsB->begin(); p_itor != pointsB->end(); p_itor++) {
674: BuyPointsB[offsetsB[ownerB[*p_itor]]++] = (*p_itor).prefix;
675: BuyPointsB[offsetsB[ownerB[*p_itor]]++] = (*p_itor).index;
676: BuyPointsB[offsetsB[ownerB[*p_itor]]++] = _graphB->cone(*p_itor)->size();
677: }
678: for(int b = 0, o = 0; b < BuyCountA; ++b) {
679: if (offsetsA[SellersA[b]] - o != msgSize*BuySizesA[b]) {
680: throw ALE::Exception("Invalid A point size");
681: }
682: o += msgSize*BuySizesA[b];
683: }
684: for(int b = 0, o = 0; b < BuyCountB; ++b) {
685: if (offsetsB[SellersB[b]] - o != msgSize*BuySizesB[b]) {
686: throw ALE::Exception("Invalid B point size");
687: }
688: o += msgSize*BuySizesB[b];
689: }
690: delete [] offsetsA;
691: delete [] offsetsB;
693: int SellCountA; // The number of A point buyers with which this process (seller) communicates
694: int SellCountB; // The number of B point buyers with which this process (seller) communicates
695: int *SellSizesA; // The number of A points to sell to each buyer
696: int *SellSizesB; // The number of B points to sell to each buyer
697: int *BuyersA; // The process for each A point buyer
698: int *BuyersB; // The process for each B point buyer
699: int MaxSellSizeA; // The maximum number of messages to be sold to any A point buyer
700: int MaxSellSizeB; // The maximum number of messages to be sold to any B point buyer
701: int32_t *SellPointsA = PETSC_NULL; // The points and cone sizes from all buyers
702: int32_t *SellPointsB = PETSC_NULL; // The points and cone sizes from all buyers
703: PetscMaxSum(comm, BuyDataA, &MaxSellSizeA, &SellCountA);CHKERROR(ierr,"Error in PetscMaxSum");
704: PetscMaxSum(comm, BuyDataB, &MaxSellSizeB, &SellCountB);CHKERROR(ierr,"Error in PetscMaxSum");
705: PetscMalloc2(SellCountA,int,&SellSizesA,SellCountA,int,&BuyersA);CHKERROR(ierr, "Error in PetscMalloc");
706: PetscMalloc2(SellCountB,int,&SellSizesB,SellCountB,int,&BuyersB);CHKERROR(ierr, "Error in PetscMalloc");
707: for(int s = 0; s < SellCountA; s++) {
708: SellSizesA[s] = MaxSellSizeA;
709: BuyersA[s] = MPI_ANY_SOURCE;
710: }
711: for(int s = 0; s < SellCountB; s++) {
712: SellSizesB[s] = MaxSellSizeB;
713: BuyersB[s] = MPI_ANY_SOURCE;
714: }
716: if (debug) {
717: ostringstream txt;
719: for(int s = 0; s < BuyCountA; s++) {
720: txt << "BuySizesA["<<s<<"]: "<<BuySizesA[s]<<" from seller "<<SellersA[s]<< std::endl;
721: }
722: for(int p = 0; p < (int) pointsA->size(); p++) {
723: txt << "["<<rank<<"]: BuyPointsA["<<p<<"]: ("<<BuyPointsA[p*msgSize]<<", "<<BuyPointsA[p*msgSize+1]<<") coneSize "<<BuyPointsA[p*msgSize+2]<<std::endl;
724: }
725: for(int s = 0; s < BuyCountB; s++) {
726: txt << "BuySizesB["<<s<<"]: "<<BuySizesB[s]<<" from seller "<<SellersB[s]<< std::endl;
727: }
728: for(int p = 0; p < (int) pointsB->size(); p++) {
729: txt << "["<<rank<<"]: BuyPointsB["<<p<<"]: ("<<BuyPointsB[p*msgSize]<<", "<<BuyPointsB[p*msgSize+1]<<") coneSize "<<BuyPointsB[p*msgSize+2]<<std::endl;
730: }
731: PetscSynchronizedPrintf(comm, txt.str().c_str()); CHKERROR(ierr, "Error in PetscSynchronizedPrintf");
732: PetscSynchronizedFlush(comm);CHKERROR(ierr,"Error in PetscSynchronizedFlush");
733: }
735: // First tell sellers which points we want to buy
736: // SellSizes, Buyers, and SellPoints are output
737: PetscObjectGetNewTag(petscObj, &tag1); CHKERROR(ierr, "Failed on PetscObjectGetNewTag");
738: commCycle(comm, tag1, msgSize, BuyCountA, BuySizesA, SellersA, BuyPointsA, SellCountA, SellSizesA, BuyersA, &SellPointsA);
739: PetscObjectGetNewTag(petscObj, &tag2); CHKERROR(ierr, "Failed on PetscObjectGetNewTag");
740: commCycle(comm, tag2, msgSize, BuyCountB, BuySizesB, SellersB, BuyPointsB, SellCountB, SellSizesB, BuyersB, &SellPointsB);
742: if (debug) {
743: ostringstream txt;
745: if (!rank) {txt << "Unsquished" << std::endl;}
746: for(int p = 0; p < SellCountA*MaxSellSizeA; p++) {
747: txt << "["<<rank<<"]: SellPointsA["<<p<<"]: ("<<SellPointsA[p*msgSize]<<", "<<SellPointsA[p*msgSize+1]<<") coneSize "<<SellPointsA[p*msgSize+2]<<std::endl;
748: }
749: for(int p = 0; p < SellCountB*MaxSellSizeB; p++) {
750: txt << "["<<rank<<"]: SellPointsB["<<p<<"]: ("<<SellPointsB[p*msgSize]<<", "<<SellPointsB[p*msgSize+1]<<") coneSize "<<SellPointsB[p*msgSize+2]<<std::endl;
751: }
752: PetscSynchronizedPrintf(comm, txt.str().c_str()); CHKERROR(ierr, "Error in PetscSynchronizedPrintf");
753: PetscSynchronizedFlush(comm);CHKERROR(ierr,"Error in PetscSynchronizedFlush");
754: }
756: // Since we gave maximum sizes, we need to squeeze SellPoints
757: for(int s = 0, offset = 0; s < SellCountA; s++) {
758: if (offset != s*MaxSellSizeA*msgSize) {
759: PetscMemmove(&SellPointsA[offset], &SellPointsA[s*MaxSellSizeA*msgSize], SellSizesA[s]*msgSize*sizeof(int32_t));CHKERROR(ierr,"Error in PetscMemmove");
760: }
761: offset += SellSizesA[s]*msgSize;
762: }
763: for(int s = 0, offset = 0; s < SellCountB; s++) {
764: if (offset != s*MaxSellSizeB*msgSize) {
765: PetscMemmove(&SellPointsB[offset], &SellPointsB[s*MaxSellSizeB*msgSize], SellSizesB[s]*msgSize*sizeof(int32_t));CHKERROR(ierr,"Error in PetscMemmove");
766: }
767: offset += SellSizesB[s]*msgSize;
768: }
770: if (debug) {
771: ostringstream txt;
772: int SellSizeA = 0, SellSizeB = 0;
774: if (!rank) {txt << "Squished" << std::endl;}
775: for(int s = 0; s < SellCountA; s++) {
776: SellSizeA += SellSizesA[s];
777: txt << "SellSizesA["<<s<<"]: "<<SellSizesA[s]<<" from buyer "<<BuyersA[s]<< std::endl;
778: }
779: for(int p = 0; p < SellSizeA; p++) {
780: txt << "["<<rank<<"]: SellPointsA["<<p<<"]: ("<<SellPointsA[p*msgSize]<<", "<<SellPointsA[p*msgSize+1]<<") coneSize "<<SellPointsA[p*msgSize+2]<<std::endl;
781: }
782: for(int s = 0; s < SellCountB; s++) {
783: SellSizeB += SellSizesB[s];
784: txt << "SellSizesB["<<s<<"]: "<<SellSizesB[s]<<" from buyer "<<BuyersB[s]<< std::endl;
785: }
786: for(int p = 0; p < SellSizeB; p++) {
787: txt << "["<<rank<<"]: SellPointsB["<<p<<"]: ("<<SellPointsB[p*msgSize]<<", "<<SellPointsB[p*msgSize+1]<<") coneSize "<<SellPointsB[p*msgSize+2]<<std::endl;
788: }
789: PetscSynchronizedPrintf(comm, txt.str().c_str()); CHKERROR(ierr, "Error in PetscSynchronizedPrintf");
790: PetscSynchronizedFlush(comm);CHKERROR(ierr,"Error in PetscSynchronizedFlush");
791: }
793: // Map from A base points to (B process, B coneSize) pairs
794: Point__int_pair_set BillOfSaleAtoB;
795: // Map from B base points to (A process, A coneSize) pairs
796: Point__int_pair_set BillOfSaleBtoA;
798: // Find the A points being sold to B buyers and record the B cone size
799: for(int s = 0, offset = 0; s < SellCountA; s++) {
800: for(int m = 0; m < SellSizesA[s]; m++) {
801: Point point = Point(SellPointsA[offset], SellPointsA[offset+1]);
802: // Just insert the point
803: int size = BillOfSaleAtoB[point].size();
804: // Avoid unused variable warning
805: if (!size) offset += msgSize;
806: }
807: }
808: for(int s = 0, offset = 0; s < SellCountB; s++) {
809: for(int m = 0; m < SellSizesB[s]; m++) {
810: Point point = Point(SellPointsB[offset], SellPointsB[offset+1]);
812: if (BillOfSaleAtoB.find(point) != BillOfSaleAtoB.end()) {
813: BillOfSaleAtoB[point].insert(int_pair(BuyersB[s], SellPointsB[offset+2]));
814: }
815: offset += msgSize;
816: }
817: }
818: // Find the B points being sold to A buyers and record the A cone size
819: for(int s = 0, offset = 0; s < SellCountB; s++) {
820: for(int m = 0; m < SellSizesB[s]; m++) {
821: Point point = Point(SellPointsB[offset], SellPointsB[offset+1]);
822: // Just insert the point
823: int size = BillOfSaleBtoA[point].size();
824: // Avoid unused variable warning
825: if (!size) offset += msgSize;
826: }
827: }
828: for(int s = 0, offset = 0; s < SellCountA; s++) {
829: for(int m = 0; m < SellSizesA[s]; m++) {
830: Point point = Point(SellPointsA[offset], SellPointsA[offset+1]);
832: if (BillOfSaleBtoA.find(point) != BillOfSaleBtoA.end()) {
833: BillOfSaleBtoA[point].insert(int_pair(BuyersA[s], SellPointsA[offset+2]));
834: }
835: offset += msgSize;
836: }
837: }
838: // Calculate number of B buyers for A base points
839: for(int s = 0, offset = 0; s < SellCountA; s++) {
840: for(int m = 0; m < SellSizesA[s]; m++) {
841: Point point = Point(SellPointsA[offset], SellPointsA[offset+1]);
843: SellPointsA[offset+2] = BillOfSaleAtoB[point].size();
844: offset += msgSize;
845: }
846: }
847: // Calculate number of A buyers for B base points
848: for(int s = 0, offset = 0; s < SellCountB; s++) {
849: for(int m = 0; m < SellSizesB[s]; m++) {
850: Point point = Point(SellPointsB[offset], SellPointsB[offset+1]);
852: SellPointsB[offset+2] = BillOfSaleBtoA[point].size();
853: offset += msgSize;
854: }
855: }
857: // Tell A buyers how many B buyers there were (contained in BuyPointsA)
858: // Tell B buyers how many A buyers there were (contained in BuyPointsB)
859: PetscObjectGetNewTag(petscObj, &tag3); CHKERROR(ierr, "Failed on PetscObjectGetNewTag");
860: commCycle(comm, tag3, msgSize, SellCountA, SellSizesA, BuyersA, SellPointsA, BuyCountA, BuySizesA, SellersA, &BuyPointsA);
861: PetscObjectGetNewTag(petscObj, &tag4); CHKERROR(ierr, "Failed on PetscObjectGetNewTag");
862: commCycle(comm, tag4, msgSize, SellCountB, SellSizesB, BuyersB, SellPointsB, BuyCountB, BuySizesB, SellersB, &BuyPointsB);
864: if (debug) {
865: ostringstream txt;
866: int BuySizeA = 0, BuySizeB = 0;
868: if (!rank) {txt << "Got other B and A buyers" << std::endl;}
869: for(int s = 0; s < BuyCountA; s++) {
870: BuySizeA += BuySizesA[s];
871: txt << "BuySizesA["<<s<<"]: "<<BuySizesA[s]<<" from seller "<<SellersA[s]<< std::endl;
872: }
873: for(int p = 0; p < BuySizeA; p++) {
874: txt << "["<<rank<<"]: BuyPointsA["<<p<<"]: ("<<BuyPointsA[p*msgSize]<<", "<<BuyPointsA[p*msgSize+1]<<") B buyers "<<BuyPointsA[p*msgSize+2]<<std::endl;
875: }
876: for(int s = 0; s < BuyCountB; s++) {
877: BuySizeB += BuySizesB[s];
878: txt << "BuySizesB["<<s<<"]: "<<BuySizesB[s]<<" from seller "<<SellersB[s]<< std::endl;
879: }
880: for(int p = 0; p < BuySizeB; p++) {
881: txt << "["<<rank<<"]: BuyPointsB["<<p<<"]: ("<<BuyPointsB[p*msgSize]<<", "<<BuyPointsB[p*msgSize+1]<<") A buyers "<<BuyPointsB[p*msgSize+2]<<std::endl;
882: }
883: PetscSynchronizedPrintf(comm, txt.str().c_str()); CHKERROR(ierr, "Error in PetscSynchronizedPrintf");
884: PetscSynchronizedFlush(comm);CHKERROR(ierr,"Error in PetscSynchronizedFlush");
885: }
887: int BuyConesSizeA = 0;
888: int BuyConesSizeB = 0;
889: int SellConesSizeA = 0;
890: int SellConesSizeB = 0;
891: int *BuyConesSizesA; // The number of A points to buy from each seller
892: int *BuyConesSizesB; // The number of B points to buy from each seller
893: int *SellConesSizesA; // The number of A points to sell to each buyer
894: int *SellConesSizesB; // The number of B points to sell to each buyer
895: int32_t *SellConesA; // The (rank, B cone size) for each A point from all other B buyers
896: int32_t *SellConesB; // The (rank, A cone size) for each B point from all other A buyers
897: int32_t *overlapInfoA = PETSC_NULL; // The (rank, B cone size) for each A point from all other B buyers
898: int32_t *overlapInfoB = PETSC_NULL; // The (rank, A cone size) for each B point from all other A buyers
899: PetscMalloc2(BuyCountA,int,&BuyConesSizesA,SellCountA,int,&SellConesSizesA);CHKERROR(ierr, "Error in PetscMalloc");
900: PetscMalloc2(BuyCountB,int,&BuyConesSizesB,SellCountB,int,&SellConesSizesB);CHKERROR(ierr, "Error in PetscMalloc");
901: for(int s = 0, offset = 0; s < SellCountA; s++) {
902: SellConesSizesA[s] = 0;
904: for(int m = 0; m < SellSizesA[s]; m++) {
905: SellConesSizesA[s] += SellPointsA[offset+2];
906: offset += msgSize;
907: }
908: SellConesSizeA += SellConesSizesA[s];
909: }
910: for(int s = 0, offset = 0; s < SellCountB; s++) {
911: SellConesSizesB[s] = 0;
913: for(int m = 0; m < SellSizesB[s]; m++) {
914: SellConesSizesB[s] += SellPointsB[offset+2];
915: offset += msgSize;
916: }
917: SellConesSizeB += SellConesSizesB[s];
918: }
920: for(int b = 0, offset = 0; b < BuyCountA; b++) {
921: BuyConesSizesA[b] = 0;
923: for(int m = 0; m < BuySizesA[b]; m++) {
924: BuyConesSizesA[b] += BuyPointsA[offset+2];
925: offset += msgSize;
926: }
927: BuyConesSizeA += BuyConesSizesA[b];
928: }
929: for(int b = 0, offset = 0; b < BuyCountB; b++) {
930: BuyConesSizesB[b] = 0;
932: for(int m = 0; m < BuySizesB[b]; m++) {
933: BuyConesSizesB[b] += BuyPointsB[offset+2];
934: offset += msgSize;
935: }
936: BuyConesSizeB += BuyConesSizesB[b];
937: }
939: int cMsgSize = 2;
940: PetscMalloc2(SellConesSizeA*cMsgSize,int32_t,&SellConesA,SellConesSizeB*cMsgSize,int32_t,&SellConesB);CHKERROR(ierr, "Error in PetscMalloc");
941: for(int s = 0, offset = 0, cOffset = 0, SellConeSize = 0; s < SellCountA; s++) {
942: for(int m = 0; m < SellSizesA[s]; m++) {
943: Point point(SellPointsA[offset],SellPointsA[offset+1]);
945: for(typename int_pair_set::iterator p_iter = BillOfSaleAtoB[point].begin(); p_iter != BillOfSaleAtoB[point].end(); ++p_iter) {
946: SellConesA[cOffset+0] = (*p_iter).first;
947: SellConesA[cOffset+1] = (*p_iter).second;
948: cOffset += cMsgSize;
949: }
950: offset += msgSize;
951: }
952: if (cOffset - cMsgSize*SellConeSize != cMsgSize*SellConesSizesA[s]) {
953: throw ALE::Exception("Nonmatching sizes");
954: }
955: SellConeSize += SellConesSizesA[s];
956: }
957: for(int s = 0, offset = 0, cOffset = 0, SellConeSize = 0; s < SellCountB; s++) {
958: for(int m = 0; m < SellSizesB[s]; m++) {
959: Point point(SellPointsB[offset],SellPointsB[offset+1]);
961: for(typename int_pair_set::iterator p_iter = BillOfSaleBtoA[point].begin(); p_iter != BillOfSaleBtoA[point].end(); ++p_iter) {
962: SellConesB[cOffset+0] = (*p_iter).first;
963: SellConesB[cOffset+1] = (*p_iter).second;
964: cOffset += cMsgSize;
965: }
966: offset += msgSize;
967: }
968: if (cOffset - cMsgSize*SellConeSize != cMsgSize*SellConesSizesB[s]) {
969: throw ALE::Exception("Nonmatching sizes");
970: }
971: SellConeSize += SellConesSizesB[s];
972: }
974: // Then send A buyers a (rank, cone size) for all B buyers of the same points
975: PetscObjectGetNewTag(petscObj, &tag5); CHKERROR(ierr, "Failed on PetscObjectGetNewTag");
976: commCycle(comm, tag5, cMsgSize, SellCountA, SellConesSizesA, BuyersA, SellConesA, BuyCountA, BuyConesSizesA, SellersA, &overlapInfoA);
977: PetscObjectGetNewTag(petscObj, &tag6); CHKERROR(ierr, "Failed on PetscObjectGetNewTag");
978: commCycle(comm, tag6, cMsgSize, SellCountB, SellConesSizesB, BuyersB, SellConesB, BuyCountB, BuyConesSizesB, SellersB, &overlapInfoB);
980: // Finally build the A-->B overlap sifter
981: // (remote rank) ---(base A overlap point, remote cone size, local cone size)---> (base A overlap point)
982: for(int b = 0, offset = 0, cOffset = 0; b < BuyCountA; b++) {
983: for(int m = 0; m < BuySizesA[b]; m++) {
984: Point p(BuyPointsA[offset],BuyPointsA[offset+1]);
986: for(int n = 0; n < BuyPointsA[offset+2]; n++) {
987: int neighbor = overlapInfoA[cOffset+0];
988: int coneSize = overlapInfoA[cOffset+1];
990: // Record the point, size of the cone over p coming in from neighbor, and going out to the neighbor for the arrow color
991: overlap->addArrow(neighbor, ALE::pair<int,Point>(0, p), ALE::pair<Point,ALE::pair<int,int> >(p, ALE::pair<int,int>(coneSize, _graphA->cone(p)->size())) );
992: cOffset += cMsgSize;
993: }
994: offset += msgSize;
995: }
996: }
998: // Finally build the B-->A overlap sifter
999: // (remote rank) ---(base B overlap point, remote cone size, local cone size)---> (base B overlap point)
1000: for(int b = 0, offset = 0, cOffset = 0; b < BuyCountB; b++) {
1001: for(int m = 0; m < BuySizesB[b]; m++) {
1002: Point p(BuyPointsB[offset],BuyPointsB[offset+1]);
1004: for(int n = 0; n < BuyPointsB[offset+2]; n++) {
1005: int neighbor = overlapInfoB[cOffset+0];
1006: int coneSize = overlapInfoB[cOffset+1];
1008: // Record the point, size of the cone over p coming in from neighbor, and going out to the neighbor for the arrow color
1009: overlap->addArrow(neighbor, ALE::pair<int,Point>(1, p), ALE::pair<Point,ALE::pair<int,int> >(p, ALE::pair<int,int>(coneSize, _graphB->cone(p)->size())) );
1010: cOffset += cMsgSize;
1011: }
1012: offset += msgSize;
1013: }
1014: }
1015: }; // __pullbackAtlases()
1017: public:
1018: }; // class Overlap
1020: template <typename Data_, typename Map_, typename Overlap_, typename Fusion_>
1021: class ParMap { // class ParMap
1022: public:
1023: //
1024: // Encapsulated types
1025: //
1026: // Overlap is an object that encapsulates GatherAtlas & ScatterAtlas objects, while Fusion encapsulates and the corresponding
1027: // Gather & Scatter map objects. The idea is that Gather & Scatter depend on the structure of the corresponding atlases, but
1028: // do not necessarily control their construction. Therefore the two types of objects and/or their constructors can be overloaded
1029: // separately.
1030: // For convenience Overlap encapsulates the data it is constructed from: two atlases, InAtlas and OutAtlas, and Lightcone,
1031: // which is a Sifter.
1032: // InAtlas refers to the input Sec (the argument of the ParMap), while OutAtlas refers to the output Sec (the result of ParMap).
1033: // InAtlas_ and OutAtlas_ have arrows from points to charts decorated with indices into Data_ storage of the input/output Sec
1034: // respectively.
1035: // GatherAtlas is an Atlas lifting InAtlas into a rank-indexed covering by overlaps with remote processes.
1036: // The overlap is computed accroding to Lightcone, which connects the charts of OutAtlas_ to InAtlas_, which implies the two
1037: // atlases have charts of the same type. The idea is that data dependencies are initially expressed at the chart level;
1038: // a refinement of this can be achieved by subclassing (or implementing a new) GatherAtlas.
1039: // Gather is a Map that reduce a Sec over GatherAtlas with data over each ((in_point,in_chart),rank) pair to a Sec over InAtlas,
1040: // with data over each (in_point, in_chart) pair. Scatter maps in the opposite direction by "multiplexing" the data onto a
1041: // rank-indexed covering.
1042: // Map is a Map sending an InAtlas Sec obtained from Gather, into an OutAtlas Sec.
1043: //
1044: typedef Data_ data_type;
1045: typedef Overlap_ overlap_type;
1046: typedef typename overlap_type::in_atlas_type in_atlas_type;
1047: typedef typename overlap_type::out_atlas_type out_atlas_type;
1048: typedef typename lightcone_type::out_atlas_type lightcone_type;
1049: //
1050: typedef Fusion_ fusion_type;
1051: typedef typename fusion_type::gather_type gather_type;
1052: typedef typename fusion_type::scatter_type scatter_type;
1053: typedef Map_ map_type;
1054: //
1055: protected:
1056: int _debug;
1057: //
1058: Obj<map_type> _map;
1059: Obj<overlap_type> _overlap_type;
1060: Obj<fusion_type> _fusion;
1061: protected:
1062: void __init(MPI_Comm comm) {
1064: this->_comm = comm;
1065: MPI_Comm_rank(this->_comm, &this->_commRank); CHKERROR(ierr, "Error in MPI_Comm_rank");
1066: MPI_Comm_size(this->_comm, &this->_commSize); CHKERROR(ierr, "Error in MPI_Comm_rank");
1067: };
1068: public:
1069: //
1070: // Basic interface
1071: //
1072: ParMap(const Obj<map_type>& map, const Obj<overlap_type>& overlap, const Obj<fusion_type>& fusion)
1073: : _debug(0), _map(map), _overlap(overlap), _fusion(fusion) {};
1074: ~ParMap(){};
1075: //
1076: // Extended interface
1077: //
1078: int getDebug() {return this->_debug;}
1079: void setDebug(const int& d) {this->_debug = d;};
1080: MPI_Comm comm() {return this->_comm;};
1081: MPI_Int commRank() {return this->_commRank;};
1082: MPI_Int commSize() {return this->_commSize;};
1083: }; // class ParMap
1084:
1086: } // namespace X
1087: } // namespace ALE
1089: #endif