@@ -22,8 +22,8 @@ from the build directory.
2222
2323Key RAJA features shown in this section are:
2424
25- * ``RAJA::View ``
26- * ``RAJA::Layout `` and ``RAJA::OffsetLayout `` constructs
25+ * ``RAJA::View ``
26+ * ``RAJA::Layout `` and ``RAJA::OffsetLayout `` constructs
2727 * Layout permutations
2828
2929The examples in this section illustrate RAJA View and Layout concepts
@@ -40,11 +40,11 @@ operation, using :math:`N \times N` matrices:
4040 :end-before: _cstyle_matmult_end
4141 :language: C++
4242
43- As is commonly done for efficiency in C and C++, we have allocated the data
44- for the matrices as one-dimensional arrays. Thus, we need to manually compute
43+ As is commonly done for efficiency in C and C++, we have allocated the data
44+ for the matrices as one-dimensional arrays. Thus, we need to manually compute
4545the data pointer offsets for the row and column indices in the kernel.
4646Here, we use the array ``Cref `` to hold a reference solution matrix that
47- we use to compare with results generated by the examples below.
47+ we use to compare with results generated by the examples below.
4848
4949To simplify the multi-dimensional indexing, we can use ``RAJA::View `` objects,
5050which we define as:
@@ -55,20 +55,31 @@ which we define as:
5555 :language: C++
5656
5757Here we define three ``RAJA::View `` objects, 'Aview', 'Bview', and 'Cview',
58- that *wrap * the array data pointers, 'A', 'B', and 'C', respectively. We
59- pass a data pointer as the first argument to each view constructor and then
58+ that *wrap * the array data pointers, 'A', 'B', and 'C', respectively. We
59+ pass a data pointer as the first argument to each view constructor and then
6060the extent of each matrix dimension as the second and third arguments. There
6161are two extent arguments since we indicate in the ``RAJA::Layout `` template
62- parameter list. The matrices are square and each extent is 'N'. Here, the
63- template parameters to ``RAJA::View `` are the array data type 'double' and
62+ parameter list. The matrices are square and each extent is 'N'. Here, the
63+ template parameters to ``RAJA::View `` are the array data type 'double' and
6464a ``RAJA::Layout `` type. Specifically::
6565
6666 RAJA::Layout<2, int>
6767
68- means that each View represents a two-dimensional default data layout, and
69- that we will use values of type 'int' to index into the arrays.
68+ means that each View represents a two-dimensional default data layout, and
69+ that we will use values of type 'int' to index into the arrays.
7070
71- Using the ``RAJA::View `` objects, we can access the data entries for the rows
71+ .. note :: A third argument in the Layout type can be used to specify the index
72+ with unit stride::
73+
74+ RAJA::Layout<2, int, 1>
75+
76+ In the example above index 1 will be marked to have unit stride making
77+ multi-dimensional indexing more efficient by avoiding multiplication by
78+ `1 ` when it is unnecessary.
79+
80+
81+
82+ Using the ``RAJA::View `` objects, we can access the data entries for the rows
7283and columns using a more natural, less error-prone syntax:
7384
7485.. literalinclude :: ../../../../exercises/view-layout_solution.cpp
@@ -79,9 +90,9 @@ and columns using a more natural, less error-prone syntax:
7990Default Layouts Use Row-major Ordering
8091^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8192
82- The default data layout ordering in RAJA is *row-major *, which is the
83- convention for multi-dimensional array indexing in C and C++. This means that
84- the rightmost index will be stride-1, the index to the left of the rightmost
93+ The default data layout ordering in RAJA is *row-major *, which is the
94+ convention for multi-dimensional array indexing in C and C++. This means that
95+ the rightmost index will be stride-1, the index to the left of the rightmost
8596index will have stride equal to the extent of the rightmost dimension, and
8697so on.
8798
@@ -90,32 +101,32 @@ so on.
90101 see :ref: `feat-view-label ` for more details.
91102
92103To illustrate the default data layout striding, we next show simple
93- one-, two-, and three-dimensional examples where the for-loop ordering
94- for the different dimensions is such that all data access is stride-1. We
104+ one-, two-, and three-dimensional examples where the for-loop ordering
105+ for the different dimensions is such that all data access is stride-1. We
95106begin by defining some dimensions, allocate and initialize arrays:
96107
97108.. literalinclude :: ../../../../exercises/view-layout_solution.cpp
98109 :start-after: _default_views_init_start
99110 :end-before: _default_views_init_end
100111 :language: C++
101112
102- The version of the array initialization kernel using a one-dimensional
113+ The version of the array initialization kernel using a one-dimensional
103114``RAJA::View `` is:
104115
105116.. literalinclude :: ../../../../exercises/view-layout_solution.cpp
106117 :start-after: _default_view1D_start
107118 :end-before: _default_view1D_end
108119 :language: C++
109120
110- The version of the array initialization using a two-dimensional
121+ The version of the array initialization using a two-dimensional
111122``RAJA::View `` is:
112123
113124.. literalinclude :: ../../../../exercises/view-layout_solution.cpp
114125 :start-after: _default_view2D_start
115126 :end-before: _default_view2D_end
116127 :language: C++
117128
118- The three-dimensional version is:
129+ The three-dimensional version is:
119130
120131.. literalinclude :: ../../../../exercises/view-layout_solution.cpp
121132 :start-after: _default_view3D_start
@@ -126,16 +137,16 @@ It's worth repeating that the data array access in all three variants shown
126137here using ``RAJA::View `` objects is stride-1 since we order the for-loops
127138in the loop nests to match the row-major ordering.
128139
129- RAJA Layout types support other data access patterns with different striding
130- orders, offsets, and permutations. To this point, we have used the default
131- Layout constructor. RAJA provides methods to generate Layouts for different
132- indexing patterns. We describe these in the next several sections. Next, we
140+ RAJA Layout types support other data access patterns with different striding
141+ orders, offsets, and permutations. To this point, we have used the default
142+ Layout constructor. RAJA provides methods to generate Layouts for different
143+ indexing patterns. We describe these in the next several sections. Next, we
133144show how to permute the data striding order using permuted Layouts.
134145
135146Permuted Layouts Change Data Striding Order
136147^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
137148
138- Every ``RAJA::Layout `` object has a permutation. When a permutation is not
149+ Every ``RAJA::Layout `` object has a permutation. When a permutation is not
139150specified at creation, a Layout will use the identity permutation. Here are
140151examples where the identity permutation is explicitly provided. First, in
141152two dimensions:
@@ -153,10 +164,10 @@ Then, in three dimensions:
153164 :language: C++
154165
155166These two examples access the data with stride-1 ordering, the same as in
156- the earlier examples, which is shown by the nested loop ordering.
167+ the earlier examples, which is shown by the nested loop ordering.
157168The identity permutation in two dimensions is '{0, 1}' and is '{0, 1, 2}'
158- for three dimensions. The method ``RAJA::make_permuted_layout `` is used to
159- create a ``RAJA::Layout `` object with a permutation. The method takes two
169+ for three dimensions. The method ``RAJA::make_permuted_layout `` is used to
170+ create a ``RAJA::Layout `` object with a permutation. The method takes two
160171arguments, the extents of each dimension and the permutation.
161172
162173.. note :: If a permuted Layout is created with the *identity permutation*
@@ -170,9 +181,9 @@ Next, we permute the striding order for the two-dimensional example:
170181 :language: C++
171182
172183Read from right to left, the permutation '{1, 0}' specifies that the first
173- (zero) index 'i' is stride-1 and the second index (one) 'j' has stride equal
174- to the extent of the first Layout dimension 'Nx'. This is evident in the
175- for-loop ordering.
184+ (zero) index 'i' is stride-1, additionally captured in the `` RAJA::Layout ``,
185+ and the second index (one) 'j' has stride equal to the extent of the first
186+ Layout dimension 'Nx'. This is evident in the for-loop ordering.
176187
177188Here is the three-dimensional case, where we have reversed the striding order
178189using the permutation '{2, 1, 0}':
@@ -182,7 +193,16 @@ using the permutation '{2, 1, 0}':
182193 :end-before: _perma_view3D_end
183194 :language: C++
184195
185- The data access remains stride-1 due to the for-loop reordering. For fun,
196+ .. note :: As the index is now held by index 0 we adjust the Layout template
197+ argument accordingly::
198+
199+ RAJA::Layout<3, int, 0>
200+
201+ As before index 0 will be marked to have unit stride making
202+ multi-dimensional indexing more efficient by avoiding multiplication by
203+ `1 ` when it is unnecessary.
204+
205+ The data access remains stride-1 due to the for-loop reordering. For fun,
186206here is another three-dimensional permutation:
187207
188208.. literalinclude :: ../../../../exercises/view-layout_solution.cpp
@@ -197,8 +217,8 @@ Multi-dimensional Indices and Linear Indices
197217^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
198218
199219``RAJA::Layout `` types provide methods to convert between linear indices and
200- multi-dimensional indices and vice versa. Recall the Layout 'perm3a_layout'
201- from above that was created with the permutation '{2, 1, 0}'. To get the
220+ multi-dimensional indices and vice versa. Recall the Layout 'perm3a_layout'
221+ from above that was created with the permutation '{2, 1, 0}'. To get the
202222linear index corresponding to the index triple '(1, 2, 0)', you can do
203223this::
204224
@@ -210,36 +230,36 @@ for linear index 7, you can do::
210230 int i, j, k;
211231 perm3a_layout.toIndices(7, i, j, k);
212232
213- This sets 'i' to 1, 'j' to 2, and 'k' to 0.
233+ This sets 'i' to 1, 'j' to 2, and 'k' to 0.
214234
215- Similarly for the Layout 'permb_layout', which was created with the
235+ Similarly for the Layout 'permb_layout', which was created with the
216236permutation '{1, 2, 0}'::
217237
218- lin = perm3b_layout(1, 2, 0);
238+ lin = perm3b_layout(1, 2, 0);
219239
220240sets 'lin' to 13 = 1 + 0 * Nx + 2 * Nx * Nz and::
221241
222242 perm3b_layout.toIndices(13, i, j, k);
223243
224244sets 'i' to 1, 'j' to 2, and 'k' to 0.
225245
226- There are more examples in the exercise file associated with this section.
246+ There are more examples in the exercise file associated with this section.
227247Feel free to experiment with them.
228248
229249One important item to note is that, by default, there is no bounds checking
230250on indices passed to a ``RAJA::View `` data access method or ``RAJA::Layout ``
231- index computation methods. Therefore, it is the responsibility of a user
232- to ensure that indices passed to ``RAJA::View `` and ``RAJA::Layoout ``
233- methods are in bounds to avoid accessing data outside
234- of the View or computing invalid indices.
251+ index computation methods. Therefore, it is the responsibility of a user
252+ to ensure that indices passed to ``RAJA::View `` and ``RAJA::Layout ``
253+ methods are in bounds to avoid accessing data outside
254+ of the View or computing invalid indices.
235255
236- .. note :: RAJA provides a CMake variable ``RAJA_ENABLE_BOUNDS_CHECK`` to
256+ .. note :: RAJA provides a CMake variable ``RAJA_ENABLE_BOUNDS_CHECK`` to
237257 turn run time bounds checking on or off when the code is compiled.
238258 Enabling bounds checking is useful for debugging and to ensure
239259 your code is correct. However, when enabled, bounds checking adds
240260 noticeable run time overhead. So it should not be enabled for
241- a production build of your code.
242-
261+ a production build of your code.
262+
243263Offset Layouts Apply Offsets to Indices
244264^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
245265
@@ -251,9 +271,9 @@ We first illustrate the concept of an offset with a C-style for-loop:
251271 :end-before: _cstyle_offlayout1D_end
252272 :language: C++
253273
254- Here, the for-loop runs from 'imin' to 'imax-1' (i.e., -5 to 5). To avoid
255- out-of-bounds negative indexing, we subtract 'imin' (i.e., -5) from the loop
256- index 'i'.
274+ Here, the for-loop runs from 'imin' to 'imax-1' (i.e., -5 to 5). To avoid
275+ out-of-bounds negative indexing, we subtract 'imin' (i.e., -5) from the loop
276+ index 'i'.
257277
258278To do the same thing with RAJA, we create a ``RAJA::OffsetLayout `` object
259279and use it to index into the array:
@@ -264,7 +284,7 @@ and use it to index into the array:
264284 :language: C++
265285
266286``RAJA::OffsetLayout `` is a different type than ``RAJA::Layout `` because
267- it contains offset information. The arguments to the
287+ it contains offset information. The arguments to the
268288``RAJA::make_offset_layout `` method are the index bounds.
269289
270290As expected, the two dimensional case is similar. First, a C-style loop:
@@ -284,7 +304,7 @@ and then the same operation using a ``RAJA::OffsetLayout`` object:
284304Note that the first argument passed to ``RAJA::make_offset_layout `` contains
285305the lower bounds for 'i' and 'j' and the second argument contains the upper
286306bounds. Also, the 'j' index is stride-1 by default since we did not pass
287- a permutation to the ``RAJA::make_offset_layout `` method, which is the same
307+ a permutation to the ``RAJA::make_offset_layout `` method, which is the same
288308as the non-offset Layout usage.
289309
290310Just like ``RAJA::Layout `` has a permutation, so does ``RAJA::OffsetLayout ``.
@@ -293,11 +313,10 @@ Here is an example where we permute the (i, j) index stride ordering:
293313.. literalinclude :: ../../../../exercises/view-layout_solution.cpp
294314 :start-after: _raja_permofflayout2D_start
295315 :end-before: _raja_permofflayout2D_end
296- :language: C++
316+ :language: C++
297317
298- The permutation '{1, 0}' is passed as the third argument to
299- ``RAJA::make_offset_layout ``. From the ordering of the for-loops, we can see
300- that the 'i' index is stride-1 and the 'j' index has stride equal to the
301- extent of the 'i' dimension so the for-loop nest strides through
318+ The permutation '{1, 0}' is passed as the third argument to
319+ ``RAJA::make_offset_layout ``. From the ordering of the for-loops, we can see
320+ that the 'i' index is stride-1 and the 'j' index has stride equal to the
321+ extent of the 'i' dimension so the for-loop nest strides through
302322the data with unit stride.
303-
0 commit comments