Skip to content

Commit 75a4cf7

Browse files
committed
Merge branch 'develop' into woptim/extended-radiuss-envs
2 parents adc7413 + 2fcd22e commit 75a4cf7

File tree

11 files changed

+165
-98
lines changed

11 files changed

+165
-98
lines changed

.gitlab/jobs/poodle.yml

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -29,13 +29,12 @@ gcc_10_3_1:
2929
SPEC: " ~shared +openmp +omptask ~vectorization +tests %gcc@=10.3.1 ${PROJECT_POODLE_DEPS}"
3030
extends: .job_on_poodle
3131

32-
# Known issue currently under investigation
32+
# custom variant
3333
# https://github.com/LLNL/RAJA/pull/1712#issuecomment-2292006843
3434
intel_2023_2_1:
3535
variables:
36-
SPEC: "${PROJECT_POODLE_VARIANTS} %intel@=2023.2.1 ${PROJECT_POODLE_DEPS}"
36+
SPEC: "${PROJECT_POODLE_VARIANTS} +lowopttest cxxflags==-fp-model=precise %intel@=2023.2.1 ${PROJECT_POODLE_DEPS}"
3737
extends: .job_on_poodle
38-
allow_failure: true
3938

4039
############
4140
# Extra jobs

.gitlab/jobs/ruby.yml

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -29,13 +29,12 @@ gcc_10_3_1:
2929
SPEC: " ~shared +openmp +omptask ~vectorization +tests %gcc@=10.3.1 ${PROJECT_RUBY_DEPS}"
3030
extends: .job_on_ruby
3131

32-
# Known issue currently under investigation
32+
# custom variant
3333
# https://github.com/LLNL/RAJA/pull/1712#issuecomment-2292006843
3434
intel_2023_2_1:
3535
variables:
36-
SPEC: "${PROJECT_RUBY_VARIANTS} %intel@=2023.2.1 ${PROJECT_RUBY_DEPS}"
36+
SPEC: "${PROJECT_RUBY_VARIANTS} +lowopttest cxxflags==-fp-model=precise %intel@=2023.2.1 ${PROJECT_RUBY_DEPS}"
3737
extends: .job_on_ruby
38-
allow_failure: true
3938

4039
############
4140
# Extra jobs

docs/sphinx/user_guide/tutorial/view_layout.rst

Lines changed: 75 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -22,8 +22,8 @@ from the build directory.
2222

2323
Key RAJA features shown in this section are:
2424

25-
* ``RAJA::View``
26-
* ``RAJA::Layout`` and ``RAJA::OffsetLayout`` constructs
25+
* ``RAJA::View``
26+
* ``RAJA::Layout`` and ``RAJA::OffsetLayout`` constructs
2727
* Layout permutations
2828

2929
The examples in this section illustrate RAJA View and Layout concepts
@@ -40,11 +40,11 @@ operation, using :math:`N \times N` matrices:
4040
:end-before: _cstyle_matmult_end
4141
:language: C++
4242

43-
As is commonly done for efficiency in C and C++, we have allocated the data
44-
for the matrices as one-dimensional arrays. Thus, we need to manually compute
43+
As is commonly done for efficiency in C and C++, we have allocated the data
44+
for the matrices as one-dimensional arrays. Thus, we need to manually compute
4545
the data pointer offsets for the row and column indices in the kernel.
4646
Here, we use the array ``Cref`` to hold a reference solution matrix that
47-
we use to compare with results generated by the examples below.
47+
we use to compare with results generated by the examples below.
4848

4949
To simplify the multi-dimensional indexing, we can use ``RAJA::View`` objects,
5050
which we define as:
@@ -55,20 +55,31 @@ which we define as:
5555
:language: C++
5656

5757
Here we define three ``RAJA::View`` objects, 'Aview', 'Bview', and 'Cview',
58-
that *wrap* the array data pointers, 'A', 'B', and 'C', respectively. We
59-
pass a data pointer as the first argument to each view constructor and then
58+
that *wrap* the array data pointers, 'A', 'B', and 'C', respectively. We
59+
pass a data pointer as the first argument to each view constructor and then
6060
the extent of each matrix dimension as the second and third arguments. There
6161
are two extent arguments since we indicate in the ``RAJA::Layout`` template
62-
parameter list. The matrices are square and each extent is 'N'. Here, the
63-
template parameters to ``RAJA::View`` are the array data type 'double' and
62+
parameter list. The matrices are square and each extent is 'N'. Here, the
63+
template parameters to ``RAJA::View`` are the array data type 'double' and
6464
a ``RAJA::Layout`` type. Specifically::
6565

6666
RAJA::Layout<2, int>
6767

68-
means that each View represents a two-dimensional default data layout, and
69-
that we will use values of type 'int' to index into the arrays.
68+
means that each View represents a two-dimensional default data layout, and
69+
that we will use values of type 'int' to index into the arrays.
7070

71-
Using the ``RAJA::View`` objects, we can access the data entries for the rows
71+
.. note:: A third argument in the Layout type can be used to specify the index
72+
with unit stride::
73+
74+
RAJA::Layout<2, int, 1>
75+
76+
In the example above index 1 will be marked to have unit stride making
77+
multi-dimensional indexing more efficient by avoiding multiplication by
78+
`1` when it is unnecessary.
79+
80+
81+
82+
Using the ``RAJA::View`` objects, we can access the data entries for the rows
7283
and columns using a more natural, less error-prone syntax:
7384

7485
.. literalinclude:: ../../../../exercises/view-layout_solution.cpp
@@ -79,9 +90,9 @@ and columns using a more natural, less error-prone syntax:
7990
Default Layouts Use Row-major Ordering
8091
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8192

82-
The default data layout ordering in RAJA is *row-major*, which is the
83-
convention for multi-dimensional array indexing in C and C++. This means that
84-
the rightmost index will be stride-1, the index to the left of the rightmost
93+
The default data layout ordering in RAJA is *row-major*, which is the
94+
convention for multi-dimensional array indexing in C and C++. This means that
95+
the rightmost index will be stride-1, the index to the left of the rightmost
8596
index will have stride equal to the extent of the rightmost dimension, and
8697
so on.
8798

@@ -90,32 +101,32 @@ so on.
90101
see :ref:`feat-view-label` for more details.
91102

92103
To illustrate the default data layout striding, we next show simple
93-
one-, two-, and three-dimensional examples where the for-loop ordering
94-
for the different dimensions is such that all data access is stride-1. We
104+
one-, two-, and three-dimensional examples where the for-loop ordering
105+
for the different dimensions is such that all data access is stride-1. We
95106
begin by defining some dimensions, allocate and initialize arrays:
96107

97108
.. literalinclude:: ../../../../exercises/view-layout_solution.cpp
98109
:start-after: _default_views_init_start
99110
:end-before: _default_views_init_end
100111
:language: C++
101112

102-
The version of the array initialization kernel using a one-dimensional
113+
The version of the array initialization kernel using a one-dimensional
103114
``RAJA::View`` is:
104115

105116
.. literalinclude:: ../../../../exercises/view-layout_solution.cpp
106117
:start-after: _default_view1D_start
107118
:end-before: _default_view1D_end
108119
:language: C++
109120

110-
The version of the array initialization using a two-dimensional
121+
The version of the array initialization using a two-dimensional
111122
``RAJA::View`` is:
112123

113124
.. literalinclude:: ../../../../exercises/view-layout_solution.cpp
114125
:start-after: _default_view2D_start
115126
:end-before: _default_view2D_end
116127
:language: C++
117128

118-
The three-dimensional version is:
129+
The three-dimensional version is:
119130

120131
.. literalinclude:: ../../../../exercises/view-layout_solution.cpp
121132
:start-after: _default_view3D_start
@@ -126,16 +137,16 @@ It's worth repeating that the data array access in all three variants shown
126137
here using ``RAJA::View`` objects is stride-1 since we order the for-loops
127138
in the loop nests to match the row-major ordering.
128139

129-
RAJA Layout types support other data access patterns with different striding
130-
orders, offsets, and permutations. To this point, we have used the default
131-
Layout constructor. RAJA provides methods to generate Layouts for different
132-
indexing patterns. We describe these in the next several sections. Next, we
140+
RAJA Layout types support other data access patterns with different striding
141+
orders, offsets, and permutations. To this point, we have used the default
142+
Layout constructor. RAJA provides methods to generate Layouts for different
143+
indexing patterns. We describe these in the next several sections. Next, we
133144
show how to permute the data striding order using permuted Layouts.
134145

135146
Permuted Layouts Change Data Striding Order
136147
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
137148

138-
Every ``RAJA::Layout`` object has a permutation. When a permutation is not
149+
Every ``RAJA::Layout`` object has a permutation. When a permutation is not
139150
specified at creation, a Layout will use the identity permutation. Here are
140151
examples where the identity permutation is explicitly provided. First, in
141152
two dimensions:
@@ -153,10 +164,10 @@ Then, in three dimensions:
153164
:language: C++
154165

155166
These two examples access the data with stride-1 ordering, the same as in
156-
the earlier examples, which is shown by the nested loop ordering.
167+
the earlier examples, which is shown by the nested loop ordering.
157168
The identity permutation in two dimensions is '{0, 1}' and is '{0, 1, 2}'
158-
for three dimensions. The method ``RAJA::make_permuted_layout`` is used to
159-
create a ``RAJA::Layout`` object with a permutation. The method takes two
169+
for three dimensions. The method ``RAJA::make_permuted_layout`` is used to
170+
create a ``RAJA::Layout`` object with a permutation. The method takes two
160171
arguments, the extents of each dimension and the permutation.
161172

162173
.. note:: If a permuted Layout is created with the *identity permutation*
@@ -170,9 +181,9 @@ Next, we permute the striding order for the two-dimensional example:
170181
:language: C++
171182

172183
Read from right to left, the permutation '{1, 0}' specifies that the first
173-
(zero) index 'i' is stride-1 and the second index (one) 'j' has stride equal
174-
to the extent of the first Layout dimension 'Nx'. This is evident in the
175-
for-loop ordering.
184+
(zero) index 'i' is stride-1, additionally captured in the ``RAJA::Layout``,
185+
and the second index (one) 'j' has stride equal to the extent of the first
186+
Layout dimension 'Nx'. This is evident in the for-loop ordering.
176187

177188
Here is the three-dimensional case, where we have reversed the striding order
178189
using the permutation '{2, 1, 0}':
@@ -182,7 +193,16 @@ using the permutation '{2, 1, 0}':
182193
:end-before: _perma_view3D_end
183194
:language: C++
184195

185-
The data access remains stride-1 due to the for-loop reordering. For fun,
196+
.. note:: As the index is now held by index 0 we adjust the Layout template
197+
argument accordingly::
198+
199+
RAJA::Layout<3, int, 0>
200+
201+
As before index 0 will be marked to have unit stride making
202+
multi-dimensional indexing more efficient by avoiding multiplication by
203+
`1` when it is unnecessary.
204+
205+
The data access remains stride-1 due to the for-loop reordering. For fun,
186206
here is another three-dimensional permutation:
187207

188208
.. literalinclude:: ../../../../exercises/view-layout_solution.cpp
@@ -197,8 +217,8 @@ Multi-dimensional Indices and Linear Indices
197217
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
198218

199219
``RAJA::Layout`` types provide methods to convert between linear indices and
200-
multi-dimensional indices and vice versa. Recall the Layout 'perm3a_layout'
201-
from above that was created with the permutation '{2, 1, 0}'. To get the
220+
multi-dimensional indices and vice versa. Recall the Layout 'perm3a_layout'
221+
from above that was created with the permutation '{2, 1, 0}'. To get the
202222
linear index corresponding to the index triple '(1, 2, 0)', you can do
203223
this::
204224

@@ -210,36 +230,36 @@ for linear index 7, you can do::
210230
int i, j, k;
211231
perm3a_layout.toIndices(7, i, j, k);
212232

213-
This sets 'i' to 1, 'j' to 2, and 'k' to 0.
233+
This sets 'i' to 1, 'j' to 2, and 'k' to 0.
214234

215-
Similarly for the Layout 'permb_layout', which was created with the
235+
Similarly for the Layout 'permb_layout', which was created with the
216236
permutation '{1, 2, 0}'::
217237

218-
lin = perm3b_layout(1, 2, 0);
238+
lin = perm3b_layout(1, 2, 0);
219239

220240
sets 'lin' to 13 = 1 + 0 * Nx + 2 * Nx * Nz and::
221241

222242
perm3b_layout.toIndices(13, i, j, k);
223243

224244
sets 'i' to 1, 'j' to 2, and 'k' to 0.
225245

226-
There are more examples in the exercise file associated with this section.
246+
There are more examples in the exercise file associated with this section.
227247
Feel free to experiment with them.
228248

229249
One important item to note is that, by default, there is no bounds checking
230250
on indices passed to a ``RAJA::View`` data access method or ``RAJA::Layout``
231-
index computation methods. Therefore, it is the responsibility of a user
232-
to ensure that indices passed to ``RAJA::View`` and ``RAJA::Layoout``
233-
methods are in bounds to avoid accessing data outside
234-
of the View or computing invalid indices.
251+
index computation methods. Therefore, it is the responsibility of a user
252+
to ensure that indices passed to ``RAJA::View`` and ``RAJA::Layout``
253+
methods are in bounds to avoid accessing data outside
254+
of the View or computing invalid indices.
235255

236-
.. note:: RAJA provides a CMake variable ``RAJA_ENABLE_BOUNDS_CHECK`` to
256+
.. note:: RAJA provides a CMake variable ``RAJA_ENABLE_BOUNDS_CHECK`` to
237257
turn run time bounds checking on or off when the code is compiled.
238258
Enabling bounds checking is useful for debugging and to ensure
239259
your code is correct. However, when enabled, bounds checking adds
240260
noticeable run time overhead. So it should not be enabled for
241-
a production build of your code.
242-
261+
a production build of your code.
262+
243263
Offset Layouts Apply Offsets to Indices
244264
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
245265

@@ -251,9 +271,9 @@ We first illustrate the concept of an offset with a C-style for-loop:
251271
:end-before: _cstyle_offlayout1D_end
252272
:language: C++
253273

254-
Here, the for-loop runs from 'imin' to 'imax-1' (i.e., -5 to 5). To avoid
255-
out-of-bounds negative indexing, we subtract 'imin' (i.e., -5) from the loop
256-
index 'i'.
274+
Here, the for-loop runs from 'imin' to 'imax-1' (i.e., -5 to 5). To avoid
275+
out-of-bounds negative indexing, we subtract 'imin' (i.e., -5) from the loop
276+
index 'i'.
257277

258278
To do the same thing with RAJA, we create a ``RAJA::OffsetLayout`` object
259279
and use it to index into the array:
@@ -264,7 +284,7 @@ and use it to index into the array:
264284
:language: C++
265285

266286
``RAJA::OffsetLayout`` is a different type than ``RAJA::Layout`` because
267-
it contains offset information. The arguments to the
287+
it contains offset information. The arguments to the
268288
``RAJA::make_offset_layout`` method are the index bounds.
269289

270290
As expected, the two dimensional case is similar. First, a C-style loop:
@@ -284,7 +304,7 @@ and then the same operation using a ``RAJA::OffsetLayout`` object:
284304
Note that the first argument passed to ``RAJA::make_offset_layout`` contains
285305
the lower bounds for 'i' and 'j' and the second argument contains the upper
286306
bounds. Also, the 'j' index is stride-1 by default since we did not pass
287-
a permutation to the ``RAJA::make_offset_layout`` method, which is the same
307+
a permutation to the ``RAJA::make_offset_layout`` method, which is the same
288308
as the non-offset Layout usage.
289309

290310
Just like ``RAJA::Layout`` has a permutation, so does ``RAJA::OffsetLayout``.
@@ -293,11 +313,10 @@ Here is an example where we permute the (i, j) index stride ordering:
293313
.. literalinclude:: ../../../../exercises/view-layout_solution.cpp
294314
:start-after: _raja_permofflayout2D_start
295315
:end-before: _raja_permofflayout2D_end
296-
:language: C++
316+
:language: C++
297317

298-
The permutation '{1, 0}' is passed as the third argument to
299-
``RAJA::make_offset_layout``. From the ordering of the for-loops, we can see
300-
that the 'i' index is stride-1 and the 'j' index has stride equal to the
301-
extent of the 'i' dimension so the for-loop nest strides through
318+
The permutation '{1, 0}' is passed as the third argument to
319+
``RAJA::make_offset_layout``. From the ordering of the for-loops, we can see
320+
that the 'i' index is stride-1 and the 'j' index has stride equal to the
321+
extent of the 'i' dimension so the for-loop nest strides through
302322
the data with unit stride.
303-

exercises/view-layout.cpp

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -105,9 +105,9 @@ int main(int RAJA_UNUSED_ARG(argc), char **RAJA_UNUSED_ARG(argv[]))
105105
// Note: we use default Layout
106106
//
107107
// _matmult_views_start
108-
RAJA::View< double, RAJA::Layout<2, int> > Aview(A, N, N);
109-
RAJA::View< double, RAJA::Layout<2, int> > Bview(B, N, N);
110-
RAJA::View< double, RAJA::Layout<2, int> > Cview(C, N, N);
108+
RAJA::View< double, RAJA::Layout<2, int, 1> > Aview(A, N, N);
109+
RAJA::View< double, RAJA::Layout<2, int, 1> > Bview(B, N, N);
110+
RAJA::View< double, RAJA::Layout<2, int, 1> > Cview(C, N, N);
111111
// _matmult_views_end
112112

113113
// _cstyle_matmult_views_start
@@ -165,7 +165,7 @@ int main(int RAJA_UNUSED_ARG(argc), char **RAJA_UNUSED_ARG(argv[]))
165165
std::memset(a, 0, Ntot * sizeof(int));
166166

167167
// _default_view1D_start
168-
RAJA::View< int, RAJA::Layout<1, int> > view_1D(a, Ntot);
168+
RAJA::View< int, RAJA::Layout<1, int, 0> > view_1D(a, Ntot);
169169

170170
for (int i = 0; i < Ntot; ++i) {
171171
view_1D(i) = i;
@@ -182,7 +182,7 @@ int main(int RAJA_UNUSED_ARG(argc), char **RAJA_UNUSED_ARG(argv[]))
182182
std::memset(a, 0, Ntot * sizeof(int));
183183

184184
// _default_view2D_start
185-
RAJA::View< int, RAJA::Layout<2, int> > view_2D(a, Nx, Ny);
185+
RAJA::View< int, RAJA::Layout<2, int, 1> > view_2D(a, Nx, Ny);
186186

187187
int iter{0};
188188
for (int i = 0; i < Nx; ++i) {
@@ -229,9 +229,9 @@ int main(int RAJA_UNUSED_ARG(argc), char **RAJA_UNUSED_ARG(argv[]))
229229

230230
// _default_perm_view2D_start
231231
std::array<RAJA::idx_t, 2> defperm2 {{0, 1}};
232-
RAJA::Layout< 2, int > defperm2_layout =
232+
RAJA::Layout< 2, int> defperm2_layout =
233233
RAJA::make_permuted_layout( {{Nx, Ny}}, defperm2);
234-
RAJA::View< int, RAJA::Layout<2, int> > defperm_view_2D(a, defperm2_layout);
234+
RAJA::View< int, RAJA::Layout<2, int, 1> > defperm_view_2D(a, defperm2_layout);
235235

236236
iter = 0;
237237
for (int i = 0; i < Nx; ++i) {
@@ -272,7 +272,7 @@ int main(int RAJA_UNUSED_ARG(argc), char **RAJA_UNUSED_ARG(argv[]))
272272
std::array<RAJA::idx_t, 2> perm2 {{1, 0}};
273273
RAJA::Layout< 2, int > perm2_layout =
274274
RAJA::make_permuted_layout( {{Nx, Ny}}, perm2);
275-
RAJA::View< int, RAJA::Layout<2, int> > perm_view_2D(a, perm2_layout);
275+
RAJA::View< int, RAJA::Layout<2, int, 0> > perm_view_2D(a, perm2_layout);
276276

277277
iter = 0;
278278
for (int j = 0; j < Ny; ++j) {
@@ -318,7 +318,7 @@ int main(int RAJA_UNUSED_ARG(argc), char **RAJA_UNUSED_ARG(argv[]))
318318
std::array<RAJA::idx_t, 3> perm3b {{1, 2, 0}};
319319
RAJA::Layout< 3, int > perm3b_layout =
320320
RAJA::make_permuted_layout( {{Nx, Ny, Nz}}, perm3b);
321-
RAJA::View< int, RAJA::Layout<3, int> > perm3b_view_3D(a, perm3b_layout);
321+
RAJA::View< int, RAJA::Layout<3, int, 0> > perm3b_view_3D(a, perm3b_layout);
322322

323323
iter = 0;
324324
for (int j = 0; j < Ny; ++j) {

0 commit comments

Comments
 (0)