llvm-project

Commit Graph

Author	SHA1	Message	Date
Kazu Hirata	192d9dd731	[mlir] Use std::nullopt instead of None in comments (NFC) This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-04 19:58:32 -08:00
Kazu Hirata	1a36588ec6	[mlir] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-03 18:50:27 -08:00
Nicolas Vasilache	3af6438372	Revert "[WIP] Add support for MMA conversion for 1-D vector.transfer followed by a broadcast to 2-D" This reverts commit `7db25f78db`. This was mistakently stacked below (and committed) along with an NFC change.	2022-12-01 02:57:03 -08:00
Nicolas Vasilache	7db25f78db	[WIP] Add support for MMA conversion for 1-D vector.transfer followed by a broadcast to 2-D Differential Revision: https://reviews.llvm.org/D139040	2022-12-01 02:49:47 -08:00
Quinn Dawkins	c0321edc26	[mlir][gpu] Adding support for transposed mma_load_matrix Enables transposed gpu.subgroup_mma_load_matrix and updates the lowerings in Vector to GPU and GPU to SPIRV. Needed to enable B transpose matmuls lowering to wmma ops. Taken over from author: stanley-nod <stanley@nod-labs.com> Reviewed By: ThomasRaoux, antiagainst Differential Revision: https://reviews.llvm.org/D138770	2022-11-29 03:35:49 +00:00
Ramkumar Ramachandra	d32ec5232c	mlir/VectorToGPU: use std::optional (NFC) This is part of an effort to migrate from llvm::Optional to std::optional: See also: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716 Signed-off-by: Ramkumar Ramachandra <r@artagnon.com>	2022-11-27 13:32:18 -08:00
Aliia Khasanova	399638f98c	Merge kDynamicSize and kDynamicSentinel into one constant. resolve conflicts Differential Revision: https://reviews.llvm.org/D138282	2022-11-21 13:01:26 +00:00
Mehdi Amini	6a7a1188d3	Apply clang-tidy fixes for llvm-else-after-return in VectorToGPU.cpp (NFC)	2022-11-06 20:15:00 +00:00
Manish Gupta	114ba722c1	[mlir][NVGPU] Handle native mma.sync and ldmatrix(x4) sizes This patch handles native `mma.sync` sizes and enables issuing `ldmatrix` on largest possible tiles for matrixB. It requires handling `vector.extract_strided_slice` from vector to ngpu lowering. Differential Revision: https://reviews.llvm.org/D135749	2022-10-19 17:10:21 -07:00
Christopher Bate	ea2ed80e6d	[mlir][nvgpu] NFC - move NVGPU conversion helpers to NvGpu utils library The ConvertVectorToGpu pass implementation contained a small private support library for performing various calculations during conversion between `vector` and `nvgpu.mma.sync` and `nvgpu.ldmatrix` operations. The support library is moved under `Dialect/NVGPU/Utils` because the functions have wider utility. Some documentation comments are added or improved. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D135303	2022-10-05 20:21:27 -06:00
Jakub Kuderski	abc362a107	[mlir][arith] Change dialect name from Arithmetic to Arith Suggested by @lattner in https://discourse.llvm.org/t/rfc-define-precise-arith-semantics/65507/22. Tested with: `ninja check-mlir check-mlir-integration check-mlir-mlir-spirv-cpu-runner check-mlir-mlir-vulkan-runner check-mlir-examples` and `bazel build --config=generic_clang @llvm-project//mlir:all`. Reviewed By: lattner, Mogball, rriddle, jpienaar, mehdi_amini Differential Revision: https://reviews.llvm.org/D134762	2022-09-29 11:23:28 -04:00
Kazu Hirata	be650de57d	[mlir] Use empty (NFC)	2022-09-18 17:46:53 -07:00
Oleg Shyshkov	4758e916e1	[mlir] Change IteratorType in ContractionOp in Vector dialect from string to enum. This is the first step in replacing interator_type from strings with enums in Vector and Linalg dialect. This change adds IteratorTypeAttr and uses it in ContractionOp. To avoid breaking all the tests, print/parse code has conversion between string and enum for now. There is a shared code in StructuredOpsUtils.h that expects iterator types to be strings. To break this dependancy, this change forks helper function `isParallelIterator` and `isReductionIterator` to utils in both dialects and adds `getIteratorTypeNames()` to support backward compatibility with StructuredGenerator. In the later changes, I plan to add a similar enum attribute to Linalg. Differential Revision: https://reviews.llvm.org/D133696	2022-09-12 16:59:34 +02:00
Michele Scuttari	67d0d7ac0a	[MLIR] Update pass declarations to new autogenerated files The patch introduces the required changes to update the pass declarations and definitions to use the new autogenerated files and allow dropping the old infrastructure. Reviewed By: mehdi_amini, rriddle Differential Review: https://reviews.llvm.org/D132838	2022-08-31 12:28:45 +02:00
Michele Scuttari	039b969b32	Revert "[MLIR] Update pass declarations to new autogenerated files" This reverts commit `2be8af8f0e`.	2022-08-30 22:21:55 +02:00
Michele Scuttari	2be8af8f0e	[MLIR] Update pass declarations to new autogenerated files The patch introduces the required changes to update the pass declarations and definitions to use the new autogenerated files and allow dropping the old infrastructure. Reviewed By: mehdi_amini, rriddle Differential Review: https://reviews.llvm.org/D132838	2022-08-30 21:56:31 +02:00
Manish Gupta	14d79afeae	[mlir][NVGPU] nvgpu.mmasync on F32 through TF32 Adds optional attribute to support tensor cores on F32 datatype by lowering to `mma.sync` with TF32 operands. Since, TF32 is not a native datatype in LLVM we are adding `tf32Enabled` as an attribute to allow the IR to be aware of `MmaSyncOp` datatype. Additionally, this patch adds placeholders for nvgpu-to-nvgpu transformation targeting higher precision tf32x3. For mma.sync on f32 input using tensor cores there are two possibilites: (a) tf32 (1 `mma.sync` per warp-level matrix-multiply-accumulate) (b) tf32x3 (3 `mma.sync` per warp-level matrix-multiply-accumulate) Typically, tf32 tensor core acceleration comes at a cost of accuracy from missing precision bits. While f32 has 23 precision bits, tf32 has only 10 precision bits. tf32x3 aims to recover the precision bits by splitting each operand into two tf32 values and issue three `mma.sync` tensor core operations. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D130294	2022-08-01 23:23:27 +00:00
Jeff Niu	e179532284	[mlir] Remove types from attributes This patch removes the `type` field from `Attribute` along with the `Attribute::getType` accessor. Going forward, this means that attributes in MLIR will no longer have types as a first-class concept. This patch lays the groundwork to incrementally remove or refactor code that relies on generic attributes being typed. The immediate impact will be on attributes that rely on `Attribute` containing a type, such as `IntegerAttr`, `DenseElementsAttr`, and `ml_program::ExternAttr`, which will now need to define a type parameter on their storage classes. This will save memory as all other attribute kinds will no longer contain a type. Moreover, it will not be possible to generically query the type of an attribute directly. This patch provides an attribute interface `TypedAttr` that implements only one method, `getType`, which can be used to generically query the types of attributes that implement the interface. This interface can be used to retain the concept of a "typed attribute". The ODS-generated accessor for a `type` parameter automatically implements this method. Next steps will be to refactor the assembly formats of certain operations that rely on `parseAttribute(type)` and `printAttributeWithoutType` to remove special handling of type elision until `type` can be removed from the dialect parsing hook entirely; and incrementally remove uses of `TypedAttr`. Reviewed By: lattner, rriddle, jpienaar Differential Revision: https://reviews.llvm.org/D130092	2022-07-31 20:01:31 -04:00
Jacques Pienaar	d2c0572b2e	[mlir] Flip LinAlg dialect to _Both This one required more changes than ideal due to overlapping generated name with different return types. Changed getIndexingMaps to getIndexingMapsArray to move it out of the way/highlight that it returns (more expensively) a SmallVector and uses the prefixed name for the Attribute. Differential Revision: https://reviews.llvm.org/D129919	2022-07-19 14:42:58 -07:00
Christopher Bate	670eee08ce	[mlir][VectorToGPU] Fix support for i4, col-major operand support For the conversion to nvgpu `mma.sync` and `ldmatrix` pathways, the code was missing support for the `i4` data type. While fixing this, another bug was discoverd that caused the number of ldmatrix tiles calculated for certain operand types and configurations to be incorrect. This change fixes both issues and adds additional tests. Differential Revision: https://reviews.llvm.org/D128074	2022-06-30 10:26:59 -06:00
Kazu Hirata	064a08cd95	Don't use Optional::hasValue (NFC)	2022-06-20 20:05:16 -07:00
Alex Zinenko	8b68da2c7d	[mlir] move SCF headers to SCF/{IR,Transforms} respectively This aligns the SCF dialect file layout with the majority of the dialects. Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D128049	2022-06-20 10:18:01 +02:00
Christopher Bate	51b925df94	[mlir][nvgpu] shared memory access optimization pass This change adds a transformation and pass to the NvGPU dialect that attempts to optimize reads/writes from a memref representing GPU shared memory in order to avoid bank conflicts. Given a value representing a shared memory memref, it traverses all reads/writes within the parent op and, subject to suitable conditions, rewrites all last dimension index values such that element locations in the final (col) dimension are given by `newColIdx = col % vecSize + perm[row](col/vecSize,row)` where `perm` is a permutation function indexed by `row` and `vecSize` is the vector access size in elements (currently assumes 128bit vectorized accesses, but this can be made a parameter). This specific transformation can help optimize typical distributed & vectorized accesses common to loading matrix multiplication operands to/from shared memory. Differential Revision: https://reviews.llvm.org/D127457	2022-06-17 09:31:05 -06:00
Mogball	d7ef488bb6	[mlir][gpu] Move GPU headers into IR/ and Transforms/ Depends on D127350 Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D127352	2022-06-09 22:49:03 +00:00
Thomas Raoux	271a48e029	[mlir][VectorToGPU] Fix bug generating incorrect ldmatrix ops ldmatrix transpose can only be used with types that are 16bits wide. Differential Revision: https://reviews.llvm.org/D126846	2022-06-03 04:30:22 +00:00
Christopher Bate	1ca772ed95	[MLIR][GPU] Add NvGpu mma.sync path to the VectorToGPU pass This changes adds the option to lower to NvGpu dialect ops during the VectorToGPU convsersion pass. Because this transformation reuses existing VectorToGPU logic, a seperate VectorToNvGpu conversion pass is not created. The option `use-nvgpu` is added to the VectorToGPU pass. When this is true, the pass will attempt to convert slices rooted at `vector.contract` operations into `nvgpu.mma.sync` ops, and `vector.transfer_read` ops are converted to either `nvgpu.ldmatrix` or one or more `vector.load` operations. The specific data loaded will depend on the thread id within a subgroup (warp). These index calculations depend on data type and shape of the MMA op according to the downstream PTX specification. The code for supporting these details is separated into `NvGpuSupport.cpp\|h`. Differential Revision: https://reviews.llvm.org/D122940	2022-05-20 09:42:55 -06:00
Jacques Pienaar	7c38fd605b	[mlir] Flip Vector dialect accessors used to prefixed form. This has been on _Both for a couple of weeks. Flip usages in core with intention to flip flag to _Prefixed in follow up. Needed to add a couple of helper methods in AffineOps and Linalg to facilitate a pure flag flip in follow up as some of these classes are used in templates and so sensitive to Vector dialect changes. Differential Revision: https://reviews.llvm.org/D122151	2022-03-28 11:24:47 -07:00
Thomas Raoux	d77f483640	[mlir][gpu] Relax restriction on mma load/store op Those ops can support more complex layout as long as the most inner dimension is contiguous. Differential Revision: https://reviews.llvm.org/D122452	2022-03-25 04:03:40 +00:00
River Riddle	47f175b09b	[mlir] Update FuncOp conversion passes to Pass/InterfacePass<FunctionOpInterface> These passes generally don't rely on any special aspects of FuncOp, and moving allows for these passes to be used in many more situations. The passes that obviously weren't relying on invariants guaranteed by a "function" were updated to be generic pass, the rest were updated to be FunctionOpinterface InterfacePasses. The test updates are NFC switching from implicit nesting (-pass -pass2) form to the -pass-pipeline form (generic passes do not implicitly nest as op-specific passes do). Differential Revision: https://reviews.llvm.org/D121190	2022-03-08 12:25:32 -08:00
Matthias Springer	99ef9eebad	[mlir][vector][NFC] Split into IR, Transforms and Utils This reduces the dependencies of the MLIRVector target and makes the dialect consistent with other dialects. Differential Revision: https://reviews.llvm.org/D118533	2022-01-31 19:17:09 +09:00
Thomas Raoux	a57ccad5a6	[VectorToGPU] Fix horizontal stride calculation for N-D memref Fix a bug in how we calculate the stride of mma load/store ops for N-D memrefs Differential Revision: https://reviews.llvm.org/D118378	2022-01-27 13:35:56 -08:00
River Riddle	e084679f96	[mlir] Make locations required when adding/creating block arguments BlockArguments gained the ability to have locations attached a while ago, but they have always been optional. This goes against the core tenant of MLIR where location information is a requirement, so this commit updates the API to require locations. Fixes #53279 Differential Revision: https://reviews.llvm.org/D117633	2022-01-19 17:35:35 -08:00
River Riddle	4157455425	[mlir][Pass] Deprecate FunctionPass in favor of OperationPass<FuncOp> The only benefit of FunctionPass is that it filters out function declarations. This isn't enough to justify carrying it around, as we can simplify filter out declarations when necessary within the pass. We can also explore with better scheduling primitives to filter out declarations at the pipeline level in the future. The definition of FunctionPass is left intact for now to allow time for downstream users to migrate. Differential Revision: https://reviews.llvm.org/D117182	2022-01-18 19:52:44 -08:00
Mehdi Amini	e4853be2f1	Apply clang-tidy fixes for performance-for-range-copy to MLIR (NFC)	2022-01-02 22:19:56 +00:00
Mehdi Amini	6786d7e4f5	Apply clang-tidy fixes for readability-simplify-boolean-expr to MLIR (NFC) Reviewed By: rriddle, Mogball Differential Revision: https://reviews.llvm.org/D116253	2022-01-02 01:59:31 +00:00
Jacques Pienaar	c0342a2de8	[mlir] Switching accessors to prefixed form (NFC) Makes eventual prefixing flag flip smaller change.	2021-12-20 08:03:43 -08:00
Nicolas Vasilache	c537a94334	[mlir][Vector] Thread 0-d vectors through vector.transfer ops This revision adds 0-d vector support to vector.transfer ops. In the process, numerous cleanups are applied, in particular around normalizing and reducing the number of builders. Reviewed By: ThomasRaoux, springerm Differential Revision: https://reviews.llvm.org/D114803	2021-12-01 16:49:43 +00:00
Alexander Belyaev	9b1d90e8ac	[mlir] Move min/max ops from Std to Arith. Differential Revision: https://reviews.llvm.org/D113881	2021-11-15 13:19:17 +01:00
Thomas Raoux	e7969240dc	[mlir][VectorToGPU] Support more cases in conversion to MMA ops Support load with broadcast, elementwise divf op and remove the hardcoded restriction on the vector size. Picking the right size should be enfored by user and will fail conversion to llvm/spirv if it is not supported. Differential Revision: https://reviews.llvm.org/D113618	2021-11-11 13:10:38 -08:00
River Riddle	937e40a8cf	[mlir] Remove the non-templated DenseElementsAttr::getSplatValue This predates the templated variant, and has been simply forwarding to getSplatValue<Attribute> for some time. Removing this makes the API a bit more uniform, and also helps prevent users from thinking it is "cheap".	2021-11-09 01:40:40 +00:00
thomasraoux	7fbb0678fa	[mlir][VectorToGPU] Add support for elementwise mma to vector to GPU Differential Revision: https://reviews.llvm.org/D112960	2021-11-02 08:01:04 -07:00
Jacques Pienaar	cfb72fd3a0	[mlir] Switch arith, llvm, std & shape dialects to accessors prefixed both form. Following https://llvm.discourse.group/t/psa-ods-generated-accessors-will-change-to-have-a-get-prefix-update-you-apis/4476, this follows flipping these dialects to _Both prefixed form. This changes the accessors to have a prefix. This was possibly mostly without breaking breaking changes if the existing convenience methods were used. (https://github.com/jpienaar/llvm-project/blob/main/clang-tools-extra/clang-tidy/misc/AddGetterCheck.cpp was used to migrate the callers post flipping, using the output from Operator.cpp) Differential Revision: https://reviews.llvm.org/D112383	2021-10-24 18:36:33 -07:00
Mogball	a54f4eae0e	[MLIR] Replace std ops with arith dialect ops Precursor: https://reviews.llvm.org/D110200 Removed redundant ops from the standard dialect that were moved to the `arith` or `math` dialects. Renamed all instances of operations in the codebase and in tests. Reviewed By: rriddle, jpienaar Differential Revision: https://reviews.llvm.org/D110797	2021-10-13 03:07:03 +00:00
thomasraoux	4392841949	[mlir][VectorToGPU] Support converting vetor.broadcast to MMA op Differential Revision: https://reviews.llvm.org/D105175	2021-06-30 09:08:55 -07:00
thomasraoux	1a86559276	[mlir][VectorToGPU] Add conversion for scf::For op with Matrix operands Differential Revision: https://reviews.llvm.org/D104134	2021-06-24 15:42:28 -07:00
thomasraoux	6413226dce	[mlir][VectorToGPU] Add conversion for splat constant to MMA const matrix Differential Revision: https://reviews.llvm.org/D104133	2021-06-24 15:38:12 -07:00
Matthias Springer	66f878cee9	[mlir][NFC] Remove Standard dialect dependency on MemRef dialect * Remove dependency: Standard --> MemRef * Add dependencies: GPUToNVVMTransforms --> MemRef, Linalg --> MemRef, MemRef --> Tensor * Note: The `subtensor_insert_propagate_dest_cast` test case in MemRef/canonicalize.mlir will be moved to Tensor/canonicalize.mlir in a subsequent commit, which moves over the remaining Tensor ops from the Standard dialect to the Tensor dialect. Differential Revision: https://reviews.llvm.org/D104506	2021-06-21 17:55:23 +09:00
thomasraoux	edd9515bd1	[mlir][VectorToGPU] First step to convert vector ops to GPU MMA ops This is the first step to convert vector ops to MMA operations in order to target GPUs tensor core ops. This currently only support simple cases, transpose and element-wise operation will be added later. Differential Revision: https://reviews.llvm.org/D102962	2021-06-11 07:52:32 -07:00

48 Commits