Commit Graph

706 Commits

Author SHA1 Message Date
ziliangzl 6a8e4d4667 [VENTUS][#119]Fix workgroup function barrier scope 2024-05-14 14:03:20 +08:00
ziliangzl d977b0bf8b [VENTUS][#119]Complete workgroup function implementation
Implement work_group_reduce_<op> functions in wgreduce.cl .
Implement work_group_scan_inclusive_<op> work_group_scan_exclusive_<op> functions in wgscan.cl .
Passed corresponding OPENCL-CTS tests.
2024-05-14 13:36:09 +08:00
ziliangzl 4789f2096b [VENTUS][#119]Add work_group_broadcast implementation
Passed corresponding OPENCL-CTS test.
2024-05-13 11:22:56 +08:00
ziliangzl d63caa094a [VENTUS][#119]Fix __wg_scratch multiple define
1.Global variable shouldn't define in header.
2.Fix code format.
2024-05-13 11:18:03 +08:00
ziliangzl 6b70120436 [VENTUS][libclc][feat]Start workgroup function implementation
1.Implement barrier and work_group_barrier function with intrinsics.
2.Implement work_group_all and work_group_any function,passed corresponding OPENCL-CTS test.
2024-05-11 16:38:52 +08:00
zhoujing 797c85d829 [patch] Add a fix patch from terapines_dev branch 2024-03-08 18:23:47 +08:00
zhoujing 87fe5f3ce8 [VENTUS][fix] Put local variables declared in kernel function into shared memory 2024-03-05 16:32:59 +08:00
zhoujing 6cac00d141 [NFC] comment fix 2024-02-01 15:03:53 +08:00
qinfan 4b25812260 [VENTUS] Fix some comments
Fix some comments.
2024-02-01 14:56:01 +08:00
zhoujing 3bd573e3b3 [VENTUS][fix] Remove codes and fix wrong register error in workitem.s 2024-02-01 14:56:01 +08:00
zhoujing 03759b1bed [VENTUS][fix] Fix get_local_id builtin function implementation 2024-02-01 14:56:01 +08:00
zhoujingya efd82b9d86 [#56][fix] Fix the implementation of get_local_linear_id 2024-02-01 14:56:01 +08:00
zhoujingya e04c1a6ec7 [#56][fix] Fix workitem function(enqueued_local_size & local_linear_id) bugs in libclc
Support get_enqueued_local_size function and fix the calculation of get_local_linear_id
2024-02-01 14:56:01 +08:00
zhoujingya 965f8c1fb6
Merge branch 'main' into eliminate_call_frame 2024-02-01 13:15:03 +08:00
zhoujing aaf1c41a21 [VENTUS][fix] Fix clzl function implementation in floatdidf 2024-01-31 16:12:58 +08:00
zhoujingya dfd2affa51 [VENTUS][fix] Fix float precision issue in libclc for ventus
There are many potential precision bugs in libclc, especially the functions
under `libclc/riscv32/lib/compiler-rt` directory.
2024-01-31 16:12:31 +08:00
zhoujingya b793a55a42
Merge pull request #95 from THU-DSP-LAB/clamp
[VENTUS][fix] Fix clamp function
2024-01-22 22:27:44 +08:00
qinfan eb8de4e634 [VENTUS][fix] Add mul_hi function
Add mul_hi function.
2024-01-22 16:55:53 +08:00
qinfan 9181e1a435 [VENTUS][fix] Fix clamp function
Fix clamp function.
2024-01-22 16:44:35 +08:00
zhoujingya 7e8e66058c [VENTUS][fix] Add ctz function implementation
Add ctz function implementation.
2024-01-11 14:21:05 +08:00
qinfan d809d3a2bd [VENTUS][fix] Fix the Offset of private variable offset on stack
Fix the Offset of private variable offset on stack.
2023-12-22 16:47:19 +08:00
qinfan 755797e27c [VENTUS][fix] Fix framelowering and calculation method of stack offset
1. Add VMV_V_X in emitEpilogue.
2. Change all the positive numbers added by TP to negative numbers(in LowerCall).
3. Fix the LowerCall function to generate correct store instruction transferring the function parameters.
4. Fix hasReservedCallFrame function to return false.
5. Align the convention between caller and callee in the case of passing parameters by stack.
6. Change the stack offset calculation method of TP.
7. Unify the calculation of TP stack and SP stack offset.
8. Node that needing to manually modify the calculation of sp offset in the workitem.S. Since the growth direction of the stack is different from that of the traditional RISCV, it is now stipulated that for both the SP stack and the TP stack, the data is stored where the stack pointer is not offset.
9. There is a SPAdj check in eliminateFrameIndex function. but we don't need this value at all so that adding a getSPAdjust function to return zero.
10. V33 is a wrong value when parameters pushed to TP stack so there must be a MV instruction to refresh V33 after ADJCALLSTACKDOWN.
2023-12-20 17:03:01 +08:00
zhoujingya 12bb90bd11 [VENTUS][fix] Add libclc function parameter vector size equal 3 support
In the current libclc library, when the function parameter contains vec3, the library
does not overload the builtin function and implement it, so we need to add related
declaration

For cts test cases:
* prefetch
* async_copy_global_to_local
* async_copy_local_to_global
2023-12-07 10:19:28 +08:00
ouyangxiao ebb43f7877 [VENTUS][fix] Fix the bug that the calculation of global id is incorrect 2023-10-19 14:46:45 +08:00
zhoujingya dc788b8e36 [VENTUS][fix] Fix ra offset in workitem.s 2023-09-14 15:52:08 +08:00
zhoujing 90698c8fb9 [VENTUS][fix] add epilog/prolog information in __builtin_riscv_global_linear_id 2023-08-16 15:04:05 +08:00
zhoujing 826c4cb599 Revert "[VENTUS][fix] Insert barrier instruction for function calling"
This reverts commit 7e4b7a6ae1.
2023-08-16 14:50:42 +08:00
zhoujing d40f4ec38d [VENTUS][fix] Change the sp initialization for different warp
In previous sp initialization, sp points to the same base address for different warps
When different warps ends in different time, the sp pointer in later ended warp will
be changed by former ended warp, we need to initialize sp pointer for different warp
2023-08-15 13:54:13 +08:00
zhoujing 35f164a462 [VENTUS][libclc] Add more compiler-rt function support
Specially this commit is for later printf function support
2023-08-14 15:33:34 +08:00
zhoujing 88a22b9e90 [VENTUS][fix] Change vmv.s.x to vmv.v.x in workitem.S 2023-08-01 14:04:59 +08:00
zhoujing 7e4b7a6ae1 [VENTUS][fix] Insert barrier instruction for function calling
Stack space is shared between different warps, if two warps are executing
different functions, then the access to the return address will conflict,
which will lead the warp executing faster can not find the return address,
so we would like to add a barrier instruction after the lw and before the ret,
to ensure that the warps have the same scope of the sp pointer
2023-07-31 11:01:14 +08:00
zhoujing 85cf676b2c [VENTUS][RISCV][feat] Accelerate libclc building process
Disable builtin cmake script defined in libclc by upstream to accelerate ventus
libclc building process
2023-07-25 10:54:40 +08:00
zhoujing bdf01afb7d [VENTUS][RISCV][fix] Add CSR_WGID definition 2023-07-21 10:01:38 +08:00
zhoujing 258d76a7c8 [VENTUS][RISCV][fix] Fix get_global_id_z function bug 2023-07-19 19:32:26 +08:00
zhoujing 1a6ead3f43 [VENTUS][RISCV][fix] Fix workitem function implementation 2023-07-19 17:45:35 +08:00
zhoujing b8223e72bd [VENTUS][RISCV][feat] Building libclc library into object file other than archive file
In our previous design, the libclc library is built into static library which make the generated
ELF file having a large size, now we change compiler and linker option to make generated ELF file size much smaller, detail information can be seen in this pull request https://github.com/THU-DSP-LAB/pocl/pull/11
2023-07-17 21:39:02 +08:00
zhoujing 2b2f2342e1 [VENTUS][RISCV][NFC] Clean codes 2023-07-14 09:13:15 +08:00
zhoujing f026e3e0aa [VENTUS][RISCV][fix] Fix spike_end wrong section bug 2023-07-13 21:27:55 +08:00
zhoujingya a5c1106e25 [VENTUS][RISCV][fix] Remove redundant codes 2023-07-13 13:53:59 +08:00
zhoujing 209306abc9 [VENTUS][RISCV][libclc] Add more soft float function support 2023-07-13 13:24:03 +08:00
zhoujing d2c3e47f6a [VENTUS][RISCV][fix] Remove redundant fma funtion 2023-07-13 09:49:30 +08:00
zhoujing 760e86baf2 [VENTUS][RISCV][fix&feat] Add cl_khr_fp64 support and add missing header file 2023-07-13 09:21:50 +08:00
zhoujing 7b949c1868 [VENTUS][RISCV][libclc] Add additional soft float support functions 2023-07-12 21:02:52 +08:00
zhoujing a5b2d952d7 [VENTUS][RISCV][libclc] Add soft float support 2023-07-12 16:40:11 +08:00
zhoujing 1bfef21c65 [VENTUS][RISCV][fix] Moving 'tohost' and 'fromhost' from section 'tohost' to '.data'
We change this because we find function jump bugs in rodinia GPU test
2023-07-10 09:12:02 +08:00
zhoujing 9899aee134 [VENTUS][RISCV][fix] Fix undefined symbol errors in libclc
We expand all the macros used in these changed files hand by hand, a little stupid, but it works
2023-07-07 16:24:29 +08:00
zhoujing be2463898a [VENTUS][RISCV][Fix] Remove 64bits related codes in gen_convert.py 2023-07-07 09:59:44 +08:00
zhoujing 21e0b87130 [VENTUS][RISCV][Fix] Typo fix 2023-07-07 09:42:55 +08:00
zhoujing 24c8b19c42 [VENTUS][RISCV][Fix] Remove 64bits related codes in gen_convert.py
This file will finally generate convert.cl file
2023-07-07 09:40:34 +08:00
zhoujing fb8b0daa4d [VENTUS][RISCV][fix] Fix building errors caused by lost semicolon in libclc 2023-07-07 09:35:22 +08:00