Commit Graph

34 Commits

Author SHA1 Message Date
Guillaume Chatelet 436c8f4420 [reland][libc] Add bcopy
Differential Revision: https://reviews.llvm.org/D138994
2022-12-01 10:07:04 +00:00
Guillaume Chatelet c5fe7eb216 Revert D138994 "[libc] Add bcopy"
Broke build bot

This reverts commit 186a15f7a9.
2022-12-01 09:55:36 +00:00
Guillaume Chatelet 186a15f7a9 [libc] Add bcopy
Differential Revision: https://reviews.llvm.org/D138994
2022-12-01 09:52:10 +00:00
Tue Ly 45233cc1ca [libc][math] Add place-holder implementation for pow function.
Add place-holder implementation for pow function to unblock libc demo
examples.

Reviewed By: michaelrj

Differential Revision: https://reviews.llvm.org/D137109
2022-10-31 17:23:33 -04:00
Tue Ly 97b4cc83e1 [libc][math] Add place-holder implementation for asin to unblock demo examples.
Add a place-holder implementation for asin to unblock libc demo
examples.

Reviewed By: michaelrj

Differential Revision: https://reviews.llvm.org/D137105
2022-10-31 17:22:12 -04:00
Tue Ly a752460d73 [libc][math] Implement exp10f function correctly rounded to all rounding modes.
Implement exp10f function correctly rounded to all rounding modes.

Algorithm: perform range reduction to reduce
```
  10^x = 2^(hi + mid) * 10^lo
```
where:
```
  hi is an integer,
  0 <= mid * 2^5 < 2^5
  -log10(2) / 2^6 <= lo <= log10(2) / 2^6
```
Then `2^mid` is stored in a table of 32 entries and the product `2^hi * 2^mid` is
performed by adding `hi` into the exponent field of `2^mid`.
`10^lo` is then approximated by a degree-5 minimax polynomials generated by Sollya with:
```
  > P = fpminimax((10^x - 1)/x, 4, [|D...|], [-log10(2)/64. log10(2)/64]);
```
Performance benchmark using perf tool from the CORE-MATH project on Ryzen 1700:
```
$ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh exp10f
GNU libc version: 2.35
GNU libc release: stable
CORE-MATH reciprocal throughput   : 10.215
System LIBC reciprocal throughput : 7.944

LIBC reciprocal throughput        : 38.538
LIBC reciprocal throughput        : 12.175   (with `-msse4.2` flag)
LIBC reciprocal throughput        : 9.862    (with `-mfma` flag)

$ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh exp10f --latency
GNU libc version: 2.35
GNU libc release: stable
CORE-MATH latency   : 40.744
System LIBC latency : 37.546

BEFORE
LIBC latency        : 48.989
LIBC latency        : 44.486   (with `-msse4.2` flag)
LIBC latency        : 40.221   (with `-mfma` flag)
```
This patch relies on https://reviews.llvm.org/D134002

Reviewed By: orex, zimmermann6

Differential Revision: https://reviews.llvm.org/D134104
2022-09-19 10:01:40 -04:00
Tue Ly 463dcc8749 [libc][math] Implement acosf function correctly rounded for all rounding modes.
Implement acosf function correctly rounded for all rounding modes.

We perform range reduction as follows:

- When `|x| < 2^(-10)`, we use cubic Taylor polynomial:
```
  acos(x) = pi/2 - asin(x) ~ pi/2 - x - x^3 / 6.
```
- When `2^(-10) <= |x| <= 0.5`, we use the same approximation that is used for `asinf(x)` when `|x| <= 0.5`:
```
  acos(x) = pi/2 - asin(x) ~ pi/2 - x - x^3 * P(x^2).
```
- When `0.5 < x <= 1`, we use the double angle formula: `cos(2y) = 1 - 2 * sin^2 (y)` to reduce to:
```
  acos(x) = 2 * asin( sqrt( (1 - x)/2 ) )
```
- When `-1 <= x < -0.5`, we reduce to the positive case above using the formula:
```
  acos(x) = pi - acos(-x)
```

Performance benchmark using perf tool from the CORE-MATH project on Ryzen 1700:
```
$ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh acosf
GNU libc version: 2.35
GNU libc release: stable
CORE-MATH reciprocal throughput   : 28.613
System LIBC reciprocal throughput : 29.204
LIBC reciprocal throughput        : 24.271

$ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh asinf --latency
GNU libc version: 2.35
GNU libc release: stable
CORE-MATH latency   : 55.554
System LIBC latency : 76.879
LIBC latency        : 62.118
```

Reviewed By: orex, zimmermann6

Differential Revision: https://reviews.llvm.org/D133550
2022-09-09 09:55:30 -04:00
Tue Ly e2f065c2a3 [libc][math] Implement asinf function correctly rounded for all rounding modes.
Implement asinf function correctly rounded for all rounding modes.

For `|x| <= 0.5`, we approximate `asin(x)` by
```
  asin(x) = x * P(x^2)
```
where `P(X^2) = Q(X)` is a degree-20 minimax even polynomial approximating
`asin(x)/x` on `[0, 0.5]` generated by Sollya with:
```
  > Q = fpminimax(asin(x)/x, [|0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20|],
                 [|1, D...|], [0, 0.5]);
```

When `|x| > 0.5`, we perform range reduction as follow:
Assume further that `0.5 < x <= 1`, and let:
```
  y = asin(x)
```
We will use the double angle formula:
```
  cos(2X) = 1 - 2 sin^2(X)
```
and the complement angle identity:
```
  x = sin(y) = cos(pi/2 - y)
              = 1 - 2 sin^2 (pi/4 - y/2)
```
So:
```
  sin(pi/4 - y/2) = sqrt( (1 - x)/2 )
```
And hence:
```
  pi/4 - y/2 = asin( sqrt( (1 - x)/2 ) )
```
Equivalently:
```
  asin(x) = y = pi/2 - 2 * asin( sqrt( (1 - x)/2 ) )
```
Let `u = (1 - x)/2`, then
```
  asin(x) = pi/2 - 2 * asin(u)
```
Moreover, since `0.5 < x <= 1`,
```
  0 <= u < 1/4, and 0 <= sqrt(u) < 0.5.
```
And hence we can reuse the same polynomial approximation of `asin(x)` when
`|x| <= 0.5`:
```
  asin(x) = pi/2 - 2 * u * P(u^2).
```

Performance benchmark using `perf` tool from the CORE-MATH project on Ryzen 1700:
```
$ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh asinf
CORE-MATH reciprocal throughput   : 23.418
System LIBC reciprocal throughput : 27.310
LIBC reciprocal throughput        : 22.741

$ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh asinf --latency
GNU libc version: 2.35
GNU libc release: stable
CORE-MATH latency   : 58.884
System LIBC latency : 62.055
LIBC latency        : 62.037
```

Reviewed By: orex, zimmermann6

Differential Revision: https://reviews.llvm.org/D133400
2022-09-07 19:27:47 -04:00
Kirill Okhotnikov 77e1d9beed [libc][math] Added atanf function.
Performance by core-math (core-math/glibc 2.31/current llvm-14):
28.879/20.843/20.15

Differential Revision: https://reviews.llvm.org/D132842
2022-08-30 22:39:54 +02:00
Kirill Okhotnikov 6c1fc7e430 [libc][math] Added atanhf function.
Performance by core-math (core-math/glibc 2.31/current llvm-14):
10.845/43.174/13.467

The review is done on top of D132809.

Differential Revision: https://reviews.llvm.org/D132811
2022-08-30 22:39:54 +02:00
Tue Ly af59bac4ca [libc] Add missing header and Windows entrypoints for tanf. 2022-08-12 09:34:33 -04:00
Kirill Okhotnikov 5ef987c985 [libc][math] Added tanhf function.
Correct rounding function. Performance ~2x faster than glibc analog.

Performance (llvm 12 intel):
```
CORE_MATH_PERF_MODE=rdtsc PERF_ARGS='' ./perf.sh tanhf
GNU libc version: 2.31
GNU libc release: stable
13.279
37.492
18.145
CORE_MATH_PERF_MODE=rdtsc PERF_ARGS='--latency' ./perf.sh tanhf
GNU libc version: 2.31
GNU libc release: stable
40.658
109.582
66.568
```

Differential Revision: https://reviews.llvm.org/D130780
2022-08-01 22:43:00 +02:00
Kirill Okhotnikov a7f55f0805 [libc][math] Added sinhf function.
Differential Revision: https://reviews.llvm.org/D129278
2022-07-29 17:20:53 +02:00
Kirill Okhotnikov fcb9d7e2cf [libc][math] Added coshf function.
Differential Revision: https://reviews.llvm.org/D129275
2022-07-29 16:57:28 +02:00
Alex Brachet c179bcc151 [libc] Add imaxabs
Differential Revision: https://reviews.llvm.org/D129517
2022-07-11 21:28:21 +00:00
Kirill Okhotnikov b8e8012aa2 [libc][math] fmod/fmodf implementation.
This is a implementation of find remainder fmod function from standard libm.
The underline algorithm is developed by myself, but probably it was first
invented before.
Some features of the implementation:
1. The code is written on more-or-less modern C++.
2. One general implementation for both float and double precision numbers.
3. Spitted platform/architecture dependent and independent code and tests.
4. Tests covers 100% of the code for both float and double numbers. Tests cases with NaN/Inf etc is copied from glibc.
5. The new implementation in general 2-4 times faster for “regular” x,y values. It can be 20 times faster for x/y huge value, but can also be 2 times slower for double denormalized range (according to perf tests provided).
6. Two different implementation of division loop are provided. In some platforms division can be very time consuming operation. Depend on platform it can be 3-10 times slower than multiplication.

Performance tests:

The test is based on core-math project (https://gitlab.inria.fr/core-math/core-math). By Tue Ly suggestion I took hypot function and use it as template for fmod. Preserving all test cases.

`./check.sh <--special|--worst> fmodf` passed.
`CORE_MATH_PERF_MODE=rdtsc ./perf.sh fmodf` results are

```
GNU libc version: 2.35
GNU libc release: stable
21.166 <-- FPU
51.031 <-- current glibc
37.659 <-- this fmod version.
```
2022-06-24 23:09:14 +02:00
Alex Brachet b1183305f8 [libc] Add strlcat
Differential Revision: https://reviews.llvm.org/D125978
2022-05-19 21:48:39 +00:00
Alex Brachet fc2c8b2371 [libc] Add strlcpy
Differential Revision: https://reviews.llvm.org/D125806
2022-05-18 17:45:05 +00:00
Michael Jones 270ca878d9 [libc] Update windows entrypoint list
The entrypoint list for windows hasn't been updated in a while, this
adds all of the entrypoints that are working for windows now.

Reviewed By: sivachandra, lntue

Differential Revision: https://reviews.llvm.org/D125058
2022-05-06 11:30:50 -07:00
Michael Jones 805899e68a [libc] Change FEnv to use MXCSR as source of truth
This patch primarily fixes the fenv implementation on Windows, since
Windows uses the MXCSR in place of the x87 status registers for storing
information about the floating point environment. This allows FEnv to
work correctly on Windows, and successfully build.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D121839
2022-03-23 16:08:00 -07:00
Tue Ly 9e7688c71e [libc] Implement log1pf correctly rounded to all rounding modes.
Implement log1pf correctly rounded to all rounding modes relying on logf implementation for exponent > 2^(-8).

Reviewed By: sivachandra, zimmermann6

Differential Revision: https://reviews.llvm.org/D118962
2022-02-07 16:17:18 -05:00
Tue Ly 63d2df003e [libc] Implement correctly rounded log2f based on RLIBM library.
Implement log2f based on RLIBM library correctly rounded for all rounding modes.

Reviewed By: sivachandra, michaelrj, santoshn, jpl169, zimmermann6

Differential Revision: https://reviews.llvm.org/D115828
2022-01-14 12:40:49 -05:00
Tue Ly d08a801b5f [libc] Implement correctly rounded logf based on RLIBM library.
Implement correctly rounded logf based on RLIBM library: https://people.cs.rutgers.edu/~sn349/rlibm/.

Reviewed By: sivachandra, santoshn, jpl169, zimmermann6

Differential Revision: https://reviews.llvm.org/D115408
2021-12-16 13:43:15 -05:00
Michael Jones 035325275c [libc] add inttypes header
Add inttypes.h to llvm libc. As its first functions strtoimax and
strtoumax are included.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D108736
2021-08-26 18:04:21 +00:00
Michael Jones eff11176c5 [libc] Enable string to integer conversion functions in the default build
Adds atoi, atol, atoll, strtol, strtoll, strtoul, and strtoull to the
list of entrypoints for Windows and aarch64 linux, as well as moving
them out of the LLVM_LIBC_FULL_BUILD condition for x86_64 linux.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D108477
2021-08-23 21:18:14 +00:00
Siva Chandra Reddy 1cd3d19271 [libc] Add bcmp to the windows config. 2021-08-20 04:51:09 +00:00
Siva Chandra Reddy 9a56d71f61 [libc][NFC] Disable double precision cos, sin and tan on Windows.
The current x86_64 implementations do not build on the windows bot
machine. We will enable them back after fixing the problem.
2021-08-17 16:47:20 +00:00
Siva Chandra Reddy 66d92efc66 [libc] Add trigonometric and exponential functions to the windows config. 2021-07-31 01:30:26 +00:00
Michael Jones c6d03b583b [libc] add strncmp to strings
Add strncmp as a function to strings.h. Also adds unit tests, and adds
strncmp as an entrypoint for all current platforms.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D106901
2021-07-28 21:37:12 +00:00
Siva Chandra Reddy dd8b93a9e7 [libc] Fix x86_64 fenv implementation for windows
All fenv functions are also enabled for windows. Since two tests,
enabled_exceptions_test and feholdexcept_test are still failing on
windows, they have been disabled.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D106808
2021-07-27 20:53:01 +00:00
Hedin Garca 8baa87d918 [libc] Enable MPFR library for math functions test
Included more math functions to Windows's entrypoints
and made a cmake option (-DLLVM_LIBC_MPFR_INSTALL_PATH)
where the user can specify the install path where the MPFR
library was built so it can be linked. The try_compile was
moved to LLVMLibCCheckMPFR.cmake, so the variable that is
set after this process can retain its value in other files
of the same parent file. A direct reason for this is for
LIBC_TESTS_CAN_USE_MPFR to be true when the user specifies
MPFR's path and retain its value even after leaving the file.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D106894
2021-07-27 20:40:04 +00:00
Hedin Garca efa2115266 [libc] Include nextafter's functions to Windows's entrypoints
Incorporated the varied functions for nextafter and refactored
NextAfterTest.h to correctly define bitWidthOfType for both
Linux and Windows; by letting FloatProperties take care
of the directives' logic based on the platform being used.
This allows to successfully run nextafter's tests.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D106395
2021-07-21 13:28:01 +00:00
Hedin Garca f49f2e2d1f [libc] Append math functions to Window's entrypoints
Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D106391
2021-07-21 13:21:55 +00:00
Caitlyn Cano e4b9fecd39 [libc] Add minimal Windows config
A README file with procedure for building/testing LLVM libc on Windows
has also been added.

Reviewed By: sivachandra, aeubanks

Differential Revision: https://reviews.llvm.org/D105231
2021-07-01 20:45:57 +00:00