[Support] unsafe pointer arithmetic in llvm_regcomp()

regcomp.c uses the "start + count < end" idiom to check that there are
"count" bytes available in an array of char "start" and "end" both point
to.

This is fine, unless "start + count" goes beyond the last element of the
array. In this case, pedantic interpretation of the C standard makes
the comparison of such a pointer against "end" undefined, and optimizers
from hell will happily remove as much code as possible because of this.

An example of this occurs in regcomp.c's bothcases(), which defines
bracket[3], sets "next" to "bracket" and "end" to "bracket + 2". Then it
invokes p_bracket(), which starts with "if (p->next + 5 < p->end)"...

Because bothcases() and p_bracket() are static functions in regcomp.c,
there is a real risk of miscompilation if aggressive inlining happens.

The following diff rewrites the "start + count < end" constructs into
"end - start > count". Assuming "end" and "start" are always pointing in
the array (such as "bracket[3]" above), "end - start" is well-defined
and can be compared without trouble.

As a bonus, MORE2() implies MORE() therefore SEETWO() can be simplified
a bit.

Bug report: https://github.com/llvm/llvm-project/issues/47993

Reviewed By: MaskRay, vitalybuka

Differential Revision: https://reviews.llvm.org/D97129
This commit is contained in:
Miod Vallat 2022-02-03 19:50:58 -05:00 committed by Brad Smith
parent 2b78ef06c2
commit 877c84acd4
1 changed files with 14 additions and 12 deletions

View File

@ -249,10 +249,10 @@ static char nuls[10]; /* place to point scanner in event of error */
*/ */
#define PEEK() (*p->next) #define PEEK() (*p->next)
#define PEEK2() (*(p->next+1)) #define PEEK2() (*(p->next+1))
#define MORE() (p->next < p->end) #define MORE() (p->end - p->next > 0)
#define MORE2() (p->next+1 < p->end) #define MORE2() (p->end - p->next > 1)
#define SEE(c) (MORE() && PEEK() == (c)) #define SEE(c) (MORE() && PEEK() == (c))
#define SEETWO(a, b) (MORE() && MORE2() && PEEK() == (a) && PEEK2() == (b)) #define SEETWO(a, b) (MORE2() && PEEK() == (a) && PEEK2() == (b))
#define EAT(c) ((SEE(c)) ? (NEXT(), 1) : 0) #define EAT(c) ((SEE(c)) ? (NEXT(), 1) : 0)
#define EATTWO(a, b) ((SEETWO(a, b)) ? (NEXT2(), 1) : 0) #define EATTWO(a, b) ((SEETWO(a, b)) ? (NEXT2(), 1) : 0)
#define NEXT() (p->next++) #define NEXT() (p->next++)
@ -800,15 +800,17 @@ p_bracket(struct parse *p)
int invert = 0; int invert = 0;
/* Dept of Truly Sickening Special-Case Kludges */ /* Dept of Truly Sickening Special-Case Kludges */
if (p->next + 5 < p->end && strncmp(p->next, "[:<:]]", 6) == 0) { if (p->end - p->next > 5) {
EMIT(OBOW, 0); if (strncmp(p->next, "[:<:]]", 6) == 0) {
NEXTn(6); EMIT(OBOW, 0);
return; NEXTn(6);
} return;
if (p->next + 5 < p->end && strncmp(p->next, "[:>:]]", 6) == 0) { }
EMIT(OEOW, 0); if (strncmp(p->next, "[:>:]]", 6) == 0) {
NEXTn(6); EMIT(OEOW, 0);
return; NEXTn(6);
return;
}
} }
if ((cs = allocset(p)) == NULL) { if ((cs = allocset(p)) == NULL) {