Commit Graph

6 Commits

Author SHA1 Message Date
Richard Smith d61a5cefe6 PR38870: Add warning for zero-width unicode characters appearing in
identifiers.

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@341700 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-07 19:25:39 +00:00
Richard Smith 6c3c48f7a3 Warn if we find a Unicode homoglyph for a symbol in an identifier.
Specifically, warn if:
 * we find a character that the language standard says we must treat as an
   identifier, and
 * that character is not reasonably an identifier character (it's a punctuation
   character or similar), and 
 * it renders identically to a valid non-identifier character in common
   fixed-width fonts.

Some tools "helpfully" substitute the surprising characters for the expected
characters, and replacing semicolons with Greek question marks is a common
"prank".


git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@320697 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-14 13:15:08 +00:00
Richard Smith 53ffb9e3f3 Add test that we correctly allow some non-letter unicode characters in
identifiers, and extend existing test to also cover C++.


git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@248079 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-19 02:14:12 +00:00
Jordan Rose 0ed4394874 Lexer: Don't warn about Unicode in preprocessor directives.
This allows people to use Unicode in their #pragma mark and in macros
that exist only to be string-ized.

<rdar://problem/13107323&13121362>

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@174081 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-31 19:48:48 +00:00
Jordan Rose 74c2498bb9 Don't warn about Unicode characters in -E mode.
People use the C preprocessor for things other than C files. Some of them
have Unicode characters. We shouldn't warn about Unicode characters
appearing outside of identifiers in this case.

There's not currently a way for the preprocessor to tell if it's in -E mode,
so I added a new flag, derived from the PreprocessorOutputOptions. This is
only used by the Unicode warnings for now, but could conceivably be used by
other warnings or even behavioral differences later.

<rdar://problem/13107323>

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@173881 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-30 01:52:57 +00:00
Jordan Rose fc12060ed5 As an extension, treat Unicode whitespace characters as whitespace.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@173370 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-24 20:50:50 +00:00