Description |
According to ISO C and POSIX, (size_t)-2 is returned:
If the next n bytes contribute to an incomplete but potentially valid character, and all n bytes have been processed (no value is stored).
uClibc's mbrtowc will return -2 when given input such as [0xc0] (1 byte) or [0xed 0xa0] (2 bytes), even though these are NOT potentially valid, i.e. not initial subsequences of a valid UTF-8 sequence. This behavior complicates a calling program's logic for detecting where an encoding error occurred when attempting to resync, and probably causes scanf to behave in a noncompliant way when using the %ls, %lc, or %l[ specifiers (consuming more bytes than it should). |