[QScintilla] Search Whole Word matches Only on regular expression symbol
Baz Walter
bazwal at ftml.net
Sat Oct 12 23:29:20 BST 2013
On 12/10/13 11:52, Phil Thompson wrote:
> So I need to call SCI_SETWORDCHARS when a lexer is set using the value
> returned by the lexer's wordCharacters() method.
>
> Is this likely to cause any unforeseen problems?
As usual with Scintilla, the main source of potential problems is
single-byte vs multi-byte encodings. For latin-1, any byte in the range
0-255 can be set as a word character. But for utf-8, only the ascii
range is relevant - all unicode characters above 127 are always treated
as word characters, regardless of what has been set using SCI_SETWORDCHARS.
However, Scintilla's default set of word characters (i.e. those set via
SCI_SETCHARSDEFAULT) includes the standard alphanumerics and underscore,
*plus* all the characters in the range 128-255 (regardless of the
code-page setting).
So, assuming the current lexer wordCharacters functions only ever return
ascii, there is some potential for changes in behaviour if QScintilla is
being used in *latin-1* mode (utf-8 mode should be unaffected).
The only other potential issue I can think of at the moment, is that
setting the word characters automatically resets the whitespace and
punctuation characters to their default values.
--
Regards
Baz Walter
More information about the QScintilla
mailing list