[QScintilla] Search Whole Word matches Only on regular expression symbol

Baz Walter bazwal at ftml.net
Sat Oct 12 23:29:20 BST 2013

On 12/10/13 11:52, Phil Thompson wrote:
> So I need to call SCI_SETWORDCHARS when a lexer is set using the value
> returned by the lexer's wordCharacters() method.
> Is this likely to cause any unforeseen problems?

As usual with Scintilla, the main source of potential problems is 
single-byte vs multi-byte encodings. For latin-1, any byte in the range 
0-255 can be set as a word character. But for utf-8, only the ascii 
range is relevant - all unicode characters above 127 are always treated 
as word characters, regardless of what has been set using SCI_SETWORDCHARS.

However, Scintilla's default set of word characters (i.e. those set via 
SCI_SETCHARSDEFAULT) includes the standard alphanumerics and underscore, 
*plus* all the characters in the range 128-255 (regardless of the 
code-page setting).

So, assuming the current lexer wordCharacters functions only ever return 
ascii, there is some potential for changes in behaviour if QScintilla is 
being used in *latin-1* mode (utf-8 mode should be unaffected).

The only other potential issue I can think of at the moment, is that 
setting the word characters automatically resets the whitespace and 
punctuation characters to their default values.

Baz Walter

More information about the QScintilla mailing list