dgit.raspbian.org Git - emacs.git/commit

author	Yuan Fu <casouri@gmail.com>
	Sat, 14 Sep 2024 04:42:17 +0000 (21:42 -0700)
committer	Yuan Fu <casouri@gmail.com>
	Sat, 14 Sep 2024 07:28:23 +0000 (00:28 -0700)
commit	6a6d7925c9ddbf558f70932661ee943262aea4ca
tree	03e91c84614c3ee2afc1c143e6ad74ea7c4f8fab	tree \| snapshot
parent	76faf7e60910ffc29b134fa4d16e3d8c176097a7	commit \| diff

Fix range handling so it works for multibyte buffer (bug#73204)

Here by multibyte buffer I mean buffer that includes non-ASCII
characters.

The problem is illustrated by this comment, which I copied from the
source:

======================================================================
(ref:bytepos-range-pitfall) Suppose we have the following buffer
content ([ ] is a unibyte char, [    ] is a multibyte char):

    [a][b][c][d][e][ f  ]

and the following ranges (denoted by braces):

    [a][b][c][d][e][ f  ]
    {       }{    }

So far so good, now user deletes a unibyte char at the beginning:

    [b][c][d][e][ f  ]
    {       }{    }

Oops, now our range cuts into the multibyte char, bad!
======================================================================

* src/treesit.c (treesit_debug_print_parser_list): Minor fix.
(treesit_sync_visible_region): Change the way we fixup ranges, instead
of using the bytepos ranges from tree-sitter, we use the cached lisp
charpos ranges.
(treesit_make_ts_ranges): New function.
(Ftreesit_parser_set_included_ranges): Refactor out the new function
treesit_make_ts_ranges.
(Ftreesit_parser_included_ranges): Rather than getting the ranges from
tree-sitter, just return the cached lisp ranges.

* src/treesit.h (Lisp_TS_Parser): Add some comment.
* test/src/treesit-tests.el (treesit-range-fixup-after-edit): New test.

src/treesit.c		diff \| blob \| history
src/treesit.h		diff \| blob \| history
test/src/treesit-tests.el		diff \| blob \| history