summary |
shortlog | log |
commit |
commitdiff |
tree
first ⋅ prev ⋅ next
Carbo Kuo [Thu, 25 Feb 2021 12:13:38 +0000 (21:13 +0900)]
Fix a severe performance bug in `Conversion::Convert` that caused O(N^2) complexity.
In `Conversion.cpp`, line 27:
```
Optional<const DictEntry*> matched = dict->MatchPrefix(pstr);
```
pstr is a `const char*`. However, there is no overloaded function which parameter `const char*`.
Therefore it matches `Optional<const DictEntry*> MatchPrefix(const std::string& word) const`.
There is an implicit type conversion from `char*` to `std::string` with time complexity O(N).
I added new benchmark tests. Before the fix:
1: ------------------------------------------------------------------
1: Benchmark Time CPU Iterations
1: ------------------------------------------------------------------
1: BM_Initialization/hk2s 1.17 ms 1.12 ms 645
1: BM_Initialization/hk2t 0.116 ms 0.116 ms 5922
1: BM_Initialization/jp2t 0.206 ms 0.201 ms 3500
1: BM_Initialization/s2hk 18.2 ms 17.9 ms 40
1: BM_Initialization/s2t 18.2 ms 18.1 ms 39
1: BM_Initialization/s2tw 17.9 ms 17.8 ms 39
1: BM_Initialization/s2twp 18.6 ms 18.4 ms 39
1: BM_Initialization/t2hk 0.055 ms 0.054 ms 12907
1: BM_Initialization/t2jp 0.120 ms 0.117 ms 5978
1: BM_Initialization/t2s 0.988 ms 0.984 ms 710
1: BM_Initialization/tw2s 1.08 ms 1.05 ms 672
1: BM_Initialization/tw2sp 1.26 ms 1.24 ms 563
1: BM_Initialization/tw2t 0.088 ms 0.083 ms 8528
1: BM_Convert2M 413 ms 413 ms 2
1: BM_Convert/100 1.09 ms 1.09 ms 629
1: BM_Convert/1000 33.2 ms 33.2 ms 21
1: BM_Convert/10000 2822 ms 2822 ms 1
1: BM_Convert/100000 (took longer than 5 minutes, killed)
Now:
1: ------------------------------------------------------------------
1: Benchmark Time CPU Iterations
1: ------------------------------------------------------------------
1: BM_Initialization/hk2s 1.07 ms 1.07 ms 650
1: BM_Initialization/hk2t 0.114 ms 0.114 ms 6092
1: BM_Initialization/jp2t 0.204 ms 0.200 ms 3503
1: BM_Initialization/s2hk 18.2 ms 18.0 ms 40
1: BM_Initialization/s2t 17.6 ms 17.6 ms 39
1: BM_Initialization/s2tw 18.0 ms 17.9 ms 40
1: BM_Initialization/s2twp 17.9 ms 17.9 ms 39
1: BM_Initialization/t2hk 0.055 ms 0.055 ms 12855
1: BM_Initialization/t2jp 0.125 ms 0.124 ms 5772
1: BM_Initialization/t2s 1.000 ms 0.989 ms 695
1: BM_Initialization/tw2s 1.09 ms 1.07 ms 668
1: BM_Initialization/tw2sp 1.29 ms 1.26 ms 558
1: BM_Initialization/tw2t 0.082 ms 0.082 ms 8558
1: BM_Convert2M 361 ms 361 ms 2
1: BM_Convert/100 0.741 ms 0.740 ms 948
1: BM_Convert/1000 7.54 ms 7.52 ms 94
1: BM_Convert/10000 76.3 ms 76.3 ms 9
1: BM_Convert/100000 786 ms 786 ms 1
This bug was reported in https://github.com/BYVoid/OpenCC/issues/478 and https://github.com/BYVoid/OpenCC/issues/517.
Gbp-Pq: Name 0007-Fix-a-severe-performance-bug-in-Conversion-Convert-t.patch
Carbo Kuo [Thu, 25 Feb 2021 11:48:50 +0000 (20:48 +0900)]
Fix a bug in the calculation of DictGroup::keyMaxLength.
The length should be the maximum of all sub-dictionaries in the dictionary group.
Gbp-Pq: Name 0006-Fix-a-bug-in-the-calculation-of-DictGroup-keyMaxLeng.patch
Shengjing Zhu [Sat, 6 Mar 2021 09:53:25 +0000 (17:53 +0800)]
Use system googletest
Forwarded: not-needed
Gbp-Pq: Name 0005-Use-system-googletest.patch
Gunnar Hjalmarsson [Sat, 6 Mar 2021 09:45:20 +0000 (17:45 +0800)]
Don't fetch remote images when reading docs on disk
Forwarded: no
Last-Update: 2021-01-14
This fixes a privacy breach previously reported as Lintian warnings
Gbp-Pq: Name 0004-no-remote-images-when-reading-docs-on-disk.patch
Boyuan Yang [Fri, 26 Jun 2020 00:32:44 +0000 (20:32 -0400)]
data/: Explicitly use python3
Before "python" is default to python3, this patch should be kept.
Last-Update: 2020-06-25
Gbp-Pq: Name 0003-data-Explicitly-use-python3.patch
Boyuan Yang [Sat, 23 May 2020 15:11:46 +0000 (11:11 -0400)]
use-system-libraries
Gbp-Pq: Name 0002-use-system-libraries.patch
IME Packaging Team [Mon, 9 Oct 2017 14:02:39 +0000 (22:02 +0800)]
use-cmake-install-libdir
===================================================================
Gbp-Pq: Name 0001-use-cmake-install-libdir.patch
Shengjing Zhu [Sun, 7 Mar 2021 06:20:40 +0000 (06:20 +0000)]
opencc (1.1.1+git20200624+ds2-10) unstable; urgency=medium
* Team upload.
* Upload to unstable.
* Backport patch to fix performance regression again.
Add
+ 0006-Fix-a-bug-in-the-calculation-of-DictGroup-keyMaxLeng.patch
+ 0007-Fix-a-severe-performance-bug-in-Conversion-Convert-t.patch
[dgit import unpatched opencc 1.1.1+git20200624+ds2-10]
Shengjing Zhu [Sun, 7 Mar 2021 06:20:40 +0000 (06:20 +0000)]
Import opencc_1.1.1+git20200624+ds2-10.debian.tar.xz
[dgit import tarball opencc 1.1.1+git20200624+ds2-10 opencc_1.1.1+git20200624+ds2-10.debian.tar.xz]
Boyuan Yang [Fri, 26 Jun 2020 00:34:50 +0000 (01:34 +0100)]
Import opencc_1.1.1+git20200624+ds2.orig.tar.gz
[dgit import orig opencc_1.1.1+git20200624+ds2.orig.tar.gz]