ux...

ux-trie(簡潔データ構造のトライで,tail を実装してさらに tail 自体もトライ化して圧縮する)のバグが直ったようなので,トライのサイズや検索性能を調べているのだけど,長い tail を持つようなデータだと,トライのサイズが tx 比で 1/2 近くまで小さくなる一方,検索は(辞書順だと)tx 比で倍程度遅くなるようだ(tail を圧縮しないと tx よりトライのサイズが大きくなる).これぐらい極端だと,かえって潔い感じかもしれない.トライ(ダブル配列,簡潔データ構造)と STL コンテナ - ny23の日記に ux を加えて再実験したものは以下の通り.

======================================================================
 Wikipedia-en 記事タイトル |  Build  | Search  | Search* | Size [bytes]
======================================================================
 sorted keys
  Darts 0.32             | 14.251s |  0.544s |  6.694s |  552,547,920
  darts-clone 0.32g5     |  7.349s |  0.662s |  5.961s |  258,735,104
  darts-clone 0.32g5 v:1 | 16.834s |  0.809s |  6.305s |   95,080,448
 *darts-clone 0.32e5     | 14.971s |  0.927s |  5.948s |  113,360,140
 *darts-clone 0.32e5 v:1 | 14.452s |  0.854s |  5.630s |   87,122,880
  darts-clone 0.32e5 huge|  4.309s |  0.547s |  6.018s |  147,491,712
 *DASTrie 1.0 -c         |      NA |      NA |      NA |           NA
  DASTrie 1.0            |  4.856s |  0.703s |  6.056s |  157,593,742
 *doar 0.0.13 static     |  8.797s |  0.730s |  5.977s |  123,000,249
  --------------------------------------------------------------------
 *doar 0.0.13 dynamic    | +3.922s |  0.689s |  5.658s | +123,000,249
  dda                    |  3.237s |  0.633s |  7.167s |  517,091,328
 ---------------------------------------------------------------------
  map           k:char*  |  2.920s |  2.093s | 17.227s |(>163,859,393)
  map           k:string | 10.504s |  5.640s | 25.184s |(>163,859,393)
  unordered_map k:char*  |  6.354s |  2.007s |  3.872s |(>163,859,393)
  unordered_map k:string |  7.240s |  2.245s |  4.590s |(>163,859,393)
  --------------------------------------------------------------------
  tx 0.17                | 18.676s | 25.349s | 63.566s |   87,520,717
  ux 0.1.2               | 47.255s | 46.280s | 67.860s |   49,134,003
  ux 0.1.2 isTailUX=false| 16.795s | 30.812s | 50.047s |  105,423,225
 =====================================================================
 randomized keys
  doar 0.0.13 dynamic    |+24.844s |  1.946s |  5.660s | +123,000,249
  dda                    | 10.103s |  3.700s |  8.473s |  517,091,328
  --------------------------------------------------------------------
  map           k:char*  | 18.346s |  3.589s | 21.163s |(>163,859,393)
  map           k:string | 27.629s |  8.044s | 30.681s |(>163,859,393)
  unordered_map k:char*  |  6.381s |  3.794s |  3.878s |(>163,859,393)
  unordered_map k:string |  7.262s |  4.580s |  4.601s |(>163,859,393)
  --------------------------------------------------------------------
  tx 0.17                | 38.663s | 26.469s | 63.192s |   87,520,717
  ux 0.1.2               | 62.991s | 47.582s | 68.247s |   49,134,003
  ux 0.1.2 isTailUX=false| 33.518s | 32.109s | 50.454s |  105,423,225
 ======================================================================

======================================================================
 郵便番号辞書              |  Build  | Search  | Search* | Size [bytes]
======================================================================
 sorted keys
  Darts 0.32             |  0.039s |  0.003s |  0.012s |    2,156,576
  darts-clone 0.32g5     |  0.040s |  0.004s |  0.011s |    1,069,056
  darts-clone 0.32g5 v:1 |  0.022s |  0.003s |  0.008s |      106,496
 *darts-clone 0.32e5     |  0.187s |  0.004s |  0.012s |    1,154,676
 *darts-clone 0.32e5 v:1 |  0.179s |  0.004s |  0.011s |      650,116
  darts-clone 0.32e5 huge|  0.156s |  0.004s |  0.014s |    1,363,656
 *DASTrie 1.0 -c         |  2.268s |  0.006s |  0.015s |    1,248,609
  DASTrie 1.0            |  2.434s |  0.006s |  0.016s |    1,411,783
 *doar 0.0.13 static     |  0.066s |  0.004s |  0.013s |    1,204,561
  --------------------------------------------------------------------
 *doar 0.0.13 dynamic    | +0.090s |  0.004s |  0.013s |   +1,204,561
  dda                    |  0.023s |  0.004s |  0.012s |    2,129,920
  --------------------------------------------------------------------
  map           k:char*  |  0.044s |  0.028s |  0.074s |  (>1,425,852)
  map           k:string |  0.138s |  0.068s |  0.142s |  (>1,425,852)
  unordered_map k:char*  |  0.024s |  0.010s |  0.028s |  (>1,425,852)
  unordered_map k:string |  0.040s |  0.013s |  0.039s |  (>1,425,852)
  --------------------------------------------------------------------
  tx 0.17                |  0.098s |  0.091s |  0.133s |      222,888
  ux 0.1.2               |  0.094s |  0.161s |  0.189s |      220,209
  ux 0.1.2 isTailUX=false|  0.094s |  0.162s |  0.190s |      222,165
 =====================================================================
 randomized keys
 *doar 0.0.13 dynamic    | +0.105s |  0.009s |  0.013s |   +1,204,561
  dda                    |  0.034s |  0.010s |  0.013s |    2,129,920
  --------------------------------------------------------------------
  map           k:char*  |  0.068s |  0.035s |  0.084s |  (>1,425,852)
  map           k:string |  0.144s |  0.087s |  0.161s |  (>1,425,852)
  unordered_map k:char*  |  0.030s |  0.025s |  0.028s |  (>1,425,852)
  unordered_map k:string |  0.049s |  0.036s |  0.040s |  (>1,425,852)
  --------------------------------------------------------------------
  tx 0.17                |  0.177s |  0.098s |  0.133s |      222,888
  ux 0.1.2               |  0.177s |  0.171s |  0.189s |      220,209
  ux 0.1.2 isTailUX=false|  0.175s |  0.169s |  0.190s |      222,165
======================================================================
(compiled with gcc 4.6.0 20100925, -O2 -march=core2 -m64)