/ Check-in [c9310c9a]
Login
SQLite training in Houston TX on 2019-11-05 (details)
Part of the 2019 Tcl Conference

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Fix Unicode character encoding issues on Windows in the fts4unicode test file.
Downloads: Tarball | ZIP archive | SQL archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: c9310c9a2bad11f1d033a57b33ea7aed43a8238d
User & Date: mistachkin 2013-10-12 00:56:21
Original Comment: Fix Unicode character encoding issues on Windows.
Context
2013-10-12
02:31
Permit the creation of VSIX packages for Win32. check-in: 035d03e9 user: mistachkin tags: trunk
00:56
Fix Unicode character encoding issues on Windows in the fts4unicode test file. check-in: c9310c9a user: mistachkin tags: trunk
2013-10-11
23:37
Identify requirements text in the SQLITE_CONFIG_ documentation. Fix a typo (a duplicated word) in part of that documentation. Add some requirements marks for DETACH to the test scripts. No code changes. check-in: 1be0a3ad user: drh tags: trunk
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to test/fts4unicode.test.

40
41
42
43
44
45
46

47


48

49
50
51
52
53
54
55
56
57
58
59
60
61
62
63

64

65
66
67
68
69
70
71
72
73
74
75
76
77
78
    append sql "'"
  }
  append sql ")"
  uplevel [list do_execsql_test $tn $sql [list [list {*}$res]]]
}

do_unicode_token_test 1.0 {a B c D} {0 a a 1 b B 2 c c 3 d D}

do_unicode_token_test 1.1 {Ä Ö Ü} {0 ä Ä 1 ö Ö 2 ü Ü}


do_unicode_token_test 1.2 {xÄx xÖx xÜx} {0 xäx xÄx 1 xöx xÖx 2 xüx xÜx}


# 0x00DF is a small "sharp s". 0x1E9E is a capital sharp s.
do_unicode_token_test 1.3 "\uDF" "0 \uDF \uDF"
do_unicode_token_test 1.4 "\u1E9E" "0 ß \u1E9E"
do_unicode_token_test 1.5 "\u1E9E" "0 \uDF \u1E9E"

do_unicode_token_test 1.6 "The quick brown fox" {
  0 the The 1 quick quick 2 brown brown 3 fox fox
}
do_unicode_token_test 1.7 "The\u00bfquick\u224ebrown\u2263fox" {
  0 the The 1 quick quick 2 brown brown 3 fox fox
}

do_unicode_token_test2 1.8  {a B c D} {0 a a 1 b B 2 c c 3 d D}
do_unicode_token_test2 1.9  {Ä Ö Ü} {0 a Ä 1 o Ö 2 u Ü}

do_unicode_token_test2 1.10 {xÄx xÖx xÜx} {0 xax xÄx 1 xox xÖx 2 xux xÜx}


# Check that diacritics are removed if remove_diacritics=1 is specified.
# And that they do not break tokens.
do_unicode_token_test2 1.11 "xx\u0301xx" "0 xxxx xx\u301xx"

# Title-case mappings work
do_unicode_token_test 1.12 "\u01c5" "0 \u01c6 \u01c5"

#-------------------------------------------------------------------------
#
set docs [list {
  Enhance the INSERT syntax to allow multiple rows to be inserted via the
  VALUES clause.
} {







>
|
>
>
|
>



|
<

|


|



|
|
>
|
>



|


|







40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56

57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
    append sql "'"
  }
  append sql ")"
  uplevel [list do_execsql_test $tn $sql [list [list {*}$res]]]
}

do_unicode_token_test 1.0 {a B c D} {0 a a 1 b B 2 c c 3 d D}

do_unicode_token_test 1.1 "\uC4 \uD6 \uDC" \
    "0 \uE4 \uC4 1 \uF6 \uD6 2 \uFC \uDC"

do_unicode_token_test 1.2 "x\uC4x x\uD6x x\uDCx" \
    "0 x\uE4x x\uC4x 1 x\uF6x x\uD6x 2 x\uFCx x\uDCx"

# 0x00DF is a small "sharp s". 0x1E9E is a capital sharp s.
do_unicode_token_test 1.3 "\uDF" "0 \uDF \uDF"
do_unicode_token_test 1.4 "\u1E9E" "0 \uDF \u1E9E"


do_unicode_token_test 1.5 "The quick brown fox" {
  0 the The 1 quick quick 2 brown brown 3 fox fox
}
do_unicode_token_test 1.6 "The\u00bfquick\u224ebrown\u2263fox" {
  0 the The 1 quick quick 2 brown brown 3 fox fox
}

do_unicode_token_test2 1.7  {a B c D} {0 a a 1 b B 2 c c 3 d D}
do_unicode_token_test2 1.8  "\uC4 \uD6 \uDC" "0 a \uC4 1 o \uD6 2 u \uDC"

do_unicode_token_test2 1.9  "x\uC4x x\uD6x x\uDCx" \
    "0 xax x\uC4x 1 xox x\uD6x 2 xux x\uDCx"

# Check that diacritics are removed if remove_diacritics=1 is specified.
# And that they do not break tokens.
do_unicode_token_test2 1.10 "xx\u0301xx" "0 xxxx xx\u301xx"

# Title-case mappings work
do_unicode_token_test 1.11 "\u01c5" "0 \u01c6 \u01c5"

#-------------------------------------------------------------------------
#
set docs [list {
  Enhance the INSERT syntax to allow multiple rows to be inserted via the
  VALUES clause.
} {