Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Fix handling of utf-16 encoding of code point 0xE000. (CVS 4017) |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA1: |
bfc35ce8673ce51f726535b90c1d86be |
User & Date: | danielk1977 2007-05-16 18:11:41.000 |
Context
2007-05-16
| ||
18:23 | Remove the SKIP_UTF16 macros (they are no longer in use). (CVS 4018) (check-in: 73e654fbdc user: danielk1977 tags: trunk) | |
18:11 | Fix handling of utf-16 encoding of code point 0xE000. (CVS 4017) (check-in: bfc35ce867 user: danielk1977 tags: trunk) | |
17:50 | Avoid passing a negative value to isspace() in a couple places. (CVS 4016) (check-in: d5db8be368 user: danielk1977 tags: trunk) | |
Changes
Changes to src/utf.c.
︙ | ︙ | |||
8 9 10 11 12 13 14 | ** May you find forgiveness for yourself and forgive others. ** May you share freely, never taking more than you give. ** ************************************************************************* ** This file contains routines used to translate between UTF-8, ** UTF-16, UTF-16BE, and UTF-16LE. ** | | | 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | ** May you find forgiveness for yourself and forgive others. ** May you share freely, never taking more than you give. ** ************************************************************************* ** This file contains routines used to translate between UTF-8, ** UTF-16, UTF-16BE, and UTF-16LE. ** ** $Id: utf.c,v 1.49 2007/05/16 18:11:41 danielk1977 Exp $ ** ** Notes on UTF-8: ** ** Byte-0 Byte-1 Byte-2 Byte-3 Value ** 0xxxxxxx 00000000 00000000 0xxxxxxx ** 110yyyyy 10xxxxxx 00000000 00000yyy yyxxxxxx ** 1110zzzz 10yyyyyy 10xxxxxx 00000000 zzzzyyyy yyxxxxxx |
︙ | ︙ | |||
103 104 105 106 107 108 109 | *zOut++ = (c&0x00FF); \ } \ } #define READ_UTF16LE(zIn, c){ \ c = (*zIn++); \ c += ((*zIn++)<<8); \ | | | | 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 | *zOut++ = (c&0x00FF); \ } \ } #define READ_UTF16LE(zIn, c){ \ c = (*zIn++); \ c += ((*zIn++)<<8); \ if( c>=0xD800 && c<0xE000 ){ \ int c2 = (*zIn++); \ c2 += ((*zIn++)<<8); \ c = (c2&0x03FF) + ((c&0x003F)<<10) + (((c&0x03C0)+0x0040)<<10); \ if( (c & 0xFFFF0000)==0 ) c = 0xFFFD; \ } \ } #define READ_UTF16BE(zIn, c){ \ c = ((*zIn++)<<8); \ c += (*zIn++); \ if( c>=0xD800 && c<0xE000 ){ \ int c2 = ((*zIn++)<<8); \ c2 += (*zIn++); \ c = (c2&0x03FF) + ((c&0x003F)<<10) + (((c&0x03C0)+0x0040)<<10); \ if( (c & 0xFFFF0000)==0 ) c = 0xFFFD; \ } \ } |
︙ | ︙ | |||
484 485 486 487 488 489 490 | t = i; if( i>=0xD800 && i<=0xDFFF ) t = 0xFFFD; if( (i&0xFFFFFFFE)==0xFFFE ) t = 0xFFFD; assert( c==t ); assert( (z-zBuf)==n ); } for(i=0; i<0x00110000; i++){ | | | | 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 | t = i; if( i>=0xD800 && i<=0xDFFF ) t = 0xFFFD; if( (i&0xFFFFFFFE)==0xFFFE ) t = 0xFFFD; assert( c==t ); assert( (z-zBuf)==n ); } for(i=0; i<0x00110000; i++){ if( i>=0xD800 && i<0xE000 ) continue; z = zBuf; WRITE_UTF16LE(z, i); n = z-zBuf; z[0] = 0; z = zBuf; READ_UTF16LE(z, c); assert( c==i ); assert( (z-zBuf)==n ); } for(i=0; i<0x00110000; i++){ if( i>=0xD800 && i<0xE000 ) continue; z = zBuf; WRITE_UTF16BE(z, i); n = z-zBuf; z[0] = 0; z = zBuf; READ_UTF16BE(z, c); assert( c==i ); assert( (z-zBuf)==n ); } } #endif /* SQLITE_TEST */ #endif /* SQLITE_OMIT_UTF16 */ |
Changes to test/enc.test.
︙ | ︙ | |||
9 10 11 12 13 14 15 | # #*********************************************************************** # This file implements regression tests for SQLite library. The focus of # this file is testing the SQLite routines used for converting between the # various suported unicode encodings (UTF-8, UTF-16, UTF-16le and # UTF-16be). # | | | 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | # #*********************************************************************** # This file implements regression tests for SQLite library. The focus of # this file is testing the SQLite routines used for converting between the # various suported unicode encodings (UTF-8, UTF-16, UTF-16le and # UTF-16be). # # $Id: enc.test,v 1.6 2007/05/16 18:11:41 danielk1977 Exp $ set testdir [file dirname $argv0] source $testdir/tester.tcl # Skip this test if the build does not support multiple encodings. # ifcapable {!utf16} { |
︙ | ︙ | |||
144 145 146 147 148 149 150 151 152 | test_conversion enc-X "\u0100" test_conversion enc-4 "\u1234" test_conversion enc-5 "\u4321abc" test_conversion enc-6 "\u4321\u1234" test_conversion enc-7 [string repeat "abcde\u00EF\u00EE\uFFFCabc" 100] test_conversion enc-8 [string repeat "\u007E\u007F\u0080\u0081" 100] test_conversion enc-9 [string repeat "\u07FE\u07FF\u0800\u0801\uFFF0" 100] finish_test | > > | 144 145 146 147 148 149 150 151 152 153 154 | test_conversion enc-X "\u0100" test_conversion enc-4 "\u1234" test_conversion enc-5 "\u4321abc" test_conversion enc-6 "\u4321\u1234" test_conversion enc-7 [string repeat "abcde\u00EF\u00EE\uFFFCabc" 100] test_conversion enc-8 [string repeat "\u007E\u007F\u0080\u0081" 100] test_conversion enc-9 [string repeat "\u07FE\u07FF\u0800\u0801\uFFF0" 100] test_conversion enc-10 [string repeat "\uE000" 100] finish_test |