Big re-organization of repository [W.I.P]

This commit is contained in:
Yehonal
2016-08-11 20:25:27 +02:00
parent c62a72c0a8
commit 0f85ce1c54
3016 changed files with 1271 additions and 1 deletions

View File

@@ -0,0 +1,281 @@
CHARSET_INFO
============
A structure containing data for charset+collation pair implementation.
Virtual functions that use this data are collected into separate
structures, MY_CHARSET_HANDLER and MY_COLLATION_HANDLER.
typedef struct charset_info_st
{
uint number;
uint primary_number;
uint binary_number;
uint state;
const char *csname;
const char *name;
const char *comment;
uchar *ctype;
uchar *to_lower;
uchar *to_upper;
uchar *sort_order;
uint16 *tab_to_uni;
MY_UNI_IDX *tab_from_uni;
uchar state_map[256];
uchar ident_map[256];
uint strxfrm_multiply;
uint mbminlen;
uint mbmaxlen;
uint16 max_sort_char; /* For LIKE optimization */
MY_CHARSET_HANDLER *cset;
MY_COLLATION_HANDLER *coll;
} CHARSET_INFO;
CHARSET_INFO fields description:
===============================
Numbers (identifiers)
---------------------
number - an ID uniquely identifying this charset+collation pair.
primary_number - ID of a charset+collation pair, which consists
of the same character set and the default collation of this
character set. Not really used now. Intended to optimize some
parts of the code where we need to find the default collation
using its non-default counterpart for the given character set.
binary_number - ID of a charset+collation pair, which consists
of the same character set and the binary collation of this
character set. Not really used now.
Names
-----
csname - name of the character set for this charset+collation pair.
name - name of the collation for this charset+collation pair.
comment - a text comment, displayed in "Description" column of
SHOW CHARACTER SET output.
Conversion tables
-----------------
ctype - pointer to array[257] of "type of characters"
bit mask for each character, e.g., whether a
character is a digit, letter, separator, etc.
Monty 2004-10-21:
If you look at the macros, we use ctype[(char)+1].
ctype[0] is traditionally in most ctype libraries
reserved for EOF (-1). The idea is that you can use
the result from fgetc() directly with ctype[]. As
we have to be compatible with external ctype[] versions,
it's better to do it the same way as they do...
to_lower - pointer to array[256] used in LCASE()
to_upper - pointer to array[256] used in UCASE()
sort_order - pointer to array[256] used for strings comparison
In all Asian charsets these arrays are set up as follows:
- All bytes in the range 0x80..0xFF were marked as letters in the
ctype array.
- The to_lower and to_upper arrays map only ASCII letters.
UPPER() and LOWER() doesn't really work for multi-byte characters.
Most of the characters in Asian character sets are ideograms
anyway and they don't have case mapping. However, there are
still some characters from European alphabets.
For example:
_ujis 0x8FAAF2 - LATIN CAPITAL LETTER Y WITH ACUTE
_ujis 0x8FABF2 - LATIN SMALL LETTER Y WITH ACUTE
But they don't map to each other with UPPER and LOWER operations.
- The sort_order array is filled case insensitively for the
ASCII range 0x00..0x7F, and in "binary" fashion for the multi-byte
range 0x80..0xFF for these collations:
cp932_japanese_ci,
euckr_korean_ci,
eucjpms_japanese_ci,
gb2312_chinese_ci,
sjis_japanese_ci,
ujis_japanese_ci.
So multi-byte characters are sorted just according to their codes.
- Two collations are still case insensitive for the ASCII characters,
but have special sorting order for multi-byte characters
(something more complex than just according to codes):
big5_chinese_ci
gbk_chinese_ci
So handlers for these collations use only the 0x00..0x7F part
of their sort_order arrays, and apply the special functions
for multi-byte characters
In Unicode character sets we have full support of UPPER/LOWER mapping,
for sorting order, and for character type detection.
"utf8_general_ci" still has the "old-fashioned" arrays
like to_upper, to_lower, sort_order and ctype, but they are
not really used (maybe only in some rare legacy functions).
Unicode conversion data
-----------------------
For 8-bit character sets:
tab_to_uni : array[256] of charset->Unicode translation
tab_from_uni: a structure for Unicode->charset translation
Non-8-bit charsets have their own structures per charset
hidden in corresponding ctype-xxx.c file and don't use
tab_to_uni and tab_from_uni tables.
Parser maps
-----------
state_map[]
ident_map[]
These maps are used to quickly identify whether a character is an
identifier part, a digit, a special character, or a part of another
SQL language lexical item.
Probably can be combined with ctype array in the future.
But for some reasons these two arrays are used in the parser,
while a separate ctype[] array is used in the other part of the
code, like fulltext, etc.
Miscellaneous fields
--------------------
strxfrm_multiply - how many times a sort key (that is, a string
that can be passed into memcmp() for comparison)
can be longer than the original string.
Usually it is 1. For some complex
collations it can be bigger. For example,
in latin1_german2_ci, a sort key is up to
two times longer than the original string.
e.g. Letter 'A' with two dots above is
substituted with 'AE'.
mbminlen - minimum multi-byte sequence length.
Now always 1 except for ucs2. For ucs2,
it is 2.
mbmaxlen - maximum multi-byte sequence length.
1 for 8-bit charsets. Can be also 2 or 3.
max_sort_char - for LIKE range
in case of 8-bit character sets - native code
of maximum character (max_str pad byte);
in case of UTF8 and UCS2 - Unicode code of the maximum
possible character (usually U+FFFF). This code is
converted to multi-byte representation (usually 0xEFBFBF)
and then used as a pad sequence for max_str.
in case of other multi-byte character sets -
max_str pad byte (usually 0xFF).
MY_CHARSET_HANDLER
==================
MY_CHARSET_HANDLER is a collection of character-set
related routines. Defined in m_ctype.h. Have the
following set of functions:
Multi-byte routines
------------------
ismbchar() - detects whether the given string is a multi-byte sequence
mbcharlen() - returns length of multi-byte sequence starting with
the given character
numchars() - returns number of characters in the given string, e.g.
in SQL function CHAR_LENGTH().
charpos() - calculates the offset of the given position in the string.
Used in SQL functions LEFT(), RIGHT(), SUBSTRING(),
INSERT()
well_formed_len()
- returns length of a given multi-byte string in bytes
Used in INSERTs to shorten the given string so it
a) is "well formed" according to the given character set
b) can fit into the given data type
lengthsp() - returns the length of the given string without trailing spaces.
Unicode conversion routines
---------------------------
mb_wc - converts the left multi-byte sequence into its Unicode code.
mc_mb - converts the given Unicode code into multi-byte sequence.
Case and sort conversion
------------------------
caseup_str - converts the given 0-terminated string to uppercase
casedn_str - converts the given 0-terminated string to lowercase
caseup - converts the given string to lowercase using length
casedn - converts the given string to lowercase using length
Number-to-string conversion routines
------------------------------------
snprintf()
long10_to_str()
longlong10_to_str()
The names are pretty self-describing.
String padding routines
-----------------------
fill() - writes the given Unicode value into the given string
with the given length. Used to pad the string, usually
with space character, according to the given charset.
String-to-number conversion routines
------------------------------------
strntol()
strntoul()
strntoll()
strntoull()
strntod()
These functions are almost the same as their STDLIB counterparts,
but also:
- accept length instead of 0-terminator
- are character set dependent
Simple scanner routines
-----------------------
scan() - to skip leading spaces in the given string.
Used when a string value is inserted into a numeric field.
MY_COLLATION_HANDLER
====================
strnncoll() - compares two strings according to the given collation
strnncollsp() - like the above but ignores trailing spaces
strnxfrm() - makes a sort key suitable for memcmp() corresponding
to the given string
like_range() - creates a LIKE range, for optimizer
wildcmp() - wildcard comparison, for LIKE
strcasecmp() - 0-terminated string comparison
instr() - finds the first substring appearance in the string
hash_sort() - calculates hash value taking into account
the collation rules, e.g. case-insensitivity,
accent sensitivity, etc.

View File

@@ -0,0 +1,82 @@
File : README
Author : Richard A. O'Keefe.
Updated: 30 April 1984
Purpose: Explain the new strings package.
The UNIX string libraries (described in the string(3) manual page)
differ from UNIX to UNIX (e.g. strtok is not in V7 or 4.1bsd). Worse,
the sources are not in the public domain, so that if there is a string
routine which is nearly what you want but not quite you can't take a
copy and modify it. And of course C programmers on non-UNIX systems
are at the mercy of their supplier.
This package was designed to let me do reasonable things with C's
strings whatever UNIX (V7, PaNiX, UX63, 4.1bsd) I happen to be using.
Everything in the System III manual is here and does just what the S3
manual says it does. There are also lots of new goodies. I'm sorry
about the names, but the routines do have to work on asphyxiated-at-
birth systems which truncate identifiers. The convention is that a
routine is called
str [n] [c] <operation>
If there is an "n", it means that the function takes an (int) "length"
argument, which bounds the number of characters to be moved or looked
at. If the function has a "set" argument, a "c" in the name indicates
that the complement of the set is used. Functions or variables whose
names start with _ are support routines which aren't really meant for
general use. I don't know what the "p" is doing in "strpbrk", but it
is there in the S3 manual so it's here too. "istrtok" does not follow
this rule, but with 7 letters what can you do?
I have included new versions of atoi(3) and atol(3) as well. They
use a new primitive str2int, which takes a pair of bounds and a radix,
and does much more thorough checking than the normal atoi and atol do.
The result returned by atoi & atol is valid if and only if errno == 0.
There is also an output conversion routine int2str, with itoa and ltoa
as interface macros. Only after writing int2str did I notice that the
str2int routine has no provision for unsigned numbers. On reflection,
I don't greatly care. I'm afraid that int2str may depend on your "C"
compiler in unexpected ways. Do check the code with -S.
Several of these routines have "asm" inclusions conditional on the
VaxAsm option. These insertions can make the routines which have them
quite a bit faster, but there is a snag. The VAX architects, for some
reason best known to themselves and their therapists, decided that all
"strings" were shorter than 2^16 bytes. Even when the length operands
are in 32-bit registers, only 16 bits count. So the "asm" versions do
not work for long strings. If you can guarantee that all your strings
will be short, define VaxAsm in the makefile, but in general, and when
using other machines, do not define it.
To use this library, you need the "strings.a" library file and the
"strings.h" and "ctypes.h" header files. The other header files are
for compiling the library itself, though if you are hacking extensions
you may find them useful. General users really shouldn't see them.
I've defined a few macros I find useful in "strings.h"; if you have no
need for "index", "rindex", "streql", and "beql", just edit them out.
On the 4.1bsd system I am using declaring all these functions 'extern'
does not mean that they will all be loaded; but only the ones you use.
When using lesser systems you may find it necessary to break strings.h
up, or you could get by with just adding "extern" declarations for the
functions you want as you need them. Many of these functions have the
same names as functions in the "standard C library", by design as this
is a replacement/reimplementation of part of that library. So you may
have to talk the loader into loading this library first. Again, I've
found no problems on 4.1bsd.
You may wonder at my failure to provide manual pages for this code.
For the things in V7, 4.?, or SIII, you should be able to use whichever
manual page came with that system, and anything I might write would be
so like it as to raise suspicions of violating AT&T copyrights. In the
sources you will find comments which provide far more documentation for
these routines than AT&T ever provided for their strings stuff, I just
don't happen to have put it in nroff -man form. Had I done so, the .3
files would have outbulked the .c files!
These files are in the public domain. This includes getopt.c, which
is the work of Henry Spencer, University of Toronto Zoology, who says of
it "None of this software is derived from Bell software. I had no access
to the source for Bell's versions at the time I wrote it. This software
is hereby explicitly placed in the public domain. It may be used for
any purpose on any machine by anyone." I would greatly prefer it if *my*
material received no military use.

View File

@@ -0,0 +1,38 @@
/* Copyright (C) 2000 MySQL AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
/* File : bchange.c
Author : Michael widenius
Updated: 1987-03-20
Defines: bchange()
bchange(dst, old_length, src, new_length, tot_length)
replaces old_length characters at dst to new_length characters from
src in a buffer with tot_length bytes.
*/
#include <my_global.h>
#include "m_string.h"
void bchange(register uchar *dst, size_t old_length, register const uchar *src,
size_t new_length, size_t tot_length)
{
size_t rest=tot_length-old_length;
if (old_length < new_length)
bmove_upp(dst+rest+new_length,dst+tot_length,rest);
else
bmove(dst+new_length,dst+old_length,rest);
memcpy(dst,src,new_length);
}

View File

@@ -0,0 +1,32 @@
/* Copyright (C) 2000 MySQL AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
/* File : bmove.c
Author : Michael widenius
Updated: 1987-03-20
Defines: bmove_upp()
bmove_upp(dst, src, len) moves exactly "len" bytes from the source
"src-len" to the destination "dst-len" counting downwards.
*/
#include <my_global.h>
#include "m_string.h"
void bmove_upp(register uchar *dst, register const uchar *src,
register size_t len)
{
while (len-- != 0) *--dst = *--src;
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,579 @@
/* Copyright (C) 2002 MySQL AB & tommy@valley.ne.jp.
This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Library General Public
License as published by the Free Software Foundation; version 2
of the License.
This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Library General Public License for more details.
You should have received a copy of the GNU Library General Public
License along with this library; if not, write to the Free
Software Foundation, Inc., 59 Temple Place - Suite 330, Boston,
MA 02111-1307, USA */
/* This file is for binary pseudo charset, created by bar@mysql.com */
#include <my_global.h>
#include "m_string.h"
#include "m_ctype.h"
static uchar ctype_bin[]=
{
0,
32, 32, 32, 32, 32, 32, 32, 32, 32, 40, 40, 40, 40, 40, 32, 32,
32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32,
72, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16,
132,132,132,132,132,132,132,132,132,132, 16, 16, 16, 16, 16, 16,
16,129,129,129,129,129,129, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 16, 16, 16, 16, 16,
16,130,130,130,130,130,130, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 16, 16, 16, 16, 32,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
};
/* Dummy array for toupper / tolower / sortorder */
static uchar bin_char_array[] =
{
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,
48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63,
64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79,
80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95,
96, 97, 98, 99,100,101,102,103,104,105,106,107,108,109,110,111,
112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,
128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,
144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,
160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,
176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,
192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,
208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,
224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,
240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255
};
static my_bool
my_coll_init_8bit_bin(CHARSET_INFO *cs,
void *(*alloc)(size_t) __attribute__((unused)))
{
cs->max_sort_char=255;
return FALSE;
}
static int my_strnncoll_binary(CHARSET_INFO * cs __attribute__((unused)),
const uchar *s, size_t slen,
const uchar *t, size_t tlen,
my_bool t_is_prefix)
{
size_t len=min(slen,tlen);
int cmp= memcmp(s,t,len);
return cmp ? cmp : (int)((t_is_prefix ? len : slen) - tlen);
}
size_t my_lengthsp_binary(CHARSET_INFO *cs __attribute__((unused)),
const char *ptr __attribute__((unused)),
size_t length)
{
return length;
}
/*
Compare two strings. Result is sign(first_argument - second_argument)
SYNOPSIS
my_strnncollsp_binary()
cs Chararacter set
s String to compare
slen Length of 's'
t String to compare
tlen Length of 't'
NOTE
This function is used for real binary strings, i.e. for
BLOB, BINARY(N) and VARBINARY(N).
It compares trailing spaces as spaces.
RETURN
< 0 s < t
0 s == t
> 0 s > t
*/
static int my_strnncollsp_binary(CHARSET_INFO * cs __attribute__((unused)),
const uchar *s, size_t slen,
const uchar *t, size_t tlen,
my_bool diff_if_only_endspace_difference
__attribute__((unused)))
{
return my_strnncoll_binary(cs,s,slen,t,tlen,0);
}
static int my_strnncoll_8bit_bin(CHARSET_INFO * cs __attribute__((unused)),
const uchar *s, size_t slen,
const uchar *t, size_t tlen,
my_bool t_is_prefix)
{
size_t len=min(slen,tlen);
int cmp= memcmp(s,t,len);
return cmp ? cmp : (int)((t_is_prefix ? len : slen) - tlen);
}
/*
Compare two strings. Result is sign(first_argument - second_argument)
SYNOPSIS
my_strnncollsp_8bit_bin()
cs Chararacter set
s String to compare
slen Length of 's'
t String to compare
tlen Length of 't'
diff_if_only_endspace_difference
Set to 1 if the strings should be regarded as different
if they only difference in end space
NOTE
This function is used for character strings with binary collations.
The shorter string is extended with end space to be as long as the longer
one.
RETURN
< 0 s < t
0 s == t
> 0 s > t
*/
static int my_strnncollsp_8bit_bin(CHARSET_INFO * cs __attribute__((unused)),
const uchar *a, size_t a_length,
const uchar *b, size_t b_length,
my_bool diff_if_only_endspace_difference)
{
const uchar *end;
size_t length;
int res;
#ifndef VARCHAR_WITH_DIFF_ENDSPACE_ARE_DIFFERENT_FOR_UNIQUE
diff_if_only_endspace_difference= 0;
#endif
end= a + (length= min(a_length, b_length));
while (a < end)
{
if (*a++ != *b++)
return ((int) a[-1] - (int) b[-1]);
}
res= 0;
if (a_length != b_length)
{
int swap= 1;
/*
Check the next not space character of the longer key. If it's < ' ',
then it's smaller than the other key.
*/
if (diff_if_only_endspace_difference)
res= 1; /* Assume 'a' is bigger */
if (a_length < b_length)
{
/* put shorter key in s */
a_length= b_length;
a= b;
swap= -1; /* swap sign of result */
res= -res;
}
for (end= a + a_length-length; a < end ; a++)
{
if (*a != ' ')
return (*a < ' ') ? -swap : swap;
}
}
return res;
}
/* This function is used for all conversion functions */
static size_t my_case_str_bin(CHARSET_INFO *cs __attribute__((unused)),
char *str __attribute__((unused)))
{
return 0;
}
static size_t my_case_bin(CHARSET_INFO *cs __attribute__((unused)),
char *src __attribute__((unused)),
size_t srclen,
char *dst __attribute__((unused)),
size_t dstlen __attribute__((unused)))
{
return srclen;
}
static int my_strcasecmp_bin(CHARSET_INFO * cs __attribute__((unused)),
const char *s, const char *t)
{
return strcmp(s,t);
}
uint my_mbcharlen_8bit(CHARSET_INFO *cs __attribute__((unused)),
uint c __attribute__((unused)))
{
return 1;
}
static int my_mb_wc_bin(CHARSET_INFO *cs __attribute__((unused)),
my_wc_t *wc,
const uchar *str,
const uchar *end __attribute__((unused)))
{
if (str >= end)
return MY_CS_TOOSMALL;
*wc=str[0];
return 1;
}
static int my_wc_mb_bin(CHARSET_INFO *cs __attribute__((unused)),
my_wc_t wc,
uchar *s,
uchar *e __attribute__((unused)))
{
if (s >= e)
return MY_CS_TOOSMALL;
if (wc < 256)
{
s[0]= (char) wc;
return 1;
}
return MY_CS_ILUNI;
}
void my_hash_sort_8bit_bin(CHARSET_INFO *cs __attribute__((unused)),
const uchar *key, size_t len,
ulong *nr1, ulong *nr2)
{
const uchar *pos = key;
/*
Remove trailing spaces. We have to do this to be able to compare
'A ' and 'A' as identical
*/
key= skip_trailing_space(key, len);
for (; pos < (uchar*) key ; pos++)
{
nr1[0]^=(ulong) ((((uint) nr1[0] & 63)+nr2[0]) *
((uint)*pos)) + (nr1[0] << 8);
nr2[0]+=3;
}
}
void my_hash_sort_bin(CHARSET_INFO *cs __attribute__((unused)),
const uchar *key, size_t len,ulong *nr1, ulong *nr2)
{
const uchar *pos = key;
key+= len;
for (; pos < (uchar*) key ; pos++)
{
nr1[0]^=(ulong) ((((uint) nr1[0] & 63)+nr2[0]) *
((uint)*pos)) + (nr1[0] << 8);
nr2[0]+=3;
}
}
/*
The following defines is here to keep the following code identical to
the one in ctype-simple.c
*/
#define likeconv(s,A) (A)
#define INC_PTR(cs,A,B) (A)++
int my_wildcmp_bin(CHARSET_INFO *cs,
const char *str,const char *str_end,
const char *wildstr,const char *wildend,
int escape, int w_one, int w_many)
{
int result= -1; /* Not found, using wildcards */
while (wildstr != wildend)
{
while (*wildstr != w_many && *wildstr != w_one)
{
if (*wildstr == escape && wildstr+1 != wildend)
wildstr++;
if (str == str_end || likeconv(cs,*wildstr++) != likeconv(cs,*str++))
return(1); /* No match */
if (wildstr == wildend)
return(str != str_end); /* Match if both are at end */
result=1; /* Found an anchor char */
}
if (*wildstr == w_one)
{
do
{
if (str == str_end) /* Skip one char if possible */
return(result);
INC_PTR(cs,str,str_end);
} while (++wildstr < wildend && *wildstr == w_one);
if (wildstr == wildend)
break;
}
if (*wildstr == w_many)
{ /* Found w_many */
uchar cmp;
wildstr++;
/* Remove any '%' and '_' from the wild search string */
for (; wildstr != wildend ; wildstr++)
{
if (*wildstr == w_many)
continue;
if (*wildstr == w_one)
{
if (str == str_end)
return(-1);
INC_PTR(cs,str,str_end);
continue;
}
break; /* Not a wild character */
}
if (wildstr == wildend)
return(0); /* match if w_many is last */
if (str == str_end)
return(-1);
if ((cmp= *wildstr) == escape && wildstr+1 != wildend)
cmp= *++wildstr;
INC_PTR(cs,wildstr,wildend); /* This is compared through cmp */
cmp=likeconv(cs,cmp);
do
{
while (str != str_end && (uchar) likeconv(cs,*str) != cmp)
str++;
if (str++ == str_end)
return(-1);
{
int tmp=my_wildcmp_bin(cs,str,str_end,wildstr,wildend,escape,w_one,
w_many);
if (tmp <= 0)
return(tmp);
}
} while (str != str_end && wildstr[0] != w_many);
return(-1);
}
}
return(str != str_end ? 1 : 0);
}
static size_t my_strnxfrm_bin(CHARSET_INFO *cs __attribute__((unused)),
uchar *dest, size_t dstlen,
const uchar *src, size_t srclen)
{
if (dest != src)
memcpy(dest, src, min(dstlen,srclen));
if (dstlen > srclen)
bfill(dest + srclen, dstlen - srclen, 0);
return dstlen;
}
static
size_t my_strnxfrm_8bit_bin(CHARSET_INFO *cs __attribute__((unused)),
uchar *dest, size_t dstlen,
const uchar *src, size_t srclen)
{
if (dest != src)
memcpy(dest, src, min(dstlen,srclen));
if (dstlen > srclen)
bfill(dest + srclen, dstlen - srclen, ' ');
return dstlen;
}
static
uint my_instr_bin(CHARSET_INFO *cs __attribute__((unused)),
const char *b, size_t b_length,
const char *s, size_t s_length,
my_match_t *match, uint nmatch)
{
register const uchar *str, *search, *end, *search_end;
if (s_length <= b_length)
{
if (!s_length)
{
if (nmatch)
{
match->beg= 0;
match->end= 0;
match->mb_len= 0;
}
return 1; /* Empty string is always found */
}
str= (const uchar*) b;
search= (const uchar*) s;
end= (const uchar*) b+b_length-s_length+1;
search_end= (const uchar*) s + s_length;
skip:
while (str != end)
{
if ( (*str++) == (*search))
{
register const uchar *i,*j;
i= str;
j= search+1;
while (j != search_end)
if ((*i++) != (*j++))
goto skip;
if (nmatch > 0)
{
match[0].beg= 0;
match[0].end= (size_t) (str- (const uchar*)b-1);
match[0].mb_len= match[0].end;
if (nmatch > 1)
{
match[1].beg= match[0].end;
match[1].end= match[0].end+s_length;
match[1].mb_len= match[1].end-match[1].beg;
}
}
return 2;
}
}
}
return 0;
}
MY_COLLATION_HANDLER my_collation_8bit_bin_handler =
{
my_coll_init_8bit_bin,
my_strnncoll_8bit_bin,
my_strnncollsp_8bit_bin,
my_strnxfrm_8bit_bin,
my_strnxfrmlen_simple,
my_like_range_simple,
my_wildcmp_bin,
my_strcasecmp_bin,
my_instr_bin,
my_hash_sort_8bit_bin,
my_propagate_simple
};
static MY_COLLATION_HANDLER my_collation_binary_handler =
{
NULL, /* init */
my_strnncoll_binary,
my_strnncollsp_binary,
my_strnxfrm_bin,
my_strnxfrmlen_simple,
my_like_range_simple,
my_wildcmp_bin,
my_strcasecmp_bin,
my_instr_bin,
my_hash_sort_bin,
my_propagate_simple
};
static MY_CHARSET_HANDLER my_charset_handler=
{
NULL, /* init */
NULL, /* ismbchar */
my_mbcharlen_8bit, /* mbcharlen */
my_numchars_8bit,
my_charpos_8bit,
my_well_formed_len_8bit,
my_lengthsp_binary,
my_numcells_8bit,
my_mb_wc_bin,
my_wc_mb_bin,
my_mb_ctype_8bit,
my_case_str_bin,
my_case_str_bin,
my_case_bin,
my_case_bin,
my_snprintf_8bit,
my_long10_to_str_8bit,
my_longlong10_to_str_8bit,
my_fill_8bit,
my_strntol_8bit,
my_strntoul_8bit,
my_strntoll_8bit,
my_strntoull_8bit,
my_strntod_8bit,
my_strtoll10_8bit,
my_strntoull10rnd_8bit,
my_scan_8bit
};
CHARSET_INFO my_charset_bin =
{
63,0,0, /* number */
MY_CS_COMPILED|MY_CS_BINSORT|MY_CS_PRIMARY,/* state */
"binary", /* cs name */
"binary", /* name */
"", /* comment */
NULL, /* tailoring */
ctype_bin, /* ctype */
bin_char_array, /* to_lower */
bin_char_array, /* to_upper */
NULL, /* sort_order */
NULL, /* contractions */
NULL, /* sort_order_big*/
NULL, /* tab_to_uni */
NULL, /* tab_from_uni */
my_unicase_default, /* caseinfo */
NULL, /* state_map */
NULL, /* ident_map */
1, /* strxfrm_multiply */
1, /* caseup_multiply */
1, /* casedn_multiply */
1, /* mbminlen */
1, /* mbmaxlen */
0, /* min_sort_char */
255, /* max_sort_char */
0, /* pad char */
0, /* escape_with_backslash_is_dangerous */
&my_charset_handler,
&my_collation_binary_handler
};

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,639 @@
/* Copyright (C) 2000 MySQL AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
/* File strings/ctype-czech.c for MySQL.
This file implements the Czech sorting for the MySQL database
server (www.mysql.com). Due to some complicated rules the
Czech language has for sorting strings, a more complex
solution was needed than the one-to-one conversion table. To
note a few, here is an example of a Czech sorting sequence:
co < hlaska < hláska < hlava < chlapec < krtek
It because some of the rules are: double char 'ch' is sorted
between 'h' and 'i'. Accented character 'á' (a with acute) is
sorted after 'a' and before 'b', but only if the word is
otherwise the same. However, because 's' is sorted before 'v'
in hlava, the accentness of 'á' is overridden. There are many
more rules.
This file defines functions my_strxfrm and my_strcoll for
C-like zero terminated strings and my_strnxfrm and my_strnncoll
for strings where the length comes as an parameter. Also
defined here you will find function my_like_range that returns
index range strings for LIKE expression and the
MY_STRXFRM_MULTIPLY set to value 4 -- this is the ratio the
strings grows during my_strxfrm. The algorithm has four
passes, that's why we need four times more space for expanded
string.
This file also contains the ISO-Latin-2 definitions of
characters.
Author: (c) 1997--1998 Jan Pazdziora, adelton@fi.muni.cz
Jan Pazdziora has a shared copyright for this code
The original of this file can also be found at
http://www.fi.muni.cz/~adelton/l10n/
Bug reports and suggestions are always welcome.
*/
/*
* This comment is parsed by configure to create ctype.c,
* so don't change it unless you know what you are doing.
*
* .configure. strxfrm_multiply_czech=4
*/
#define SKIP_TRAILING_SPACES 1
#define REAL_MYSQL
#ifdef REAL_MYSQL
#include <my_global.h>
#include "m_string.h"
#include "m_ctype.h"
#else
#include <stdio.h>
#define uchar unsigned char
#endif
#ifdef HAVE_CHARSET_latin2
/*
These are four tables for four passes of the algorithm. Please see
below for what are the "special values"
*/
static uchar *CZ_SORT_TABLE[] = {
(uchar*) "\000\000\000\000\000\000\000\000\000\002\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\002\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\043\044\045\046\047\050\051\052\053\054\000\000\000\000\000\000\000\003\004\377\007\010\011\012\013\015\016\017\020\022\023\024\025\026\027\031\033\034\035\036\037\040\041\000\000\000\000\000\000\003\004\377\007\010\011\012\013\015\016\017\020\022\023\024\025\026\027\031\033\034\035\036\037\040\041\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\003\000\021\000\020\032\000\000\032\032\033\042\000\042\042\000\003\000\021\000\020\032\000\000\032\032\033\042\000\042\042\027\003\003\003\003\020\006\006\006\010\010\010\010\015\015\007\007\023\023\024\024\024\024\000\030\034\034\034\034\040\033\000\027\003\003\003\003\020\006\006\006\010\010\010\010\015\015\007\007\023\023\024\024\024\024\000\030\034\034\034\034\040\033\000",
(uchar*) "\000\000\000\000\000\000\000\000\000\002\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\002\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\106\107\110\111\112\113\114\115\116\117\000\000\000\000\000\000\000\003\011\377\016\021\026\027\030\032\035\036\037\043\044\047\054\055\056\061\065\070\075\076\077\100\102\000\000\000\000\000\000\003\011\377\016\021\026\027\030\032\035\036\037\043\044\047\054\055\056\061\065\070\075\076\077\100\102\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\010\000\042\000\041\063\000\000\062\064\066\104\000\103\105\000\010\000\042\000\041\063\000\000\062\064\066\104\000\103\105\057\004\005\007\006\040\014\015\013\022\025\024\023\033\034\017\020\046\045\050\051\053\052\000\060\072\071\074\073\101\067\000\057\004\005\007\006\040\014\015\013\022\025\024\023\033\034\017\020\046\045\050\051\053\052\000\060\072\071\074\073\101\067\000",
(uchar*) "\000\000\000\000\000\000\000\000\000\002\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\002\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\212\213\214\215\216\217\220\221\222\223\000\000\000\000\000\000\000\004\020\377\032\040\052\054\056\063\071\073\075\105\107\115\127\131\133\141\151\157\171\173\175\177\203\000\000\000\000\000\000\003\017\377\031\037\051\053\055\062\070\072\074\104\106\114\126\130\132\140\150\156\170\172\174\176\202\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\016\000\103\000\101\145\000\000\143\147\153\207\000\205\211\000\015\000\102\000\100\144\000\000\142\146\152\206\000\204\210\135\006\010\014\012\077\026\030\024\042\050\046\044\065\067\034\036\113\111\117\121\125\123\000\137\163\161\167\165\201\155\000\134\005\007\013\011\076\025\027\023\041\047\045\043\064\066\033\035\112\110\116\120\124\122\000\136\162\160\166\164\200\154\000",
(uchar*) "\264\265\266\267\270\271\272\273\274\002\276\277\300\301\302\303\304\305\306\307\310\311\312\313\314\315\316\317\320\321\322\323\002\230\232\253\324\252\251\234\240\241\261\260\225\262\224\235\212\213\214\215\216\217\220\221\222\223\231\226\244\257\245\227\250\004\020\377\032\040\052\054\056\063\071\073\075\105\107\115\127\131\133\141\151\157\171\173\175\177\203\242\237\243\254\255\233\003\017\377\031\037\051\053\055\062\070\072\074\104\106\114\126\130\132\140\150\156\170\172\174\176\202\246\236\247\256\325\264\265\266\267\270\271\272\273\274\275\276\277\300\301\302\303\304\305\306\307\310\311\312\313\314\315\316\317\320\321\322\323\326\016\327\103\330\101\145\331\332\143\147\153\207\333\205\211\334\015\335\102\336\100\144\337\340\142\146\152\206\341\204\210\135\006\010\014\012\077\026\030\024\042\050\046\044\065\067\034\036\113\111\117\121\125\123\263\137\163\161\167\165\201\155\342\134\005\007\013\011\076\025\027\023\041\047\045\043\064\066\033\035\112\110\116\120\124\122\343\136\162\160\166\164\200\154\344",
};
/*
These define the valuse for the double chars that need to be
sorted as they were single characters -- in Czech these are
'ch', 'Ch' and 'CH'.
*/
struct wordvalue
{
const char * word;
uchar *outvalue;
};
static struct wordvalue doubles[] = {
{ "ch", (uchar*) "\014\031\057\057" },
{ "Ch", (uchar*) "\014\031\060\060" },
{ "CH", (uchar*) "\014\031\061\061" },
{ "c", (uchar*) "\005\012\021\021" },
{ "C", (uchar*) "\005\012\022\022" },
};
/*
Unformal description of the algorithm:
We walk the string left to right.
The end of the string is either passed as parameter, or is
*p == 0. This is hidden in the IS_END macro.
In the first two passes, we compare word by word. So we make
first and second pass on the first word, first and second pass
on the second word, etc. If we come to the end of the string
during the first pass, we need to jump to the last word of the
second pass.
End of pass is marked with value 1 on the output.
For each character, we read it's value from the table.
If the value is ignore (0), we go straight to the next character.
If the value is space/end of word (2) and we are in the first
or second pass, we skip all characters having value 0 -- 2 and
switch the passwd.
If it's the compose character (255), we check if the double
exists behind it, find its value.
We append 0 to the end.
---
Neformální popis algoritmu:
Procházíme øetìzec zleva doprava.
Konec øetìzce je pøedán buï jako parametr, nebo je to *p == 0.
Toto je o¹etøeno makrem IS_END.
Pokud jsme do¹li na konec øetìzce pøi prùchodu 0, nejdeme na
zaèátek, ale na ulo¾enou pozici, proto¾e první a druhý prùchod
bì¾í souèasnì.
Konec vstupu (prùchodu) oznaèíme na výstupu hodnotou 1.
Pro ka¾dý znak øetìzce naèteme hodnotu z tøídící tabulky.
Jde-li o hodnotu ignorovat (0), skoèíme ihned na dal¹í znak..
Jde-li o hodnotu konec slova (2) a je to prùchod 0 nebo 1,
pøeskoèíme v¹echny dal¹í 0 -- 2 a prohodíme prùchody.
Jde-li o kompozitní znak (255), otestujeme, zda následuje
správný do dvojice, dohledáme správnou hodnotu.
Na konci pøipojíme znak 0
*/
#define ADD_TO_RESULT(dest, len, totlen, value) \
if ((totlen) < (len)) { dest[totlen] = value; } (totlen++);
#define IS_END(p, src, len) (((char *)p - (char *)src) >= (len))
#define NEXT_CMP_VALUE(src, p, store, pass, value, len) \
while (1) \
{ \
if (IS_END(p, src, len)) \
{ \
/* when we are at the end of string */ \
/* return either 0 for end of string */ \
/* or 1 for end of pass */ \
value= 0; \
if (pass != 3) \
{ \
p= (pass++ == 0) ? store : src; \
value = 1; \
} \
break; \
} \
/* not at end of string */ \
value = CZ_SORT_TABLE[pass][*p]; \
if (value == 0) \
{ p++; continue; } /* ignore value */ \
if (value == 2) /* space */ \
{ \
const uchar *tmp; \
const uchar *runner = ++p; \
while (!(IS_END(runner, src, len)) && (CZ_SORT_TABLE[pass][*runner] == 2)) \
runner++; /* skip all spaces */ \
if (IS_END(runner, src, len) && SKIP_TRAILING_SPACES) \
p = runner; \
if ((pass <= 2) && !(IS_END(runner, src, len))) \
p = runner; \
if (IS_END(p, src, len)) \
continue; \
/* we switch passes */ \
if (pass > 1) \
break; \
tmp = p; \
pass= 1-pass; \
p = store; store = tmp; \
break; \
} \
if (value == 255) \
{ \
int i; \
for (i = 0; i < (int) sizeof(doubles); i++) \
{ \
const char * pattern = doubles[i].word; \
const char * q = (const char *) p; \
int j = 0; \
while (pattern[j]) \
{ \
if (IS_END(q, src, len) || (*q != pattern[j])) \
break; \
j++; q++; \
} \
if (!(pattern[j])) \
{ \
value = (int)(doubles[i].outvalue[pass]); \
p= (const uchar *) q - 1; \
break; \
} \
} \
} \
p++; \
break; \
}
/*
Function strnncoll, actually strcoll, with Czech sorting, which expect
the length of the strings being specified
*/
static int my_strnncoll_czech(CHARSET_INFO *cs __attribute__((unused)),
const uchar *s1, size_t len1,
const uchar *s2, size_t len2,
my_bool s2_is_prefix)
{
int v1, v2;
const uchar *p1, * p2, * store1, * store2;
int pass1 = 0, pass2 = 0;
if (s2_is_prefix && len1 > len2)
len1=len2;
p1 = s1; p2 = s2;
store1 = s1; store2 = s2;
do
{
int diff;
NEXT_CMP_VALUE(s1, p1, store1, pass1, v1, (int)len1);
NEXT_CMP_VALUE(s2, p2, store2, pass2, v2, (int)len2);
if ((diff = v1 - v2))
return diff;
}
while (v1);
return 0;
}
/*
TODO: Fix this one to compare strings as they are done in ctype-simple1
*/
static
int my_strnncollsp_czech(CHARSET_INFO * cs,
const uchar *s, size_t slen,
const uchar *t, size_t tlen,
my_bool diff_if_only_endspace_difference
__attribute__((unused)))
{
for ( ; slen && s[slen-1] == ' ' ; slen--);
for ( ; tlen && t[tlen-1] == ' ' ; tlen--);
return my_strnncoll_czech(cs,s,slen,t,tlen,0);
}
/*
Function strnxfrm, actually strxfrm, with Czech sorting, which expect
the length of the strings being specified
*/
static size_t my_strnxfrm_czech(CHARSET_INFO *cs __attribute__((unused)),
uchar *dest, size_t len,
const uchar *src, size_t srclen)
{
int value;
const uchar *p, * store;
int pass = 0;
size_t totlen = 0;
p = src; store = src;
do
{
NEXT_CMP_VALUE(src, p, store, pass, value, (int)srclen);
ADD_TO_RESULT(dest, len, totlen, value);
}
while (value);
if (len > totlen)
bfill(dest + totlen, len - totlen, ' ');
return len;
}
#undef IS_END
/*
Neformální popis algoritmu:
procházíme øetìzec zleva doprava
konec øetìzce poznáme podle *p == 0
pokud jsme do¹li na konec øetìzce pøi prùchodu 0, nejdeme na
zaèátek, ale na ulo¾enou pozici, proto¾e první a druhý
prùchod bì¾í souèasnì
konec vstupu (prùchodu) oznaèíme na výstupu hodnotou 1
naèteme hodnotu z tøídící tabulky
jde-li o hodnotu ignorovat (0), skoèíme na dal¹í prùchod
jde-li o hodnotu konec slova (2) a je to prùchod 0 nebo 1,
pøeskoèíme v¹echny dal¹í 0 -- 2 a prohodíme
prùchody
jde-li o kompozitní znak (255), otestujeme, zda následuje
správný do dvojice, dohledáme správnou hodnotu
na konci pøipojíme znak 0
*/
/*
** Calculate min_str and max_str that ranges a LIKE string.
** Arguments:
** ptr Pointer to LIKE string.
** ptr_length Length of LIKE string.
** escape Escape character in LIKE. (Normally '\').
** All escape characters should be removed from min_str and max_str
** res_length Length of min_str and max_str.
** min_str Smallest case sensitive string that ranges LIKE.
** Should be space padded to res_length.
** max_str Largest case sensitive string that ranges LIKE.
** Normally padded with the biggest character sort value.
**
** The function should return 0 if ok and 1 if the LIKE string can't be
** optimized !
*/
#ifdef REAL_MYSQL
#define min_sort_char ' '
#define max_sort_char '9'
#define EXAMPLE
static my_bool my_like_range_czech(CHARSET_INFO *cs __attribute__((unused)),
const char *ptr,size_t ptr_length,
pbool escape, pbool w_one, pbool w_many,
size_t res_length, char *min_str,
char *max_str,
size_t *min_length,size_t *max_length)
{
#ifdef EXAMPLE
uchar value;
const char *end=ptr+ptr_length;
char *min_org=min_str;
char *min_end=min_str+res_length;
for (; ptr != end && min_str != min_end ; ptr++)
{
if (*ptr == w_one) /* '_' in SQL */
{ break; }
if (*ptr == w_many) /* '%' in SQL */
{ break; }
if (*ptr == escape && ptr+1 != end)
{ ptr++; } /* Skip escape */
value = CZ_SORT_TABLE[0][(int) (uchar) *ptr];
if (value == 0) /* Ignore in the first pass */
{ continue; }
if (value <= 2) /* End of pass or end of string */
{ break; }
if (value == 255) /* Double char too compicated */
{ break; }
*min_str++= *max_str++ = *ptr;
}
if (cs->state & MY_CS_BINSORT)
*min_length= (size_t) (min_str - min_org);
else
{
/* 'a\0\0... is the smallest possible string */
*min_length= res_length;
}
/* a\ff\ff... is the biggest possible string */
*max_length= res_length;
while (min_str != min_end)
{
*min_str++ = min_sort_char; /* Because of key compression */
*max_str++ = max_sort_char;
}
return 0;
#else
return 1;
#endif
}
#endif
#ifdef REAL_MYSQL
/* This is a latin2 file */
/*
* File generated by cset
* (C) Abandoned 1997 Zarko Mocnik <zarko.mocnik@dem.si>
*
* definition table reworked by Jaromir Dolecek <dolecek@ics.muni.cz>
*/
#include <my_global.h>
#include "m_string.h"
static uchar ctype_czech[257] = {
0,
32, 32, 32, 32, 32, 32, 32, 32, 32, 40, 40, 40, 40, 40, 32, 32,
32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32,
72, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16,
132,132,132,132,132,132,132,132,132,132, 16, 16, 16, 16, 16, 16,
16,129,129,129,129,129,129, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 16, 16, 16, 16, 16,
16,130,130,130,130,130,130, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 16, 16, 16, 16, 32,
32, 32, 32, 32, 32, 32, 32, 32, 40, 40, 40, 40, 40, 32, 32, 32,
32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 72,
1, 16, 1, 16, 1, 1, 16, 0, 0, 1, 1, 1, 1, 16, 1, 1,
16, 2, 16, 2, 16, 2, 2, 16, 16, 2, 2, 2, 2, 16, 2, 2,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
16, 1, 1, 1, 1, 1, 1, 16, 1, 1, 1, 1, 1, 1, 1, 16,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 16, 2, 2, 2, 2, 2, 2, 2, 16,
};
static uchar to_lower_czech[] = {
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,
48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63,
64, 97, 98, 99,100,101,102,103,104,105,106,107,108,109,110,111,
112,113,114,115,116,117,118,119,120,121,122, 91, 92, 93, 94, 95,
96, 97, 98, 99,100,101,102,103,104,105,106,107,108,109,110,111,
112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,
128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,
144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,
177,161,179,163,181,182,166,167,168,185,186,187,188,173,190,191,
176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,
224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,
208,241,242,243,244,245,246,215,248,249,250,251,252,253,254,223,
224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,
240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255,
};
static uchar to_upper_czech[] = {
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,
48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63,
64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79,
80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95,
96, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79,
80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90,123,124,125,126,127,
128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,
144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,
160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,
176,160,178,162,180,164,165,183,184,169,170,171,172,189,174,175,
192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,
208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,
192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,
240,209,210,211,212,213,214,247,216,217,218,219,220,221,222,255,
};
static uchar sort_order_czech[] = {
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,
48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63,
64, 65, 71, 72, 76, 78, 83, 84, 85, 86, 90, 91, 92, 96, 97,100,
105,106,107,110,114,117,122,123,124,125,127,131,132,133,134,135,
136, 65, 71, 72, 76, 78, 83, 84, 85, 86, 90, 91, 92, 96, 97,100,
105,106,107,110,114,117,122,123,124,125,127,137,138,139,140, 0,
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,255,
66,255, 93,255, 94,111,255,255,255,112,113,115,128,255,129,130,
255, 66,255, 93,255, 94,111,255,255,112,113,115,128,255,129,130,
108, 67, 68, 69, 70, 95, 73, 75, 74, 79, 81, 82, 80, 89, 87, 77,
255, 98, 99,101,102,103,104,255,109,119,118,120,121,126,116,255,
108, 67, 68, 69, 70, 95, 73, 75, 74, 79, 81, 82, 80, 89, 88, 77,
255, 98, 99,101,102,103,104,255,109,119,118,120,121,126,116,255,
};
static uint16 tab_8859_2_uni[256]={
0,0x0001,0x0002,0x0003,0x0004,0x0005,0x0006,0x0007,
0x0008,0x0009,0x000A,0x000B,0x000C,0x000D,0x000E,0x000F,
0x0010,0x0011,0x0012,0x0013,0x0014,0x0015,0x0016,0x0017,
0x0018,0x0019,0x001A,0x001B,0x001C,0x001D,0x001E,0x001F,
0x0020,0x0021,0x0022,0x0023,0x0024,0x0025,0x0026,0x0027,
0x0028,0x0029,0x002A,0x002B,0x002C,0x002D,0x002E,0x002F,
0x0030,0x0031,0x0032,0x0033,0x0034,0x0035,0x0036,0x0037,
0x0038,0x0039,0x003A,0x003B,0x003C,0x003D,0x003E,0x003F,
0x0040,0x0041,0x0042,0x0043,0x0044,0x0045,0x0046,0x0047,
0x0048,0x0049,0x004A,0x004B,0x004C,0x004D,0x004E,0x004F,
0x0050,0x0051,0x0052,0x0053,0x0054,0x0055,0x0056,0x0057,
0x0058,0x0059,0x005A,0x005B,0x005C,0x005D,0x005E,0x005F,
0x0060,0x0061,0x0062,0x0063,0x0064,0x0065,0x0066,0x0067,
0x0068,0x0069,0x006A,0x006B,0x006C,0x006D,0x006E,0x006F,
0x0070,0x0071,0x0072,0x0073,0x0074,0x0075,0x0076,0x0077,
0x0078,0x0079,0x007A,0x007B,0x007C,0x007D,0x007E, 0,
0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0,
0x00A0,0x0104,0x02D8,0x0141,0x00A4,0x013D,0x015A,0x00A7,
0x00A8,0x0160,0x015E,0x0164,0x0179,0x00AD,0x017D,0x017B,
0x00B0,0x0105,0x02DB,0x0142,0x00B4,0x013E,0x015B,0x02C7,
0x00B8,0x0161,0x015F,0x0165,0x017A,0x02DD,0x017E,0x017C,
0x0154,0x00C1,0x00C2,0x0102,0x00C4,0x0139,0x0106,0x00C7,
0x010C,0x00C9,0x0118,0x00CB,0x011A,0x00CD,0x00CE,0x010E,
0x0110,0x0143,0x0147,0x00D3,0x00D4,0x0150,0x00D6,0x00D7,
0x0158,0x016E,0x00DA,0x0170,0x00DC,0x00DD,0x0162,0x00DF,
0x0155,0x00E1,0x00E2,0x0103,0x00E4,0x013A,0x0107,0x00E7,
0x010D,0x00E9,0x0119,0x00EB,0x011B,0x00ED,0x00EE,0x010F,
0x0111,0x0144,0x0148,0x00F3,0x00F4,0x0151,0x00F6,0x00F7,
0x0159,0x016F,0x00FA,0x0171,0x00FC,0x00FD,0x0163,0x02D9
};
/* 0000-00FD , 254 chars */
static uchar tab_uni_8859_2_plane00[]={
0x00,0x01,0x02,0x03,0x04,0x05,0x06,0x07,0x08,0x09,0x0A,0x0B,0x0C,0x0D,0x0E,0x0F,
0x10,0x11,0x12,0x13,0x14,0x15,0x16,0x17,0x18,0x19,0x1A,0x1B,0x1C,0x1D,0x1E,0x1F,
0x20,0x21,0x22,0x23,0x24,0x25,0x26,0x27,0x28,0x29,0x2A,0x2B,0x2C,0x2D,0x2E,0x2F,
0x30,0x31,0x32,0x33,0x34,0x35,0x36,0x37,0x38,0x39,0x3A,0x3B,0x3C,0x3D,0x3E,0x3F,
0x40,0x41,0x42,0x43,0x44,0x45,0x46,0x47,0x48,0x49,0x4A,0x4B,0x4C,0x4D,0x4E,0x4F,
0x50,0x51,0x52,0x53,0x54,0x55,0x56,0x57,0x58,0x59,0x5A,0x5B,0x5C,0x5D,0x5E,0x5F,
0x60,0x61,0x62,0x63,0x64,0x65,0x66,0x67,0x68,0x69,0x6A,0x6B,0x6C,0x6D,0x6E,0x6F,
0x70,0x71,0x72,0x73,0x74,0x75,0x76,0x77,0x78,0x79,0x7A,0x7B,0x7C,0x7D,0x7E,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0xA0,0x00,0x00,0x00,0xA4,0x00,0x00,0xA7,0xA8,0x00,0x00,0x00,0x00,0xAD,0x00,0x00,
0xB0,0x00,0x00,0x00,0xB4,0x00,0x00,0x00,0xB8,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0xC1,0xC2,0x00,0xC4,0x00,0x00,0xC7,0x00,0xC9,0x00,0xCB,0x00,0xCD,0xCE,0x00,
0x00,0x00,0x00,0xD3,0xD4,0x00,0xD6,0xD7,0x00,0x00,0xDA,0x00,0xDC,0xDD,0x00,0xDF,
0x00,0xE1,0xE2,0x00,0xE4,0x00,0x00,0xE7,0x00,0xE9,0x00,0xEB,0x00,0xED,0xEE,0x00,
0x00,0x00,0x00,0xF3,0xF4,0x00,0xF6,0xF7,0x00,0x00,0xFA,0x00,0xFC,0xFD};
/* 0102-017E , 125 chars */
static uchar tab_uni_8859_2_plane01[]={
0xC3,0xE3,0xA1,0xB1,0xC6,0xE6,0x00,0x00,0x00,0x00,0xC8,0xE8,0xCF,0xEF,0xD0,0xF0,
0x00,0x00,0x00,0x00,0x00,0x00,0xCA,0xEA,0xCC,0xEC,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0xC5,0xE5,0x00,0x00,0xA5,0xB5,0x00,0x00,0xA3,
0xB3,0xD1,0xF1,0x00,0x00,0xD2,0xF2,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0xD5,0xF5,
0x00,0x00,0xC0,0xE0,0x00,0x00,0xD8,0xF8,0xA6,0xB6,0x00,0x00,0xAA,0xBA,0xA9,0xB9,
0xDE,0xFE,0xAB,0xBB,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0xD9,0xF9,0xDB,0xFB,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0xAC,0xBC,0xAF,0xBF,0xAE,0xBE};
/* 02C7-02DD , 23 chars */
static uchar tab_uni_8859_2_plane02[]={
0xB7,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0xA2,0xFF,0x00,0xB2,0x00,0xBD};
static MY_UNI_IDX idx_uni_8859_2[]={
{0x0000,0x00FD,tab_uni_8859_2_plane00},
{0x0102,0x017E,tab_uni_8859_2_plane01},
{0x02C7,0x02DD,tab_uni_8859_2_plane02},
{0,0,NULL}
};
static MY_COLLATION_HANDLER my_collation_latin2_czech_ci_handler =
{
NULL, /* init */
my_strnncoll_czech,
my_strnncollsp_czech,
my_strnxfrm_czech,
my_strnxfrmlen_simple,
my_like_range_czech,
my_wildcmp_bin,
my_strcasecmp_8bit,
my_instr_simple,
my_hash_sort_simple,
my_propagate_simple
};
CHARSET_INFO my_charset_latin2_czech_ci =
{
2,0,0, /* number */
MY_CS_COMPILED|MY_CS_STRNXFRM|MY_CS_CSSORT, /* state */
"latin2", /* cs name */
"latin2_czech_cs", /* name */
"", /* comment */
NULL, /* tailoring */
ctype_czech,
to_lower_czech,
to_upper_czech,
sort_order_czech,
NULL, /* contractions */
NULL, /* sort_order_big*/
tab_8859_2_uni, /* tab_to_uni */
idx_uni_8859_2, /* tab_from_uni */
my_unicase_default, /* caseinfo */
NULL, /* state_map */
NULL, /* ident_map */
4, /* strxfrm_multiply */
1, /* caseup_multiply */
1, /* casedn_multiply */
1, /* mbminlen */
1, /* mbmaxlen */
0, /* min_sort_char */
0, /* max_sort_char */
' ', /* pad char */
0, /* escape_with_backslash_is_dangerous */
&my_charset_8bit_handler,
&my_collation_latin2_czech_ci_handler
};
#endif
#endif

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,782 @@
/* Copyright (C) 2000 MySQL AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
#include <my_global.h>
#include "m_string.h"
#include "m_ctype.h"
static uchar ctype_latin1[] = {
0,
32, 32, 32, 32, 32, 32, 32, 32, 32, 40, 40, 40, 40, 40, 32, 32,
32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32,
72, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16,
132,132,132,132,132,132,132,132,132,132, 16, 16, 16, 16, 16, 16,
16,129,129,129,129,129,129, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 16, 16, 16, 16, 16,
16,130,130,130,130,130,130, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 16, 16, 16, 16, 32,
16, 0, 16, 2, 16, 16, 16, 16, 16, 16, 1, 16, 1, 0, 1, 0,
0, 16, 16, 16, 16, 16, 16, 16, 16, 16, 2, 16, 2, 0, 2, 1,
72, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16,
16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 16, 1, 1, 1, 1, 1, 1, 1, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 16, 2, 2, 2, 2, 2, 2, 2, 2
};
static uchar to_lower_latin1[] = {
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,
48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63,
64, 97, 98, 99,100,101,102,103,104,105,106,107,108,109,110,111,
112,113,114,115,116,117,118,119,120,121,122, 91, 92, 93, 94, 95,
96, 97, 98, 99,100,101,102,103,104,105,106,107,108,109,110,111,
112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,
128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,
144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,
160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,
176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,
224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,
240,241,242,243,244,245,246,215,248,249,250,251,252,253,254,223,
224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,
240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255
};
static uchar to_upper_latin1[] = {
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,
48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63,
64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79,
80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95,
96, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79,
80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90,123,124,125,126,127,
128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,
144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,
160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,
176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,
192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,
208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,
192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,
208,209,210,211,212,213,214,247,216,217,218,219,220,221,222,255
};
static uchar sort_order_latin1[] = {
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,
48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63,
64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79,
80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95,
96, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79,
80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90,123,124,125,126,127,
128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,
144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,
160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,
176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,
65, 65, 65, 65, 92, 91, 92, 67, 69, 69, 69, 69, 73, 73, 73, 73,
68, 78, 79, 79, 79, 79, 93,215,216, 85, 85, 85, 89, 89,222,223,
65, 65, 65, 65, 92, 91, 92, 67, 69, 69, 69, 69, 73, 73, 73, 73,
68, 78, 79, 79, 79, 79, 93,247,216, 85, 85, 85, 89, 89,222,255
};
/*
WL#1494 notes:
We'll use cp1252 instead of iso-8859-1.
cp1252 contains printable characters in the range 0x80-0x9F.
In ISO 8859-1, these code points have no associated printable
characters. Therefore, by converting from CP1252 to ISO 8859-1,
one would lose the euro (for instance). Since most people are
unaware of the difference, and since we don't really want a
"Windows ANSI" to differ from a "Unix ANSI", we will:
- continue to pretend the latin1 character set is ISO 8859-1
- actually allow the storage of euro etc. so it's actually cp1252
Also we'll map these five undefined cp1252 character:
0x81, 0x8D, 0x8F, 0x90, 0x9D
into corresponding control characters:
U+0081, U+008D, U+008F, U+0090, U+009D.
like ISO-8859-1 does. Otherwise, loading "mysqldump"
output doesn't reproduce these undefined characters.
*/
unsigned short cs_to_uni[256]={
0x0000,0x0001,0x0002,0x0003,0x0004,0x0005,0x0006,0x0007,
0x0008,0x0009,0x000A,0x000B,0x000C,0x000D,0x000E,0x000F,
0x0010,0x0011,0x0012,0x0013,0x0014,0x0015,0x0016,0x0017,
0x0018,0x0019,0x001A,0x001B,0x001C,0x001D,0x001E,0x001F,
0x0020,0x0021,0x0022,0x0023,0x0024,0x0025,0x0026,0x0027,
0x0028,0x0029,0x002A,0x002B,0x002C,0x002D,0x002E,0x002F,
0x0030,0x0031,0x0032,0x0033,0x0034,0x0035,0x0036,0x0037,
0x0038,0x0039,0x003A,0x003B,0x003C,0x003D,0x003E,0x003F,
0x0040,0x0041,0x0042,0x0043,0x0044,0x0045,0x0046,0x0047,
0x0048,0x0049,0x004A,0x004B,0x004C,0x004D,0x004E,0x004F,
0x0050,0x0051,0x0052,0x0053,0x0054,0x0055,0x0056,0x0057,
0x0058,0x0059,0x005A,0x005B,0x005C,0x005D,0x005E,0x005F,
0x0060,0x0061,0x0062,0x0063,0x0064,0x0065,0x0066,0x0067,
0x0068,0x0069,0x006A,0x006B,0x006C,0x006D,0x006E,0x006F,
0x0070,0x0071,0x0072,0x0073,0x0074,0x0075,0x0076,0x0077,
0x0078,0x0079,0x007A,0x007B,0x007C,0x007D,0x007E,0x007F,
0x20AC,0x0081,0x201A,0x0192,0x201E,0x2026,0x2020,0x2021,
0x02C6,0x2030,0x0160,0x2039,0x0152,0x008D,0x017D,0x008F,
0x0090,0x2018,0x2019,0x201C,0x201D,0x2022,0x2013,0x2014,
0x02DC,0x2122,0x0161,0x203A,0x0153,0x009D,0x017E,0x0178,
0x00A0,0x00A1,0x00A2,0x00A3,0x00A4,0x00A5,0x00A6,0x00A7,
0x00A8,0x00A9,0x00AA,0x00AB,0x00AC,0x00AD,0x00AE,0x00AF,
0x00B0,0x00B1,0x00B2,0x00B3,0x00B4,0x00B5,0x00B6,0x00B7,
0x00B8,0x00B9,0x00BA,0x00BB,0x00BC,0x00BD,0x00BE,0x00BF,
0x00C0,0x00C1,0x00C2,0x00C3,0x00C4,0x00C5,0x00C6,0x00C7,
0x00C8,0x00C9,0x00CA,0x00CB,0x00CC,0x00CD,0x00CE,0x00CF,
0x00D0,0x00D1,0x00D2,0x00D3,0x00D4,0x00D5,0x00D6,0x00D7,
0x00D8,0x00D9,0x00DA,0x00DB,0x00DC,0x00DD,0x00DE,0x00DF,
0x00E0,0x00E1,0x00E2,0x00E3,0x00E4,0x00E5,0x00E6,0x00E7,
0x00E8,0x00E9,0x00EA,0x00EB,0x00EC,0x00ED,0x00EE,0x00EF,
0x00F0,0x00F1,0x00F2,0x00F3,0x00F4,0x00F5,0x00F6,0x00F7,
0x00F8,0x00F9,0x00FA,0x00FB,0x00FC,0x00FD,0x00FE,0x00FF
};
uchar pl00[256]={
0x00,0x01,0x02,0x03,0x04,0x05,0x06,0x07,
0x08,0x09,0x0A,0x0B,0x0C,0x0D,0x0E,0x0F,
0x10,0x11,0x12,0x13,0x14,0x15,0x16,0x17,
0x18,0x19,0x1A,0x1B,0x1C,0x1D,0x1E,0x1F,
0x20,0x21,0x22,0x23,0x24,0x25,0x26,0x27,
0x28,0x29,0x2A,0x2B,0x2C,0x2D,0x2E,0x2F,
0x30,0x31,0x32,0x33,0x34,0x35,0x36,0x37,
0x38,0x39,0x3A,0x3B,0x3C,0x3D,0x3E,0x3F,
0x40,0x41,0x42,0x43,0x44,0x45,0x46,0x47,
0x48,0x49,0x4A,0x4B,0x4C,0x4D,0x4E,0x4F,
0x50,0x51,0x52,0x53,0x54,0x55,0x56,0x57,
0x58,0x59,0x5A,0x5B,0x5C,0x5D,0x5E,0x5F,
0x60,0x61,0x62,0x63,0x64,0x65,0x66,0x67,
0x68,0x69,0x6A,0x6B,0x6C,0x6D,0x6E,0x6F,
0x70,0x71,0x72,0x73,0x74,0x75,0x76,0x77,
0x78,0x79,0x7A,0x7B,0x7C,0x7D,0x7E,0x7F,
0x00,0x81,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x8D,0x00,0x8F,
0x90,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x9D,0x00,0x00,
0xA0,0xA1,0xA2,0xA3,0xA4,0xA5,0xA6,0xA7,
0xA8,0xA9,0xAA,0xAB,0xAC,0xAD,0xAE,0xAF,
0xB0,0xB1,0xB2,0xB3,0xB4,0xB5,0xB6,0xB7,
0xB8,0xB9,0xBA,0xBB,0xBC,0xBD,0xBE,0xBF,
0xC0,0xC1,0xC2,0xC3,0xC4,0xC5,0xC6,0xC7,
0xC8,0xC9,0xCA,0xCB,0xCC,0xCD,0xCE,0xCF,
0xD0,0xD1,0xD2,0xD3,0xD4,0xD5,0xD6,0xD7,
0xD8,0xD9,0xDA,0xDB,0xDC,0xDD,0xDE,0xDF,
0xE0,0xE1,0xE2,0xE3,0xE4,0xE5,0xE6,0xE7,
0xE8,0xE9,0xEA,0xEB,0xEC,0xED,0xEE,0xEF,
0xF0,0xF1,0xF2,0xF3,0xF4,0xF5,0xF6,0xF7,
0xF8,0xF9,0xFA,0xFB,0xFC,0xFD,0xFE,0xFF
};
uchar pl01[256]={
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x8C,0x9C,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x8A,0x9A,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x9F,0x00,0x00,0x00,0x00,0x8E,0x9E,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x83,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00
};
uchar pl02[256]={
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x88,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x98,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00
};
uchar pl20[256]={
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x96,0x97,0x00,0x00,0x00,
0x91,0x92,0x82,0x00,0x93,0x94,0x84,0x00,
0x86,0x87,0x95,0x00,0x00,0x00,0x85,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x89,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x8B,0x9B,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x80,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00
};
uchar pl21[256]={
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x99,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00
};
uchar *uni_to_cs[256]={
pl00,pl01,pl02,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
pl20,pl21,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL
};
static
int my_mb_wc_latin1(CHARSET_INFO *cs __attribute__((unused)),
my_wc_t *wc,
const uchar *str,
const uchar *end __attribute__((unused)))
{
if (str >= end)
return MY_CS_TOOSMALL;
*wc=cs_to_uni[*str];
return (!wc[0] && str[0]) ? -1 : 1;
}
static
int my_wc_mb_latin1(CHARSET_INFO *cs __attribute__((unused)),
my_wc_t wc,
uchar *str,
uchar *end __attribute__((unused)))
{
uchar *pl;
if (str >= end)
return MY_CS_TOOSMALL;
pl= uni_to_cs[(wc>>8) & 0xFF];
str[0]= pl ? pl[wc & 0xFF] : '\0';
return (!str[0] && wc) ? MY_CS_ILUNI : 1;
}
static MY_CHARSET_HANDLER my_charset_handler=
{
NULL, /* init */
NULL,
my_mbcharlen_8bit,
my_numchars_8bit,
my_charpos_8bit,
my_well_formed_len_8bit,
my_lengthsp_8bit,
my_numcells_8bit,
my_mb_wc_latin1,
my_wc_mb_latin1,
my_mb_ctype_8bit,
my_caseup_str_8bit,
my_casedn_str_8bit,
my_caseup_8bit,
my_casedn_8bit,
my_snprintf_8bit,
my_long10_to_str_8bit,
my_longlong10_to_str_8bit,
my_fill_8bit,
my_strntol_8bit,
my_strntoul_8bit,
my_strntoll_8bit,
my_strntoull_8bit,
my_strntod_8bit,
my_strtoll10_8bit,
my_strntoull10rnd_8bit,
my_scan_8bit
};
CHARSET_INFO my_charset_latin1=
{
8,0,0, /* number */
MY_CS_COMPILED | MY_CS_PRIMARY, /* state */
"latin1", /* cs name */
"latin1_swedish_ci", /* name */
"", /* comment */
NULL, /* tailoring */
ctype_latin1,
to_lower_latin1,
to_upper_latin1,
sort_order_latin1,
NULL, /* contractions */
NULL, /* sort_order_big*/
cs_to_uni, /* tab_to_uni */
NULL, /* tab_from_uni */
my_unicase_default, /* caseinfo */
NULL, /* state_map */
NULL, /* ident_map */
1, /* strxfrm_multiply */
1, /* caseup_multiply */
1, /* casedn_multiply */
1, /* mbminlen */
1, /* mbmaxlen */
0, /* min_sort_char */
255, /* max_sort_char */
' ', /* pad char */
0, /* escape_with_backslash_is_dangerous */
&my_charset_handler,
&my_collation_8bit_simple_ci_handler
};
/*
* This file is the latin1 character set with German sorting
*
* The modern sort order is used, where:
*
* 'ä' -> "ae"
* 'ö' -> "oe"
* 'ü' -> "ue"
* 'ß' -> "ss"
*/
/*
* This is a simple latin1 mapping table, which maps all accented
* characters to their non-accented equivalents. Note: in this
* table, 'ä' is mapped to 'A', 'ÿ' is mapped to 'Y', etc. - all
* accented characters except the following are treated the same way.
* Ü, ü, Ö, ö, Ä, ä
*/
static uchar sort_order_latin1_de[] = {
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,
48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63,
64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79,
80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95,
96, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79,
80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90,123,124,125,126,127,
128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,
144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,
160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,
176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,
65, 65, 65, 65,196, 65, 92, 67, 69, 69, 69, 69, 73, 73, 73, 73,
68, 78, 79, 79, 79, 79,214,215,216, 85, 85, 85,220, 89,222,223,
65, 65, 65, 65,196, 65, 92, 67, 69, 69, 69, 69, 73, 73, 73, 73,
68, 78, 79, 79, 79, 79,214,247,216, 85, 85, 85,220, 89,222, 89
};
/*
same as sort_order_latin_de, but maps ALL accented chars to unaccented ones
*/
uchar combo1map[]={
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,
48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63,
64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79,
80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95,
96, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79,
80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90,123,124,125,126,127,
128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,
144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,
160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,
176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,
65, 65, 65, 65, 65, 65, 92, 67, 69, 69, 69, 69, 73, 73, 73, 73,
68, 78, 79, 79, 79, 79, 79,215,216, 85, 85, 85, 85, 89,222, 83,
65, 65, 65, 65, 65, 65, 92, 67, 69, 69, 69, 69, 73, 73, 73, 73,
68, 78, 79, 79, 79, 79, 79,247,216, 85, 85, 85, 85, 89,222, 89
};
uchar combo2map[]={
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,69, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0,69, 0, 0, 0, 0, 0,69, 0, 0,83, 0, 0, 0, 0,69, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,69, 0, 0, 0, 0, 0,69, 0, 0, 0, 0
};
/*
Some notes about the following comparison rules:
By definition, my_strnncoll_latin_de must works exactly as if had called
my_strnxfrm_latin_de() on both strings and compared the result strings.
This means that:
Ä must also matches ÁE and Aè, because my_strxn_frm_latin_de() will convert
both to AE.
The other option would be to not do any accent removal in
sort_order_latin_de[] at all
*/
static int my_strnncoll_latin1_de(CHARSET_INFO *cs __attribute__((unused)),
const uchar *a, size_t a_length,
const uchar *b, size_t b_length,
my_bool b_is_prefix)
{
const uchar *a_end= a + a_length;
const uchar *b_end= b + b_length;
uchar a_char, a_extend= 0, b_char, b_extend= 0;
while ((a < a_end || a_extend) && (b < b_end || b_extend))
{
if (a_extend)
{
a_char=a_extend; a_extend=0;
}
else
{
a_extend=combo2map[*a];
a_char=combo1map[*a++];
}
if (b_extend)
{
b_char=b_extend; b_extend=0;
}
else
{
b_extend=combo2map[*b];
b_char=combo1map[*b++];
}
if (a_char != b_char)
return (int) a_char - (int) b_char;
}
/*
A simple test of string lengths won't work -- we test to see
which string ran out first
*/
return ((a < a_end || a_extend) ? (b_is_prefix ? 0 : 1) :
(b < b_end || b_extend) ? -1 : 0);
}
static int my_strnncollsp_latin1_de(CHARSET_INFO *cs __attribute__((unused)),
const uchar *a, size_t a_length,
const uchar *b, size_t b_length,
my_bool diff_if_only_endspace_difference)
{
const uchar *a_end= a + a_length, *b_end= b + b_length;
uchar a_char, a_extend= 0, b_char, b_extend= 0;
int res;
#ifndef VARCHAR_WITH_DIFF_ENDSPACE_ARE_DIFFERENT_FOR_UNIQUE
diff_if_only_endspace_difference= 0;
#endif
while ((a < a_end || a_extend) && (b < b_end || b_extend))
{
if (a_extend)
{
a_char=a_extend;
a_extend= 0;
}
else
{
a_extend= combo2map[*a];
a_char= combo1map[*a++];
}
if (b_extend)
{
b_char= b_extend;
b_extend= 0;
}
else
{
b_extend= combo2map[*b];
b_char= combo1map[*b++];
}
if (a_char != b_char)
return (int) a_char - (int) b_char;
}
/* Check if double character last */
if (a_extend)
return 1;
if (b_extend)
return -1;
res= 0;
if (a != a_end || b != b_end)
{
int swap= 1;
if (diff_if_only_endspace_difference)
res= 1; /* Assume 'a' is bigger */
/*
Check the next not space character of the longer key. If it's < ' ',
then it's smaller than the other key.
*/
if (a == a_end)
{
/* put shorter key in a */
a_end= b_end;
a= b;
swap= -1; /* swap sign of result */
res= -res;
}
for ( ; a < a_end ; a++)
{
if (*a != ' ')
return (*a < ' ') ? -swap : swap;
}
}
return res;
}
static size_t my_strnxfrm_latin1_de(CHARSET_INFO *cs __attribute__((unused)),
uchar *dest, size_t len,
const uchar *src, size_t srclen)
{
const uchar *de = dest + len;
const uchar *se = src + srclen;
for ( ; src < se && dest < de ; src++)
{
uchar chr=combo1map[*src];
*dest++=chr;
if ((chr=combo2map[*src]) && dest < de)
*dest++=chr;
}
if (dest < de)
bfill(dest, de - dest, ' ');
return (int) len;
}
void my_hash_sort_latin1_de(CHARSET_INFO *cs __attribute__((unused)),
const uchar *key, size_t len,
ulong *nr1, ulong *nr2)
{
const uchar *end;
/*
Remove end space. We have to do this to be able to compare
'AE' and 'Ä' as identical
*/
end= skip_trailing_space(key, len);
for (; key < end ; key++)
{
uint X= (uint) combo1map[(uint) *key];
nr1[0]^=(ulong) ((((uint) nr1[0] & 63)+nr2[0]) * X) + (nr1[0] << 8);
nr2[0]+=3;
if ((X= combo2map[*key]))
{
nr1[0]^=(ulong) ((((uint) nr1[0] & 63)+nr2[0]) * X) + (nr1[0] << 8);
nr2[0]+=3;
}
}
}
static MY_COLLATION_HANDLER my_collation_german2_ci_handler=
{
NULL, /* init */
my_strnncoll_latin1_de,
my_strnncollsp_latin1_de,
my_strnxfrm_latin1_de,
my_strnxfrmlen_simple,
my_like_range_simple,
my_wildcmp_8bit,
my_strcasecmp_8bit,
my_instr_simple,
my_hash_sort_latin1_de,
my_propagate_complex
};
CHARSET_INFO my_charset_latin1_german2_ci=
{
31,0,0, /* number */
MY_CS_COMPILED|MY_CS_STRNXFRM, /* state */
"latin1", /* cs name */
"latin1_german2_ci", /* name */
"", /* comment */
NULL, /* tailoring */
ctype_latin1,
to_lower_latin1,
to_upper_latin1,
sort_order_latin1_de,
NULL, /* contractions */
NULL, /* sort_order_big*/
cs_to_uni, /* tab_to_uni */
NULL, /* tab_from_uni */
my_unicase_default, /* caseinfo */
NULL, /* state_map */
NULL, /* ident_map */
2, /* strxfrm_multiply */
1, /* caseup_multiply */
1, /* casedn_multiply */
1, /* mbminlen */
1, /* mbmaxlen */
0, /* min_sort_char */
247, /* max_sort_char */
' ', /* pad char */
0, /* escape_with_backslash_is_dangerous */
&my_charset_handler,
&my_collation_german2_ci_handler
};
CHARSET_INFO my_charset_latin1_bin=
{
47,0,0, /* number */
MY_CS_COMPILED|MY_CS_BINSORT, /* state */
"latin1", /* cs name */
"latin1_bin", /* name */
"", /* comment */
NULL, /* tailoring */
ctype_latin1,
to_lower_latin1,
to_upper_latin1,
NULL, /* sort_order */
NULL, /* contractions */
NULL, /* sort_order_big*/
cs_to_uni, /* tab_to_uni */
NULL, /* tab_from_uni */
my_unicase_default, /* caseinfo */
NULL, /* state_map */
NULL, /* ident_map */
1, /* strxfrm_multiply */
1, /* caseup_multiply */
1, /* casedn_multiply */
1, /* mbminlen */
1, /* mbmaxlen */
0, /* min_sort_char */
255, /* max_sort_char */
' ', /* pad char */
0, /* escape_with_backslash_is_dangerous */
&my_charset_handler,
&my_collation_8bit_bin_handler
};

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,965 @@
/* Copyright (C) 2000-2003 MySQL AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
/*
Copyright (C) 2003 by Sathit Jittanupat
<jsat66@hotmail.com,jsat66@yahoo.com>
* solving bug crash with long text field string
* sorting with different number of space or sign char. within string
Copyright (C) 2001 by Korakot Chaovavanich <korakot@iname.com> and
Apisilp Trunganont <apisilp@pantip.inet.co.th>
Copyright (C) 1998, 1999 by Pruet Boonma <pruet@eng.cmu.ac.th>
Copyright (C) 1998 by Theppitak Karoonboonyanan <thep@links.nectec.or.th>
Copyright (C) 1989, 1991 by Samphan Raruenrom <samphan@thai.com>
Permission to use, copy, modify, distribute and sell this software
and its documentation for any purpose is hereby granted without fee,
provided that the above copyright notice appear in all copies.
Samphan Raruenrom , Theppitak Karoonboonyanan , Pruet Boonma ,
Korakot Chaovavanich and Apisilp Trunganont makes no representations
about the suitability of this software for any purpose. It is provided
"as is" without express or implied warranty.
*/
/*
This file is basicly tis620 character sets with some extra functions
for tis-620 handling
*/
/*
* This comment is parsed by configure to create ctype.c,
* so don't change it unless you know what you are doing.
*
* .configure. strxfrm_multiply_tis620=4
*/
#include <my_global.h>
#include <my_sys.h>
#include "m_string.h"
#include "m_ctype.h"
#include "t_ctype.h"
#ifdef HAVE_CHARSET_tis620
#define BUFFER_MULTIPLY 4
#define M L_MIDDLE
#define U L_UPPER
#define L L_LOWER
#define UU L_UPRUPR
#define X L_MIDDLE
static int t_ctype[][TOT_LEVELS] = {
/*0x00*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x01*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x02*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x03*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x04*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x05*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x06*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x07*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x08*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x09*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x0A*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x0B*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x0C*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x0D*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x0E*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x0F*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x10*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x11*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x12*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x13*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x14*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x15*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x16*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x17*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x18*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x19*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x1A*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x1B*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x1C*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x1D*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x1E*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x1F*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x20*/ { IGNORE, IGNORE, L3_SPACE, IGNORE, M},
/*0x21*/ { IGNORE, IGNORE, L3_EXCLAMATION, IGNORE, M },
/*0x22*/ { IGNORE, IGNORE, L3_QUOTATION, IGNORE, M },
/*0x23*/ { IGNORE, IGNORE, L3_NUMBER, IGNORE, M },
/*0x24*/ { IGNORE, IGNORE, L3_DOLLAR, IGNORE, M },
/*0x25*/ { IGNORE, IGNORE, L3_PERCENT, IGNORE, M },
/*0x26*/ { IGNORE, IGNORE, L3_AMPERSAND, IGNORE, M },
/*0x27*/ { IGNORE, IGNORE, L3_APOSTROPHE, IGNORE, M },
/*0x28*/ { IGNORE, IGNORE, L3_L_PARANTHESIS, IGNORE, M },
/*0x29*/ { IGNORE, IGNORE, L3_R_PARENTHESIS, IGNORE, M },
/*0x2A*/ { IGNORE, IGNORE, L3_ASTERISK, IGNORE, M },
/*0x2B*/ { IGNORE, IGNORE, L3_PLUS, IGNORE, M },
/*0x2C*/ { IGNORE, IGNORE, L3_COMMA, IGNORE, M },
/*0x2D*/ { IGNORE, IGNORE, L3_HYPHEN, IGNORE, M },
/*0x2E*/ { IGNORE, IGNORE, L3_FULL_STOP, IGNORE, M },
/*0x2F*/ { IGNORE, IGNORE, L3_SOLIDUS, IGNORE, M },
/*0x30*/ { L1_08, L2_BLANK, L3_BLANK, L4_BLANK, M },
/*0x31*/ { L1_18, L2_BLANK, L3_BLANK, L4_BLANK, M },
/*0x32*/ { L1_28, L2_BLANK, L3_BLANK, L4_BLANK, M },
/*0x33*/ { L1_38, L2_BLANK, L3_BLANK, L4_BLANK, M },
/*0x34*/ { L1_48, L2_BLANK, L3_BLANK, L4_BLANK, M },
/*0x35*/ { L1_58, L2_BLANK, L3_BLANK, L4_BLANK, M },
/*0x36*/ { L1_68, L2_BLANK, L3_BLANK, L4_BLANK, M },
/*0x37*/ { L1_78, L2_BLANK, L3_BLANK, L4_BLANK, M },
/*0x38*/ { L1_88, L2_BLANK, L3_BLANK, L4_BLANK, M },
/*0x39*/ { L1_98, L2_BLANK, L3_BLANK, L4_BLANK, M },
/*0x3A*/ { IGNORE, IGNORE, L3_COLON, IGNORE, M },
/*0x3B*/ { IGNORE, IGNORE, L3_SEMICOLON, IGNORE, M },
/*0x3C*/ { IGNORE, IGNORE, L3_LESS_THAN, IGNORE, M },
/*0x3D*/ { IGNORE, IGNORE, L3_EQUAL, IGNORE, M },
/*0x3E*/ { IGNORE, IGNORE, L3_GREATER_THAN, IGNORE, M },
/*0x3F*/ { IGNORE, IGNORE, L3_QUESTION, IGNORE, M },
/*0x40*/ { IGNORE, IGNORE, L3_AT, IGNORE, M },
/*0x41*/ { L1_A8, L2_BLANK, L3_BLANK, L4_CAP, M },
/*0x42*/ { L1_B8, L2_BLANK, L3_BLANK, L4_CAP, M },
/*0x43*/ { L1_C8, L2_BLANK, L3_BLANK, L4_CAP, M },
/*0x44*/ { L1_D8, L2_BLANK, L3_BLANK, L4_CAP, M },
/*0x45*/ { L1_E8, L2_BLANK, L3_BLANK, L4_CAP, M },
/*0x46*/ { L1_F8, L2_BLANK, L3_BLANK, L4_CAP, M },
/*0x47*/ { L1_G8, L2_BLANK, L3_BLANK, L4_CAP, M },
/*0x48*/ { L1_H8, L2_BLANK, L3_BLANK, L4_CAP, M },
/*0x49*/ { L1_I8, L2_BLANK, L3_BLANK, L4_CAP, M },
/*0x4A*/ { L1_J8, L2_BLANK, L3_BLANK, L4_CAP, M },
/*0x4B*/ { L1_K8, L2_BLANK, L3_BLANK, L4_CAP, M },
/*0x4C*/ { L1_L8, L2_BLANK, L3_BLANK, L4_CAP, M },
/*0x4D*/ { L1_M8, L2_BLANK, L3_BLANK, L4_CAP, M },
/*0x4E*/ { L1_N8, L2_BLANK, L3_BLANK, L4_CAP, M },
/*0x4F*/ { L1_O8, L2_BLANK, L3_BLANK, L4_CAP, M },
/*0x50*/ { L1_P8, L2_BLANK, L3_BLANK, L4_CAP, M },
/*0x51*/ { L1_Q8, L2_BLANK, L3_BLANK, L4_CAP, M },
/*0x52*/ { L1_R8, L2_BLANK, L3_BLANK, L4_CAP, M },
/*0x53*/ { L1_S8, L2_BLANK, L3_BLANK, L4_CAP, M },
/*0x54*/ { L1_T8, L2_BLANK, L3_BLANK, L4_CAP, M },
/*0x55*/ { L1_U8, L2_BLANK, L3_BLANK, L4_CAP, M },
/*0x56*/ { L1_V8, L2_BLANK, L3_BLANK, L4_CAP, M },
/*0x57*/ { L1_W8, L2_BLANK, L3_BLANK, L4_CAP, M },
/*0x58*/ { L1_X8, L2_BLANK, L3_BLANK, L4_CAP, M },
/*0x59*/ { L1_Y8, L2_BLANK, L3_BLANK, L4_CAP, M },
/*0x5A*/ { L1_Z8, L2_BLANK, L3_BLANK, L4_CAP, M },
/*0x5B*/ { IGNORE, IGNORE, L3_L_BRACKET, IGNORE, M },
/*0x5C*/ { IGNORE, IGNORE, L3_BK_SOLIDUS, IGNORE, M },
/*0x5D*/ { IGNORE, IGNORE, L3_R_BRACKET, IGNORE, M },
/*0x5E*/ { IGNORE, IGNORE, L3_CIRCUMFLEX, IGNORE, M },
/*0x5F*/ { IGNORE, IGNORE, L3_LOW_LINE, IGNORE, M },
/*0x60*/ { IGNORE, IGNORE, L3_GRAVE, IGNORE, M },
/*0x61*/ { L1_A8, L2_BLANK, L3_BLANK, L4_MIN, M },
/*0x62*/ { L1_B8, L2_BLANK, L3_BLANK, L4_MIN, M },
/*0x63*/ { L1_C8, L2_BLANK, L3_BLANK, L4_MIN, M },
/*0x64*/ { L1_D8, L2_BLANK, L3_BLANK, L4_MIN, M },
/*0x65*/ { L1_E8, L2_BLANK, L3_BLANK, L4_MIN, M },
/*0x66*/ { L1_F8, L2_BLANK, L3_BLANK, L4_MIN, M },
/*0x67*/ { L1_G8, L2_BLANK, L3_BLANK, L4_MIN, M },
/*0x68*/ { L1_H8, L2_BLANK, L3_BLANK, L4_MIN, M },
/*0x69*/ { L1_I8, L2_BLANK, L3_BLANK, L4_MIN, M },
/*0x6A*/ { L1_J8, L2_BLANK, L3_BLANK, L4_MIN, M },
/*0x6B*/ { L1_K8, L2_BLANK, L3_BLANK, L4_MIN, M },
/*0x6C*/ { L1_L8, L2_BLANK, L3_BLANK, L4_MIN, M },
/*0x6D*/ { L1_M8, L2_BLANK, L3_BLANK, L4_MIN, M },
/*0x6E*/ { L1_N8, L2_BLANK, L3_BLANK, L4_MIN, M },
/*0x6F*/ { L1_O8, L2_BLANK, L3_BLANK, L4_MIN, M },
/*0x70*/ { L1_P8, L2_BLANK, L3_BLANK, L4_MIN, M },
/*0x71*/ { L1_Q8, L2_BLANK, L3_BLANK, L4_MIN, M },
/*0x72*/ { L1_R8, L2_BLANK, L3_BLANK, L4_MIN, M },
/*0x73*/ { L1_S8, L2_BLANK, L3_BLANK, L4_MIN, M },
/*0x74*/ { L1_T8, L2_BLANK, L3_BLANK, L4_MIN, M },
/*0x75*/ { L1_U8, L2_BLANK, L3_BLANK, L4_MIN, M },
/*0x76*/ { L1_V8, L2_BLANK, L3_BLANK, L4_MIN, M },
/*0x77*/ { L1_W8, L2_BLANK, L3_BLANK, L4_MIN, M },
/*0x78*/ { L1_X8, L2_BLANK, L3_BLANK, L4_MIN, M },
/*0x79*/ { L1_Y8, L2_BLANK, L3_BLANK, L4_MIN, M },
/*0x7A*/ { L1_Z8, L2_BLANK, L3_BLANK, L4_MIN, M },
/*0x7B*/ { IGNORE, IGNORE, L3_L_BRACE, IGNORE, M },
/*0x7C*/ { IGNORE, IGNORE, L3_V_LINE, IGNORE, M },
/*0x7D*/ { IGNORE, IGNORE, L3_R_BRACE, IGNORE, M },
/*0x7E*/ { IGNORE, IGNORE, L3_TILDE, IGNORE, M },
/*0x7F*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x80*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x81*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x82*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x83*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x84*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x85*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x86*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x87*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x88*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x89*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x8A*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x8B*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x8C*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x8D*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x8E*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x8F*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x90*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x91*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x92*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x93*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x94*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x95*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x96*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x97*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x98*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x99*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x9A*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x9B*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x9C*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x9D*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x9E*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0x9F*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0xA0*/ { IGNORE, IGNORE, L3_NB_SACE, IGNORE, X },
/*0xA1*/ { L1_KO_KAI, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xA2*/ { L1_KHO_KHAI, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xA3*/ { L1_KHO_KHUAT, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xA4*/ { L1_KHO_KHWAI, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xA5*/ { L1_KHO_KHON, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xA6*/ { L1_KHO_RAKHANG, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xA7*/ { L1_NGO_NGU, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xA8*/ { L1_CHO_CHAN, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xA9*/ { L1_CHO_CHING, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xAA*/ { L1_CHO_CHANG, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xAB*/ { L1_SO_SO, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xAC*/ { L1_CHO_CHOE, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xAD*/ { L1_YO_YING, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xAE*/ { L1_DO_CHADA, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xAF*/ { L1_TO_PATAK, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xB0*/ { L1_THO_THAN, L2_BLANK,L3_BLANK, L4_BLANK, M | _consnt},
/*0xB1*/ { L1_THO_NANGMONTHO, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xB2*/ { L1_THO_PHUTHAO, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xB3*/ { L1_NO_NEN, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xB4*/ { L1_DO_DEK, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xB5*/ { L1_TO_TAO, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xB6*/ { L1_THO_THUNG, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xB7*/ { L1_THO_THAHAN, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xB8*/ { L1_THO_THONG, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xB9*/ { L1_NO_NU, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xBA*/ { L1_BO_BAIMAI, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xBB*/ { L1_PO_PLA, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xBC*/ { L1_PHO_PHUNG, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xBD*/ { L1_FO_FA, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xBE*/ { L1_PHO_PHAN, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xBF*/ { L1_FO_FAN, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xC0*/ { L1_PHO_SAMPHAO, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xC1*/ { L1_MO_MA, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xC2*/ { L1_YO_YAK, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xC3*/ { L1_RO_RUA, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xC4*/ { L1_RU, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xC5*/ { L1_LO_LING, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xC6*/ { L1_LU, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xC7*/ { L1_WO_WAEN, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xC8*/ { L1_SO_SALA, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xC9*/ { L1_SO_RUSI, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xCA*/ { L1_SO_SUA, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xCB*/ { L1_HO_HIP, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xCC*/ { L1_LO_CHULA, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xCD*/ { L1_O_ANG, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xCE*/ { L1_HO_NOKHUK, L2_BLANK, L3_BLANK, L4_BLANK, M | _consnt},
/*0xCF*/ { IGNORE, IGNORE, L3_PAIYAN_NOI, IGNORE, M},
/*0xD0*/ { L1_SARA_A, L2_BLANK, L3_BLANK, L4_BLANK, M | _fllwvowel},
/*0xD1*/ { L1_MAI_HAN_AKAT, L2_BLANK, L3_BLANK, L4_BLANK, U | _uprvowel},
/*0xD2*/ { L1_SARA_AA, L2_BLANK, L3_BLANK, L4_BLANK, M | _fllwvowel},
/*0xD3*/ { L1_SARA_AM, L2_BLANK, L3_BLANK, L4_BLANK, M | _fllwvowel},
/*0xD4*/ { L1_SARA_I, L2_BLANK, L3_BLANK, L4_BLANK, U | _uprvowel},
/*0xD5*/ { L1_SARA_II, L2_BLANK, L3_BLANK, L4_BLANK, U | _uprvowel},
/*0xD6*/ { L1_SARA_UE, L2_BLANK, L3_BLANK, L4_BLANK, U | _uprvowel},
/*0xD7*/ { L1_SARA_UEE, L2_BLANK, L3_BLANK, L4_BLANK, U | _uprvowel},
/*0xD8*/ { L1_SARA_U, L2_BLANK, L3_BLANK, L4_BLANK, L | _lwrvowel},
/*0xD9*/ { L1_SARA_UU, L2_BLANK, L3_BLANK, L4_BLANK, L | _lwrvowel},
/*0xDA*/ { IGNORE, L2_PINTHU, L3_BLANK, L4_BLANK, L },
/*0xDB*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0xDC*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0xDD*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0xDE*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0xDF*/ { IGNORE, IGNORE, L3_BAHT, IGNORE, M},
/*0xE0*/ { L1_SARA_E, L2_BLANK, L3_BLANK, L4_BLANK, M | _ldvowel },
/*0xE1*/ { L1_SARA_AE, L2_BLANK, L3_BLANK, L4_BLANK, M | _ldvowel },
/*0xE2*/ { L1_SARA_O, L2_BLANK, L3_BLANK, L4_BLANK, M | _ldvowel },
/*0xE3*/ { L1_SARA_AI_MAIMUAN, L2_BLANK, L3_BLANK, L4_BLANK, M | _ldvowel },
/*0xE4*/ { L1_SARA_AI_MAIMALAI, L2_BLANK, L3_BLANK, L4_BLANK, M | _ldvowel },
/*0xE5*/ { L1_SARA_AA, L2_BLANK, L3_BLANK, L4_EXT, M | _fllwvowel },
/*0xE6*/ { IGNORE, IGNORE, L3_MAI_YAMOK, IGNORE, M | _stone },
/*0xE7*/ { IGNORE, L2_TYKHU, L3_BLANK, L4_BLANK, U | _diacrt1 | _stone },
/*0xE8*/ { IGNORE, L2_TONE1, L3_BLANK, L4_BLANK, UU | _tone | _combine | _stone },
/*0xE9*/ { IGNORE, L2_TONE2, L3_BLANK, L4_BLANK, UU | _tone | _combine | _stone },
/*0xEA*/ { IGNORE, L2_TONE3, L3_BLANK, L4_BLANK, UU | _tone | _combine | _stone },
/*0xEB*/ { IGNORE, L2_TONE4, L3_BLANK, L4_BLANK, UU | _tone | _combine | _stone },
/*0xEC*/ { IGNORE, L2_GARAN, L3_BLANK, L4_BLANK, UU | _diacrt2 | _combine | _stone },
/*0xED*/ { L1_NKHIT, L2_BLANK, L3_BLANK, L4_BLANK, U | _diacrt1 },
/*0xEE*/ { IGNORE, L2_YAMAK, L3_BLANK, L4_BLANK, U | _diacrt1 },
/*0xEF*/ { IGNORE, IGNORE, L3_FONGMAN, IGNORE, M },
/*0xF0*/ { L1_08, L2_THAII, L3_BLANK, L4_BLANK, M | _tdig },
/*0xF1*/ { L1_18, L2_THAII, L3_BLANK, L4_BLANK, M | _tdig },
/*0xF2*/ { L1_28, L2_THAII, L3_BLANK, L4_BLANK, M | _tdig },
/*0xF3*/ { L1_38, L2_THAII, L3_BLANK, L4_BLANK, M | _tdig },
/*0xF4*/ { L1_48, L2_THAII, L3_BLANK, L4_BLANK, M | _tdig },
/*0xF5*/ { L1_58, L2_THAII, L3_BLANK, L4_BLANK, M | _tdig },
/*0xF6*/ { L1_68, L2_THAII, L3_BLANK, L4_BLANK, M | _tdig },
/*0xF7*/ { L1_78, L2_THAII, L3_BLANK, L4_BLANK, M | _tdig },
/*0xF8*/ { L1_88, L2_THAII, L3_BLANK, L4_BLANK, M | _tdig },
/*0xF9*/ { L1_98, L2_THAII, L3_BLANK, L4_BLANK, M | _tdig },
/*0xFA*/ { IGNORE, IGNORE, L3_ANGKHANKHU, IGNORE, X },
/*0xFB*/ { IGNORE, IGNORE, L3_KHOMUT, IGNORE, X },
/*0xFC*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0xFD*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/*0xFE*/ { IGNORE, IGNORE, IGNORE, IGNORE, X },
/* Utilize 0xFF for max_sort_chr in my_like_range_tis620 */
/*0xFF*/ { 255 /*IGNORE*/, IGNORE, IGNORE, IGNORE, X },
};
static uchar ctype_tis620[257] =
{
0, /* For standard library */
32,32,32,32,32,32,32,32,32,40,40,40,40,40,32,32,
32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,
72,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,
132,132,132,132,132,132,132,132,132,132,16,16,16,16,16,16,
16,129,129,129,129,129,129,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,16,16,16,16,16,
16,130,130,130,130,130,130,2,2,2,2,2,2,2,2,2,
2,2,2,2,2,2,2,2,2,2,2,16,16,16,16,32,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
};
static uchar to_lower_tis620[]=
{
'\000','\001','\002','\003','\004','\005','\006','\007',
'\010','\011','\012','\013','\014','\015','\016','\017',
'\020','\021','\022','\023','\024','\025','\026','\027',
'\030','\031','\032','\033','\034','\035','\036','\037',
' ', '!', '"', '#', '$', '%', '&', '\'',
'(', ')', '*', '+', ',', '-', '.', '/',
'0', '1', '2', '3', '4', '5', '6', '7',
'8', '9', ':', ';', '<', '=', '>', '?',
'@', 'a', 'b', 'c', 'd', 'e', 'f', 'g',
'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o',
'p', 'q', 'r', 's', 't', 'u', 'v', 'w',
'x', 'y', 'z', '[', '\\', ']', '^', '_',
'`', 'a', 'b', 'c', 'd', 'e', 'f', 'g',
'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o',
'p', 'q', 'r', 's', 't', 'u', 'v', 'w',
'x', 'y', 'z', '{', '|', '}', '~', '\177',
(uchar) '\200',(uchar) '\201',(uchar) '\202',(uchar) '\203',(uchar) '\204',(uchar) '\205',(uchar) '\206',(uchar) '\207',
(uchar) '\210',(uchar) '\211',(uchar) '\212',(uchar) '\213',(uchar) '\214',(uchar) '\215',(uchar) '\216',(uchar) '\217',
(uchar) '\220',(uchar) '\221',(uchar) '\222',(uchar) '\223',(uchar) '\224',(uchar) '\225',(uchar) '\226',(uchar) '\227',
(uchar) '\230',(uchar) '\231',(uchar) '\232',(uchar) '\233',(uchar) '\234',(uchar) '\235',(uchar) '\236',(uchar) '\237',
(uchar) '\240',(uchar) '\241',(uchar) '\242',(uchar) '\243',(uchar) '\244',(uchar) '\245',(uchar) '\246',(uchar) '\247',
(uchar) '\250',(uchar) '\251',(uchar) '\252',(uchar) '\253',(uchar) '\254',(uchar) '\255',(uchar) '\256',(uchar) '\257',
(uchar) '\260',(uchar) '\261',(uchar) '\262',(uchar) '\263',(uchar) '\264',(uchar) '\265',(uchar) '\266',(uchar) '\267',
(uchar) '\270',(uchar) '\271',(uchar) '\272',(uchar) '\273',(uchar) '\274',(uchar) '\275',(uchar) '\276',(uchar) '\277',
(uchar) '\300',(uchar) '\301',(uchar) '\302',(uchar) '\303',(uchar) '\304',(uchar) '\305',(uchar) '\306',(uchar) '\307',
(uchar) '\310',(uchar) '\311',(uchar) '\312',(uchar) '\313',(uchar) '\314',(uchar) '\315',(uchar) '\316',(uchar) '\317',
(uchar) '\320',(uchar) '\321',(uchar) '\322',(uchar) '\323',(uchar) '\324',(uchar) '\325',(uchar) '\326',(uchar) '\327',
(uchar) '\330',(uchar) '\331',(uchar) '\332',(uchar) '\333',(uchar) '\334',(uchar) '\335',(uchar) '\336',(uchar) '\337',
(uchar) '\340',(uchar) '\341',(uchar) '\342',(uchar) '\343',(uchar) '\344',(uchar) '\345',(uchar) '\346',(uchar) '\347',
(uchar) '\350',(uchar) '\351',(uchar) '\352',(uchar) '\353',(uchar) '\354',(uchar) '\355',(uchar) '\356',(uchar) '\357',
(uchar) '\360',(uchar) '\361',(uchar) '\362',(uchar) '\363',(uchar) '\364',(uchar) '\365',(uchar) '\366',(uchar) '\367',
(uchar) '\370',(uchar) '\371',(uchar) '\372',(uchar) '\373',(uchar) '\374',(uchar) '\375',(uchar) '\376',(uchar) '\377',
};
static uchar to_upper_tis620[]=
{
'\000','\001','\002','\003','\004','\005','\006','\007',
'\010','\011','\012','\013','\014','\015','\016','\017',
'\020','\021','\022','\023','\024','\025','\026','\027',
'\030','\031','\032','\033','\034','\035','\036','\037',
' ', '!', '"', '#', '$', '%', '&', '\'',
'(', ')', '*', '+', ',', '-', '.', '/',
'0', '1', '2', '3', '4', '5', '6', '7',
'8', '9', ':', ';', '<', '=', '>', '?',
'@', 'A', 'B', 'C', 'D', 'E', 'F', 'G',
'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O',
'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W',
'X', 'Y', 'Z', '[', '\\', ']', '^', '_',
'`', 'A', 'B', 'C', 'D', 'E', 'F', 'G',
'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O',
'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W',
'X', 'Y', 'Z', '{', '|', '}', '~', '\177',
(uchar) '\200',(uchar) '\201',(uchar) '\202',(uchar) '\203',(uchar) '\204',(uchar) '\205',(uchar) '\206',(uchar) '\207',
(uchar) '\210',(uchar) '\211',(uchar) '\212',(uchar) '\213',(uchar) '\214',(uchar) '\215',(uchar) '\216',(uchar) '\217',
(uchar) '\220',(uchar) '\221',(uchar) '\222',(uchar) '\223',(uchar) '\224',(uchar) '\225',(uchar) '\226',(uchar) '\227',
(uchar) '\230',(uchar) '\231',(uchar) '\232',(uchar) '\233',(uchar) '\234',(uchar) '\235',(uchar) '\236',(uchar) '\237',
(uchar) '\240',(uchar) '\241',(uchar) '\242',(uchar) '\243',(uchar) '\244',(uchar) '\245',(uchar) '\246',(uchar) '\247',
(uchar) '\250',(uchar) '\251',(uchar) '\252',(uchar) '\253',(uchar) '\254',(uchar) '\255',(uchar) '\256',(uchar) '\257',
(uchar) '\260',(uchar) '\261',(uchar) '\262',(uchar) '\263',(uchar) '\264',(uchar) '\265',(uchar) '\266',(uchar) '\267',
(uchar) '\270',(uchar) '\271',(uchar) '\272',(uchar) '\273',(uchar) '\274',(uchar) '\275',(uchar) '\276',(uchar) '\277',
(uchar) '\300',(uchar) '\301',(uchar) '\302',(uchar) '\303',(uchar) '\304',(uchar) '\305',(uchar) '\306',(uchar) '\307',
(uchar) '\310',(uchar) '\311',(uchar) '\312',(uchar) '\313',(uchar) '\314',(uchar) '\315',(uchar) '\316',(uchar) '\317',
(uchar) '\320',(uchar) '\321',(uchar) '\322',(uchar) '\323',(uchar) '\324',(uchar) '\325',(uchar) '\326',(uchar) '\327',
(uchar) '\330',(uchar) '\331',(uchar) '\332',(uchar) '\333',(uchar) '\334',(uchar) '\335',(uchar) '\336',(uchar) '\337',
(uchar) '\340',(uchar) '\341',(uchar) '\342',(uchar) '\343',(uchar) '\344',(uchar) '\345',(uchar) '\346',(uchar) '\347',
(uchar) '\350',(uchar) '\351',(uchar) '\352',(uchar) '\353',(uchar) '\354',(uchar) '\355',(uchar) '\356',(uchar) '\357',
(uchar) '\360',(uchar) '\361',(uchar) '\362',(uchar) '\363',(uchar) '\364',(uchar) '\365',(uchar) '\366',(uchar) '\367',
(uchar) '\370',(uchar) '\371',(uchar) '\372',(uchar) '\373',(uchar) '\374',(uchar) '\375',(uchar) '\376',(uchar) '\377',
};
static uchar sort_order_tis620[]=
{
'\000','\001','\002','\003','\004','\005','\006','\007',
'\010','\011','\012','\013','\014','\015','\016','\017',
'\020','\021','\022','\023','\024','\025','\026','\027',
'\030','\031','\032','\033','\034','\035','\036','\037',
' ', '!', '"', '#', '$', '%', '&', '\'',
'(', ')', '*', '+', ',', '-', '.', '/',
'0', '1', '2', '3', '4', '5', '6', '7',
'8', '9', ':', ';', '<', '=', '>', '?',
'@', 'A', 'B', 'C', 'D', 'E', 'F', 'G',
'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O',
'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W',
'X', 'Y', 'Z', '\\', ']', '[', '^', '_',
'E', 'A', 'B', 'C', 'D', 'E', 'F', 'G',
'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O',
'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W',
'X', 'Y', 'Z', '{', '|', '}', 'Y', '\177',
(uchar) '\200',(uchar) '\201',(uchar) '\202',(uchar) '\203',(uchar) '\204',(uchar) '\205',(uchar) '\206',(uchar) '\207',
(uchar) '\210',(uchar) '\211',(uchar) '\212',(uchar) '\213',(uchar) '\214',(uchar) '\215',(uchar) '\216',(uchar) '\217',
(uchar) '\220',(uchar) '\221',(uchar) '\222',(uchar) '\223',(uchar) '\224',(uchar) '\225',(uchar) '\226',(uchar) '\227',
(uchar) '\230',(uchar) '\231',(uchar) '\232',(uchar) '\233',(uchar) '\234',(uchar) '\235',(uchar) '\236',(uchar) '\237',
(uchar) '\240',(uchar) '\241',(uchar) '\242',(uchar) '\243',(uchar) '\244',(uchar) '\245',(uchar) '\246',(uchar) '\247',
(uchar) '\250',(uchar) '\251',(uchar) '\252',(uchar) '\253',(uchar) '\254',(uchar) '\255',(uchar) '\256',(uchar) '\257',
(uchar) '\260',(uchar) '\261',(uchar) '\262',(uchar) '\263',(uchar) '\264',(uchar) '\265',(uchar) '\266',(uchar) '\267',
(uchar) '\270',(uchar) '\271',(uchar) '\272',(uchar) '\273',(uchar) '\274',(uchar) '\275',(uchar) '\276',(uchar) '\277',
(uchar) '\300',(uchar) '\301',(uchar) '\302',(uchar) '\303',(uchar) '\304',(uchar) '\305',(uchar) '\306',(uchar) '\307',
(uchar) '\310',(uchar) '\311',(uchar) '\312',(uchar) '\313',(uchar) '\314',(uchar) '\315',(uchar) '\316',(uchar) '\317',
(uchar) '\320',(uchar) '\321',(uchar) '\322',(uchar) '\323',(uchar) '\324',(uchar) '\325',(uchar) '\326',(uchar) '\327',
(uchar) '\330',(uchar) '\331',(uchar) '\332',(uchar) '\333',(uchar) '\334',(uchar) '\335',(uchar) '\336',(uchar) '\337',
(uchar) '\340',(uchar) '\341',(uchar) '\342',(uchar) '\343',(uchar) '\344',(uchar) '\345',(uchar) '\346',(uchar) '\347',
(uchar) '\350',(uchar) '\351',(uchar) '\352',(uchar) '\353',(uchar) '\354',(uchar) '\355',(uchar) '\356',(uchar) '\357',
(uchar) '\360',(uchar) '\361',(uchar) '\362',(uchar) '\363',(uchar) '\364',(uchar) '\365',(uchar) '\366',(uchar) '\367',
(uchar) '\370',(uchar) '\371',(uchar) '\372',(uchar) '\373',(uchar) '\374',(uchar) '\375',(uchar) '\376',(uchar) '\377',
};
/*
Convert thai string to "Standard C String Function" sortable string
SYNOPSIS
thai2sortable()
tstr String to convert. Does not have to end with \0
len Length of tstr
*/
static size_t thai2sortable(uchar *tstr, size_t len)
{
uchar *p;
int tlen;
uchar l2bias;
tlen= len;
l2bias= 256 - 8;
for (p= tstr; tlen > 0; p++, tlen--)
{
uchar c= *p;
if (isthai(c))
{
int *t_ctype0= t_ctype[c];
if (isconsnt(c))
l2bias -= 8;
if (isldvowel(c) && tlen != 1 && isconsnt(p[1]))
{
/* simply swap between leading-vowel and consonant */
*p= p[1];
p[1]= c;
tlen--;
p++;
continue;
}
/* if found level 2 char (L2_GARAN,L2_TONE*,L2_TYKHU) move to last */
if (t_ctype0[1] >= L2_GARAN)
{
/*
l2bias use to control position weight of l2char
example (*=l2char) XX*X must come before X*XX
*/
memmove((char*) p, (char*) (p+1), tlen-1);
tstr[len-1]= l2bias + t_ctype0[1]- L2_GARAN +1;
p--;
continue;
}
}
else
{
l2bias-= 8;
*p= to_lower_tis620[c];
}
}
return len;
}
/*
strncoll() replacement, compare 2 string, both are converted to sortable
string
NOTE:
We can't cut strings at end \0 as this would break comparision with
LIKE characters, where the min range is stored as end \0
Arg: 2 Strings and it compare length
Ret: strcmp result
*/
static
int my_strnncoll_tis620(CHARSET_INFO *cs __attribute__((unused)),
const uchar *s1, size_t len1,
const uchar *s2, size_t len2,
my_bool s2_is_prefix)
{
uchar buf[80] ;
uchar *tc1, *tc2;
int i;
if (s2_is_prefix && len1 > len2)
len1= len2;
tc1= buf;
if ((len1 + len2 +2) > (int) sizeof(buf))
tc1= (uchar*) my_str_malloc(len1+len2+2);
tc2= tc1 + len1+1;
memcpy((char*) tc1, (char*) s1, len1);
tc1[len1]= 0; /* if length(s1)> len1, need to put 'end of string' */
memcpy((char *)tc2, (char *)s2, len2);
tc2[len2]= 0; /* put end of string */
thai2sortable(tc1, len1);
thai2sortable(tc2, len2);
i= strcmp((char*)tc1, (char*)tc2);
if (tc1 != buf)
my_str_free(tc1);
return i;
}
static
int my_strnncollsp_tis620(CHARSET_INFO * cs __attribute__((unused)),
const uchar *a0, size_t a_length,
const uchar *b0, size_t b_length,
my_bool diff_if_only_endspace_difference)
{
uchar buf[80], *end, *a, *b, *alloced= NULL;
size_t length;
int res= 0;
#ifndef VARCHAR_WITH_DIFF_ENDSPACE_ARE_DIFFERENT_FOR_UNIQUE
diff_if_only_endspace_difference= 0;
#endif
a= buf;
if ((a_length + b_length +2) > (int) sizeof(buf))
alloced= a= (uchar*) my_str_malloc(a_length+b_length+2);
b= a + a_length+1;
memcpy((char*) a, (char*) a0, a_length);
a[a_length]= 0; /* if length(a0)> len1, need to put 'end of string' */
memcpy((char *)b, (char *)b0, b_length);
b[b_length]= 0; /* put end of string */
a_length= thai2sortable(a, a_length);
b_length= thai2sortable(b, b_length);
end= a + (length= min(a_length, b_length));
while (a < end)
{
if (*a++ != *b++)
{
res= ((int) a[-1] - (int) b[-1]);
goto ret;
}
}
if (a_length != b_length)
{
int swap= 1;
if (diff_if_only_endspace_difference)
res= 1; /* Assume 'a' is bigger */
/*
Check the next not space character of the longer key. If it's < ' ',
then it's smaller than the other key.
*/
if (a_length < b_length)
{
/* put shorter key in s */
a_length= b_length;
a= b;
swap= -1; /* swap sign of result */
res= -res;
}
for (end= a + a_length-length; a < end ; a++)
{
if (*a != ' ')
{
res= (*a < ' ') ? -swap : swap;
goto ret;
}
}
}
ret:
if (alloced)
my_str_free(alloced);
return res;
}
/*
strnxfrm replacment, convert Thai string to sortable string
Arg: Destination buffer, source string, dest length and source length
Ret: Conveted string size
*/
static
size_t my_strnxfrm_tis620(CHARSET_INFO *cs __attribute__((unused)),
uchar *dest, size_t len,
const uchar *src, size_t srclen)
{
size_t dstlen= len;
len= (size_t) (strmake((char*) dest, (char*) src, min(len, srclen)) -
(char*) dest);
len= thai2sortable(dest, len);
if (dstlen > len)
bfill(dest + len, dstlen - len, ' ');
return dstlen;
}
static unsigned short cs_to_uni[256]={
0x0000,0x0001,0x0002,0x0003,0x0004,0x0005,0x0006,0x0007,
0x0008,0x0009,0x000A,0x000B,0x000C,0x000D,0x000E,0x000F,
0x0010,0x0011,0x0012,0x0013,0x0014,0x0015,0x0016,0x0017,
0x0018,0x0019,0x001A,0x001B,0x001C,0x001D,0x001E,0x001F,
0x0020,0x0021,0x0022,0x0023,0x0024,0x0025,0x0026,0x0027,
0x0028,0x0029,0x002A,0x002B,0x002C,0x002D,0x002E,0x002F,
0x0030,0x0031,0x0032,0x0033,0x0034,0x0035,0x0036,0x0037,
0x0038,0x0039,0x003A,0x003B,0x003C,0x003D,0x003E,0x003F,
0x0040,0x0041,0x0042,0x0043,0x0044,0x0045,0x0046,0x0047,
0x0048,0x0049,0x004A,0x004B,0x004C,0x004D,0x004E,0x004F,
0x0050,0x0051,0x0052,0x0053,0x0054,0x0055,0x0056,0x0057,
0x0058,0x0059,0x005A,0x005B,0x005C,0x005D,0x005E,0x005F,
0x0060,0x0061,0x0062,0x0063,0x0064,0x0065,0x0066,0x0067,
0x0068,0x0069,0x006A,0x006B,0x006C,0x006D,0x006E,0x006F,
0x0070,0x0071,0x0072,0x0073,0x0074,0x0075,0x0076,0x0077,
0x0078,0x0079,0x007A,0x007B,0x007C,0x007D,0x007E,0x007F,
0x0080,0x0081,0x0082,0x0083,0x0084,0x0085,0x0086,0x0087,
0x0088,0x0089,0x008A,0x008B,0x008C,0x008D,0x008E,0x008F,
0x0090,0x0091,0x0092,0x0093,0x0094,0x0095,0x0096,0x0097,
0x0098,0x0099,0x009A,0x009B,0x009C,0x009D,0x009E,0x009F,
0xFFFD,0x0E01,0x0E02,0x0E03,0x0E04,0x0E05,0x0E06,0x0E07,
0x0E08,0x0E09,0x0E0A,0x0E0B,0x0E0C,0x0E0D,0x0E0E,0x0E0F,
0x0E10,0x0E11,0x0E12,0x0E13,0x0E14,0x0E15,0x0E16,0x0E17,
0x0E18,0x0E19,0x0E1A,0x0E1B,0x0E1C,0x0E1D,0x0E1E,0x0E1F,
0x0E20,0x0E21,0x0E22,0x0E23,0x0E24,0x0E25,0x0E26,0x0E27,
0x0E28,0x0E29,0x0E2A,0x0E2B,0x0E2C,0x0E2D,0x0E2E,0x0E2F,
0x0E30,0x0E31,0x0E32,0x0E33,0x0E34,0x0E35,0x0E36,0x0E37,
0x0E38,0x0E39,0x0E3A,0xFFFD,0xFFFD,0xFFFD,0xFFFD,0x0E3F,
0x0E40,0x0E41,0x0E42,0x0E43,0x0E44,0x0E45,0x0E46,0x0E47,
0x0E48,0x0E49,0x0E4A,0x0E4B,0x0E4C,0x0E4D,0x0E4E,0x0E4F,
0x0E50,0x0E51,0x0E52,0x0E53,0x0E54,0x0E55,0x0E56,0x0E57,
0x0E58,0x0E59,0x0E5A,0x0E5B,0xFFFD,0xFFFD,0xFFFD,0xFFFD
};
static uchar pl00[256]={
0x0000,0x0001,0x0002,0x0003,0x0004,0x0005,0x0006,0x0007,
0x0008,0x0009,0x000A,0x000B,0x000C,0x000D,0x000E,0x000F,
0x0010,0x0011,0x0012,0x0013,0x0014,0x0015,0x0016,0x0017,
0x0018,0x0019,0x001A,0x001B,0x001C,0x001D,0x001E,0x001F,
0x0020,0x0021,0x0022,0x0023,0x0024,0x0025,0x0026,0x0027,
0x0028,0x0029,0x002A,0x002B,0x002C,0x002D,0x002E,0x002F,
0x0030,0x0031,0x0032,0x0033,0x0034,0x0035,0x0036,0x0037,
0x0038,0x0039,0x003A,0x003B,0x003C,0x003D,0x003E,0x003F,
0x0040,0x0041,0x0042,0x0043,0x0044,0x0045,0x0046,0x0047,
0x0048,0x0049,0x004A,0x004B,0x004C,0x004D,0x004E,0x004F,
0x0050,0x0051,0x0052,0x0053,0x0054,0x0055,0x0056,0x0057,
0x0058,0x0059,0x005A,0x005B,0x005C,0x005D,0x005E,0x005F,
0x0060,0x0061,0x0062,0x0063,0x0064,0x0065,0x0066,0x0067,
0x0068,0x0069,0x006A,0x006B,0x006C,0x006D,0x006E,0x006F,
0x0070,0x0071,0x0072,0x0073,0x0074,0x0075,0x0076,0x0077,
0x0078,0x0079,0x007A,0x007B,0x007C,0x007D,0x007E,0x007F,
0x0080,0x0081,0x0082,0x0083,0x0084,0x0085,0x0086,0x0087,
0x0088,0x0089,0x008A,0x008B,0x008C,0x008D,0x008E,0x008F,
0x0090,0x0091,0x0092,0x0093,0x0094,0x0095,0x0096,0x0097,
0x0098,0x0099,0x009A,0x009B,0x009C,0x009D,0x009E,0x009F,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000
};
static uchar pl0E[256]={
0x0000,0x00A1,0x00A2,0x00A3,0x00A4,0x00A5,0x00A6,0x00A7,
0x00A8,0x00A9,0x00AA,0x00AB,0x00AC,0x00AD,0x00AE,0x00AF,
0x00B0,0x00B1,0x00B2,0x00B3,0x00B4,0x00B5,0x00B6,0x00B7,
0x00B8,0x00B9,0x00BA,0x00BB,0x00BC,0x00BD,0x00BE,0x00BF,
0x00C0,0x00C1,0x00C2,0x00C3,0x00C4,0x00C5,0x00C6,0x00C7,
0x00C8,0x00C9,0x00CA,0x00CB,0x00CC,0x00CD,0x00CE,0x00CF,
0x00D0,0x00D1,0x00D2,0x00D3,0x00D4,0x00D5,0x00D6,0x00D7,
0x00D8,0x00D9,0x00DA,0x0000,0x0000,0x0000,0x0000,0x00DF,
0x00E0,0x00E1,0x00E2,0x00E3,0x00E4,0x00E5,0x00E6,0x00E7,
0x00E8,0x00E9,0x00EA,0x00EB,0x00EC,0x00ED,0x00EE,0x00EF,
0x00F0,0x00F1,0x00F2,0x00F3,0x00F4,0x00F5,0x00F6,0x00F7,
0x00F8,0x00F9,0x00FA,0x00FB,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000
};
static uchar plFF[256]={
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,
0x0000,0x0000,0x0000,0x0000,0x0000,0x00FF,0x0000,0x0000
};
static uchar *uni_to_cs[256]={
pl00,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,pl0E,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,plFF
};
static
int my_mb_wc_tis620(CHARSET_INFO *cs __attribute__((unused)),
my_wc_t *wc,
const uchar *str,
const uchar *end __attribute__((unused)))
{
if (str >= end)
return MY_CS_TOOSMALL;
*wc=cs_to_uni[*str];
return (!wc[0] && str[0]) ? -1 : 1;
}
static
int my_wc_mb_tis620(CHARSET_INFO *cs __attribute__((unused)),
my_wc_t wc,
uchar *str,
uchar *end __attribute__((unused)))
{
uchar *pl;
if (str >= end)
return MY_CS_TOOSMALL;
pl= uni_to_cs[(wc>>8) & 0xFF];
str[0]= pl ? pl[wc & 0xFF] : '\0';
return (!str[0] && wc) ? MY_CS_ILUNI : 1;
}
static MY_COLLATION_HANDLER my_collation_ci_handler =
{
NULL, /* init */
my_strnncoll_tis620,
my_strnncollsp_tis620,
my_strnxfrm_tis620,
my_strnxfrmlen_simple,
my_like_range_simple,
my_wildcmp_8bit, /* wildcmp */
my_strcasecmp_8bit,
my_instr_simple, /* QQ: To be fixed */
my_hash_sort_simple,
my_propagate_simple
};
static MY_CHARSET_HANDLER my_charset_handler=
{
NULL, /* init */
NULL, /* ismbchar */
my_mbcharlen_8bit, /* mbcharlen */
my_numchars_8bit,
my_charpos_8bit,
my_well_formed_len_8bit,
my_lengthsp_8bit,
my_numcells_8bit,
my_mb_wc_tis620, /* mb_wc */
my_wc_mb_tis620, /* wc_mb */
my_mb_ctype_8bit,
my_caseup_str_8bit,
my_casedn_str_8bit,
my_caseup_8bit,
my_casedn_8bit,
my_snprintf_8bit,
my_long10_to_str_8bit,
my_longlong10_to_str_8bit,
my_fill_8bit,
my_strntol_8bit,
my_strntoul_8bit,
my_strntoll_8bit,
my_strntoull_8bit,
my_strntod_8bit,
my_strtoll10_8bit,
my_strntoull10rnd_8bit,
my_scan_8bit
};
CHARSET_INFO my_charset_tis620_thai_ci=
{
18,0,0, /* number */
MY_CS_COMPILED|MY_CS_PRIMARY|MY_CS_STRNXFRM, /* state */
"tis620", /* cs name */
"tis620_thai_ci", /* name */
"", /* comment */
NULL, /* tailoring */
ctype_tis620,
to_lower_tis620,
to_upper_tis620,
sort_order_tis620,
NULL, /* contractions */
NULL, /* sort_order_big*/
NULL, /* tab_to_uni */
NULL, /* tab_from_uni */
my_unicase_default, /* caseinfo */
NULL, /* state_map */
NULL, /* ident_map */
4, /* strxfrm_multiply */
1, /* caseup_multiply */
1, /* casedn_multiply */
1, /* mbminlen */
1, /* mbmaxlen */
0, /* min_sort_char */
255, /* max_sort_char */
' ', /* pad char */
0, /* escape_with_backslash_is_dangerous */
&my_charset_handler,
&my_collation_ci_handler
};
CHARSET_INFO my_charset_tis620_bin=
{
89,0,0, /* number */
MY_CS_COMPILED|MY_CS_BINSORT, /* state */
"tis620", /* cs name */
"tis620_bin", /* name */
"", /* comment */
NULL, /* tailoring */
ctype_tis620,
to_lower_tis620,
to_upper_tis620,
NULL, /* sort_order */
NULL, /* contractions */
NULL, /* sort_order_big*/
NULL, /* tab_to_uni */
NULL, /* tab_from_uni */
my_unicase_default, /* caseinfo */
NULL, /* state_map */
NULL, /* ident_map */
1, /* strxfrm_multiply */
1, /* caseup_multiply */
1, /* casedn_multiply */
1, /* mbminlen */
1, /* mbmaxlen */
0, /* min_sort_char */
255, /* max_sort_char */
' ', /* pad char */
0, /* escape_with_backslash_is_dangerous */
&my_charset_handler,
&my_collation_8bit_bin_handler
};
#endif

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,715 @@
/* Copyright (C) 2003 MySQL AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
/*
Shared, independent copyright: (C) 2001 Jan Pazdziora.
Development of this software was supported by Neocortex, s.r.o.
MySQL AB expresses its gratitude to Jan for for giving us this software.
Bug reports and suggestions are always welcome.
This file implements the collating sequence for Windows-1250
character set. It merely extends the binary sorting of US-ASCII
by adding characters with diacritical marks into proper places.
In addition, it sorts 'ch' between 'h' and 'i', and the sorting
is case sensitive, with uppercase being sorted first, in the
second pass.
*/
/*
* This comment is parsed by configure to create ctype.c,
* so don't change it unless you know what you are doing.
*
* .configure. strxfrm_multiply_win1250ch=2
*/
#define REAL_MYSQL
#ifdef REAL_MYSQL
#include "my_global.h"
#include "m_string.h"
#include "m_ctype.h"
#else
#include <stdio.h>
#define uchar unsigned char
#endif
#ifdef HAVE_CHARSET_cp1250
static uint16 tab_cp1250_uni[256]={
0,0x0001,0x0002,0x0003,0x0004,0x0005,0x0006,0x0007,
0x0008,0x0009,0x000A,0x000B,0x000C,0x000D,0x000E,0x000F,
0x0010,0x0011,0x0012,0x0013,0x0014,0x0015,0x0016,0x0017,
0x0018,0x0019,0x001A,0x001B,0x001C,0x001D,0x001E,0x001F,
0x0020,0x0021,0x0022,0x0023,0x0024,0x0025,0x0026,0x0027,
0x0028,0x0029,0x002A,0x002B,0x002C,0x002D,0x002E,0x002F,
0x0030,0x0031,0x0032,0x0033,0x0034,0x0035,0x0036,0x0037,
0x0038,0x0039,0x003A,0x003B,0x003C,0x003D,0x003E,0x003F,
0x0040,0x0041,0x0042,0x0043,0x0044,0x0045,0x0046,0x0047,
0x0048,0x0049,0x004A,0x004B,0x004C,0x004D,0x004E,0x004F,
0x0050,0x0051,0x0052,0x0053,0x0054,0x0055,0x0056,0x0057,
0x0058,0x0059,0x005A,0x005B,0x005C,0x005D,0x005E,0x005F,
0x0060,0x0061,0x0062,0x0063,0x0064,0x0065,0x0066,0x0067,
0x0068,0x0069,0x006A,0x006B,0x006C,0x006D,0x006E,0x006F,
0x0070,0x0071,0x0072,0x0073,0x0074,0x0075,0x0076,0x0077,
0x0078,0x0079,0x007A,0x007B,0x007C,0x007D,0x007E,0x007F,
0x20AC, 0,0x201A, 0,0x201E,0x2026,0x2020,0x2021,
0,0x2030,0x0160,0x2039,0x015A,0x0164,0x017D,0x0179,
0,0x2018,0x2019,0x201C,0x201D,0x2022,0x2013,0x2014,
0,0x2122,0x0161,0x203A,0x015B,0x0165,0x017E,0x017A,
0x00A0,0x02C7,0x02D8,0x0141,0x00A4,0x0104,0x00A6,0x00A7,
0x00A8,0x00A9,0x015E,0x00AB,0x00AC,0x00AD,0x00AE,0x017B,
0x00B0,0x00B1,0x02DB,0x0142,0x00B4,0x00B5,0x00B6,0x00B7,
0x00B8,0x0105,0x015F,0x00BB,0x013D,0x02DD,0x013E,0x017C,
0x0154,0x00C1,0x00C2,0x0102,0x00C4,0x0139,0x0106,0x00C7,
0x010C,0x00C9,0x0118,0x00CB,0x011A,0x00CD,0x00CE,0x010E,
0x0110,0x0143,0x0147,0x00D3,0x00D4,0x0150,0x00D6,0x00D7,
0x0158,0x016E,0x00DA,0x0170,0x00DC,0x00DD,0x0162,0x00DF,
0x0155,0x00E1,0x00E2,0x0103,0x00E4,0x013A,0x0107,0x00E7,
0x010D,0x00E9,0x0119,0x00EB,0x011B,0x00ED,0x00EE,0x010F,
0x0111,0x0144,0x0148,0x00F3,0x00F4,0x0151,0x00F6,0x00F7,
0x0159,0x016F,0x00FA,0x0171,0x00FC,0x00FD,0x0163,0x02D9
};
/* 0000-00FD , 254 chars */
static uchar tab_uni_cp1250_plane00[]={
0x00,0x01,0x02,0x03,0x04,0x05,0x06,0x07,0x08,0x09,0x0A,0x0B,0x0C,0x0D,0x0E,0x0F,
0x10,0x11,0x12,0x13,0x14,0x15,0x16,0x17,0x18,0x19,0x1A,0x1B,0x1C,0x1D,0x1E,0x1F,
0x20,0x21,0x22,0x23,0x24,0x25,0x26,0x27,0x28,0x29,0x2A,0x2B,0x2C,0x2D,0x2E,0x2F,
0x30,0x31,0x32,0x33,0x34,0x35,0x36,0x37,0x38,0x39,0x3A,0x3B,0x3C,0x3D,0x3E,0x3F,
0x40,0x41,0x42,0x43,0x44,0x45,0x46,0x47,0x48,0x49,0x4A,0x4B,0x4C,0x4D,0x4E,0x4F,
0x50,0x51,0x52,0x53,0x54,0x55,0x56,0x57,0x58,0x59,0x5A,0x5B,0x5C,0x5D,0x5E,0x5F,
0x60,0x61,0x62,0x63,0x64,0x65,0x66,0x67,0x68,0x69,0x6A,0x6B,0x6C,0x6D,0x6E,0x6F,
0x70,0x71,0x72,0x73,0x74,0x75,0x76,0x77,0x78,0x79,0x7A,0x7B,0x7C,0x7D,0x7E,0x7F,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0xA0,0x00,0x00,0x00,0xA4,0x00,0xA6,0xA7,0xA8,0xA9,0x00,0xAB,0xAC,0xAD,0xAE,0x00,
0xB0,0xB1,0x00,0x00,0xB4,0xB5,0xB6,0xB7,0xB8,0x00,0x00,0xBB,0x00,0x00,0x00,0x00,
0x00,0xC1,0xC2,0x00,0xC4,0x00,0x00,0xC7,0x00,0xC9,0x00,0xCB,0x00,0xCD,0xCE,0x00,
0x00,0x00,0x00,0xD3,0xD4,0x00,0xD6,0xD7,0x00,0x00,0xDA,0x00,0xDC,0xDD,0x00,0xDF,
0x00,0xE1,0xE2,0x00,0xE4,0x00,0x00,0xE7,0x00,0xE9,0x00,0xEB,0x00,0xED,0xEE,0x00,
0x00,0x00,0x00,0xF3,0xF4,0x00,0xF6,0xF7,0x00,0x00,0xFA,0x00,0xFC,0xFD};
/* 0102-017E , 125 chars */
static uchar tab_uni_cp1250_plane01[]={
0xC3,0xE3,0xA5,0xB9,0xC6,0xE6,0x00,0x00,0x00,0x00,0xC8,0xE8,0xCF,0xEF,0xD0,0xF0,
0x00,0x00,0x00,0x00,0x00,0x00,0xCA,0xEA,0xCC,0xEC,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0xC5,0xE5,0x00,0x00,0xBC,0xBE,0x00,0x00,0xA3,
0xB3,0xD1,0xF1,0x00,0x00,0xD2,0xF2,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0xD5,0xF5,
0x00,0x00,0xC0,0xE0,0x00,0x00,0xD8,0xF8,0x8C,0x9C,0x00,0x00,0xAA,0xBA,0x8A,0x9A,
0xDE,0xFE,0x8D,0x9D,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0xD9,0xF9,0xDB,0xFB,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x8F,0x9F,0xAF,0xBF,0x8E,0x9E};
/* 2013-20AC , 154 chars */
static uchar tab_uni_cp1250_plane20[]={
0x96,0x97,0x00,0x00,0x00,0x91,0x92,0x82,0x00,0x93,0x94,0x84,0x00,0x86,0x87,0x95,
0x00,0x00,0x00,0x85,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x89,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x8B,0x9B,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x80};
/* 02C7-02DD , 23 chars */
static uchar tab_uni_cp1250_plane02[]={
0xA1,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
0x00,0xA2,0xFF,0x00,0xB2,0x00,0xBD};
/* 2122-2122 , 1 chars */
static uchar tab_uni_cp1250_plane21[]={
0x99};
static MY_UNI_IDX idx_uni_cp1250[]={
{0x0000,0x00FD,tab_uni_cp1250_plane00},
{0x0102,0x017E,tab_uni_cp1250_plane01},
{0x2013,0x20AC,tab_uni_cp1250_plane20},
{0x02C7,0x02DD,tab_uni_cp1250_plane02},
{0x2122,0x2122,tab_uni_cp1250_plane21},
{0,0,NULL}
};
static uchar ctype_win1250ch[] = {
0x00,
0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20,
0x20, 0x28, 0x28, 0x28, 0x28, 0x28, 0x20, 0x20,
0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20,
0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20,
0x48, 0x10, 0x10, 0x10, 0x10, 0x10, 0x10, 0x10,
0x10, 0x10, 0x10, 0x10, 0x10, 0x10, 0x10, 0x10,
0x84, 0x84, 0x84, 0x84, 0x84, 0x84, 0x84, 0x84,
0x84, 0x84, 0x10, 0x10, 0x10, 0x10, 0x10, 0x10,
0x10, 0x81, 0x81, 0x81, 0x81, 0x81, 0x81, 0x01,
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01,
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01,
0x01, 0x01, 0x01, 0x10, 0x10, 0x10, 0x10, 0x10,
0x10, 0x82, 0x82, 0x82, 0x82, 0x82, 0x82, 0x02,
0x02, 0x02, 0x02, 0x02, 0x02, 0x02, 0x02, 0x02,
0x02, 0x02, 0x02, 0x02, 0x02, 0x02, 0x02, 0x02,
0x02, 0x02, 0x02, 0x10, 0x10, 0x10, 0x10, 0x20,
0x20, 0x20, 0x10, 0x20, 0x10, 0x10, 0x10, 0x10,
0x20, 0x10, 0x01, 0x10, 0x01, 0x01, 0x01, 0x01,
0x20, 0x10, 0x10, 0x10, 0x10, 0x10, 0x10, 0x10,
0x20, 0x10, 0x02, 0x10, 0x02, 0x02, 0x02, 0x02,
0x48, 0x10, 0x10, 0x01, 0x10, 0x01, 0x10, 0x01,
0x10, 0x10, 0x01, 0x10, 0x10, 0x10, 0x10, 0x01,
0x10, 0x10, 0x10, 0x02, 0x10, 0x10, 0x10, 0x10,
0x10, 0x02, 0x02, 0x10, 0x01, 0x10, 0x02, 0x02,
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01,
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01,
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x10,
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x02,
0x02, 0x02, 0x02, 0x02, 0x02, 0x02, 0x02, 0x02,
0x02, 0x02, 0x02, 0x02, 0x02, 0x02, 0x02, 0x02,
0x02, 0x02, 0x02, 0x02, 0x02, 0x02, 0x02, 0x10,
0x02, 0x02, 0x02, 0x02, 0x02, 0x02, 0x02, 0x10
};
static uchar to_lower_win1250ch[] = {
0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f,
0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17,
0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f,
0x20, 0x21, 0x22, 0x23, 0x24, 0x25, 0x26, 0x27,
0x28, 0x29, 0x2a, 0x2b, 0x2c, 0x2d, 0x2e, 0x2f,
0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37,
0x38, 0x39, 0x3a, 0x3b, 0x3c, 0x3d, 0x3e, 0x3f,
0x40, 0x61, 0x62, 0x63, 0x64, 0x65, 0x66, 0x67,
0x68, 0x69, 0x6a, 0x6b, 0x6c, 0x6d, 0x6e, 0x6f,
0x70, 0x71, 0x72, 0x73, 0x74, 0x75, 0x76, 0x77,
0x78, 0x79, 0x7a, 0x5b, 0x5c, 0x5d, 0x5e, 0x5f,
0x60, 0x61, 0x62, 0x63, 0x64, 0x65, 0x66, 0x67,
0x68, 0x69, 0x6a, 0x6b, 0x6c, 0x6d, 0x6e, 0x6f,
0x70, 0x71, 0x72, 0x73, 0x74, 0x75, 0x76, 0x77,
0x78, 0x79, 0x7a, 0x7b, 0x7c, 0x7d, 0x7e, 0x7f,
0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
0x88, 0x89, 0x9a, 0x8b, 0x9c, 0x9d, 0x9e, 0x9f,
0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f,
0xa0, 0xa1, 0xa2, 0xb3, 0xa4, 0xb9, 0xa6, 0xdf,
0xa8, 0xa9, 0xba, 0xab, 0xac, 0xad, 0xae, 0xbf,
0xb0, 0xb1, 0xb2, 0xb3, 0xb4, 0xb5, 0xb6, 0xb7,
0xb8, 0xb9, 0xba, 0xbb, 0xbe, 0xbd, 0xbe, 0xbf,
0xe0, 0xe1, 0xe2, 0xe3, 0xe4, 0xe5, 0xe6, 0xe7,
0xe8, 0xe9, 0xea, 0xeb, 0xec, 0xed, 0xee, 0xef,
0xf0, 0xf1, 0xf2, 0xf3, 0xf4, 0xf5, 0xf6, 0xd7,
0xf8, 0xf9, 0xfa, 0xfb, 0xfc, 0xfd, 0xfe, 0xdf,
0xe0, 0xe1, 0xe2, 0xe3, 0xe4, 0xe5, 0xe6, 0xe7,
0xe8, 0xe9, 0xea, 0xeb, 0xec, 0xed, 0xee, 0xef,
0xf0, 0xf1, 0xf2, 0xf3, 0xf4, 0xf5, 0xf6, 0xf7,
0xf8, 0xf9, 0xfa, 0xfb, 0xfc, 0xfd, 0xfe, 0xff
};
static uchar to_upper_win1250ch[] = {
0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f,
0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17,
0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f,
0x20, 0x21, 0x22, 0x23, 0x24, 0x25, 0x26, 0x27,
0x28, 0x29, 0x2a, 0x2b, 0x2c, 0x2d, 0x2e, 0x2f,
0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37,
0x38, 0x39, 0x3a, 0x3b, 0x3c, 0x3d, 0x3e, 0x3f,
0x40, 0x41, 0x42, 0x43, 0x44, 0x45, 0x46, 0x47,
0x48, 0x49, 0x4a, 0x4b, 0x4c, 0x4d, 0x4e, 0x4f,
0x50, 0x51, 0x52, 0x53, 0x54, 0x55, 0x56, 0x57,
0x58, 0x59, 0x5a, 0x5b, 0x5c, 0x5d, 0x5e, 0x5f,
0x60, 0x41, 0x42, 0x43, 0x44, 0x45, 0x46, 0x47,
0x48, 0x49, 0x4a, 0x4b, 0x4c, 0x4d, 0x4e, 0x4f,
0x50, 0x51, 0x52, 0x53, 0x54, 0x55, 0x56, 0x57,
0x58, 0x59, 0x5a, 0x7b, 0x7c, 0x7d, 0x7e, 0x7f,
0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
0x98, 0x99, 0x8a, 0x9b, 0x8c, 0x8d, 0x8e, 0x8f,
0xa0, 0xa1, 0xa2, 0xa3, 0xa4, 0xa5, 0xa6, 0xa7,
0xa8, 0xa9, 0xaa, 0xab, 0xac, 0xad, 0xae, 0xaf,
0xb0, 0xb1, 0xb2, 0xa3, 0xb4, 0xb5, 0xb6, 0xb7,
0xb8, 0xa5, 0xaa, 0xbb, 0xbc, 0xbd, 0xbc, 0xaf,
0xc0, 0xc1, 0xc2, 0xc3, 0xc4, 0xc5, 0xc6, 0xc7,
0xc8, 0xc9, 0xca, 0xcb, 0xcc, 0xcd, 0xce, 0xcf,
0xd0, 0xd1, 0xd2, 0xd3, 0xd4, 0xd5, 0xd6, 0xd7,
0xd8, 0xd9, 0xda, 0xdb, 0xdc, 0xdd, 0xde, 0xa7,
0xc0, 0xc1, 0xc2, 0xc3, 0xc4, 0xc5, 0xc6, 0xc7,
0xc8, 0xc9, 0xca, 0xcb, 0xcc, 0xcd, 0xce, 0xcf,
0xd0, 0xd1, 0xd2, 0xd3, 0xd4, 0xd5, 0xd6, 0xf7,
0xd8, 0xd9, 0xda, 0xdb, 0xdc, 0xdd, 0xde, 0xff
};
static uchar sort_order_win1250ch[] = {
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,
48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63,
64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79,
80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95,
96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111,
112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127,
128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143,
144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159,
160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175,
176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191,
192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207,
208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223,
224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239,
240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255
};
static uchar _sort_order_win1250ch1[] = {
0x81, 0x81, 0x81, 0x81, 0x81, 0x81, 0x81, 0x81,
0x81, 0x81, 0x81, 0x81, 0x81, 0x81, 0x81, 0x81,
0x81, 0x81, 0x81, 0x81, 0x81, 0x81, 0x81, 0x81,
0x81, 0x81, 0x81, 0x81, 0x81, 0x81, 0x81, 0x81,
/* space ord 32 0x20 */
0x82, 0x83, 0x84, 0x85, 0x86, 0x87, 0x88, 0x89,
0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, 0x90, 0x91,
/* 0 ord 48 0x30 */
0x92, 0x93, 0x94, 0x95, 0x96, 0x97, 0x98, 0x99,
0x9a, 0x9b,
/* colon ord 58 0x3a */
0x9c, 0x9d, 0x9e, 0x9f, 0xa0, 0xa1,
0xa2,
/* A ord 65 0x41 */
0xa4, 0xa5,
/* C ord 67 0x43 */
0xff, 0xa8, 0xa9, 0xaa, 0xab,
0xac, 0xae, 0xaf, 0xb0, 0xb1, 0xb2, 0xb3, 0xb4,
0xb5, 0xb6,
/* R ord 82 0x52 */
0xb7,
/* S ord 83 0x53 */
0xb9, 0xbc, 0xbd, 0xbe, 0xbf,
0xc0, 0xc1, 0xc2,
/* [ ord 91 0x5b */
0xc4, 0xc5, 0xc6, 0xc7, 0xc8,
0xc9,
/* a ord 97 0x61 */
0xa4, 0xa5, 0xff, 0xa8, 0xa9, 0xaa, 0xab,
0xac, 0xae, 0xaf, 0xb0, 0xb1, 0xb2, 0xb3, 0xb4,
0xb5, 0xb6, 0xb7, 0xb9, 0xbc, 0xbd, 0xbe, 0xbf,
0xc0, 0xc1, 0xc2,
/* { ord 123 0x7b */
0xca, 0xcb, 0xcc, 0xcd, 0x81,
0x81, 0x81, 0xce, 0x81, 0xcf, 0xd0, 0xd1, 0xd2,
0x81, 0xd3,
/* Scaron ord 138 0x8a */
0xba, 0xd4, 0xb9, 0xbc, 0xc3, 0xc2,
0x81, 0xd5, 0xd6, 0xd7, 0xd8, 0xd9, 0xda, 0xdb,
0x81, 0xdc, 0xba, 0xdd, 0xb9, 0xbc, 0xc3, 0xc2,
/* nobreakspace ord 160 0xa0 */
0x82, 0xde, 0xdf, 0xb1, 0xe0, 0xa4, 0xe1, 0xe2,
0xe3, 0xe4, 0xb9, 0xe5, 0xe6, 0xe7, 0xe8, 0xc2,
0xe9, 0xea, 0xeb, 0xb1, 0xed, 0xee, 0x81, 0xef,
/* cedilla ord 183 0xb8 */
0xf0, 0xa4, 0xb9, 0xf1, 0xb1, 0xf2, 0xb1, 0xc2,
0xb7, 0xa4, 0xa4, 0xa4, 0xa4, 0xb1, 0xa6, 0xa6,
0xa7, 0xa9, 0xa9, 0xa9, 0xa9, 0xae, 0xae, 0xa8,
/* Eth ord 208 0xd0 */
0xa8, 0xb3, 0xb3, 0xb4, 0xb4, 0xb4, 0xb4, 0xf3,
0xb8, 0xbd, 0xbd, 0xbd, 0xbd, 0xc1, 0xbc, 0xbb,
/* racute ord 224 0xe0 */
0xb7, 0xa4, 0xa4, 0xa4, 0xa4, 0xb1, 0xa6, 0xa6,
0xa7, 0xa9, 0xa9, 0xa9, 0xa9, 0xae, 0xae, 0xa8,
/* eth ord 240 0xf0 */
0xa8, 0xb3, 0xb3, 0xb4, 0xb4, 0xb4, 0xb4, 0xf4,
0xb8, 0xbd, 0xbd, 0xbd, 0xbd, 0xc1, 0xbc, 0xf5
};
static uchar _sort_order_win1250ch2[] = {
0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09,
0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, 0x10, 0x11,
0x12, 0x13, 0x14, 0x15, 0x16, 0x17, 0x18, 0x19,
0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f, 0x20, 0x21,
/* space ord 32 0x20 */
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01,
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01,
/* 0 ord 48 0x30 */
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01,
0x01, 0x01,
/* colon ord 58 0x3a */
0x01, 0x01, 0x01, 0x01, 0x01, 0x01,
0x01,
/* A ord 65 0x41 */
0x01, 0x01,
/* C ord 67 0x43 */
0xff, 0x01, 0x01, 0x01, 0x01,
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01,
0x01, 0x01,
/* R ord 82 0x52 */
0x01,
/* S ord 83 0x53 */
0x01, 0x01, 0x01, 0x01, 0x01,
0x01, 0x01, 0x01,
/* [ ord 91 0x5b */
0x01, 0x01, 0x01, 0x01, 0x01,
0x01,
/* a ord 97 0x61 */
0x02, 0x02, 0xff, 0x02, 0x02, 0x02, 0x02,
0x02, 0x02, 0x02, 0x02, 0x02, 0x02, 0x02, 0x02,
0x02, 0x02, 0x02, 0x02, 0x02, 0x02, 0x02, 0x02,
0x02, 0x02, 0x02,
/* { ord 123 0x7b */
0x01, 0x01, 0x01, 0x01, 0x22,
0x23, 0x24, 0x01, 0x25, 0x01, 0x01, 0x01, 0x01,
0x26, 0x01,
/* Scaron ord 138 0x8a */
0x01, 0x01, 0x03, 0x03, 0x01, 0x05,
0x27, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01,
0x28, 0x01, 0x02, 0x01, 0x04, 0x04, 0x02, 0x06,
/* nobreakspace ord 160 0xa0 */
0x02, 0x01, 0x01, 0x07, 0x01, 0x11, 0x01, 0x01,
0x01, 0x01, 0x05, 0x01, 0x01, 0x01, 0x01, 0x03,
0x01, 0x01, 0x01, 0x08, 0x01, 0x01, 0x29, 0x01,
/* cedilla ord 184 0xb8 */
0x01, 0x12, 0x06, 0x01, 0x05, 0x01, 0x06, 0x04,
0x03, 0x03, 0x05, 0x07, 0x09, 0x03, 0x03, 0x05,
0x01, 0x03, 0x09, 0x07, 0x05, 0x03, 0x05, 0x03,
/* Eth ord 208 0xd0 */
0x05, 0x03, 0x05, 0x03, 0x05, 0x09, 0x07, 0x01,
0x01, 0x05, 0x03, 0x09, 0x07, 0x03, 0x05, 0x01,
/* racute ord 224 0xe0 */
0x04, 0x04, 0x06, 0x08, 0x0a, 0x04, 0x04, 0x06,
0x02, 0x04, 0x0a, 0x08, 0x06, 0x04, 0x06, 0x04,
/* eth ord 240 0xf0 */
0x06, 0x04, 0x06, 0x04, 0x06, 0x0a, 0x08, 0x01,
0x02, 0x06, 0x04, 0x0a, 0x08, 0x04, 0x06, 0x01
};
struct wordvalue {
const uchar *word;
uchar pass1;
uchar pass2;
};
static struct wordvalue doubles[] = {
{ (uchar*) "ch", 0xad, 0x03 },
{ (uchar*) "c", 0xa6, 0x02 },
{ (uchar*) "Ch", 0xad, 0x02 },
{ (uchar*) "CH", 0xad, 0x01 },
{ (uchar*) "C", 0xa6, 0x01 },
};
#define NEXT_CMP_VALUE(src, p, pass, value, len) \
while (1) { \
if (IS_END(p, src, len)) { \
if (pass == 0 && len > 0) { p= src; pass++; } \
else { value = 0; break; } \
} \
value = ((pass == 0) ? _sort_order_win1250ch1[*p] \
: _sort_order_win1250ch2[*p]); \
if (value == 0xff) { \
int i; \
for (i = 0; i < (int) sizeof(doubles); i++) { \
const uchar *patt = doubles[i].word; \
const uchar *q = (const uchar *) p; \
while (*patt \
&& !(IS_END(q, src, len)) \
&& (*patt == *q)) { \
patt++; q++; \
} \
if (!(*patt)) { \
value = (int)((pass == 0) \
? doubles[i].pass1 \
: doubles[i].pass2); \
p = (const uchar *) q - 1; \
break; \
} \
} \
} \
p++; \
break; \
}
#define IS_END(p, src, len) (((char *)p - (char *)src) >= (len))
static int my_strnncoll_win1250ch(CHARSET_INFO *cs __attribute__((unused)),
const uchar *s1, size_t len1,
const uchar *s2, size_t len2,
my_bool s2_is_prefix)
{
int v1, v2;
const uchar *p1, * p2;
int pass1 = 0, pass2 = 0;
int diff;
if (s2_is_prefix && len1 > len2)
len1=len2;
p1 = s1; p2 = s2;
do
{
NEXT_CMP_VALUE(s1, p1, pass1, v1, (int)len1);
NEXT_CMP_VALUE(s2, p2, pass2, v2, (int)len2);
if ((diff = v1 - v2))
return diff;
} while (v1);
return 0;
}
/*
TODO: Has to be fixed as strnncollsp in ctype-simple
*/
static
int my_strnncollsp_win1250ch(CHARSET_INFO * cs,
const uchar *s, size_t slen,
const uchar *t, size_t tlen,
my_bool diff_if_only_endspace_difference
__attribute__((unused)))
{
for ( ; slen && s[slen-1] == ' ' ; slen--);
for ( ; tlen && t[tlen-1] == ' ' ; tlen--);
return my_strnncoll_win1250ch(cs,s,slen,t,tlen,0);
}
static size_t my_strnxfrm_win1250ch(CHARSET_INFO * cs __attribute__((unused)),
uchar *dest, size_t len,
const uchar *src, size_t srclen)
{
int value;
const uchar *p;
int pass = 0;
size_t totlen = 0;
p = src;
do {
NEXT_CMP_VALUE(src, p, pass, value, (int)srclen);
if (totlen <= len)
dest[totlen] = value;
totlen++;
} while (value) ;
if (len > totlen)
bfill(dest + totlen, len - totlen, ' ');
return len;
}
#undef IS_END
#ifdef REAL_MYSQL
static uchar like_range_prefix_min_win1250ch[]=
{
0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
0x08, 0x09, 0x0A, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F,
0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17,
0x18, 0x19, 0x1A, 0x1B, 0x1C, 0x1D, 0x1E, 0x1F,
0x20, 0x21, 0x22, 0x23, 0x24, 0x25, 0x26, 0x27,
0x28, 0x29, 0x2A, 0x2B, 0x2C, 0x2D, 0x2E, 0x2F,
0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37,
0x38, 0x39, 0x3A, 0x3B, 0x3C, 0x3D, 0x3E, 0x3F,
0x40, 0x41, 0x42, 0x43, 0x44, 0x45, 0x46, 0x47,
0x48, 0x49, 0x4A, 0x4B, 0x4C, 0x4D, 0x4E, 0x4F,
0x50, 0x51, 0x52, 0x53, 0x54, 0x55, 0x56, 0x57,
0x58, 0x59, 0x5A, 0x5B, 0x5C, 0x5D, 0x5E, 0x5F,
0x60, 0x61, 0x62, 0x63, 0x64, 0x65, 0x66, 0x67,
0x68, 0x69, 0x6A, 0x6B, 0x6C, 0x6D, 0x6E, 0x6F,
0x70, 0x71, 0x72, 0x73, 0x74, 0x75, 0x76, 0x77,
0x78, 0x79, 0x7A, 0x7B, 0x7C, 0x7D, 0x7E, 0x7F,
0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
0x88, 0x89, 0x8A, 0x8B, 0x8C, 0x8D, 0x8E, 0x8F,
0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
0x98, 0x99, 0x9A, 0x9B, 0x9C, 0x9D, 0x9E, 0x9F,
0xA0, 0xA1, 0xA2, 0xA3, 0xA4, 0xA5, 0xA6, 0xA7,
0xA8, 0xA9, 0xAA, 0xAB, 0xAC, 0xAD, 0xAE, 0xAF,
0xB0, 0xB1, 0xB2, 0xB3, 0xB4, 0xB5, 0xB6, 0xB7,
0xB8, 0xB9, 0xBA, 0xBB, 0xBC, 0xBD, 0xBE, 0xBF,
0xC0, 0xC1, 0xC2, 0xC3, 0xC4, 0xC5, 0xC6, 0xC7,
0xC8, 0xC9, 0xCA, 0xCB, 0xCC, 0xCD, 0xCE, 0xCF,
0xD0, 0xD1, 0xD2, 0xD3, 0xD4, 0xD5, 0xD6, 0xD7,
0xD8, 0xD9, 0xDA, 0xDB, 0xDC, 0xDD, 0xDE, 0xDF,
0xE0, 0xE1, 0xE2, 0xE3, 0xE4, 0xE5, 0xE6, 0xE7,
0xE8, 0xE9, 0xEA, 0xEB, 0xEC, 0xED, 0xEE, 0xEF,
0xF0, 0xF1, 0xF2, 0xF3, 0xF4, 0xF5, 0xF6, 0xF7,
0xF8, 0xF9, 0xFA, 0xFB, 0xFC, 0xFD, 0xFE, 0xFF
};
/*
The letter "C" is a special case:
"CH" is sorted between "H" and "I".
prefix_max for "C" is "I": prefix_max[0x43] == 0x49
prefix_max for "c" is "i": prefix_max[0x63] == 0x69
For all other characters: prefix_max[i] == i
*/
static uchar like_range_prefix_max_win1250ch[]=
{
0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
0x08, 0x09, 0x0A, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F,
0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17,
0x18, 0x19, 0x1A, 0x1B, 0x1C, 0x1D, 0x1E, 0x1F,
0x20, 0x21, 0x22, 0x23, 0x24, 0x25, 0x26, 0x27,
0x28, 0x29, 0x2A, 0x2B, 0x2C, 0x2D, 0x2E, 0x2F,
0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37,
0x38, 0x39, 0x3A, 0x3B, 0x3C, 0x3D, 0x3E, 0x3F,
0x40, 0x41, 0x42, 0x49, 0x44, 0x45, 0x46, 0x47,
0x48, 0x49, 0x4A, 0x4B, 0x4C, 0x4D, 0x4E, 0x4F,
0x50, 0x51, 0x52, 0x53, 0x54, 0x55, 0x56, 0x57,
0x58, 0x59, 0x5A, 0x5B, 0x5C, 0x5D, 0x5E, 0x5F,
0x60, 0x61, 0x62, 0x69, 0x64, 0x65, 0x66, 0x67,
0x68, 0x69, 0x6A, 0x6B, 0x6C, 0x6D, 0x6E, 0x6F,
0x70, 0x71, 0x72, 0x73, 0x74, 0x75, 0x76, 0x77,
0x78, 0x79, 0x7A, 0x7B, 0x7C, 0x7D, 0x7E, 0x7F,
0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
0x88, 0x89, 0x8A, 0x8B, 0x8C, 0x8D, 0x8E, 0x8F,
0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
0x98, 0x99, 0x9A, 0x9B, 0x9C, 0x9D, 0x9E, 0x9F,
0xA0, 0xA1, 0xA2, 0xA3, 0xA4, 0xA5, 0xA6, 0xA7,
0xA8, 0xA9, 0xAA, 0xAB, 0xAC, 0xAD, 0xAE, 0xAF,
0xB0, 0xB1, 0xB2, 0xB3, 0xB4, 0xB5, 0xB6, 0xB7,
0xB8, 0xB9, 0xBA, 0xBB, 0xBC, 0xBD, 0xBE, 0xBF,
0xC0, 0xC1, 0xC2, 0xC3, 0xC4, 0xC5, 0xC6, 0xC7,
0xC8, 0xC9, 0xCA, 0xCB, 0xCC, 0xCD, 0xCE, 0xCF,
0xD0, 0xD1, 0xD2, 0xD3, 0xD4, 0xD5, 0xD6, 0xD7,
0xD8, 0xD9, 0xDA, 0xDB, 0xDC, 0xDD, 0xDE, 0xDF,
0xE0, 0xE1, 0xE2, 0xE3, 0xE4, 0xE5, 0xE6, 0xE7,
0xE8, 0xE9, 0xEA, 0xEB, 0xEC, 0xED, 0xEE, 0xEF,
0xF0, 0xF1, 0xF2, 0xF3, 0xF4, 0xF5, 0xF6, 0xF7,
0xF8, 0xF9, 0xFA, 0xFB, 0xFC, 0xFD, 0xFE, 0xFF
};
#define min_sort_char '\x20'
#define max_sort_char '\xff'
/*
** Calculate min_str and max_str that ranges a LIKE string.
** Arguments:
** ptr Pointer to LIKE string.
** ptr_length Length of LIKE string.
** escape Escape character in LIKE. (Normally '\').
** All escape characters should be removed from min_str and max_str
** res_length Length of min_str and max_str.
** min_str Smallest case sensitive string that ranges LIKE.
** Should be space padded to res_length.
** max_str Largest case sensitive string that ranges LIKE.
** Normally padded with the biggest character sort value.
**
** The function should return 0 if ok and 1 if the LIKE string can't be
** optimized !
*/
static my_bool
my_like_range_win1250ch(CHARSET_INFO *cs __attribute__((unused)),
const char *ptr, size_t ptr_length,
pbool escape, pbool w_one, pbool w_many,
size_t res_length,
char *min_str, char *max_str,
size_t *min_length, size_t *max_length)
{
int only_min_found= 1;
const char *end = ptr + ptr_length;
char *min_org = min_str;
char *min_end = min_str + res_length;
/* return 1; */
for (; ptr != end && min_str != min_end ; ptr++)
{
if (*ptr == escape && ptr+1 != end)
ptr++; /* Skip escape */
else if (*ptr == w_one || *ptr == w_many) /* '_' or '%' in SQL */
break;
*min_str= like_range_prefix_min_win1250ch[(uint) (uchar) (*ptr)];
if (*min_str != min_sort_char)
only_min_found= 0;
min_str++;
*max_str++= like_range_prefix_max_win1250ch[(uint) (uchar) (*ptr)];
}
if (cs->state & MY_CS_BINSORT)
*min_length= (size_t) (min_str - min_org);
else
{
/* 'a\0\0... is the smallest possible string */
*min_length= res_length;
}
/* a\ff\ff... is the biggest possible string */
*max_length= res_length;
while (min_str != min_end)
{
*min_str++ = min_sort_char;
*max_str++ = max_sort_char;
}
return (only_min_found);
}
static MY_COLLATION_HANDLER my_collation_czech_ci_handler =
{
NULL, /* init */
my_strnncoll_win1250ch,
my_strnncollsp_win1250ch,
my_strnxfrm_win1250ch,
my_strnxfrmlen_simple,
my_like_range_win1250ch,
my_wildcmp_8bit,
my_strcasecmp_8bit,
my_instr_simple,
my_hash_sort_simple,
my_propagate_simple
};
CHARSET_INFO my_charset_cp1250_czech_ci =
{
34,0,0, /* number */
MY_CS_COMPILED|MY_CS_STRNXFRM|MY_CS_CSSORT, /* state */
"cp1250", /* cs name */
"cp1250_czech_cs", /* name */
"", /* comment */
NULL, /* tailoring */
ctype_win1250ch,
to_lower_win1250ch,
to_upper_win1250ch,
sort_order_win1250ch,
NULL, /* contractions */
NULL, /* sort_order_big*/
tab_cp1250_uni, /* tab_to_uni */
idx_uni_cp1250, /* tab_from_uni */
my_unicase_default, /* caseinfo */
NULL, /* state_map */
NULL, /* ident_map */
2, /* strxfrm_multiply */
1, /* caseup_multiply */
1, /* casedn_multiply */
1, /* mbminlen */
1, /* mbmaxlen */
0, /* min_sort_char */
0, /* max_sort_char */
' ', /* pad char */
0, /* escape_with_backslash_is_dangerous */
&my_charset_8bit_handler,
&my_collation_czech_ci_handler
};
#endif /* REAL_MYSQL */
#endif /* HAVE_CHARSET_cp1250 */

View File

@@ -0,0 +1,430 @@
/* Copyright (C) 2000 MySQL AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
#include <my_global.h>
#include <m_ctype.h>
#include <my_xml.h>
#ifndef SCO
#include <m_string.h>
#endif
/*
This files implements routines which parse XML based
character set and collation description files.
Unicode collations are encoded according to
Unicode Technical Standard #35
Locale Data Markup Language (LDML)
http://www.unicode.org/reports/tr35/
and converted into ICU string according to
Collation Customization
http://oss.software.ibm.com/icu/userguide/Collate_Customization.html
*/
static char *mstr(char *str,const char *src,size_t l1,size_t l2)
{
l1= l1<l2 ? l1 : l2;
memcpy(str,src,l1);
str[l1]='\0';
return str;
}
struct my_cs_file_section_st
{
int state;
const char *str;
};
#define _CS_MISC 1
#define _CS_ID 2
#define _CS_CSNAME 3
#define _CS_FAMILY 4
#define _CS_ORDER 5
#define _CS_COLNAME 6
#define _CS_FLAG 7
#define _CS_CHARSET 8
#define _CS_COLLATION 9
#define _CS_UPPERMAP 10
#define _CS_LOWERMAP 11
#define _CS_UNIMAP 12
#define _CS_COLLMAP 13
#define _CS_CTYPEMAP 14
#define _CS_PRIMARY_ID 15
#define _CS_BINARY_ID 16
#define _CS_CSDESCRIPT 17
#define _CS_RESET 18
#define _CS_DIFF1 19
#define _CS_DIFF2 20
#define _CS_DIFF3 21
#define _CS_IDENTICAL 22
static struct my_cs_file_section_st sec[] =
{
{_CS_MISC, "xml"},
{_CS_MISC, "xml/version"},
{_CS_MISC, "xml/encoding"},
{_CS_MISC, "charsets"},
{_CS_MISC, "charsets/max-id"},
{_CS_CHARSET, "charsets/charset"},
{_CS_PRIMARY_ID, "charsets/charset/primary-id"},
{_CS_BINARY_ID, "charsets/charset/binary-id"},
{_CS_CSNAME, "charsets/charset/name"},
{_CS_FAMILY, "charsets/charset/family"},
{_CS_CSDESCRIPT, "charsets/charset/description"},
{_CS_MISC, "charsets/charset/alias"},
{_CS_MISC, "charsets/charset/ctype"},
{_CS_CTYPEMAP, "charsets/charset/ctype/map"},
{_CS_MISC, "charsets/charset/upper"},
{_CS_UPPERMAP, "charsets/charset/upper/map"},
{_CS_MISC, "charsets/charset/lower"},
{_CS_LOWERMAP, "charsets/charset/lower/map"},
{_CS_MISC, "charsets/charset/unicode"},
{_CS_UNIMAP, "charsets/charset/unicode/map"},
{_CS_COLLATION, "charsets/charset/collation"},
{_CS_COLNAME, "charsets/charset/collation/name"},
{_CS_ID, "charsets/charset/collation/id"},
{_CS_ORDER, "charsets/charset/collation/order"},
{_CS_FLAG, "charsets/charset/collation/flag"},
{_CS_COLLMAP, "charsets/charset/collation/map"},
{_CS_RESET, "charsets/charset/collation/rules/reset"},
{_CS_DIFF1, "charsets/charset/collation/rules/p"},
{_CS_DIFF2, "charsets/charset/collation/rules/s"},
{_CS_DIFF3, "charsets/charset/collation/rules/t"},
{_CS_IDENTICAL, "charsets/charset/collation/rules/i"},
{0, NULL}
};
static struct my_cs_file_section_st * cs_file_sec(const char *attr, size_t len)
{
struct my_cs_file_section_st *s;
for (s=sec; s->str; s++)
{
if (!strncmp(attr,s->str,len))
return s;
}
return NULL;
}
#define MY_CS_CSDESCR_SIZE 64
#define MY_CS_TAILORING_SIZE 1024
typedef struct my_cs_file_info
{
char csname[MY_CS_NAME_SIZE];
char name[MY_CS_NAME_SIZE];
uchar ctype[MY_CS_CTYPE_TABLE_SIZE];
uchar to_lower[MY_CS_TO_LOWER_TABLE_SIZE];
uchar to_upper[MY_CS_TO_UPPER_TABLE_SIZE];
uchar sort_order[MY_CS_SORT_ORDER_TABLE_SIZE];
uint16 tab_to_uni[MY_CS_TO_UNI_TABLE_SIZE];
char comment[MY_CS_CSDESCR_SIZE];
char tailoring[MY_CS_TAILORING_SIZE];
size_t tailoring_length;
CHARSET_INFO cs;
int (*add_collation)(CHARSET_INFO *cs);
} MY_CHARSET_LOADER;
static int fill_uchar(uchar *a,uint size,const char *str, size_t len)
{
uint i= 0;
const char *s, *b, *e=str+len;
for (s=str ; s < e ; i++)
{
for ( ; (s < e) && strchr(" \t\r\n",s[0]); s++) ;
b=s;
for ( ; (s < e) && !strchr(" \t\r\n",s[0]); s++) ;
if (s == b || i > size)
break;
a[i]= (uchar) strtoul(b,NULL,16);
}
return 0;
}
static int fill_uint16(uint16 *a,uint size,const char *str, size_t len)
{
uint i= 0;
const char *s, *b, *e=str+len;
for (s=str ; s < e ; i++)
{
for ( ; (s < e) && strchr(" \t\r\n",s[0]); s++) ;
b=s;
for ( ; (s < e) && !strchr(" \t\r\n",s[0]); s++) ;
if (s == b || i > size)
break;
a[i]= (uint16) strtol(b,NULL,16);
}
return 0;
}
static int cs_enter(MY_XML_PARSER *st,const char *attr, size_t len)
{
struct my_cs_file_info *i= (struct my_cs_file_info *)st->user_data;
struct my_cs_file_section_st *s= cs_file_sec(attr,len);
if ( s && (s->state == _CS_CHARSET))
bzero(&i->cs,sizeof(i->cs));
if (s && (s->state == _CS_COLLATION))
i->tailoring_length= 0;
return MY_XML_OK;
}
static int cs_leave(MY_XML_PARSER *st,const char *attr, size_t len)
{
struct my_cs_file_info *i= (struct my_cs_file_info *)st->user_data;
struct my_cs_file_section_st *s= cs_file_sec(attr,len);
int state= s ? s->state : 0;
int rc;
switch(state){
case _CS_COLLATION:
rc= i->add_collation ? i->add_collation(&i->cs) : MY_XML_OK;
break;
default:
rc=MY_XML_OK;
}
return rc;
}
static int cs_value(MY_XML_PARSER *st,const char *attr, size_t len)
{
struct my_cs_file_info *i= (struct my_cs_file_info *)st->user_data;
struct my_cs_file_section_st *s;
int state= (int)((s=cs_file_sec(st->attr, strlen(st->attr))) ? s->state :
0);
switch (state) {
case _CS_ID:
i->cs.number= strtol(attr,(char**)NULL,10);
break;
case _CS_BINARY_ID:
i->cs.binary_number= strtol(attr,(char**)NULL,10);
break;
case _CS_PRIMARY_ID:
i->cs.primary_number= strtol(attr,(char**)NULL,10);
break;
case _CS_COLNAME:
i->cs.name=mstr(i->name,attr,len,MY_CS_NAME_SIZE-1);
break;
case _CS_CSNAME:
i->cs.csname=mstr(i->csname,attr,len,MY_CS_NAME_SIZE-1);
break;
case _CS_CSDESCRIPT:
i->cs.comment=mstr(i->comment,attr,len,MY_CS_CSDESCR_SIZE-1);
break;
case _CS_FLAG:
if (!strncmp("primary",attr,len))
i->cs.state|= MY_CS_PRIMARY;
else if (!strncmp("binary",attr,len))
i->cs.state|= MY_CS_BINSORT;
else if (!strncmp("compiled",attr,len))
i->cs.state|= MY_CS_COMPILED;
break;
case _CS_UPPERMAP:
fill_uchar(i->to_upper,MY_CS_TO_UPPER_TABLE_SIZE,attr,len);
i->cs.to_upper=i->to_upper;
break;
case _CS_LOWERMAP:
fill_uchar(i->to_lower,MY_CS_TO_LOWER_TABLE_SIZE,attr,len);
i->cs.to_lower=i->to_lower;
break;
case _CS_UNIMAP:
fill_uint16(i->tab_to_uni,MY_CS_TO_UNI_TABLE_SIZE,attr,len);
i->cs.tab_to_uni=i->tab_to_uni;
break;
case _CS_COLLMAP:
fill_uchar(i->sort_order,MY_CS_SORT_ORDER_TABLE_SIZE,attr,len);
i->cs.sort_order=i->sort_order;
break;
case _CS_CTYPEMAP:
fill_uchar(i->ctype,MY_CS_CTYPE_TABLE_SIZE,attr,len);
i->cs.ctype=i->ctype;
break;
case _CS_RESET:
case _CS_DIFF1:
case _CS_DIFF2:
case _CS_DIFF3:
case _CS_IDENTICAL:
{
/*
Convert collation description from
Locale Data Markup Language (LDML)
into ICU Collation Customization expression.
*/
char arg[16];
const char *cmd[]= {"&","<","<<","<<<","="};
i->cs.tailoring= i->tailoring;
mstr(arg,attr,len,sizeof(arg)-1);
if (i->tailoring_length + 20 < sizeof(i->tailoring))
{
char *dst= i->tailoring_length + i->tailoring;
i->tailoring_length+= sprintf(dst," %s %s",cmd[state-_CS_RESET],arg);
}
}
}
return MY_XML_OK;
}
my_bool my_parse_charset_xml(const char *buf, size_t len,
int (*add_collation)(CHARSET_INFO *cs))
{
MY_XML_PARSER p;
struct my_cs_file_info i;
my_bool rc;
my_xml_parser_create(&p);
my_xml_set_enter_handler(&p,cs_enter);
my_xml_set_value_handler(&p,cs_value);
my_xml_set_leave_handler(&p,cs_leave);
i.add_collation= add_collation;
my_xml_set_user_data(&p,(void*)&i);
rc= (my_xml_parse(&p,buf,len) == MY_XML_OK) ? FALSE : TRUE;
my_xml_parser_free(&p);
return rc;
}
/*
Check repertoire: detect pure ascii strings
*/
uint
my_string_repertoire(CHARSET_INFO *cs, const char *str, ulong length)
{
const char *strend= str + length;
if (cs->mbminlen == 1)
{
for ( ; str < strend; str++)
{
if (((uchar) *str) > 0x7F)
return MY_REPERTOIRE_UNICODE30;
}
}
else
{
my_wc_t wc;
int chlen;
for (;
(chlen= cs->cset->mb_wc(cs, &wc, (uchar*) str, (uchar*) strend)) > 0;
str+= chlen)
{
if (wc > 0x7F)
return MY_REPERTOIRE_UNICODE30;
}
}
return MY_REPERTOIRE_ASCII;
}
/*
Returns repertoire for charset
*/
uint my_charset_repertoire(CHARSET_INFO *cs)
{
return cs->state & MY_CS_PUREASCII ?
MY_REPERTOIRE_ASCII : MY_REPERTOIRE_UNICODE30;
}
/*
Detect whether a character set is ASCII compatible.
Returns TRUE for:
- all 8bit character sets whose Unicode mapping of 0x7B is '{'
(ignores swe7 which maps 0x7B to "LATIN LETTER A WITH DIAERESIS")
- all multi-byte character sets having mbminlen == 1
(ignores ucs2 whose mbminlen is 2)
TODO:
When merging to 5.2, this function should be changed
to check a new flag MY_CS_NONASCII,
return (cs->flag & MY_CS_NONASCII) ? 0 : 1;
This flag was previously added into 5.2 under terms
of WL#3759 "Optimize identifier conversion in client-server protocol"
especially to mark character sets not compatible with ASCII.
We won't backport this flag to 5.0 or 5.1.
This function is Ok for 5.0 and 5.1, because we're not going
to introduce new tricky character sets between 5.0 and 5.2.
*/
my_bool
my_charset_is_ascii_based(CHARSET_INFO *cs)
{
return
(cs->mbmaxlen == 1 && cs->tab_to_uni && cs->tab_to_uni['{'] == '{') ||
(cs->mbminlen == 1 && cs->mbmaxlen > 1);
}
/*
Detect if a character set is 8bit,
and it is pure ascii, i.e. doesn't have
characters outside U+0000..U+007F
This functions is shared between "conf_to_src"
and dynamic charsets loader in "mysqld".
*/
my_bool
my_charset_is_8bit_pure_ascii(CHARSET_INFO *cs)
{
size_t code;
if (!cs->tab_to_uni)
return 0;
for (code= 0; code < 256; code++)
{
if (cs->tab_to_uni[code] > 0x7F)
return 0;
}
return 1;
}
/*
Shared function between conf_to_src and mysys.
Check if a 8bit character set is compatible with
ascii on the range 0x00..0x7F.
*/
my_bool
my_charset_is_ascii_compatible(CHARSET_INFO *cs)
{
uint i;
if (!cs->tab_to_uni)
return 1;
for (i= 0; i < 128; i++)
{
if (cs->tab_to_uni[i] != i)
return 0;
}
return 1;
}

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,164 @@
/* Copyright (C) 2000 MySQL AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
#include <my_global.h>
#include "m_string.h"
/*
_dig_vec arrays are public because they are used in several outer places.
*/
char _dig_vec_upper[] =
"0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
char _dig_vec_lower[] =
"0123456789abcdefghijklmnopqrstuvwxyz";
/*
Convert integer to its string representation in given scale of notation.
SYNOPSIS
int2str()
val - value to convert
dst - points to buffer where string representation should be stored
radix - radix of scale of notation
upcase - set to 1 if we should use upper-case digits
DESCRIPTION
Converts the (long) integer value to its character form and moves it to
the destination buffer followed by a terminating NUL.
If radix is -2..-36, val is taken to be SIGNED, if radix is 2..36, val is
taken to be UNSIGNED. That is, val is signed if and only if radix is.
All other radixes treated as bad and nothing will be changed in this case.
For conversion to decimal representation (radix is -10 or 10) one can use
optimized int10_to_str() function.
RETURN VALUE
Pointer to ending NUL character or NullS if radix is bad.
*/
char *
int2str(register long int val, register char *dst, register int radix,
int upcase)
{
char buffer[65];
register char *p;
long int new_val;
char *dig_vec= upcase ? _dig_vec_upper : _dig_vec_lower;
ulong uval= (ulong) val;
if (radix < 0)
{
if (radix < -36 || radix > -2)
return NullS;
if (val < 0)
{
*dst++ = '-';
/* Avoid integer overflow in (-val) for LONGLONG_MIN (BUG#31799). */
uval = (ulong)0 - uval;
}
radix = -radix;
}
else if (radix > 36 || radix < 2)
return NullS;
/*
The slightly contorted code which follows is due to the fact that
few machines directly support unsigned long / and %. Certainly
the VAX C compiler generates a subroutine call. In the interests
of efficiency (hollow laugh) I let this happen for the first digit
only; after that "val" will be in range so that signed integer
division will do. Sorry 'bout that. CHECK THE CODE PRODUCED BY
YOUR C COMPILER. The first % and / should be unsigned, the second
% and / signed, but C compilers tend to be extraordinarily
sensitive to minor details of style. This works on a VAX, that's
all I claim for it.
*/
p = &buffer[sizeof(buffer)-1];
*p = '\0';
new_val= uval / (ulong) radix;
*--p = dig_vec[(uchar) (uval- (ulong) new_val*(ulong) radix)];
val = new_val;
#ifdef HAVE_LDIV
while (val != 0)
{
ldiv_t res;
res=ldiv(val,radix);
*--p = dig_vec[res.rem];
val= res.quot;
}
#else
while (val != 0)
{
new_val=val/radix;
*--p = dig_vec[(uchar) (val-new_val*radix)];
val= new_val;
}
#endif
while ((*dst++ = *p++) != 0) ;
return dst-1;
}
/*
Converts integer to its string representation in decimal notation.
SYNOPSIS
int10_to_str()
val - value to convert
dst - points to buffer where string representation should be stored
radix - flag that shows whenever val should be taken as signed or not
DESCRIPTION
This is version of int2str() function which is optimized for normal case
of radix 10/-10. It takes only sign of radix parameter into account and
not its absolute value.
RETURN VALUE
Pointer to ending NUL character.
*/
char *int10_to_str(long int val,char *dst,int radix)
{
char buffer[65];
register char *p;
long int new_val;
unsigned long int uval = (unsigned long int) val;
if (radix < 0) /* -10 */
{
if (val < 0)
{
*dst++ = '-';
/* Avoid integer overflow in (-val) for LONGLONG_MIN (BUG#31799). */
uval = (unsigned long int)0 - uval;
}
}
p = &buffer[sizeof(buffer)-1];
*p = '\0';
new_val= (long) (uval / 10);
*--p = '0'+ (char) (uval - (unsigned long) new_val * 10);
val = new_val;
while (val != 0)
{
new_val=val/10;
*--p = '0' + (char) (val-new_val*10);
val= new_val;
}
while ((*dst++ = *p++) != 0) ;
return dst-1;
}

View File

@@ -0,0 +1,32 @@
/* Copyright (C) 2000 MySQL AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
/* File : is_prefix.c
Author : Michael Widenius
Defines: is_prefix()
is_prefix(s, t) returns 1 if s starts with t.
A empty t is allways a prefix.
*/
#include <my_global.h>
#include "m_string.h"
int is_prefix(register const char *s, register const char *t)
{
while (*t)
if (*s++ != *t++) return 0;
return 1; /* WRONG */
}

View File

@@ -0,0 +1,478 @@
; Character code definition file for latin2 languages (for use with cset)
; it's written for Czech, but may be used generally; works
; minimum for Slovenian alphabet too (or at least I hope so)
;
; Written by Jaromir Dolecek <dolecek@ics.muni.cz>
;
; Notes of defined sorting order:
; Upper/Lower case is ignored
; All characters with the accents are sorted after appropriate
; character without accent in order:
; Xacute, Xring , Xcaron, Xslash, Xcedilla, Xogonek, Xcircumflex,
; Xbreve, Xhungarumlaut, Xdieresis, Xdotaccent
;
latin2
***
NUL 0 C
SOH C
STX C
ETX C
EOT C
ENQ C
ACK C
BEL C
BS C
HT CS
LF CS
VT CS
FF CS
CR CS
SO C
SI C
DLE C
DC1 C
DC2 C
DC3 C
DC4 C
NAK C
SYN C
ETB C
CAN C
EM C
SUB C
ESC C
FS C
GS C
RS C
US C
/space BS
/exclam P
/quotedbl P
/numbersign P
/dollar P
/percent P
/ampersand P
/quoteright P
/parenleft P
/parenright P
/asterisk P
/plus P
/comma P
/minus P
/period P
/slash P
/zero NX
/one NX
/two NX
/three NX
/four NX
/five NX
/six NX
/seven NX
/eight NX
/nine NX
/colon P
/semicolon P
/less P
/equal P
/greater P
/question P
/at P
/A UX
/B UX
/C UX
/D UX
/E UX
/F UX
/G U
/H U
/I U
/J U
/K U
/L U
/M U
/N U
/O U
/P U
/Q U
/R U
/S U
/T U
/U U
/V U
/W U
/X U
/Y U
/Z U
/bracketleft P
/backslash P
/bracketright P
/asciicircum P
/underscore P
/quoteleft P
/a LX
/b LX
/c LX
/d LX
/e LX
/f LX
/g L
/h L
/i L
/j L
/k L
/l L
/m L
/n L
/o L
/p L
/q L
/r L
/s L
/t L
/u L
/v L
/w L
/x L
/y L
/z L
/braceleft P
/bar P
/braceright P
/tilde P
NUL_ C
SOH_ C
STX_ C
ETX_ C
EOT_ C
ENQ_ C
ACK_ C
BEL_ C
BS_ C
HT_ CS
LF_ CS
VT_ CS
FF_ CS
CR_ CS
SO_ C
SI_ C
DLE_ C
DC1_ C
DC2_ C
DC3_ C
DC4_ C
NAK_ C
SYN_ C
ETB_ C
CAN_ C
EM_ C
SUB_ C
ESC_ C
FS_ C
GS_ C
RS_ C
US_ C
/space_ SB
/Aogonek U
/breve P
/Lslash U
/currency P
/Lcaron U
/Sacute U
/dieresis P
/Scaron 169 U
/Scedilla U
/Tcaron U
/Zacute U
/hyphen P
/Zcaron U
/Zdotaccent U
/degree P
/aogonek L
/ogonek P
/lslash L
/acute P
/lcaron L
/sacute L
/caron P
/cedilla P
/scaron L
/scedilla L
/tcaron L
/zacute L
/hungarumlaut P
/zcaron L
/zdotaccent L
/Racute U
/Aacute U
/Acircumflex U
/Abreve U
/Adieresis U
/Lacute U
/Cacute U
/Ccedilla U
/Ccaron U
/Eacute U
/Eogonek U
/Edieresis U
/Ecaron U
/Iacute U
/Icircumflex U
/Dcaron U
/Eth P
/Nacute U
/Ncaron U
/Oacute U
/Ocircumflex U
/Ohungarumlaut U
/Odieresis U
/multiply P
/Rcaron U
/Uring U
/Uacute U
/Uhungarumlaut U
/Udieresis U
/Yacute U
/Tcedilla U
/germandbls P
/racute L
/aacute L
/acircumflex L
/abreve L
/adieresis L
/lacute L
/cacute L
/ccedilla L
/ccaron L
/eacute L
/eogonek L
/edieresis L
/ecaron L
/iacute L
/icircumflex L
/dcaron L
/dbar L
/nacute L
/ncaron L
/oacute L
/ocircumflex L
/ohungarumlaut L
/odieresis L
/divide P
/rcaron L
/uring L
/uacute L
/uhungarumlaut L
/udieresis L
/yacute L
/tcedilla L
/dotaccent P
***
/A /a
/B /b
/C /c
/D /d
/E /e
/F /f
/G /g
/H /h
/I /i
/J /j
/K /k
/L /l
/M /m
/N /n
/O /o
/P /p
/Q /q
/R /r
/S /s
/T /t
/U /u
/V /v
/W /w
/X /x
/Y /y
/Z /z
/Aogonek /aogonek
/Lslash /lslash
/Lcaron /lcaron
/Sacute /sacute
/Scaron /scaron
/Scedilla /scedilla
/Tcaron /tcaron
/Zacute /zacute
/Zcaron /zcaron
/Zdotaccent /zdotaccent
/Racute /racute
/Aacute /aacute
/Acircumflex /acircumflex
/Abreve /abreve
/Adieresis /adieresis
/Lacute /lacute
/Cacute /cacute
/Ccedilla /ccedilla
/Ccaron /ccaron
/Eacute /eacute
/Eogonek /eogonek
/Edieresis /edieresis
/Ecaron /ecaron
/Iacute /iacute
/Icircumflex /icircumflex
/Dcaron /dcaron
/Nacute /nacute
/Ncaron /ncaron
/Oacute /oacute
/Ocircumflex /ocircumflex
/Ohungarumlaut /ohungarumlaut
/Odieresis /odieresis
/Rcaron /rcaron
/Uring /uring
/Uacute /uacute
/Uhungarumlaut /uhungarumlaut
/Udieresis /udieresis
/Yacute /yacute
/Tcedilla /tcedilla
***
NUL NUL_
SOH SOH_
STX STX_
ETX ETX_
EOT EOT_
ENQ ENQ_
ACK ACK_
BEL BEL_
BS BS_
HT HT_
LF LF_
VT VT_
FF FF_
CR CR_
SO SO_
SI SI_
DLE DLE_
DC1 DC1_
DC2 DC2_
DC3 DC3_
DC4 DC4_
NAK NAK_
SYN SYN_
ETB ETB_
CAN CAN_
EM EM_
SUB SUB_
ESC ESC_
FS FS_
GS GS_
RS RS_
US US_
/space
/exclam
/quotedbl
/numbersign
/dollar
/percent
/ampersand
/quoteright
/parenleft
/parenright
/asterisk
/plus
/comma
/minus
/period
/slash
/zero
/one
/two
/three
/four
/five
/six
/seven
/eight
/nine
/colon
/semicolon
/less
/equal
/greater
/question
/at
/A /a
/Aogonek /aogonek
/Aacute /aacute
/Acircumflex /acircumflex
/Abreve /abreve
/Adieresis /adieresis
/B /b
/C /c
/Cacute /cacute
/Ccaron /ccaron
/Ccedilla /ccedilla
/D /d
/Dcaron /dcaron
/E /e
/Eacute /eacute
/Ecaron /ecaron
/Eogonek /eogonek
/Edieresis /edieresis
/F /f
/G /g
/H /h
/I /i
/Icircumflex
/icircumflex
/Iacute /iacute
/J /j
/K /k
/L /l
/Lslash /lslash
/Lcaron /lcaron
/Lacute /lacute
/M /m
/N /n
/Nacute /nacute
/Ncaron /ncaron
/O /o
/Oacute /oacute
/Ocircumflex /ocircumflex
/Ohungarumlaut /ohungarumlaut
/Odieresis /odieresis
/P /p
/Q /q
/R /r
/Racute /racute
/Rcaron /rcaron
/S /s
/Sacute /sacute
/Scaron /scaron
/Scedilla /scedilla
/T /t
/Tcaron /tcaron
/Tcedilla /tcedilla
/U /u
/Uacute /uacute
/Uring /uring
/Uhungarumlaut /uhungarumlaut
/Udieresis /udieresis
/V /v
/W /w
/X /x
/Y /y
/Yacute /yacute
/Z /z
/Zacute /zacute
/Zcaron /zcaron
/Zdotaccent /zdotaccent
/bracketleft
/backslash
/bracketright
/asciicircum
/underscore
/quoteleft
/braceleft
/bar
/braceright
/tilde
***

View File

@@ -0,0 +1,40 @@
/* Copyright (C) 2000 MySQL AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
/*
Defines: llstr();
llstr(value, buff);
This function saves a longlong value in a buffer and returns the pointer to
the buffer. This is useful when trying to portable print longlong
variables with printf() as there is no usable printf() standard one can use.
*/
#include <my_global.h>
#include "m_string.h"
char *llstr(longlong value,char *buff)
{
longlong10_to_str(value,buff,-10);
return buff;
}
char *ullstr(longlong value,char *buff)
{
longlong10_to_str(value,buff,10);
return buff;
}

View File

@@ -0,0 +1,143 @@
/* Copyright (C) 2000 MySQL AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
/*
Defines: longlong2str();
longlong2str(dst, radix, val)
converts the (longlong) integer "val" to character form and moves it to
the destination string "dst" followed by a terminating NUL. The
result is normally a pointer to this NUL character, but if the radix
is dud the result will be NullS and nothing will be changed.
If radix is -2..-36, val is taken to be SIGNED.
If radix is 2.. 36, val is taken to be UNSIGNED.
That is, val is signed if and only if radix is. You will normally
use radix -10 only through itoa and ltoa, for radix 2, 8, or 16
unsigned is what you generally want.
_dig_vec is public just in case someone has a use for it.
The definitions of itoa and ltoa are actually macros in m_string.h,
but this is where the code is.
Note: The standard itoa() returns a pointer to the argument, when int2str
returns the pointer to the end-null.
itoa assumes that 10 -base numbers are allways signed and other arn't.
*/
#include <my_global.h>
#include "m_string.h"
#ifndef ll2str
/*
This assumes that longlong multiplication is faster than longlong division.
*/
char *ll2str(longlong val,char *dst,int radix, int upcase)
{
char buffer[65];
register char *p;
long long_val;
char *dig_vec= upcase ? _dig_vec_upper : _dig_vec_lower;
ulonglong uval= (ulonglong) val;
if (radix < 0)
{
if (radix < -36 || radix > -2) return (char*) 0;
if (val < 0) {
*dst++ = '-';
/* Avoid integer overflow in (-val) for LONGLONG_MIN (BUG#31799). */
uval = (ulonglong)0 - uval;
}
radix = -radix;
}
else
{
if (radix > 36 || radix < 2) return (char*) 0;
}
if (uval == 0)
{
*dst++='0';
*dst='\0';
return dst;
}
p = &buffer[sizeof(buffer)-1];
*p = '\0';
while (uval > (ulonglong) LONG_MAX)
{
ulonglong quo= uval/(uint) radix;
uint rem= (uint) (uval- quo* (uint) radix);
*--p= dig_vec[rem];
uval= quo;
}
long_val= (long) uval;
while (long_val != 0)
{
long quo= long_val/radix;
*--p= dig_vec[(uchar) (long_val - quo*radix)];
long_val= quo;
}
while ((*dst++ = *p++) != 0) ;
return dst-1;
}
#endif
#ifndef longlong10_to_str
char *longlong10_to_str(longlong val,char *dst,int radix)
{
char buffer[65];
register char *p;
long long_val;
ulonglong uval= (ulonglong) val;
if (radix < 0)
{
if (val < 0)
{
*dst++ = '-';
/* Avoid integer overflow in (-val) for LONGLONG_MIN (BUG#31799). */
uval = (ulonglong)0 - uval;
}
}
if (uval == 0)
{
*dst++='0';
*dst='\0';
return dst;
}
p = &buffer[sizeof(buffer)-1];
*p = '\0';
while (uval > (ulonglong) LONG_MAX)
{
ulonglong quo= uval/(uint) 10;
uint rem= (uint) (uval- quo* (uint) 10);
*--p = _dig_vec_upper[rem];
uval= quo;
}
long_val= (long) uval;
while (long_val != 0)
{
long quo= long_val/10;
*--p = _dig_vec_upper[(uchar) (long_val - quo*10)];
long_val= quo;
}
while ((*dst++ = *p++) != 0) ;
return dst-1;
}
#endif

View File

@@ -0,0 +1,104 @@
/* Copyright (C) 2005 MySQL AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
#include <my_global.h>
#include "m_string.h"
#include "m_ctype.h"
#define NEQ(A, B) ((A) != (B))
#define EQU(A, B) ((A) == (B))
/**
Macro for the body of the string scanning.
@param CS The character set of the string
@param STR Pointer to beginning of string
@param END Pointer to one-after-end of string
@param ACC Pointer to beginning of accept (or reject) string
@param LEN Length of accept (or reject) string
@param CMP is a function-like for doing the comparison of two characters.
*/
#define SCAN_STRING(CS, STR, END, ACC, LEN, CMP) \
do { \
uint mbl; \
const char *ptr_str, *ptr_acc; \
const char *acc_end= (ACC) + (LEN); \
for (ptr_str= (STR) ; ptr_str < (END) ; ptr_str+= mbl) \
{ \
mbl= my_mbcharlen((CS), *(uchar*)ptr_str); \
if (mbl < 2) \
{ \
DBUG_ASSERT(mbl == 1); \
for (ptr_acc= (ACC) ; ptr_acc < acc_end ; ++ptr_acc) \
if (CMP(*ptr_acc, *ptr_str)) \
goto end; \
} \
} \
end: \
return (size_t) (ptr_str - (STR)); \
} while (0)
/*
my_strchr(cs, str, end, c) returns a pointer to the first place in
str where c (1-byte character) occurs, or NULL if c does not occur
in str. This function is multi-byte safe.
TODO: should be moved to CHARSET_INFO if it's going to be called
frequently.
*/
char *my_strchr(CHARSET_INFO *cs, const char *str, const char *end,
pchar c)
{
uint mbl;
while (str < end)
{
mbl= my_mbcharlen(cs, *(uchar *)str);
if (mbl < 2)
{
if (*str == c)
return((char *)str);
str++;
}
else
str+= mbl;
}
return(0);
}
/**
Calculate the length of the initial segment of 'str' which consists
entirely of characters not in 'reject'.
@note The reject string points to single-byte characters so it is
only possible to find the first occurrence of a single-byte
character. Multi-byte characters in 'str' are treated as not
matching any character in the reject string.
@todo should be moved to CHARSET_INFO if it's going to be called
frequently.
@internal The implementation builds on the assumption that 'str' is long,
while 'reject' is short. So it compares each character in string
with the characters in 'reject' in a tight loop over the characters
in 'reject'.
*/
size_t my_strcspn(CHARSET_INFO *cs, const char *str, const char *str_end,
const char *reject)
{
SCAN_STRING(cs, str, str_end, reject, strlen(reject), EQU);
}

View File

@@ -0,0 +1,236 @@
/* Copyright (C) 2003 MySQL AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
#include <my_global.h>
#include <my_sys.h> /* Needed for MY_ERRNO_ERANGE */
#include <m_string.h>
#define MAX_NEGATIVE_NUMBER ((ulonglong) LL(0x8000000000000000))
#define INIT_CNT 9
#define LFACTOR ULL(1000000000)
#define LFACTOR1 ULL(10000000000)
#define LFACTOR2 ULL(100000000000)
static unsigned long lfactor[9]=
{
1L, 10L, 100L, 1000L, 10000L, 100000L, 1000000L, 10000000L, 100000000L
};
/*
Convert a string to an to unsigned long long integer value
SYNOPSYS
my_strtoll10()
nptr in pointer to the string to be converted
endptr in/out pointer to the end of the string/
pointer to the stop character
error out returned error code
DESCRIPTION
This function takes the decimal representation of integer number
from string nptr and converts it to an signed or unsigned
long long integer value.
Space characters and tab are ignored.
A sign character might precede the digit characters. The number
may have any number of pre-zero digits.
The function stops reading the string nptr at the first character
that is not a decimal digit. If endptr is not NULL then the function
will not read characters after *endptr.
RETURN VALUES
Value of string as a signed/unsigned longlong integer
if no error and endptr != NULL, it will be set to point at the character
after the number
The error parameter contains information how things went:
-1 Number was an ok negative number
0 ok
ERANGE If the the value of the converted number exceeded the
maximum negative/unsigned long long integer.
In this case the return value is ~0 if value was
positive and LONGLONG_MIN if value was negative.
EDOM If the string didn't contain any digits. In this case
the return value is 0.
If endptr is not NULL the function will store the end pointer to
the stop character here.
*/
longlong my_strtoll10(const char *nptr, char **endptr, int *error)
{
const char *s, *end, *start, *n_end, *true_end;
char *dummy;
uchar c;
unsigned long i, j, k;
ulonglong li;
int negative;
ulong cutoff, cutoff2, cutoff3;
s= nptr;
/* If fixed length string */
if (endptr)
{
end= *endptr;
while (s != end && (*s == ' ' || *s == '\t'))
s++;
if (s == end)
goto no_conv;
}
else
{
endptr= &dummy; /* Easier end test */
while (*s == ' ' || *s == '\t')
s++;
if (!*s)
goto no_conv;
/* This number must be big to guard against a lot of pre-zeros */
end= s+65535; /* Can't be longer than this */
}
/* Check for a sign. */
negative= 0;
if (*s == '-')
{
*error= -1; /* Mark as negative number */
negative= 1;
if (++s == end)
goto no_conv;
cutoff= MAX_NEGATIVE_NUMBER / LFACTOR2;
cutoff2= (MAX_NEGATIVE_NUMBER % LFACTOR2) / 100;
cutoff3= MAX_NEGATIVE_NUMBER % 100;
}
else
{
*error= 0;
if (*s == '+')
{
if (++s == end)
goto no_conv;
}
cutoff= ULONGLONG_MAX / LFACTOR2;
cutoff2= ULONGLONG_MAX % LFACTOR2 / 100;
cutoff3= ULONGLONG_MAX % 100;
}
/* Handle case where we have a lot of pre-zero */
if (*s == '0')
{
i= 0;
do
{
if (++s == end)
goto end_i; /* Return 0 */
}
while (*s == '0');
n_end= s+ INIT_CNT;
}
else
{
/* Read first digit to check that it's a valid number */
if ((c= (*s-'0')) > 9)
goto no_conv;
i= c;
n_end= ++s+ INIT_CNT-1;
}
/* Handle first 9 digits and store them in i */
if (n_end > end)
n_end= end;
for (; s != n_end ; s++)
{
if ((c= (*s-'0')) > 9)
goto end_i;
i= i*10+c;
}
if (s == end)
goto end_i;
/* Handle next 9 digits and store them in j */
j= 0;
start= s; /* Used to know how much to shift i */
n_end= true_end= s + INIT_CNT;
if (n_end > end)
n_end= end;
do
{
if ((c= (*s-'0')) > 9)
goto end_i_and_j;
j= j*10+c;
} while (++s != n_end);
if (s == end)
{
if (s != true_end)
goto end_i_and_j;
goto end3;
}
if ((c= (*s-'0')) > 9)
goto end3;
/* Handle the next 1 or 2 digits and store them in k */
k=c;
if (++s == end || (c= (*s-'0')) > 9)
goto end4;
k= k*10+c;
*endptr= (char*) ++s;
/* number string should have ended here */
if (s != end && (c= (*s-'0')) <= 9)
goto overflow;
/* Check that we didn't get an overflow with the last digit */
if (i > cutoff || (i == cutoff && ((j > cutoff2 || j == cutoff2) &&
k > cutoff3)))
goto overflow;
li=i*LFACTOR2+ (ulonglong) j*100 + k;
return (longlong) li;
overflow: /* *endptr is set here */
*error= MY_ERRNO_ERANGE;
return negative ? LONGLONG_MIN : (longlong) ULONGLONG_MAX;
end_i:
*endptr= (char*) s;
return (negative ? ((longlong) -(long) i) : (longlong) i);
end_i_and_j:
li= (ulonglong) i * lfactor[(uint) (s-start)] + j;
*endptr= (char*) s;
return (negative ? -((longlong) li) : (longlong) li);
end3:
li=(ulonglong) i*LFACTOR+ (ulonglong) j;
*endptr= (char*) s;
return (negative ? -((longlong) li) : (longlong) li);
end4:
li=(ulonglong) i*LFACTOR1+ (ulonglong) j * 10 + k;
*endptr= (char*) s;
if (negative)
{
if (li > MAX_NEGATIVE_NUMBER)
goto overflow;
return -((longlong) li);
}
return (longlong) li;
no_conv:
/* There was no number to convert. */
*error= MY_ERRNO_EDOM;
*endptr= (char *) nptr;
return 0;
}

View File

@@ -0,0 +1,681 @@
/* Copyright (C) 2000 MySQL AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
#include <my_global.h>
#include <m_string.h>
#include <stdarg.h>
#include <m_ctype.h>
#define MAX_ARGS 32 /* max positional args count*/
#define MAX_PRINT_INFO 32 /* max print position count */
#define LENGTH_ARG 1
#define WIDTH_ARG 2
#define PREZERO_ARG 4
#define ESCAPED_ARG 8
typedef struct pos_arg_info ARGS_INFO;
typedef struct print_info PRINT_INFO;
struct pos_arg_info
{
char arg_type; /* argument type */
uint have_longlong; /* used from integer values */
char *str_arg; /* string value of the arg */
longlong longlong_arg; /* integer value of the arg */
double double_arg; /* double value of the arg */
};
struct print_info
{
char arg_type; /* argument type */
size_t arg_idx; /* index of the positional arg */
size_t length; /* print width or arg index */
size_t width; /* print width or arg index */
uint flags;
const char *begin; /**/
const char *end; /**/
};
/**
Calculates print length or index of positional argument
@param fmt processed string
@param length print length or index of positional argument
@param pre_zero returns flags with PREZERO_ARG set if necessary
@retval
string position right after length digits
*/
static const char *get_length(const char *fmt, size_t *length, uint *pre_zero)
{
for (; my_isdigit(&my_charset_latin1, *fmt); fmt++)
{
*length= *length * 10 + (uint)(*fmt - '0');
if (!*length)
*pre_zero|= PREZERO_ARG; /* first digit was 0 */
}
return fmt;
}
/**
Calculates print width or index of positional argument
@param fmt processed string
@param width print width or index of positional argument
@retval
string position right after width digits
*/
static const char *get_width(const char *fmt, size_t *width)
{
for (; my_isdigit(&my_charset_latin1, *fmt); fmt++)
{
*width= *width * 10 + (uint)(*fmt - '0');
}
return fmt;
}
/**
Calculates print width or index of positional argument
@param fmt processed string
@param have_longlong TRUE if longlong is required
@retval
string position right after modifier symbol
*/
static const char *check_longlong(const char *fmt, uint *have_longlong)
{
*have_longlong= 0;
if (*fmt == 'l')
{
fmt++;
if (*fmt != 'l')
*have_longlong= (sizeof(long) == sizeof(longlong));
else
{
fmt++;
*have_longlong= 1;
}
}
else if (*fmt == 'z')
{
fmt++;
*have_longlong= (sizeof(size_t) == sizeof(longlong));
}
return fmt;
}
/**
Returns escaped string
@param cs string charset
@param to buffer where escaped string will be placed
@param end end of buffer
@param par string to escape
@param par_len string length
@param quote_char character for quoting
@retval
position in buffer which points on the end of escaped string
*/
static char *backtick_string(CHARSET_INFO *cs, char *to, char *end,
char *par, size_t par_len, char quote_char)
{
uint char_len;
char *start= to;
char *par_end= par + par_len;
size_t buff_length= (size_t) (end - to);
if (buff_length <= par_len)
goto err;
*start++= quote_char;
for ( ; par < par_end; par+= char_len)
{
uchar c= *(uchar *) par;
if (!(char_len= my_mbcharlen(cs, c)))
char_len= 1;
if (char_len == 1 && c == (uchar) quote_char )
{
if (start + 1 >= end)
goto err;
*start++= quote_char;
}
if (start + char_len >= end)
goto err;
start= strnmov(start, par, char_len);
}
if (start + 1 >= end)
goto err;
*start++= quote_char;
return start;
err:
*to='\0';
return to;
}
/**
Prints string argument
*/
static char *process_str_arg(CHARSET_INFO *cs, char *to, char *end,
size_t width, char *par, uint print_type)
{
int well_formed_error;
size_t plen, left_len= (size_t) (end - to) + 1;
if (!par)
par = (char*) "(null)";
plen= strnlen(par, width);
if (left_len <= plen)
plen = left_len - 1;
plen= cs->cset->well_formed_len(cs, par, par + plen,
width, &well_formed_error);
if (print_type & ESCAPED_ARG)
to= backtick_string(cs, to, end, par, plen, '`');
else
to= strnmov(to,par,plen);
return to;
}
/**
Prints binary argument
*/
static char *process_bin_arg(char *to, char *end, size_t width, char *par)
{
DBUG_ASSERT(to <= end);
if (to + width + 1 > end)
width= end - to - 1; /* sign doesn't matter */
memmove(to, par, width);
to+= width;
return to;
}
/**
Prints double or float argument
*/
static char *process_dbl_arg(char *to, char *end, size_t width,
double par, char arg_type)
{
if (width == SIZE_T_MAX)
width= FLT_DIG; /* width not set, use default */
else if (width >= NOT_FIXED_DEC)
width= NOT_FIXED_DEC - 1; /* max.precision for my_fcvt() */
width= min(width, (size_t)(end-to) - 1);
if (arg_type == 'f')
to+= my_fcvt(par, (int)width , to, NULL);
else
to+= my_gcvt(par, MY_GCVT_ARG_DOUBLE, (int) width , to, NULL);
return to;
}
/**
Prints integer argument
*/
static char *process_int_arg(char *to, char *end, size_t length,
longlong par, char arg_type, uint print_type)
{
size_t res_length, to_length;
char *store_start= to, *store_end;
char buff[32];
if ((to_length= (size_t) (end-to)) < 16 || length)
store_start= buff;
if (arg_type == 'd' || arg_type == 'i')
store_end= longlong10_to_str(par, store_start, -10);
else if (arg_type == 'u')
store_end= longlong10_to_str(par, store_start, 10);
else if (arg_type == 'p')
{
store_start[0]= '0';
store_start[1]= 'x';
store_end= ll2str(par, store_start + 2, 16, 0);
}
else if (arg_type == 'o')
{
store_end= ll2str(par, store_start, 8, 0);
}
else
{
DBUG_ASSERT(arg_type == 'X' || arg_type =='x');
store_end= ll2str(par, store_start, 16, (arg_type == 'X'));
}
if ((res_length= (size_t) (store_end - store_start)) > to_length)
return to; /* num doesn't fit in output */
/* If %#d syntax was used, we have to pre-zero/pre-space the string */
if (store_start == buff)
{
length= min(length, to_length);
if (res_length < length)
{
size_t diff= (length- res_length);
bfill(to, diff, (print_type & PREZERO_ARG) ? '0' : ' ');
if (arg_type == 'p' && print_type & PREZERO_ARG)
{
if (diff > 1)
to[1]= 'x';
else
store_start[0]= 'x';
store_start[1]= '0';
}
to+= diff;
}
bmove(to, store_start, res_length);
}
to+= res_length;
return to;
}
/**
Procesed positional arguments.
@param cs string charset
@param to buffer where processed string will be place
@param end end of buffer
@param par format string
@param arg_index arg index of the first occurrence of positional arg
@param ap list of parameters
@retval
end of buffer where processed string is placed
*/
static char *process_args(CHARSET_INFO *cs, char *to, char *end,
const char* fmt, size_t arg_index, va_list ap)
{
ARGS_INFO args_arr[MAX_ARGS];
PRINT_INFO print_arr[MAX_PRINT_INFO];
uint idx= 0, arg_count= arg_index;
start:
/* Here we are at the beginning of positional argument, right after $ */
arg_index--;
print_arr[idx].flags= 0;
if (*fmt == '`')
{
print_arr[idx].flags|= ESCAPED_ARG;
fmt++;
}
if (*fmt == '-')
fmt++;
print_arr[idx].length= print_arr[idx].width= 0;
/* Get print length */
if (*fmt == '*')
{
fmt++;
fmt= get_length(fmt, &print_arr[idx].length, &print_arr[idx].flags);
print_arr[idx].length--;
DBUG_ASSERT(*fmt == '$' && print_arr[idx].length < MAX_ARGS);
args_arr[print_arr[idx].length].arg_type= 'd';
print_arr[idx].flags|= LENGTH_ARG;
arg_count= max(arg_count, print_arr[idx].length + 1);
fmt++;
}
else
fmt= get_length(fmt, &print_arr[idx].length, &print_arr[idx].flags);
if (*fmt == '.')
{
fmt++;
/* Get print width */
if (*fmt == '*')
{
fmt++;
fmt= get_width(fmt, &print_arr[idx].width);
print_arr[idx].width--;
DBUG_ASSERT(*fmt == '$' && print_arr[idx].width < MAX_ARGS);
args_arr[print_arr[idx].width].arg_type= 'd';
print_arr[idx].flags|= WIDTH_ARG;
arg_count= max(arg_count, print_arr[idx].width + 1);
fmt++;
}
else
fmt= get_width(fmt, &print_arr[idx].width);
}
else
print_arr[idx].width= SIZE_T_MAX;
fmt= check_longlong(fmt, &args_arr[arg_index].have_longlong);
if (*fmt == 'p')
args_arr[arg_index].have_longlong= (sizeof(void *) == sizeof(longlong));
args_arr[arg_index].arg_type= print_arr[idx].arg_type= *fmt;
print_arr[idx].arg_idx= arg_index;
print_arr[idx].begin= ++fmt;
while (*fmt && *fmt != '%')
fmt++;
if (!*fmt) /* End of format string */
{
uint i;
print_arr[idx].end= fmt;
/* Obtain parameters from the list */
for (i= 0 ; i < arg_count; i++)
{
switch (args_arr[i].arg_type) {
case 's':
case 'b':
args_arr[i].str_arg= va_arg(ap, char *);
break;
case 'f':
case 'g':
args_arr[i].double_arg= va_arg(ap, double);
break;
case 'd':
case 'i':
case 'u':
case 'x':
case 'X':
case 'o':
case 'p':
if (args_arr[i].have_longlong)
args_arr[i].longlong_arg= va_arg(ap,longlong);
else if (args_arr[i].arg_type == 'd' || args_arr[i].arg_type == 'i')
args_arr[i].longlong_arg= va_arg(ap, int);
else
args_arr[i].longlong_arg= va_arg(ap, uint);
break;
case 'c':
args_arr[i].longlong_arg= va_arg(ap, int);
break;
default:
DBUG_ASSERT(0);
}
}
/* Print result string */
for (i= 0; i <= idx; i++)
{
size_t width= 0, length= 0;
switch (print_arr[i].arg_type) {
case 's':
{
char *par= args_arr[print_arr[i].arg_idx].str_arg;
width= (print_arr[i].flags & WIDTH_ARG)
? (size_t)args_arr[print_arr[i].width].longlong_arg
: print_arr[i].width;
to= process_str_arg(cs, to, end, width, par, print_arr[i].flags);
break;
}
case 'b':
{
char *par = args_arr[print_arr[i].arg_idx].str_arg;
width= (print_arr[i].flags & WIDTH_ARG)
? (size_t)args_arr[print_arr[i].width].longlong_arg
: print_arr[i].width;
to= process_bin_arg(to, end, width, par);
break;
}
case 'c':
{
if (to == end)
break;
*to++= (char) args_arr[print_arr[i].arg_idx].longlong_arg;
break;
}
case 'f':
case 'g':
{
double d= args_arr[print_arr[i].arg_idx].double_arg;
width= (print_arr[i].flags & WIDTH_ARG) ?
(uint)args_arr[print_arr[i].width].longlong_arg : print_arr[i].width;
to= process_dbl_arg(to, end, width, d, print_arr[i].arg_type);
break;
}
case 'd':
case 'i':
case 'u':
case 'x':
case 'X':
case 'o':
case 'p':
{
/* Integer parameter */
longlong larg;
length= (print_arr[i].flags & LENGTH_ARG)
? (size_t)args_arr[print_arr[i].length].longlong_arg
: print_arr[i].length;
if (args_arr[print_arr[i].arg_idx].have_longlong)
larg = args_arr[print_arr[i].arg_idx].longlong_arg;
else if (print_arr[i].arg_type == 'd' || print_arr[i].arg_type == 'i' )
larg = (int) args_arr[print_arr[i].arg_idx].longlong_arg;
else
larg= (uint) args_arr[print_arr[i].arg_idx].longlong_arg;
to= process_int_arg(to, end, length, larg, print_arr[i].arg_type,
print_arr[i].flags);
break;
}
default:
break;
}
if (to == end)
break;
length= min(end - to , print_arr[i].end - print_arr[i].begin);
if (to + length < end)
length++;
to= strnmov(to, print_arr[i].begin, length);
}
DBUG_ASSERT(to <= end);
*to='\0'; /* End of errmessage */
return to;
}
else
{
/* Process next positional argument*/
DBUG_ASSERT(*fmt == '%');
print_arr[idx].end= fmt - 1;
idx++;
fmt++;
arg_index= 0;
fmt= get_width(fmt, &arg_index);
DBUG_ASSERT(*fmt == '$');
fmt++;
arg_count= max(arg_count, arg_index);
goto start;
}
return 0;
}
/**
Produces output string according to a format string
See the detailed documentation around my_snprintf_service_st
@param cs string charset
@param to buffer where processed string will be place
@param n size of buffer
@param par format string
@param ap list of parameters
@retval
length of result string
*/
size_t my_vsnprintf_ex(CHARSET_INFO *cs, char *to, size_t n,
const char* fmt, va_list ap)
{
char *start=to, *end=to+n-1;
size_t length, width;
uint print_type, have_longlong;
for (; *fmt ; fmt++)
{
if (*fmt != '%')
{
if (to == end) /* End of buffer */
break;
*to++= *fmt; /* Copy ordinary char */
continue;
}
fmt++; /* skip '%' */
length= width= 0;
print_type= 0;
/* Read max fill size (only used with %d and %u) */
if (my_isdigit(&my_charset_latin1, *fmt))
{
fmt= get_length(fmt, &length, &print_type);
if (*fmt == '$')
{
to= process_args(cs, to, end, (fmt+1), length, ap);
return (size_t) (to - start);
}
}
else
{
if (*fmt == '`')
{
print_type|= ESCAPED_ARG;
fmt++;
}
if (*fmt == '-')
fmt++;
if (*fmt == '*')
{
fmt++;
length= va_arg(ap, int);
}
else
fmt= get_length(fmt, &length, &print_type);
}
if (*fmt == '.')
{
fmt++;
if (*fmt == '*')
{
fmt++;
width= va_arg(ap, int);
}
else
fmt= get_width(fmt, &width);
}
else
width= SIZE_T_MAX;
fmt= check_longlong(fmt, &have_longlong);
if (*fmt == 's') /* String parameter */
{
reg2 char *par= va_arg(ap, char *);
to= process_str_arg(cs, to, end, width, par, print_type);
continue;
}
else if (*fmt == 'b') /* Buffer parameter */
{
char *par = va_arg(ap, char *);
to= process_bin_arg(to, end, width, par);
continue;
}
else if (*fmt == 'f' || *fmt == 'g')
{
double d= va_arg(ap, double);
to= process_dbl_arg(to, end, width, d, *fmt);
continue;
}
else if (*fmt == 'd' || *fmt == 'i' || *fmt == 'u' || *fmt == 'x' ||
*fmt == 'X' || *fmt == 'p' || *fmt == 'o')
{
/* Integer parameter */
longlong larg;
if (*fmt == 'p')
have_longlong= (sizeof(void *) == sizeof(longlong));
if (have_longlong)
larg = va_arg(ap,longlong);
else if (*fmt == 'd' || *fmt == 'i')
larg = va_arg(ap, int);
else
larg= va_arg(ap, uint);
to= process_int_arg(to, end, length, larg, *fmt, print_type);
continue;
}
else if (*fmt == 'c') /* Character parameter */
{
register int larg;
if (to == end)
break;
larg = va_arg(ap, int);
*to++= (char) larg;
continue;
}
/* We come here on '%%', unknown code or too long parameter */
if (to == end)
break;
*to++='%'; /* % used as % or unknown code */
}
DBUG_ASSERT(to <= end);
*to='\0'; /* End of errmessage */
return (size_t) (to - start);
}
/*
Limited snprintf() implementations
exported to plugins as a service, see the detailed documentation
around my_snprintf_service_st
*/
size_t my_vsnprintf(char *to, size_t n, const char* fmt, va_list ap)
{
return my_vsnprintf_ex(&my_charset_latin1, to, n, fmt, ap);
}
size_t my_snprintf(char* to, size_t n, const char* fmt, ...)
{
size_t result;
va_list args;
va_start(args,fmt);
result= my_vsnprintf(to, n, fmt, args);
va_end(args);
return result;
}

View File

@@ -0,0 +1,201 @@
/* Copyright (C) 2000 MySQL AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
/*
str2int(src, radix, lower, upper, &val)
converts the string pointed to by src to an integer and stores it in
val. It skips leading spaces and tabs (but not newlines, formfeeds,
backspaces), then it accepts an optional sign and a sequence of digits
in the specified radix. The result should satisfy lower <= *val <= upper.
The result is a pointer to the first character after the number;
trailing spaces will NOT be skipped.
If an error is detected, the result will be NullS, the value put
in val will be 0, and errno will be set to
EDOM if there are no digits
ERANGE if the result would overflow or otherwise fail to lie
within the specified bounds.
Check that the bounds are right for your machine.
This looks amazingly complicated for what you probably thought was an
easy task. Coping with integer overflow and the asymmetric range of
twos complement machines is anything but easy.
So that users of atoi and atol can check whether an error occured,
I have taken a wholly unprecedented step: errno is CLEARED if this
call has no problems.
*/
#include <my_global.h>
#include "m_string.h"
#include "m_ctype.h"
#include "my_sys.h" /* defines errno */
#include <errno.h>
#define char_val(X) (X >= '0' && X <= '9' ? X-'0' :\
X >= 'A' && X <= 'Z' ? X-'A'+10 :\
X >= 'a' && X <= 'z' ? X-'a'+10 :\
'\177')
char *str2int(register const char *src, register int radix, long int lower,
long int upper, long int *val)
{
int sign; /* is number negative (+1) or positive (-1) */
int n; /* number of digits yet to be converted */
long limit; /* "largest" possible valid input */
long scale; /* the amount to multiply next digit by */
long sofar; /* the running value */
register int d; /* (negative of) next digit */
char *start;
int digits[32]; /* Room for numbers */
/* Make sure *val is sensible in case of error */
*val = 0;
/* Check that the radix is in the range 2..36 */
#ifndef DBUG_OFF
if (radix < 2 || radix > 36) {
errno=EDOM;
return NullS;
}
#endif
/* The basic problem is: how do we handle the conversion of
a number without resorting to machine-specific code to
check for overflow? Obviously, we have to ensure that
no calculation can overflow. We are guaranteed that the
"lower" and "upper" arguments are valid machine integers.
On sign-and-magnitude, twos-complement, and ones-complement
machines all, if +|n| is representable, so is -|n|, but on
twos complement machines the converse is not true. So the
"maximum" representable number has a negative representative.
Limit is set to min(-|lower|,-|upper|); this is the "largest"
number we are concerned with. */
/* Calculate Limit using Scale as a scratch variable */
if ((limit = lower) > 0) limit = -limit;
if ((scale = upper) > 0) scale = -scale;
if (scale < limit) limit = scale;
/* Skip leading spaces and check for a sign.
Note: because on a 2s complement machine MinLong is a valid
integer but |MinLong| is not, we have to keep the current
converted value (and the scale!) as *negative* numbers,
so the sign is the opposite of what you might expect.
*/
while (my_isspace(&my_charset_latin1,*src)) src++;
sign = -1;
if (*src == '+') src++; else
if (*src == '-') src++, sign = 1;
/* Skip leading zeros so that we never compute a power of radix
in scale that we won't have a need for. Otherwise sticking
enough 0s in front of a number could cause the multiplication
to overflow when it neededn't.
*/
start=(char*) src;
while (*src == '0') src++;
/* Move over the remaining digits. We have to convert from left
to left in order to avoid overflow. Answer is after last digit.
*/
for (n = 0; (digits[n]=char_val(*src)) < radix && n < 20; n++,src++) ;
/* Check that there is at least one digit */
if (start == src) {
errno=EDOM;
return NullS;
}
/* The invariant we want to maintain is that src is just
to the right of n digits, we've converted k digits to
sofar, scale = -radix**k, and scale < sofar < 0. Now
if the final number is to be within the original
Limit, we must have (to the left)*scale+sofar >= Limit,
or (to the left)*scale >= Limit-sofar, i.e. the digits
to the left of src must form an integer <= (Limit-sofar)/(scale).
In particular, this is true of the next digit. In our
incremental calculation of Limit,
IT IS VITAL that (-|N|)/(-|D|) = |N|/|D|
*/
for (sofar = 0, scale = -1; --n >= 1;)
{
if ((long) -(d=digits[n]) < limit) {
errno=ERANGE;
return NullS;
}
limit = (limit+d)/radix, sofar += d*scale; scale *= radix;
}
if (n == 0)
{
if ((long) -(d=digits[n]) < limit) /* get last digit */
{
errno=ERANGE;
return NullS;
}
sofar+=d*scale;
}
/* Now it might still happen that sofar = -32768 or its equivalent,
so we can't just multiply by the sign and check that the result
is in the range lower..upper. All of this caution is a right
pain in the neck. If only there were a standard routine which
says generate thus and such a signal on integer overflow...
But not enough machines can do it *SIGH*.
*/
if (sign < 0)
{
if (sofar < -LONG_MAX || (sofar= -sofar) > upper)
{
errno=ERANGE;
return NullS;
}
}
else if (sofar < lower)
{
errno=ERANGE;
return NullS;
}
*val = sofar;
errno=0; /* indicate that all went well */
return (char*) src;
}
/* Theese are so slow compared with ordinary, optimized atoi */
#ifdef WANT_OUR_ATOI
int atoi(const char *src)
{
long val;
str2int(src, 10, (long) INT_MIN, (long) INT_MAX, &val);
return (int) val;
}
long atol(const char *src)
{
long val;
str2int(src, 10, LONG_MIN, LONG_MAX, &val);
return val;
}
#endif /* WANT_OUR_ATOI */

View File

@@ -0,0 +1,33 @@
/* Copyright (C) 2000 MySQL AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
#include <my_global.h>
#include <m_string.h>
static void *my_str_malloc_default(size_t size)
{
void *ret= malloc(size);
if (!ret)
exit(1);
return ret;
}
static void my_str_free_default(void *ptr)
{
free(ptr);
}
void *(*my_str_malloc)(size_t)= &my_str_malloc_default;
void (*my_str_free)(void *)= &my_str_free_default;

View File

@@ -0,0 +1,39 @@
/* Copyright (C) 2000 MySQL AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
/* File : strappend.c
Author : Monty
Updated: 1987.02.07
Defines: strappend()
strappend(dest, len, fill) appends fill-characters to a string so that
the result length == len. If the string is longer than len it's
trunked. The des+len character is allways set to NULL.
*/
#include <my_global.h>
#include "m_string.h"
void strappend(register char *s, size_t len, pchar fill)
{
register char *endpos;
endpos = s+len;
while (*s++);
s--;
while (s<endpos) *(s++) = fill;
*(endpos) = '\0';
} /* strappend */

View File

@@ -0,0 +1,36 @@
/* Copyright (C) 2000 MySQL AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
/* File : strcend.c
Author : Michael Widenius: ifdef MC68000
Updated: 20 April 1984
Defines: strcend()
strcend(s, c) returns a pointer to the first place in s where c
occurs, or a pointer to the end-null of s if c does not occur in s.
*/
#include <my_global.h>
#include "m_string.h"
char *strcend(register const char *s, register pchar c)
{
for (;;)
{
if (*s == (char) c) return (char*) s;
if (!*s++) return (char*) s-1;
}
}

View File

@@ -0,0 +1,44 @@
/* Copyright (C) 2000 MySQL AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
/* File : strcont.c
Author : Monty
Updated: 1988.07.27
Defines: strcont()
strcont(str, set) if str contanies any character in the string set.
The result is the position of the first found character in str, or NullS
if there isn't anything found.
*/
#include <my_global.h>
#include "m_string.h"
char * strcont(reg1 const char *str,reg2 const char *set)
{
reg3 char * start = (char *) set;
while (*str)
{
while (*set)
{
if (*set++ == *str)
return ((char*) str);
}
set=start; str++;
}
return (NullS);
} /* strcont */

View File

@@ -0,0 +1,37 @@
/* Copyright (C) 2002 MySQL AB
This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Library General Public
License as published by the Free Software Foundation; version 2
of the License.
This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Library General Public License for more details.
You should have received a copy of the GNU Library General Public
License along with this library; if not, write to the Free
Software Foundation, Inc., 59 Temple Place - Suite 330, Boston,
MA 02111-1307, USA */
/* File : strend.c
Author : Richard A. O'Keefe.
Updated: 23 April 1984
Defines: strend()
strend(s) returns a character pointer to the NUL which ends s. That
is, strend(s)-s == strlen(s). This is useful for adding things at
the end of strings. It is redundant, because strchr(s,'\0') could
be used instead, but this is clearer and faster.
*/
#include <my_global.h>
#include "m_string.h"
char *strend(register const char *s)
{
while (*s++);
return (char*) (s-1);
}

View File

@@ -0,0 +1,34 @@
/* Copyright (C) 2000 MySQL AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
/* File : strfill.c
Author : Monty
Updated: 1987.04.16
Defines: strfill()
strfill(dest, len, fill) makes a string of fill-characters. The result
string is of length == len. The des+len character is allways set to NULL.
strfill() returns pointer to dest+len;
*/
#include <my_global.h>
#include "m_string.h"
char * strfill(char *s, size_t len, pchar fill)
{
while (len--) *s++ = fill;
*(s) = '\0';
return(s);
} /* strfill */

View File

@@ -0,0 +1,54 @@
/* Copyright (C) 2000 MySQL AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
/* File : strmake.c
Author : Michael Widenius
Updated: 20 Jul 1984
Defines: strmake()
strmake(dst,src,length) moves length characters, or until end, of src to
dst and appends a closing NUL to dst.
Note that if strlen(src) >= length then dst[length] will be set to \0
strmake() returns pointer to closing null
*/
#include <my_global.h>
#include "m_string.h"
char *strmake(register char *dst, register const char *src, size_t length)
{
#ifdef EXTRA_DEBUG
/*
'length' is the maximum length of the string; the buffer needs
to be one character larger to accomodate the terminating '\0'.
This is easy to get wrong, so we make sure we write to the
entire length of the buffer to identify incorrect buffer-sizes.
We only initialise the "unused" part of the buffer here, a) for
efficiency, and b) because dst==src is allowed, so initialising
the entire buffer would overwrite the source-string. Also, we
write a character rather than '\0' as this makes spotting these
problems in the results easier.
*/
uint n= 0;
while (n < length && src[n++]);
memset(dst + n, (int) 'Z', length - n + 1);
#endif
while (length--)
if (! (*dst++ = *src++))
return dst-1;
*dst=0;
return dst;
}

View File

@@ -0,0 +1,37 @@
/* Copyright (C) 2000 MySQL AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
/*
strmov(dst, src) moves all the characters of src (including the
closing NUL) to dst, and returns a pointer to the new closing NUL in
dst. The similar UNIX routine strcpy returns the old value of dst,
which I have never found useful. strmov(strmov(dst,a),b) moves a//b
into dst, which seems useful.
*/
#include <my_global.h>
#include "m_string.h"
#ifdef strmov
#undef strmov
#define strmov strmov_overlapp
#endif
char *strmov(register char *dst, register const char *src)
{
while ((*dst++ = *src++)) ;
return dst-1;
}

View File

@@ -0,0 +1,34 @@
/* Copyright (C) 2000 MySQL AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
/* File : strnlen.c
Author : Michael Widenius
Updated: 20 April 1984
Defines: strnlen.
strnlen(s, len) returns the length of s or len if s is longer than len.
*/
#include <my_global.h>
#include "m_string.h"
#ifndef HAVE_STRNLEN
size_t strnlen(register const char *s, register size_t maxlen)
{
const char *end= (const char *)memchr(s, '\0', maxlen);
return end ? (size_t) (end - s) : maxlen;
}
#endif

View File

@@ -0,0 +1,34 @@
/* Copyright (C) 2000 MySQL AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
/*
strnmov(dst,src,length) moves length characters, or until end, of src to
dst and appends a closing NUL to dst if src is shorter than length.
The result is a pointer to the first NUL in dst, or is dst+n if dst was
truncated.
*/
#include <my_global.h>
#include "m_string.h"
char *strnmov(register char *dst, register const char *src, size_t n)
{
while (n-- != 0) {
if (!(*dst++ = *src++)) {
return (char*) dst-1;
}
}
return dst;
}

View File

@@ -0,0 +1,50 @@
/* Copyright (C) 2002 MySQL AB
This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Library General Public
License as published by the Free Software Foundation; version 2
of the License.
This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Library General Public License for more details.
You should have received a copy of the GNU Library General Public
License along with this library; if not, write to the Free
Software Foundation, Inc., 59 Temple Place - Suite 330, Boston,
MA 02111-1307, USA */
/* File : strxmov.c
Author : Richard A. O'Keefe.
Updated: 25 may 1984
Defines: strxmov()
strxmov(dst, src1, ..., srcn, NullS)
moves the concatenation of src1,...,srcn to dst, terminates it
with a NUL character, and returns a pointer to the terminating NUL.
It is just like strmov except that it concatenates multiple sources.
Beware: the last argument should be the null character pointer.
Take VERY great care not to omit it! Also be careful to use NullS
and NOT to use 0, as on some machines 0 is not the same size as a
character pointer, or not the same bit pattern as NullS.
*/
#include <my_global.h>
#include "m_string.h"
#include <stdarg.h>
char *strxmov(char *dst,const char *src, ...)
{
va_list pvar;
va_start(pvar,src);
while (src != NullS) {
while ((*dst++ = *src++)) ;
dst--;
src = va_arg(pvar, char *);
}
va_end(pvar);
*dst = 0; /* there might have been no sources! */
return dst;
}

View File

@@ -0,0 +1,63 @@
/* Copyright (C) 2002 MySQL AB
This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Library General Public
License as published by the Free Software Foundation; version 2
of the License.
This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Library General Public License for more details.
You should have received a copy of the GNU Library General Public
License along with this library; if not, write to the Free
Software Foundation, Inc., 59 Temple Place - Suite 330, Boston,
MA 02111-1307, USA */
/* File : strxnmov.c
Author : Richard A. O'Keefe.
Updated: 2 June 1984
Defines: strxnmov()
strxnmov(dst, len, src1, ..., srcn, NullS)
moves the first len characters of the concatenation of src1,...,srcn
to dst and add a closing NUL character.
It is just like strnmov except that it concatenates multiple sources.
Beware: the last argument should be the null character pointer.
Take VERY great care not to omit it! Also be careful to use NullS
and NOT to use 0, as on some machines 0 is not the same size as a
character pointer, or not the same bit pattern as NullS.
NOTE
strxnmov is like strnmov in that it moves up to len
characters; dst will be padded on the right with one '\0' character.
if total-string-length >= length then dst[length] will be set to \0
*/
#include <my_global.h>
#include "m_string.h"
#include <stdarg.h>
char *strxnmov(char *dst, size_t len, const char *src, ...)
{
va_list pvar;
char *end_of_dst=dst+len;
va_start(pvar,src);
while (src != NullS)
{
do
{
if (dst == end_of_dst)
goto end;
}
while ((*dst++ = *src++));
dst--;
src = va_arg(pvar, char *);
}
end:
*dst=0;
va_end(pvar);
return dst;
}

View File

@@ -0,0 +1,258 @@
/* Copyright (C) 2000 MySQL AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
/*
Copyright (C) 1998, 1999 by Pruet Boonma, all rights reserved.
Copyright (C) 1998 by Theppitak Karoonboonyanan, all rights reserved.
Permission to use, copy, modify, distribute and sell this software
and its documentation for any purpose is hereby granted without fee,
provided that the above copyright notice appear in all copies.
Smaphan Raruenrom and Pruet Boonma makes no representations about
the suitability of this software for any purpose. It is provided
"as is" without express or implied warranty.
*/
/*
LC_COLLATE category + Level information
*/
#ifndef _t_ctype_h
#define _t_ctype_h
typedef unsigned char tchar;
#define TOT_LEVELS 5
#define LAST_LEVEL 4 /* TOT_LEVELS - 1 */
#define IGNORE 0
/* level 1 symbols & order */
enum l1_symbols {
L1_08 = TOT_LEVELS,
L1_18,
L1_28,
L1_38,
L1_48,
L1_58,
L1_68,
L1_78,
L1_88,
L1_98,
L1_A8,
L1_B8,
L1_C8,
L1_D8,
L1_E8,
L1_F8,
L1_G8,
L1_H8,
L1_I8,
L1_J8,
L1_K8,
L1_L8,
L1_M8,
L1_N8,
L1_O8,
L1_P8,
L1_Q8,
L1_R8,
L1_S8,
L1_T8,
L1_U8,
L1_V8,
L1_W8,
L1_X8,
L1_Y8,
L1_Z8,
L1_KO_KAI,
L1_KHO_KHAI,
L1_KHO_KHUAT,
L1_KHO_KHWAI,
L1_KHO_KHON,
L1_KHO_RAKHANG,
L1_NGO_NGU,
L1_CHO_CHAN,
L1_CHO_CHING,
L1_CHO_CHANG,
L1_SO_SO,
L1_CHO_CHOE,
L1_YO_YING,
L1_DO_CHADA,
L1_TO_PATAK,
L1_THO_THAN,
L1_THO_NANGMONTHO,
L1_THO_PHUTHAO,
L1_NO_NEN,
L1_DO_DEK,
L1_TO_TAO,
L1_THO_THUNG,
L1_THO_THAHAN,
L1_THO_THONG,
L1_NO_NU,
L1_BO_BAIMAI,
L1_PO_PLA,
L1_PHO_PHUNG,
L1_FO_FA,
L1_PHO_PHAN,
L1_FO_FAN,
L1_PHO_SAMPHAO,
L1_MO_MA,
L1_YO_YAK,
L1_RO_RUA,
L1_RU,
L1_LO_LING,
L1_LU,
L1_WO_WAEN,
L1_SO_SALA,
L1_SO_RUSI,
L1_SO_SUA,
L1_HO_HIP,
L1_LO_CHULA,
L1_O_ANG,
L1_HO_NOKHUK,
L1_NKHIT,
L1_SARA_A,
L1_MAI_HAN_AKAT,
L1_SARA_AA,
L1_SARA_AM,
L1_SARA_I,
L1_SARA_II,
L1_SARA_UE,
L1_SARA_UEE,
L1_SARA_U,
L1_SARA_UU,
L1_SARA_E,
L1_SARA_AE,
L1_SARA_O,
L1_SARA_AI_MAIMUAN,
L1_SARA_AI_MAIMALAI
};
/* level 2 symbols & order */
enum l2_symbols {
L2_BLANK = TOT_LEVELS,
L2_THAII,
L2_YAMAK,
L2_PINTHU,
L2_GARAN,
L2_TYKHU,
L2_TONE1,
L2_TONE2,
L2_TONE3,
L2_TONE4
};
/* level 3 symbols & order */
enum l3_symbols {
L3_BLANK = TOT_LEVELS,
L3_SPACE,
L3_NB_SACE,
L3_LOW_LINE,
L3_HYPHEN,
L3_COMMA,
L3_SEMICOLON,
L3_COLON,
L3_EXCLAMATION,
L3_QUESTION,
L3_SOLIDUS,
L3_FULL_STOP,
L3_PAIYAN_NOI,
L3_MAI_YAMOK,
L3_GRAVE,
L3_CIRCUMFLEX,
L3_TILDE,
L3_APOSTROPHE,
L3_QUOTATION,
L3_L_PARANTHESIS,
L3_L_BRACKET,
L3_L_BRACE,
L3_R_BRACE,
L3_R_BRACKET,
L3_R_PARENTHESIS,
L3_AT,
L3_BAHT,
L3_DOLLAR,
L3_FONGMAN,
L3_ANGKHANKHU,
L3_KHOMUT,
L3_ASTERISK,
L3_BK_SOLIDUS,
L3_AMPERSAND,
L3_NUMBER,
L3_PERCENT,
L3_PLUS,
L3_LESS_THAN,
L3_EQUAL,
L3_GREATER_THAN,
L3_V_LINE
};
/* level 4 symbols & order */
enum l4_symbols {
L4_BLANK = TOT_LEVELS,
L4_MIN,
L4_CAP,
L4_EXT
};
enum level_symbols {
L_UPRUPR = TOT_LEVELS,
L_UPPER,
L_MIDDLE,
L_LOWER
};
#define _is(c) (t_ctype[(c)][LAST_LEVEL])
#define _level 8
#define _consnt 16
#define _ldvowel 32
#define _fllwvowel 64
#define _uprvowel 128
#define _lwrvowel 256
#define _tone 512
#define _diacrt1 1024
#define _diacrt2 2048
#define _combine 4096
#define _stone 8192
#define _tdig 16384
#define _rearvowel (_fllwvowel | _uprvowel | _lwrvowel)
#define _diacrt (_diacrt1 | _diacrt2)
#define levelof(c) ( _is(c) & _level )
#define isthai(c) ( (c) >= 128 )
#define istalpha(c) ( _is(c) & (_consnt|_ldvowel|_rearvowel|\
_tone|_diacrt1|_diacrt2) )
#define isconsnt(c) ( _is(c) & _consnt )
#define isldvowel(c) ( _is(c) & _ldvowel )
#define isfllwvowel(c) ( _is(c) & _fllwvowel )
#define ismidvowel(c) ( _is(c) & (_ldvowel|_fllwvowel) )
#define isuprvowel(c) ( _is(c) & _uprvowel )
#define islwrvowel(c) ( _is(c) & _lwrvowel )
#define isuprlwrvowel(c) ( _is(c) & (_lwrvowel | _uprvowel))
#define isrearvowel(c) ( _is(c) & _rearvowel )
#define isvowel(c) ( _is(c) & (_ldvowel|_rearvowel) )
#define istone(c) ( _is(c) & _tone )
#define isunldable(c) ( _is(c) & (_rearvowel|_tone|_diacrt1|_diacrt2) )
#define iscombinable(c) ( _is(c) & _combine )
#define istdigit(c) ( _is(c) & _tdig )
#define isstone(c) ( _is(c) & _stone )
#define isdiacrt1(c) ( _is(c) & _diacrt1)
#define isdiacrt2(c) ( _is(c) & _diacrt2)
#define isdiacrt(c) ( _is(c) & _diacrt)
/* Function prototype called by sql/field.cc */
void ThNormalize(uchar* ptr, uint field_length, const uchar* from, uint length);
#endif

View File

@@ -0,0 +1,496 @@
/* Copyright (C) 2000 MySQL AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
#include "my_global.h"
#include "m_string.h"
#include "my_xml.h"
#define MY_XML_UNKNOWN 'U'
#define MY_XML_EOF 'E'
#define MY_XML_STRING 'S'
#define MY_XML_IDENT 'I'
#define MY_XML_EQ '='
#define MY_XML_LT '<'
#define MY_XML_GT '>'
#define MY_XML_SLASH '/'
#define MY_XML_COMMENT 'C'
#define MY_XML_TEXT 'T'
#define MY_XML_QUESTION '?'
#define MY_XML_EXCLAM '!'
#define MY_XML_CDATA 'D'
typedef struct xml_attr_st
{
const char *beg;
const char *end;
} MY_XML_ATTR;
/*
XML ctype:
*/
#define MY_XML_ID0 0x01 /* Identifier initial character */
#define MY_XML_ID1 0x02 /* Identifier medial character */
#define MY_XML_SPC 0x08 /* Spacing character */
/*
http://www.w3.org/TR/REC-xml/
[4] NameChar ::= Letter | Digit | '.' | '-' | '_' | ':' |
CombiningChar | Extender
[5] Name ::= (Letter | '_' | ':') (NameChar)*
*/
static char my_xml_ctype[256]=
{
/*00*/ 0,0,0,0,0,0,0,0,0,8,8,0,0,8,0,0,
/*10*/ 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
/*20*/ 8,0,0,0,0,0,0,0,0,0,0,0,0,2,2,0, /* !"#$%&'()*+,-./ */
/*30*/ 2,2,2,2,2,2,2,2,2,2,3,0,0,0,0,0, /* 0123456789:;<=>? */
/*40*/ 0,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3, /* @ABCDEFGHIJKLMNO */
/*50*/ 3,3,3,3,3,3,3,3,3,3,3,0,0,0,0,3, /* PQRSTUVWXYZ[\]^_ */
/*60*/ 0,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3, /* `abcdefghijklmno */
/*70*/ 3,3,3,3,3,3,3,3,3,3,3,0,0,0,0,0, /* pqrstuvwxyz{|}~ */
/*80*/ 3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,
/*90*/ 3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,
/*A0*/ 3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,
/*B0*/ 3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,
/*C0*/ 3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,
/*D0*/ 3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,
/*E0*/ 3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,
/*F0*/ 3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3
};
#define my_xml_is_space(c) (my_xml_ctype[(uchar) (c)] & MY_XML_SPC)
#define my_xml_is_id0(c) (my_xml_ctype[(uchar) (c)] & MY_XML_ID0)
#define my_xml_is_id1(c) (my_xml_ctype[(uchar) (c)] & MY_XML_ID1)
static const char *lex2str(int lex)
{
switch(lex)
{
case MY_XML_EOF: return "END-OF-INPUT";
case MY_XML_STRING: return "STRING";
case MY_XML_IDENT: return "IDENT";
case MY_XML_CDATA: return "CDATA";
case MY_XML_EQ: return "'='";
case MY_XML_LT: return "'<'";
case MY_XML_GT: return "'>'";
case MY_XML_SLASH: return "'/'";
case MY_XML_COMMENT: return "COMMENT";
case MY_XML_TEXT: return "TEXT";
case MY_XML_QUESTION: return "'?'";
case MY_XML_EXCLAM: return "'!'";
}
return "unknown token";
}
static void my_xml_norm_text(MY_XML_ATTR *a)
{
for ( ; (a->beg < a->end) && my_xml_is_space(a->beg[0]) ; a->beg++ );
for ( ; (a->beg < a->end) && my_xml_is_space(a->end[-1]) ; a->end-- );
}
static int my_xml_scan(MY_XML_PARSER *p,MY_XML_ATTR *a)
{
int lex;
for (; ( p->cur < p->end) && my_xml_is_space(p->cur[0]) ; p->cur++);
if (p->cur >= p->end)
{
a->beg=p->end;
a->end=p->end;
lex=MY_XML_EOF;
goto ret;
}
a->beg=p->cur;
a->end=p->cur;
if ((p->end - p->cur > 3) && !memcmp(p->cur,"<!--",4))
{
for (; (p->cur < p->end) && memcmp(p->cur, "-->", 3); p->cur++)
{}
if (!memcmp(p->cur, "-->", 3))
p->cur+=3;
a->end=p->cur;
lex=MY_XML_COMMENT;
}
else if (!memcmp(p->cur, "<![CDATA[",9))
{
p->cur+= 9;
for (; p->cur < p->end - 2 ; p->cur++)
{
if (p->cur[0] == ']' && p->cur[1] == ']' && p->cur[2] == '>')
{
p->cur+= 3;
a->end= p->cur;
break;
}
}
lex= MY_XML_CDATA;
}
else if (strchr("?=/<>!",p->cur[0]))
{
p->cur++;
a->end=p->cur;
lex=a->beg[0];
}
else if ( (p->cur[0] == '"') || (p->cur[0] == '\'') )
{
p->cur++;
for (; ( p->cur < p->end ) && (p->cur[0] != a->beg[0]); p->cur++)
{}
a->end=p->cur;
if (a->beg[0] == p->cur[0])p->cur++;
a->beg++;
if (!(p->flags & MY_XML_FLAG_SKIP_TEXT_NORMALIZATION))
my_xml_norm_text(a);
lex=MY_XML_STRING;
}
else if (my_xml_is_id0(p->cur[0]))
{
p->cur++;
while (p->cur < p->end && my_xml_is_id1(p->cur[0]))
p->cur++;
a->end=p->cur;
my_xml_norm_text(a);
lex=MY_XML_IDENT;
}
else
lex= MY_XML_UNKNOWN;
#if 0
printf("LEX=%s[%d]\n",lex2str(lex),a->end-a->beg);
#endif
ret:
return lex;
}
static int my_xml_value(MY_XML_PARSER *st, const char *str, size_t len)
{
return (st->value) ? (st->value)(st,str,len) : MY_XML_OK;
}
static int my_xml_enter(MY_XML_PARSER *st, const char *str, size_t len)
{
if ((size_t) (st->attrend-st->attr+len+1) > sizeof(st->attr))
{
sprintf(st->errstr,"To deep XML");
return MY_XML_ERROR;
}
if (st->attrend > st->attr)
{
st->attrend[0]= '/';
st->attrend++;
}
memcpy(st->attrend,str,len);
st->attrend+=len;
st->attrend[0]='\0';
if (st->flags & MY_XML_FLAG_RELATIVE_NAMES)
return st->enter ? st->enter(st, str, len) : MY_XML_OK;
else
return st->enter ? st->enter(st,st->attr,st->attrend-st->attr) : MY_XML_OK;
}
static void mstr(char *s,const char *src,size_t l1, size_t l2)
{
l1 = l1<l2 ? l1 : l2;
memcpy(s,src,l1);
s[l1]='\0';
}
static int my_xml_leave(MY_XML_PARSER *p, const char *str, size_t slen)
{
char *e;
size_t glen;
char s[32];
char g[32];
int rc;
/* Find previous '/' or beginning */
for (e=p->attrend; (e>p->attr) && (e[0] != '/') ; e--);
glen = (size_t) ((e[0] == '/') ? (p->attrend-e-1) : p->attrend-e);
if (str && (slen != glen))
{
mstr(s,str,sizeof(s)-1,slen);
if (glen)
{
mstr(g,e+1,sizeof(g)-1,glen),
sprintf(p->errstr,"'</%s>' unexpected ('</%s>' wanted)",s,g);
}
else
sprintf(p->errstr,"'</%s>' unexpected (END-OF-INPUT wanted)", s);
return MY_XML_ERROR;
}
if (p->flags & MY_XML_FLAG_RELATIVE_NAMES)
rc= p->leave_xml ? p->leave_xml(p, str, slen) : MY_XML_OK;
else
rc= (p->leave_xml ? p->leave_xml(p,p->attr,p->attrend-p->attr) :
MY_XML_OK);
*e='\0';
p->attrend=e;
return rc;
}
int my_xml_parse(MY_XML_PARSER *p,const char *str, size_t len)
{
p->attrend=p->attr;
p->beg=str;
p->cur=str;
p->end=str+len;
while ( p->cur < p->end )
{
MY_XML_ATTR a;
if (p->cur[0] == '<')
{
int lex;
int question=0;
int exclam=0;
lex=my_xml_scan(p,&a);
if (MY_XML_COMMENT == lex)
continue;
if (lex == MY_XML_CDATA)
{
a.beg+= 9;
a.end-= 3;
my_xml_value(p, a.beg, (size_t) (a.end-a.beg));
continue;
}
lex=my_xml_scan(p,&a);
if (MY_XML_SLASH == lex)
{
if (MY_XML_IDENT != (lex=my_xml_scan(p,&a)))
{
sprintf(p->errstr,"%s unexpected (ident wanted)",lex2str(lex));
return MY_XML_ERROR;
}
if (MY_XML_OK != my_xml_leave(p,a.beg,(size_t) (a.end-a.beg)))
return MY_XML_ERROR;
lex=my_xml_scan(p,&a);
goto gt;
}
if (MY_XML_EXCLAM == lex)
{
lex=my_xml_scan(p,&a);
exclam=1;
}
else if (MY_XML_QUESTION == lex)
{
lex=my_xml_scan(p,&a);
question=1;
}
if (MY_XML_IDENT == lex)
{
p->current_node_type= MY_XML_NODE_TAG;
if (MY_XML_OK != my_xml_enter(p,a.beg,(size_t) (a.end-a.beg)))
return MY_XML_ERROR;
}
else
{
sprintf(p->errstr,"%s unexpected (ident or '/' wanted)",
lex2str(lex));
return MY_XML_ERROR;
}
while ((MY_XML_IDENT == (lex=my_xml_scan(p,&a))) ||
((MY_XML_STRING == lex && exclam)))
{
MY_XML_ATTR b;
if (MY_XML_EQ == (lex=my_xml_scan(p,&b)))
{
lex=my_xml_scan(p,&b);
if ( (lex == MY_XML_IDENT) || (lex == MY_XML_STRING) )
{
p->current_node_type= MY_XML_NODE_ATTR;
if ((MY_XML_OK != my_xml_enter(p,a.beg,(size_t) (a.end-a.beg))) ||
(MY_XML_OK != my_xml_value(p,b.beg,(size_t) (b.end-b.beg))) ||
(MY_XML_OK != my_xml_leave(p,a.beg,(size_t) (a.end-a.beg))))
return MY_XML_ERROR;
}
else
{
sprintf(p->errstr,"%s unexpected (ident or string wanted)",
lex2str(lex));
return MY_XML_ERROR;
}
}
else if (MY_XML_IDENT == lex)
{
p->current_node_type= MY_XML_NODE_ATTR;
if ((MY_XML_OK != my_xml_enter(p,a.beg,(size_t) (a.end-a.beg))) ||
(MY_XML_OK != my_xml_leave(p,a.beg,(size_t) (a.end-a.beg))))
return MY_XML_ERROR;
}
else if ((MY_XML_STRING == lex) && exclam)
{
/*
We are in <!DOCTYPE>, e.g.
<!DOCTYPE name SYSTEM "SystemLiteral">
<!DOCTYPE name PUBLIC "PublidLiteral" "SystemLiteral">
Just skip "SystemLiteral" and "PublicidLiteral"
*/
}
else
break;
}
if (lex == MY_XML_SLASH)
{
if (MY_XML_OK != my_xml_leave(p,NULL,0))
return MY_XML_ERROR;
lex=my_xml_scan(p,&a);
}
gt:
if (question)
{
if (lex != MY_XML_QUESTION)
{
sprintf(p->errstr,"%s unexpected ('?' wanted)",lex2str(lex));
return MY_XML_ERROR;
}
if (MY_XML_OK != my_xml_leave(p,NULL,0))
return MY_XML_ERROR;
lex=my_xml_scan(p,&a);
}
if (exclam)
{
if (MY_XML_OK != my_xml_leave(p,NULL,0))
return MY_XML_ERROR;
}
if (lex != MY_XML_GT)
{
sprintf(p->errstr,"%s unexpected ('>' wanted)",lex2str(lex));
return MY_XML_ERROR;
}
}
else
{
a.beg=p->cur;
for ( ; (p->cur < p->end) && (p->cur[0] != '<') ; p->cur++);
a.end=p->cur;
if (!(p->flags & MY_XML_FLAG_SKIP_TEXT_NORMALIZATION))
my_xml_norm_text(&a);
if (a.beg != a.end)
{
my_xml_value(p,a.beg,(size_t) (a.end-a.beg));
}
}
}
if (p->attr[0])
{
sprintf(p->errstr,"unexpected END-OF-INPUT");
return MY_XML_ERROR;
}
return MY_XML_OK;
}
void my_xml_parser_create(MY_XML_PARSER *p)
{
bzero((void*)p,sizeof(p[0]));
}
void my_xml_parser_free(MY_XML_PARSER *p __attribute__((unused)))
{
}
void my_xml_set_value_handler(MY_XML_PARSER *p,
int (*action)(MY_XML_PARSER *p, const char *s,
size_t l))
{
p->value=action;
}
void my_xml_set_enter_handler(MY_XML_PARSER *p,
int (*action)(MY_XML_PARSER *p, const char *s,
size_t l))
{
p->enter=action;
}
void my_xml_set_leave_handler(MY_XML_PARSER *p,
int (*action)(MY_XML_PARSER *p, const char *s,
size_t l))
{
p->leave_xml=action;
}
void my_xml_set_user_data(MY_XML_PARSER *p, void *user_data)
{
p->user_data=user_data;
}
const char *my_xml_error_string(MY_XML_PARSER *p)
{
return p->errstr;
}
size_t my_xml_error_pos(MY_XML_PARSER *p)
{
const char *beg=p->beg;
const char *s;
for ( s=p->beg ; s<p->cur; s++)
{
if (s[0] == '\n')
beg=s;
}
return (size_t) (p->cur-beg);
}
uint my_xml_error_lineno(MY_XML_PARSER *p)
{
uint res=0;
const char *s;
for (s=p->beg ; s<p->cur; s++)
{
if (s[0] == '\n')
res++;
}
return res;
}