原创 有关汉字编码、拼音输入法 大搜罗

2008-4-10 22:08 65922 19 19 分类: 工程师职场
机内码国标码区位码


计算机处理汉字信息的前提条件是对每个汉字进行编码,这些编码统称为汉字编码。汉字信息在系统内传送的过程就是汉字编码转换的过程。
汉字交换码:汉字信息处理系统之间或通信系统之间传输信息时,对每一个汉字所规定的统一编码,我国已指定汉字交换码的国家标准“信息交换用汉字编码字符集——基本集”,代号为GB 2312—80,又称为“国标码”。
国标码:所有汉字编码都应该遵循这一标准,汉字机内码的编码、汉字字库的设计、汉字输入码的转换、输出设备的汉字地址码等,都以此标准为基础。GB 2312—80就是国标码。该码规定:一个汉字用两个字节表示,每个字节只有7位,与ASCII码相似。
区位码:将GB 2312—80的全部字符集组成一个94×94的方阵,每一行称为一个“区”,编号为0l~94;每一列称为一个“位”,编号为0l~94,这样得到GB 2312—80的区位图,用区位图的位置来表示的汉字编码,称为区位码。
机内码:为了避免ASCII码和国标码同时使用时产生二义性问题,大部分汉字系统都采用将国标码每个字节高位置1作为汉字机内码。这样既解决了汉字机内码与西文机内码之间的二义性,又使汉字机内码与国标码具有极简单的对应关系。
汉字机内码、国标码和区位码三者之间的关系为:区位码(十进制)的两个字节分别转换为十六进制后加20H得到对应的国标码;机内码是汉字交换码(国标码)两个字节的最高位分别加1,即汉字交换码(国标码)的两个字节分别加80H得到对应的机内码;区位码(十进制)的两个字节分别转换为十六进制后加A0H得到对应的机内码。

 GB2312编码包括符号、数字、字母、日文、制表符等,当然最主要的部分还是中文,它采用16位编码方式,简体中文的编码范围从B<?XML:NAMESPACE PREFIX = ST1 />0A1一直到F7FE,完整编码表可以参考http://ash.jp/code/cn/gb2312tbl.htm


汉字编码简明对照表


中文转换为完整拼音算法原理分析


 汉字编码原理


1.国家标准汉字代码体系
  汉字字数繁多,属性丰富,因而汉字代码体系也较复杂,包括:
  (1)汉字机内码。它们是汉字在计算机汉字系统内部的表示方法,是计算机汉字系统
的基础代码。
  (2)汉字交换码。它们是国标汉字(如机内码)进行信息交换的代码标准。
  (3)汉字输入码。它们是在计算机标准键盘上输入汉字用到的各种代码体系。
  (4)汉字点阵码。它们是在计算机屏幕上显示和在打印机上打印输出汉字的代码体系。
  (5)汉字字形控制码。为了打印各种风格的字体和字形所制定的代码。
  这些代码系统有的必须有统一的国家标准,有的则不要求统一。近年来我国已经制定
系列汉字信息处理方面的国家标准,今后将继续完善,并与国际上求得统一。


2. 国家标准汉字交换码
  我国制定了“中华人民共和国国家标准信息交换汉字编码”,标准代号为GB2312—80,
这种编码又称为国标码。在国标码的字符集中共收录了一级汉字3755个,二级汉字3008
个,图形符号682个,三项字符总计7445个。
  在国标GD2312—80中规定,所有的国标汉字及符号分配在一个94行、94列的方阵中,
方阵的每一行称为一个“区”,编号为01区到94区,每一列称为一个“位”,编号为01
位到94位,方阵中的每一个汉字和符号所在的区号和位号组合在一起形成的四个阿拉伯
数字就是它们的“区位码”。区位码的前两位是它的区号,后两位是它的位号。用区位码
就可以唯一地确定一个汉字或符号,反过来说,任何一个汉字或符号也都对应着一个唯一
的区位码。汉字“母”字的区位码是3624,表明它在方阵的36区24位,问号“?”的区
位码为0331,则它在03区3l位。
  所有的汉字和符号所在的区分为以下四个组:
    (1)01区到15区。图形符号区,其中01区到09区为标准符号区,10区到15区为
自定义符号区。
  01区到09区的具体内容如下;
  1)01区。一般符号202个,如间隔符、标点、运算符、单位符号及制表符;
  2)02区。序号60个,如1.~20.、(1)~(20)、①~⑩及(一)~(十);
  3)03区。数字22个,如0—9及X一XII,英文字母52个,其中大写A—Z、小写
a—z各26个;
  4)04区。日文平假名83个;
  5)05区。日文片假名86个;
  6)06区。希腊字母48个;
  7)07区。俄文字母66个;
  8)08区。汉语拼音符号a—z26个;
  9)09区。汉语拼音字母37个。
    (2)16区到55区。一级常用汉字区,包括了3755个一统汉字。这40个区中的汉字
是按汉语拼音排序的,同音字按笔划顺序排序。其中55区的90一94位未定义汉字。
    (3)56区到87区。二级汉字区,包括了3008个二级汉字,按部首排序。
    (4)88区到94区。自定义汉字区。
    第10区到第15区的自定义符号区和第88区到第94区的自定义汉字区可由用户自行
  定义国标码中未定义的符号和汉字。


3. 国家标准汉字机内码
    汉字的机内码是指在计算机中表示一个汉字的编码。机内码与区位码稍有区别。如上
所述,汉字区位码的区码和位码的取值均在1~94之间,如直接用区位码作为机内码,就
会与基本ASCII码混淆。为了避免机内码与基本ASCII码的冲突,需要避开基本ASCII码
中的控制码(00H~1FH),还需与基本ASCII码中的字符相区别。为了实现这两点,可以
先在区码和位码分别加上20H,在此基础上再加80H(此处“H”表示前两位数字为十六
进制数)。经过这些处理,用机内码表示一个汉字需要占两个字节,分别  称为高位字
节和低位字节,这两位字节的机内码按如下规则表示:
    高位字节=区码+20H+80H(或区码+A0H)
    低位字节=位码+20H+80H(或位码+AOH)
    由于汉字的区码与位码的取值范围的十六进制数均为01H~5EH(即十进制的01~94),
  所以汉字的高位字节与低位字节的取值范围则为A1H~FEH(即十进制的161~254)。
    例如,汉字“啊”的区位码为1601,区码和位码分别用十六进制表示即为1001H,它
的机内码的高位字节为B0H,低位字节为A1H,机内码就是B0A1H。 


4. 汉字的输入码
    在计算机标准键盘上,汉字的输入和西文的输入有很大的不同。西文的输入,击一次
键就直接输入了相应的字符或代码,“键入”和“输入”是同一个含义。但是在计算机上
进行汉字输入时,“键入”是指击键的动作即键盘操作的过程,而“输入”则是把所需的
汉字或字符送到指定的地方,是键盘操作的目的。目前已有多种汉字输入方法,因此就有
多种汉字输入码。汉字输入码是面向输入者的,使用不同的输入码其操作过程不同,但是
得到的结果是一样的。不管采用何种输入方法,所有输入的汉字都以机内码的形式存储在
介质中,而在进行汉字传输时,又都以交换码的形式发送和接收。
    国标GB2312—80规定的区位码和沿用多年的电报码都可以作为输入码。这类汉字编码
和输入码是一一对应的,具有标准的性质,它们编码用的字符是10个阿拉伯数字,每个
汉字的码长均为等长的四个数码。
    其他编码的种类很多,可从以下几点加以讨论:
    (2)编码类型。可分为拼音码、字形码、音形结合码等类型。
    (2)编码规则。不同的编码方案有很大的不同,有的规则简单,学习起来较容易记忆,‘
  有的规则复杂,较难记亿。
    (3)编码字符集。有用字母键的,有用数字键的,有用字母键加数字键的,或者用了
  更多的键作编码字符集的。
    (4)编码长度。它与编码字符集的大小有关,字符集越大,编码长度越短。采用26
  个字母的编码,其码长一般为四位。
    (5)对应关系。除上面提到的区位码和电报码为一一对应的无重码编码外,其他现有
的编码方案均有一定数量的重码。所谓重码即一码对应多字。有许多编码为了增加输入的
灵活性,同一汉字用多个码来对应,例如双音编码。
    (6)单宁和词汇的编码。现有的编码方案,为了提高效率,除了单字外还规定了词汇
的编码,甚至使用者可以自行增加词汇库中的词汇,但在提高效率的同时也增加了记亿和
操作的复杂性。
    (7)码表的类型和大小。从汉字输入码到机内码的转换一股需要在机内检索码表。如
果输入码和机内码存在简单的函数关系,有公式可以计算,如区位码等编码就不需要码表,
其他没有简单函数关系的编码就需要码表。码表大小与数据结构、单字数量、词汇数量等
因素有关。国标血2312—80规定的6763个一、二级汉字,备类编码的码表从几千字节到
几万字节。随着词汇旦的增加,有的码表达到了若干兆字节。


5. 汉字的点阵码
    汉字的显示和输出,普遍采用点阵方法。由于汉字数量多且字形变化大,对不同字形
汉字的输出,就有不同的点阵字形。所谓汉字的点阵码,就是汉字点阵字形的代码。存储
在介质中的全部汉字的点阵码又称为字库。    .
  16x16点阵的汉字其点阵有16行,每一行上有16个点。如果每一个点用一个二进制
位来表示,则每一行有16个二进制位,需用两个字节来存放每一行上的16个点,并且规
定其点阵中二进制位0为白点,1为黑点,这样一个16X16点阵的汉字需要用2×t6即32
个字节来存放。依次类推,24×24点阵和32×32点阵的汉字则依次要用72个字节和128
个字节存放一个汉字,构成它在字库中的字模信息。
    要显示或打印输出一个汉字时,计算机汉字系统根据该汉字的机内码找出其字模信息
在字库中的位置,再取出其字模信息作为字形在屏幕上显示或在打印机上打印输出。


  嵌入式系统中文输入法的设计



摘   要:在基于嵌入式系统的智能终端中,中文人机交互界面是必须的功能,某些系统还要求中文文本输入。本文介绍了一种占用较少资源并适于在MCU上实现的中文输入法。


关键词:嵌入式系统;中文输入法;数字键盘


引言


目前,以LCD和数字键盘实现的人机交互式界面在智能终端中广泛采用。在不同的应用场合,对人机界面的要求也不同,一些情况下只要求简单参数的显示和选择,而在一些信息终端中,还要求文字的输入。


在使用高性能CPU和标准显示设备的情况下,实现友好的人机界面可采用商用嵌入式系统( 如Linux或WinCE)所支持的GUI程序。但很多情况下,智能终端使用MCU,且其显示设备是非标准接口的小型LCD。此时,必须找到占用较少资源的低成本实现方法。


笔者参加的智能终端项目就是一个比较典型的基于MCU的人机界面应用,使用128×64点阵式LCD模块,要求可显示Unicode编码的一、二级常用汉字库并可进行中文输入。此应用中输入法相关的代码和数据占用约20kB。在应用开发中,我们使用了实时操作系统μC/OS-II,相关内容可参考有关文献。


简单的中文拼音输入法


汉字输入法的实质是建立一种按键组合到汉字编码的映射关系,因此,使用数字键盘的嵌入式系统的输入法与使用标准键盘的PC机的输入法没有本质的不同,其区别主要在于嵌入式应用中处理器、存储器等资源比较有限。如对应汉字“你”,拼音输入法下PC键盘按键组合为“ni”,而在一般数字键盘下,其按键组合则为“64”。


在多数手持式设备(如智能电话)中,以0~9数字键与几个简单的控制键实现汉字输入,比较著名的是在手机中广泛采用的T9 和iTap 输入法。这里我们介绍一种简单的拼音输入法的实现方法。


一般终端键盘包括12个按键,分别是0~9数字键和“*”、“#”两个特殊键。按通用规则,数字1对应空格,其功能基本等同于PC机中的空格键,用于输入空格或作为当前汉字的确认键;2~9数字键分别对应下述汉语拼音字母:


2:a b c   3:d e f     4:g h i


5:j k l    6:m n o   7:p q r s 


8:t u v   9:w x y z


而“0”、“*”、“#”键则作为输入法中的控制键。我们将“#”作为“选择键”,用于选取同一数字键组合下的不同拼音组合。


输入法中使用了两个重要数据结构,分别是PY_NODE和PY_SUBNODE。每个PY_NODE对应一个数字键组合,PY_SUBNODE则对应一组拼音组合。由于一个数字组合可对应多个拼音组合(如“226”对应“ban”、“bao”、“can ”、“cao”),因此这两个结构实现的是一个两级的对应表。


PY_NODE按树组织,而PY_SUBNODE按双向链表组织。二者的基本关系如图1所示。


以下是两个结构的定义:


typedef struct py_node{


unsigned int son[8];   //对应下次2~9按键输入时应转到的PY_NODE的ID号


unsigned int father;     //父节点ID号


struct py_subnode *ptrpy;   //指向下属第一个PY_SUBNODE的指针


}PY_NODE;


typedef rom struct py_subnode{


unsigned char py[7];                   //本节点的拼音字符串


struct py_subnode *prev;           //指向前一PY_SUBNODE的指针


struct py_subnode *next;            //指向下一PY_SUBNODE的指针


unsigned char *ptrUnicode;        //指向本节点对应Unicode码表的指针


}PY_SUBNODE;


设计中我们所参照的汉语拼音表中共有412种组合,这样系统中必须有412个PY_SUBNODE与其一一对应;系统中共建立了250个PY_NODE。建立此部分数据的工作比较繁琐,分以下5个步骤进行:


1、 汉字按拼音进行分组,按常用程度排序,并将汉字转化为Unicode码或国标码,码型视系统要求而定;


2、 将有效拼音转换为数字键盘值组合,如拼音“cui”转为数字值“284”,这些值对应了部分PY_NODE;


3、 增加中间PY_NODE,用于表示本身无效但后续输入有效的拼音,如“b”、“c”、“don”、“dua”等节点;


4、 将数字键组合相同的PY_SUBNODE编成链表,由某一PY_NODE中的ptrpy指针指向表头;


5、 按数字键组合的关系,将PY_NODE组成树。


图1中所示组织关系并不复杂,但其工作量不小,一般情况下可编写转换程序自动建立。图2为拼音输入法数据结构的一个片断。


在改变当前PY_NODE时,一般应伴有一些显示操作,因应用不同各有差异,此处不做过多说明。


在当前节点下,可以用某一指定控制键(如“#”键)来选择此PY_NODE下属的PY_SUBNODE以缩小汉字的选取范围。


 


增加功能


上述拼音输入法比较简单,且完成了输入法需要的基本功能。对于某些应用场合,对输入法还有更多的要求,可在上述方法的基础上进行改进实现。一些常见的要求和改进方法列举如下:


① 增加常用字功能


在上述输入法中,增加常用汉字。只考虑国标码中的约7000常用汉字情况下,输入法所占用的存储空间增加14kB。


② 增加联想功能


为使输入更为友好,很多输入法设有联想功能,即在输入一个汉字后,此汉字常用的后续汉字自动成为候选项由用户选择。


③ 笔划输入法


笔划输入法较之拼音输入法的优势在于重码少,输入不常用汉字时也不必多次翻页查找。


以五笔划输入法为例,通过五个按键即可输入汉字。该输入法将汉字笔划分为5种笔划,即:“一”、“丨”、“丿”、“丶”和“-”五种笔划,分别对应数字键“7”、“8”、“9”、“*”、“0”,如“你”字的组合为“丿”、“丨”、“丿”、“-”、“丨”、“丿”、“丶”。


笔划输入法与拼音输入法的区别在于人的感觉而非机器的操作,本质上只是按键组合与汉字码表对应关系有所不同,如“你”在拼音输入法下对应“64”,而在笔划输入法下则对应“989089*”。


④ 关于特殊符号、英文和数字


对于一些常用的特殊符号、英文和数字的输入,较常用的做法是将以单独的输入法实现。


软、硬件设计


输入法的性能优劣,更多的不是体现在算法,而是是否符合实际需求。因此它的优化工作是对前述PY_NODE和PY_SUBNODE组织的优化,如汉字次序的安排、联想功能中后序字的组织、以及操作界面的设计是否适合人们的使用习惯。因算法本身很简单,所以用C语言可实现较高的代码效率,以及较好的可移植性。


对于很多8位MCU,地址空间不大于64kB。这样小的空间对于汉字界面中的汉字库和输入法中的大量数据结构是远远不够的(如一、二级常用字的16×16点阵汉字库至少需要约220kB),因此常使用地址分页方式实现地址扩展。在MCU外部设一锁存器作为“页”寄存器,每页大小根据MCU特性和实际需求确定,如MCS51系列最大可为64kB一页。由于页寄存器的操作为独占型的,因此在中断内不能进行操作;而在基于RTOS的多任务环境下,应避免多个任务同时使用页寄存器。


结语


由于8位、16位MCU的应用场合多是低成本的设备,当商用输入法的成本无法接受或无法得到时,自行编写输入法应是可行的。当然,本文所讨论的只是实现输入法的基本方法,虽然方法可行,但所编写的输入法代码应经过较长时间的测试才可以作为产品的正式软件发布。



 直接在Keil下仿真的T9拼音输入法(完整版) 
 
作者:佚名    文章来源:侃单片机    点击数:1481    更新时间:2005-4-14
 
/*
看到论坛上有人发T9拼音输入法,好多人感兴趣啊!
呵呵,也把我很久以前的程序找出来重新编译了一下,
特点是直接在Keil下仿真,切换到串口窗口就可以直接看到结果。
希望大家喜欢哦
仿真步骤如下:
1、把3个帖子的内容分别保存为51t9py.c,51t9py_indexa.h,5py_mb.h,放在同一目录下,将51t9py.c加入工程编译
2、由于Keil的模拟串口是单字节显示,汉字显示为乱码,所以要加挂RICHWIN或RichView这种会重新刷新显示的中文平台,或手工刷新屏幕,所以请先到 http://www.pchome.net/dl/chinese.htm 下载RichView,安装运行
3、在Keil环境下可以直接按“Ctrl+F5”键仿真,按“F5”全速运行,切换到在串口依次输入: //
     64*.6 426***.5 98*.7 936.3 586.1 4826*.1 9464*.7 64*.6试试:-)  
4、按键对应(根据我自己的手机设置的,全部在PC的小键盘操作):         //
   Num                /:上一拼音?*:下一拼音                          //
   7:pqrs    8:tuv    9:wxyz   -:前翻页                               //
   4:ghi    5:jkl    6:mno    +:后翻页                               //
   1:?    2:abc    3:def    回车键:输入状态和选字状态切换          //
   0:?            .和空格及回车键:输入状态和选字状态切换          //
*/
//请把这个帖子的内容保存为51t9py.c
//--------------------------------------------------------------------------//
//                                源程序大公开                              //
//                    (c) Copyright 2001-2003 xuwenjun                     //
//                            All Rights Reserved                           //
//                                    V1.00                                 //
//--------------------------------------------------------------------------//
//标 题: T9拼音输入法模块                                                  //
//文件名: 51t9py.c                                                          //
//版 本: V1.00                                                             //
//修改人: 徐文军                         E-mail:xuwenjun@21cn.com           //
//日 期: 05-4-8                                                            //
//描 述: T9拼音输入法模块                                                  //
//声 明:                                                                   //
//        以下代码仅免费提供给学习用途,但引用或修改后必须在文件中声明出处. //
//        如用于商业用途请与作者联系.    E-mail:xuwenjun@21cn.com           //
//        有问题请mailto xuwenjun@21cn.com   欢迎与我交流!                  //
//--------------------------------------------------------------------------//
//老版本: 无                             老版本文件名:                      //
//创建人: 徐文军                         E-mail:xuwenjun@21cn.com           //
//日 期: 02-11-05                                                          //
//描 述:                                                                   //
//       1、很久以前的程序,根据网友张 凯、李 强的51py输入法子程序改编,    //
//          增加索引、完善主程序、测试程序,使之在Keil下直接仿真            //
//       2、在Keil环境下可以直接按“Ctrl+F5”键仿真,切换到在串口依次输入: //
//          64*.6 426***.5 98*.7 936.3 586.1 4826*.1 9464*.7 64*.6试试:-)   //
//       3、由于Keil的模拟串口是单字节显示,汉字显示为乱码,所以要加挂    //
//          RICHWIN或RichView这种会重新刷新显示的中文平台,或手工刷新屏幕   //
//          (RichView可以到 http://www.pchome.net/dl/chinese.htm 下载)      //
//       4、按键对应(根据我自己的手机设置的,全部在PC的小键盘操作):         //
//          Num                /:上一拼音?*:下一拼音                          //
//          7:pqrs    8:tuv    9:wxyz   -:前翻页                               //
//          4:ghi    5:jkl    6:mno    +:后翻页                               //
//          1:?    2:abc    3:def    回车键:输入状态和选字状态切换          //
//          0:?            .和空格及回车键:输入状态和选字状态切换          //
//--------------------------------------------------------------------------//
#include<string.h>
#include<stdio.h>
#include"PY_mb.h"
//#include"51t9_MB.h"
#include"51t9py_indexa.h"


#define CNTLQ      0x11
#define CNTLS      0x13
#define DEL        0x7F
#define BACKSPACE  0x08
#define CR         0x0D
#define LF         0x0A


unsigned char cpt9PY_Mblen;
struct t9PY_index code  * cpt9PY_Mb[16];


unsigned char t9PY_ime(char *strInput_t9PY_str)
{
    struct t9PY_index *cpHZ,*cpHZedge,*cpHZTemp;
    unsigned char i,j,cInputStrLength;


    cpt9PY_Mblen=0;                                //完全匹配组数
    j="0";                                        //j为匹配最大值
    cInputStrLength="strlen"(strInput_t9PY_str);     //输入拼音串长度//
    if(*strInput_t9PY_str=='\0')return(0);       //如果输入空字符返回0//


    cpHZ=&(t9PY_index2[0]);                        //查首字母索引//
    cpHZedge="t9PY"_index2+sizeof(t9PY_index2)/sizeof(t9PY_index2[0]);
//    strInput_t9PY_str++;                        //指向拼音串第二个字母//
    while(cpHZ < cpHZedge)                       //待查询记录条数
    {
        for(i=0;i<cInputStrLength;i++)
        {
               if(*(strInput_t9PY_str+i)!=*((*cpHZ).t9PY_T9+i))    //检查字符串匹配
            {
                if (i+1 > j)
                {
                    j="i"+1;                    //j为匹配最大值
                    cpHZTemp="cpHZ";
                }
                break;                        //发现字母串不配,退出//
            }           
        }
        if((i==cInputStrLength) && (cpt9PY_Mblen<16))    //字母串全配,最多8组
        {
            cpt9PY_Mb[cpt9PY_Mblen]=cpHZ;
            cpt9PY_Mblen++;
        }
        cpHZ++;
    }
     if(j!=cInputStrLength)                 //不完全匹配输出最多匹配的1组
        cpt9PY_Mb[0]=cpHZTemp;
    return (cpt9PY_Mblen);                //输出完全匹配组数,0为无果而终//
}


char * t9PY_ime_mb(char *strInput_t9PY_str)
{
    if(t9PY_ime(strInput_t9PY_str) > 0)
        return ((*(cpt9PY_Mb[0])).PY_mb);
    else
        return (PY_mb_space);
}


void t9PY_Test(void)
{
    bit PYEnter="0";
    bit HZok="0";
    unsigned char temp;
//    unsigned char temp2;
    unsigned char t9PYn=0;
    char idata inline[16]={0x00};
    idata char chinese_word[3]="  ";
    char tempchar,Add=0,i=0;
    struct t9PY_index *cpTemp;
//    cpTemp="t9PY"_index2;
//    printf ("\n按键 /:上一拼音 *:下一拼音 .和空格及回车键:输入状态和选字状态切换\n");         //
    printf ("请按键:2-abc 3-def 4-ghi 5-jkl 6-mno 7-pqrs 8-tuv 9-wxyz \n");          //
    while(!HZok)
    {
        tempchar="getchar"();
        switch (tempchar)
        {
//            case '0':
            case '1':
            case '2':
            case '3':
            case '4':
            case '5':
            case '6':
            case '7':
            case '8':
            case '9':
                  if (~PYEnter)
                 {
                    inline=tempchar;
                    i++;
                    Add="0";
                    t9PY_ime(inline);
                }
                break;
            case '/':
                if (t9PYn >0) t9PYn --;
                break;
            case '*':
                t9PYn ++;
                if (t9PYn >=cpt9PY_Mblen) t9PYn --;
                break;
            case '-':
                if (Add >= 12) Add -= 12;
                break;
            case '=':
            case '+':
                if (Add < strlen((*cpTemp).PY_mb) -12 )Add += 12;
                break;
            case BACKSPACE:
                if (i>0) i--;
                inline=0x00;
                Add="0";
                t9PY_ime(inline);
//                   cpTemp="cpt9PY"_Mb[t9PYn];
                break;
//            case '\n':
            case '.':                        //输入状态和选字状态切换
            case ' ':
            case '\n':
                PYEnter ^=1;
                break;
            default     :
//                HZok="1";
                break;
        }


        printf ("                                               \r");
          if (PYEnter)
         {
            printf ("选");
              cpTemp="cpt9PY"_Mb[t9PYn];
            if((cpTemp != PY_mb_space) && (tempchar>='1') && (tempchar<='9'))
            {
                HZok="1";
                t9PYn=0;
                printf ("                                                 \r");
//                printf ("%s\n",inline);
                chinese_word[0]=*((*cpTemp).PY_mb+Add+(tempchar-'1')*2);
                chinese_word[1]=*((*cpTemp).PY_mb+Add+(tempchar-'1')*2+1);
                   printf (chinese_word);
                   printf ("\n");
            }
            else
            {
//                printf ((*(cpTemp)).PY);
                printf (":");
                   printf ((*cpTemp).PY_mb+Add);
//                   printf ("\n拼音1 2 3 4 5 6 7 8 9\r");
            }
        }
        else
        {
            printf ("拼");
            for (temp=t9PYn;temp<cpt9PY_Mblen;temp++)
            {
                cpTemp="cpt9PY"_Mb[temp];
//                    temp2=((strlen((*(cpTemp)).PY_mb)-Add)/2);
//                    printf ("%2bd:%02bd:",temp,temp2);
                    printf (":");
                    printf ((*(cpTemp)).PY);
//                       printf ((*(cpTemp)).PY_mb+Add);
//                printf ("\n");
            }
//               printf ("\n");
        }
    }
}


//-----以下为测试程序---------------------------------------------------------------------//
#include <REG52.H>
#include <stdio.h>
#ifdef MONITOR51                         /* Debugging with Monitor-51 needs   */
    char code reserve [3] _at_ 0x23;         /* space for serial interrupt if     */
#endif                                   /* Stop Exection with Serial Intr.   */
                                         /* is enabled                        */
void main (void) {
    char input_string[]="98";


/*------------------------------------------------
Setup the serial port for 1200 baud at 16MHz.
------------------------------------------------*/
#ifndef MONITOR51
    SCON  = 0x50;                /* SCON: mode 1, 8-bit UART, enable rcvr      */
    TMOD |= 0x20;               /* TMOD: timer 1, mode 2, 8-bit reload        */
    TH1   = 250;                /* TH1:  reload value for 9600 baud @ 11.0592MHz */
    TR1   = 1;                  /* TR1:  timer 1 run                          */
    TI    = 1;                  /* TI:   set TI to send first char of UART    */
#endif


/*------------------------------------------------
Note that an embedded program never exits (because
there is no operating system to return to).  It
must loop and execute forever.
------------------------------------------------*/
//  printf ("Hello World\n");   /* Print "Hello World" */
    printf ("\n");
    printf ("%s\n",input_string);
    printf (t9PY_ime_mb(input_string));
    printf ("按键对应全部在PC的小键盘操作)\n");
    printf ("        /-上一拼音 *-下一拼音\n");
    printf ("7-pqrs  8-tuv      9-wxyz   --前翻页\n");
    printf ("4-ghi   5-jkl      6-mno    +-后翻页\n");
    printf ("1-无效  2-abc      3-def    回车键-输入状态和选字状态切换\n");
    printf ("0-无效             .和空格及回车键-输入状态和选字状态切换\n\n");
    while(1)
    {
        t9PY_Test();
    }
}



//请把这个帖子的内容保存为51t9py_indexa.h
//--------------------------------------------------------------------------//
//                                源程序大公开                              //
//                    (c) Copyright 2001-2003 xuwenjun                     //
//                            All Rights Reserved                           //
//                                    V1.00                                 //
//--------------------------------------------------------------------------//
//标 题: T9拼音输入法索引                                                  //
//文件名: 51t9py_indexa.h                                                   //
//版 本: V1.00                                                             //
//修改人: 徐文军                         E-mail:xuwenjun@21cn.com           //
//日 期: 05-4-8                                                            //
//描 述: T9拼音输入法索引                                                  //
//声 明:                                                                   //
//        以下代码仅免费提供给学习用途,但引用或修改后必须在文件中声明出处. //
//        如用于商业用途请与作者联系.    E-mail: xuwenjun@21cn.com            //
//        有问题请mailto xuwenjun@21cn.com    欢迎与我交流!                  //
//--------------------------------------------------------------------------//
//老版本: 无                             老版本文件名:                      //
//创建人: 徐文军                         E-mail:xuwenjun@21cn.com           //
//日 期: 02-11-05                                                          //
//描 述:                                                                   //
//--------------------------------------------------------------------------//
struct t9PY_index
{
    char code *t9PY_T9;
    char code *PY;
    char code *PY_mb;
};


/*"拼音输入法查询码表,T9数字字母索引表(index)"*/
struct t9PY_index code t9PY_index2[] ={{"","",PY_mb_space},
                                    {"2","a",PY_mb_a},
                                    {"3","e",PY_mb_e},
                                    {"4","i",PY_mb_space},
                                    {"6","o",PY_mb_o},
                                    {"8","u",PY_mb_space},
                                    {"8","v",PY_mb_space},
                                    {"24","ai",PY_mb_ai},
                                    {"26","an",PY_mb_an},
                                    {"26","ao",PY_mb_ao},
                                    {"22","ba",PY_mb_ba},
                                    {"24","bi",PY_mb_bi},
                                    {"26","bo",PY_mb_bo},
                                    {"28","bu",PY_mb_bu},
                                    {"22","ca",PY_mb_ca},
                                    {"23","ce",PY_mb_ce},
                                    {"24","ci",PY_mb_ci},
                                    {"28","cu",PY_mb_cu},
                                    {"32","da",PY_mb_da},
                                    {"33","de",PY_mb_de},
                                    {"34","di",PY_mb_di},
                                    {"38","du",PY_mb_du},
                                    {"36","en",PY_mb_en},
                                    {"37","er",PY_mb_er},
                                    {"32","fa",PY_mb_fa},
                                    {"36","fo",PY_mb_fo},
                                    {"38","fu",PY_mb_fu},
                                    {"42","ha",PY_mb_ha},
                                    {"42","ga",PY_mb_ga},
                                    {"43","ge",PY_mb_ge},
                                    {"43","he",PY_mb_he},
                                    {"48","gu",PY_mb_gu},
                                    {"48","hu",PY_mb_hu},
                                    {"54","ji",PY_mb_ji},
                                    {"58","ju",PY_mb_ju},
                                    {"52","ka",PY_mb_ka},
                                    {"53","ke",PY_mb_ke},
                                    {"58","ku",PY_mb_ku},
                                    {"52","la",PY_mb_la},
                                    {"53","le",PY_mb_le},
                                    {"54","li",PY_mb_li},
                                    {"58","lu",PY_mb_lu},
                                    {"58","lv",PY_mb_lv},
                                    {"62","ma",PY_mb_ma},
                                    {"63","me",PY_mb_me},
                                    {"64","mi",PY_mb_mi},
                                    {"66","mo",PY_mb_mo},
                                    {"68","mu",PY_mb_mu},
                                    {"62","na",PY_mb_na},
                                    {"63","ne",PY_mb_ne},
                                    {"64","ni",PY_mb_ni},
                                    {"68","nu",PY_mb_nu},
                                    {"68","nv",PY_mb_nv},
                                    {"68","ou",PY_mb_ou},
                                    {"72","pa",PY_mb_pa},
                                    {"74","pi",PY_mb_pi},
                                    {"76","po",PY_mb_po},
                                    {"78","pu",PY_mb_pu},
                                    {"74","qi",PY_mb_qi},
                                    {"78","qu",PY_mb_qu},
                                    {"73","re",PY_mb_re},
                                    {"74","ri",PY_mb_ri},
                                    {"78","ru",PY_mb_ru},
                                    {"72","sa",PY_mb_sa},
                                    {"73","se",PY_mb_se},
                                    {"74","si",PY_mb_si},
                                    {"78","su",PY_mb_su},
                                    {"82","ta",PY_mb_ta},
                                    {"83","te",PY_mb_te},
                                    {"84","ti",PY_mb_ti},
                                    {"88","tu",PY_mb_tu},
                                    {"92","wa",PY_mb_wa},
                                    {"96","wo",PY_mb_wo},
                                    {"98","wu",PY_mb_wu},
                                    {"94","xi",PY_mb_xi},
                                    {"98","xu",PY_mb_xu},
                                    {"92","ya",PY_mb_ya},
                                    {"93","ye",PY_mb_ye},
                                    {"94","yi",PY_mb_yi},
                                    {"96","yo",PY_mb_yo},
                                    {"98","yu",PY_mb_yu},
                                    {"92","za",PY_mb_za},
                                    {"93","ze",PY_mb_ze},
                                    {"94","zi",PY_mb_zi},
                                    {"98","zu",PY_mb_zu},
                                    {"264","ang",PY_mb_ang},
                                    {"224","bai",PY_mb_bai},
                                    {"226","ban",PY_mb_ban},
                                    {"226","bao",PY_mb_bao},
                                    {"234","bei",PY_mb_bei},
                                    {"236","ben",PY_mb_ben},
                                    {"243","bie",PY_mb_bie},
                                    {"246","bin",PY_mb_bin},
                                    {"224","cai",PY_mb_cai},
                                    {"226","can",PY_mb_can},
                                    {"226","cao",PY_mb_cao},
                                    {"242","cha",PY_mb_cha},
                                    {"243","che",PY_mb_che},
                                    {"244","chi",PY_mb_chi},
                                    {"248","chu",PY_mb_chu},
                                    {"268","cou",PY_mb_cou},
                                    {"284","cui",PY_mb_cui},
                                    {"286","cun",PY_mb_cun},
                                    {"286","cuo",PY_mb_cuo},
                                    {"324","dai",PY_mb_dai},
                                    {"326","dan",PY_mb_dan},
                                    {"326","dao",PY_mb_dao},
                                    {"343","die",PY_mb_die},
                                    {"348","diu",PY_mb_diu},
                                    {"368","dou",PY_mb_dou},
                                    {"384","dui",PY_mb_dui},
                                    {"386","dun",PY_mb_dun},
                                    {"386","duo",PY_mb_duo},
                                    {"326","fan",PY_mb_fan},
                                    {"334","fei",PY_mb_fei},
                                    {"336","fen",PY_mb_fen},
                                    {"368","fou",PY_mb_fou},
                                    {"424","gai",PY_mb_gai},
                                    {"426","gan",PY_mb_gan},
                                    {"426","gao",PY_mb_gao},
                                    {"434","gei",PY_mb_gei},
                                    {"436","gen",PY_mb_gan},
                                    {"468","gou",PY_mb_gou},
                                    {"482","gua",PY_mb_gua},
                                    {"484","gui",PY_mb_gui},
                                    {"486","gun",PY_mb_gun},
                                    {"486","guo",PY_mb_guo},
                                    {"423","hai",PY_mb_hai},
                                    {"426","han",PY_mb_han},
                                    {"426","hao",PY_mb_hao},
                                    {"434","hei",PY_mb_hei},
                                    {"436","hen",PY_mb_hen},
                                    {"468","hou",PY_mb_hou},
                                    {"482","hua",PY_mb_hua},
                                    {"484","hui",PY_mb_hui},
                                    {"486","hun",PY_mb_hun},
                                    {"486","huo",PY_mb_huo},
                                    {"542","jia",PY_mb_jia},
                                    {"543","jie",PY_mb_jie},
                                    {"546","jin",PY_mb_jin},
                                    {"548","jiu",PY_mb_jiu},
                                    {"583","jue",PY_mb_jue},
                                    {"586","jun",PY_mb_jun},
                                    {"524","kai",PY_mb_kai},
                                    {"526","kan",PY_mb_kan},
                                    {"526","kao",PY_mb_kao},
                                    {"536","ken",PY_mb_ken},
                                    {"568","kou",PY_mb_kou},
                                    {"582","kua",PY_mb_kua},
                                    {"584","kui",PY_mb_kui},
                                    {"586","kun",PY_mb_kun},
                                    {"586","kuo",PY_mb_kuo},
                                    {"524","lai",PY_mb_lai},
                                    {"526","lan",PY_mb_lan},
                                    {"526","lao",PY_mb_lao},
                                    {"534","lei",PY_mb_lei},
                                    {"543","lie",PY_mb_lie},
                                    {"546","lin",PY_mb_lin},
                                    {"548","liu",PY_mb_liu},
                                    {"568","lou",PY_mb_lou},
                                    {"583","lue",PY_mb_lue},
                                    {"586","lun",PY_mb_lun},
                                    {"586","luo",PY_mb_luo},
                                    {"624","mai",PY_mb_mai},
                                    {"626","man",PY_mb_man},
                                    {"626","mao",PY_mb_mao},
                                    {"634","mei",PY_mb_mei},
                                    {"636","men",PY_mb_men},
                                    {"643","mie",PY_mb_mie},
                                    {"646","min",PY_mb_min},
                                    {"648","miu",PY_mb_miu},
                                    {"668","mou",PY_mb_mou},
                                    {"624","nai",PY_mb_nai},
                                    {"626","nan",PY_mb_nan},
                                    {"626","nao",PY_mb_nao},
                                    {"634","nei",PY_mb_nei},
                                    {"636","nen",PY_mb_nen},
                                    {"643","nie",PY_mb_nie},
                                    {"646","nin",PY_mb_nin},
                                    {"648","niu",PY_mb_niu},
                                    {"683","nue",PY_mb_nue},
                                    {"686","nuo",PY_mb_nuo},
                                    {"724","pai",PY_mb_pai},
                                    {"726","pan",PY_mb_pan},
                                    {"726","pao",PY_mb_pao},
                                    {"734","pei",PY_mb_pei},
                                    {"736","pen",PY_mb_pen},
                                    {"743","pie",PY_mb_pie},
                                    {"746","pin",PY_mb_pin},
                                    {"768","pou",PY_mb_pou},
                                    {"742","qia",PY_mb_qia},
                                    {"743","qie",PY_mb_qie},
                                    {"746","qin",PY_mb_qin},
                                    {"748","qiu",PY_mb_qiu},
                                    {"783","que",PY_mb_que},
                                    {"786","qun",PY_mb_qun},
                                    {"726","ran",PY_mb_ran},
                                    {"726","rao",PY_mb_rao},
                                    {"736","ren",PY_mb_ren},
                                    {"768","rou",PY_mb_rou},
                                    {"784","rui",PY_mb_rui},
                                    {"786","run",PY_mb_run},
                                    {"786","ruo",PY_mb_ruo},
                                    {"724","sai",PY_mb_sai},
                                    {"726","sao",PY_mb_sao},
                                    {"726","san",PY_mb_san},
                                    {"736","sen",PY_mb_sen},
                                    {"742","sha",PY_mb_sha},
                                    {"743","she",PY_mb_she},
                                    {"744","shi",PY_mb_shi},
                                    {"748","shu",PY_mb_shu},
                                    {"768","sou",PY_mb_sou},
                                    {"784","sui",PY_mb_sui},
                                    {"786","sun",PY_mb_sun},
                                    {"786","suo",PY_mb_suo},
                                    {"824","tai",PY_mb_tai},
                                    {"826","tan",PY_mb_tan},
                                    {"826","tao",PY_mb_tao},
                                    {"843","tie",PY_mb_tie},
                                    {"868","tou",PY_mb_tou},
                                    {"884","tui",PY_mb_tui},
                                    {"886","tun",PY_mb_tun},
                                    {"886","tuo",PY_mb_tuo},
                                    {"924","wai",PY_mb_wai},
                                    {"926","wan",PY_mb_wan},
                                    {"934","wei",PY_mb_wei},
                                    {"936","wen",PY_mb_wen},
                                    {"942","xia",PY_mb_xia},
                                    {"943","xie",PY_mb_xie},
                                    {"946","xin",PY_mb_xin},
                                    {"948","xiu",PY_mb_xiu},
                                    {"983","xue",PY_mb_xue},
                                    {"986","xun",PY_mb_xun},
                                    {"926","yan",PY_mb_yan},
                                    {"926","yao",PY_mb_yao},
                                    {"946","yin",PY_mb_yin},
                                    {"968","you",PY_mb_you},
                                    {"983","yue",PY_mb_yue},
                                    {"986","yun",PY_mb_yun},
                                    {"924","zai",PY_mb_zai},
                                    {"926","zan",PY_mb_zan},
                                    {"926","zao",PY_mb_zao},
                                    {"934","zei",PY_mb_zei},
                                    {"936","zen",PY_mb_zen},
                                    {"942","zha",PY_mb_zha},
                                    {"943","zhe",PY_mb_zhe},
                                    {"944","zhi",PY_mb_zhi},
                                    {"948","zhu",PY_mb_zhu},
                                    {"968","zou",PY_mb_zou},
                                    {"984","zui",PY_mb_zui},
                                    {"986","zun",PY_mb_zun},
                                    {"986","zuo",PY_mb_zuo},
                                    {"2264","bang",PY_mb_bang},
                                    {"2364","beng",PY_mb_beng},
                                    {"2426","bian",PY_mb_bian},
                                    {"2426","biao",PY_mb_biao},
                                    {"2464","bing",PY_mb_bing},
                                    {"2264","cang",PY_mb_cang},
                                    {"2364","ceng",PY_mb_ceng},
                                    {"2424","chai",PY_mb_chai},
                                    {"2426","chan",PY_mb_chan},
                                    {"2426","chao",PY_mb_chao},
                                    {"2436","chen",PY_mb_chen},
                                    {"2468","chou",PY_mb_chou},
                                    {"2484","chuai",PY_mb_chuai},
                                    {"2484","chui",PY_mb_chui},
                                    {"2484","chun",PY_mb_chun},
                                    {"2486","chuo",PY_mb_chuo},
                                    {"2664","cong",PY_mb_cong},
                                    {"2826","cuan",PY_mb_cuan},
                                    {"3264","dang",PY_mb_dang},
                                    {"3364","deng",PY_mb_deng},
                                    {"3426","dian",PY_mb_dian},
                                    {"3426","diao",PY_mb_diao},
                                    {"3464","ding",PY_mb_ding},
                                    {"3664","dong",PY_mb_dong},
                                    {"3826","duan",PY_mb_duan},
                                    {"3264","fang",PY_mb_fang},
                                    {"3364","feng",PY_mb_feng},
                                    {"4264","gang",PY_mb_gang},
                                    {"4364","geng",PY_mb_geng},
                                    {"4664","gong",PY_mb_gong},
                                    {"4824","guai",PY_mb_guai},
                                    {"4826","guan",PY_mb_guan},
                                    {"4264","hang",PY_mb_hang},
                                    {"4364","heng",PY_mb_heng},
                                    {"4664","hong",PY_mb_hong},
                                    {"4823","huai",PY_mb_huai},
                                    {"4826","huan",PY_mb_huan},
                                    {"5426","jian",PY_mb_jian},
                                    {"5426","jiao",PY_mb_jiao},
                                    {"5464","jing",PY_mb_jing},
                                    {"5826","juan",PY_mb_juan},
                                    {"5264","kang",PY_mb_kang},
                                    {"5364","keng",PY_mb_keng},
                                    {"5664","kong",PY_mb_kong},
                                    {"5824","kuai",PY_mb_kuai},
                                    {"5826","kuan",PY_mb_kuan},
                                    {"5264","lang",PY_mb_lang},
                                    {"5366","leng",PY_mb_leng},
                                    {"5426","lian",PY_mb_lian},
                                    {"5426","liao",PY_mb_liao},
                                    {"5464","ling",PY_mb_ling},
                                    {"5664","long",PY_mb_long},
                                    {"5826","luan",PY_mb_luan},
                                    {"6264","mang",PY_mb_mang},
                                    {"6364","meng",PY_mb_meng},
                                    {"6426","mian",PY_mb_mian},
                                    {"6426","miao",PY_mb_miao},
                                    {"6464","ming",PY_mb_ming},
                                    {"6264","nang",PY_mb_nang},
                                    {"6364","neng",PY_mb_neng},
                                    {"6426","nian",PY_mb_nian},
                                    {"6426","niao",PY_mb_niao},
                                    {"6464","ning",PY_mb_ning},
                                    {"6664","nong",PY_mb_nong},
                                    {"6826","nuan",PY_mb_nuan},
                                    {"7264","pang",PY_mb_pang},
                                    {"7364","peng",PY_mb_peng},
                                    {"7426","pian",PY_mb_pian},
                                    {"7426","piao",PY_mb_piao},
                                    {"7464","ping",PY_mb_ping},
                                    {"7426","qian",PY_mb_qian},
                                    {"7426","qiao",PY_mb_qiao},
                                    {"7464","qing",PY_mb_qing},
                                    {"7826","quan",PY_mb_quan},
                                    {"7264","rang",PY_mb_rang},
                                    {"7364","reng",PY_mb_reng},
                                    {"7664","rong",PY_mb_rong},
                                    {"7826","ruan",PY_mb_ruan},
                                    {"7264","sang",PY_mb_sang},
                                    {"7364","seng",PY_mb_seng},
                                    {"7424","shai",PY_mb_shai},
                                    {"7426","shan",PY_mb_shan},
                                    {"7426","shao",PY_mb_shao},
                                    {"7436","shen",PY_mb_shen},
                                    {"7468","shou",PY_mb_shou},
                                    {"7482","shua",PY_mb_shua},
                                    {"7484","shui",PY_mb_shui},
                                    {"7486","shun",PY_mb_shun},
                                    {"7486","shuo",PY_mb_shuo},
                                    {"7664","song",PY_mb_song},
                                    {"7826","suan",PY_mb_suan},
                                    {"8264","tang",PY_mb_tang},
                                    {"8364","teng",PY_mb_teng},
                                    {"8426","tian",PY_mb_tian},
                                    {"8426","tiao",PY_mb_tiao},
                                    {"8464","ting",PY_mb_ting},
                                    {"8664","tong",PY_mb_tong},
                                    {"8826","tuan",PY_mb_tuan},
                                    {"9264","wang",PY_mb_wang},
                                    {"9364","weng",PY_mb_weng},
                                    {"9426","xian",PY_mb_xiao},
                                    {"9426","xiao",PY_mb_xiao},
                                    {"9464","xing",PY_mb_xing},
                                    {"9826","xuan",PY_mb_xuan},
                                    {"9264","yang",PY_mb_yang},
                                    {"9464","ying",PY_mb_ying},
                                    {"9664","yong",PY_mb_yong},
                                    {"9826","yuan",PY_mb_yuan},
                                    {"9264","zang",PY_mb_zang},
                                    {"9364","zeng",PY_mb_zeng},
                                    {"9424","zhai",PY_mb_zhai},
                                    {"9426","zhan",PY_mb_zhan},
                                    {"9426","zhao",PY_mb_zhao},
                                    {"9436","zhen",PY_mb_zhen},
                                    {"9468","zhou",PY_mb_zhou},
                                    {"9482","zhua",PY_mb_zhua},
                                    {"9484","zhui",PY_mb_zhui},
                                    {"9486","zhun",PY_mb_zhun},
                                    {"9486","zhuo",PY_mb_zhuo},
                                    {"9664","zong",PY_mb_zong},
                                    {"9826","zuan",PY_mb_zuan},
                                    {"24264","chang",PY_mb_chang},
                                    {"24364","cheng",PY_mb_cheng},
                                    {"24664","chong",PY_mb_chong},
                                    {"24826","chuan",PY_mb_chuan},
                                    {"48264","guang",PY_mb_guang},
                                    {"48264","huang",PY_mb_huang},
                                    {"54264","jiang",PY_mb_jiang},
                                    {"54664","jiong",PY_mb_jiong},
                                    {"58264","kuang",PY_mb_kuang},
                                    {"54264","liang",PY_mb_liang},
                                    {"64264","niang",PY_mb_niang},
                                    {"74264","qiang",PY_mb_qiang},
                                    {"74664","qiong",PY_mb_qiong},
                                    {"74264","shang ",PY_mb_shang},
                                    {"74364","sheng",PY_mb_sheng},
                                    {"74824","shuai",PY_mb_shuai},
                                    {"74826","shuan",PY_mb_shuan},
                                    {"94264","xiang",PY_mb_xiang},
                                    {"94664","xiong",PY_mb_xiong},
                                    {"94264","zhang",PY_mb_zhang},
                                    {"94364","zheng",PY_mb_zheng},
                                    {"94664","zhong",PY_mb_zhong},
                                    {"94824","zhuai",PY_mb_zhuai},
                                    {"94826","zhuan",PY_mb_zhuan},
                                    {"248264","chuang",PY_mb_chuang},
                                    {"748264","shuang",PY_mb_shuang},
                                    {"948264","zhuang",PY_mb_zhuang},
};


//py_mb.h
 
//"拼音输入法汉字排列表,码表(mb)"
code char PY_mb_a[]     ={"阿啊"};
code char PY_mb_ai[]    ={"哎哀唉埃挨皑癌矮蔼艾爱隘碍"};
code char PY_mb_an[]    ={"安氨鞍俺岸按案胺暗"};
code char PY_mb_ang[]   ={"肮昂盎"};
code char PY_mb_ao[]    ={"凹敖熬翱袄傲奥澳懊"};
code char PY_mb_ba[]    ={"八巴叭扒吧芭疤捌笆拔跋把靶坝爸罢霸"};
code char PY_mb_bai[]   ={"白百佰柏摆败拜稗"};
code char PY_mb_ban[]   ={"扳班般颁斑搬板版办半伴扮拌绊瓣"};
code char PY_mb_bang[]  ={"邦帮梆绑榜膀蚌傍棒谤磅镑"};
code char PY_mb_bao[]   ={"包苞胞褒雹宝饱保堡报抱豹鲍暴爆剥薄瀑"};
code char PY_mb_bei[]   ={"卑杯悲碑北贝狈备背钡倍被惫焙辈"};
code char PY_mb_ben[]   ={"奔本苯笨夯"};
code char PY_mb_beng[]  ={"崩绷甭泵迸蹦"};
code char PY_mb_bi[]    ={"逼鼻比彼笔鄙币必毕闭庇毖陛毙敝痹蓖弊碧蔽壁避臂"};
code char PY_mb_bian[]  ={"边编鞭贬扁卞便变遍辨辩辫"};
code char PY_mb_biao[]  ={"彪标膘表"};
code char PY_mb_bie[]   ={"憋鳖别瘪"};
code char PY_mb_bin[]   ={"宾彬斌滨濒摈"};
code char PY_mb_bing[]  ={"冰兵丙秉柄炳饼并病"};
code char PY_mb_bo[]    ={"拨波玻钵脖菠播伯驳帛泊勃铂舶博渤搏箔膊卜"};
code char PY_mb_bu[]    ={"补哺捕不布步怖部埠簿"};
code char PY_mb_ca[]    ={"擦"};
code char PY_mb_cai[]   ={"猜才材财裁采彩睬踩菜蔡"};
code char PY_mb_can[]   ={"参餐残蚕惭惨灿"};
code char PY_mb_cang[]  ={"仓沧苍舱藏"};
code char PY_mb_cao[]   ={"操糙曹槽草"};
code char PY_mb_ce[]    ={"册侧厕测策"};
code char PY_mb_ceng[]  ={"层蹭曾"};
code char PY_mb_cha[]   ={"叉插查茬茶搽察碴岔诧差刹"};
code char PY_mb_chai[]  ={"拆柴豺"};
code char PY_mb_chan[]  ={"掺搀谗馋缠蝉产铲阐颤"};
code char PY_mb_chang[] ={"昌猖肠尝偿常厂场敞畅倡唱"};
code char PY_mb_chao[]  ={"抄钞超巢朝嘲潮吵炒绰"};
code char PY_mb_che[]   ={"车扯彻掣撤澈"};
code char PY_mb_chen[]  ={"郴尘臣忱沉辰陈晨衬趁"};
code char PY_mb_cheng[] ={"称撑成呈承诚城乘惩程澄橙逞骋秤"};
code char PY_mb_chi[]   ={"吃痴弛池驰迟持尺侈齿耻斥赤炽翅"};
code char PY_mb_chong[] ={"充冲虫崇宠"};
code char PY_mb_chou[]  ={"抽仇绸畴愁稠筹酬踌丑瞅臭"};
code char PY_mb_chu[]   ={"出初除厨滁锄雏橱躇础储楚处搐触矗畜"};
code char PY_mb_chuai[] ={"揣"};
code char PY_mb_chuan[] ={"川穿传船椽喘串"};
code char PY_mb_chuang[]={"闯疮窗床创"};
code char PY_mb_chui[]  ={"吹炊垂捶锤"};
code char PY_mb_chun[]  ={"春椿纯唇淳醇蠢"};
code char PY_mb_chuo[]  ={"戳"};
code char PY_mb_ci[]    ={"疵词茨瓷慈辞磁雌此次刺赐"};
code char PY_mb_cong[]  ={"囱从匆葱聪丛"};
code char PY_mb_cou[]   ={"凑"};
code char PY_mb_cu[]    ={"粗促醋簇"};
code char PY_mb_cuan[]  ={"蹿窜篡"};
code char PY_mb_cui[]   ={"崔催摧脆淬瘁粹翠"};
code char PY_mb_cun[]   ={"村存寸"};
code char PY_mb_cuo[]   ={"搓磋撮挫措错"};
code char PY_mb_da[]    ={"搭达答瘩打大"};
code char PY_mb_dai[]   ={"呆歹傣代带待怠殆贷袋逮戴"};
code char PY_mb_dan[]   ={"丹单担耽郸胆掸旦但诞弹惮淡蛋氮"};
code char PY_mb_dang[]  ={"当挡党荡档"};
code char PY_mb_dao[]   ={"刀导岛倒捣祷蹈到悼盗道稻"};
code char PY_mb_de[]    ={"得德的"};
code char PY_mb_deng[]  ={"灯登蹬等邓凳瞪"};
code char PY_mb_di[]    ={"低堤滴狄迪敌涤笛嫡底抵地弟帝递第缔蒂"};
code char PY_mb_dian[]  ={"掂滇颠典点碘电佃甸店垫惦淀奠殿靛"};
code char PY_mb_diao[]  ={"刁叼凋碉雕吊钓掉"};
code char PY_mb_die[]   ={"爹跌迭谍叠碟蝶"};
code char PY_mb_ding[]  ={"丁叮盯钉顶鼎订定锭"};
code char PY_mb_diu[]   ={"丢"};
code char PY_mb_dong[]  ={"东冬董懂动冻侗恫栋洞"};
code char PY_mb_dou[]   ={"都兜斗抖陡豆逗痘"};
code char PY_mb_du[]    ={"督毒读犊独堵赌睹妒杜肚度渡镀"};
code char PY_mb_duan[]  ={"端短段断缎锻"};
code char PY_mb_dui[]   ={"堆队对兑"};
code char PY_mb_dun[]   ={"吨敦墩蹲盾钝顿遁"};
code char PY_mb_duo[]   ={"多哆夺掇朵垛躲剁堕舵惰跺"};
code char PY_mb_e[]     ={"讹俄娥峨鹅蛾额厄扼恶饿鄂遏"};
code char PY_mb_en[]    ={"恩"};
code char PY_mb_er[]    ={"儿而尔耳洱饵二贰"};
code char PY_mb_fa[]    ={"发乏伐罚阀筏法珐"};
code char PY_mb_fan[]   ={"帆番翻藩凡矾钒烦樊繁反返犯泛饭范贩"};
code char PY_mb_fang[]  ={"方坊芳防妨房肪仿访纺放"};
code char PY_mb_fei[]   ={"飞非啡菲肥匪诽吠废沸肺费"};
code char PY_mb_fen[]   ={"分吩纷芬氛酚坟汾焚粉份奋忿愤粪"};
code char PY_mb_feng[]  ={"丰风枫封疯峰烽锋蜂冯逢缝讽凤奉"};
code char PY_mb_fo[]    ={"佛"};
code char PY_mb_fou[]   ={"否"};
code char PY_mb_fu[]    ={"夫肤孵敷弗伏扶拂服俘氟浮涪符袱幅福辐抚甫府斧俯釜辅腑腐父讣付妇负附咐阜复赴副傅富赋缚腹覆"};
code char PY_mb_ga[]    ={"嘎噶"};
code char PY_mb_gai[]   ={"该改钙盖溉概"};
code char PY_mb_gan[]   ={"干甘杆肝柑竿秆赶敢感赣"};
code char PY_mb_gang[]  ={"冈刚岗纲肛缸钢港杠"};
code char PY_mb_gao[]   ={"皋羔高膏篙糕搞稿镐告"};
code char PY_mb_ge[]    ={"戈疙哥胳鸽割搁歌阁革格葛隔个各铬咯"};
code char PY_mb_gei[]   ={"给"};
code char PY_mb_gen[]   ={"根跟"};
code char PY_mb_geng[]  ={"更庚耕羹埂耿梗"};
code char PY_mb_gong[]  ={"工弓公功攻供宫恭躬龚巩汞拱共贡"};
code char PY_mb_gou[]   ={"勾沟钩狗苟构购垢够"};
code char PY_mb_gu[]    ={"估咕姑孤沽菇辜箍古谷股骨蛊鼓固故顾雇"};
code char PY_mb_gua[]   ={"瓜刮剐寡挂褂"};
code char PY_mb_guai[]  ={"乖拐怪"};
code char PY_mb_guan[]  ={"关观官冠棺馆管贯惯灌罐"};
code char PY_mb_guang[] ={"光广逛"};
code char PY_mb_gui[]   ={"归圭龟规闺硅瑰轨诡癸鬼刽柜贵桂跪"};
code char PY_mb_gun[]   ={"辊滚棍"};
code char PY_mb_guo[]   ={"郭锅国果裹过"};
code char PY_mb_ha[]    ={"蛤哈"};
code char PY_mb_hai[]   ={"孩骸海亥骇害氦"};
code char PY_mb_han[]   ={"酣憨含邯函涵寒韩罕喊汉汗旱悍捍焊憾撼翰"};
code char PY_mb_hang[]  ={"杭航行"};
code char PY_mb_hao[]   ={"毫豪嚎壕好郝号浩耗"};
code char PY_mb_he[]    ={"呵喝禾合何和河阂核荷涸盒菏贺褐赫鹤"};
code char PY_mb_hei[]   ={"黑嘿"};
code char PY_mb_hen[]   ={"痕很狠恨"};
code char PY_mb_heng[]  ={"亨哼恒横衡"};
code char PY_mb_hong[]  ={"轰哄烘弘红宏洪虹鸿"};
code char PY_mb_hou[]   ={"侯喉猴吼后厚候"};
code char PY_mb_hu[]    ={"乎呼忽弧狐胡壶湖葫瑚糊蝴虎唬互户护沪"};
code char PY_mb_hua[]   ={"花华哗滑猾化划画话"};
code char PY_mb_huai[]  ={"怀徊淮槐坏"};
code char PY_mb_huan[]  ={"欢还环桓缓幻宦唤换涣患焕痪豢"};
code char PY_mb_huang[] ={"荒慌皇凰黄惶煌蝗磺簧恍晃谎幌"};
code char PY_mb_hui[]   ={"灰恢挥辉徽回蛔悔卉汇会讳绘诲烩贿晦秽惠毁慧"};
code char PY_mb_hun[]   ={"昏荤婚浑魂混"};
code char PY_mb_huo[]   ={"豁活火伙或货获祸惑霍"};
code char PY_mb_ji[]    ={"讥击饥圾机肌鸡迹姬积基绩缉畸箕稽激及吉汲级即极急疾棘集嫉辑籍几己挤脊计记伎纪妓忌技际剂季既济继寂寄悸祭蓟冀藉"};
code char PY_mb_jia[]   ={"加夹佳枷家嘉荚颊甲贾钾价驾架假嫁稼挟"};
code char PY_mb_jian[]  ={"奸尖坚歼间肩艰兼监笺缄煎拣俭柬茧捡减剪检硷简碱见件建饯剑荐贱健涧舰渐溅践鉴键箭"};
code char PY_mb_jiang[] ={"江姜将浆僵疆讲奖桨蒋匠降酱"};
code char PY_mb_jiao[]  ={"交郊娇浇骄胶椒焦蕉礁角狡绞饺矫脚铰搅剿缴叫轿较教窖酵觉嚼"};
code char PY_mb_jie[]   ={"阶皆接秸揭街节劫杰洁结捷睫截竭姐解介戒芥届界疥诫借"};
code char PY_mb_jin[]   ={"巾今斤金津筋襟仅紧谨锦尽劲近进晋浸烬禁靳"};
code char PY_mb_jing[]  ={"京经茎荆惊晶睛粳兢精鲸井颈景警净径痉竞竟敬靖境静镜"};
code char PY_mb_jiong[] ={"炯窘"};
code char PY_mb_jiu[]   ={"纠究揪九久灸玖韭酒旧臼咎疚厩救就舅"};
code char PY_mb_ju[]    ={"居拘狙驹疽鞠局桔菊咀沮举矩句巨拒具炬俱剧惧据距锯聚踞"};
code char PY_mb_juan[]  ={"娟捐鹃卷倦绢眷"};
code char PY_mb_jue[]   ={"撅决诀抉绝倔掘爵攫"};
code char PY_mb_jun[]   ={"军君均钧菌俊郡峻浚骏竣"};
code char PY_mb_ka[]    ={"咖喀卡"};
code char PY_mb_kai[]   ={"开揩凯慨楷"};
code char PY_mb_kan[]   ={"槛刊勘堪坎砍看"};
code char PY_mb_kang[]  ={"康慷糠扛亢抗炕"};
code char PY_mb_kao[]   ={"考拷烤靠"};
code char PY_mb_ke[]    ={"坷苛柯科棵颗磕壳咳可渴克刻客课"};
code char PY_mb_ken[]   ={"肯垦恳啃"};
code char PY_mb_keng[]  ={"吭坑"};
code char PY_mb_kong[]  ={"空孔恐控"};
code char PY_mb_kou[]   ={"抠口扣寇"};
code char PY_mb_ku[]    ={"枯哭窟苦库裤酷"};
code char PY_mb_kua[]   ={"夸垮挎胯跨"};
code char PY_mb_kuai[]  ={"块快侩筷"};
code char PY_mb_kuan[]  ={"宽款"};
code char PY_mb_kuang[] ={"匡筐狂况旷矿框眶"};
code char PY_mb_kui[]   ={"亏岿盔窥奎葵魁傀愧溃馈"};
code char PY_mb_kun[]   ={"坤昆捆困"};
code char PY_mb_kuo[]   ={"扩括阔廓"};
code char PY_mb_la[]    ={"垃拉啦喇腊蜡辣"};
code char PY_mb_lai[]   ={"来莱赖"};
code char PY_mb_lan[]   ={"兰拦栏婪阑蓝谰澜篮览揽缆懒烂滥"};
code char PY_mb_lang[]  ={"郎狼廊琅榔朗浪"};
code char PY_mb_lao[]   ={"捞劳牢老佬姥涝烙酪"};
code char PY_mb_le[]    ={"乐勒了"};
code char PY_mb_lei[]   ={"雷镭垒磊蕾儡肋泪类累擂"};
code char PY_mb_leng[]  ={"棱楞冷"};
code char PY_mb_li[]    ={"厘梨狸离莉犁漓璃黎篱礼李里哩理鲤力历厉立吏丽利励沥例隶俐荔栗砾粒傈痢"};
code char PY_mb_lian[]  ={"连帘怜涟莲联廉镰敛脸练炼恋链"};
code char PY_mb_liang[] ={"俩良凉梁粮粱两亮谅辆晾量"};
code char PY_mb_liao[]  ={"潦辽疗聊僚寥廖撩燎镣料撂"};
code char PY_mb_lie[]   ={"列劣烈猎裂"};
code char PY_mb_lin[]   ={"邻林临淋琳霖磷鳞凛吝赁拎"};
code char PY_mb_ling[]  ={"伶灵岭玲凌铃陵羚菱零龄领令另"};
code char PY_mb_liu[]   ={"溜刘流留琉硫馏榴瘤柳六"};
code char PY_mb_long[]  ={"龙咙笼聋隆窿陇垄拢"};
code char PY_mb_lou[]   ={"娄楼搂篓陋漏"};
code char PY_mb_lu[]    ={"露卢庐芦炉颅卤虏掳鲁陆录赂鹿禄碌路戮潞麓"};
code char PY_mb_luan[]  ={"孪峦挛滦卵乱"};
code char PY_mb_lue[]   ={"掠略"};
code char PY_mb_lun[]   ={"抡仑伦沦纶轮论"};
code char PY_mb_luo[]   ={"罗萝逻锣箩骡螺裸洛络骆落"};
code char PY_mb_lv[]    ={"滤驴吕侣旅铝屡缕履律虑率绿氯"};
code char PY_mb_ma[]    ={"妈麻马玛码蚂骂吗嘛"};
code char PY_mb_mai[]   ={"埋买迈麦卖脉"};
code char PY_mb_man[]   ={"蛮馒瞒满曼谩慢漫蔓"};
code char PY_mb_mang[]  ={"忙芒盲茫莽氓"};
code char PY_mb_mao[]   ={"猫毛矛茅锚卯铆茂冒贸帽貌"};
code char PY_mb_me[]    ={"么"};
code char PY_mb_mei[]   ={"没枚玫眉梅媒煤酶霉每美镁妹昧媚寐"};
code char PY_mb_men[]   ={"门闷们"};
code char PY_mb_meng[]  ={"萌盟檬猛蒙锰孟梦"};
code char PY_mb_mi[]    ={"弥迷谜醚糜靡米眯泌觅秘密幂蜜"};
code char PY_mb_mian[]  ={"眠绵棉免勉娩冕缅面"};
code char PY_mb_miao[]  ={"苗描瞄秒渺藐妙庙"};
code char PY_mb_mie[]   ={"灭蔑"};
code char PY_mb_min[]   ={"民皿抿闽悯敏"};
code char PY_mb_ming[]  ={"名明鸣铭螟命"};
code char PY_mb_miu[]   ={"谬"};
code char PY_mb_mo[]    ={"貉摸摹模膜摩磨蘑魔抹末沫陌莫寞漠墨默"};
code char PY_mb_mou[]   ={"牟谋某"};
code char PY_mb_mu[]    ={"母亩牡姆拇木目牧募墓幕睦慕暮穆"};
code char PY_mb_na[]    ={"拿哪那纳娜钠呐"};
code char PY_mb_nai[]   ={"乃奶氖奈耐"};
code char PY_mb_nan[]   ={"男南难"};
code char PY_mb_nang[]  ={"囊"};
code char PY_mb_nao[]   ={"挠恼脑闹淖"};
code char PY_mb_ne[]    ={"呢"};
code char PY_mb_nei[]   ={"内馁"};
code char PY_mb_nen[]   ={"嫩"};
code char PY_mb_neng[]  ={"能"};
code char PY_mb_ni[]    ={"妮尼泥倪霓你拟逆匿溺腻"};
code char PY_mb_nian[]  ={"拈年捻撵碾念蔫"};
code char PY_mb_niang[] ={"娘酿"};
code char PY_mb_niao[]  ={"鸟尿"};
code char PY_mb_nie[]   ={"捏涅聂啮镊镍孽"};
code char PY_mb_nin[]   ={"您"};
code char PY_mb_ning[]  ={"宁拧狞柠凝泞"};
code char PY_mb_niu[]   ={"牛扭纽钮"};
code char PY_mb_nong[]  ={"农浓脓弄"};
code char PY_mb_nu[]    ={"奴努怒"};
code char PY_mb_nuan[]  ={"暖"};
code char PY_mb_nue[]   ={"疟虐"};
code char PY_mb_nuo[]   ={"挪诺懦糯"};
code char PY_mb_nv[]    ={"女"};
code char PY_mb_o[]     ={"哦"};
code char PY_mb_ou[]    ={"欧殴鸥呕偶藕沤"};
code char PY_mb_pa[]    ={"趴啪爬耙琶帕怕"};
code char PY_mb_pai[]   ={"拍徘排牌派湃"};
code char PY_mb_pan[]   ={"潘攀盘磐判叛盼畔"};
code char PY_mb_pang[]  ={"乓庞旁耪胖"};
code char PY_mb_pao[]   ={"抛刨咆炮袍跑泡"};
code char PY_mb_pei[]   ={"呸胚陪培赔裴沛佩配"};
code char PY_mb_pen[]   ={"喷盆"};
code char PY_mb_peng[]  ={"抨砰烹朋彭棚硼蓬鹏澎篷膨捧碰"};
code char PY_mb_pi[]    ={"辟批坯披砒劈霹皮毗疲啤琵脾匹痞屁僻譬"};
code char PY_mb_pian[]  ={"片偏篇骗"};
code char PY_mb_piao[]  ={"漂飘瓢票"};
code char PY_mb_pie[]   ={"撇瞥"};
code char PY_mb_pin[]   ={"拼贫频品聘"};
code char PY_mb_ping[]  ={"乒平评凭坪苹屏瓶萍"};
code char PY_mb_po[]    ={"坡泼颇婆迫破粕魄"};
code char PY_mb_pou[]   ={"剖"};
code char PY_mb_pu[]    ={"脯仆扑铺莆菩葡蒲朴圃埔浦普谱曝"};
code char PY_mb_qi[]    ={"七沏妻柒凄栖戚期欺漆祁齐其奇歧祈脐崎畦骑棋旗乞企岂启起气讫迄弃汽泣契砌器"};
code char PY_mb_qia[]   ={"掐恰洽"};
code char PY_mb_qian[]  ={"千仟扦迁钎牵铅谦签前钱钳乾潜黔浅遣谴欠堑嵌歉"};
code char PY_mb_qiang[] ={"呛羌枪腔强墙蔷抢"};
code char PY_mb_qiao[]  ={"悄敲锹橇乔侨桥瞧巧俏峭窍翘撬鞘"};
code char PY_mb_qie[]   ={"切茄且怯窃"};
code char PY_mb_qin[]   ={"亲侵钦芹秦琴禽勤擒寝沁"};
code char PY_mb_qing[]  ={"青氢轻倾卿清情晴氰擎顷请庆"};
code char PY_mb_qiong[] ={"穷琼"};
code char PY_mb_qiu[]   ={"丘邱秋囚求泅酋球"};
code char PY_mb_qu[]    ={"区曲驱屈蛆躯趋渠取娶龋去趣"};
code char PY_mb_quan[]  ={"圈全权泉拳痊醛颧犬劝券"};
code char PY_mb_que[]   ={"炔缺瘸却雀确鹊榷"};
code char PY_mb_qun[]   ={"裙群"};
code char PY_mb_ran[]   ={"然燃冉染"};
code char PY_mb_rang[]  ={"瓤嚷壤攘让"};
code char PY_mb_rao[]   ={"饶扰绕"};
code char PY_mb_re[]    ={"惹热"};
code char PY_mb_ren[]   ={"人仁壬忍刃认任纫妊韧"};
code char PY_mb_reng[]  ={"扔仍"};
code char PY_mb_ri[]    ={"日"};
code char PY_mb_rong[]  ={"戎绒茸荣容溶蓉熔融冗"};
code char PY_mb_rou[]   ={"柔揉肉"};
code char PY_mb_ru[]    ={"如茹儒孺蠕汝乳辱入褥"};
code char PY_mb_ruan[]  ={"阮软"};
code char PY_mb_rui[]   ={"蕊锐瑞"};
code char PY_mb_run[]   ={"闰润"};
code char PY_mb_ruo[]   ={"若弱"};
code char PY_mb_sa[]    ={"撒洒萨"};
code char PY_mb_sai[]   ={"塞腮鳃赛"};
code char PY_mb_san[]   ={"三叁伞散"};
code char PY_mb_sang[]  ={"桑嗓丧"};
code char PY_mb_sao[]   ={"搔骚扫嫂"};
code char PY_mb_se[]    ={"色涩瑟"};
code char PY_mb_sen[]   ={"森"};
code char PY_mb_seng[]  ={"僧"};
code char PY_mb_sha[]   ={"杀沙纱砂莎傻啥煞厦"};
code char PY_mb_shai[]  ={"筛晒"};
code char PY_mb_shan[]  ={"山删杉衫珊煽闪陕汕苫扇善缮擅膳赡栅"};
code char PY_mb_shang[] ={"伤商墒裳晌赏上尚"};
code char PY_mb_shao[]  ={"捎梢烧稍勺芍韶少邵绍哨"};
code char PY_mb_she[]   ={"奢赊舌蛇舍设社射涉赦慑摄"};
code char PY_mb_shen[]  ={"申伸身呻绅娠砷深神沈审婶肾甚渗慎什"};
code char PY_mb_sheng[] ={"升生声牲胜甥绳省圣盛剩"};
code char PY_mb_shi[]   ={"匙尸失师虱诗施狮湿十石时识实拾蚀食史矢使始驶屎士氏世仕市示式事侍势视试饰室恃拭是柿适逝释嗜誓噬似"};
code char PY_mb_shou[]  ={"收手守首寿受兽售授瘦"};
code char PY_mb_shu[]   ={"书抒叔枢殊梳淑疏舒输蔬孰赎熟暑黍署鼠蜀薯曙术戍束述树竖恕庶数墅漱属"};
code char PY_mb_shua[]  ={"刷耍"};
code char PY_mb_shuai[] ={"衰摔甩帅"};
code char PY_mb_shuan[] ={"拴栓"};
code char PY_mb_shuang[]={"双霜爽"};
code char PY_mb_shui[]  ={"谁水税睡"};
code char PY_mb_shun[]  ={"吮顺舜瞬"};
code char PY_mb_shuo[]  ={"说烁朔硕"};
code char PY_mb_si[]    ={"丝司私思斯嘶撕死巳四寺伺饲嗣肆"};
code char PY_mb_song[]  ={"松怂耸讼宋诵送颂"};
code char PY_mb_sou[]   ={"嗽搜艘擞"};
code char PY_mb_su[]    ={"苏酥俗诉肃素速粟塑溯僳"};
code char PY_mb_suan[]  ={"酸蒜算"};
code char PY_mb_sui[]   ={"虽绥隋随髓岁祟遂碎隧穗"};
code char PY_mb_sun[]   ={"孙损笋"};
code char PY_mb_suo[]   ={"唆梭蓑缩所索琐锁"};
code char PY_mb_ta[]    ={"她他它塌塔獭挞踏蹋"};
code char PY_mb_tai[]   ={"胎台抬苔太汰态泰酞"};
code char PY_mb_tan[]   ={"坍贪摊滩瘫坛谈痰谭潭檀坦袒毯叹炭探碳"};
code char PY_mb_tang[]  ={"汤唐堂棠塘搪膛糖倘淌躺烫趟"};
code char PY_mb_tao[]   ={"涛绦掏滔逃桃陶淘萄讨套"};
code char PY_mb_te[]    ={"特"};
code char PY_mb_teng[]  ={"疼腾誊藤"};
code char PY_mb_ti[]    ={"剔梯锑踢啼提题蹄体屉剃涕惕替嚏"};
code char PY_mb_tian[]  ={"天添田恬甜填腆舔"};
code char PY_mb_tiao[]  ={"调挑条迢眺跳"};
code char PY_mb_tie[]   ={"贴铁帖"};
code char PY_mb_ting[]  ={"厅汀听烃廷亭庭停挺艇"};
code char PY_mb_tong[]  ={"通同彤桐铜童酮瞳统捅桶筒痛"};
code char PY_mb_tou[]   ={"偷头投透"};
code char PY_mb_tu[]    ={"凸秃突图徒涂途屠土吐兔"};
code char PY_mb_tuan[]  ={"湍团"};
code char PY_mb_tui[]   ={"推颓腿退蜕褪"};
code char PY_mb_tun[]   ={"囤吞屯臀"};
code char PY_mb_tuo[]   ={"托拖脱驮陀驼鸵妥椭拓唾"};
code char PY_mb_wa[]    ={"哇娃挖洼蛙瓦袜"};
code char PY_mb_wai[]   ={"歪外"};
code char PY_mb_wan[]   ={"弯湾豌丸完玩顽烷宛挽晚婉惋皖碗万腕"};
code char PY_mb_wang[]  ={"汪亡王网往枉妄忘旺望"};
code char PY_mb_wei[]   ={"危威微巍为韦围违桅唯惟维潍伟伪尾纬苇委萎卫未位味畏胃尉谓喂渭蔚慰魏"};
code char PY_mb_wen[]   ={"温瘟文纹闻蚊吻紊稳问"};
code char PY_mb_weng[]  ={"翁嗡瓮"};
code char PY_mb_wo[]    ={"挝涡窝蜗我沃卧握斡"};
code char PY_mb_wu[]    ={"乌污呜巫屋诬钨无毋吴吾芜梧五午伍坞武侮捂舞勿务戊物误悟晤雾"};
code char PY_mb_xi[]    ={"夕汐西吸希昔析矽息牺悉惜烯硒晰犀稀溪锡熄熙嘻膝习席袭媳檄洗喜戏系细隙"};
code char PY_mb_xia[]   ={"虾瞎匣侠峡狭暇辖霞下吓夏"};
code char PY_mb_xian[]  ={"铣仙先纤掀锨鲜闲弦贤咸涎舷衔嫌显险县现线限宪陷馅羡献腺"};
code char PY_mb_xiang[] ={"乡相香厢湘箱襄镶详祥翔享响想向巷项象像橡"};
code char PY_mb_xiao[]  ={"宵消萧硝销霄嚣淆小晓孝肖哮效校笑啸"};
code char PY_mb_xie[]   ={"些楔歇蝎协邪胁斜谐携鞋写泄泻卸屑械谢懈蟹"};
code char PY_mb_xin[]   ={"心忻芯辛欣锌新薪信衅"};
code char PY_mb_xing[]  ={"兴星惺猩腥刑邢形型醒杏姓幸性"};
code char PY_mb_xiong[] ={"凶兄匈汹胸雄熊"};
code char PY_mb_xiu[]   ={"宿休修羞朽秀绣袖锈嗅"};
code char PY_mb_xu[]    ={"戌须虚嘘需墟徐许旭序叙恤绪续酗婿絮蓄吁"};
code char PY_mb_xuan[]  ={"轩宣喧玄悬旋选癣绚眩"};
code char PY_mb_xue[]   ={"削靴薛穴学雪血"};
code char PY_mb_xun[]   ={"勋熏寻巡旬驯询循训讯汛迅逊殉"};
code char PY_mb_ya[]    ={"丫压呀押鸦鸭牙芽蚜崖涯衙哑雅亚讶"};
code char PY_mb_yan[]   ={"咽烟淹焉阉延严言岩沿炎研盐阎蜒颜奄衍掩眼演厌彦砚唁宴艳验谚堰焰雁燕"};
code char PY_mb_yang[]  ={"央殃秧鸯扬羊阳杨佯疡洋仰养氧痒样漾"};
code char PY_mb_yao[]   ={"侥妖腰邀尧姚窑谣摇遥瑶咬舀药要耀钥"};
code char PY_mb_ye[]    ={"椰噎爷耶也冶野业叶曳页夜掖液腋"};
code char PY_mb_yi[]    ={"一伊衣医依铱壹揖仪夷沂宜姨胰移遗颐疑彝乙已以矣蚁倚椅义亿忆艺议亦屹异役抑译邑易绎诣疫益谊翌逸意溢肄裔毅翼臆"};
code char PY_mb_yin[]   ={"因阴姻茵荫音殷吟寅淫银尹引饮隐印"};
code char PY_mb_ying[]  ={"应英婴缨樱鹰迎盈荧莹萤营蝇赢颖影映硬"};
code char PY_mb_yo[]    ={"哟"};
code char PY_mb_yong[]  ={"佣拥痈庸雍臃永咏泳勇涌恿蛹踊用"};
code char PY_mb_you[]   ={"优忧幽悠尤由犹邮油铀游友有酉又右幼佑诱釉"};
code char PY_mb_yu[]    ={"迂淤渝于予余盂鱼俞娱渔隅愉逾愚榆虞舆与宇屿羽雨禹语玉驭芋育郁狱峪浴预域欲喻寓御裕遇愈誉豫"};
code char PY_mb_yuan[]  ={"冤鸳渊元员园垣原圆袁援缘源猿辕远苑怨院愿"};
code char PY_mb_yue[]   ={"乐曰约月岳悦阅跃粤越钥"};
code char PY_mb_yun[]   ={"云匀郧耘允陨孕运晕酝韵蕴"};
code char PY_mb_za[]    ={"匝杂砸咋"};
code char PY_mb_zai[]   ={"灾哉栽宰载再在仔"};
code char PY_mb_zan[]   ={"咱攒暂赞"};
code char PY_mb_zang[]  ={"赃脏葬"};
code char PY_mb_zao[]   ={"遭糟凿早枣蚤澡藻灶皂造噪燥躁"};
code char PY_mb_ze[]    ={"则择泽责"};
code char PY_mb_zei[]   ={"贼"};
code char PY_mb_zen[]   ={"怎"};
code char PY_mb_zeng[]  ={"增憎赠"};
code char PY_mb_zha[]   ={"喳渣扎札轧闸铡眨乍诈炸榨柞"};
code char PY_mb_zhai[]  ={"斋摘宅翟窄债寨"};
code char PY_mb_zhan[]  ={"沾毡粘詹瞻斩展盏崭辗占战栈站绽湛蘸"};
code char PY_mb_zhang[] ={"长张章彰漳樟涨掌丈仗帐杖胀账障瘴"};
code char PY_mb_zhao[]  ={"招昭找沼召兆赵照罩肇爪"};
code char PY_mb_zhe[]   ={"遮折哲蛰辙者锗这浙蔗着"};
code char PY_mb_zhen[]  ={"贞针侦珍真砧斟甄臻诊枕疹阵振镇震帧"};
code char PY_mb_zheng[] ={"争征怔挣狰睁蒸拯整正证郑政症"};
code char PY_mb_zhi[]   ={"之支汁芝吱枝知织肢脂蜘执侄直值职植殖止只旨址纸指趾至志制帜治炙质峙挚秩致掷痔窒智滞稚置"};
code char PY_mb_zhong[] ={"中忠终盅钟衷肿种仲众重"};
code char PY_mb_zhou[]  ={"州舟诌周洲粥轴肘帚咒宙昼皱骤"};
code char PY_mb_zhu[]   ={"朱诛株珠诸猪蛛竹烛逐主拄煮嘱瞩住助注贮驻柱祝著蛀筑铸"};
code char PY_mb_zhua[]  ={"抓"};
code char PY_mb_zhuai[] ={"拽"};
code char PY_mb_zhuan[] ={"专砖转撰篆"};
code char PY_mb_zhuang[]={"妆庄桩装壮状幢撞"};
code char PY_mb_zhui[]  ={"追椎锥坠缀赘"};
code char PY_mb_zhun[]  ={"谆准"};
code char PY_mb_zhuo[]  ={"卓拙捉桌灼茁浊酌啄琢"};
code char PY_mb_zi[]    ={"孜兹咨姿资淄滋籽子紫滓字自渍"};
code char PY_mb_zong[]  ={"宗综棕踪鬃总纵"};
code char PY_mb_zou[]   ={"邹走奏揍"};
code char PY_mb_zu[]    ={"租足卒族诅阻组祖"};
code char PY_mb_zuan[]  ={"赚纂钻"};
code char PY_mb_zui[]   ={"嘴最罪醉"};
code char PY_mb_zun[]   ={"尊遵"};
code char PY_mb_zuo[]   ={"昨左佐作坐座做"};
code char PY_mb_space[] ={""};




用拼音输入法字典库实现同音字模糊查询
文/王守银


    在开发各类应用管理系统中,一般都要实现各种查询功能,如何准确、快速查找到符合条件的记录,是实现各种查询功能的重点。系统的实际开发过程中,查询功能一般都是通过对字符进行比较、判断等方法来实现。我们开发一个人事管理系统中过程中,系统要具有新的查询方式,即只要知道一个人姓名的读音,并不知道每个字的具体写法,通过检索数据库,就能把所有符合这个读音的记录内容全部显示出来。由于汉字存在着大量的同音字,采用常用的字符比较法,如查找一个名叫“李晓军”的人,数据库中存在的叫“李小君”的数据就不能查到,利用常用的查询方法实现不了同音字的查询功能。为了解决同音字的模糊查询问题,笔者借用WINDOWS系统下的输入法生成器,将系统中的拼音输入法字典库,生成了一个拼音查询字典库,利用这个拼音查询字典库,在使用VFP数据库管理系统编写的人事管理系统中,轻松实现了按语音进行模糊查询功能,具体思路如下:


---- 一、首先要生成一个拼音字典查询数据库


---- 选择WINDOWS系统的开始—程序—附件—输入法生成器,进入输入法生成器窗口,使用鼠标点击逆转换的页框,点击打开文件按钮,选中硬盘WINDOWS\SYSTEM文件夹下的WINPY.MB文件,在码表原文件中输入C:\WINPY.TXT,输入完毕后点击逆转换,此时系统对全拼字典库进行转换,最后将生成一个纯文本文件,利用这个纯文本文件编写一段小程序即可生成一个拼音字典查询数据库。


---- 二、进入VFP系统,编写一段生成程序命令为ZH.PRG


---- 程序中的内容如下所列:


---- CREA TABL B1 (NR C(60),HZ C(2),PY1 C(12),PY2 C(12))&&创建一个临时数据库


---- USE B1 &&打开生成的数据库


---- APPE FROM C:\WINPY.TXT SDF


---- &&将利用输入法生成器生成的字典码文件WINPY.TXT文件内容追加到数据库中


---- DELE FOR ASC(SUBS(NR,3,1)) >=128


---- &&在数据库中删除全部词组内容,只留下单字 DELE FOR RECN()< 13 &&在数据库删除编码库的头文件


---- PACK &&清除打了删除标记的记录。


---- REPL HZ WITH SUBS(NR,1,2),PY1 WITH SUBS(NR,3,AT(' ',NR)-2),;


---- PY2 WITH SUBS(NR,AT(' ',NR)+1) ALL


---- &&将汉字与拼音存放在不同的字段里,这里拼音有两个字段,其中有一个为同音字。


---- REPL NR1 WITH ‘s’+SUBS(NR1,3) FOR “sh”$NR1


---- &&为了照顾南方口音的人员使用,可将全部zh,ch,sh替换成z,c,s


---- REPL NR1 WITH ‘c’+SUBS(NR1,3) FOR “ch”$NR1


---- REPL NR1 WITH ‘z’+SUBS(NR1,3) FOR “zh”$NR1


---- REPL NR2 WITH ‘s’+SUBS(NR2,3) FOR “sh”$NR2


---- REPL NR2 WITH ‘c’+SUBS(NR2,3) FOR “ch”$NR2


---- REPL NR2 WITH ‘z’+SUBS(NR2,3) FOR “zh”$NR2


---- COPY TO PYZDK FIEL HZ,PY1,PY2 &&生成一个拼音查询数据库


---- USE &&关闭打开的数据库


---- ERASE 'B1.DBF' &&删除生成的临过数据库


---- 在VFP中命令窗口中,执行上面这段程序系统将自动生成一个拼音查询库,并将这个数据库命名为PYZDK.DBF。


---- 三、新建一个表单


---- 在数据环境中加入“人员情况表”和“PYZDK”,两个数据库,在人员情况表中存在需要有查询的人员姓名字段,字段名为NAME,标志位字段,字段名为BZW。


---- 在表单上新建一个LABEL1对象用来提示“请输入要查询的姓名”;一个文本框用来输入要查询的字符内容;一个表格对像,用来显示人员情况表数据库中的内容,两个命令按钮,COMMAND2用来执行查询过程文件,COMMAND1用来退出查询窗口。


---- COMMAND1命令按钮属性设置如下:


COMMAND1.CAPTION=’退出’
在COMMAND1命令按钮的.CLICK事件中写入代码
THISFORM.RELEASE
COMMAND2命令按钮属性设置如下:
COMMAND2.CAPTION=’开始查询’
在COMMAND2命令按钮的CLICK事件中写入代码:
IF THIS.CAPTION='开始查询'
   THIS.CAPTION='恢复数据'
   SRNR="ALLT"(THISFORM.TEXT1.VALUE)
   ZDD="LEN"(SRNR)/2
   DIME HH(100)
   HH=''
   FOR I="1" TO LEN(SRNR)/2
      ABC="SUBS"(SRNR,I*2-1,2)
      SELE PYZDK
      SET FILT TO
      LOCA FOR ABC$HZ
      PYNR1=PY1
      PYNR2=PY2
      IF PYNR2< >' '
         SET FILT TO PY1=PYNR1 OR PY2=PYNR2
      ELSE
         SET FILT TO PY1=PYNR1
      ENDIF
      GO TOP
      DO WHIL NOT EOF()
         HH(I)=HH(I)+HZ
         SKIP
      ENDDO
   ENDFOR
   SELE 人员情况表
   SET FILT TO
   REPL BZW WITH '' ALL
   GO TOP
   DO WHIL NOT EOF()
      JSQ="0"
      FOR I="1" TO LEN(ALLT(NAME))/2
        IF SUBS(YXMM,I*2-1,2)$HH(I)
          JSQ="JSQ"+1
        ENDIF
      ENDFOR
      IF JSQ="LEN"(ALLT(NAME))/2 OR JSQ >=ZDD
         REPL BZW WITH '*'
      ENDIF
      SKIP
   ENDDO
   SET FILT TO BZW="*"
   THISFORM.REFRESH
ELSE
   THIS.CAPTION='开始查询'
   SELE 人员情况表
   SET FILT TO
   GO TOP
   THISFORM.REFRESH
ENDIF
---- 全部设置完毕后,执行这个程序,在文本框中输入要查询的关键字,点击查询命令按钮,这时表格中显示的将是符合查询关键字的全部内容,再次点击这个命令按钮,表格中将恢复显示全部数据内容,点击退出系统命令按钮,将退出系统。


---- 四、查询程序实现的原理


---- 在系统中提供的编辑框中输入要查询的关键字,如果在一个人员数据库中查询一个姓名音为“李晓军”的人员,首先在生成的拼音查询数据库中找输入第一个汉字在字典库中的位置,利用这个汉字的读音,对字典库记录内容进行过滤,这样字典库中只有符合这个汉字读音的全部汉字,利用一个循环,将全部同音字相加生成一个字符串送到一个变量中,继续对另外输入的汉字进行同样的处理,最后根据输入汉字的多少,生成多个字符串变量。生成字符串变量后,开始检索你要查询的数据库,将数据库中每人的姓名拆开分别与生成的字符串进行判断,只有当数据库中一个人的姓名全部字符都能在相应的字符串中查找到,那么这条记录就符合查找的条件,打上一个标识,比较下一条记录,循环到数据库的结尾,这时就可以将所有做了查询标记的数据库内容显示出来,即实现了按语音模糊查询的方法。


---- 上面简要介绍了在VFP系统中,实现按语音模糊查询的一种方法,在程序中并没有考虑容错性,在这里只是提供了一种思路,读者根据以上实现语音查询的原理,根据实际工作的需要实现不同形式的语音模糊查询方式。


 


 /*
手机的汉字拼音输入法很'聪明',只要用数字键组合,就能够自动找到能组成拼音的字母组合。
从2代表abc,3:def,4:ghi,5:jkl,6:mno,7:pqrs,8:tuv,9:wxyz


写一个程序,对输入的数字组合,找到匹配的字母组合成拼音输出。
如果有多个匹配则按照字母顺序排列后输出。
*/


#pragma warning(disable:4786)


#include <cstdio>
#include <cassert>
#include <vector>
#include <algorithm>
#include <string>


using namespace std;


class NameList
{ public:
    void init()
    {
        string a;
        for (int j = 0; ym[j]; j++)
        {
            a = ym[j];
            mStrs.push_back(a);
            for (int i = 0; sm; i++)
            {
                a = sm;
                a += ym[j] ;
                mStrs.push_back(a);
            }
        }
    }
    void load(const char *file)
    {
        char buf[128];
        char buf2[1024];
        string a;
        mStrs.clear();
        mStrGBs.clear();
        FILE *fp = fopen(file,"rt");
        while(fscanf(fp, "%s %s", buf, buf2) == 2)
        {
            a = buf;
            mStrs.push_back(a);
            a = buf2;
            mStrGBs.push_back(a);
        }
        fclose(fp);
    }
    void show()
    {
        for (vector< string >::iterator s = mStrs.begin(); s != mStrs.end(); ++s)
        {
            printf("%s ", s->c_str());
        }
    }
    vector< string > mStrs;
    vector< string > mStrGBs;
private:
    static const char *sm[];
    static const char *ym[];
};


const char *NameList::sm[]=
{     "b","p","m","f","d", "t","n","l","g","k",
    "h","j","q","x","z", "c","s","zh","ch","sh",
    "r","y","w",
    NULL,
};
const char *NameList::ym[]=
{     "a","ai","ao","an","ang",
    "o","ou","ong",
    "e","ei","er","en","eng",
    "i","iang","ian","iao","in","ing","iu","ia","ie","iong",
    "u","uo","uang","un","uai","uan",
    "ui","ue","ua",
    NULL,
};
class Digit2PinyinConverter
{ public:
    struct StrCmpFunctor
    {
        operator () (const char *s1, const char *s2)
        {
            int len1 = strlen(s1);
            int len2 = strlen(s2);
            if (len1 != len2)
                return len1 < len2;
            return strcmp(s1, s2) < 0;
        }
    };
    void init()
    {
        validStrings.clear();
        vector< string >::iterator name = namelist.mStrs.begin();
        while(name != namelist.mStrs.end())
        {
            validStrings.push_back(name->c_str());
            ++name;
        }
        digits = "";
        sort(validStrings.begin(), validStrings.end(), StrCmpFunctor());
    }
    /**//// @param char digit '2'..'9'
    void push(char digit)
    {
        if (digit < '2' || digit > '9')
            return;
        digits += digit;
        vector< const char * > next;
        vector< const char * >::iterator name = validStrings.begin();
        while(name != validStrings.end())
        {
            if (match(*name, digits.size() - 1, digit))
            {
                next.push_back(*name);
            }
            ++name;
        }
        validStrings = next;
        sort(validStrings.begin(), validStrings.end(), StrCmpFunctor());
    }


    void show()
    {
        printf("+-------+-------+-------+\n");
        printf("|       |2 abc  |3 def  |\n");
        printf("+-------+-------+-------+\n");
        printf("|4 ghi  |5 jkl  |6 mno  |\n");
        printf("+-------+-------+-------+\n");
        printf("|7 pqrs |8 tuv  |9 wxyz |\n");
        printf("+-------+-------+-------+\n");
        printf("|-<prep>|0<back>|+<next>|\n");
        printf("+-------+-------+-------+\n");
    }
    string digits;
    vector< const char * > validStrings;
    static NameList namelist;
private: 
    bool match(const char *name, int pos, char digit)
    {
        assert(digit >= '0' && digit <= '9');
        int len = strlen(name);
        if (len <= pos)
            return false;
        char d = name[pos];
        const char *maplist = map[digit - '0'];
        for (; *maplist; maplist++)
        {
            if (d == *maplist)
                return true;
        }
        return false;
    }
    static const char *map[10];
};
const char *Digit2PinyinConverter::map[10]=
{     "", "","abc","def","ghi","jkl","mno","pqrs","tuv","wxyz"
};
NameList Digit2PinyinConverter::namelist;


int main()
{     char c = '0';
    int page = 0;
    int i;


    Digit2PinyinConverter converter;
    Digit2PinyinConverter::namelist.load("pinyin2.txt");
    converter.init();
    do
    {
        if (c == '\n')
            continue;
        if (c == '0')
            converter.init();
        if (c == '+')
        {
            page++;
            int maxpage = (converter.validStrings.size() + 9)/10;
            if (page > maxpage)
                page = maxpage;
        }
        if (c == '-' && page > 0)
            page--;
        if (c >= '2' && c <= '9')
        {
            page = 0;
            converter.push(c);
        }
       
        system("cls");
        converter.show();
        for (i = 0; i < (int)converter.digits.size(); i++)
            printf("%c", converter.digits);
        printf("\n");
        for (i=0;i<10 && i + page * 10 < (int)converter.validStrings.size();i++)
        {
            if (strlen(converter.validStrings[i + page * 10]) > strlen(converter.digits.c_str()))
                break;
            printf("%d:%s ", i, converter.validStrings[i + page * 10]);
        }
        printf("\nPage %d, Total %d\n", page, converter.validStrings.size());
    } while(scanf("%c", &c) == 1);


    return 0;
}




Trackback: http://tb.blog.csdn.net/TrackBack.aspx?PostId=1561644

PARTNER CONTENT

文章评论0条评论)

登录后参与讨论
EE直播间
更多
我要评论
0
19
关闭 站长推荐上一条 /1 下一条