8. 分析结果的自动评价 “ 自知之明”( self-knowledge ):“知之为知之,不知为不知,是知也。” 所谓自然语言理解的“自知之明”就是对无疑点分析结果的正确指示。 Based on the criterion of the sentence category and the association testing, ASC can itself evaluates the degree that one sentence satisfy the criterion. It can say that where satisfy more, where satisfy less and where don’t satisfy. three level of the degree of self-knowledge in ASC: sentence-degree, chunk-degree and word-degree I think self-knowledge is the intelligence that machine need, and it can.
9. Compare of the ASC and syntax parsing sentence Concepts words tagging Syantax tree Semantic analysis phrase Bottom-Up Concept association Association veins Concepts activate Sentence category chunk Up-Bottom Conceptual Space Linguistic Space
13. 正在研究的算法 a 语义块内部局部组合处理 b 专名识别: 现有基础:独立的、基于规则方法的识别算法 进一步工作:增加预期驱动能力 c 各类动态词的发现:利用汉语的特性“ 字义基元化,词义组合化 ” d 省略的恢复、指代的确定 e 句间关系确认 涵盖了难点: 11, 15, 18, 19, 20
14. 语义距离计算 semantic distance-computing (SDC) which computes the semantic distance between two concepts. Here is the meaning of two words which represented in the concepts, 光荣 (glory) uga00+ugc01+rd00aem1+u71381 任务 (mission,task) (ga00,rc01). The SDC is to compare the letters and digits in the concept. If two concepts have more same symbols, their semantic distance is closer, and they relate more. the expert of string comparing, the expert of the semantic (or association) computing through SDC.
24. 工程方面 a 鲁棒性研究: 适应大规模新闻体语料分析 10G 语料,包括 1994 年人民日报、网络新闻等 b 可移植性研究: 系统可以适用于 Windows 、 Linux 、 Unix 等各种操作系统 c 规范化研究: 输入可以是 GB 、 Big5 、 Unicode 各种编码 输出以国际标准的 XML 标记,可扩展性好 d 与知识库的磨合,共同发展完善