My research project has been supported by National Basic Research Program of China (973), “Basic research on standards of the syndrome defined by Chinese medicine and its correlation with disease and formulas”. The main task is to analyze the complicated correlations between physicochemical parameters and syndromes of TCM, and to discover the relevant symptoms of syndrome by using the entropy based intelligent calculation methods. According to the project requirement, I have finished following works. 1. The application of entropy in discretezation and association analysis Entropy-based mutual information is a nonlinear measure of association, which is extensively applied in theory and practices. But it’s doubted to prescribe the distribution character of variables as normal distribution, particular facing with small sample of data. In order to deal with these problems, we discretize the variables while ensuring the original their information based on rough set and entropy. And then, calculate the correlation of the category variables. In this way, we not only decrease the computational cost, but also avoid the error brought by the assumption of normal distribution. 2. The application of entropy in the syndrome research Many approaches have been introduced to clustering. But there is almost no research on how to determine the cluster size (the number of elements of cluster). In order to deal with these problems, a technique of self-adaptively selecting symptoms’ number of syndromes is proposed, which is based on contribution degree. We apply this method to depression, chronic renal failure (CRF) and chronic hepatitis b data, retrieve syndromes in TCM. The method provides new train of thought for syndrome standardization. 3. The application of entropy in feature selection There are too much four diagnoses available in clinic. It’s difficult to collect all of them, and not beneficial to differentiation of syndromes. In this paper, feature selection based on mutual information is studied to select the most optimal subset of symptoms, and the selected symptoms subset is input to SVM classifier for objectively differentiation of syndromes. An incremental learning algorithm of SVM is introduced for the purpose of resolving difficult of classifier learning brought by increasing clinic data.
修改评论