BioInfoBank Library


FP7 Partner
Add BioInfo.PL bioinformatics lab to Your FP7 application

Mi, H (Hong)

Latest papers:

BMC Bioinformatics. 2009 ;10 Suppl 1 :S22 19208122 (P,S,G,E,B) Cited:1
Automation Department, Xiamen University, Xiamen, 361005, P.R.C. yang@xmu.edu.cn
BACKGROUND: Most machine-learning classifiers output label predictions for new instances without indicating how reliable the predictions are. The applicability of these classifiers is limited in critical domains where incorrect predictions have serious consequences, like medical diagnosis. Further, the default assumption of equal misclassification costs is most likely violated in medical diagnosis. RESULTS: In this paper, we present a modified random forest classifier which is incorporated into the conformal predictor scheme. A conformal predictor is a transductive learning scheme, using Kolmogorov complexity to test the randomness of a particular sample with respect to the training sets. Our method show well-calibrated property that the performance can be set prior to classification and the accurate rate is exactly equal to the predefined confidence level. Further, to address the cost sensitive problem, we extend our method to a label-conditional predictor which takes into account different costs for misclassifications in different class and allows different confidence level to be specified for each class. Intensive experiments on benchmark datasets and real world applications show the resultant classifier is well-calibrated and able to control the specific risk of different class. CONCLUSION: The method of using RF outlier measure to design a nonconformity measure benefits the resultant predictor. Further, a label-conditional classifier is developed and turn to be an alternative approach to the cost sensitive learning problem that relies on label-wise predefined confidence level. The target of minimizing the risk of misclassification is achieved by specifying the different confidence level for different class.
Spectrochim Acta A Mol Biomol Spectrosc. 2007 Nov 7;: 18155640 (P,S,G,E,B,D)
Near-infrared (NIR) spectroscopy, in combination with chemometrics, enables nondestructive analysis of solid samples without time-consuming sample preparation methods. A new method for the nondestructive determination of compound amoxicillin powder drug via NIR spectroscopy combined with an improved neural network model based on principal component analysis (PCA) and radial basis function (RBF) neural networks is investigated. The PCA technique is applied to extraction relevant features from lots of spectra data in order to reduce the input variables of the RBF neural networks. Various optimum principal component analysis-radial basis function (PCA-RBF) network models based on conventional spectra and preprocessing spectra (standard normal variate (SNV) and multiplicative scatter correction (MSC)) have been established and compared. Principal component regression (PCR) and partial least squares (PLS) multivariate calibrations are also used, which are compared with PCA-RBF neural networks. Experiment results show that the proposed PCA-RBF method is more efficient than PCR and PLS multivariate calibrations. And the PCA-RBF approach with SNV preprocessing spectra is found to provide the best performance.

Most cited papers:

BMC Bioinformatics. 2009 ;10 Suppl 1 :S22 19208122 (P,S,G,E,B) Cited:1
Automation Department, Xiamen University, Xiamen, 361005, P.R.C. yang@xmu.edu.cn
BACKGROUND: Most machine-learning classifiers output label predictions for new instances without indicating how reliable the predictions are. The applicability of these classifiers is limited in critical domains where incorrect predictions have serious consequences, like medical diagnosis. Further, the default assumption of equal misclassification costs is most likely violated in medical diagnosis. RESULTS: In this paper, we present a modified random forest classifier which is incorporated into the conformal predictor scheme. A conformal predictor is a transductive learning scheme, using Kolmogorov complexity to test the randomness of a particular sample with respect to the training sets. Our method show well-calibrated property that the performance can be set prior to classification and the accurate rate is exactly equal to the predefined confidence level. Further, to address the cost sensitive problem, we extend our method to a label-conditional predictor which takes into account different costs for misclassifications in different class and allows different confidence level to be specified for each class. Intensive experiments on benchmark datasets and real world applications show the resultant classifier is well-calibrated and able to control the specific risk of different class. CONCLUSION: The method of using RF outlier measure to design a nonconformity measure benefits the resultant predictor. Further, a label-conditional classifier is developed and turn to be an alternative approach to the cost sensitive learning problem that relies on label-wise predefined confidence level. The target of minimizing the risk of misclassification is achieved by specifying the different confidence level for different class.
Eur J Pharm Sci. 2007 Mar 19;: 17449230 (P,S,G,E,B,D) Cited:1
Department of Chemistry, College of Chemistry, Jilin University, Changchun 130021, China.
A new assay method for the nondestructive determination of erythromycin ethylsuccinate powder drug via short-wave near-infrared spectroscopy (NIR) combined with radial basis function (RBF) neural networks is investigated. The modern near-infrared spectroscopy analysis technique is efficient, simple and nondestructive, which has been used in chemical analysis in diverse fields. Short-wave NIR is a more rapid, flexible, and cost-effective method to control product concentration in pharmaceutical industry. The RBF neural networks are local approximation networks that have superiorities in function approximation and learning speed. In addition, the structure of RBF networks is simple. Estimate and calibration of the sample concentration via short-wave NIR are made with the aid of RBF models based on conventional spectra, standard normal variate (SNV), multiplicative scatter correction (MSC) and the first-derivative spectra. Various optimum models of them are established and compared. Experiment results show that the models of SNV spectra can give better performance, and the optimized RBF neural network model after SNV treatment were given, by which the root-mean-square-errors (RMSE) for calibration set and test set were 0.3266% and 0.5244%, respectively and the correlation coefficients (R) for calibration set and test set were 0.9942 and 0.9852, respectively. The proposed RBF method based on short-wave NIR is more valuable and economical for quantitative analysis than traditional methods such as partial least squares (PLS).
Science news