In silico prediction of toxicity of phenols to Tetrahymena pyriformis by using genetic algorithm and decision tree-based modeling approach
Abstract Risk assessment of chemicals is an important issue in environmental protection; however, there is a huge lack of experimental data for a large number of end-points. The experimental determination of toxicity of chemicals involves high costs and time-consuming process. In silico tools such as quantitative structure–toxicity relationship (QSTR) models, which are constructed on the basis of computational molecular descriptors, can predict missing data for toxic end-points for existing or even not yet synthesized chemicals. Phenol derivatives are known to be aquatic pollutants. With this background, we aimed to develop an accurate and reliable QSTR model for the prediction of toxicity of 206 phenols to Tetrahymena pyriformis . A multiple linear regression (MLR)-based QSTR was obtained using a powerful descriptor selection tool named Memorized_ACO algorithm. Statistical parameters of the model were 0.72 and 0.68 for R t r a i n i n g 2 and R t e s t 2 , respectively. To develop a high-quality QSTR model, classification and regression tree (CART) was employed. Two approaches were considered: (1) phenols were classified into different modes of action using CART and (2) the phenols in the training set were partitioned to several subsets by a tree in such a manner that in each subset, a high-quality MLR could be developed. For the first approach, the statistical parameters of the resultant QSTR model were improved to 0.83 and 0.75 for R t r a i n i n g 2 and R t e s t 2 , respectively. Genetic algorithm was employed in the second approach to obtain an optimal tree, and it was shown that the final QSTR model provided excellent prediction accuracy for the training and test sets ( R t r a i n i n g 2 and R t e s t 2 were 0.91 and 0.93, respectively). The mean absolute error for the test set was computed as 0.1615. Highlights A new algorithm that combines GA and CART is introduced. The proposed algorithm inherits the advantages from both local and global approaches and was used to predict phenol toxicity. The algorithm partitioned the phenols to several subsets regardless of their MOAs to obtain statistically sound QSARs. Compared to the QSAR models reported in the literature, our model had better statistical efficiencies.
- 원문이 없습니다.
- DOI : http://dx.doi.org/10.1016/j.chemosphere.2016.12.095
- Elsevier : 저널> 권호 > 논문
NDSL에서는 해당 원문을 복사서비스하고 있습니다. 위의 원문복사신청 또는 장바구니 담기를 통하여 원문복사서비스 이용이 가능합니다.
- 이 논문과 함께 출판된 논문 + 더보기