Mohanapriya, D and Beena, R (2021) Improving topic modelling for Prediction of Drug Indication and Side effects. Improving topic modelling for Prediction of Drug Indication and Side effects, 25 (4). pp. 11542-11558. ISSN 1583-6258
2nd-paper.pdf - Published Version
Download (714kB)
Abstract
Text mining is a common technique in system biology because it can reveal secret relationships between drugs,
genes, and diseases in large quantities of data. Improved Predict Drug Indications and Side Effects using Topic
Modelling and Natural Language Processing (IPISTON) was a text mining technique for drug phenotype and
side effect prediction. In IPISTON, Linear Discriminative Analysis (LDA) was used to model the topics from
the sentences in the collected data. Using the topics and Gene Regulation Score (GRS), a drug-topic probability
matrix was constructed and it was given as input along with the syntactic distance measure to Conditional
Random Field (CRF) and Bi-directional Long-Short Term Memory-CRF (BILSTM-CRF) classifiers for
prediction of drug-phenotype relationship and drug-side effect relationship. In this paper, Enhanced Topic
Modelling-IPISTON (ETP-IPISTON) is proposed to enhance the topic modelling for better prediction of drug�phenotype association and drug-side effect association. A logistic LDA is introduced for topic modelling. It has
the capability of handling wide variety of data modalities. The logistic LDA eliminates the generative portion of
the LDA while keeping the conditional distribution factorization over latent variables. The logistic LDA
generates the gene vector and latent vector of every gene and it is given as input to the cells of BILSTM-CRF
for topic modelling. In BILSTM-CRF, the logistic LDA reduces the computational cost of extracting topics
from a large corpus. By using the topics modelled by logistic LDA-BILSTM-CRF and GRS score a drug-topic
probability matrix is constructed and it is used along with the syntactic distance measure in CRF, BILSTM�CRF, Naïve Bayes, Classification and Regression Tree (CART) and Logistic regression for prediction of drug�phenotype relationship and drug-side effect relationship.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | Text mining, drug-phenotype relationship, drug-side effect relationship, logistic LDA, enhanced topic modelling, IPISTON. |
Divisions: | PSG College of Arts and Science > Department of Computer Science |
Depositing User: | Mr Team Mosys |
Date Deposited: | 29 Feb 2024 10:49 |
Last Modified: | 29 Feb 2024 10:49 |
URI: | http://ir.psgcas.ac.in/id/eprint/2112 |