Prediction of liver diseases with Random Forest classifier with principal component feature extraction    
           
  
        
            
                | Full Text |  Pdf | 
                
                    | Author | M. Rekha Sundari, P. Raga Manognya, Rishi V., A. Venkata Mahesh and Mugada Swetha | 
                
                    | e-ISSN | 1819-6608 | 
                
                    | On Pages | 2020-2027 | 
                
                    | Volume No. | 18 | 
                
                    | Issue No. | 17 | 
                
                    | Issue Date | November 8, 2023 | 
                
                
                    | DOI | https://doi.org/10.59018/0923247 | 
                
                    | Keywords | liver disease, synthetic minority oversampling technique, encoded nominal and continuous, edited nearest neighbourhood, random forest. | 
            
        
        
         
            
            Abstract
            
        
Liver disease is one of the most major illnesses that has a negative influence on the normal, healthy stature of a human being. This is because many different factors may lead to liver disease. The liver is the organ that is most often afflicted by liver disease even though it is the largest organ found inside the body. A few examples of the various subtypes of liver illness include fatty liver disease, cirrhosis, hepatitis, chronic liver disease, liver cancer, liver tumors, and other kinds of liver disease. This study creates machine learning algorithms to enhance the prediction of liver disease using various data balancing techniques, such as the Synthetic Minority Oversampling Technique (SMOTE) with Edited Nearest Neighbourhood (ENN).  The traits that were derived from SMOTE-ENN balanced features are then normalised using the principal component analysis (PCA) method. In addition, approaches like as correlation and skewness are used to clean the dataset and minimize the number of features. Finally, the prediction operation is carried out by a Random Forest classifier with PCA extracted and SEMOTE-ENN balanced features. The simulations conducted liver-disease dataset show that the proposed method results in improved performance over existing methods. 
                
                
               
                Back