Pancreatic ductal adenocarcinoma (PDAC) is the most common form of pancreatic cancer, accounting for over 90% of cases, and is characterized by aggressive growth, early metastasis, and resistance to therapy. A comprehensive understanding of the molecular mechanisms driving PDAC is essential for improving diagnosis, prognosis, and treatment. In this study, a multiomics approach was applied by analyzing both DNA methylation and RNA-sequencing datasets obtained from The Cancer Genome Atlas Pancreatic Adenocarcinoma project.The methylation dataset included significantly more tumor samples than normal samples, and a similar imbalance was observed in the RNA-seq dataset. This disparity posed a challenge for direct feature selection, as it could lead to a model biased toward tumor-associated features. To address this issue, six data imbalance correction techniques were evaluated and compared: Random Oversampling, Synthetic Minority Over-sampling Technique (SMOTE), and Adaptive Synthetic (ADASYN) for oversampling, along with Random Undersampling, Cluster Centroids, and AllKNN for undersampling. Identifying the most effective imbalance correction method is essential for improving feature selection accuracy and facilitating the discovery of novel genes associated with pancreatic ductal adenocarcinoma (PDAC). A deeper understanding of these oncogenes could contribute to the development of non-invasive diagnostic tests and personalized treatment strategies for PDAC.