Variable Rate Agriculture (VRA) is a data-driven approach aimed at reducing the environmental impact of commercial farming. It leverages machine learning models (MLMs) to enhance crop yield predictions more rapidly than traditional soil analyses. However, MLMs require large datasets, and agricultural data is often limited. Cross-validation (CV) techniques help improve model generalization by testing model performance on reserved subsets of data, even with limited data. This study used a three-year dataset on hybrid wheat, covering pre-planting, crop growth, and yield mass from Minnesota. Four machine learning models—linear regression, random forest, XGBoost, and feed-forward neural networks (FFNN)—were developed to link pre-growing conditions with yield outcomes. Two CV methods, Random CV and Spatial Grid CV, were applied to compare model performance, assessing overfitting using the coefficient of determination (R²) and Root Mean Squared Error (RMSE). Feature selection was performed to pinpoint critical spectral indices impacting model output. Findings indicated that Random CV generally outperformed Spatial Grid CV across both full and reduced feature sets. While linear regression suffered from feature selection limitations, FFNN showed occasional improvement. Overall, Random CV proved more effective, especially with a diverse dataset, enhancing model reliability in VRA applications.