As the use of car dashboard cameras (dashcams) has increased, the availability of dashcam imagery has also increased. In recent years, dashcam imagery has been predominantly used in conjunction with computer vision techniques for autonomous vehicle systems. However, this research explores an alternative application of these technologies in the domain of public safety and security. Specifically, we apply object detection to dashcam imagery to address the challenge of identifying vehicles associated with active Amber Alerts. With the goal of aiding law enforcement in locating abducted children more efficiently, we employ the YOLO (You Only Look Once) object detection model, a state-of-the-art deep learning framework known for its real-time performance and accuracy. Our methodology involves training and fine-tuning the YOLO model on a custom dataset of dashcam footage, incorporating diverse environmental conditions such as varying lighting, weather, and traffic scenarios. Experimental results demonstrate that the model achieves high precision and recall rates in detecting target vehicles, validating its effectiveness for real-world deployment. This research highlights the potential of leveraging deep learning and computer vision techniques to address critical public safety challenges, offering a novel application of these technologies beyond their traditional use in autonomous driving. Our findings contribute to the growing body of work in computer science that seeks to harness AI for societal benefit.
Artificial Intelligence (AI) agents are transforming healthcare by automating tasks, enhancing diagnostic precision, and enabling personalized care. Our project aims to develop an AI-based system to automate the detection of IVC filters and complications, such as extravascular extension, in CT scans. IVC filters are crucial for patients with venous blood clots but are meant to be temporary, and delays in their removal can cause harm. Interventional radiology (IR) practices often rely on manual tracking methods, which are inadequate when patients transfer care. Many patients forget their filter’s presence, leaving new providers unaware. Building on previous research with Mayo Clinic NWWI, we aim to enhance an existing deep learning algorithm for IVC flagging and extend it to detect extravascular extension, flagging patients for closer follow-up. The system will also integrate large language models (LLMs) to process electronic health records (EHRs) and be modular for future expansion. Our goal is to create a reliable AI algorithm for detecting IVC filters and implement it in hospital settings.
Coronary Artery Bypass Grafting (CABG) is the most common form of open-heart surgery in the United States and is performed hundreds of thousands of times annually. This surgery can be performed via on-pump, which utilizes a machine to keep the heart beating, or off-pump, where the patient’s heart beats like normal. One of the main complications associated with CABG is acute kidney injury (AKI) which has a high mortality rate, making our research goals to identify the important risk factors behind why a patient would experience acute kidney injury, and to compare the on-pump and off-pump surgical techniques. We used datasets from Mayo Clinic comprised of approximately 2000 patients and several hundred features. We analyzed this dataset using several statistical models, including Random Forest, XGBoost, and Propensity Analysis, with Inverse Probability of Treatment Weighting (IPTW) as our propensity analysis technique. From this analysis, we gathered a list of key features which can predict if a patient will experience AKI when performed with the on-pump method. We also found that there was no statistically significant difference in success rates of the on-pump and off-pump techniques, however the high imbalance in the dataset requires further investigation. Mayo Choice Award: Our project is intended to help patient outcomes by providing physicians with key predictors of whether a patient will experience AKI or not during surgery. The physician will then be able to make a better-informed decision about whether the surgery should be performed given the patient’s characteristics and associated risk factors.
This study proposes an AI driven pipeline that combines, pancreas segmentation outcome for Pancreatic Ductal Adenocarcinoma (PDAC) diagnosis with a large language model (LLM) agent to enhance diagnostic and clinical analysis. Building upon already established deep learning approaches in medical imaging, our project aims to extend traditional UNet segmentation methods by integrating the capabilities of an LLM agent to provide detailed diagnostic information for medical practitioners. Using the Pancreas Decathlon dataset, 3D CT scans are processed and trained over multiple different iterations utilizing attention mechanisms, sparse categorical cross entropy and Tversky loss. The predicted segmentation labels are used by the LLM to infer diagnostic details such as the stage of the disease progression and integrate results with the electronic health records for longitudinal study. Ultimately, this integrated framework aims to assist medical practitioners in diagnosing PDAC more effectively while offering additional supplemental information.
Artificial Intelligence (AI) agents are transforming healthcare by automating tasks, enhancing diagnostic precision, and enabling personalized care. Our project aims to develop an AI-based system to automate the detection of IVC filters and complications, such as extravascular extension, in CT scans. IVC filters are crucial for patients with venous blood clots but are meant to be temporary, and delays in their removal can cause harm. Interventional radiology (IR) practices often rely on manual tracking methods, which are inadequate when patients transfer care. Many patients forget their filter’s presence, leaving new providers unaware. Building on previous research with Mayo Clinic NWWI, we aim to enhance an existing deep learning algorithm for IVC flagging and extend it to detect extravascular extension, flagging patients for closer follow-up. The system will also integrate large language models (LLMs) to process electronic health records (EHRs) and be modular for future expansion. Our goal is to create a reliable AI algorithm for detecting IVC filters and implement it in hospital settings.
Diabetes Mellitus is a chronic condition affecting millions worldwide, associated with factors like age, body mass index, blood pressure, and social determinants such as income level, education, and healthcare access. This study uses a mix of these factors derived from a public health survey to train machine learners for diabetes prediction. The data includes 29 features and 223,022 records. A key goal here is to investigate levels of feature importance in risk factors to assess the impact of social determinants on diabetes. We employ six machine learning models, including XGBoost, AdaBoost, LightGBM, Random Forest, Naive Bayes, and Logistic Regression, and utilize SHapley Additive exPlanations to measure feature importance. Predictive performance metrics include accuracy, precision, recall, and the area under the receiver operating characteristic curve. Empirical results show that five out of six models achieved 85% accuracy, with blood pressure, body mass index, cholesterol, weekly alcohol consumption, and time since the last checkup being the most significant predictive attributes. These initial findings highlight the potential of machine learners to predict diabetes and contribute to early monitoring of the identified risk factors. In related future research, a planned work will investigate whether identifying and incorporating other factors would improve overall predictive performance.
The recent advancements in LLMs and Gen-AI technology, such as ChatGPT, Gemini, and Llamas, have widespread applications. Such AI-based solutions strive to achieve extremely high levels of effectiveness in identifying and modeling a multitude of complex patterns and characteristics in textual data. In related literature, recently there is an increased focus on solutions to detect and model the complex characteristics and semantic formulations in textual data that are respectively unique to AI-generated and human-developed responses to a given input query or data. This research work is a preliminary study on investigating NLP-based approaches that are applicable to our research question: for a given query or input data, can we differentiate between the AI-generated responses from that is developed by a human expert. The case study dataset has about 2,000 records (about published research articles) and four attributes: title of a published article; its AI-generated abstract; its human-developed abstract; and a class label. The two NLP-based approaches we are currently investigating are: similarity metric(s) based assessment paired with the known class labels of the respective abstracts, and an LLM-based approach for modeling. We will present the respective results and provide their individual and comparative analysis as well as important conclusion points.
The field of electronic textiles (e-textiles) combines digital technology with textile objects, and has applications in fields such as wearable computing, theatrical design, and medicine. Prior work has examined deploying this technology in educational settings, to teach such skills as circuit design, computer programming, and iterative design. However, e-textile-based learning materials are still not commonly used, and more validated examples of such interventions would be valuable. The aim of this project is to investigate the state of the art in e-textile technology, especially in educational contexts, and to develop and evaluate an e-textiles intervention which could be deployed in a classroom or extra-curricular setting to teach introductory programming skills. So far, we have conducted a literature review examining applications of e-textiles in learning environments. For example, in one study [Seo et al. 2017], the researchers provided a safe environment for children with ASD (Autism Spectrum Disorder) to create their own sensory haptic toy. We found that many of the studies targeted middle and high school age children as a way to gauge and increase their knowledge of electricity and sewing techniques, but not many examined undergraduates. Therefore, in future work, we plan to conduct an experiment investigating the effectiveness of e-textiles in undergraduate learning.
The emergence of generative artificial intelligence (AI) has facilitated the creation of targeted, mass-produced, and highly effective phishing messages with unprecedented ease. Unlike previous methods, attackers no longer face the dilemma of choosing between investing time in crafting personalized spear phishing messages or opting for less effective, but broadly distributed, general phishing campaigns. Despite continuous warnings from security researchers and academics spanning over a year, there remains a notable scarcity of AI-generated phishing messages available for comprehensive study and analysis. The establishment of a comprehensive corpus of AI-generated phishing messages would provide researchers with the data to devise effective strategies for detecting and thwarting these sophisticated techniques. To address this gap, we propose leveraging the computational capabilities of UWEC’s Blugold Center for High Performance Computing with local Large Language Models (LLMs) to generate a diverse and extensive collection of malicious phishing messages for analysis and new techniques to better detect AI generated phishing attacks.
Background: Patient education is linked to better health outcomes and is a core component of Family Medicine, where providers see a variety of patient health problems (Simonsmeier, 2022). Developing and maintaining an evidence-based and inclusive patient education library is a resource-intensive. Content libraries at academic medical centers often are not inclusive of Family Medicine. Moreover, users cannot tailor content to individual patient needs, and accessing content is cumbersome. Objective: We aimed to close this education gap by developing an AI-assisted tool where clinicians can easily generate trustworthy education content for diverse patient needs. Methods: Our tool combines a web-scraper that pulls data from mayoclinic.org, feeds it into a standalone user interface (UI) enabled by a large language model (LLM), which allows users to generate printable education based on inputs, such as disease name, content headers, text size, and patient reading level. We validated the LLM’s accuracy and completeness using volunteer medical students. We plan to evaluate the tool’s usability, time savings, and user satisfaction with a pilot study comparing the traditional workflow to our tool. Results & Future Work: Two times during the development process, output forms were evaluated by multiple different clinicians to confirm medical accuracy and readability. Post-pilot, we will investigate translating the tool into clinical practice. Mayo Choice Award Family medicine providers handle an incredibly large volume of diseases and diagnoses, so having easy-to-access, adjustable educational material is incredibly important as it decreases clerical burden for clinicians and increases patient health literacy (Hart, 2015). Currently, even if providers are able to locate the educational forms without interrupting their workflow to visit the public website, they cannot adjust educational material reading level or text sizes to tailor to individual patient needs without extra steps, which inhibits patients from fully understanding their diagnosis and relevant follow-up, including vital self-care instructions that lead to better patient outcomes (Simonsmeier, 2022). Overall, this tool provides the educational materials for over 400 diagnoses commonly seen in family medicine all in one place, while also allowing providers to tailor the reading level and text size to each patient, which will lead to overall better health outcomes. Works Cited Hart, S., 2015. Patient education accessibility. Medical Writing 24, 190– 194. Simonsmeier BA, Flaig M, Simacek T, Schneider M. What sixty years of research says about the effectiveness of patient education on health: a second order meta-analysis. Health Psychol Rev. 2022 Sep;16(3):450-474. doi: 10.1080/17437199.2021.1967184. Epub 2021 Aug 24. PMID: 34384337.
This website was developed as an educational tool to train students in accurately identifying different types of stuttering. The platform provides audio samples, allowing users to practice distinguishing between various stutter types, such as repetitions, prolongations, and blocks. As students classify these speech patterns, their responses are recorded and stored, with hopes to eventually form a structured dataset. This dataset serves a dual purpose: enhancing student learning through hands-on experience and creating a valuable resource for future AI applications in speech therapy and automated stutter detection. The project aims to bridges the gap between AI and stutter disfluency detection. The resulting dataset can support the development of AI-driven tools for diagnosing and assisting individuals with speech disorders, ultimately improving accessibility to speech therapy solutions.
This project aims to standardize follow-up recommendations for colonoscopies by leveraging Generative AI and Natural Language Processing (NLP) to analyze colonoscopy and pathology reports. Current follow-up guidelines vary based on multiple factors, including polyp type, size, number, and patient history, often leading to inconsistencies in clinical recommendations. The AI system processes unstructured text from medical reports, extracting key diagnostic details and cross-referencing them with established guidelines to generate personalized return date recommendations. By automating this process, the project enhances accuracy, reduces variability in clinical decision-making, and improves workflow efficiency for healthcare providers. The standardized recommendations ensure that patients receive appropriate follow-up care, minimizing the risk of delayed or unnecessary procedures. This initiative demonstrates the potential of AI in streamlining medical decision-making, ultimately contributing to better patient outcomes and more consistent adherence to evidence-based guidelines in gastroenterology.
It is well known that the usefulness of a machinelearning model is due to its ability to generalize to unseendata. This study uses three popular cyberbullying datasets toexplore the effects of data, how it’s collected, and how it’slabeled, on the resulting machine learning models. The biasintroduced from differing definitions of cyberbullying and fromdata collection is discussed in detail. An emphasis is made onthe impact of dataset expansion methods, which utilize currentdata points to fetch and label new ones. Furthermore, explicittesting is performed to evaluate the ability of a model togeneralize to unseen datasets through cross-dataset evaluation.As hypothesized, the models have a significant drop in theMacro F1 Score, with an average drop of 0.222. As such, thisstudy effectively highlights the importance of dataset curationand cross-dataset testing for creating models with real-worldapplicability. The experiments and other code can be found athttps://github.com/rootdrew27/cyberbullying-ml.
Curling is a strategic ice sport that presents unique challenges for AI research due to its combination of complex decision-making and intricate physical dynamics. This project aims to develop a physics-based curling simulator to address these challenges, enabling accurate modeling of stone movement, ice conditions, and sweeping effects. Our approach involves utilizing an existing physics engine, MoJuCo, to simulate realistic curling interactions. We implemented physics models based on leading theories for basic curling shot selections. The simulator initially focuses on stone dynamics and shot selection, with more complex features such as sweeping effects being added in later iterations. A visualization web app displays shot outcomes and will eventually support AI training and data analysis.In addition to the simulation application for curling research, we developed a training module for both the physics of curling and interacting with the MoJuCo library. This module is designed to help new student learn about the complicated physics of curling. This module also helps students learn how to implement and maintain MuJuCo based features into the simulator.
Curling is a strategic team sport that presents unique challenges for artificial intelligence (AI) research, particularly in decision-making and physical simulation. However, a significant barrier to AI development in curling is the lack of structured and accessible datasets. This project aims to address this gap by leveraging standardized video feeds from Curling Stadium to generate datasets suitable for AI research.Our approach involves developing software that uses image detection models YOLO (You Only Look Once) and SAM (Segment Anything Model) to analyze YouTube videos of curling matches, tracking objects such as rocks and players to gather data on their positions and movements.The expected outcome of the larger project is a structured and scalable dataset that can be used for AI-based curling research, including game strategy analysis and predictive modeling. This project lays the foundation for broader AI applications in curling by automating data collection, enabling machine learning models to analyze strategic decision-making, and fostering human-AI collaboration in sports analytics.
The Internet of Things (IoT) encompasses a variety of systems and devices that enable data exchange across networks. With this interleaved connectivity comes an inherent vulnerability to attacks. Traditional intrusion detection in IoT environments has been primarily human-reliant, but modern malicious methods surpass manual approaches. Machine Learning (ML)-based Intrusion Detection Systems (IDS) show promise but require refinement to match human-monitored IDS effectiveness.This study involved a literature review of research involving the NetFlow dataset NF-ToN-IoT-v2, created in 2022 to enable ML-based IDS development. With balancing, the dataset includes approximately 16 million net-flows, with 63.99% attack and 36.01% benign. The data’s imbalanced nature was addressed through methods like down sampling to reduce training bias. A hyper-parameter tuning pipeline was used to optimize algorithm testing and cross-validation, especially for different data balancing methods.The algorithms tested based on previous research found during literature review include Naïve Bayes, Random Forest, K-Nearest Neighbor (KNN), Support Vector Machines (SVM), and XGBoost. Comparative analysis using confusion matrices and bar plots enabled the evaluation of algorithm effectiveness. Overall, this research highlights the potential of ML approaches in IoT IDS development, through leveraging NF-ToN-IoT-v2 to enhance detection accuracy and bridge the gap between human-monitored and ML-driven solutions.
Pancreatic ductal adenocarcinoma (PDAC) is the most common form of pancreatic cancer, accounting for over 90% of cases, and is characterized by aggressive growth, early metastasis, and resistance to therapy. A comprehensive understanding of the molecular mechanisms driving PDAC is essential for improving diagnosis, prognosis, and treatment. In this study, a multiomics approach was applied by analyzing both DNA methylation and RNA-sequencing datasets obtained from The Cancer Genome Atlas Pancreatic Adenocarcinoma project.The methylation dataset included significantly more tumor samples than normal samples, and a similar imbalance was observed in the RNA-seq dataset. This disparity posed a challenge for direct feature selection, as it could lead to a model biased toward tumor-associated features. To address this issue, six data imbalance correction techniques were evaluated and compared: Random Oversampling, Synthetic Minority Over-sampling Technique (SMOTE), and Adaptive Synthetic (ADASYN) for oversampling, along with Random Undersampling, Cluster Centroids, and AllKNN for undersampling. Identifying the most effective imbalance correction method is essential for improving feature selection accuracy and facilitating the discovery of novel genes associated with pancreatic ductal adenocarcinoma (PDAC). A deeper understanding of these oncogenes could contribute to the development of non-invasive diagnostic tests and personalized treatment strategies for PDAC.
As the use of car dashboard cameras (dashcams) has increased, the availability of dashcam imagery has also increased. In recent years, dashcam imagery has been predominantly used in conjunction with computer vision techniques for autonomous vehicle systems. However, this research explores an alternative application of these technologies in the domain of public safety and security. Specifically, we apply object detection to dashcam imagery to address the challenge of identifying vehicles associated with active Amber Alerts. With the goal of aiding law enforcement in locating abducted children more efficiently, we employ the YOLO (You Only Look Once) object detection model, a state-of-the-art deep learning framework known for its real-time performance and accuracy. Our methodology involves training and fine-tuning the YOLO model on a custom dataset of dashcam footage, incorporating diverse environmental conditions such as varying lighting, weather, and traffic scenarios. Experimental results demonstrate that the model achieves high precision and recall rates in detecting target vehicles, validating its effectiveness for real-world deployment. This research highlights the potential of leveraging deep learning and computer vision techniques to address critical public safety challenges, offering a novel application of these technologies beyond their traditional use in autonomous driving. Our findings contribute to the growing body of work in computer science that seeks to harness AI for societal benefit.