Introduction: Physiological and psychological factors contribute to gastroesophageal reflux symptoms. However, most research assesses reflux symptom severity via questionnaires. We aim to identify the factors associated with real-time reflux symptom reporting using traditional statistical and machine learning approaches.
Aims and methods: Adult patients with refractory reflux symptoms (heartburn/regurgitation) completed 12 psychosocial questionnaires and standard 24-hour pH-impedance monitoring (pH-MII). pH-MII outcomes included proton-pump inhibitor (PPI) use (on/off), self-reported symptom frequency (via a button press within 2 minutes of experiencing a symptom), and the total number of acid reflux events. In the traditional statistical approach, principal component analysis (PCA) with varimax rotation reduced the psychological questionnaires into independent components. Next, data was evaluated using a Hurdle Poisson Model consisting of 1) a logistic model examining the variables associated with reporting either no symptoms or ≥1 symptom during pH-MII, and 2) a Poisson model evaluating the variables associated with the number of symptoms reported, in those who reported ≥1 symptom. The 5 psychological components, PPI use, and total reflux events served as independent variables. In the machine learning approach, the individual psychological questionnaires, PPI use, and total reflux events were entered into 11 different models with symptom frequency as the outcome. Cross-validated (CV) model performance (8-fold CV) was compared between the different models to select the best model.
Results: 393 participants [mean (SD) age=48.5 (14.1); 60% female] were included. 301 (84%) participants had ≥1 symptom (median=8). PCA identified 5 components: health anxiety, general psychological health, personality, pain coping, and social functioning. General psychological health was the only significant variable in the logistic model. Health anxiety, pain coping, total reflux events and PPI use were significant in the Poisson model (Table 1). Gradient Boosting Tuned 3 demonstrated the best CV predictive performance (average squared error: 25.62) and explained 53% of the variance in symptom frequency. The top 5 variables with the highest worth (ie, contribution to symptom frequency) were total reflux events, pain-related anxiety, depressive symptoms, trait anxiety, and pain catastrophizing (Table 2).
Conclusions: General psychological symptoms (e.g. depressive symptoms) are associated with the presence or absence of real-time reflux symptoms. In those who do report symptoms, both illness-specific psychological processes and reflux events are associated with symptom frequency. Thus, reflux symptom reporting is a multifactorial process. General and illness-specific psychological processes as well as reflux events contribute to different aspects of the symptom experience.

