The study sample included Prisma Health patients, aged 18 years or older, engaged in ambulatory care and condition management, inpatient case management, or community health in South Carolina’s central (Midlands) and northwestern (Upstate) regions. This study population constitutes initial efforts to pilot the SDoH screening within the health system.
Prisma Health is the largest non-profit health system in South Carolina and treats about 1.4 million patients annually. Its geographic footprint covers about half of the state’s total population . Census data based on the system’s geographic coverage describes the areas served by the health system as predominantly White (63.4%) with the second most common racial or ethnic group being Black (26.7%). Of individuals in this geographic area, 29.0% have a bachelor’s degree or higher (versus 32.9% nationally), 13.8% would be considered persons in poverty (versus 11.4% nationally) and 13.2% are without health insurance (versus 10.2% nationally). Median household income is $54,864 versus $64,994 nationally and areas covered have population density of 332.6 per square mile (< 500 per square mile and less than 2,500 people being rural) .
The data for this study came from the health system’s piloting of a digital SDoH screening and referral platform in 2020, called NowPow. NowPow is a software program, embedded within the electronic medical record (EMR), that matches patients with SDoH-related needs to local resources based on their demographic information (e.g., distance from home).
This study uses three data sources: SDoH screening information from NowPow, electronic medical record (EMR) data and geocoded patient addresses. This study was reviewed by the Institutional Review Board (IRB) at Prisma Health (IRB# Pro00105482) and approved as human subjects exempt. Informed consent was waived by the IRB committee, and the study was conducted in accordance to all relevant IRB guidelines.
Data was collected for the period June 1, 2019-December 31, 2020 from NowPow and two EMR sources. SDoH screening information comes from a 13-item SDoH screening questionnaire that draws from validated questions such as the Hunger Vital Sign , the UCLA Three-Item Loneliness Scale  and the Single Item Literacy Screener (SILS) . The questionnaire has been further described elsewhere . Questions and full responses appear in the Appendix 1. Questions were chosen to include those social needs exhibiting links to health outcomes and where resources were available to intervene locally. The questionnaire was verbally administered and captured SDoH needs for food insecurity, housing instability and quality, financial instability, transportation needs, interpersonal violence/abuse, language and health literacy, and social connectedness. Based on responses to screening questions, patients were referred to community-based resources via a digital prescription (‘HealtheRx’) in NowPow.
The two EMR data sources were the health system’s Epic and a local clinically integrated network’s database (inVio Health Network). A study database was created from NowPow questionnaire responses and EMR data by linking medical record numbers.
Multiple screens for a single patient could occur during the study period. For a given patient, the screen that took place earliest in the study period was categorized as the index screen. If multiple screens were recorded in a single day, the last screen of the day was categorized as the index screen.
Finally, patient addresses were available through the EMR. Each address was associated with a latitude–longitude from the United States Census Geocoder. The latitude–longitude coordinates were then linked to the 2020 TIGER Block Group Shapefile from the US Census Bureau.
ED visit information was drawn from the inVio network EMR and the date of the index screen was available in the linked database. The number of ED visits after the index screen and before the end of the study period was calculated and was the primary outcome of interest in the study.
Key independent variable – SDoH screening responses
Responses to the SDoH screening questions from the 13-item questionnaire are modeled as follows. Eight of the thirteen screening questions were Yes/No, and ‘No’ was set as the reference level. Five questions included more than two response categories. Responses to these questions were dichotomized into Yes/No. For health literacy, ‘Never’ and ‘Rarely’ constituted the ‘No’ category and the remaining responses (‘Some of the time,’ ‘Often’, ‘Always’) were grouped into the ‘Yes’ category, as suggested by Morris et al. . The remaining four questions (two concerning food insecurity and two concerning social connectedness) were categorized such that the response indicating the least frequency (e.g., ‘Never true’ or ‘Hardly ever’) was the ‘No’ category and the remaining responses were grouped into the ‘Yes’ category.
Nine additional patient demographic variables were considered: smoking status (Yes/No), age at the time of the screen, female (Yes/No), pregnant (Yes/No), Hispanic (Yes/No), primary payer (Medicaid/Medicare/Other), race (White/Black/Other/Patient Refused), weight category according to BMI (Underweight/Healthy/Overweight/Obese), and total number of comorbidities (0–9).
To calculate the total number of comorbidities, the presence (present = 1, absent = 0) of each of asthma, cancer (heart, colorectal, prostate, lung, endometrial), chronic pulmonary disease, depression, diabetes, heart failure, substance use (alcohol use and drug use), Alzheimer’s disease (including related disorders or senile dementia), and cerebrovascular disease was summed. The Charlson Comorbidity Index  was used to define cerebrovascular disease; the remaining comorbidities were defined by the Chronic Conditions Data Warehouse (CCW) .
To create the weight category variable, BMI cutoffs were set at less than 18.5, 18.5 to less than 25, 25 to less than 30, and 30 or greater and corresponded to weight categories of underweight, healthy, overweight, and obese, respectively. The race data from the EMR included levels such as White, Black, American Indian or Alaskan Native, Asian, biracial or multiracial, etc. A new race variable was created that re-categorized levels to White, Black, Other, and Patient Refused. Race categories included Hispanic individuals and should not be interpreted as ‘non-Hispanic’.
The number of ED visits after the index screen was the response variable and was modeled solely by a spatially varying intercept term. Bayesian zero-inflated Poisson, Poisson hurdle, zero-inflated negative binomial, and negative binomial hurdle regression models were considered. The zero-inflated negative binomial model was considered the best fit as it produced the smallest deviance criterion information (DIC). The model was used to identify if ED visits after the index screen per person per week varied across US Census block groups.
Next, the relationships between the number of ED visits and the response to each screening question were further explored with Bayesian zero-inflated Poisson, Poisson hurdle, zero-inflated negative binomial, and negative binomial hurdle models. Thirteen different models were created, one for each screening question. The expected value of the Poisson/negative binomial distribution governing the number of ED visits for individual \(i\) was given by \(\mathrmexp\left(\eta _i\right)\), where \(\eta _i=t_i+ \varvecx_\varveci\varvec\beta+r_i\psi _s_i\).
Here, \(t_i\) is an offset term equal to the log of the number of weeks between individual \(i\)’s index screen and the end of the study period; \(\varvecx_\varveci\) is a vector of covariates belonging to individual \(i\); \(\varvec\beta\) is the associated vector of coefficients; \(r_i=1\) if individual \(i\) responded ‘yes’ to the screening question under consideration and 0 otherwise; \(s_i\) denotes the census block group to which individual \(i\) belongs; and \(\psi _s\) is a spatially varying coefficient for block group \(s\). The covariates initially included in \(\varvecx_\varveci\) were an intercept, smoking status, age at the time of the screen, female, pregnant, Hispanic, primary payer, race, weight category, and total number of comorbidities.
Models were fit using the INLA package version 22.05.07  in R version 4.2.0 . The vector of spatially varying coefficients \(\varvec\psi=(\psi _1,\psi _2, \dots .,\psi _S)\mathrm^\prime\) was assumed to follow a conditional autoregressive (CAR) model at the census block group level . All parameters were assumed to follow the default prior specifications in the R INLA package for the ‘besagproper’ prior. For each model, zero-inflated Poisson, Poisson hurdle, zero-inflated negative binomial, and negative binomial hurdle regression were considered. The negative binomial hurdle model produced the lowest DIC for all 13 models and was designated as the model of choice. In addition to empirical support for the negative binomial hurdle model provided by DIC, the negative binomial hurdle model also has strong theoretical support. Negative binomial hurdle models are commonly used to model emergency department visits [42,43,44,45]. In a zero-inflated model, observations equal to zeros can either be ‘structural’ zeros (modeled by a point mass at zero) or ‘sampling’ zeros, modeled by the chosen count distribution. In a hurdle model, all zero observations are structural zeros and the count distribution models only non-zero observations. When analyzing healthcare visit data, which are likely to have high numbers of zeros, it is common to conceptualize individuals as first deciding to seek care and then determining (or being provided advice on) what kind or how much care to seek . In this context, all zeros are structural zeros, as they arise from individuals who do not seek care, making the hurdle model more appropriate.
Next, backwards stepwise selection was performed on each model individually. Barring the intercept and spatial term, all other variables were considered for removal. A variable was removed if its exclusion lowered the model’s DIC by at least two. If multiple variables’ removal lowered the DIC past the threshold, the variable which decreased it to the greatest magnitude was removed. This process was repeated until no variable met the removal criterion, and the resulting model was used. The set of covariates included in the final model after the stepwise selection procedure varied by screening question. Table 1 indicates which covariates were included in each screening question model. Specifications were analyzed to assess whether the response to each screening question was associated with ED visits after the index screen differentially across block groups.