Studies & Findings

Common inflammatory proteins linking frailty and area-level deprivation as key drivers of cardiovascular risk in women

Common inflammatory proteins linking frailty and area-level deprivation as key drivers of cardiovascular risk in women

Study population

Discovery cohort

Study participants were individuals enroled in the TwinsUK Registry, a national register of adult twins recruited as volunteers without selecting for any particular disease or trait17. It is among the most detailed omics and phenotypic bioresource worldwide, including over 14,000 twins comparable to the general population for health and lifestyle characteristics17. Historically, the cohort is predominantly female. All twin participants provided informed written consent. The TwinsUK study, approved by the St Thomas’ Hospital Research Ethics Committee (REC reference: EC04/015), encompasses longitudinal clinical and multi-omic phenotyping, including substudies such as ours that utilise existing data collected under this protocol17.

Our study included 2144 twins with concurrent measures of frailty, social deprivation level, cardiovascular phenotypes and inflammation-related markers. Since frailty may change with age, we selected individuals with a maximum gap of 5 years between their Frailty index (FI) and Olink data collection. Since the majority of subjects from TwinsUK exhibited similar socioeconomic status over time, we matched these individuals with IMD data from the closest date to their Olink data collection. This selected cohort comprised the final population for this study.

Frailty

Frailty was estimated using the Rockwood FI, a comprehensive measure of the individual’s health and functional status. 39 binary outcomes related to health deficit, including physical and mental health, were identified from questionnaires and clinical visits18. FI was calculated as the ratio of the individual number of health deficits over the number of completed domains. Individuals were categorised into non-frail (≤0.08), pre-frail (>0.08) and frail (≥0.25) according to FI19. When used as a continuous trait, FI was log-transformed and standardised.

SES measurement (area level based)

The index of multiple deprivation (IMD), a measure of neighbourhood-level socioeconomic status, was determined by the residential address postcode20. The IMD score is divided into ten deciles, with the first decile representing the most deprived areas, and the highest decile the least deprived areas, respectively.

Individuals with IMD in the 1st quintile and FI ≥ 0.25 were defined to be frail and living in a socioeconomic deprived area, while individuals with IMD in the 5th quintile and FI ≤ 0.08 were defined to be non-frail and living in the least socioeconomically deprived areas.

Atherosclerotic cardiovascular disease

The Atherosclerotic Cardiovascular Disease (ASCVD) risk score estimates the individual 10-year CVD risk based on ethnicity, age, sex, and traditional cardiometabolic risk factors (type 2 diabetes, smoking, total cholesterol, HDL cholesterol, systolic blood pressure, and treatment of hypertension) and was calculated as previously described21. The ASCVD scores were square root (sqrt)-transformed to reduce skewness in the data distribution and standardised for use in calculating the association within the TwinsUK and Nottingham OA cohorts.

Lifestyle factors

Physical activity was measured using the International Physical Activity Questionnaire22. Smoking and drinking status were assessed from questionnaires that measured the frequency of alcohol and cigarette consumption. Individuals were identified as current smokers or not according to their smoking habits. Participants with different responses for their drinking habits were classified as never drinking, occasionally drinking, or weekly drinking.

Dietary information

A validated 131-item semi-quantitative food frequency questionnaire (EPIC-FFQ)23 was used to estimate habitual dietary information. From FFQs, food items, macro- and micronutrient intakes were determined using FETA software23,24. From intakes, we also calculated indexes to represent the whole dietary pattern, including the healthy eating index (HEI), which characterises intakes of foods and nutrients and it is understood to be associated with chronic diseases25

Ischaemic heart disease

Incident ischaemic heart disease was ascertained through self-reported questionnaires administered to participants at regular follow-up intervals. Specifically, participants were asked whether a doctor had formally diagnosed them with ischaemic heart disease since their previous assessment. We defined incident cases as those reporting a diagnosis of ischaemic heart disease during the follow-up period who had not reported such a diagnosis at baseline.

Replication cohort

Results were replicated in 57 female individuals (≤79 years) with knee OA from the Nottingham OA study26,27, with measurement of protein profiles from the Olink panel and ASCVD-related risk factors. Systolic blood pressure (SBP) was imputed based on age and BMI. The imputation of SBP for the Nottingham OA cohort was conducted using a linear regression model derived from data on age and BMI from patients with knee OA in the INSPIRE cohort. The INSPIRE Study, which investigated the effects of dietary fibre and exercise on knee OA, recruited patients from the Nottingham area between May 2022 and December 2023. The participants of the Nottingham OA cohort and the INSPIRE cohort came from the same geographic area, ensuring that those individuals shared similar ethnic backgrounds and lifestyles. The SBP model was developed using three variable combinations: age only, age and BMI, and age, BMI, and sex. The model utilising age and BMI, which had the lowest Akaike Information Criterion and Bayesian Information Criterion, was selected to predict SBP in the Nottingham OA cohort. Ethical approval for the Nottingham OA study was obtained from the East Midlands – Derby NHS Research Ethics Committee (20/EM/0065, 18/EM/0154, respectively) and the Health Research Authority (protocol no: 19098, 18021). All participants provided informed written consent.

Olink panel profiling

74 inflammation-related proteins from plasma samples were profiled using the Olink Proximity Extension Assay (PEA) technique (v.3021 panels), as previously described28,29,30. Briefly, PEA uses a pair of antibodies labelled with unique complementary oligonucleotides (proximity probes) to bind to their specific target protein in a sample. This binding brings the probes into proximity, allowing them to hybridise and enabling DNA amplification of the protein signal, which is then quantified using next-generation sequencing. Plasma samples were randomly allocated to a 96-well plate including six Olink controls and 9 wells populated with a master plasma mix distributed across all 13 plates. The protein levels in the samples were reported using Normalised Protein Expression (NPX), a relative quantification unit on the log2 scale. Plasma samples from the TwinsUK cohort were measured in two batches at different time points. We excluded proteins that were measured in only one batch of Olink data, and imputed missing data using a KNN-based imputation method. This process resulted in the selection of 69 out of 74 inflammation-related proteins for downstream analysis method.

Statistics and reproducibility

Statistical analysis was performed using R 4.2.0 and the machine learning models were constructed using Python 3.7.0.

Frailty and IMD association

Linear mixed models were used to explore the relationship between the FI and IMD, adjusting for (i) age, BMI and family relatedness (random effect); (ii) further adjusting for age stopping full-time education, drinking, smoking, physical activity and HEI. The 95% confidence intervals for effect sizes were calculated using the R function confint(), which provides more estimates using profile likelihood-based confidence intervals for fixed effects.

We used a two-step approach to identify proteins associated with deprivation and frailty. We first employed a Random Forest model with SHapley Additive exPlanations (SHAP)31 to identify the 20 most predictive proteins (based on feature importance) for distinguishing frail individuals in deprived areas from non-frail individuals in advantaged areas. We then tested these proteins individually using linear mixed models, with deprivation and frailty as separate outcomes, adjusting for age, BMI, batch, and family structure.

To identify the most important features, a random forest model with SHAP approach was used to classify socially deprived frail VS non-deprived & non-frail conditions. Age-matched individuals living in the least deprived area without frailty were selected as controls for subjects living in deprived area and with frailty. Individuals were split into a training set and validation set (70%/30%) using a group split method for the feature selection. Individuals from the same family were always allocated as a group to avoid separating twins into different sets31. In addition to the Random Forest model, a Partial Least Squares Discriminant Analysis (PLS-DA) was developed using the identified key features. Performance was tested using the area under the receiver operating characteristic (AUC) after five-fold cross-validation.

We selected the top 20 important proteins from SHAP analysis following a widely used rule of thumb, aligning with several other omics studies employing SHAP for feature ranking32,33,34. This approach balances interpretability and biological relevance while avoiding overfitting. We then explored the relationship between the top 20 inflammation-related markers from SHAP and (i) frailty (as continuous trait) and (ii) social deprivation (IMD) using linear mixed models, adjusting for age, BMI, batch effects, family relatedness and multiple testing (Benjamini–Hochberg). Identified proteins significantly associated with both FI and social deprivation (P < 0.05 and False Discovery Rate [FDR] <0.1) were taken for downstream analysis.

Investigating the role of traditional, environmental and lifestyle factors on the association of the identified proteins

Linear mixed models adjusting for age, BMI, batch and family relatedness (as random effects) and multiple testing (Benjamini–Hochberg), were used to investigate the association between the identified protein markers and (i) educational attainment; (ii) physical activity; (iii) HEI; (iv) the ASCVD score. Protein markers with an FDR < 0.1 were considered statistically significant.

Replication of protein signatures in the Nottingham OA study

Linear regression models adjusting for age, BMI and batch were used to determine the association between the identified inflammation-related markers and ASCVD risk score in the female participants from the Nottingham OA study, adjusting for age and BMI and multiple testing (Benjamini–Hochberg).

Association between the ASCVD-associated proteins and incident ischaemic heart disease in TwinsUK

For the survival analysis, the time-to-event outcome was incident ischaemic heart disease. Cases were defined as participants diagnosed with ischaemic heart disease within 10 years following the collection of Olink inflammatory data. Individuals with a prior history of ischaemic heart disease or who developed the disease during follow-up were excluded from the control group. Age-matched controls were selected at baseline for subjects with ischaemic heart disease. The Kaplan-Meier method was used to estimate the probability of ischaemic heart disease for individuals in the top and bottom tertiles of four replicated protein markers. Log-rank tests were performed to assess whether there was a statistically significant difference in disease risk between groups. To evaluate the impact of other covariates, including BMI and other CVD-related factors (smoking status, diabetes diagnosis, use of anti-hypertensive medication, total cholesterol level, HDL cholesterol level), on ischaemic heart disease risk, we used mixed-effects Cox regression models to estimate the association between externally validated markers and 10-year ischaemic heart disease risk, accounting for batch effects, family relatedness, and potential confounders.

Exploring the mediatory role of the replicated proteins in TwinsUK in the association between IMD, frailty and ASCVD

We ran a mediation analysis to investigate the mediatory role of the identified proteins in the association between (i) IMD and ASCVD risk; (ii) FI and ASCVD risk. We conducted causal mediation analyses following the Baron and Kenny framework35. First, we evaluated the three essential mediation assumptions: (1) significant association between the independent variable and the dependent variable, (2) significant association between the independent variable and the mediator, and (3) significant association between the mediator and the dependent variable when controlling for the independent variable. After confirming these assumptions, we implemented formal causal mediation analysis using the ‘mediate’ function from the R package ‘mediation’ (version 4.5.0)36. Each mediator was analysed independently. We determined significant mediation based on both statistical significance (p < 0.05) and the magnitude of the indirect effect. The variance accounted for (VAF) was determined as the ratio of the indirect-to-total effect and distinguished the proportion of the variance explained by the mediation process (the proportion of the effect of social deprivation and frailty on CVD that goes through the protein markers).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

administrator
Certified nutritionist and wellness writer passionate about helping people live healthier, happier lives—one habit at a time. Contributor at EatWellBuzz.

Leave feedback about this

  • Quality
  • Price
  • Service

PROS

+
Add Field

CONS

+
Add Field
Choose Image
Choose Video