Other assessments carried out across the world support the use of ECP. For example, the Health Policy Advisory Committee on Technology (HealthPACT) of Australia and New Zealand has submitted to the Euroscan, the International Information Network on New and Emerging Health Technologies, an assessment report on ECP for the treatment of GvHD. In fact, ECP has been classified as an innovative treatment with significant expected health benefits, in particular considering that there is no standard therapy for patients who fail Pyrrolidinedithiocarbamate ammonium . In 2011, the same HealtPACT agency released a report on ECP for the treatment of GvHD after bone marrow transplantation. In the report, ECP was judged, according to the evidence, well tolerated and safe in seriously ill adult and pediatric patients with limited treatment options. Indeed, HealthPACT recommended that information on ECP is renowned and did not ask for further research .
The growing acceptance of ECP in standard practice is evidenced by the publication of a protocol by the Cochrane Childhood Cancer Group on the comparison between ECP and alternative treatments in the management of cGvHD in pediatric patients .
ECP has already received recommendation for routine use in many countries. For example, the Centers for Medicare & Medicaid Services defined ECP as a reasonable and necessary treatment for patients with cGvHD whose disease is refractory to standard immunosuppressive drug treatment . Also, Cancer Care Ontario defined ECP as an acceptable therapy for the treatment of steroid-dependent/refractory acute GvHD and cGvHD in adult and pediatric patients . In the United Kingdom, ECP is suggested for use in the second-line treatment of cGvHD, as recommended by Dignan et al. . These recommendations on ECP ;  ; , as well as those from the Italian Society of Hemapheresis and Cell Manipulation and the Italian Group for Bone Marrow Transplantation , are made on the basis of its efficacy and safety, even though data from randomized clinical trial are still missing. In this respect, this work has to be considered an advance in knowledge because Linker scanner mutations provides evidence about the alternatives routinely used in the management of steroid-refractory/resistant cGvHD and the economic profile of Therakos online ECP. As far as the economic evaluation is concerned, a literature search performed on PubMed yielded a study assessing the cost-effectiveness of ECP in comparison to rituximab and imatinib in steroid-refractory cGvHD . The study showed that, from the Spanish NHS viewpoint, ECP is cost-effective. Furthermore, another cost-effectiveness analysis, conducted in Poland, showed that ECP is the most cost-effective alternative in the management of patients affected by cGvHD, compared with no alternative . These results are in line with our evaluation that takes into consideration more alternatives. In conclusion, this comprehensive work may be considered an important step in the evaluation of the use of Therakos online ECP in steroid-refractory/resistant cGvHD in the Italian context and, with respect to some data, elsewhere. It showed that Therakos online ECP should be considered an effective, safe, and cost-effective alternative in steroid-refractory/resistant cGvHD and should be promoted and used, avoiding regional differences.
The clinical justification for switching between brands of the same pharmaceutical preparation rests on an assumption of bioequivalence, that is, both medicines’ overall bioavailability and maximum plasma concentrations being the same . A limited number of medicines are traditionally considered noninterchangeable, either when bioequivalence has not been established or the therapeutic index is narrow and the risk of toxicity high. Despite this, successful substitution with generic cyclosporine, long considered the archetypical noninterchangeable medicine, has been recently reported in order gaboxadol transplant patients . Establishing bioequivalence, however, does not guarantee acceptance of the “same-but-different” generic medicine by health professionals or patients, and concerns linger around the interchangeability of generic medicines with their originator counterparts  ; . Literature related to brand switching of SSRI and serotonin-noradrenaline reuptake inhibitor medicines focuses mainly on the market share of the brands, especially in insurance settings with tiered-pricing plans that favor generic equivalents for full subsidy . Studies evaluating health outcomes of SSRI brand switching are limited in number and methodology. A US-based study reported increased health care costs associated with therapeutic brand switching (changing from one chemical entity to another), yet reported on brand-to-generic switching of SSRIs as a whole. The switcher patients included in this study also had significantly different baseline scores for depression than did their matched nonswitcher counterparts . Available reviews of brand-to-generic switching of psychotropic medicines include literature on a diverse range of medicines including older and newer antidepressives, antipsychotics, and antiepileptic medicines, blurring the picture on these distinct pharmacological entities and contributing to misperceptions  ; .
New Zealand’s Pharmaceutical Management Agency (PHARMAC) is the agency responsible for making funding decisions regarding which pharmaceutical preparations will be listed in the Pharmaceutical Schedule and thus provided largely free to all New Zealanders (outside of a co-payment of
health inequalities; health surveys; population health; quality-adjusted life-years
There are various ways of summarizing a population’s overall lifetime experience of health by combining information on both mortality and morbidity. Perhaps the best known metrics are disability-free life expectancy (DFLE) and healthy life expectancy (HLE), which subtract years from life expectancy (LE) using a binary indicator of ill-health or disability. Recent efforts have been made to incorporate more sophisticated measures of morbidity into health expectancy estimates. Studies by Mathers et al.  and Salomon et al.  combined injury and disability prevalence rates with a set of disability weights to estimate disability- or health-adjusted LE, thereby reflecting the severity of conditions, not just their presence. Quality-adjusted life expectancy (QALE) is another recent approach to estimating health expectancy that uses a continuous ratio scale variable to measure morbidity, thus enabling it to incorporate detailed multiattribute data on health-related quality of life (HRQOL). The rising popularity of the quality-adjusted life-year (QALY) metric through its use in health technology assessment has led to its inclusion in national health surveys, affording researchers the opportunity to estimate QALY weights for a wide range of population subgroups using large, nationally representative data sets. Implementation of the QALE metric in health inequality research, however, has been limited to regional analyses , despite widespread application of other health expectancy indicators to inequality measurement  ; .
Step 2: Item Elimination
Table 3 summarizes the results of the item elimination analysis (more detail is provided in Supplemental Material 2 found at doi:10.1016/j.jval.2015.07.002). No items exhibited disordered thresholds. Five items from the physical subscale and one from the psychological subscale exhibited uniform DIF. Thirty-five and 22 respondents were removed from the physical and psychological subscale models, respectively, because of misfit to the Rasch model. No commonalities were found in the clinical or demographic characteristics of these respondents; hence, there was no indication that order Ketorolac tromethamine salt the scale was unsuitable for any particular subgroup of people with MS.
Initial overall fit statistics for both subscales indicated poor fit to the Rasch model. Eight items misfitted the model for the physical subscale, and two misfitted the model for the psychological subscale. Removing these items produced good overall fit to both models.
At the end of the item elimination phase, five conceptual dimensions were represented by one item each: General/other social/role functioning (IS13), Employment (IS16), Fatigue (IS23), Cognition (IS27), and Depression (IS29). A further three dimensions each had two remaining items: General/other physical functioning (IS01 and IS11), Mobility (IS14 and IS17), and General/other mental/emotional well-being (IS24 and IS26). Three dimensions were no longer represented because their constituent items had been eliminated: Independence (IS12), Bladder/bowel function (IS20), and Sleep quality (IS22).
Step 3: Item Selection
The aims of the item selection phase were to confirm the suitability of the items remaining as the sole representative of a dimension and to decide which items should be selected to represent the General/other physical functioning, Mobility, and General/other mental well-being dimensions. The results are summarized in Table 4.
All items anus remained as the sole representative of a dimension had adequate spread across the latent space and well-spaced threshold probability curves at logit zero. Items IS13 and IS16 performed well across all criteria; IS23 and IS27 failed to meet the threshold for internal consistency but performed well against the other criteria; IS29 struggled against some criteria but exhibited the strongest internal consistency of any item from the psychological subscale.
General/Other Physical Functioning
IS01 showed a wider spread across the latent space than did IS11 and performed well on all criteria. IS11 had better spaced threshold probability curves but had a high fit residual and a relatively high proportion of missing data.
Although IS14 and IS17 had equivalent spread across the latent space, the thresholds of item IS14 spanned logit zero whereas all thresholds for item IS17 were above logit zero, and the threshold probability curves for item IS14 were more widely spaced. IS14 had a high fit residual, whereas IS17 had a large ceiling effect.
As an example, Milstein et al.  used SD to study and evaluate the US health system reform that rgd peptide included three main strategies: coverage, care, and protection. The model was designed to address questions around the impact of these strategies nationwide, individually and together. This is a typical example of a broad problem with systemwide implications that requires a holistic perspective with attention to dynamic processes within the system and its structure. The modelers estimated the relative and combined effects of the three strategies from 2000 to 2010 and asked what might have happened had the United States taken decisive action in these three areas during that decade in terms of reducing avoidable deaths and lowering health care costs for Americans. Results and simulated scenarios show that all three strategies have the potential of saving millions of deaths while offering good economic value. Beyond the 10-year horizon, however, protection yields the best result by saving more lives and money. The model offers a useful way of observing how the US health care system tends to respond to large-scale interventions. Scenarios let planners compare these major interventions regarding direction, timing, costs, and benefits. The interpretation of these results is as follows: 1) a 10-year horizon tends to obscure the full effect of interventions; 2) protective interventions could effectively complement coverage and care by ensuring that people stay healthy for longer, hence reducing excess demand on the health care system; and 3) because population-based prevention policies take longer to yield their full economic and health benefits, Saturation density should not be postponed until positive effects are seen from coverage and care.
Model outputs and level of insight are varied and dependent on the purpose of the model and the type of problem. In general terms, SD can produce patterns and trends, as well as mean values. SD allows for the elicitation of “mental models” from stakeholders involved in the discussions and also from those involved in the model-building process. A mental model is an explanation of the stakeholder’s thought process about how something works in the real world  ; . This methodology generates a high level of insight about the problem and the system under study at strategic and policy levels.
Interpretation of outputs also depends on the type of problem and the purpose for which the model is designed. The model will not give a unique answer or optimal answer to a problem. Instead, the model allows experimentation to test alternative strategies (“what-if scenarios”) for system intervention and observing their potential outcomes to inform decision making before implementing a particular strategy.
DES is used to represent processes at an individual level where people may be subject to events, whether they be decisions or occurrences over time. DES is a simulation method that captures individual-level heterogeneity and is used to characterize and analyze queuing processes and networks of queues where there is an emphasis in the utilization of resources .
It must be noted that the current tool is intended to be used to optimize the time of a reevaluation, not to calculate a posteriori the value of a study. Hence, in actual practice, the method must be applied to prospective data.
The registry data were gathered retrospectively during 2008-2009. The database includes 391 patients with stage III colon cancer receiving adjuvant therapy (see  for detailed inclusion criteria), of which 281 patients had been treated with oxaliplatin (FOLFOX in 136 patients and CAPOX in145 patients). The remaining patients received capecitabine (93 patients) or 5FU/LV (17 patients). Follow-up time before a relapse or censoring was reported and used to estimate disease-free survival (DFS). Drug costs and follow-up costs were also registered .
Prediction of Missing Data Values
Some patients did not have a relapse and were censored at the end of the data collection period. For the purpose of this study, however, we need to consider the case in which the data would have been gathered beyond 2008-2009. Therefore, from 2009 onwards, simulation was used to project the remainder of each patient’s lifetime. The simulation was based on Weibull distributions for DFLDs fitted to the available patient data. Using conditional survivals, the expected future life expectancies were computed for all patients (see Appendix 1 in Supplemental Materials found at doi:10.1016/j.jval.2014.10.008). Medical costs were observed for the (-)-JQ1 2005-2008 and used to simulate treatment costs and follow-up costs for the remaining years. We used constant costs per day for the treatment phase and a gamma distribution on the proportion of total costs in each time interval during the follow-up phase, based on opinions of the involved experts (for more details, see Appendix 2 in Supplemental materials found at doi:10.1016/j.jval.2014.10.008). This resulted in a partly empirical, partly simulated database covering the periods 2005-2009 and 2010-2012, containing information on all patients diagnosed in 2005-2006.
Survivals, Costs, and INBs
Because the purpose was to evaluate the registry at potential points of making a definite decision, we looked at the data at the end of each year as if there would be no more information available after that date. This mimics how the procedure could be used prospectively for a new decision, using a completely empirical database.
For each year, we filtered the (partly simulated) DFS to find the patients who had started treatment before the end of that year. If the patient experienced no event before the end of the year, the patient was censored for that year. The costs were simulated for all patients during their treatment and follow-up time. When a patient is censored, the costs up to the censor point were considered (see Appendix 2 in Supplemental Materials found at doi:10.1016/j.jval.2014.10.008).
This resulted in seven different data sets containing the data observed up to the end of each year. These were analyzed to find their overall mean and standard errors of survivals and costs, taking account of censoring.
Because of the severity of the condition, we assumed a rather large λλ of €60,000/disease-free life-year gained (~€82/DFLD) in the base case. We changed this in the sensitivity analysis. The expected INB at the end of each year i was calculated as follows:equation(1)INBi=(λ/365)×[Ei(So)−Ei(Sc)]−[Ei(Co)−Ei(Cc)]where Ei(X) shows the mean of parameter X at the end of the year i. Assuming independency between costs and the DFS time, the squared standard error of INBi then isequation(2)s.eINBi2=(λ/365)2×[s.ei(So)2+s.ei(Sc)2]+[s.ei(Co)2+s.ei(Cc)2]This can be estimated for each year, using the number of patients in the registry at the end of each year (ni).
Prior Distribution of INB
The distribution for INB at t0 (the start of the conditional reimbursement period) reflects the information available when the original decision to set up the registry was made. This distribution is called the prior distribution in a Bayesian analysis .
Appraisal of psychometric properties and indicative criteria.Psychometric propertyIndicative criteriaContent validityClear conceptual framework consistent with stated purpose of measurementQualitative research with potential respondentsConstruct validityStructural validity from factor analysisPost hoc tests of unidimensionality by Rasch analysisHypothesis testing, with a priori hypotheses about direction and magnitude of expected effect sizesTests for differential item and scale functioning between sex, age groups, and different diagnosesReproducibilityTest-retest reliability: ICC >0.7 adequate, >0.9 excellent.Proxy reliability: Child and parent-reported reliability ICC > 0.7Internal consistencyCronbach α coefficient >0.7 and <0.9PrecisionAssessment of measurement error; floor or ceiling effects <15%; evidence provided by Rasch analysis and/or interval-level scalingResponsivenessLongitudinal data about change in scores with reference to hypotheses, measurement error, minimal important differenceICC, intraclass correlation coefficient.Full-size tableTable optionsView in workspaceDownload as CSV
For each article describing a study evaluating the psychometric performance of an eligible PROM, the following descriptive data were extracted: instrument version, first author name, publication year, study aim, study population, number of participants, age range, mean age, and setting or country where the study was conducted. Data were extracted by one reviewer (K.A.), and 50% were checked by a second (A.J.), with disagreements resolved by discussion with a third (C.M.), where necessary.
For each version of a PROM, evidence of the following psychometric properties was extracted: content validity (theoretical framework and/or qualitative research), construct validity (structural validity and straight from the source testing), internal consistency, test-retest reliability, proxy reliability, responsiveness, and precision. Data were extracted by one reviewer (K.A./A.J./A.T.) and checked by a second (A.J./K.A./A.J.), with disagreements resolved by discussion with a third (C.M.), where necessary.
Appraisal and Summary of Evidence for Psychometric Performance
Evidence of performance was summarized by psychometric property and judged using standardized reference criteria and thresholds (Table 1). We included an appraisal of validity, reliability, responsiveness, and precision . These data were summarized in a single rating for each measurement property following methods commonly used for the presentation of findings against the COSMIN criteria  ; . Our summary judgment took into account the following elements: 1) data extracted from included studies, with reference to standard criteria; 2) the methodological quality of studies and number of studies; and 3) the thoroughness of testing, giving further weight to any studies that appeared not to have been conducted by the original developers (Table 2) . Two reviewers (A.J./C.M.) made the judgment through discussion based on available evidence.
Indices for summarizing appraising psychometric properties of patient-reported outcome measures.RatingDefinitionDescription0Not reportedNo studies found that evaluate Microtubule associated proteins measurement property?Not clearly determinedStudies were rated poor methodological quality; results not considered robust−Evidence not in favorStudies were rated good or excellent methodological quality; results did not meet standard criteria for this property+/−Conflicting evidenceStudies were rated fair, good, or excellent methodological quality; results did not consistently meet standard criteria for this property, e.g., not for all domain scales+Some evidence in favorStudies were rated fair or good methodological quality; standard criteria were met for the property++Some good evidence in favorStudies were rated good or excellent methodological quality; standard criteria were met or exceeded+++Good evidence in favorStudies were rated good or excellent methodological quality; standard criteria were exceeded, results have been replicatedFull-size tableTable optionsView in workspaceDownload as CSV
The Labor and Delivery Index
LADY-X measures the labor and delivery experience of the woman who gave birth. It consists of seven items: 1) availability of competent professionals, 2) the information provided, 3) professionals’ responses to needs, 4) professionals’ emotional support, 5) feelings of safety, 6) concerns about the child’s condition, and 7) duration until first contact with child. Each item has three response categories, which vary by item; however, for all items, we presumed an ordering of the levels ranging from “very well,” “adequately,” to “inadequately.” The levels of the last item “duration until first contact” vary slightly from this ck1 inhibitor categorization. Based on these labels, the distance between the bottom category and the middle category is considered more meaningful than the distance between the middle category and the upper category for items from one to six. For an overview of the seven items (termed attributes in the DCE context) and their levels, see Table 1.
Attribute description and levels.AttributesLevelNameAbbreviation1231Availability of competent health care professionalsAvailabilityAt all timesMost of the timeRarely2Information provided by health care professionalsInformationVery wellAdequatelyInadequately3Health care professionals’ responses to needsNeedsVeryReasonablyNot at all4Emotional support by health care professionalsEmotional supportVery wellAdequatelyInadequately5The mother’s feelings of safetySafetyVeryReasonablyNot enough6The mother’s concerns about the child’s conditionConcernsNoSomeMany7Experienced duration until first contact with the childFirst contactNot longQuite longVery longFull-size tableTable optionsView in workspaceDownload as CSV
The seven items were derived from a mixed-methods study that included the views of pregnant women, women who had recently given birth, and professionals analyzing which aspects of labor and delivery are most important for a mother’s overall experience of labor and birth . The items were further evaluated in eight verbal probe interviews with women who had given birth in the past year. On the basis of these interviews, we concluded that the seven domains of LADY-X are clear, distinctive, relevant, complete, and applicable to all types of birth (place of birth, cesarean section, occurrence of complications, etc.). Thus, we concluded that LADY-X has good content validity. Parallel to this study intended to estimate preference weights for LADY-X, a clinometric evaluation of LADY-X was performed, which exhibits good reliability and construct validity. Results of the validation study will be presented elsewhere. An English version of LADY-X based on a forward-backward translation process is presented in the Appendix in Supplemental Materials found at doi:10.1016/j.jval.2015.07.005. The English version is yet to be evaluated in terms of validity and reliability.
Discrete Choice Design
The assumption in the DCE is that the object (e.g., a health care intervention or program) that is being valued is defined by a number of characteristics (attributes) and levels that are assigned to these attributes, as is the case for classification systems. In the present study, the object of interest was the birth experience of a mother, the seven items of LADY-X were the attributes that define the birth experience, and the three response categories of each item (e.g., “very well,” “adequately,” and “inadequately”) formed the attribute levels (see Table 1). The relative importance of these seven attributes was assessed by presenting respondents with a series of choice sets consisting of two hypothetical scenarios with varying combinations of attribute levels (see Figure 1 for an example of a choice set). For each choice set, respondents were asked to indicate their preferred birth experience scenario .
The study stat3 inhibitor consisted of 139 participants. To ensure a variety of experienced health states in the study population, participants were recruited from three patient groups—experiencing somatic complaints with a known cause (atherosclerosis or venous insufficiency), somatic complaints without a known cause (tinnitus), and psychological complaints (anxious or depressed)—and a population- based sample. All participants were 18 years or older. Exclusion criteria were not being able to read and write in Dutch or not being able to handle the electronic ESM device because of impaired motor skills (for more details, see Appendix A in Supplemental Materials found at doi:10.1016/j.jval.2014.10.003).
ESM using the Maastricht routine 
The ESM consists of a beep questionnaire that participants are required to fill out at several unpredictable moments during the day, in addition to questions in the morning, on waking and in the evening when going to sleep. The validity and reliability of the Maastricht routine has been documented elsewhere . In this study, we used the PsyMate, a small user-friendly device programmed to generate beeps (and vibrations) 10 times a day between 07.30 h and 22.30 h randomly in 1½-hour intervals. At every beep, the PsyMate presents the questions and records the responses using a touchscreen keyboard. The beep questionnaire (see Appendix B in Supplemental Materials found at doi:10.1016/j.jval.2014.10.003) consists of items on feelings, physical symptoms, context (location, interaction, activities), and overall HRQOL. For the items on feelings—six for positive affect (PA) and five for negative affect (NA) and PS (four items)—a seven-point Likert scale was used. The contextual items had predetermined answering categories. To obtain a valuation of momentary HRQOL, a VAS anchored in the same way as the EQ-VAS (0 being the worst imaginable health state and 100 being the best imaginable health state) was included . A detailed description can be found in Appendix A.
A global retrospective valuation of health, or HRQOL, was obtained using the EQ-VAS. The EQ-VAS is part of the EuroQol instrument, and it ranges from (worst imaginable health state) to 100 (best imaginable health state). The EQ-VAS has good reliability .
Anxiety and depression was measured with the Hospital Anxiety and Depression Scale (HADS), which contains 14 items and has good reliability and validity . Each item on the questionnaire is scored on a scale of to 3, with 3 indicating higher symptom frequencies. In addition, data on personal characteristics were collected.
The study consisted of three phases planned individually for each participant. All participants received €25 for their participation.
During the briefing (approximately 3 hours) on the first day, the rationale of the study was explained and an instruction on the use of the PsyMate was given. A try-out sampling moment was simulated in which the participants were coached in answering the questions on the PsyMate. After the try-out baseline, global data were collected (the EQ-VAS, the HADS, and personal characteristics).
The ESM period
The ESM period comprised 6 days, starting the day after the briefing. During this week, the participants were asked to continue their normal life while carrying the PsyMate with them.
On the eighth day, the participants returned for a debriefing session. The ESM period was reviewed by means of a questionnaire. Participants had to answer whether the PsyMate had influenced their mood, activities, thoughts, or contacts with other people and whether they had been annoyed by the beeps. Furthermore, participants were asked whether the ESM week had been a typical week, whether any unusual incidents had occurred, whether items were unclear, and whether the questions allowed them to give a good representation of their experiences during the day. The EQ-VAS and the HADS were readministered.
CTT, RMT, and IRT: Comparison of evaluations.Psychometric propertyCTT Evaluation ;  ; IRT Evaluation  ; RMT Evaluation ;  ; *AcceptabilityThe percentage of missing data for each item and the percentage of people for whom a PRO instrument score can be computedThere are no formal RMT analyses for this property of a PRO instrumentThere are no formal IRT analyses for this property of a PRO instrumentTargeting of the itemsPRO instrument scores should span the entire range; floor (proportion of the sample at the maximum score) and ceiling (proportion of the sample at the minimum score) effects should be lowThe PRO items should provide information across the full range of the XMD8-92 cost for which it is intendedThe relative distributions of item locations and person estimates (statistical indicators) are examined statistically and graphicallyScaling assumptionsSumming item scores is considered legitimate, when the items:▪Are approximately parallel (i.e., they measure at the same point)▪Contribute similarly to the variation of the total score (i.e., similar variances); otherwise, these should be standardized▪Measure a common underlying construct▪Contain a similar proportion of information concerning the construct being measuredThere are no formal IRT analyses for this property of a PRO instrumentThere are no formal RMT analyses for this property of a PRO instrumentSuitability of the response optionsThere are no formal traditional analyses for this property of a PRO instrument, although the patterns of item endorsement frequencies can be examinedEach response option should provide information within the range of the population for which it is intended. Each response option should be distinct and should have a range along the scale within which it is the most likely response choiceThe examination of category probability curves show the ordering of the thresholds for each item. A threshold marks the location on the latent continuum where two adjacent response categories are equally likely. The ordering of the thresholds should reflect the intended order of the categoriesValidityThe validity of the scale is evaluated using inter-item correlations and item-to-total correlations to gauge the strength of the relationships among the items and the appropriateness of scoring them together on one scale as well as evidence for local dependenciesBroad internal validity indicators.Broad internal validity indicators.Fit residuals (statistical) summarize the difference between observed and expected responses to an item across all people (item–person interaction).Fit residuals (statistical) summarize the difference between observed and expected responses to an item across all people (item–person interaction).Item characteristic curves display graphically the expected responses for each item across the continuum (the curve).Chi-square values (statistical) summarize the difference between observed and expected responses to an item for groups (known as class intervals) of people with relatively similar levels of ability (item–trait interaction).Item characteristic curves display graphically the expected responses for each item across the continuum (the curve), and the mean observed scores for each group of person scores (class intervals) can be plotted against the item characteristic curves.Local independence: IRT analyses also consider the item scoring bias resulting from similar items being included in the same instrument.Local independence: RMT analyses also consider item scoring bias, which is the extent to which items are locally independent, i.e., individual items are not biased by each otherReliabilityCommonly assessed using Cronbach’s alpha coefficient  and item internal consistency indicators, including item-total correlationsAssessed using the information curve, which is analogous to Cronbach’s alpha being calculated separately at each score along the range of the scale. This reflects desmosome an instrument’s reliability may change depending on the level of the underlying condition being measuredExamined using the Person Separation Index, which is analogous to Cronbach’s alphaCTT, classical test theory; IRT, item response theory; PRO, patient-reported outcome; RMT, Rasch measurement theory.⁎Although the general tenet around issues such as fit, dependency, and reliability are effectively consistent across Rasch-based software programs, the broad descriptions here are based on analyses and outputs generated through RUMM 2030 and cannot be considered as exactly the same as other programs.Full-size tableTable optionsView in workspaceDownload as CSV
Whether or not the respondent is a smoker is a highly influential covariate and, hence, an important factor explaining preference heterogeneity among the respondents. Smokers clearly attributed a lower importance to the attributes lifestyle (P < 0.0001) and patient’s age (P = 0.0008), preferred cure to prevention (P < 0.0001), and discounted future health gains to a greater extent (P = 0.04). Also, “Body Mass Index” (BMI) turns out to be an important covariate. The higher the BMI score, the less a respondent takes into account the patient’s age (P < 0.0032), severity of illness (P = 0.0006), and the disease’s link with lifestyle (P = 0.04) and the more he or she prefers cure to prevention (P = 0.0034) and discounts future health gains (P = 0.008). To visualize the differential valuation by respondents with a “healthy” lifestyle and patients with an “unhealthy” lifestyle, we partitioned the respondents into two groups, one group (in total 38% of the sample) containing respondents with a BMI exceeding 30 (i.e., the obesity threshold ) as well as smokers and one group containing nonsmokers with a BMI lower than 30. The “unhealthy” lifestyle group preferred cure to prevention (P < 0.0001), attributed a lower weight to lifestyle (P < 0.0001) and patient’s age (P < 0.0001) (as illustrated in Fig. 4B), and, remarkably, attached more importance to the risk of adverse effects (P = 0.02).
The objective of our study was to investigate on which basis the Belgian Amyloid β-Peptide (1-42) wants to set health care priorities. Although characteristics of the intervention (effectiveness and risk of adverse effects) and of the illness (severity of illness and time span) were found to matter, it were mainly the characteristics of the recipient that drove respondents’ preferences. Priority was given to younger patients and to those who have not somehow caused their own illness. We also detected substantial heterogeneity in the preferences: young, healthy, highly educated or more health-conscious adults responded in a markedly different way than did older, unhealthy, less well educated, and health-unconscious ones.
Our results confirm studies in other countries indicating that the context shapes the social value of QALYs, and that the general public’s distributive preferences diverge from a simple health maximization approach, as would be prescribed by cost-utility analysis (i.e., minimizing cost/QALY) (e.g., ; ; ; ; ; ; ;  ; ; for reviews, see ;  ; ). Many of these studies also observe a public preference for prioritizing younger patients over older ones, and several ones describe how a substantial number of participants want to account for self-inflicted illness. However, our results seem to diverge from these other studies in the strong impact of the lifestyle attribute, and the relatively limited impact of severity of illness to priority setting.
We paid specific attention to the difference between prevention and cure. A few studies in the literature also compared stated preferences for both types of health care ; ; ; ; ; ;  ; . These studies found no preference  ; , a preference for prevention ;  ; , or a preference for cure  ; . Our sample valued prevention higher than cure only when Marker is targeted at relatively young age groups and when it protects against more severe illness. However, as the self-inflicted nature of a health condition was a factor of major relevance in our study, indirectly, our results can be interpreted as providing further support for prevention in general. An allocation scheme that accounts for individual responsibility would mainly ration on curative treatments because accountability for lifestyle is less relevant for (not) providing prevention, especially when it comes to primary prevention. Preventive programs can incentivize, or even enable, citizens to adopt healthy and responsible lifestyles before their lifestyle-associated risk exposure requires cure. Currently, preventive “lifestyle” policies such as alcohol, fat, sugar, or smoking taxes are gaining interest  ; . Such measures, if effective, would increase short-term government income and reduce lifestyle-related morbidity.