Is it time to revisit the role of ultrasound in rheumatoid arthritis management?

For over a decade, a large number of studies have highlighted the benefits of ultrasound (US) in the diagnosis and management of rheumatic diseases, especially rheumatoid arthritis (RA). However, its benefits in routine practice have been less studied and trials examining US as part of various clinical strategies are just emerging, with recent randomised trials examining the added value of US in tight-control paradigms. The conclusions of these trials have raised questions on the role of US in RA management. This Viewpoint analyses the recent studies, and discusses potential limitations in study designs as well as the methodological challenges of assessing the added value of an imaging technique.

ABSTRACT For over a decade, a large number of studies have highlighted the benefits of ultrasound (US) in the diagnosis and management of rheumatic diseases, especially rheumatoid arthritis (RA). However, its benefits in routine practice have been less studied and trials examining US as part of various clinical strategies are just emerging, with recent randomised trials examining the added value of US in tight-control paradigms. The conclusions of these trials have raised questions on the role of US in RA management. This Viewpoint analyses the recent studies, and discusses potential limitations in study designs as well as the methodological challenges of assessing the added value of an imaging technique.
For over a decade there has been a rapidly expanding literature on the benefits of ultrasound (US) in rheumatic diseases. US straightforwardly provides accurate detection of both inflammation and damage at the joint level. [1][2][3] This predominantly research-focused literature has led to an exponential use of US in routine practice around the world, for diagnosis, management and guiding therapeutic decisions.
However, as with the introduction of any new medical technology or test, we still lack knowledge on how to best use US in routine care. 4 All this was well highlighted in the European League Against Rheumatism recommendations on the use of imaging in rheumatoid arthritis (RA) clinical practice, which demonstrated the gaps in the current knowledge base and highlighted the need for strategy trials. 3 In that context, it is good to see such trials emerging, with two recent studies focusing on the role of US in RA management: the 'Targeting Synovitis in Early Rheumatoid Arthritis' (TaSER) and the 'Aiming for Remission in rheumatoid arthritis: a randomised trial examining the benefit of ultrasound in a Clinical TIght Control regimen' (ARCTIC) multicentre studies. 5 6 Both studies looked at aspects of US imaging within the context of a modern tightcontrol treatment paradigm (ie, the benefit of US over conventional approaches for achieving clinical remission). In the TaSER trial, the authors tested the hypothesis that adding US disease activity assessment to a treat-to-target (T2T) strategy of patients with early inflammatory arthritis would produce superior clinical and imaging outcomes compared with a strategy driven by a clinical disease activity score (DAS) assessment. In the ARCTIC trial, the authors tested the effect of applying US versus not applying US in a T2T regimen in patients with early RA with outcomes of achieving clinical remission and non-progression of structural damage.
The broad conclusion of these authors was that US does not add to clinical management of early RA. But are these studies sufficient to definitively inform our practice in this clinical context?

STUDY DESIGN ISSUES
When evaluating the role of a new technique, there are a number of potential study designs to consider. 7 However 'the difference between the two groups in a randomized clinical trial evaluating diagnostic tests is completely explained by the group of patients that would have had discordant test results, if they had undergone both tests". 8 Clinical practice reflects this: it is common to order an US where the clinician perceives a difference between their clinical assessment and a DAS. 9 10 In both ARCTIC and TaSER, patients had either clinical examination alone or clinical examination with US; we do not know how many patients in the clinical examination arm would have different findings on US. A study design where only such discordant patients were randomised to therapeutic changes would better evaluate the added value of the imaging test under evaluation.
The examination of study designs is made more complex by differing escalation rules: in TaSER patients with high disease activity (DAS28 >5.1), or moderate disease activity (3.2<DAS28-Erythrocyte Sedimentation Rate (ESR)<5.1) with more than two swollen joints, treatment was escalated without US assessment. In ARCTIC the target was remission defined as DAS <1.6, plus the following criteria, different for the two treatment arms: no swollen joints in the clinical arm; no swollen joints and no joints with power Doppler (PD) signal in the US arm. In addition ARCTIC report complex decision algorithm, with 'no response' (requiring escalation) defined for current DAS <2.4 as change of DAS <0.6 or <10% decrease of US total score (using a combined grey scale (GS) and PD score of 0-192) or for DAS >2.4, change of DAS <0.6 or <10% decrease of US total score. It is not clear how treatment decisions were made in the discrepant group where DAS was reduced but the US score was increased. We applaud the researchers for emulating clinical practice where US is used to complement clinical assessment. However, the fact that in TaSER the ultrasonographer was also the clinician, and thus aware of the clinical assessments could also influence (bias) the US assessment itself. On the other hand, in ARCTIC the clinician making treatment decisions was aware of both clinical examination and US findings; it is not clear how this may have influenced assessments. In ARCTIC 19% of decisions reportedly deviated from the protocol, but no further data are provided; this aspect is not reported for TaSER.
The study end points differed between the studies. In TaSER, the co-primary outcomes were the mean change in DAS44 and Rheumatoid Arthritis Magnetic Resonance Imaging Scoring (RAMRIS) erosions between baseline and 18 months. While it is understandable to choose a clinical inflammation outcome, in the absence of better measures, it must be noted that DAS28 (upon which T2T escalation rules were based) and DAS44 are not totally independent measures. DAS44 remission did favour the US group after 18 months. That there was little change in MRI erosions is not surprising, and the study may have been underpowered to detect RAMRIS changes based on data only recently emerging from early RA trials. 11 12 However, numerical changes in erosion progression were less in the US-examined group. In ARCTIC the primary end point was the proportion of patients with a combination between 16 months and 24 months of: (1) DAS clinical remission (2) no swollen joints and (3) nonprogression of radiographic joint damage. Again DAS was used for escalation rules and the primary outcome measure. In terms of objective structural outcomes, although the median change in total van der Heijde modified Sharp Score over 24 months was low, with no statistically significant differences between the two strategies in progression above a predefined cut-point, there was a borderline statistically significant difference in the 24-month change in radiographic joint damage between the groups, favouring the US tight control strategy. So in both studies we see a trend to less joint damage with the US T2T approach. If no structural damage progression is the main goal of a T2T approach, then US seems to ensure a better outcome.

US ISSUES
How many joints should we evaluate in the context of an RA management strategy? There have been many studies looking at the responsiveness of various US joint scores in evaluating treatment response at the group level. Such data may inform which joints should be included in trial involving direction of treatment change. In this context, the ARCTIC study evaluated more joints than TaSER (32 vs 14). However, which US findings are most important for decision making? The studies used different US findings to assign treatment escalation. TaSER defined a PD signal ≥1 in at least two joints as a 'positive' finding. The importance of joints with, for example, no PD signal but severe GS (ie, severity grade ≥ 3) would not be taken into account. [13][14][15] ARCTIC used US decision rules based on <10% or <20% reduction in a total US (GS +PD) Score which the authors state corresponded to DAS changes or 0.6 and 1.2; this might suggest that the imaging changes were somehow linked to DAS Scores and not used fully independently. Incorporating PD into decision rules also raises the issue of the quality of the US machine employed (in terms of its ability to detect PD signal, a capability that has been steadily improving in machines over the last decade) especially when only the Doppler signal is the main parameter for decision making. 16 These technical issues, including the technical capability of the sonographers involved, were not extensively reported in the papers.

CONCLUSIONS
What is clear from these recent studies is that they raise significant questions about how to conduct studies to examine the potential benefits of new instruments in rheumatology. Both TaSER and ARCTIC were conducted in very early RA populations, and the generalisability to all RA populations is unknown. The choice of an open label approach and the lack of blinding for the US evaluations may represent a limitation of both studies. Nevertheless, it may be that US does not add a lot to tight control in such populations, and it is possible that there is a 'floor' effect related to our existing therapies in such patients. From a clinical point of view, this means that even if the US information is accurate, the potential benefits may not be achieved due to the limitations of the therapies. And of course, the recent studies do not address the roles of US outside of RA management or the lack of information on the real implementation of a T2T approach in clinical practice.
We should be careful about throwing the baby out with the bathwater. The evidence from the recent trials should be considered within the context of the methodological issues raised here. The take home message is that we still need robust evaluation of the usefulness of US in RA clinical practice, and precise guidance on how to report US studies in rheumatology.