Journal Peer Review and Editorial Evaluation: Cautious Innovator or Sleepy Giant?

Peer review of journal submissions has become one of the most important pillars of quality management in academic publishing. Because of growing concerns with the quality and effectiveness of the system, a host of enthusiastic innovators has proposed and experimented with new procedures and technologies. However, little is known about whether these innovations manage to convince other journal editors. This paper will address open questions regarding the implementation of new review procedures, the occurrence rate of various peer review procedures and their distribution over scientific disciplines or academic publishers, as well as the motivations for editors or publishers to engage in novel review procedures. It shows that in spite of enthusiastic innovation, the adoption of new peer review procedures is in fact very slow, with the exception of text similarity scanners. For now, peer review innovations appear to be restricted to specific niches in academic publishing. Analysing these niches, the article concludes with a reflection on the circumstances in which innovations might be more widely implemented.


Introduction
Peer review of journal submissions plays a key role in the work of virtually all academics and researchers. Reviewer comments become conditions for publication as papers are "put through peer review". The expression is used as if peer review is a singular system, but in fact journal peer review and editorial evaluation are now more diverse than ever. A host of enthusiastic innovators has proposed and experimented with new procedures and technologies, but little is known about whether these innovations manage to convince other journal editors. The innovations are advocated with a wide range of concerns over peer review and editorial evaluation, ranging from its fairness, its ability to assess the solidity of statistics or text recycling, to its transparency or cost. In response to such concerns, editorial innovations now include procedures such as open peer review, registered reports (reviewing research protocols rather than results), various scanners (including statistics and text similarity scanners), or external review platforms that offer outsourced peer review. In this sense, traditional peer review has become enveloped in editorial procedures that employ software support tools to guarantee the quality or integrity of publications, at least in some parts of the research publication system. Some functions previously attributed to peer review are now covered, or at least supported, by software tools or organisational innovations in editorial management. It is this wider set of editorial assessment practices (in which peer review still holds a central position) that constitutes the topic of our analysis. Table 1 outlines some of the major innovations in peer review and editorial procedures over the past decades.
While the innovations have been published and are often passionately defended by their proponents, there is little research on the actual distribution of the current editorial procedure diversity. Available research has focused on a single review aspect, for instance, single-blind vs. double-blind (Pontille and Torny 2014); or on journals in a specific research area; or on a single publisher (Taylor & Francis 2015). In addition, these studies commonly focus on perceptions of review procedures. This leaves open questions regarding the implementation of new editorial and review procedures, as well as what motivates editors or publishers to engage or reject novel procedures. This paper will address these questions based on a survey among journal editors of 361 research journals, covering a wide range of research fields. Using the results of the survey, we aim to answer two main questions. First, based on an inventory of different editorial and peer review models (Horbach and Halffman 2018b), we aim to assess how often these models are employed and how their usage is distributed over different academic disciplines and publishers. Second, we set out to elucidate the process of editorial innovation, asking why some innovations are implemented faster and wider than others. In our article, we focus on peer review of journal submissions (as opposed to review of grant applications or other types of review). In this, we will distinguish between the editorial process, encompassing the entire workflow from submission to decisions on acceptance or rejection; and the peer review process, referring to the actual intellectual work 1 3 Journal Peer Review and Editorial Evaluation done by peers in evaluating a manuscript. For example, the usage of text similarity scanners is part of the editorial process but complements the actual peer review process.
Our analysis shows that, in spite of enthusiastic innovation, the adoption of new editorial procedures is in fact very slow, with the exception of text similarity scanners. For now, innovations appear to be restricted to specific niches in academic publishing, despite the ardent commitment of their proponents.
After describing peer review and the editorial process from the perspective of innovations as a contentious set of procedures, we describe the method used to gather data. Our findings are organised as follows: after an overview of various editorial procedures' occurrence rate, we analyse their distribution over research fields, publishers, and changes over time. Based on qualitative responses, we can describe the conditions for innovation, as reported by the editors in our sample. After a general overview, we return to specific innovation niches in which editorial procedures are changing more quickly or drastically, in order to analyse the conditions for change. This leads to conclusions about the distribution of editorial procedures, the pace of innovation in the system, as well as some potential factors stimulating or triggering innovation in academic publishing.

Innovation in a Contentious Set of Procedures
Practices to evaluate the quality of articles submitted to scholarly journals have always been varied and contentious. Even though peer review is nowadays often presented as a universal gold standard guaranteeing the epistemic reliability of published work, the wide-spread use of referees by research journals is in fact relatively new (Fyfe et al. 2015(Fyfe et al. , 2019Baldwin 2015;Csiszar 2018). Systematic use of referees to evaluate submitted work was only introduced in learned societies in the early 19th century. Deep into the 20th century, the use of referees by journal editors was still considered optional (Fyfe et al. 2015(Fyfe et al. , 2019Csiszar 2016). The term "peer review" itself is a neologism that only became common in the 1970s (Baldwin 2015(Baldwin , 2018. The general use of peer review to evaluate knowledge claims became prominent in the specific context of US grant proposal evaluation in the late 70s (Baldwin 2018;LaFollette 1992) and only later became common practice beyond the realm of natural and medical sciences . Throughout this history, refereeing has been used in various ways and for various purposes, ranging from state censorship, allocation of printing resources, fraud detection and quality improvement, to the protection of reputations Biagioli 2002;LaFollette 1992). As Moxham and Fyfe put it: "(…) the relative durability of refereeing as a practice should not be mistaken for simple continuity of purpose or of meaning. What it was meant to accomplish, whom it was intended to benefit, and the perception of its virtues and defects varied considerably with time and place." (Moxham and Fyfe 2017: 888) With the gradual uptake of review practices, the expectations of the system have been in constant flux and still remain controversial today (Horbach and Halffman 2018b). Two expectations meet with more or less general agreement. First, peer review is currently generally expected to act as a filter, distinguishing 'good' from 'bad' science. Second, it is widely expected to improve manuscripts: through their comments and feedback, reviewers are expected to assist authors in improving their manuscript's quality (Zuckerman and Merton 1971). However, other expectations are less universal and the understanding of 'quality' may vary considerably. For example, others go further, including expectations about fraud detection, the creation of fair and equal opportunities for all authors, and the establishment of a hierarchy of published results (Bohlin 2004). These expectations are disputed by other actors, mainly publishers, arguing that 'peer review was never designed nor intended to do so.' With the editorial process now encompassing much more than just the act of peer reviewing a manuscript, debates about the role of 'peer review' should now be reformulated as the role of the editorial process in a broader sense, among others due to the introduction of several automated scanners. In recent years, several additional challenges and concerns surrounding the editorial system have emerged. These include the discussions on open access publishing, potentially drastically impacting on publisher's business models; and other open science initiatives, demanding more transparency from both authors and journals (cOAlition S 2018). Also, the rise of alternative publishing formats, such as preprint servers that no longer require the direct involvement of publishers (Walker and Rocha da Silva 2015), raises further questions about the role of publishers. The involvement of wider communities, outside of invited reviewers, additionally challenges the notion of a 'peer' and its relevant expertise: who is sufficiently capable of assessing the quality and merit of a manuscript? And what is the role of metrics, including altmetrics, in assessing manuscripts? (Thelwall et al. 2013) The differing expectations of the review and editorial system have led to a range of novel review procedures. For instance, the expectation to detect fraud has triggered the development of text and image similarity scanners (Ercegovac and Richardson 2004). An expectation of more extensive reviewing has advocated the involvement of a wider community through post-publication review (Knoepfler 2015), and even of commercial review platforms (Research Square 2017). Furthermore, changing selection criteria, interaction between stakeholders and cooperation between journals have aimed to foster research integrity through review procedures, rather than detecting integrity breaches after they occur (Hames 2014). Open review, including sharing review reports, as well as its radical opposite, the double-blind review system, have emerged out of questions of fairness in review (Rojas 2007;Pontille and Torny 2014;Okike et al. 2016).
These innovations have also been fuelled by concerns over peer review and the publication system in general. An extensive literature of editorials, letters and research articles has emerged claiming that peer review is essentially flawed (Smith 2006), that it is inconsistent (Peters and Ceci 1982), slow (Nguyen et al. 2015), and ineffective (Smith 2006;Lee et al. 2013). Moreover, it has been characterised as an 'empty gun' (Smith 2010), 'with no evidence that it works and no established way to provide [reviewer] training' (Patel 2014). More recently, claims have emerged about peer review as 'old fashioned and outdated' (Tennant et al. 2018), arguing that the peer review system is not keeping pace with rapid changes in (scientific) communication and research practices.
Given these growing concerns about peer review and editorial practices, we might expect these practices to be in considerable flux and novel innovations to distribute quickly throughout the publishing system. We described and systematised the emergence of novel editorial procedures elsewhere (Horbach and Halffman 2018b), but here, we want to investigate the spread and implementation of these innovations among journals. We will understand 'implementation of innovations' as 'changes in the review procedures', which also allows for the re-implementation of more traditional models of peer review.
The spread and implementation of innovations has previously been widely studied by scholars from sociology, organisation studies, and science and technology studies, to name just a few (e.g. Wejnert 2002;Franke and Shah 2003;Peres et al. 2010). The spread of innovations in most of the traditional literature has been described as the 'diffusion of innovation'. However, this term is slightly misleading: the adaptation or implementation of an innovative technique, process, mechanism or practice requires active agents. Adoption is the result of concrete actions, rather than of some naturally occurring phenomenon, as is the case with diffusing chemicals (MacKenzie and Wajcman 1999;von Hippel 1976;Oudshoorn and Pinch 2007).
Accordingly, several scholars, most notably in the 'social construction of technology' tradition (Bijker et al. 1987), have outlined the various characteristics that describe the process in which innovations travel. First, innovations are picked up by active seekers of knowledge, in an active process of adoption (Greenhalgh et al. 2004). They do not travel driven by their own force and are not taken up by a passive 1 3 set of recipients. One of the consequences is that strong communication networks in particular foster the spread of innovations. Second, users predominantly search for innovations when they experience a problem or opportunity to improve their practices (Wisdom et al. 2014). Third, users domesticate innovations, to make them correspond to their specific conditions and needs (Oudshoorn and Pinch 2007;von Hippel 1976). Fourth, users' expectations of new technologies play a significant role in their willingness to implement such innovations (Brown and Michael 2003;Verbong et al. 2008;Van Lente 1993). Especially given the contentious nature of peer review procedures and the contrasting expectations of the system, these expectations might be prominent factors in the implementation of review innovations. Indeed, what constitutes 'good' review varies among scientists and editors (Taylor & Francis 2015), thereby potentially leading to high diversity in review procedures, but also to obstacles for innovation as it leaves the requirements for novel procedures contentious (Van Lente 1993).
Academic publishing meets at least some of these conditions with increasing and shifting expectations of the system explicitly voiced, pressure from related (social) networks to adopt changes, well-developed communication capacities that facilitate knowledge exchange, and sufficient financial clout (Larivière et al. 2015). In the remainder of this article we will present empirical data to demonstrate that, despite these expectations, the editorial process is a fairly stable system, in which change or adoption of novel procedures is actually rare. Even though the system may seem highly innovative in particular niches of academic publishing, it appears rather constant when looking at academic publishing in general.

Methods
Detailed information on editorial procedures used by journals is surprisingly hard to find. While some journal 'instructions for authors' indicate procedures for blinding author identities or reviewer selection, most journals do not explain the details of their editorial procedures. For our study, we were obliged to gather data through an online survey among editors, which was distributed in the context of a previous study on the effectiveness of peer review forms in preventing article retractions. We will briefly describe our methodology here and refer to the previous publication (Horbach and Halffman 2018a) for further details. Gathering information about peer review through a questionnaire has the advantage of a wide coverage and practicality of data collection, but also a drawback: procedures reported by the editors are not necessarily a perfect reflection of actual practices, which could vary between editors or might present a polished account. (A more detailed account of editorial practices will be the subject of a follow-up study.) 1 3

Editorial Procedure Questionnaire
Information about journal editorial procedures was gathered through a short questionnaire among journal editors. The questionnaire consisted of 12 questions, each articulating a specific attribute of editorial practices, based on a classification of editorial procedures by Horbach and Halffman (2018b). Respondents were asked to match their procedures for each of the editorial attributes with the options outlined in the table. (The full questionnaire can be found in Appendix A, electronic supplementary material.) In addition, we asked editors to indicate whether, when and how any of these attributes have been modified since 2000, which allows us to trace innovations in editorial procedures since the beginning of this century.

The Sample
The questionnaire was distributed via email to journal editors. Two strategies were used to gather email addresses. First, we used articles indexed in Web of Science (WoS) as 'editorial material' and extracted the email address of the corresponding author, on the assumption that authors of 'editorial material' would very likely be editors. 58,763 unique email addresses of editors from 6,245 different journals were collected in this way (about a fifth of all journals listed in WoS). Because of our initial interest in retracted journal articles, we manually amended the list with email addresses from editors of journals issuing at least ten retractions, according to the RetractionWatch database of retracted journal articles. This yielded a total of 6,331 different journals.
After distributing our questionnaire to these journals and sending out two reminders, we eventually obtained a total of 361 useful responses. The final response rate of 6.12% is low, but comparable to, or even higher than response rates of similar online surveys among journal editors or authors regarding issues related to academic integrity (Hopp and Hoover 2017;Stitzel et al. 2018). Nevertheless, our sample covers a wide range of research fields and reflects the distribution of journals over fields. For a detailed overview of our sample, we refer to Horbach and Halffman (2018a).

Analysis
To analyse the distribution of editorial procedures across journals, we used information about the journals' academic disciplines and publishers. For the information regarding publishers we used the available data in the Web of Science database. Here, we distinguished between the five largest publishers (Elsevier, Springer, Wiley, Taylor & Francis, and Sage) and all other, smaller, publishers (Larivière et al. 2015). When analysing the distribution of editorial procedures over scientific disciplines, we made use of the categorisation of research disciplines in the Leiden  Table 2 lists all editorial procedures and attributes studied in our research. Appendix B (electronic supplementary material) also presents an overview of the current implementation of all 12 editorial attributes according to our data.

Current Implementation of Editorial Procedures
The table in Appendix B demonstrates some clear differences in the uptake of editorial procedures, especially indicating that several 'traditional' review procedures are still ubiquitous, such as selection of reviewers by editors (97%), keeping reviewer identities anonymous (94%), or pre-publication review (97%). In contrast, some more recent or innovative procedures are virtually absent, including review in which reviewer identities are made public (2%), review by commercial platforms (1%) and post-publication review (2%).
Even though some editorial procedures are ubiquitous there is no 'standard' model for peer review, nor even a limited set of standard models. The core set of review procedures, used in combination by 75% of all journals, consists of five principles: (i) pre-publication review, (ii) using methodological rigour and correctness as selection criteria, (iii) performed by external reviewers suggested and selected by editor(s), (iv) keeping reviewers anonymous (both to authors and other reviewers, as well as to readers of the published manuscript) and (v) making review reports accessible to authors and editors. However, as soon as we add more characteristics to this set, the commonality between journals quickly drops. Outside of this set, editorial procedures are quite diverse, with journals engaging in review procedures that differ on at least one of the twelve attributes studied. Hence, even though some basic review procedures seem universal, only relatively few journals use all of them and only very few journals perform the editorial process in the exact same way. Given the fact that editorial procedures in a large share of journals are more or less centrally organised through large publishers, this significant heterogeneity in journals' review procedures might be deemed surprising. In the following sections we will look more specifically at the distribution of editorial procedures across scientific disciplines and academic publishers.

Research Disciplines
Peer review is commonly presented as field-specific, with particular procedures that are common for particular research areas. However, our data suggest rather the opposite. For most of the editorial attributes studied, research fields appear strikingly similar. While journals tend to differ in their editorial procedures in subtle ways, when aggregating over disciplines, these variations dissolve. In fact, only two of the twelve attributes display substantial differences between fields: the level of author anonymity and the form of statistics review.

3
Journal Peer Review and Editorial Evaluation The former, demonstrated in Figure 1, represents a well-known difference between the social sciences and humanities, on the one hand, and natural and health sciences, on the other. While in SSH journals it is common to blind author identities to reviewers (but not to editors), journals in all other domains more commonly disclose author identities both to editors and reviewers. The biomedical and health sciences demonstrate most diversity with 63% of the journals disclosing author identities, 36% blinding author identities to reviewers only, and 2% blinding author identities both to reviewers and editors. These findings are consistent with a Taylor & Francis survey, in which SSH editors-reported to have used, at some point in time, 86% double-blind and 35% single-blind, while STM editors reported 75% single blind and 42% double blind (Taylor & Francis 2015). Our overall occurrence rate for double-blind procedures resembles that of the Directory of Open Access Journals Journal Peer Review and Editorial Evaluation (48%), but reliable disciplinary break-downs are not provided there (Directory of Open Access Journals 2018). The second major difference between scientific domains consists of how they perform statistics reviews (Figure 2). This might be not so surprising given the different importance of statistical analyses in various domains. Most notably, statistical review was deemed to be 'not applicable' to many journals in mathematics and computer sciences, physical sciences, and engineering and social sciences and humanities. In contrast, it is considered relevant for the biomedical and health sciences, as well as for life and earth sciences. In the latter, statistics review is predominantly incorporated in the general review assignment, whereas in the former more than half of the journals report having specialist statistics reviewers to evaluate these aspects of the manuscript.

Publishers
Similar to the distribution of editorial procedures over scientific disciplines, the distribution of procedures over large and small academic publishers is fairly homogeneous. Distinguishing between the five largest publishers (Elsevier, Springer, Wiley, Taylor & Francis, and Sage) and the other, smaller, publishers, we only notice three significant differences in the way their affiliated journals organise their editorial process. Journals affiliated with the large publishers more often communicate author responses to review reports with reviewers (68% vs. 56%), and more often use plagiarism detection software (70% vs. 55%). In contrast, journals affiliated with smaller publishers more often facilitate reader commentary on the journal's webpage (25% vs. 14%).

Author anonymity
Author identities are known to editor and reviewer Author identities are blinded to reviewer but known to editor Author identities are blinded to editor and reviewer

Fig. 1 Level of author anonymity during review by journals in different research areas
Most interestingly, however, the distribution of editorial procedures for all other (47) attributes does not differ substantially between the largest and the smaller publishers. While some differences are to be expected if only due to chance, this suggests that the heterogeneity in editorial procedures occurs mainly within publishers, rather than across publishers. Hence this seems to demonstrate that editors of journals at larger publishers are relatively autonomous in their choice of editorial procedures. At least no difference can be spotted between the set of most prominent publishers and smaller publishers, including journals run by scientific communities or university presses.

Changes Over Time
In spite of a constant stream of innovations, editorial procedures in most journals are surprisingly stable. For example, whereas 54.3% of the journals disclosed author identities to reviewers and editors in 2000, 54.6% did so in 2008 and 54.0% in 2018. Similarly, reviewer identities were hidden from authors and other reviewers in 94.7% of the journals in 2000 and in 94.2% of journals in 2018. The vast majority of other procedures studied display very similar patterns.
For all 12 aspects of the editorial process used in our survey, we asked respondents whether any changes had taken place since 2000. Only 169 out of the 361 responding journals (47%) reported at least one such change and only 11 (3%) reported at least three changes. Hence, the majority of the journals do not report any change, suggesting that their editorial procedures have remained fixed since the beginning of this century. In total, 286 changes were reported, an average of 0.8 changes per journal. The majority of alterations in editorial procedures concerned  the introduction of digital tools (most notably text similarity scanners), or changes in review criteria (usually becoming more strict), comprising respectively 39% and 16% of all changes.
Because the number of changes in editorial procedures is so low, hardly any are visible when plotting trends of review procedures for most of the attributes studied. Only for the attribute concerning the usage of digital tools, a clear trend is visible since the year 2000, see Figure 3. The figure demonstrates that, especially over the last decade, journals are increasingly adopting text similarity scanners, while the share of journals not using any form of digital support is clearly declining.
Drawing on the literature about the spread of innovations, certain factors might be expected to drive the implementation of novel editorial procedures. First, we could expect more innovations in journals with high retraction rates, since innovations tend to appear as ways to tackle specific issues, and peer review is increasingly expected to detect fraudulent or erroneous manuscripts. Second, prominent, or highly established journals might be expected to be drivers of change, as they have more resources available, are more centrally positioned in communication networks and their reputation is at stake. Although heavily criticised, the journal impact factor (JIF) remains one of the sole recognised indicators of journal prominence and prestige. Using JIF, one could expect that journals with a higher JIF are more likely to implement novel editorial procedures. However, both factors, retraction rate and JIF, were not significantly correlated with the number of changes in editorial procedures (with r-squared values of 0.003 and 0.00004 respectively). This suggests that both the number of retractions and the JIF neither have a stimulating nor a restricting effect on the implementation of innovative review procedures.
Plotting the number of changes for which we have information on the exact date of implementation, we conclude that, even though only few changes occur, the rate of implementation is generally increasing (Figure 4). Showing the number of changes due to the implementation of plagiarism detection software, we again conclude that this type of change accounts for the majority of innovations in peer review. It remains to be studied whether the increasing pace of innovation is a general trend, or whether the apparent trend is merely an effect of specific innovations becoming more familiar and more ingrained in communication networks, thereby temporarily lowering the threshold for implementing this specific innovation (Wejnert 2002).
The literature on the spread of innovations distinguishes between smaller or larger collectives of actors implementing an innovation, suggesting that they may be more or less likely to do so in various circumstances. Therefore, we studied the number of changes in editorial procedures in journals of either one of the five largest publishers (Elsevier, Springer, Wiley, Taylor & Francis, and Sage) or any of the other publishers. The results are plotted in Figure 5. The figure shows that large publishers contribute slightly more to the number of implemented changes. However, when compensating for the fact that our sample comprises 198 journals from the large publishers and 162 journals from the smaller publishers, this difference in number of implemented changes becomes negligible. In addition, the trends of implementing changes in both larger and smaller publishers are highly similar, suggesting akin underlying mechanisms.

Some Reflection: Reasons for Change
Even though it was not a prime aim of our study, our data allows us to get some impression of the reasons why journals alter their editorial procedures. Though not directly invited to do so, a substantial share of the respondents reporting on changes in their editorial procedures, included information about the reason for the change. Out of the 286 reported changes, 61 (21.3%) were provided with information on the reason for change. Even though this data has to be treated with caution, it shows interesting patterns. Most notably, 'the availability of new tools made this possible' was frequently mentioned as a reason to adopt new editorial procedures. It was mentioned in 41% of all cases, not surprisingly, especially when reporting on changes in the use of (text similarity) scanners or support in statistical review. Other reasons frequently presented were the arrival of a new editor-in-chief (15%) or a (new) requirement by the publisher (8%).
Besides these three major reasons, other less frequently occurring motivations for change include 'pressure to increase impact factors', 'increased submission rates' and 'stopped to have access to this service' (e.g. to specialist statistics reviewers). In addition, some journals specifically addressed 'issues with fraud/misconduct' or the intention to 'filter 'bad' science' as reasons to implement different editorial procedures. Notably absent among the list of reported reasons for change was a history of retracted journal articles that 'slipped through' peer review and were later found to be problematic. This suggests that, by and large, the opportunity to implement editorial innovations (i.e. the availability of and access to new tools, or the new expertise of a novel editor-in-chief) are the main motivators to change. On the contrary, intrinsic arguments to improve peer reviews capabilities or performance are seldom given as motivations for change. Even though our data are to be considered rather exploratory, they do suggest a clear pattern and invoke several questions for future research.

Innovation Niches
Our analyses of editorial procedures show a very slow implementation rate. When looking at the editorial process 'from a distance', little seems to be changing. However, despite an apparent stability, some innovations are actually getting a foothold, but only in very specific niches and particular contexts of the publication system, a phenomenon which is extensively described in innovations studies (e.g. Smith and Raven 2012). In the following, we will provide short descriptions of four niches in which particular innovations are getting established. This will allow for reflection on the circumstances in which innovations might be more widely implemented.

Text Similarity Scanners
The only innovation for which we observe substantial implementation are text similarity scanners, with significant increase in usage over the past decade. Combining different pieces of data from our study, a nuanced picture emerges about the reasons for their unique success.
First, text similarity scanners promise a simple fix for the rather uncontested issue of plagiarism and problematic text recycling. Unlike many of the other review procedures, these scanners promise a guaranteed solution to a specific problem, much more so than blinding author or reviewer identities, for instance. Hence, the expectations are clear, allowing for a relatively smooth translation of expectations into requirements for the tools (Van Lente 1993).
Second, journals and publishers have a major (commercial) stake in providing or promising duplication-free manuscripts. It allows them to sell a 'unique' product. Especially the larger, commercial publishers may be interested in this, in line with our finding that the use of text similarity scanners is one of the few examples distinguishing the larger from the smaller publishers.
Third, similarity scanners are not only used in the publishing industry, but also in higher education, scanning student papers for plagiarism. In fact, many of the developers of such scanners consider this their primary market. For editors and publishers, the usage of these scanners in higher education provides a testbed allowing them to see whether the scanners live up to expectations. Since many editors also have a role as lecturer, this allows them to get familiar with these tools via multiple communication networks.

Registered Reports in Health and Psychology Journals
A second example of an innovation that finds substantial implementation, though only in a particular niche, are the registered reports, in which research is evaluated only based on its rationale and methodology, usually before data gathering has started. Currently, this review model has been implemented in a substantial amount of psychology journals, as well as some journals in the health sciences (Center for Open Science 2018). Similar to text similarity scanners, registered reports were established with a fairly specific aim. They aim to address the alleged replication crisis, and promise to provide a more or less simple fix by facilitating the publication of negative results (combating publication bias) and making replication studies more attractive (Nosek and Lakens 2014;Horbach and Halffman 2018b). In addition, the registered report model is highly similar to the review model used in grant applications, which is also solely based on a study's a priori rationale and methodology. Hence, akin to the text similarity scanners, actors might become familiar with registered reports through various communication channels, thus making the innovation more familiar.
Even though concerns about the 'replication crisis' in science currently seem to be spreading, they originated and still mainly seem to affect the medical science and (social) psychology (Wicherts 2017;Begley and Ioannidis 2015). Hence, the implementation of registered reports seems to be constrained to the area for which it provides a solution to an acknowledged and well-defined problem. In addition, the registered report format seems to be most applicable to certain areas of research (including the empirical, highly standardised fields, with low levels of researchers' degrees of freedom), while it is less applicable in fields with other methodological and epistemic traditions (such as the humanities).

Image Manipulation Scanners in Biomedical Journals
A third editorial innovation that we would like to single out comprises the use of image manipulation scanners. At present, they seem to be most commonly used in biomedical fields and, to a lesser extent, some journals in psychology (Scheman and Bennett 2017). Within these fields, they again provide a solution to an uncontested issue, being the manipulation of figures and images, such as western blots. While detecting image tweaking is still technically challenging, highly standardised representations such as western blots allow for some automated detection, or at least flagging of potential problems. Even though some prominent cases of fraud where detected through careful scanning of images and figures, including the Schön case (Consoli 2006), such detection as yet relies on human skill. While techniques based on Artificial Intelligence promise to take this approach to a more automated level, such expectations remain to be fulfilled (BioMed Central 2017). Currently, the use of image manipulation scanners therefore seems to be constrained to (1) fields in which images commonly occur in manuscripts; and (2) those fields that have highly standardised representations in images and figures, thereby allowing relatively simple technical tools to be of genuine assistance.

Open Review at Several Publishers
The last peer review innovation implemented in specific niches is the open review model. Several publishers have now adopted this model, with some, such as BioMed Central and the British Medical Journal, launching a range of new journals adopting open review (Godlee 2002). This review procedure aligns with the more general call for opening up science and adhering to open science practices, including publishing open access, sharing data, and other forms of transparency in research (Nosek et al. 2015). Despite wide calls to follow these standards, our data show that implementation of the open review model is still rather modest and mainly confined to several individual publishers. Part of this may be due to the large variety of different forms of 'open review', a term that may encompass either the disclosure of reviewers' identities to the authors of a submitted manuscript, the disclosure of such identities to the wider public, or even the publication of entire review reports (Ross-Hellauer 2017). In fact, Ross-Hellauer (2017) found at least 22 different definitions of 'open peer review', showing that the phrase is currently highly ambiguous and has not yet settled into a single set of features or schema for implementation. This lack of uniformity may cause a serious obstacle for editors or publishers willing to implement some form of open review in their journals.

Conclusion and Outlook
This study has been one of the first attempts to map the distribution and development of journal peer review and editorial procedures. Our work presents new perspectives on multiple aspects of review. First, it shows that editorial procedures are diverse. The 'common core' of editorial procedures that are shared by a wide variety of journals comprises a surprisingly small set of procedures. Journals commonly differ in their review procedures in small or subtle ways.
Second, we witness only minor variations in editorial procedures when aggregating over either scientific disciplines or academic publishers. Hence, while individual journals commonly differ slightly in their editorial procedures, on a larger scale, there are few systematic patterns in these differences.
Third, over the past decades, an abundance of new review procedures have been suggested and initiated. However, adoption of these innovative formats by other journals is slow. Since the beginning of the century, only a very limited number of journals have made substantial changes to their editorial process. Even today, the traditional forms of peer review, single-or double-blind pre-publication review, still prevail over more innovative formats, such as open, post-publication, or review assisted by the wider community and digital tools. Text similarity scanners are the exception to this pattern. In the past decade, their uptake has rapidly increased and using such scanners has now become more or less common practice.
Fourth, we obtained some data about journals' motivations to alter their editorial procedures. Despite the exploratory nature of this data, it suggests that innovations most commonly occur as a response to novel opportunities, rather than as a response to shifting expectations or general threats.
Last, we sketch out several niches in which particular innovations have found their way to implementation. Although these are, with the exception of text similarity scanners, not or hardly visible in the overall figures on the distribution of editorial models, the implementation of innovations in particular niches may provide an important step towards further distribution (e.g. Verbong et al. 2008;Smith and Raven 2012). A closer look at these niches portrays several contexts in which implementation becomes more likely: contexts in which an innovation offers a fairly simple solution to an uncontested problem, or contexts in which actors may acquire prior familiarity with the innovation. In contrast, ambiguity about the specific features of an innovation as well as technical limitations and epistemic or methodological diversity seem to hinder wide spread implementation.
Hence, we conclude that there are various reasons for, for example, text similarity scanners to be a special case in our sample of review procedures. Rather than spreading by the logic of a diffusion model, there are more subtle reasons for these tools to be specifically prone to implementation at a wide variety of journals. For the same reasons, a similar process might be expected to occur for other tools and review models in the future, including tools supporting statistics review and image manipulation scanners: They also promise a relatively simple fix for an uncontested issue as well as that they might be introduced and create familiarity in multiple contexts.
Our study is the first to analyse the implementation of a wide variety of editorial formats over a wide variety of journals. Earlier work has commonly focussed on a single aspect of review (for instance, mapping single-blind vs. double-blind procedures) or focussed on a specific research discipline or publisher (e.g. Taylor & Francis 2015).
Our study's findings may be somewhat limited by a number of factors. First, it is important to bear in mind the possible bias in the selection of and response by our study sample. When sampling email addresses of journal editors, we used the Web of Science database, searching for editorials written in 2017. This thus excludes journals not indexed by Web of Science, or those journals that do not publish editorials. In particular, this may have excluded several, young, non-English, or niche journals. Especially some of the young or niche journals might be particularly innovative in their editorial process, hence the selection bias may have caused us to underestimate the diversity of editorial procedures. In particular, it may have led us to overlook the implementation of editorial procedures within a specific community that is underrepresented within our sample. For example, the apparent increase in the usage of registered reports in psychology journals (Center for Open Science 2018) is a trend not visible in our data.
Conversely, from the sample of journals that were sent an invitation to participate, those journals paying specific attention to their editorial process, as well as those being particularly keen on innovating editorial procedures are arguably more likely to have responded to our survey. Hence, this potential response bias may in fact have resulted in an overestimation of diversity and innovation in editorial procedures. In fact, our data provides some hint to this phenomenon, with substantially more innovations in peer review reported by those journals that responded most quickly to our survey.
A final limitation of our study rests in the survey approach to collect data about editorial procedures. Even though we tested our survey before distributing it, this type of data collection is inherently prone to misunderstanding of the wording, as well as incomplete knowledge of the editors of (past) review procedures, or different interpretation of terms by researchers and respondents. This might have led editors to classify their journal's editorial procedures differently from how we intended it and hence have influenced our analyses. Specifically, editors' incomplete knowledge of past changes in review procedures might have influenced our analysis of the implementation of new editorial models, especially with implementations established further in the past. In addition, we acknowledge that we have studied formal peer review procedures, rather than actual review practices. How editors say their review is performed, might not for all manuscripts correspond to how things are done in practice.
Our findings raise several questions to be addressed in future research. Both the apparent lack in diversity of editorial procedures across disciplines and publishers as well as the slow implementation of novel procedures raise questions about the seeming inertia in review and factors hindering innovation. While the literature on the spread of innovations suggests that the academic publishing industry meets many of the conditions required for substantial adoption of innovations, we witness little of this in practice. In addition, the exploratory findings about reasons for journals to adopt new review procedures raise further questions to be explored, including about which actors are in the position to implement new review formats. On what do they base their decisions regarding innovation? And more generally, do journals deliberately adopt a strategy of cautious innovation, or is the peer review system merely a sleepy giant?
In addition, future research could examine actual review practices, rather than formal editorial procedures as reported by editors or outlined on journal web pages. These questions are to be tackled by more qualitative research involving the various stakeholders in the editorial process, including editorial boards, managing editors, publishers, authors and reviewers.
Last, an open question to be addressed in future research concerns the effects of various procedures: do they deliver on the expectations? For instance, are specialist statistics reviewers or statistics scanners indeed effective in increasing the quality of statistical analyses? And do text similarity scanners indeed decrease the amount of (textual) duplication in journal manuscripts? While some work on this has been done (Horbach and Halffman 2018a), much remains to be elucidated. In the end, such information should be the leading motivation for journals to adopt specific review procedures.