The DocMaps Framework for representing assertions on research products in an extensible, machine-readable, and discoverable format

Peer review of a research product varies widely depending on the publishers and platforms involved in the process. As scholarly publishing is disrupted by new innovations, peer review processes become more heterogeneous, placing an increasing burden on the researcher in understanding how they can communicate their scholarship. New ways to model such processes, and increase transparency, trust, and experimentation in scholarly publishing are needed. Many are emerging but can tend to focus on the needs of creators, and not those of readers, funders, and the whole scholarly publishing ecosystem. They may not place focus on representing editorial practices in ways that can be reliably aggregated, surfaced, and queried; are often limited to traditional peer review processes; and cannot capture the full range of editorial practices and events needed to accommodate alternative publication, review, and curation models. To support researchers in a world of experimentation in scholarly publishing, we propose a machine-readable, extensible, and discoverable framework for representing and surfacing review and editorial processes. Working with a Technical Committee composed of interested parties by employing a modified Delphi Method, we developed initial guiding principles and proposals towards an object-level editorial metadata framework compatible with a broad range of possible futures for scholarly publishing. We present the results of this process with a proposal and example use cases for DocMaps, a framework for representing object-level assertions.


Introduction
There is a need to increase transparency about how scholarly publishing is undertaken. The COVID-19 pandemic, with its requirements for rapid review and an explosion of pre-printing, shows the urgency of this need.
While it may be tempting to suggest that there should be a standard way to undertake processes of publication, review and curation, we propose that what the current ecosystem needs instead is a framework that defines some general principles and best practices, while allowing the ecosystem to experiment, grow and evolve as needed. To address this need, we set out to construct the DocMaps Framework for capturing valuable context about the processes used to create research products. This framework was designed to capture as much (or little) contextual data about a document as desired by the publisher: from a minimum assertion that an event took place, to a detailed history of every edit to a document.
We started by identifying three key requirements for a framework for representing and surfacing object-level review and editorial events to create a healthy and trusted ecosystem: • Extensibility: a wide range of editorial process events should be able to be represented, ranging from a simple assertion that a review occurred, to a complete history of editorial comments on a document, to a standalone review submitted by an independent reviewer.
• Machine-readability: assertions should be represented in a format that can be interpreted computationally and translated into visual representations.
• Interoperability: a single service should be able to interpret multiple taxonomies against the same criteria and arrive at the same interpretations.
The DocMaps project aims to assist the community by providing a framework to represent a broad range of editorial processes independent of the model used to define them; ultimately we aim to provide technical guides for implementing those requirements in a number of common formats.
The project took place in the midst of a rich ecosystem of initiatives related to models aiming to better describe and support peer review (see Appendix 1). Numerous recent initiatives have been positive developments in the space of peer review experimentation and transparency (Woolston 2019;Jones et al. 2020; "Peer Review Transparency," n.d.; "Review Maps · KFG Notes" 2019). Such initiatives understandably tend to focus on the needs of the creators, and can be limited to traditional, even field-specific, peer review processes. They may not focus on representing editorial practices in ways that can be reliably aggregated, surfaced, and queried, or capture the full range of editorial practices and events needed to accommodate new publishing workflows in which reviews may be conducted by multiple parties. But they can be integrated into a framework like DocMaps to allow those who do, or wish, to use them to frame their contextual metadata around them. DocMaps aims to be a framework with a common way of describing editorial events, to which publishers of documents can add the components they can, or wish to, share.
Here, we describe our work in constructing the DocMaps Framework, using a modified Delphi Method (Linstone and Turoff 1975) to iteratively gather and prioritise the views of a Technical Committee (TC, see Appendix 2) comprising a broad-range of expertise and stakeholder viewpoints: parties interested in modelling object-level editorial processes; parties who may be willing to surface metadata as part of their infrastructure; and entities who create metadata as part of their editorial process. The outcome, also presented here, is a proposal for DocMaps, a common framework for representing object-level editorial processes.
In essence, the DocMaps Framework is composed of "DocMaps", which are immutable assertions that describe an editorial event (or series of events). A "DocMap": • can describe multiple types of documents;.
• is asserted by the publisher of this document; and • exists independently from another DocMap, but can reference other DocMaps.

5
. CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 15, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 In this way, a DocMap can describe a published paper, a preprint, a single referee report, an aggregate report, a recommendation, etc., allowing (for example) review services to curate evaluations from multiple sources in their reviews. At its simplest, a DocMap could simply describe the existence of an object:

DocMap
{ createdOn: timestamp // The time the DocMap was created contentType: string // The type of content (book, chapter, review) content: uri | optional // A link to the content the DocMap refers to provider: uri | optional // A link identifying the provider of the DocMap (e.g. the journal) } We here present our proposal for the consideration of the scholarly publishing community, providing two worked examples of use cases identified and described with the assistance of our TC.

Method
The process of developing the DocMaps Framework chiefly relied on interacting with a Technical Committee (TC) composed of experts using a modified form of the Delphi Method (Linstone and Turoff 1975). Interactions with the TC were planned and executed by the Core Team (GSM, JKP, TRH, GS).
The TC was composed of parties interested in modelling object-level editorial processes; parties who may be willing to surface metadata as part of their infrastructure; and entities who create metadata as part of their editorial process. The list of permanent TC members involved throughout the process is described in The Delphi method is a systematic, interactive forecasting method which iteratively builds upon the combinatorial judgements of experts. The method seeks to increase the efficiency of face-to-face meetings by introducing additional structured group communications in between the meetings, in this case through the administration of strategically-designed surveys and requests for feedback. Time is then allowed for submission of responses, which are then anonymized and summarized by a facilitator. In this way, the method seeks to prevent group domination or "bandwagon" effects commonly encountered in face-to face meetings. In our version of the Delphi Method, members of the TC were not anonymous, and interacted with each other in regular (approximately monthly) one-hour meetings.
Following the traditional steps of the Delphi Method, the process was distinguished into four phases. The first exploratory phase (Phase I) allowed each TC member to contribute additional pertinent information, or make clear their initial reactions to the framing of the problem at hand. The next phase (Phase II) aimed to reach a reasonable consensus and shared understanding of how the group viewed the problem, including allowing discussions to clarify semantics or relative perspectives. The third phase (Phase III) sought to probe and explore areas of contention, and clarify outstanding issues, which in this case involved an initial draft proposal of the DocMaps Framework. The final phase (Phase IV) allowed us to present back to the group what had been heard, clarify any final questions and propose a final framework for publication.
In brief, the process involved the following steps: 1. An initial survey (Survey 1) prior to any face-to-face meetings, accompanying a description of the proposed project sent to participants who had previously agreed to take part in the TC (Phase I) 2. Meeting 1 (September 30th, 2020), where we presented the overall project, described the process, allowed group members to introduce themselves, and led a discussion of a summary of the results of Survey 1.
3. Survey 2 asked participants to suggest possible use cases for the DocMaps Framework, and asked for clarification on thoughts about an issue arising in Meeting 1, in order to reach consensus and clarify any arising issues (Phase II).
4. Survey 3 asked participants to vote on each proposed use case (summarized, standardized and anonymized) in order to help the group select two priority use cases to work up in the context of the DocMaps Framework.
5. Meeting 2 (October 29th, 2020) involved the TC picking two use cases, and breaking into two working groups to discuss the elements required to describe these use cases, and then gave data and their sources that would correspond to those elements.
6. Survey 4 proposed the first draft of the DocMaps Framework and asked for comments on the draft, as well as asking specific questions to probe and explore areas of contention (Phase III). In addition to this work, we also discussed the DocMaps Framework proposal at an initial recorded webinar for our Co-Creation Community, a group of self-identifying interested parties who we are asking to give more feedback now and as the project progresses.

Results
Delphi Method with the Technical Committee As described in the Methods, we first asked our TC to complete Survey 1 prior to any face-to-face meetings. This survey, accompanying a description of the proposed project, was designed to achieve the goals of Phase I of the Delphi Method by allowing each TC member to contribute additional pertinent information and make clear their initial reactions to the framing of the problem at hand.
Survey 1 asked: 2. What are the main aims, and hopes, of such a project for you and/or your organization, and the scholarly communications ecosystem more broadly?
3. What major milestones need to be reached between where we are now and the future you envision?
4. What are the major pitfalls that could disrupt the project?
In answer to question 1, TC members provided a list of editorial or evaluative "Events" that could be supported within the DocMaps Framework, which were summarized and are listed in Appendix 3. In summarizing the suggested events, we took care to highlight events listed in the peer review taxonomy developed by the International Association of Scientific, Technical and Medical Publishers (Jones et al. 2020).
The responses to question 2 were combined and summarized as shown in Figure 1.
The responses to questions 3 and 4 were combined into Figure 2, as a timeline of the project was possible to combine with pitfalls that would arise at particular defined timepoints. 10 . CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 15, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021  "What are the major pitfalls that could disrupt the project?" (depicted on the right-hand side).

11
. CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 15, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 During Meeting 1, we led a discussion of a summary of the results of Survey 1. In the course of this meeting, a clear tension arose between the need to define events, and the desire to try to build a minimal pilot of something that could describe these events. In particular it was noted that other initiatives (such as the STM initiative (Jones et al. 2020)) were working on defining taxonomies, and that attempting to define every single possible event, and achieve consensus on these definitions, would be difficult and not a desired task for DocMaps.
Another issue that arose through the course of the meeting was clearly articulated by one participant as a collective issue in defining an event, and in particular the "status" of an event, and its perception in the community, or its "standing". For example, an event can give an output (the "status" of the event) that may then be subjected to an assessment by a human, such as an editor (or, someone who makes an evaluation of its "standing"). This has previously been described extensively in discussions of preprints (Neylon et al. 2017).
We next asked the TC to provide responses to Survey 2 in order to reach consensus and clarify thoughts on the particular arising issue of status vs. standing (Phase II) and to provide suggestions for use cases that could be satisfied by the DocMaps Framework, in order to start working towards a proposal for a pilot.
Survey 2 asked: 1. Please give us at least 1 (maximum 3) use cases that you would like us to consider to ultimately work on to define Doc Maps. a. Feel free to describe the use case in as much detail as you would like; and cases can be as general or specific as you would like.
b. You can make use of this list of possible events you provided to us. c. As a prompt, you may consider and expand upon use cases suggested previously.
d. Please use the agile "user story" format of:  (Neylon et al. 2017)). Our stance is that Doc Maps should list, as closely as possible, objective events, and leave subjective judgements and interpretations to others (This includes subjective judgements themselves being listed as events). Do you agree or disagree with this stance? Please give us your thoughts.
In response to question 1, a number of use cases were identified. The full list is provided in Appendix 4.
The results were collated and then sent back to the TC, asking them to vote on which use cases they wished to prioritize (see Survey 3 below).
In general, respondents agreed with the stance in question 2. They noted that this stance would be likely to help build trust with organizations, and that as little room as possible should be left for interpretation and judgement. One example provided was that it would be correct to convey the event of e.g. an accept/reject decision, but not to convey how that decision was made.
A potential issue raised was that in service of constraining the framework to only objective events (or state changes), it would be hard to define such events, with suggestions that the group should try to specify some standard definitions. For example, in theory, excluding changes in standing could exclude recording publication or curation in journals, since they apply the value judgement as to whether a paper is worth the attention of their readers.
This concept was also articulated by TC participants as being potentially difficult for readers to understand, especially as separation of status from standing may not be trivial. For example, if a document was subjected to a plagiarism check, which was then signified through a badge (a change in status), that assumes and conveys to the reader that the article is plagiarism free (its standing), as per the standards and policies of the entity carrying out the check and providing this interpretation.
Survey 3 asked participants to vote on use cases proposed in Survey 2 (summarized, standardized and anonymized) in order to help the group pick 2 use cases to work up in the context of the DocMaps Framework.
Three use cases received the same, highest number of votes: 1. READER of a published article: wants to KNOW THE THOROUGHNESS of the peer review process of the article to BETTER DISTINGUISH trustable papers and MAKE A DECISION of further reading/paying to read/citing/sharing/etc.
2. JOURNAL: wants to PUBLISH its own reviews and the reviews transferred from another journal or peer review service alongside the published paper.
3. PREPRINT SERVER: wants to IDENTIFY AND SELECT third-party reviews to DISPLAY them alongside the associated preprint.
In Meeting 2, the TC agreed on two use cases, by combining the READER and JOURNAL use cases into a single use case, keeping the PREPRINT SERVER use case as stated. This followed from a discussion on the work that would be required in the READER use case of defining "thoroughness", which was viewed as ambitious, and by some TC participants as desirable, but overall probably a significant effort which could distract from the goal of designing a simple pilot use case.
The TC then separated into two working groups during the meeting to discuss the elements required to describe these use cases, and identify data (and their sources) that would correspond to those elements.
These results are described in Appendix 5.
In summary, the READER/JOURNAL group discussed elements that readers might want to see, or journals might want to communicate, to promote trust. This again brought up the issue of "status" vs.
14 . CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 15, 2021. ; https://doi.org/10.1101/2021.07.13.452204 doi: bioRxiv preprint "standing". Participants identified that there were many elements to draw from in the STM taxonomy (Jones et al. 2020). Participants discussed providing more transparency around quality control actions that journals may already do that are not communicated well at all, e.g. peer review of code. Participants also discussed the issue of expertise of reviewers (the field someone is in and how they contribute), and how difficult it would be to have consensus in describing this, perhaps needing the development of a taxonomy.
The PREPRINT SERVER group discussed the difference between accepting materials from a review service because they meet certain qualifications and agree to transmit certain pieces of information vs.
allowing anyone to send in a review: what information you would need to determine whether to accept or reject displaying the review? The group discussed the pre-approval route and explored how the review would be transmitted, what format it would use, and how that would affect the different types of data that would be needed. There followed a discussion about all the types of data that would be needed, one being the structured full text for the review, for example allowing you to extract scores or ranking. As there is no accept/reject decision in the preprint case, participants suggested that taxonomies may need to be developed for scores and ratings.
The Core Team took the results from this discussion and expanded the use cases, categorizing and prioritizing elements for each use case. Using this, a first draft of the DocMaps Framework was created (see Appendix 6). To begin work on Phase III of the Delphi Method, to probe and explore areas of contention, and clarify outstanding issues, this draft was sent to the TC with Survey 4 with a request for comments on the draft: 15 . CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 15, 2021. ; In response, TC participants sought clarification on two issues: the perceived tension between defining a taxonomy vs. a framework; and questions surrounding Versioning within the DocMaps Framework. Participants perceived a tension between efforts to standardize or taxonomize editorial event information, and building a framework to represent the information in those standards and taxonomies (and beyond) in a way that can be interpreted and used in multiple formats, for multiple purposes.
Participants also questioned the logic of having 'version' as a value for 'article type', rather than as a separate element, pointing out that versions were iterations of an object rather than a new object itself.
Would including version as a content type and a context introduce confusion? Would pre-publication revisions be classified as versions?
In order to explore these questions further, they were discussed in face-to-face Meeting 3. Participants discussed issues surrounding the location of metadata about content (e.g. a document) and about related objects (e.g. a review of that document), with each having its own DocMaps describing metadata about its own object. There was an agreement about the need for an entity to be able to knit all of these assertions together, and also where to store this metadata. Whether other organizations were already carrying out some of these functions, and how DocMaps differs, was discussed, and in particular TC participants identified one key value of DocMaps being its ability to use an open and distributed infrastructure. The DocMaps Framework would provide a structure that anybody could then use, and any organization could push or pull that information to or from any other organization. The pushing and pulling of information was identified as a problem which could be bracketed at this stage and addressed in a later phase of the project.
When discussing versioning, there was general consensus about a version being an attribute rather than a content type. Participants suggested learning from the web infrastructure itself and linking from one version to another indicating a predecessor or successor by means of reference rather than value from one version to another. This would have the advantage of a single update not being a huge burden on 16 . CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 15, 2021. ; https://doi.org/10.1101/2021.07.13.452204 doi: bioRxiv preprint whoever owns the DocMap for an object. Another discussion centered around what version reviews of an object would sit on, i.e. the version that is reviewed, or the version that results from a review. This raised the question of the technical difficulty of linking reviews to an object that "doesn't exist", e.g. a version of a manuscript that is not publicly available and was only seen by internal reviewers, editors, and the authors.
This was resolved as needing to place attributes on links as part of the DocMaps Framework, to capture this information.
In Meeting 4, the TC took a practical look into how building the Framework would look, using concrete examples, to see what issues could be exposed in the practical creation of the DocMaps Framework. This work was based on the previously identified uses cases, namely:

A publisher captures context about a review of an article published in their journal; and
2. An independent review service notifies a preprint server about a review of an article on their platform.
A community-driven technology effort, Sciety (sciety.org) is developing an application to support communities reviewing preprints, and their code is available on the articles reviewed already, and this provides an opportunity to work through a document which is deposited as a preprint, reviewed by a community with these reviews identified on Sciety, then published on in a journal. These events could approximate the two use cases identified by the TC. Using this example, the aim was to work with the TC to map out the events, and lay out a rough code for the DocMaps for the use cases. In doing this, we hoped to be able to have the TC identify any issues with the creation of DocMaps relating to a research product.

17
. CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 15, 2021. ; https://doi.org/10. 1101/2021 We asked, in each case, what events had taken place that we would want to record in its DocMap? And as events were identified, were there any aspects of those events that should be described by the DocMaps Framework? We also began to build the DocMap during the meeting for each case. In the process, we also learned that the same article was described on Early Evidence Base (https://eeb.embo.org/doi/10.1101/2020.02.20.958025), an experimental platform being developed by EMBO Press and SourceData, combining artificial intelligence with human curation and peer-review to highlight results posted in preprints. The example was pointed out as being particularly useful due to its illustration of a properly distributed system.
An analogy was made that the DocMap for the final article was somewhat like a rearview mirror of a vehicle i.e. of a process that had already happened, whereas the preprint case was like a view through the vehicle's side window i.e. a view of a process in progress.
On the subject of how reviews are carried out in practice, there was a discussion of two situations: 1) a version rejected by one journal after a round of review, then the same version is independently reviewed completely independently by a different journal 2) a version that is on e.g. Review Commons involving the same journals, the decisions are not as a result of independent review processes.
These raise questions about what the DocMaps look like, how these are all linked together, and the importance of thinking about how things are grouped. Both the scenarios of solicited review events, and peer or community review events need to be encapsulated, and can stand alone as events. But when an editorial judgement is made, some entity then has to wrap these things and say because of these events, an editorial judgement has been made. Both things are possible, but there is a tension between the actors in the ecosystem that need to make this possible. The journal will publish a summary of solicited reviews and the version that was used for that, and the decision made on this basis; but a community review 18 . CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 15, 2021. ; https://doi.org/10.1101/2021.07.13.452204 doi: bioRxiv preprint service may make an independent review and recommendation, and then some other entity again could then take all these things, and make a further judgement, citing those. The final stage of the TC process (Phase IV) took place in Meeting 5, where a summary of the work undertaken was presented back to the group, clarifying what had been heard, any final questions and proposing a final DocMaps Framework to be published in this work.

19
. CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 15, 2021. ; https://doi.org/10.1101/2021.07.13.452204 doi: bioRxiv preprint

The DocMaps Framework
The text of the DocMaps Framework, providing an opportunity to comment, can be found at https://docmaps.knowledgefutures.org/pub/sgkf1pqa.

Summary
DocMaps is a framework for capturing valuable context about the processes used to create documents, in a machine-readable way that can be expressed and interpreted in multiple formats depending on the use-case of the reader of the DocMap. The framework is designed to capture any amount of contextual data about a document -from a minimum assertion that a process took place, to a detailed history of every edit to a document.
We intend DocMaps to be general enough to capture the process behind any document, but our initial focus is capturing evaluative processes surrounding research documents -primarily peer review processes conducted by journals or by services reviewing preprints.

DocMaps
In addition to being the name of the framework, a DocMap is the name of the object used to capture information about a document. At its most basic, a DocMap consists of just a few pieces of information: { createdOn: timestamp // The time the DocMap was created contentType: string // The type of content (book, chapter, review) content: uri | optional // A link to the content the DocMap refers to provider: uri | optional // A link identifying the provider of the DocMap (e.g. the journal) } Using just these three pieces of information, a publisher could make a simple assertion that an article exists.

20
. CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 15, 2021. ; https://doi.org/10.1101/2021.07.13.452204 doi: bioRxiv preprint { createdOn: 07-07-1999T00:00:00Z contentType: "article" content: https://doi.org/10.1109/5.771073 provider: https://ieee.org } Of course, this assertion leaves much to be desired. Most DocMaps will contain more information, e.g. the date an article was published; which version of an article the DocMap refers to; etc. DocMaps accomplishes this by building and maintaining schemas for different Content Types.

Content Types
Content types extend the basic DocMaps structure to add information needed to describe a specific kind of content.
In the article example above, the publisher can provide more information about the article by looking at the schema for a DocMaps Article (see below) and providing as many of the fields as they feel necessary.

Contexts
When you create relationships between DocMaps to describe a series of events, these relationships are called Contexts, and are defined by a context key that describes the type of relationship and the 21 . CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made But the flexibility of DocMaps means they don't have to publish all this information at once. They could also publish a DocMap describing just the second version of the paper by creating a Version DocMap with an isVersionOf context containing an Article DocMap.

22
. CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made Because contexts can flexibly describe events in many different ways, we expect that conventions will arise relatively organically between providers and consumers of DocMaps for describing different editorial processes at different levels of complexity.
As you'll see in the use cases below, we've used the Versions context to provide information about both published article versions and review revision rounds. If a provider didn't need or want to report revision rounds, or if there was only one round, they could omit the Versions context. In all of these cases, the consumer of the DocMap can decide how much of the context they want to use depending on how they plan to use the DocMap.

Initial Use Cases
The Technical Committee specified two first use cases for DocMaps and further prioritized the data elements needed to capture relevant data for those cases. Below, we describe example DocMaps for these cases and explain how the elements requested by the Technical Committee can be imputed from the DocMaps.

A publisher captures context about a review of an article published in their journal
In this example, a journal is describing a double-masked peer review of an article with two rounds of revisions. They do this by nesting a Review context within an Article Context. They then further nest two Version Contexts within the Review Context to describe multiple rounds of feedback. was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

24
. CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 2. An independent review service notifies a preprint server about a review of an article on their platform.
In this example, a review service is describing a fully transparent review of a preprint article with links to the review report and author response. They do this by including a content field for the review object and filling out the author response and STM Association Taxonomy metadata to describe the process of the review.

25
. CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 26 . CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

Draft Content Type Schemas
These are initial drafts of content type schemas, and will be somewhat incomplete by design. They are intended to capture the information necessary to make the two initial use cases listed above possible.

Implementation
Once we discuss and come to agreement on the initial schemas, we'll turn to implementation. We're using pseudocode in the examples above because we want DocMaps to support multiple formats, since publishers' technological needs and capabilities are so varied.

28
. CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 15, 2021. ; https://doi.org/10.1101/2021.07.13.452204 doi: bioRxiv preprint

Case Study
Here, we present an example of how a real situation could be described using the DocMaps Framework.
In the example presented below, a document was deposited as a preprint on bioRxiv and after a review process described in Sciety and on Early Evidence Base, was published in eLife.

29
. CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 30 . CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made They could curate those individual docmaps and produce a combined review DocMap, as shown below.
Interestingly, this looks somewhat similar to the DocMap shown above for Use Case 1. However, it is possible that grouping reviews like this sends a signal about how the process was conducted, suggesting all reviews were part of a round of review stewarded by an editor.

31
. CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 15, 2021. ; https://doi.org/10.1101/2021.07.13.452204 doi: bioRxiv preprint

Discussion Technical Limitations
As with any group selected for development of an idea, the initial proposal will be limited by the perspectives, experiences, assumptions and motivations of the people selected to be on the Technical Committee. It is for precisely this reason that we are proposing this Framework as an idea to the wider community, and soliciting feedback as part of our larger Co-Creation Community (C3) effort with a larger group of stake-holders. This process is open-invitation and aims to work with interested stakeholders towards implementation of DocMaps to suit their needs.

Status vs Standing
A key tension that arose as part of this process was clarifying how, and indeed whether, DocMaps could list objective events without attaching subjective judgements and interpretations. In collectively discussing what "an event" in DocMaps is, we were able to articulate it in terms of the characterization of the objective elements of the event (its "status") compared to its interpretation or subjective judgement (its "standing"). The view was that DocMaps would aim to transmit information about an objective measurement. The issue of representing the "status" of an event versus its "standing", has been discussed previously with respect to preprints, with the authors pointing out that the issue becomes politicized as people mix these two concepts (Neylon et al. 2017). For example, the "status" of a preprint (e.g. its existence as a document on a server) is one thing, but the "standing" is then for the wider community to make a determination of how what that object means, for example in terms of the contribution of its contents to a body of knowledge. One outcome of this discussion was the agreement that it is not a goal of DocMaps to become a registry or accreditation agency for post-publication review services, for example. The aim is, as far as possible, to be content agnostic.

32
. CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 15, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021

Increasing Trust in Scholarly Publishing
One hope is that the information recorded in the DocMaps framework could be used to increase trust in scholarly publishing. In particular, we see a role in addressing concerns about misinformation with respect to preprints, and to clarify the status of retracted works, which have come to the fore during the COVID-19 pandemic. For example, an article funded by an organization with a particular agenda could currently be posted on a preprint server that does not undertake any quality control or screening, and used to spread disinformation. To most observers, including journalists and the general public, looking at the article on this server alone would not tell you that no quality control or screening has been undertaken, and it could have all the appearance of a legitimate article. If scholarly repositories surfaced information about editorial processes, the difference between an article (and its vetting process) on different servers could be readily apparent. Likewise, clear machine-readable ways of flagging retractions would make discredited works more readily identifiable as such, and even allow publishers to demonstrate that retractions of published papers can lead to improvements in the review process. DocMaps would make more data about editorial processes accessible, allowing researchers to systematically study processes and because the data would be public and interoperable, this could be done repeatedly and independently.

Avoiding defining and prescriptive taxonomic definitions
There can be a temptation to be prescriptive as providers of information, and to attempt to define taxonomies for how information can be represented. DocMaps aims to be agnostic to these definitions, and to focus instead on recording the existence of the underlying editorial events that shape taxonomies and workflows. One publisher may not want to consider including reviews of objects that don't exist publicly, and can ignore them. But another publisher could, and should be able to do so. We expect that publishers and consumers of DocMaps will describe and interpret the events provided according to their needs. The aim is to give people the data they need to make those judgements. This is a major difference between DocMaps and other schemas, taxonomies and formats: to allow the publication of information events in as flat a format as possible. The interpretation is done by the consumer of the events. In this way, we aim to ensure flexibility, and machine-readability, of contextual information for all types of uses, 33 . CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 15, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 including non-traditional and even non-academic ones. We expect that conventions for publishing and interpreting DocMaps that can be used with specific taxonomies and workflows will arise, and intend to provide venues to support the development of these conventions.

34
. CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 15, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 Conclusions and Future Directions Using a modified Delphi Method (Linstone and Turoff 1975) we were able to iteratively gather and prioritise the views of a Technical Committee to construct a proposed common framework, DocMaps, to represent object-level editorial processes in a machine-readable, interoperable and extensible manner.
The DocMaps Framework describes editorial events through immutable assertions known as "DocMaps", which: • can describe multiple types of documents;.
• are asserted by the publisher of this document; and • exist independently from another DocMap, but can reference other DocMaps.
In this way, a DocMap can describe a published paper, a preprint, a single referee report, an aggregate report, a recommendation, etc., allowing (for example) review services to curate evaluations from multiple sources in their reviews. At its simplest, a DocMap could simply describe the existence of an object.
Our proposal is for the consideration of the scholarly publishing community. Our future work involves gathering feedback on this proposal, with the assistance of a Co-Creation Community open to all interested parties, and working with interested stakeholders to develop pilot implementations and experiments using the proposed DocMaps Framework. As a first example, we are working with EMBO's Early Evidence Base and eLife's Sciety review curation platform, in conjunction with Cold Spring Harbor Lab's biorxiv and medrxiv preprint servers, to pilot using DocMaps to represent and display distributed review events across platforms. These pilots will allow us to evaluate the utility and limitations of DocMaps in practice.

35
. CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 15, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 . CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 15, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 . CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 15, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 . CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made This COAR protocol will enable different repositories/services to communicate and notify one another that a manuscript has been submitted, accepted, or rejected. However, it does not describe the nature of the editorial evaluation that has taken place. Doc Maps could be exchanged or pointed to with this protocol.

Repository and Services Interoperability Project (COAR)
Aim is to develop a standard and interoperable approach to link reviews and assertions from different services with research outputs An approach to linking a set of assertions with research products, with focus on transmission and notification with machine sources, that can then be represented by DocMaps.
Event Data (Crossref) At #bioPreprints2020, participants agreed to use Event Data to publish review metadata (rather than including it in metadata produced by the preprint server).
This is a centralized approach to solving the problem COAR wishes to solve in a decentralized way. Doc Maps could be exchanged or pointed to with this approach.

MECA (NISO)
A set of DTDs for exchanging manuscripts between journals and between preprint servers and journals, including information about the review process.
MECA enables transfer of reviewer identity/comments (see appendix B4), but not detailed information about other editorial practices.

39
. CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made A taxonomy of peer review types for display on journals sites Provides a taxonomy that can be slotted into the DocMaps Framework by publishers.

40
. CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

43
. CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 15, 2021. ; https://doi.org/10.1101/2021.07.13.452204 doi: bioRxiv preprint PLATFORM: wants to RENDER the publishing process in various visual formats. (e.g. : single merge pdf file with sequence of documents, dates, roles types; separate HTML pages; graphical timeline) PREPRINT SERVER: wants to IDENTIFY AND SELECT third-party reviews to DISPLAY them alongside the associated preprint.
PREPRINT-FOCUSED EVALUATION AGGREGATION SERVICE that DISPLAYS feeds of events about preprints to BUILD TRUST in that preprint: wants to have an OBJECTIVE LIST of facts about the evolution of a paper (including deposition, quality check, publication and review information) to SUBJECTIVELY INCLUDE these facts in a feed to provide trust indicators to a reader and show the evolution of the paper within the context of its lifetime. For example: http://hive.review/articles/10.1101/646810?flavour=a shows the context of a review, acceptance, deposition and publication but this was obtained via 3 different services which do not standardise their description of the data. Adding more information here is not possible because a) the cost is too high to build custom integrations which each new event type and b) most other event types are not available in the existing services anyway.
PREPRINT-FOCUSED EVALUATION AGGREGATION SERVICE that DISPLAYS feeds of events about preprints to BUILD TRUST in that preprint: wants a STANDARD way to INGEST peer review content from any provider to easily ADD NEW COMMUNITIES without development effort on either party's side. For example: can currently ingest reviews from e.g. Hypothes.is, RSS and custom API integrations, and even standardising the field names across these methods would allow easier onboarding, and faster innovation in the usage of these evaluation events.
PREPRINT-FOCUSED EVALUATION AGGREGATION SERVICE that gives a HOME TO PEER REVIEW COMMUNITIES: wants a STANDARD way to SHOW the editorial and review methods used by a community or on a review so that READERS can easily DETERMINE whether they can apply TRUST to a review based on how it was conducted. For example: http://hive.review/editorial-communities/316db7d9-88cc-4c26-b386-f067e0f56334 shows Review Commons' community page and describes their peer review model in free-form text. We could summarise this, filter by it, include it as a badge etc if the terminology were standardised.
AGGREGATOR: wants to FETCH REVIEWS and authors' replies produced by peer review services and DISPLAY ALONGSIDE the associated preprint and published papers. was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 15, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021  was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 15, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 Appendix 6: The First DocMaps Framework Draft (post-Meeting 2) The DocMaps Framework Summary DocMaps is a framework for capturing valuable context about the processes used to create documents, in a machine-readable way that can be expressed and interpreted in multiple formats depending on the use-case of the reader of the DocMap. The framework is designed to capture any amount of contextual data about a document -from a minimum assertion that a process took place, to a detailed history of every edit to a document.
We intend DocMaps to be general enough to capture the process behind any document, but our initial focus is capturing evaluative processes surrounding research documentsprimarily peer review processes conducted by journals or by services reviewing preprints. We are working with a technical committee of technologists, publishers, preprint server operators, aggregators, and advocates to develop the first use-cases for the framework.

50
. CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 15, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 Core Concepts

DocMaps
In addition to being the name of the framework, a DocMap is the name of the object used to capture information about a document. At its most basic, a DocMap consists of just a few pieces of information: { createdOn: timestamp // The time the DocMap was created contentType: string // The type of content (book, chapter, review) content: uri | optional // A link to the content the DocMap refers to provider: uri | optional // A link identifying the provider of the DocMap (e.g. the journal) } Using just these three pieces of information, a publisher could make a simple assertion that an article exists.
DocMaps accomplishes this by building and maintaining schemas for different Content Types.

51
. CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 15, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021

Content Types
Content types extend the basic DocMaps structure to add information needed to describe a specific kind of content.
In the article example above, the publisher can provide more information about the article by looking at the schema for a DocMaps Article (see below) and providing as many of the fields as they feel necessary. For example: { createdOn: 07-07-1999T00:00:00Z contentType: "article" content: https://doi.org/10.1109/5.771073 provider: https://ieee.org contributors: [{name: "N. Paskin", type: "author"}] title: "Toward unique identifiers" } Although the initial set of content types will be limited, we imagine working with relevant communities to develop many different types that support their use cases. For example: reviews, versions, translations; and discussions.
For now, we've defined three different content types with the input of the Technical Committee and the DocMaps core team: Articles, Reviews, and Versions. For more details, see the draft content type schemas below. Because contexts can flexibly describe events in many different ways, we expect that conventions will arise relatively organically between providers and consumers of DocMaps for describing different editorial processes at different levels of complexity.
As you'll see in the use cases below, we've used the Versions context to provide information about both published article versions and review revision rounds. If a provider didn't need or want to report revision rounds, or if there was only one round, they could omit the Versions context. In all of these cases, the consumer of the DocMap can decide how much of the context they want to use depending on how they plan to use the DocMap.
. CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 15, 2021. ; https://doi.org/10.1101/2021.07.13.452204 doi: bioRxiv preprint . CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 15, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 Zaphod Beeblebrox, Reviewer

Draft Content Type Schemas
These are initial drafts of content type schemas, and will be somewhat incomplete by design. They are intended to capture the information necessary to make the two initial use cases listed above possible. was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made Once we discuss and come to agreement on the initial schemas, we'll turn to implementation. We're using pseudocode in the examples above because we want DocMaps to support multiple formats, since publishers' technological needs and capabilities are so varied.

64
. CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 15, 2021. ; https://doi.org/10.1101/2021.07.13.452204 doi: bioRxiv preprint . CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 67 . CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 15, 2021. ; . CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made implied from order of array: article -> reviews -> contributors 69 . CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 15, 2021. ; https://doi.org/10.1101/2021.07.13.452204 doi: bioRxiv preprint peer_review_stage n/a unknown, as article being reviewed is only referenced by ID. Could maybe be implied as "post-publication" from lack of provider or date_online for isReviewOf docmap transfer n/a unknown, as article being reviewed is only referenced by ID. could maybe be implied as "yes" from lack of provider for isReviewOf docmap revision_round 1 implied from number of elements in review -> versions recommendation accept -> accept mapped from review -> decision peer_review_type visible -> all_identities_visible, opt-in mapped from review -> identity_transparency (uses STM taxonomy) 70 . CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted July 15, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021