EU Open Science Platform

EU Open Science Platform

The EC published their tender specifications for the Open Research Publishing Platform at the end of March , and as I suggested in an earlier blog on 20 March, it is completely open to “all natural and legal persons” at least within the EU (due to Brexit, UK organizations appear to be excluded). I think the Commission is showing a commendable lack of prejudice, and I think good common sense as well, in being open to participants with publishing expertise (whether university or library-organized, funder-led, NFP society, or commercial entity (publisher or other vendor). The Commission’s tender document is ambitious and demanding (more on this later), so it will require a competent organization or consortia of entities to fulfill. Some of the ambition is about technical performance (the 99.9% up-time requirement), some of it is about networking capabilities, but some is also about combining publishing requirements (open peer review) with the technical issues. There is a further area of ambition of requiring a preprint server capability, with linking and automatic repository posting features, while providing no funding.

It had been suggested on Twitter and in some media (see my prior post and Twitter  comments back and forth) that commercial publishers such as my former employer Elsevier should be automatically disqualified because they do not support Open Access enough (odd because two of the three largest OA publishers are commercial publishers)—and because existing publishers and trade associations have the temerity to advocate for sound OA policies (i.e. publishing Green OA with embargo periods given that Green means, in contrast to Gold OA, that no funds have been provided for the formal publishing activities).  Helpfully the Commission was quite universal in its approach, while quite prescriptive in requirements.

Richard Poynder retweeted Martin Eve’s analysis of the tender document (see the analysis here and below—which was quite a good analysis of the ambition of the project).  I assume Poynder is suggesting that Elsevier would regard it as too much work for too little reward—which I do think many organizations would agree with!  Bianca Kramer did an excellent job in doing a 17 point Twitter analysis on 2 April, which I describe below.

Three key themes of the tender document

Running throughout the tender document are these three themes: ambition/demand (particularly on the technical side); control/authority (on the part of the EC) re publication processes (open peer review process; preprints and repositories; standards such as CC BY); and the design of a “scientific utility” which can be later taken over directly by the EC or transferred to a new party (building a platform that is highly portable).  While there is nothing wrong with ambition, and government or other funders should always ensure they are getting value for money, I agree with some of the early critics that it is hard to see how existing scholarly communications participants including established publishers will be eager to bid, other than for the joy of the sheer challenge!

The EC might want to consider whether it might need to make more trade-offs to get the platform that it wants, with all of the technical and portability requirements, by being less prescriptive over the publishing process, for example by being flexible on staffing vs automation, or by not insisting on open peer review, which is uncertain in effect and might well impact the timeliness of formal publication.  It might be that incorporating the possibility of open reviews and post-publication comments, without requiring that peer reviewers openly post their comments and identities, would be more practical.  Even among strong supporters of open review, there’s some disagreement over the exact meaning of open peer review (see the 2017 review by Tony Ross-Hellauer ).

Technical ambition/demand

As others have noted, the technical demands of the system are considerable.  First, building a reliable publishing services platform, with author submissions, peer review, external linking especially to non-publication resources (publication resources would no doubt link through CrossRef), are non-trivial.  There are many vendors in the scholarly communications space now who have worked hard to provide scaleable and reliable services, generally on a proprietary and highly customized basis.  Online submission and review processes challenge most publishers, and the larger the scope of activity the larger the challenge.  The contents must be made available in multiple formats, with significant download activity expected (especially for text and data mining purposes).  Responsiveness at the level of 99.999% might be difficult to obtain if the content is being constantly accessed and mined.  Registration through the use of ORCID and other EU systems are required (though common sign-in protocols will no doubt become more pervasive in any event).  In addition to the identifiers, DOIs must be assigned for all article versions, and logs must be made available of all interactions.  Somehow the system must be able to populate institutional and other repositories on an “automatic transfer” basis (at the request of the author).  Preprints must be annotated with appropriate, CrossRef style links.  Quite a few standards have to be met, including Dublin Core for metadata, LOCKKS for archiving, graphics requirements, although established publishers are already navigating these.

Quite a lot of reporting, not only to the EC but also at the author and funding agency level is required, with citation information.  Much of this is being done now—and in fact F1000 (based in the UK, so probably disqualified) does much of this kind of reporting for users now (seen in the screen shot above).  Finally and fundamentally, the software to be used shall be commercial off-the-shelf or open source, and specifically any “proprietary/exclusive technologies that are not available to other solution providers are not acceptable.”

So plenty of challenges.

Publishing process controls

The tender gives a nice diagram of the publishing process in context of platform requirements as shown below…

The general work-flow diagrammed here is very recognizable and common, although it is important to note that there is both a preprint server aspect (unclear what the relationship is between Horizon 2020 funding and the preprint requirement) and a general publication process.  The diagram also over-simplifies the “first level check” requirements (which are not explored in the tender document in any detail), though perhaps this is like eLife or PLoS initial screening.  One might assume that a plagiarism check through CrossRef is contemplated, but again this is not clear (the tender document itself refers to “editors” performing these checks, so it sounds more manual than automated).  The ALLEA code of conduct is referenced , but this is a general set of principles rather than a process-oriented document.

Some of the criteria sections point to proven experience in developing and managing scientific publishing services, and note the requirement to establish a strong editing and project management team, in addition to the technology staff.  Importantly there are requirements for establishing a scientific advisory board (a fundamental step in establishing any new journal), also important in helping to recruit qualified peer reviewers.  Interestingly the tender document says that the contractor “will be required to gather broad institutional support and the involvement of the research community in many fields across Europe and beyond… [helping to establish the] Platform as a successful and innovative publishing paradigm for research funded by Horizon 2020” without any indication of how the Research directorate or the Commission itself might help in this mission.  Perhaps this is why the document is so heavy in requirements for communications initiatives and staff.

There are very specific requirements of editing, proofreading, layout and production, familiar to established publishers, in addition to communication and networking.  It is interesting to review the staffing requirements—one might wonder whether with the use of more online resources some of this work could be done more efficiently.

Finally, notwithstanding the notion of respecting authors and their copyright (or that of their institutions or funders), there appears to be a straight-forward requirement for CC BY Creative Commons licenses, which of course many OA advocates equate with OA publishing, so the broadest possible re-use rights.  Journal authors, however, when asked whether they might have concerns re CC BY and commercial use, or derivative use, do not seem as wholeheartedly enthusiastic (see the Taylor & Francis surveys).

Building the scholarly communications utility (portability)

The framework contract itself has a duration of 4 years, after which the EC expects the system to be operating well, according to technical functionality, and with a minimum of 5,600 OA articles posted using the strict CC BY licensing approach, and some number of preprints.  Perhaps more importantly, the Commission appears to contemplate transferring the operation of the platform to either itself or some other party or parties at some point.  The successful bidder will thus be responsible for ensuring that they can be eased out of the picture, and with an appropriate depth of knowledge transfer.  Though this might be helpful in ensuring transparency, it likely will be a de-motivating factor in the bidding process.

The price schedule (Annex 8)

While only a form, the EC has made clear that while there may be some “building” costs that would be contemplated in the early phase of the process, the Platform is supposed to operate financially on the basis of a price per peer-reviewed article (assuming there will be 5,600 of those).  I do remember at some point NIH in the US indicated they were building and operating the PubMedCentral database for around $4.4m a year (see the 2013 Scholarly Kitchen post ).  PMC is hosting many 100’s of thousands of manuscripts, so presumably the EC will be looking for a cost significantly below that.  It is important to remember however that in addition to the technical requirements, staffing requirements (editorial and technical), there will also be costs involved on the preprint side.  Of interest is the comment that the bidder “will not charge the Commission for the process leading to the posting of pre-prints or for articles that have been rejected during the initial checks.”

Other summaries/analysis

 As noted, I thought the analysis by Bianca Kramer on 2 April was very good—hard to do on Twitter to capture 17 salient points— noting that certain Open Science protocols and requirements were not incorporated.  The post was also critical that O/S software was not required in all functionalities (though the requirement is either publicly available “off-the-shelf” technology or O/S, so in any event nothing proprietary/private), finding perhaps that the tender was not ambitious enough!

2018 NISO Conference

2018 NISO Conference

NISO Virtual Conference on Integrating the Preprint Into Scholarly Communications

I enjoyed participating in the NISO conference last week on preprints and their evolving role in scholarly communications (see http://www.niso.org/events/2018/02/preprint-integrating-form-scholarly-ecosystem) in which I spoke about publishing ethics processes and policies and their applicability for preprint servers. There is much discussion today about how systems that incorporate early publication (not only of scholarly papers, but also data and other types of research artifacts) can bypass the formal journal publication process for papers, with the idea that speed to publication is critical.  I wanted to point out that if new sites and services intend to supplant formal journal publishing, then those services will need to adopt greater formality in policy and process to better assure the scientific community, and perhaps more importantly society more broadly, of the quality of the content.

But it was also a great opportunity for me to learn more about changing practices among preprint servers, new links between university policies and repositories, and developments at both ArXiv and bioRxiv.  One overall sense I had was the growing maturity of these initiatives, and the thoughtfulness of the organizers and participants.

My former colleague Gregg Gordon gave a terrific summary of the history of preprints and SSRN, and the remarkable diversity of content and content types that is being posted and highlighted, from working papers to conference proceedings to early version papers (SSRN does not do much with data to date), and the importance of breadth of content for cross-disciplinary research, quoting Granovetter’s  1973 article “The Strength of Weak Ties” (Am.J.Sociol. 78(6) 1360-80).

The NIH’s Neil Thakur discussed policy development at NIH that supports the citation of “interim research products” in applications and reports. The NIH’s policy recognizes the speed factor in early versions and the more authoritative nature of preprint services.

Neil also gave some survey results (noting they were not weighed properly from a statistical perspective) that supported the policy change, noting that the inclusion of interim products could improve the rigor of application reviews, while noting some concerns re the absence of formal peer review processes.  The NIH’s new policy also includes guidelines on repository best practices and citation formats.

Matt Spitzer from the Center for Open Science described the OSF Preprints network, noting that it is an Open Source platform which can be utilized for multiple preprint servers, and listing the number of sites already started that utilize the platform (now 17, in a variety of fields).  One huge benefit of having a common infrastructure is to facilitate the amount of cross-connection among the sites of supplemental data or other materials (further information at https://osf.io/preprints/).

John Inglis of Cold Spring Harbor discussed the launch of bioRxiv (2013) and the upcoming launch of MedRxiv.  Echoing Darla’s comment on the importance of community, John mentioned the strength of Cold Spring Harbor, its conferences and resident researchers, and its key publishing activities.

An important element for bioRxiv is its emphasis on publisher neutrality—helping to broaden the content posted on the site and ensure widespread participation.  The bioRxiv screening process (for plagiarism, non-scientific claims) was described.  John had some interesting statistics on the number of papers that received substantial revision (29%) and the total numbers of papers (20,000) on the site.  MedRxiv is being organized by researchers at Yale and CSH, and intends to include a number of non-paper objects such as protocols and technical reports.  John also described the ethical concerns with medical preprints, and the importance of screening and disclaimers.  John also described the recent announcement by PLoS and CSH of manuscript posting to BioRxiv, just announced on 6 February.

In contrast to Gregg’s discussion about the importance of cross-fertilization among disciplines and communities, Darla Henderson from the ACS in her discussion about the launch of ChemRxiv emphasized the importance and strength of community in the ideation and development of the new service.  Darla noted that although there had been discussions for some years about the need for a preprint service in chemistry, it needed an organized effort and strong leadership, starting in early 2016, to identify the mission, identify strengths and weaknesses, and to bring to bear ACS strengths (quality, organization).  Perhaps most importantly, Darla emphasized the trust element that scientific societies can bring to the picture.

Oya Rieger presented interesting data on the grandmother of preprint servers (started 1991), arXiv, which seems to have retained its vitality notwithstanding age, with the number of submissions growing steadily to nearly 120,000 in 2017.  Oya focused on the question of design principles for the future of arXiv (the “next-gen” version), and noted the many comments received to the tune of not fixing things that are not broken (with a 95% satisfaction rate in a recent survey).  While some aspects of the current system may seem dated, they all function very well and are easily managed.  Comments were made to not add features just for the sake of adding features (something I think that all platforms are susceptible to and need to guard against).  Oya mentioned also the importance of moderation for annotations and comments.

Jamie Wittenberg, head of Scholarly Communication at Indiana University, spoke about the 2017 university OA policy and its relationship to the university’s own repository and preprint servers.  Jamie noted that the policy requires the deposit in  repository (absent a waiver) but was agnostic as to which repository that could be.

In the wrap-up questions at the end we discussed the evolution of journals policies on preprint posting, noting that while many journals had an earlier policy that pre-submission posting might disqualify an article from consideration, that more journals have accepted that preprints are part of and not in conflict with the formal publication process.  Often journals have outdated policies, and have simply not been asked to address the question directly and at the right level of engagement.  The number of journals with these kinds of prohibitions are clearly decreasing.  I did note in my presentation that there are some of the “weekly” journals that still have strong “news-embargo” type of prohibitions, where the preprint might be viewed by the journal as violating the embargo, particularly if the ultimate journal of publication is mentioned.  My slides can be found with the other speakers at the NISO site, but I will include them separately here.

Mark Seeley
16 February 2018