AI, Open Science, and the Future of Research Integrity: An Interview with Alison Mudditt of PLOS

Below is an interview with Alison Mudditt, CEO of PLOS (Public Library of Science) discussing the impact of AI on publishing operations, research integrity, and sustainable open science.

Dave: I’ve spoken with several publishers at this point about AI and its effect on their publishing and business models. PLOS is an open access publisher so some of those licensing concerns might not apply. Could you give me a high level overview of how you are seeing AI affect PLOS’s publishing operations?

Alison: First, for clarity here – I’m talking about AI broadly. Not just generative language models, because we use both and there are different sets of issues with each.

At the moment, where we’re seeing the biggest impact is submission volume and research integrity. We’ve seen a dramatic increase in submissions over recent years and we believe there are a number of factors driving that. And although we can’t definitively attribute this to generative AI, we do think it’s playing a role in so far as it lowers the barrier to produce large volumes of text.

Alongside that it brings increased difficulty in maintaining research integrity, especially for a publisher like PLOS – we’ve now hit 8,000 submissions a week, so that’s a lot for us to work through. We are using AI tools in helping to screen submissions up front, making sure that we’re verifying authenticity of data, authorship, institutional affiliations. All those things just become a lot more complex both with the volume, and when these can be manipulated or fabricated with AI tools.

We are using digital tools like iThenticate for plagiarism detection along with other ways of screening up front – the evolution of AI tools is helping us there. They don’t fully replace humans, but they can help us flag papers that should have a second look from a human and human editorial judgment. So, we’re trying several AI tools to support screening and quality control. None of them have proven capable of fully replacing the insight of our editors and reviewers…which they are of course very pleased to hear! But we’re in an ongoing arms race between those who are using AI to undermine research integrity and our attempts to evolve our detection methods so that we can guard against that.

The other area where we see AI increasingly playing a role is in our operations and how we can use new tools to improve both platform development, speeding up prototyping, expanded automated test coverage and development – but also potentially lowering costs as we’re looking at improving workflows, taking out some of the more manual work to support much better efficiency in some of those workflows.

Dave: That’s super interesting. If you’ll indulge me, I’d like to go a little bit out of order with what you just said. So, I’ve seen some publishers float the idea of integrating AI into their processes to improve workflows like you just described. One of the places where I’ve seen a lot of sensitivity is with peer review. So I wonder – is PLOS using it or recommending it to peer reviewers – to use AI in that context or have you seen it and thought through those issues?

Alison: Peer review is one of those areas which is, as you say, is a little tricky at the moment. We allow AI to be used in writing papers, but authors must disclose its use to be completely transparent. Authors then remain fully responsible for the content of the submission and AI isn’t being used to either fabricate or misrepresent results. So that accountability of authorship is really critical.

Our editorial policies there are very much in line with guidance from organizations like COPE (Committee on Publication Ethics). When concerns do arise either about AI generated content in manuscripts or in peer reviews, we’re using a combination of human review and some of the AI detection tools that we’ve been piloting to help assess the situation. What we’ve found so far is that some of the tools offer useful indicators, but they’re not consistently reliable – partly because generative AI is evolving so rapidly. When we do find signs of AI use without disclosure in a paper, we’ll follow up directly with the authors.

When it comes to peer review, we are beginning to encounter concerns about AI generated peer reviews, which are becoming increasingly easy to produce. Editors and reviewers at this point are prohibited from uploading manuscripts into AI tools without disclosure. If we verify improper use, then we will go ahead and assess the broader impact – is the issue isolated? Is it part of a pattern involving specific peer reviewers, specific editors?

We will then work with the editor or the reviewer if necessary. Protecting the integrity of the review process is critical, and we don’t currently allow AI generated reviews. But there’s a gray area that we struggle with in this and other contexts: how do you identify the difference between when AI is used to help with writing style versus to generate the content?

Dave: Yeah, definitely. I think one of the big areas of concern, and you’ve touched on this a little bit, is research integrity. In the past, we’ve seen some really high profile data fabrication concerns, even well before AI. And I think a lot of people have looked at this technology and thought, gee, it’s really good at pretending to create stuff that looks legitimate. And so I wonder what kind of guidance or warning you’re giving or communicating either to peer reviewers or to your editors about data fabrication issues.

Alison: Yes, we have a program that is trying to keep our editorial boards aware of the trends here and AI-assisted data fabrication is one of the major threats, along with image manipulation. Both have become a lot easier with AI and cannot be reliably detected with current tools. Many of the concerns that come through to our publication ethics team start with concerns raised by reviewers or flagged during our internal checks. Like many publishers, we’ve been shifting as much of this research integrity work as we can to earlier in the workflow with the goal of identifying problematic papers even before they’re sent out for review. The growth of our Publication Ethics Team illustrates the growing scale of the problem: we started with just one person in 2018 and are now up to 18 people.

If we’re able to spot problems pre publication then the paper is rejected; if it’s identified post publication, the response is typically a rejection – but not without a comprehensive investigation. In terms of working with our editorial boards, we offer training to help them with spotting integrity issues, including the role of AI, and emphasize the importance of raising concerns.

The other solution to this is at industry level because there’s only so much we can do as an individual publisher. We’re working with STM’s Integrity Hub and one of the useful aspects of their tools is the potential to flag patterns across journals and publishers. If we can share data it is going to help us with identifying bad actors, whether individuals, organizations or specific countries. That’s harder for us to do at an individual publisher level.

Dave: Oh that’s super interesting – one of the questions I had which you partially addressed is what happens with submissions that have AI generated content but this also goes for fabricated data – what do the consequences look like for one of those bad actors or folks who are trying to pass content off as legitimate research when in fact it’s just entirely fabricated using these AI tools?

Alison: If we catch it early on, we’ll obviously reject the paper. One of the challenges that we face as a publisher and as an industry is the question of who’s responsible for research integrity. Research integrity is the top priority for PLOS, but it has also become a very significant cost across publishers. We obviously have a clear responsibility for the integrity of the content we publish, but the actual problems have usually originated much further upstream in the research workflow. It seems very reasonable to expect that institutions have a responsibility for the integrity of research conducted at their institutions and by their researchers. Unfortunately, we and many other publishers often find them unwilling to engage when a problem is identified. We often run into challenges identifying who to contact or simply get no response. And particularly if it’s a well-known researcher or a researcher who has significant grant funding, frankly they don’t want to know.

We understand that if we retract a paper, this potentially has a significant impact on a researcher’s career. And of course, the impact of retracting a paper also has a massive impact for their graduate students and collaborators. We’ve tried to take a very responsible approach of doing the leg work to understand what happened. That can take a long time – particularly if we don’t have a partnership with the institution. I’ve heard a few other publishers suggest that we should simply retract as soon as a problem is identified on a paper, and I have some sympathy for that perspective.

Dave: That makes sense. This is going a little off script but I’m curious – on the university angle of caring about research integrity. I can understand for individual cases institutions being reluctant for a variety of reasons but you’d think at a systemwide level they’d have a shared interest in ensuring research integrity. So are they engaged for example through COPE or other places where some of the conversations about tackling these issues are happening?

Alison: They are engaged and I think there’ve been a number of ideas put forward. One interesting suggestion put forward by Lisa Hinchliffe was that publishers should build research integrity oversight requirements into their agreements with institutions. I can see several challenges with implementing this, but it highlights the need for individual institutions to step up to their responsibilities here.

But much of this has to start earlier with improved training. Danny Kingsley has written about the gaps in graduate training in research practice, including integrity and ethics. This is one clear and strong example of where institutions need to step up – and it’s an area that’s ripe for systemwide collaboration across institutions, publishers, libraries, and other interested organizations (such as the UK Reproducibility Network, which has launched a training program in partnership with Research England).

One of my real concerns is that the overall climate makes this work all the more challenging. While science and research have never been separated from politics in other parts of the world, we’re now seeing clearer interference in parts of Europe and the US. The research enterprise has been under attack in countries such as Turkey and Hungary for a number of years now, and we’re seeing a similar playbook rolled out here in the US. This isn’t only about funding – though that’s obviously critical – but also attempts to police expression and the choice of what can be studied and how.

The recent “Gold Standard Science” Executive Order also has the potential to be counterproductive to reform movements. While its core tenets seem anodyne enough (federal and federally supported science should be reproducible and transparent, communicate error and uncertainty, and subject to unbiased peer review), the rigid framework it sets out is almost impossible for any single study to achieve. Rather than building a scientific integrity focused on ethics and objectivity, this Order has the potential to move us further from this goal – especially as political appointees are given the authority to both define scientific integrity and to decide which evidence counts.

Dave: Since we’re on the subject of federal government interference or involvement in scientific publishing – research support is in disarray right now. There’s funding cuts. There’s real concerns about funding agencies interjecting into what funded authors or government employee researchers can even say in their papers. For a while there was this list of words that were prohibited. And so I wonder how has this concern filtered through to PLOS and, what kind of guidance are you giving authors or editors around these sorts of issues?

Alison: We began to hear concerns not only from authors but also from editors back in February. We reviewed our editorial policies which all prioritize the integrity and the usability of content. Having reviewed those, we haven’t seen any need to change our policies regarding manuscript withdrawal, authorship change, scientifically accurate language, or correction/retraction. At the same time, we’re approaching questions and requests from authors with empathy and flexibility where we can. That said, there are requests we’ve had to deny – for example, requests to withdraw or to change authorship.

Dave: One of the things I mentioned was the funding cuts. And so on that subject, PLOS announced two major grants last fall focused on transforming its business and publishing model. I’d love to hear about what that is and how that project is going. And also if there is anything to say about just how changes in federal funding are factoring in – with overall reductions in grant funding availability.

Alison: Let me comment on that first and then move on to the research and development project. At this point in the year, our revenues are holding up strongly but we’re certainly watching and bracing for potential impact from both cuts to federal grant funding and the impact on institutions of higher education.

I’ve spoken to a lot of librarians over the past six months, and some have already started taking cuts. Others have even been asked to give back money that was allocated this year. But while there is likely a real impact here, I’ve also heard from other librarians who believe that this crisis could force librarians to more fundamentally reevaluate how and with whom they spend their money. The hopeful side of me would like to think that this can force a realignment that’s been a long time coming.

The second thing that we worry about is an emerging more fragmented infrastructure, along with geopolitical tensions and rising scientific nationalism. One of the challenges for an organization like PLOS as we look at the next few years is: can we remain a principled advocate for open access at the same time as adapting to a world which is becoming more closed off? How much do we resist pressure to compromise our values and to redefine open access within national and political boundaries?

I don’t think we have clear answers but this question is one that the project of open access and open science is going to have to confront over the coming years in some way or another.

Dave: Before we get to the second part of my question about the grant and your project, I want to go back because I skipped a question and it relates to this – I’ve seen in several LLM training data sets that PLOS has been relied on heavily as a data source. And I think the obvious reason is it’s all licensed very permissively and is legally clean. I’ve heard from authors – some sense of, if not reluctance, just a question – should I continue to publish open access because doesn’t that mean that these AI companies will scrape up my data and then use it in ways that maybe I didn’t anticipate? So I wonder what you think about PLOS being relied on so heavily as a data source.

Alison: That’s a difficult one because, in principle, I think it’s a good thing. But there are a number of important caveats. This first came up with the New York Times lawsuit in which it was disclosed that PLOS content was the sixth largest data set that had been used in training. Obviously, the fact that we mandate the CC-BY license for pretty much all of our content and its high quality XML makes any reuse easier. And reuse is one of our core goals.

The key problems for us are the lack of attribution and the lack of ethical oversight by the large commercial AI players. We’ve had very different sets of feedback from authors. Some researchers choose to publish openly for a reason and would far rather that their science is being used in training than some of the junk that’s out there on the internet. But others oppose participation in AI training altogether due to concerns over ethics, environmental costs, and lack of consent or attribution.

I think the challenge is to balance innovation and compliance. We would definitely rather that AI tools are trained on trustworthy content like ours. But we do think it’s important that AI companies find ways to comply with license terms. For us that means citation and attribution linking to sources indicating where content has been modified. This becomes all the more important as AI becomes a primary channel for knowledge discovery going forward.

Dave: This may not be a question you can answer, but there’s been a few reports recently about AI being a dead end for search. People have inquiries about topics and they ask ChatGPT, what is this? And the AI draws on a variety of sources, but maybe doesn’t reference people out to where those answers came from. And that’s really interesting to me because even with our little, Author’s Alliance website, I’m looking at where our referral traffic’s coming from and we’re starting to get quite a few referrals from some of these AI tools, where people are coming to our website from ChatGPT or Claude or these others. I wonder if people are being directed to any PLOS published research from that. So like I said, you may not have an answer to that.

We’ve seen a number of spikes on our website which we assume are bots crawling our sites to build LLMs. We’ve begun actively tracking people searching our site via AI tools like ChatGPT over the past few months in Google Analytics and it currently appears to be relatively low volume. That said, I know that some of the main publishing platforms are now seeing very significant traffic from these sites so I suspect we’re in the early stages of a change which, like everything else in AI, might happen quite quickly.

Dave: Most of our referral traffic comes from Google and Bing and stuff like that, but it was not an insubstantial number of referrals to our website. And for me, I’m very happy – we give everything away for free. I’d much rather have people taking our copyright advice than other people’s.

Alison: Yes, exactly.

Dave: Let’s get back I asked about this major grant that PLOS has that is focused on transforming PLOS’s business models. Maybe you could just tell me a little bit about that and how that is going.

Alison: This project has evolved from research we’ve been doing over the past couple of years to identify the next major transformation for PLOS. PLOS was never intended to be simply an open access publisher – rather, our role has always been that of a catalyst for positive change in the way in which research is shared.

Our goal has been to identify the next big problems we want to solve – ones we think PLOS is in a unique position to solve. The digital age has made significant transformations to research workflows and the way in which research is conducted but research publishing itself has really lagged behind – a lot of the fundamental concepts have remained largely artifacts of print publications. Through extensive stakeholder conversations, we identified two key questions that we wanted to focus on.

The first one is the misalignment of incentives for open science. PLOS has done a lot over the past decade to promote open science practice through our journal portfolio. From this, we can see that many researchers see the benefits of sharing more openly and in principle, they’re willing to engage with open science practices. But the lack of recognition for doing so in the academic reward system is a huge obstacle to further adoption.

Of course, there are key elements of this problem that PLOS can’t do anything about. We are not going to change institutional academic reward systems, but all too often, this enables publishers to walk away from the problem and leave it for someone else. We believe that there are ways in which publishers can contribute to reducing the focus solely on published articles. If open science is going to gain wider adoption, there needs to be a much easier way to share and link those research outputs, to make them more discoverable in their own right, and to facilitate credit for them.

The second problem that we want to address is the business model. PLOS was one of the early publishers to use APCs and while they played an important role in getting open access off the ground and proving that it was a viable business model, they become co-opted in wholly unintended ways. In addition to the inequity they create, APCs reinforce that notion that articles are all that matters.

It’s also clear that APCs have done nothing to reduce the cost of publishing or to reduce the market power of the large commercial publishers. This is clearly illustrated by the 2024 JISC report evaluating the first decade of transformative agreements in the UK. While over 50% of UK research output is now published open access, libraries are still spending the majority of their budgets in subscriptions. And as we now know, APCs have disenfranchised many researchers and institutions, primarily in the Global South.

Dave: Those are two pretty hard problems. I’m curious if you can say and I totally understand if you can’t on the business model front – could you give some some sense of or flavor of alternatives that you’re exploring – I know PLOS has a really robust institutional partnership program now and when I was at Duke, I thought that was fantastic. We really wanted to jump in on that and so that seems to me like a kind of interesting innovation on the paper article APC approach but I would be interested if you could give any flavor of what the menu of options you’re thinking about are there.

Alison: This will be an institutional based model and we’ve started out with a few core or design objectives. We want to structure the participation fees to maximize open science and the social benefit that delivers. We want to ensure that costs and funding mechanisms are fair and transparent. We also want to avoid the administrative overhead of individual transaction fees that burden publishers, libraries and funders in the APC model. And we need a model that can capture sufficient funding from libraries to make it self-sufficient and self- sustaining over time.

Providing our publishing services purely as a public good runs into the classic collective action problem. We would require consistent grant funding – coordinated collective support over time. Or we could appeal to values-based acquisition practices at well-resourced libraries. But it’s questionable how such a model would fare over time, especially as budgets tighten. We’ve therefore been exploring models that include some private benefit, a sort of club-based benefit alongside the core public benefit (i.e. the content is always free to read). We’ve had excellent input and support from a variety of libraries and funders to date, and I think we’re headed in the right direction.

Dave: That’s really interesting and very interesting too to think about how the two pieces of this project fit together. Libraries are an important part of the funding for journals and for publishers, but increasingly with the fragmented APC payment landscape, you see an awful lot of money flowing out of institutions that is not institutionally coordinated. And I think, especially with what’s happening with grant funding now, universities are going to have to face a crisis in funding – how do they rein in all of this diffuse spending and maybe a better approach is to have an institutional investment in a publisher or publishers that are aligned with their open science goals.

Alison: I think that’s true and there is clearly a big realignment happening in the US. Whatever happens whenever we reach the other side of this, we’re not going back to how things were before. Although there’s a lot to be angry, frustrated and worried about, on my hopeful days I see an opportunity for us to seed a much needed transformation. One aspect of this for PLOS is positioning our publishing services as a collective that’s supported primarily by research producing organizations but also by research funders – we don’t want to lose that funding from the system and place the entire burden on libraries.

Dave: Right. That makes a lot of sense. This is really exciting. I’m very pleased that you’re taking such a close look at this and having a really open mind about how you’re approaching it. It’s easy to get stuck in what we have here and now. But it sounds really cool.

Alison: Definitely. I just want to reinforce the point you made about the linkage between these two problems. The new business model is necessary to support a different model of publication incorporating different outputs but we are also looking at this as a business model that will support our current journal portfolio.

One of the core learnings of our research to date is that these new publishing capabilities won’t appear as a shiny new product but as a transformation over time of our journals portfolio. We know from many other attempts to change researcher behavior that this is about much more than providing technological innovation. Behavioral change is hard enough already and one of the core advantages that PLOS has is a highly respected journals portfolio. This has enabled us to drive past changes, such as data sharing, because researchers have done this through vehicles they already know and trust.

Dave: Yes. If there’s anything I’ve learned in scholarly publishing over the last decade or two is that change is incremental. And trying to do it all at once is almost doomed to fail. So that approach makes a lot of sense.

Alison: It is. And I think that’s one of the interesting things when you look back at PLOS and open access. PLOS’s initial success with launching PLOS Biology was that it was like every other selective journal – it just changed the business model.

That said, I think that what hopefully comes out of our current work is more ambitious and transformative – because that’s why PLOS exists. How can we demonstrate a different way of publishing that meets the funding and equity challenges head on, but also allows open science practices to flourish in ways that have meaningful public and societal benefit?

Dave: Yeah. That’s fantastic. I’ve been in so many conversations over the years where we throw up our hands and say, “Well, none of us can convince promotion and tenure committees to change what they’re doing or to change the incentive structure, and so why are we even trying?” It’s really admirable that you’ve taken a much more positive approach to saying, well, there are some things where we can make a difference here.

Alison: Yes, exactly. And while some of this will be incremental, I fully believe there are steps that all of us can take.

Dave: This is great. Thank you for talking with me!

Discover more from Authors Alliance

Subscribe to get the latest posts sent to your email.

Share this:

Discover more from Authors Alliance

Related Posts

Leave a Comment