What is a Scientific Paper?

David Bradley reporting from Science Online London 2009 (#solo09)

The “modern” form of scientific publishing began in the 17th century when gentlemen (rarely has it been a lady until very recently) with an inquisitive bent decided it would be a good idea to share the results of their endeavours among their peers, for assessment, confirmation and debate. August bodies that published these seeds of enlightenment as well as the occasional monstrous calf (Robert Boyle quote) grew into the learned societies we still know and love today. Moreover, on the back of scientific industry these organisations and countless commercial concerns since have built vast empires to publish and profit from the growing piles of scientific information.

From Boyle’s “monstrous calf” in 1665 to Watson and Crick’s seminal, single page “paper” in Nature in 1953 humbly announcing that they may have unlocked the secret of life, the status quo has remained…well…the same. There were innovations over the decades, mainly in the arcane areas of typesetting and lithography. With the emergence of the personal computers and the internet, however, things began to change ever so slightly. The flat and static nature of research papers remained pretty much the same, but were copied from treeware to PDF formats and online archives.

With the emergence of the age of digital media, social networking, online collaborative tools, and new business models for publishing, however, the late 90s saw the first waves of a sea change that would, dot.com froth aside be the first ebbing of a revolution the full impact of which science is yet to observe.

In 1665, Robert

At #solo09, Lee-Ann Coleman of the British Library, asked “where next?” There are millions of research papers out there now, so how does science use new technology to mine this information data seam? More urgently though, what format should the modern scientific paper take? It is obvious from the way many pioneering scientists are working today, among them many conference delegates such as Richard Grant, Cameron Neylon, Peter Murray-Rust, that things are changing significantly.

With biology papers and particle physics research having vast author lists and mounds of technical data, should the science blog become the narrative resource, the results and discussion, for a research paper
, with annotated databases and repositories of processes acting as the old method and supplementary data sections?, asked Coleman.

Katherine Barnes of Nature Protocols explains how NPG is already taking steps towards such a view of the scientific paper. This entity is unusual in that it publishes protocols, recipes of how to do the experiment, rather than primary research papers, with a review of the method and detailed description. These “Protocols” are unlike traditional peer-reviewed papers in that they are not peer reviewed in the conventional sense but almost instantaneously critiqued by the community. What is left unsaid, of course, is who should be the critics, are they anonymous and who pays for their services?

The digital video journal JOVE has almost taken this protocol approach to its logical conclusion where each “paper” is an improved video presentation of a protocol. Such is their credibility in this age, that they are indexed in PubMed. It is presumably relatively expensive, but really useful nevertheless, Barnes suggested. Although one audience comment suggested that making a video would be one of the least costly parts of a research project overall.

Like arXiv and the ChemWeb chemistry pre-print server before it, NPG is also touting Precedings as a pre-print journal for biology and also offering innovations in Nature Chemistry, such as 3D structures, links to data, citations and download data. All apparently very innovative but something that Henry Rzepa and Peter Murray-Rust were proving way back in the mid-1990s with the ECTOC and ECHET virtual conferences. The pioneering efforts of those conferences and the likes of ChemWeb and BioMedNet, which were web 2.0 years before the web 2.0 of reflective logos and so-called social media seem to be neglected in discussions today. I digress.

Barnes said that while NPG is always thinking of ways they can improve articles, handle big data sets, add movies, and make the traditional paper (invented wayback when) as useful to scientists today as possible. The publisher still maintains a traditional view as to what a basic paper is, but nevertheless is asking how it can move forward to help researchers in the future.

Theodora Bloom Chief Editor at PLoS Biology did what she described as a whistle-stop tour of what’s wrong with scientific papers at the moment. We’ve come along way since teh 1953 paper by Watson and Crick, she said, but asserted that “Papers” don’t really work now. There are some pressing problems that must be address, not least how to preserve a master copy for the record.

One of the problems that papers present when they meet head on with the digital, which is probably more important to authors than publishers is that the likes of PubMed rarely index the complete author lists for papers with huge numbers of authors such as those that emerge from genomics programs. The inclusion of complete methods is a point of contention, some publishers do, some don’t, some reserve those details for supplementary information. However, they’re handled, it doesn’t seem that an optimal standard approach has been reached that allows other scientists to quickly ascertain the protocols and attempt to reproduce them, an essential part of the validation of science, of course.

Moreover, a paper may have 1500 genes or 25000 images, where and how should those be published or archived? Asked Bloom. How would they be date stamped and if, as some studies claim, many scientists cannot trace the originals of published images and data, how do we preserve the provenance of a published scientific result and so allow technology to detect fraud and inappropriate manipulation?

We need a snapshot database of these “big” papers, suggested Bloom. Again, who pays, who has the storage, virtual and offline. It could be argued that the likes of complex 3D protein structures and such are the essence of a paper anyway, and that the narrative description is just an access point and could be handled independently in the electronic lab book or through a blog. Indeed, if a “paper” is entirely machine readable and digital then where does the author express their views. Perhaps we could go back to the one-page narrative epitomised by Crick and Watson’s humble publication of 1953. This component could become an aside for the “paper”. What is the primary data? asked Bloom.

She also suggested that the time is ripe for integrating the good-old reference section and database information for real-time analysis of author activity and results. Publishing has moved on since 1953, but Bloom also asserts that for all the We have come a long way in the fifty years but not quite far enough, already multiple versions of papers available online and the rich semantic material to work requires a free and open access to the articles, as provided by the PLOS model. Other interested parties might disagree.

The final speaker in the session, Enrico Balli points out that some organisations are already working in the new age of scientific publishing in which the definition of a scientific paper has evolved significantly already. SISSA has published a small number of particle physics papers in its journal and started to run “non-printable” types of papers in the last year in Proceedings of Science (pos.sissa.it), papers such as proceedings, regular papers, educational activities, conference notes etc…

SISSA is also working alongside the UK’s Institute of Physics to publish normal papers alongside any kind of other content attached to those papers. Strange papers include a manual for the software used by a part of the physics community, for instance. It’s not a paper in the strict sense, Balli explained, it’s a manual, although written in a style as if it were a manual to conform to what reviewers expect of a paper. He also highlights the CERN/LHC Atlas project, the biggest particle physics experiment with 5000 people and so enormous author lists. The “paper” is the manual for Atlas and has been published in a journal to explore every single nut, every single bolt. The author list covers twenty screens alone and if it were printed as a traditional paper might be a metre and a half tall.

Balli also asks, what of the data sets in particle physics for recreating experiments. The LHC will create unbelievable amounts of data, everyone will need to cite the data, but of what will the papers that emerge from this vast international collaboration consist? A comment from the floor pointed out that reproducibility of this data might be nice but is probably irrelevant for such enormous experiments.

Balli further points out that they are creating a new un-journal the Journal of Stuff which will be nothing like a traditional journal but allow physicists the opportunity, perhaps with peer review, to publish the stuff they need to publish, data sets, manuals etc.

Nevertheless, to mix a metaphor the seeds are being sown after several centuries of evolution, perhaps it’s time Boyle’s enormous calf was put out to pasture.

Pictured left to right: Katherine Barnes, Lee-Ann Coleman, Theodora Bloom, Enrico Balli

Incidentally, there was quite a lively Q&A after the speakers had done their set pieces with Cameron Neylon, Peter Murray-Rust and others pitching in with how they consider various parties in the publishing industry being to blame for different limitations in innovation when it comes to research papers, while others such as Nature’s Maxine Clarke defended the publishers’ corner to some extent.

Martin Fenner has aggregated many of the excellent posts from #solo09 that have already been published from the prequel, the conference itself, and the various breakouts etc. Although I wrote this post on the train home on Saturday, didn’t want to publish it until today, so these guys were ahead of me with their reports:

16 thoughts on “What is a Scientific Paper?

  1. When I first read this article I must admit I got pretty scared. But, after some moments, I said to my self: “there’s nothing to worry about, It’s just an article about scientific papers” so I calmed down and, nowadays, I can say I live a very good life. Thank you all, specially the author who made clear, at list to me, some very dark points here with elegance and very effectively.

  2. The arguments about online publishing put forth in the article as well as the reference links cited, is indeed convincing about the whole lot of online publications and blogs. However, the concern always remain about the validity and reliability of these publications. Nevertheless, they do provide a good food for thinking process in the proposed direction which can lead to setting up a new research idea/question for study.

    A very good synonymous scenario that i can immediately think of is, for example, reporting of unique cases in the form of a case report. Though one case report does not lead to any sound evidence, reporting of such anecdotal cases by several others may set the basis for future research in that direction.

    However, the lack here is owing to the burden of writing for ‘publication’, on account of reasons such as a lack of time or lack of expertise in writing for peer-reviewed publications that demands much more competencies beyond just the subject knowledge, several of these unique cases remain unidentified. Thus, this leads to delay in knowledge reaching to the public/other researchers on account of which patients keep suffering. In these instances, blogs or online reporting may be supportive resources allowing ease and flexibility of reporting thus helping to propagate the incidents and experiences.

  3. Thanks for a thought-provoking article. Anyone interested in the changing nature of scientific publication might be interested in having a look at Elsevier’s “Article of the future”, at http://beta.cell.com/

    In my faculty, we’re planning on moving our journal online and was wondering if there were any suggestions as to an appropriate platform? We’re thinking about Joomla, as it has an active developer base and loads of plugins that would bring most of the functionality we need.

  4. It depends on what we value, data or views. Sometimes I am told that views is not truer that facts, so if we have to omit one from data and views we choose views, and this makes “entirely machine readable” papers valid. But then we don’t need seminar and conferences either. Every scientist reads data from others and publishes his/hers. No angles or aspects of research are emphasized, and what’s worse, no theories are proposed. So we don’t have science following this way of thinking.

    So, views are important because they help form theories, which although less truer than pieces of facts we still value much. Then the question becomes whether views and data should be shown together, in the same place. If not, it is rational to have on one hand a cold database of data (duh) while on the other a warm webinar of views, both based on Internet technologies.

    Maybe it is necessary to make the theoretical type of views more ‘machine friendly’ than the casual, saloon type because the former is to show its predictability and thus subjected to later reference and should be immediately digitally comparable/compatible with that cold database of data. But the change is the separated way we treat information in this case because of their different natures when it comes to either archiving or ‘trueness’.

  5. @Maxine Many thanks for commenting Maxine. I do have my notes from the follow-up discussion but didn’t get time to edit them into a decent shape, there were also comments from a couple of people whom I didn’t recognise in the audience who made invaluable comments but I didn’t like to quote anonymously.

  6. As mentioned at the conference, I think that scientists find the filtering and editing process a valuable way to keep track of developments. I also think that one of the talks confounded the “paper” – a useful, focused if brief description of the concept of a piece of research – with the “data underlying it”. We need far more annotated, public repositories for various kinds of data – not cheap, and not cheap to maintain. Journals do and can continue to do their bit in making authors deposit their data and other materials (eg raw files from which they derived figures) in publicly accessible databases for all scientists to search/access. Both services are valuable and in my opinion necessary. Until the day when it is all done by computers and research trends emerge automatically from the mass. But we are by no means at that point yet.

Comments are closed.