Open Access Science

John Wilbanks, executive director of the Science Commons, and his colleagues are now focusing on access to the literature, obtaining materials, and sharing data. Science Commons recently introduced a set of tools to allow authors greater control over papers published in scientific journals.

This week, they have launched the Neurocommons project, an open-source research platform for brain studies. This system uses text-mining tools and analysis software to annotate millions of neurology papers, so that researchers worldwide can find relevant information in a matter of minutes. Other sciences will follow. Wilbanks spoke with Popular Science magazine about his vision for open access science.

Chemical Crocodile Clips

I hate having to download standalone video players to enable playback of video content. Google Videos/Youtube are both guilty, although obviously you can view online, but then you have to have an internet connection to do that unless you save the file to your hard drive and download the player…

Crocodile Clips provides simple simulation software and, you’ve guessed it, they have their own proprietary player. But, I can excuse them, because their player is not a simple video rendering application but a simulator that allows educators and students alike to work with data and generate simulations of a whole range of processes from titration to animation. For ChemSpy.com users, the chemistry simulations and tutorials will probably be of most interest.

With the snappy Crocodile Chemistry, you get a simulated chemistry laboratory where you can model experiments and reactions, without all the hassle of fume cupboards and safety goggles. Drag chemicals, equipment and glassware from the toolbars at the side of the screen, and combine them as you wish. Choose whatever quantities and concentrations you like: reactions are modelled accurately as soon as you mix the chemicals. Plot graphs to analyse data from your experiment, and view mechanisms using 3D animations.

Moreover, if you really cannot face downloading yet another applet for viewing something, then they also have a section on pre-simulated videos ready for showing that are targeted at training potential users, but only if you’re online and only in their proprietary video format.

Data Mining Prominent Scientists

Authoratory is a unique database that provided contact information, professional interests, social connections and funding for almost 300,000 leading scientists (The site quoted 289,943 as the actual figure, at the time of writing). So, what makes this database so unique? Well, The content is generated by data mining the millions of articles indexed by PubMed. Published papers are inspected and a personalized report built. You can hook out the most prominent expert in almost any field reported by PubMed and the site will tell you how many papers they published, their research affiliations collaborators, and list any NIH funding.

Researchers listed have to be US, UK or Canada based and have to have published at least three papers a year. 2,129,859 papers with almost 2,699,772 unique authors have been mined, the site claims, but only ten percent of those are considered worthy for inclusion in the Authoratory release.

I am sure someone adept at Yahoo Pipes could exploit this database in a mashup of chemistry papers and feeds and the Authoratory database. Mitch?

PubChem Statistics

In March 2006, I interviewed PubChem’s Steve Bryant for the Reactive Reports chemistry webzine and he revealed some of the inner workings and the aims of the PubChem chemistry database. Ever since, I’ve been rather curious about the growth of the site. How many scientists are using it. Unfortunately, Bryant tells me, getting a handle on that kind of data is difficult. “It’s a very tricky business to accurately condense all the raw log info on hits and IP addresses into an accurate summary of who’s using a given resource and how,” he explains.

However, there are a few tips you might use to extract some useful information from the site nevertheless. There is an easy way to look at current contents of the databases, for instance. The best trick is to go to the “global query” page:

http://www.ncbi.nlm.nih.gov/gquery/gquery.fcgi

Then enter “all[filter]” (no quotes) in the search box. This gives counts of how many records in each database, e.g. 10,358,219 PubChem compounds, 552 assays, etc. There is also a summary of contributors to PubChem, that lists numbers of substances or assays by organization:

http://pubchem.ncbi.nlm.nih.gov/sources/sources.cgi

Now, obviously that doesn’t provide usage stats, but it does highlight a newsworthy aspect of developments at PubChem. Over the past year, there has been an increasing number (and diversity) of the screening assay results. “We’re now up to over 10 million substance test results (sum of the number of substances tested in each assay, across all assays),” says Bryant, “We’ve also put some work into structure-activity analysis tools. For example, from the first
assay answering the all[filter] query (AID 728, Factor XIIa Dose Response Confirmation), try “Related BioAssays | Related BioAssays, by Target Similarity”, the “Structure Activity Analysis”.”

Bryant points out that this “heatmap” display isn’t useful to all users. However, screeners who want to check on the selectivity of their “hits” are using these tools more and more, he says.

YASSE – Yet Another Science Search Engine

A new global science gateway – I’m sure they’d prefer me to display that phrase in blinking bright green text on a red background, but I won’t – has been launched by the US Department of Energy (DOE) and the British Library. The aim is in “accelerating scientific discovery and progress through a multilateral partnership to enable federated searching of national and international scientific databases.” Yeah, okay. But, isn’t this just yet another science search engine portal? Apparently, subsequent versions of WorldWideScience.org will offer access to additional sources as well as enhanced features.

That’s what they all say. I’m sure it’s a very worthy search tool and would like to hear from Chemspy visitors who have tried it out and found its results useful/useless (del. as applic.)

Incidentally, the webmaster risks a duplicate content penalty because the two canonical forms of the web address (WorldWideScience.org and www.WorldWideScience.org) both return “200 OK”. One of them should have a 301 redirect applied.

Search on Steroids

Researchers at Xerox Corporation have developed new text mining software that goes beyond conventional keyword search, enabling it, in effect, to hone in on the one or two golden nuggets among the trash in the garbage pit. I asked the developers how applicable the system might be to science searching as it is originally aimed at administrative, legal and business type environments. Apparently, the FactSpotter software will be just as well matched to searching out elements of risk in scientific documents, for instance.

White Biotech

Is it just me or is the title of the latest paper published on Chemistry Central rather unfortunately politically incorrect when taken out of context? I suspect it is just me, as Google throws up almost half a million entries for “white biotechnology” and the phrase itself was apparently coined in 2003 or thereabouts.

Anyway, the paper’s full title is “Relevance of Chemistry to White Biotechnology” and it is authored by Munishwar Gupta and Smita Raghava of the Indian Institute of Technology in Delhi. They discuss the emergence of novel biotechnological approaches to the bulk production of fine chemicals, biofuels, and agricultural products. It is, as the authors say, “a truly multidisciplinary area” with “further progress depending critically on the role of chemists.” The authors outline the state of the art and in so doing hope to encourages chemists to take up some of the challenges thrown up by this area of chemical science.

You can read the pre-press version of the paper here (as a pdf).

Thumbing Scientific Papers

A rather eye-catching paper was posted on the ChemRank site recently entitled: How to write consistently boring scientific literature. The paper is a parody on the art of writing a research paper by biologist Kaj Sand-Jensen of the University of Copenhagen. And begins, “Although scientists typically insist that their research is very exciting and adventurous when they talk to laymen and prospective students, the allure of this enthusiasm is too often lost in the predictable, stilted structure and language of their scientific publications. I present here, a top-10 list of recommendations for how to write consistently boring scientific publications. I then discuss why we should and how we could make these contributions more accessible and exciting.” Are you enticed by Sand-Jensen’s intro? Me neither. It just seems it would be as terse and as inaccessible to a lay reader as any of the papers he parodies. You can give it the thumbs up or the thumbs down on ChemRank.

Chemical Precedent

Readers with a fairly long memory will remember ChemWeb preprints. The pioneering site , which hosted my weekly Alchemist column from pilot issue till final closure and takeover by CI now hosts a fortnightly newspick from yours truly. As to the preprint server it attracted a lot of interest but never took off in the way that the physics preprint service at LANL did, unfortunately. It seems that now nature publishing group is hoping to step into the fold.

Nature Precedings (Geddit?) will cover chemistry, biomedicine, and earth sciences ) will host a wide range of research documents, including preprints, unpublished manuscripts, white papers, technical papers, supplementary findings, posters and presentations. All submissions will be reviewed by staff curators and accepted only if they are considered to be legitimate scientific contributions. The papers will not be peer reviewed. So, it’s almost exactly the same as ChemWeb preprints, but with the addition of biomed and geo. I hope it goes well, it is an interesting experiment, but one that did not produce the desired yield for ChemWeb despite that organisation’s peak membership being higher than the American Chemical Society. It takes more than a snappy name and some Web 2.0 graphics to win scientists over with novel Internet applications…thankfully.

Social Scientists Don’t Do Chemistry

To show scientific information flow between disciplines, Columbia University’s W. Bradford Paley and colleagues categorized about 800,000 papers into almost 800 areas based on citations of each in other papers. They produced a map of nodes in which node size is proportional to citation frequency and color distinguishes between 23 broader areas of scientific inquiry, from mental health to fluid mechanics.

A write-up outlining the details appeared on the Discover Magazine site recently and the number 1 section heading announced that “Social Scientists Don’t Do Chemistry”. Presumably, the reverse is also true as the relationships between disciplines are mutual in Paley’s map. So, what I’d like to know is aren’t there examples of social scientists studying the impact of chemistry on our lives, perhaps touching on chemophobia and other phenomena and what about those chemists who take a philosophical view of their science considering its wider sociological implications in their work. If you have any examples or thoughts on this please leave a comment.