Googling InchiKeys

What it is to be an InChI – at once unique and representative of so much and yet meandering and potentially far too long. Now, IUPAC has launched a new version – InChIKey – a condensed version of the InChI chemical identifier that will be a fixed 25-characters long.

This new format will make searching for molecules on the web much simpler by preventing unpredictable breaks that happen with the conventional seemingly endless InChI strings for some of the more complex compounds. It will thus facilitate a web-based InChI lookup service and allow InChI to be stored in fixed length database fields and so make chemical structure database indexing far easier. One of the most important aspects of this new approach to the InChI is that it will allow verification of InChI strings across networks. Imagine the woe, for instance, if the tail of your Viagra InChI were cropped short in transmission…

IUPAC admits that there is a finite, but very small probability of finding two structures with the same InChIKey. The odds are of the order of billions to one against, but chemists are making more and more new substances as we speak so you never know when there might be a data collision. It is very, very unlikely, however.

The new release can be downloaded from the IUPAC web site (www.iupac.org/inchi). What effect this will have on my passwords for chemists idea, I simply don’t know!

Author: David Bradley

Freelance science journalist, author of Deceived Wisdom. Sharp-shooting photographer and wannabe rock god.