The world of chemical research is currently going through a revolution. The constant decrease in the cost of chemical analysis and high-throughput experimentation has led to an emerging need for data curation and storage facilities. At the same time, the possibilities the internet offers, such as intra- and extra clouds, contribute to a data-oriented information climate. As a result, the computational chemist’s function is presently changing from that of an algorithm developer to that of a data curator.

To indicate, illustrate and anticipate the current ‘second industrial revolution’, three short case studies, or ‘visions’, have been fleshed out below.

Oil and Gas: a Vision of Petroleomics and the Digitizing of Oil

Close to seventy percent of the world’s oil cannot be recovered by traditional means, as it remains inaccessibly trapped in otherwise depleted reservoirs. If, accordingly, through clever computational processes today’s extraction efficiency could be improved only by 1 percentile, the world’s oil production would increase significantly. The numbers, admittedly, are daunting: one percentile equals one year of present-day oil consumption.

The petrol industry’s future aim is to program software with which it is possible to store and annotate all data pertaining to a given oil field, as the development of a comprehensive software toolkit will most likely facilitate the exploration, recovery and processing of more oil in a more efficient, effective and green manner. Yet accurate software and sensor equipment easily produce terabytes of data; data that is difficult to process and moreover impractical to put into service without a concrete frame of reference. Subsequently, all leading oil companies are currently engaged in projects that aim to make sense of big data. Primarily, the intention is to develop methods that can interpret data automatically.

With regard to oil recovery projects, Culgi B.V. tends to occupy a consulting position. The company aims to assist in the recovery of oil by offering advice and information about the chemical composition of oil and the condition of reservoirs. Culgi B.V. moreover believes that chemical modelling software should not be put into service incommunicado. Rather it should spin as a cog in a much larger machine and function as a semantic tool to help bridge the gap between chemical analysis and datamining. We at Culgi B.V. sternly believe that the possibilities that today’s computational methodologies offer can be fruitfully exploited to fashion a sustainable and gainful future for the oil industry.

Materials: a Vision of Detergents

Personal care companies, such as Proctor and Gamble and Unilever, are increasingly making use of e-Science, whereby all data on consumers, production methods and materials are integrated into one company-wide framework. Admittedly, the digitizing of data in the personal care industry is similar to that taking place in the oil industry, the difference being, of course, that the latter intends to improve the exploitation of raw materials and the former the marketing of consumer goods.

Bio-derived detergents offer an excellent example of how e-science might be fruitfully employed to increase an enterprise’s productivity and profit. Personal care companies are progressively looking for detergents that are both environmentally friendly and chemically stable. That this is a formidable challenge cannot be understated: bio-derived detergents are on the one hand expected to be renewable (that is: not to rely on petroleum for their composition and production), but on the other they are also meant to perform at least as well as traditional petrochemical products.

Over the last few years, scientists have increasingly started to use robots to produce and test several thousands of new detergent mixtures overnight. The produced mixtures are then screened on high-performance computers and analysed with the help of advanced computational algorithms. In the future, perhaps, these experimental and computational screening techniques will shorten the time-to-market of a new detergent, which currently usually constitutes a few years, to a few weeks.

Pharma: a Vision of a Digital Kinome

An appealing example of the ongoing in-vitro to in-silico transformation can be found in the pharmaceutical industry. The kinase protein family (kinome) has 518 different proteins. Tumour growth usually advances because one or more kinase proteins are de-regulated. If it is possible to invent a drug that can inhibit each of the 518 kinase proteins separately, then cancer could perhaps be made into a tolerable condition rather than a devastating illness. Maybe a daily administering of an inhibitor could slow down the disease’s development.

Unsurprisingly, perhaps, roughly thirty percent of all the work that is currently undertaken in the pharmaceutical industry is in one way or another related to the kinase protein family.

The targeting of a specific person’s kinome is referred to as a personalized form of medicine.  Yet most of the drugs that are on the market today have been discovered through trial and error – this has, perhaps deservedly, earned the drug design industry the reputation of being capricious, opaque, and unintelligible. Admittedly, the designing of a successful drug has in the past often been a matter of luck. But maybe that will change in the future. Both a better understanding of the kinase family and the availability of an increasing larger dataset of protein structures will allow future researchers to rationally design inhibitors. Subsequently, mankind’s ongoing battle against cancer might one day be turned into a straightforward engineering problem. But in order to fashion such a vision of the future into a reality, the following things are needed: bigger computers, better chemical models and more accessible data.