Drug Discovery + AI

Kiran Mitra
5 min readDec 13, 2020

Drug discovery has always been a major focus of the healthcare industry. From when Alexander Fleming discovered penicillin to the COVID-19 vaccines that are coming out now. Pharmaceutical companies and scientists work to develop drugs to help the human population thrive. However, a lot of work goes into finding these drugs and most do not pan out as usable medications. So how are drugs discovered? Currently, there are two main scientific methods.

Drug Discovery Through Chemistry

Chemistry is one way through which scientists discover drugs. Chemistry entails the structures and representation of molecules and then predicting what effect they will have on the target. Experiments are also run where chemists take a compound or manufacture a compound and do controlled testing to observe the effect.

Drug Discovery Through Biology

The biology method is slightly different than the chemistry approach. This method relies on study of DNA, RNA, and the proteins created by both (using genes). Using this information, biologists design molecules suited to safe interaction of living matter and compounds.

A Process

Drug discovery is a long and expensive process. In 2014, each drug that was developed cost $2.6 billion. The average time for a drug to developed and approved is more than 10 years. This is because of the multiple requirements needed of an effective and safe drug.

Requirements:

  1. Must be effective
  2. Must be able to absorbed by the body
  3. Must be non-toxic
  4. Must be capable of mass production (to be effectively distributed)
  5. Must have FDA approval
  6. etc.

Meeting all of these requirements is a challenge. Here is how it is done today.

There are two main methods: Brute force lab experiments and Mathematical simulations.

Brute Force Lab Experiments

Brute force lab experiments happen when scientists make several compounds and try all of them against the target. This method is very accurate, but is a slow and expensive process as they must make the compounds and then test them on living matter.

Mathematical Simulations

These simulations do not require real materials and are much less expensive than the brute force method. However, the lack of biological testing and computational margin for error cause this methods to be less accurate.

Both methods have their pros and cons and neither is perfect. So let’s try a new method: AI.

Illustration by Michele Marconi

AI Models

AI models have been fairly recently applied to this problem and a few models have been successful. All of the models test a wide variety of drug candidates (chemical compounds) and check for effectiveness, toxicity, and check other requirements. The AI models are also very versatile and can fit any task/target. These are fast and efficient and some models have excellent accuracy.

The inputs for the models are the proteins or other targets and the output is the properties of said molecule. How do we represent the inputs in a format that is compatible with computers (i.e. numerically)?

Example Sequence

One way is to put them into sequences aka just having the chemical formula. However, using this method negates from the accuracy as we lose the specific bonds that are in the molecule that give it its properties.

Example Sequence

Another method is to use a graph. The nodes would be the type of atom and the edges would be the chemical bonds.

Graph Representation Example

Graph Neural Networks (GNNs) are a great model for working on predictions using graphs. They are able to recognize the graphs and glean information about the connections. Each node is affected by the nodes connected to it and etc. This process is called molecule embedding, which captures the role in the molecule’s structure and chemistry. Using this analysis of the molecule, the we will be able to determine specific properties of the molecule. MIT was able to sue this method to find a new drug to be a new kind of broad-spectrum antibiotic.

Halicin

This antibiotic was accidentally discovered by researchers at MIT when they were conducting a search for possible molecules that could combat E. coli. Regina Barzilay headed the investigation of over 107 million molecules in search of one that could treat E. coli. They fed graphical representations of those molecules to a AI model and they came up with Halicin, a failed diabetes drug.

Halicin (Credit: MIT News)

After rediscovering Halicin, it was tested in the lab and was confirmed as a new kind of broad-spectrum antibiotic.

Closing Thoughts

AI has many applications in healthcare and has much potential in all of its several facets. Drug discovery has recently come into focus as a prospective field in which AI can be applied. AI has already provided some results in the drug discovery world (Halicin) and continues to be a great force for greater innovation.

Kiran Mitra is a Student Ambassador in the Inspirit AI Student Ambassadors
Program. Inspirit AI is a pre-collegiate enrichment program that exposes curious high school students globally to AI through live online classes. Learn more at https://www.inspiritai.com/.

--

--

Kiran Mitra

A high school student that is passionate about STEM, music, drama, photography, AI, and food. Most blogs are done as an Inspirit AI Ambassador (Stanford)