Using AI to Solve Medical Mysteries and Spur Rare Disease Treatments – Dr. Matthew Might, Kaul Precision Medicine Institute at the University of Alabama at Birmingham


“It's still early days in the application of all this technology relative to its long-term potential, but even so, it's already producing some big wins for patients,” says Dr. Matthew Might, whose impactful career in computer science and medicine has been shaped by the rare disease odyssey of one of his children. His son, Bertrand, was the first person in the world diagnosed with a particular form of NGLY1 deficiency, a neurogenic degenerative condition that causes developmental delays, seizures and frequent infections. Unfortunately, Bertrand succumbed to an infection at the age of twelve in 2020 but by that time, Dr. Might's work in precision medicine had led to crucial discoveries for dozens of children with NGLY1 deficiency. Now, as director of the Hugh Kaul Institute of Precision Medicine at the University of Alabama at Birmingham, he uses an AI-based system programmed to connect the dots in extensive databases of medical literature to make inferences about potential therapies for uncommon diseases. Check out this fascinating conversation with host Shiv Gaglani about the promise of this approach, the challenges in repurposing drugs and conducting clinical trials in the rare disease community, the need for more genetic counselors and Dr. Might’s work on President Obama’s Precision Medicine Initiative, which he calls the Rosetta stone of the human genome. Mentioned in this episode: https://www.uab.edu/medicine/pmi/




Shiv Gaglani: Hi, I'm Shiv Gaglani and today I'm happy to welcome Dr. Matt Might to Raise the Line whose impactful career in computer science and medicine has been shaped by the rare disease odyssey of one of his children. His son Bertrand was the first person in the world diagnosed with a particular form of NGLY1 deficiency, a neurogenic degenerative condition that causes developmental delays, seizures and frequent infections. Unfortunately, Bertrand succumbed to an infection at the age of twelve in 2020. But by that time, Dr. Might's work in precision medicine had led to crucial discoveries for dozens of children with NGLY1 deficiency and an AI based system that is helping solve medical mysteries of all sorts. 


Dr. Might has been the director of the Hugh Kaul Precision Medicine Institute at the University of Alabama at Birmingham since 2017, where he's also professor of internal medicine and a professor of computer science. Prior to that, he worked on the Obama administration's Precision Medicine Initiative and joined the faculty of the Department of Biomedical Informatics at Harvard Medical School, where he's currently a senior lecturer. 


Before we get started, I wanted to note connections to two other Raise the Line guests: Matt Wilsey of Grace Science, whose daughter also suffers from NGLY1 deficiency; and Chris Gibson, the founder of Recursion Pharmaceuticals, who has been involved in the quest for NGLY1 deficiency treatments. So, Dr. Might, thanks for taking the time to be with us today.


Dr. Matt Might: Oh, it's a pleasure to be here. Thanks for having me.


Shiv Gaglani: Of course. You’ve had quite an interesting career. I know our audience will find it fascinating. Starting with your beginnings in computer science, can you give our audience a bit of a breakdown of what got you interested in computer science and then eventually, cybersecurity?


Dr. Matt Might: Sure. Maybe it happened in the reverse order. When I was twelve years old, I was getting into mischief on the internet. This is sort of the early days of the internet. Prodigy and AOL were around, but you could get online. I kind of took to hacking. I really enjoyed it and from that developed a broader love of computing, computer science, and all that, and that really stuck with me all the way through the time I got to college where I finally majored in it and ultimately got a Ph.D. in it. But it's always been a passion.


Shiv Gaglani: Yeah. That's how a lot of people get into it. I think that the two ways are like white hat hacking or game development. The best software engineers I've met go between those two. Obviously, you've been applying your computer science skills to precision medicine for a while. Talk to us a bit about your diagnostic odyssey that your family went through with Bertrand.


Dr. Matt Might: So, Bertrand is my eldest son, and had a four-year diagnostic odyssey where we had no clue what was wrong, right from birth. It ultimately ended with really one of the first trials ever of a form of genomic sequencing just sort of figuring out what was going on with a very small group of kids who had been on similar intractable diagnostic odysseys. That was done at Duke University and the shocking result came back that Bertrand appeared to have this brand new genetic disease, NGLY1 deficiency. He was the first person discovered with it.


Shiv Gaglani: Can you talk to us a bit about the specifics of the odyssey? Like, was it ultimately you found the right specialists to test for that gene? Or was it a whole genome sequence that Bertrand went through or?


Dr. Matt Might: It was a whole exome sequence in this case. You know, genomes were still very expensive at the time, and exomes sort of produced this overnight breakthrough in costs because they were designed to look at the 2% of the human genome that's protein coding, which is where the vast majority of genetic disease lies. So, by looking at a very small fraction of the genome, you can capture most of the genetic disease and that's what suddenly made it feasible. It's what allowed that clinical trial to go forward.


Shiv Gaglani: That's a good distinction for our audience to know. So, after the diagnosis, what was life at home like with NGLY1 deficiency? Tell us a bit more about what it was like being a parent of a child with a rare disease during the course of his life.


Dr. Matt Might:  I kind of divide it into the undiagnosed and the diagnosed periods. The undiagnosed period is just sort of...it's kind of terrifying every day and it's not like it's not necessarily not terrifying afterwards, but just not knowing what's going on, not knowing what could come next, having no sense of the future and random hospitalizations thrown at you. It's a hard life. The analogy I've heard and used before is that it's like you're just drowning every day and you're just trying to tread water. That's a day in the life of a rare disease parent and then during that undiagnosed phase, you're also trying to learn as much medicine or genetics or whatever as you can to hopefully figure out what it is. Then afterwards, it's not genetics. You're trying to figure out, okay, how do I develop a drug? You know, what is a drug? All that sort of stuff, and so it just flips what you're trying to learn when you go from undiagnosed to diagnosed.


Shiv Gaglani: I mentioned in the intro that we recently spoke with Matt Wilsey. I think it was big news for the NGLY1 community with the first IND being submitted. I'd love to hear more about the Precision Medicine Institute work as well as your work at UAB with the AI system you've developed. Can you give us a bit of a sense of how you turned a devastating diagnosis and family experience into something so revolutionary and transformative that has applications well beyond NGLY1?


Dr. Matt Might: Yes. Bertrand's odyssey kind of snowballed beyond just him. It snowballed into finding other NGLY1 patients, and of course, it snowballed into trying to find treatments. But what it really did is it gave me an appreciation for how this was done, along with some realizations that now is sort of the perfect time to have not just computer science in medicine but computer scientists in medicine. I think actually being both a computer scientist in medicine, and doing computer science in medicine has been an advantage. One thing I saw right away was some of the potential for AI to make a difference and so when it comes to finding treatments, I should say I'm agnostic as to approach. Oviously, I'll do the computational stuff, if I can. But the Institute itself also does plenty of stuff in a wet lab, too. So, it's sort of a whatever gets the job done type of approach for us. 


In the case of AI, I like it just because it's easy and it's cheap. When patients reach out, we can generally run queries pretty fast. We have developed a pretty extensive AI system at the Precision Medicine Institute at UAB called mediKanren. It's largely funded by the NCATS Consortium, called Translator and through this, it's been able to absorb hundreds of different datasets in a structured fashion through a number of teams that we collaborate with. Then we build automated reasoning on top of that to sort of connect the data points between all these datasets and allow us to make inferences about what could help influence some core mechanism of harm in a disease. That's really what it's all about. 


I should say one of the key datasets here has actually been a natural language processing of the entire medical literature. That one dataset alone is probably responsible for 30% of what we end up recommending to patients. It's kind of shocking when you think about that, because all of that was already known in theory, it's just that no individual physician actually knows all of it and so we can spot stuff that goes unforeseen or just unknown or is very old. Then when we connect dots between different datasets we make inferences. No two people have those data sets in their head at the same time so they couldn't make the connection, but of course, an AI system can.


Shiv Gaglani: I love that. We could spend so much time talking about these applications. My mentor here at Elsevier is a guy named Jan Herzhoff and he loves talking about connecting the dots and the value you can create for the world by just having the different silos talk to each other. Part of why we joined Elsevier is they are a large publisher of scientific journals and so a big part of what we're trying to do with rare diseases is make them more accessible. We're launching a journal of rare diseases next year and trying to make it more accessible for researchers, patients and others to access these articles. 


So, there's obviously the rare disease applications, but very specifically, what are some applications for mediKanrenthat you've been most proud of over the past two or three years and where do you see it going over the next couple of years? Because it does seem like everything is getting exponentially better. I think we're going from GPT3 to GPT4 very soon and that's like an order of magnitude better than what we've seen.


Dr. Matt Might: I'd say in some ways, it's still very early days in the application of all this technology relative to its long-term potential, but even so, it's already producing some big wins for patients. What's exciting to me is that in terms of the maturity of what it's done -- if we look at some of the first cases we tackled when I got to UAB right after we built the system -- some of them have reached extraordinary milestones. 


For example, there's an autism driven by mutations in a gene called ADNP. One of our first predictions was for that particular disorder. In that case, the ADNP gene is working at sort of 50% strength. The query was, "How do we make it work harder?" You know, "What increases the activity of this gene?" And it actually came back with an interesting answer. It said, "Have you tried low-dose ketamine?" No, obviously. No one's considered that yet. But sure enough, that ended up working.  

We knew it was working in patients probably two years ago, then the patient community got a clinical trial done and the year after that the paper finally got published. So, that's pretty recent. It was just four years from the time the prediction was made to the time that there's the gold standard validation of a publication. We can certainly point that out and say, it's real. It made a prediction no human would have made, it turned into a clinical trial and it worked. That's exciting almost no matter how you look at it, and also the fact that it was in autism, a field with very little pharmacological progress. I think that in and of itself is exciting, too. 


I see that as the tip of the iceberg. We've certainly had other patients where, in some cases, literally just better mastery of the literature helped. Like, that's all it took. The example I like to point to is the case of a girl who had cyclic vomiting syndrome. You know, a long ordeal in the hospital, desperate parents not sure what to do, a medical team that felt they tried everything. When we applied natural language processing to the entire medical literature, there's like 347 possible treatments that have been proposed at different points in time. So all it took was just going down that list and we started finding stuff they hadn't considered. 


In that case, it was nasally inhaled isopropyl alcohol. I mean, it wasn't any more complicated than that - and that worked. It just wasn't well known to anybody on the medical team or to any of us, frankly, but there are three papers out there that say it might actually work. So, sometimes we just find stuff that is sitting out there and it's technically known, but no one knows it.


Shiv Gaglani: Those are great examples. Really good. Actually, you preempted one of my questions, which is about one of my friends and former guests on the podcast, David Fajgenbaum. I'm on his advisory board for Every Cure, and actually, it was through Every Cure that I met Tania Simoncelli at the Chan Zuckerberg Initiative who recommended I get in touch with you. She and Tanisha Coates both speak highly of you. Every Cure is about repurposing three thousand drugs. In David’s case, he had Castleman disease. As you know, sirolimus was a common drug that really helped him and about a third of Castleman patients. 


Tell us a bit more about how do you scale out? What do you need to repurpose more drugs or to get more of these discoveries? Is it more human capital, more datasets, more funding, a combination of all of that?


Dr. Matt Might: It's some combination, but more of some things than other things, for sure. There seems to be a desperate lack of funding for clinical trials when it comes to repurposing. Usually it kind of unfolds like this: we get a promising prediction, we work with a clinical team, they try it out, we see something promising in the patient, we go "Okay, now what?" Well, the next step would be to reach out to the larger community to do a trial. And every single time, that becomes one of the big stumbling blocks because of recruiting, you know, putting together and funding a clinical trial that would give us that gold standard validation that we really got it right and this is something that patients really should look at. For me, I think that's the biggest obstacle. 


Certainly, other datasets would make a difference too. We have overleveraged certain datasets. There's one called LINCS and other one called CMAP. These are datasets that have built up large transcriptomic profiles of drugs interacting with cells. These are really valuable for making predictions about repurposing, but they're very sparse relative to the total space of cell types and genes and all the drugs you could really want in these datasets. So, we make a lot of inferences that I wish we didn't have to make where ground truth would make a huge difference.


Shiv Gaglani: Interesting. One overarching question I have -- and one of the benefits of doing this podcast is you get to ask questions like this to experts like yourself -- is just how do we connect these dots?  And if there's something we can follow up on about how maybe Elsevier could be helpful with providing more data, I'd love to touch base on that. 


Going into the clinical trials aspect, though, what are some of the hurdles? I know, one is getting enough numbers to run a clinical trial of patients, right? When your son was diagnosed with NGLY1 he was the first. Now there are, what eight hundred patients worldwide and maybe even less.  What is it...seventy families on your newsletter?


Dr. Matt Might: Yeah, I think about seventy families that were in touch with it, depending on sort of the frequency of the alleles...probably five hundred, maybe eight hundred that are probably out there somewhere. We're in contact with a pretty sizable fraction of them, but there's clearly more that haven't been contacted yet.


Shiv Gaglani: Yeah, language barriers, etc,


Dr. Matt Might: At seventy people, you have enough to do a clinical trial sort of just barely, but probably not more than one clinical trial. That in and of itself actually leads to issues, too, when you have more than one candidate you might want to try. That is actually the case in the NGLY1 community. We're fortunate enough that there's more than one potential route to treatment at this point, which is not what every community is facing. Usually, it's the case where there might be a proposed treatment, but nowhere near enough patients to actually do a trial and even if there is, then not enough funding to do the trial. So, there's only a handful of communities that I think have really been able to run all the way because they happen to have that magic combination of numbers and funding and something to do.


Shiv Gaglani: What's an example? Like, one thing that comes to mind is epidermolysis bullosa seems to be a good example. We've had two people on from those communities, and they seem to be pretty far along in their discovery process.


Dr. Matt Might: Yeah, in terms of different communities being at different points, that's a good example. One we're working on right now is the KC and MA1 gain of function community. There are two different disorders for the gene, KC and MA1, and we found what seems to be an excellent treatment for the hyperactive or gain of function side of this community. There are probably just enough patients to do a clinical trial, but no funds to do it. 


In this case, we've reached out to the company that has the patent to see if they're interested, and they aren't.  It was a thoughtful no. It was a very honest no.  It was like, "Well, there's just not enough of these patients to justify a whole new clinical trial." And I get it. Their view is kind of like, "Well, they could just take it off label." I suppose that's what's effectively happening already. But, it'd be nice to actually get good answers and good data to really guide these patients in the use of these medications.


Shiv Gaglani: Yeah, very interesting. One topic I know Matt Wilsey has talked a bunch about is biomarkers. And we had Luke Rosen on the podcast talking about KIF1A syndrome, which his child has, and the need for us to better understand endpoints. An endpoint for a trial that’s valid as defined by the FDA goes beyond what is important to rare disease families, which is that quality of life is improving. You know, something like off-label ketamine that may not justify a current clinical trial, but maybe there needs to be a rare disease clinical trial pathway. Is that sort of kind of what the community needs?


Dr. Matt Might: Yeah, I think when it comes to rare in general, we've got to think differently about clinical trials. I think we have to expand our flexibility in terms of what we'll accept in terms of endpoints. In some sense, we have to look at what endpoints are realistically measurable on the timeframes available and likely enough to produce meaningful effect sizes so that we actually can be somewhat confident that it's doing what we think it's doing.  I think in some sense, even focusing too much on the efficacy side is a big problem. I feel like in a lot of cases we'd be better off focusing on the safety side, and then worrying about the efficacy side through really innovative data science later on after some sort of new sort of limited approval, or through a very innovative trial design of some sort. I think that's probably the way to go.


Shiv Gaglani: That is interesting. And, again, that echoes what some other parents of rare disease children have said, which is clearly they're motivated to make sure any of these medications that they're trying are safe, but they also have this, you know, race against time... that they want to make sure they're getting therapies in the body. So fascinating. 


Turning our attention to precision medicine as a whole, I would love to hear about your experience working in the Obama administration and then moving forward, what are some of the other things that get you most excited about precision medicine?


Dr. Matt Might: I was really fortunate to be able to work for President Obama on the Precision Medicine Initiative really right from the beginning, even helping to co-author one of the original white papers laying out the structure of the initiative. What was great to me was the recognition of its necessity in the first place...this realization that clinical genomes are about to be a thing, and now they are. We're really bad at interpreting them even now because we're still building these datasets. If you sequence a patient who looks like they have a rare genetic disease and you pull out all these mutations, only about a third of the time do we get it right. And by that, I mean we actually find an answer and the rest of time we go, "Well, I don't know which of these mutations it might be" So, recognizing seven years ago that this problem was coming, I think, is incredible and the fact that we started building the data set to answer these questions that far back is prescient. 


Now, of course, it's not just rare disease patients that need their genomes. We're all getting genomes, or 23andMe, or something like that and we're all wondering, “Well, what does that mutation mean for my health?”  It was exciting to be a part of that effort to build what I call the Rosetta Stone of the human genome so that we would be able to answer these questions in a meaningful way for every patient as they start to encounter their own genetic data.


Shiv Gaglani: It's very exciting. We just had Max Bronstein from the White House Office of Science and Technology Policy on the podcast talking about ARPA-H as a potential funding source for these kinds of innovations. It's definitely a space we want our learners to watch, because the way they practice medicine may completely change. How far off do you think we are from every newborn and every person getting their genome data?  Not just a 23andMe, but a whole genome? I know, Illumina just released a $200 one.


Dr. Matt Might:  Let's be honest. The problem isn't cost. It hasn't been cost for a while. In terms of generating the data it’s just not that expensive anymore. You could conceivably do it for everybody at this point. It's really more on an ethics side and on sort of a, “do we have the bandwidth to tell everybody what their genome means” side? We don't have enough genetic counselors to properly counsel every newborn, and then the ongoing counseling. Your genome stays the same, but the information that we know about your genome does not and so you almost need an update every single year of what we've learned about your genome. I think that's another piece of the puzzle. 


I don't think anybody has a good answer to this yet. What does it look like to sequence a newborn and find out they're probably gonna get Alzheimer's and get it earlier than most? What do you tell those parents? What do you tell that child as he or she grows up? That's the problem. Personally, I would want to know for my kids, and I do know for my kids. I'm sort of an information maximalist as a parent, but we are concerned about the people that don't want to know. I guess they just don't participate. I guess that's the bottom line. But, you know, any parent out there that right now wants their newborn sequenced, there's nothing stopping you. You can do it.


Shiv Gaglani: That's really interesting. I know these are really challenging bioethical issues and we were talking about 23andMe because we had their CEO Anne Wojcicki on the podcast, and we work with 23andMe to try to get primary care providers more trained on direct-to-consumer genetics and genomics in general. There are only five thousand genetic counselors in the country. That's not enough, as you said.


Dr. Matt Might: Nowhere near enough.


Shiv Gaglani: So, we need to get PAs to know this better and other primary care folks, so we're working to upskill them. But even that's changing year to year, and even doing the 23andMe every year, you're getting notifications just like you get on social media where somebody commented on the post you had. Similarly, you’ll get alerts that “now we know this about a mutation you have” or “we discovered a long-lost family member” or something like that.


Dr. Matt Might: So yeah, 23andMe has a good model. But what they know is only a fraction of what they can safely say because they have this very clear process for doing it very carefully. Their threshold has got to be set very high too so that they can be very confident that they're telling you the right thing. I know this because, of course, I've got my raw 23andMe data. I've got my whole genome. I can compare what my whole genome tells me versus what the 23andMe data tells me versus what the site actually tells me and these are three very different things. But what I will say is the raw 23andMe data does cover most of the really interesting stuff in the whole genome. It didn't miss anything really interesting as far as I can tell. So, the snips that were picked, were picked well. I can say that. But when you compare what's in your raw 23andMe data versus what's in your reports, it's so far apart. But again, it's because that level of caution is critical, and how to do it thoughtfully, how to return to it appropriately. I respect the challenge, it's a big one.


Shiv Gaglani: Yeah, totally. Putting on your cybersecurity hat, which is why we're talking in the first place, what got you interested in this entire space? There's the ethical aspect, but what about just the cybersecurity privacy aspects of keeping a whole genome safe? Is that overplayed? Or are we getting into a sharing economy where everyone should know this? Certainly, I think there'll be implications for speeding up drug development, right?


Dr. Matt Might: Yeah, I think it's one of those situations where we're not quite sure what the security risks of posting your genome publicly are. George Church is piloting this, where he's having people post their genome publicly just to see what happens.  It's worth looking at the consent for that study. It's fascinating, because they had to brainstorm all these hypothetical risks of what could happen to you if you post your genome publicly. For example -- this is part of the process -- they'll say technically, someday in the future, somebody might use your publicly posted genome to clone your blood and plant it in a crime scene, just so you know. It's like, be aware that once this is out there, that could happen. It's just fascinating to think about all the things that could happen someday, even if they're not yet possible right now, by having your genome out there publicly.


Shiv Gaglani: Yeah.  There’s a thin line between the utopia that we're going towards, and the potential dystopia. Technology is not good or bad. It's how people apply it, really.


Dr. Matt Might: Right. What's interesting is the controversy around the use of these genomics datasets to catch serial killers. Arguably, it's a good thing to catch serial killers, right? We're all happy when that happens. But we can also admit it was a little creepy how it was done.


Shiv Gaglani: Yeah, you're right. Black Mirror. Interesting, very interesting. I had two other questions for you. The first is, you've had a very interesting career at the intersection of computer science and medicine, and you're still getting started in many ways. What advice would you give to our audience about approaching their careers?


Dr. Matt Might: I guess if I could redo my own career, I would have stayed in computer science but I would have learned the biology earlier than I was forced to by Bertrand's odyssey. So, I think, for anybody out there right now, if you're already in sort of a biological track, don't despair. There's plenty of time and plenty of resources for learning the computational side, and the more you do, the more opportunities are available to you. I work with students going in both directions -- students that are computational in nature that want to become more biological, or students who are biological and want to become more computational. You can move in either direction. But that fusion is very powerful.


Shiv Gaglani: Agree. Yeah, that's great, and hopefully people will look up UAB and your institute and see what they can do from there. Is there anything else you want to get across to our audience that we haven't yet talked about?


Dr. Matt Might: I think we've covered a lot. Precision medicine is obviously a vast space. AI and repurposing are one small part of it, but a very important part of it. What I tell folks is if it seems like repurposing isn't the right fit for you, don't worry. There are other routes you can take. It might be gene therapy, it might be gene editing...there are all these other modalities emerging all the time. Exploring them all is what I would say when you're thinking about how to apply precision medicine at the level of an individual patient or a small patient group.


Shiv Gaglani: That's great advice and I'm hoping that people will take it to heart because this truly will be how people practice medicine in the next, not just twenty years, but maybe ten, maybe even five and in many ways. So, Dr. Might, thank you so much for not only taking the time to be with us on the podcast, but more importantly for the work that you're doing to bring this future to reality.


Dr. Matt Might: Oh, well, absolutely. Thank you for having me.


Shiv Gaglani: And with that, I'm Shiv Gaglani. Thank you to our audience for checking out today's show, and remember to do your part to raise the line and strengthen our healthcare system. We're all in this together. Take care.