Lecture: Artificial Intelligence in Ophthalmology

During this live webinar, a panel of global experts presents new developments in ophthalmic artificial intelligence (AI). Topics covered include: best practices for data collection, regulatory hurdles in various global markets and specific proposals for integrating AI into clinical practice. The session concludes with a panel discussion including audience questions/answers to enhance the learning experience.

Moderator: Dr. Malik Y. Kahook

Lecturers: Gabriella Lanouette, Dr. Jayashree Kalpathy-Cramer, Dr. Pearse Keane, Dr. Chris Leung, Dr. Ciku (Wanjiku) Mathenge & Dr. Michael Abramoff

Transcript

DR KAHOOK: Good morning, everybody. I am really excited about today and this Cybersight webinar that is focused on artificial intelligence in ophthalmology. I’m gonna share a couple of slides, as we start, because I have some groundwork to cover. I know many of you that are joining us today have participated as observers. Some of you have been attendees in lectures on this session. On this Cybersight webinar platform. So you’re familiar with the way that we do things. But I did want to cover some of the basics that will allow us to kind of enjoy this as much as possible, before we get started. I’m Mal Kahook, Professor of ophthalmology at the University of Colorado. I’m an artificial intelligence novice, so I’m coming into this as a student, super excited about the panel that we have and about the potential to learn from everybody. Gonna advance here. So before we get started with the introductions, I just wanted to cover some stuff that you can take with you, after the session. Most of you have been to Orbis.org as well as Cybersight.org. If you haven’t, I think it’s time for you to go and see all of the information that you can learn from. You see the screenshot that I have on the right hand side with access to the library, as well as multiple classes. That teach you everything from ROP to glaucoma, some of the basics of medications, lasers, that we use in ophthalmology. Just a wealth of free data. So please sign up if you’re a teacher with residents and fellows, please have them sign up and participate as much as possible. I also want to thank at the top here Lawrence and Jonathan, the magic behind Cybersight. Lawrence is the one who is running things behind the scenes here. And they’re working very hard for all the sessions, so I would like to thank them for the work they’re putting into it. And thank all of you. It’s not easy to attend these sessions, especially at different times around the world. For me, it’s early morning. For many of you, it’s much later in the day and perhaps getting towards the end of your day, so thank you for that. We’re always trying to improve these sessions. You can go through Orbis or Cybersight for some feedback or contact me on Twitter. That’s probably the best way to say what you like or didn’t like. You can see me and Gabriella arguing on things over Twitter, which I’m gonna start doing pretty soon now that I know her handle. You can ask questions as we’re going along, about the journey of learning about artificial intelligence, and I would love to learn from everybody. So let’s talk a little bit about who we’re going to hear from today. We have people from all over the world, different time zones, and also have speakers from all over the world, different time zones. In no particular order, I’m gonna read off some of the bios here that are extremely important I think to kind of understand the depth of the field that we have speaking to us today. The first one is Pearse Keane, professor of medical artificial intelligence at UCL Institute of Ophthalmology and a consultant ophthalmologist at Moorfields. He is originally from Ireland and received his medical degree from the University College of Dublin graduating in 2002. In 2016 he initiated formal collaboration between Moorfields and Google Deep Mind, and that’s when a lot of us got to know Pearse and the work he was doing. It was a shift in the way we were doing things. He developed an early warning system for age-related macular degeneration and has done work in other areas of ophthalmology and is really leading the effort to change the way we practice ophthalmology in clinic. So welcome, Pearse. Michael Abramoff is probably known to the majority of people who have dabbled in artificial intelligence, he’s a computer scientist and entrepreneur, the professor of ophthalmology at the University of Iowa with a joint appointment in the college of engineering, chair of digital diagnostics, the first autonomous AI diagnostics company in any field of medicine to get clearance. He is an academician, has been cited thousands of times and has multiple issued patents. Chris Leung, who I was talking to before we started the live session, haven’t seen him for a long time. Very excited to see him. Have learned a lot from him with his expertise in OCT, he’s a clinician scientist with a focus in diagnostics, has his masters from the imperial College of London and obtained medical training and doctoral education from the Chinese University of Hong Kong, completed his clinical and research fellowship at the Hamilton Glaucoma Center. And is playing a key role in advancing a lot of the new concepts that will allow us to access early diagnosis of glaucoma and detection of other ophthalmic diseases. He’s certainly a pioneer in using AI in glaucoma and that’s what he’ll be lecturing on today. Gabriella Lanouette is a full stack machine learning engineer working for Orbis, an insider coming with the first lecture today to talk about what Orbis is doing, she’s a part of Cybersight, develops algorithms for Cybersight AI tools, which can detect common eye diseases in mere seconds, she received her undergrad degree from the University of Connecticut and holds a postgraduate degree in machine learning and artificial intelligence from the University of British Columbia. Dr. Mathenge is a researcher at the Rwanda Institute of technology and professor of ophthalmology there, she conducts research and trains ophthalmic workers for the first ophthalmology residency program in Rwanda, has her masters from the University College London and was the first researcher on RAB in Rwanda, was awarded the first research fellowship in 2006, has conducted extensive research in retina, and I’m trying to drag her more towards glaucoma with some of the work we do, but I’m really excited to have her and her expertise. She’ll be talking about one of the research projects she did, that I was fortunate to participate with as well in one of her recent manuscripts. And last but not least is my colleague here, Jayashree Kalpathy-Cramer. Recently appointed to the division of artificial intelligence and ophthalmology at UC University of Medicine. Her research interests include artificial intelligence in ophthalmology, radiology, and oncology. Previous to this, she was at Harvard, co-director of the UTIN lab, she’s an electrical engineer by training, worked in the semi-conductor industry for several years, decided that she was being paid too well, so she decided to go into academia and is now focused on the applications of machine learning and modeling in health care. She’s very passionate about machine learning, she’s been a great teacher and I’m excited to learn from her now that she’s faculty, every day as we interact on research projects. So the way that this is gonna work is Gabriella will start putting up her slide here. She’ll share… I’m gonna stop sharing mine so Gabriella can get on board. And we’re gonna start with the first talk that will cover the first handful of minutes as an introduction to Cybersight AI and I’ll start going down with brief introductions to each talk. At the end of that, we’ll go into a Q and A session. There’s the ability to send in questions as we go along. So please do that. And the speakers or panelists will answer your questions or we might save some of them for the live Q and A. So Gabriella, I’m gonna hand it off to you, and I look forward to hearing your talk.

MS LANOUETTE: Thank you, Dr. Kahook, and thank you all for having me. This is really a humbling experience. And pretty exciting. Excited to hear the rest of my panelists. So today I’m gonna talk about the Orbis AI ecosystem. As Dr. Kahook mentioned, I’m a full stack machine learning engineer at Orbis. So this all really starts with Cybersight and our goals to build a global audience of health care professionals. So our digital services with global reach and a big emphasis on global. And to do this, by upskilling local health workers through free access to telemedicine and learning capabilities. We want to lower the barrier of entry to your services. So as long as you have a device with an internet connection, you should be able to get access to the same learning material that you would in the best medical training program. That’s a picture of what Cybersight looks like. So the funnel model. This is for eyecare specifically. And applies to us. But does apply to many areas in health care. Capacity is very important. And we want to give everyone the opportunity to be checked. So this journey from screening to treatment has a big emphasis on capacity overload. And not much expertise. So the most important… This is the most pressure on the system. Many skilled workers are required in many locations. And the systems can’t cope. So fewer patients — this leads to fewer patients being assessed and treated. And AI can help with both of these issues. So this is a normal screening workflow. We start with patient imaging. Which is then graded by a human grader on the ground. And then reported. And if everything is good, we circle back. And this leads to referrals and treatments. But in low resource environments, there’s a major shortage in grading. Which then leads to delays and missed opportunities in reporting, which leads to misreferrals and mistreatments. And because there’s 20 times fewer ophthalmologists per capita in low and middle income countries, this is a big overload. So AI can increase the capacity with assistive AI. This does tackle the capacity issue, and provides the confidence needed to participate in the clinical workflow. So this can be done with AI. With the automated interpretation of retinal images using Deep Learning. This includes referral recommendations, disease grading, anomaly detection, and lesion visualization. And this is actually a fundus image that you would get returned from Cybersight AI, as you can see. There’s the vertical cup to disc ratio, biomarkers, hemorrhages, exudates. You would get a DR grading score for diabetic retinopathy, a glaucoma score, and many other disease markers. So Cybersight provides… Oh, sorry. Having an issue with my… Can you hear me still?

DR KAHOOK: We can hear you.

MS LANOUETTE: Sorry. So Cybersight provides this additional context in a telemedicine human in the loop workflow. It’s a standalone tool that’s integrated into existing screening workflows and it’s free of charge in low and middle income environments. So is AI useful when implemented on the ground? A recent study that we ran and was led by Ciku in Rwanda showed that patients were more likely to adhere with referral recommendations when receiving instant AI results, and Ciku will speak more on this in her talk. So thank you for listening to my talk. Upload AI-only cases today at Cybersight.org, and if you’re interested in using our AI tools or collaborating, get in touch at [email protected].

DR KAHOOK: Thank you so much, Gabriella. That’s great. I think one thing that I should have said at the top that you just mentioned is the fact that Cybersight does have this collaborative consult service. That can be a peer-to-peer, as well as using the AI system. So I encourage everybody to go visit that. And we’re gonna move on to the second talk. And that’s gonna be Jayashree Kalpathy-Cramer. She’s going to talk about the dos and the don’ts of data collection. So we’re trying to avoid bad info in, bad info out. And Jayashree is going to talk to us a little bit about the ecosystem, starting early AI projects, and how best to collect data and deal with data in AI. So Jayashree, you’re up.

DR KALPATHY-CRAMER: Thank you for the introduction. And for the opportunity to be here this morning. Or whatever time of the day it is. I’m gonna share my screen and hopefully… So as was mentioned, I was gonna talk a little about the data. As I think most people here are aware, especially modern AI techniques really, really benefit from having lots of data. And are really only as good as the input data that goes into the model building process. As a data scientist will tell you, the part of curating and annotating the data is sort of one of the least enjoyable tasks and takes the most amount of time. Again, for most projects, the data curation can take up to 80% of the time frame. While the fun part of it, which is the model building, is very, very small, in terms of the time and resources that go into building these datasets. So machine learning typically is described as a system that gets better at a task through experience. That means the more and more data it sees when you’re building an AI model, the better the model gets. If the data are more diverse, the model again is more robust. The key ingredients to building machine learning systems is again — data is the main thing. It’s what goes into the system. It can be tabular data like a spreadsheet, it can be images, it can be a combination of images and demographics. A variety of data inputs can go into your machine learning model. And hopefully at the end of it what we get is something that transforms the input, whether images alone or other things, into an output. And there are many other things, in terms of what kind of algorithm, how do you measure error, and so on. Typically, we talk about supervised and unsupervised as… There are many types, but these are two commonly described kinds of machine learning. And in supervised learning, which is what we typically tend to see published, we need lots of entries. So along with the data, you need some labels. For instance, if you’re creating a glaucoma diagnosis model, you might need the images plus the output diagnosis. If you’re looking for ROP or things like that again, we have data and we have labels. In unsupervised learning, we are looking for clusters in the learning. We don’t need the ground truth labels. And these are things to think about when you’re starting to build a model, in terms of how you curate a dataset. There are many models people build. Deep Learning is common today, but support vector machine, naive Bayes have previously been common and might come back into fashion again. Another thing to think about, during the data collection and curation process, is what to output in the model. These are the kinds of models we see most often. For instance, we have classification models. This can be for any disease. In this case, we do a lot of work in ROP. For instance, we might be interested in understanding if this particular image from this baby represents plus disease or not. And very often these classifications tend to be binary. Is the disease present or absent? And again, when you’re thinking about the data you’re collecting and the labels that you need, that’s the kind of information that you need. Along with this image, do you have a diagnosis? Sometimes we might want to do segmentation. And in segmentation, the goal is to delineate the region of interest. So in the case here… We might want to delineate the vasculature. So we have a segmentation model for the vessels. In other cases, we might want to delineate drusen or other characteristics from the image. A detection is similar to that. But very often in detection models, you use a bounding box to essentially highlight the area of concern. So this is again an example of a detection model with hemorrhages and different things outlined. You can also have regression models. And this is often used when you’re looking at severity of diseases. So if you have normal, mild, moderate, severe, or where you might use ordinal regression, or you have a numeric value, you might use other kinds of regression models again. And so it really depends… The annotation effort and the curation really depends a lot on the tasks. So at what level are we doing the annotation? Is it done on the patient level? Meaning do we just know… For the entire set of data associated with that patient, if it’s a positive or negative, if we have multiple images from that patient, for instance, we have fundus, OCT, something else, are you doing it on the image level or the patient level, if you have 3D data, are you doing it on a frame or slice or something like that, and are you doing it on a pixel or voxel level. Depending on that, the effort that goes into the annotation and creating of the dataset can be quite, quite different. If you want to do segmentation, it takes a long time. On the other hand, if you’re building a model that’s a segmentation model, the size of the dataset that you require can be much smaller. So there’s definitely trade-offs between the amount of effort that you’re putting in to creating this dataset, as well as the time… The number of images that you might need for it. Where are you getting these? The ground truths from? Are you getting it from the patient record? Is somebody prospectively annotating it? So that’s another question to think about. Is where the annotation is coming from. How are you doing it? Are you parsing the EHR to get the ground truth? Are you having somebody do it? How many cases do you need. One issue is that the ground truth can be quite subjective. We have a lot of variability between raters — this is for ROP. And asking 8 experts on the same set of images, whether it’s normal, preplus, or plus, you see quite a lot of variety for the same set of images. So as you’re creating the data, something to think about is what the consensus is and how do you create ground truth. You can get it from EHR. This is a typical workflow. You get the data. You might have it from the EHR or might have somebody do a prospective evaluation. You typically preprocess the images and that might include normalization, cropping, augmentation, you build the model. There’s many different ways to do it. What architecture, what type of parameters are you using, Auto-ML to do it automatically, and the evaluation piece is really, really important. What is the ground truth? What is the metric? Are you looking at an independent dataset? In the case of ROP, for instance, three level system for plusness. We can see in the case of our data curation, we have the spreadsheet we got with this kind of information. We also wanted to then have multiple raters annotated, and eventually we get something like the plus or normal categories like that. We then do something of preprocessing, we might normalize images or do some kind of… Making it such that images coming from different cameras look more similar. We might crop out everything but the area of interest in ROP. Because we’re looking at vascular aspects of the image, we might do segmentation, might flip, might rotate, and so on. And there are many frameworks that are Open Source and publicly available that you can use for your model building. Typically classification models are evaluated using an ROC curve. And you can again look at the performance of your model as well as performance of individual raters. And this is something we see typically presented as well. Again, highlighting the issues that we have with range of variability. So if you look at the literature, there’s hundreds of thousands of publications in ophthalmology. And it’s getting to the point that it’s getting quite straightforward to build these models if you have the data. But some of the cautionary tales we’ve seen both from ophthalmology and other areas in radiology, et cetera, is that these models can be brittle. They are not self-aware. They’re not explainable. They can be unbiased or not fair. They don’t generalize well. So if you build a model in one institution and give it to your colleagues across the pond or across the country, they might just not work well. There’s a lot of data heterogeneity issues. If you have different populations, we build a model in the US, take it to India, different camera systems with different fields of view, different protocols. Suddenly we’re expecting to see posterior pole images and we see this image on the left here. It really might mess up our model. Image quality, something that needs to be considered carefully, as a precursor to the classification model. And different biologies and different populations might affect the results. In COVID, for instance, they found that a lot of the models were not useful. We saw hundreds and hundreds of models published. And one of the biggest problems is these Frankenstein datasets. Really the problem is the data going into building these models. The model building part is easy. And it might seem like it’s working. But if the data going in is not good, it really can cause lots of problems. One of the big issues is what we call shortcut learning. And the networks are sort of lazy and they pick up the spurious signals that might be things like scars in images. Or markers. Things that are not really what you’re looking for. But are associated with the output that you’re interested in. And this can really mess up your system. Different scanner types might look different. And you might have quite a bit of variability there. In such a way that if you train on one scanner and test on another scanner, it may not work. So if you train and test on the same device, imaging device, it looks great. If you train your model on one device and test it on another device, it may not work at all. And so thinking about how do we sort of create these nice beautifully centralized datasets… It’s something that we should be thinking about. But there’s many, many challenges with central databases. Because of privacy and so on. International regulations. Again, this can be quite challenging to do that. So federated learning approaches can be useful there. AI can be biased. And so we need to think about that. And then again… To reemphasize, we really, really need data from all over the world, in order to build these models that are useful everywhere. So to summarize, the dos and don’ts, please think carefully about building diverse datasets, and diversity refers to demographics, disease presentation, institutions, devices. In terms of creating the annotations to the extent that you can, multiple raters are important. We need to really check for bias and confounding in the data. It’s very easy to accidentally leak data between your test dataset and your training dataset and give an inflated image of performance. So thinking carefully about how not to leak data is useful, and the data needs to be represented. So to summarize, garbage in, garbage out, at some level, and really the key to creating good robust models.

DR KAHOOK: That’s great. Thank you so much, Jayashree. I think you’ve illustrated a lot of what the history has happened in the past but also the struggle of how to get to where we’re aiming to be in the future. That’s what Pearse is gonna talk about. So let’s hand over the space for sharing to Pearse Keane, who is going to talk about concept to reality. How do we get from where we are today to getting to more… Closer to the clinic? And we can go over this in the Q and A session. Because I think a lot of us will have questions about this. But Pearse, take it away. Tell us how we’re gonna get there.

DR KEANE: Thank you for the nice introduction. And let me say thank you for inviting me to this event. This is just wonderful. And it’s so great to see people from… I have already counted Nigeria, Bangladesh, Argentina, Romania. And probably many, many others. So I’m gonna speak about going from concept to clinical care. Or maybe the way that I would like to think of that is going from idea to algorithm and going from code to clinic. In the bottom right of this slide, I’ve included my Twitter handle. So if anyone would like to follow me on Twitter, I’m always looking for more followers. Or would like to get in touch with me, they can get in touch that way. And on the last slide I’m going to give my email address, so if anyone has any questions or comments at all, I always love to receive those. So I guess I’ve already been introduced at the start of the session. But just to repeat some of that, I’m a consultant ophthalmologist at Moorfields Eye Hospital in London and I specialize in the treatment of retinal disease. And I’m also a professor of artificial medical intelligence at University College, London. Although I’m not a computer scientist or engineer, I’ve come to find myself leading a multidisciplinary clinical research group that aims to develop and apply artificial intelligence in health care. One of the things we believe is that ophthalmology is one of the medical specialties at the forefront of this research. I’m part of UK Research and Innovation and these are my financial disclosures. At Dr. Kahook mentioned at the start, back in July 2016, I initiated a research collaboration between Moorfields Eye Hospital, where I work, and the artificial intelligence company DeepMind. DeepMind are one of the world’s leading AI companies, founded in 2012 in London, and the nice thing from my perspective is that two of the three co-founders are from London and graduates of University College, London, the University affiliated with Moorfields. In 2013, they were acquired by Google. The best thing from my perspective is that they’re based in the King’s Cross area of London. For those who know London, that’s just two stops away from Moorfields on the London Underground. Now, we started this collaboration back in 2016, and the idea was to work together to develop a Deep Learning system that could evaluate OCT scans of the macula and could be used by people without expertise to triage the scans, to identify those patients with the most sight threatening disease, to get them in front of a retina specialist as soon as possible. Now, the collaboration began in 2016. We worked hard for a couple of years, and in August 2018, it was very exciting to publish the first results of the collaboration in nature medicine. We described a Deep Learning system as able to assess these macular OCT scans and evaluate them for more than 50 different retinal diseases. And had a performance on par with some world leading experts at Moorfields for the triage of those conditions. It was also very exciting that this was featured on the cover of Nature Medicine. And this being AI, there’s a lot of hype around it. So it was very exciting for us that we got global media attention. Of course, that was very exciting, because I’ve done a lot of research before. Which I’ve really put my heart and my soul into. That probably was not read by very many people. So it was nice to get a lot more attention on the research. At the same time, though, it’s also a little bit awkward. Because there is a huge amount of hype around AI. The fact of the matter is… Although we think this has great potential, we haven’t actually saved the sight of millions of people. Yet. Now, let me just show you the algorithm in action. So this is running on a macular OCT scan from a patient in their 40s at Moorfields with very poorly controlled diabetes. And essentially the network or the AI system consists of two neural networks. First neural network does segmentation. Which Jayashree talked about nicely in her previous lecture. But essentially it means it delineates between ten or fifteen anatomic parameters on the OCT scan. In fact, you can see on the bottom right there, you can see all the different anatomic parameters. So in this case, you don’t have to be a retinal specialist to see this patient has diabetic macular edema. It’s delineating all the fluid on the exam. The segmentation feeds into a second neural network, a classification network, and you can see the outputs of this on the top left. In this case, the classification network is recommending a semi-urgent referral to see a retina specialist and it’s giving a diagnosis probability across different classes. It’s saying macular retinal edema in this case. For example, this could be used in the context of some sort of community diabetic retinopathy screening program to help identify people with macular edema, to prioritize getting them seen urgently by a retina specialist. I’m here to talk about going from code to clinic. So the question is: That was 2018. Now it’s 2022. Where do things stand with this system, and are you in fact saving the sight of millions of people with this? I want to tell you a little story about a friend of mine, someone called Ken Cukier, a senior editor for The Economist magazine. Basically Ken contacted me, in March of 2021, in a little bit of a panic. Because he had suddenly lost vision in his left eye. I think we were in a COVID lockdown at that time, in the UK. And so he wanted some advice about how he could get that sorted. I was able to direct him to go in to Moorfields and get seen in the emergency department at Moorfields and get sorted out. But one of the things was that when he was in there, because he had written about our work in artificial intelligence before, he was really curious to see it in action, as he was going through the system. So he was asking the receptionist: Where is the AI system? Asking the nurses: When am I gonna have the AI running on me? Everybody was looking at him like… What is this guy talking about? Because in fact, the AI system that we developed isn’t yet being used in the real world. And so a few days later, he sent this tweet where he kind of gently teased us, a little bit, where he said: AI is a fraud! I went to Moorfields for emergency eye surgery and instead of DeepMind’s cool algorithms diagnosing me and robot surgeons, I had to accept the excellent, skilled, thoughtful care of human medical staff. What a disappointment! Now, as it happens, the condition that he had is not something that would be diagnosed with an OCT, and therefore the AI system that we developed for OCT wouldn’t really be of any benefit for him. But nevertheless, he makes a good point that the system that he had written about is not actually in the real world. And so he very graciously invited me on his podcast to talk a little bit more about these real world challenges. I’ll touch on that briefly now. So one of the key challenges is around the technical maturity of these algorithms. And one of the things that has really struck me in the last few years, as a clinician, is that the piece of code that is used to typically accompany a research paper in a journal… It’s like a piece of experimental code… That’s nowhere near really a piece of software that could be deployed at scale for human benefit. The other effect is that these systems… While they’re very exciting… There’s plenty of ways in which they can go wrong. So they really require robust clinical validation. And it’s one thing to show good results, as we had done, on an in silico retrospective dataset. Another thing to do clinical validation. That’s an important component. And the other part is that you need regulatory approval in most countries for these algorithms to be used. That is something that requires considerable expertise, effort, and resources. But those are well defined challenges. Some of the really gnarly challenges are around: Actually, how can these AI models be integrated and delivered in current health care systems? What are the business models, the economic models around the use of these systems? Will they make business and commercial sense? And will health care professionals and doctors, et cetera, actually implement and adopt these systems? And in fact, we’re still very, very early in AI enabled health care, in beginning to answer these questions. Not just in ophthalmology, but across other specialties also. And so actually, I was reading a biography of Thomas Edison, and it occurred to me that there’s a lot of we can learn from his story and a lot we can learn about the lightbulb. So this was an article that I was reading, related to that. When Edison lit up Manhattan. Essentially it says: On September 4, 1882, the electrical age began. That was the day that Thomas Edison’s company flipped the switch on his power station on Pearl Street in New York, providing electricity to homes at a price comparable to gas. And it was fascinating to me to read about this story. Because actually, the lightbulb had been invented by 20 or 30 different people over the previous 25 years. But the true beginning of the electricity age was when Thomas Edison could bring things together and he could bring together reliable sources of electricity. A distribution system for that electricity. A connection system to light bulbs within the home. And then some sort of meter to be able to quantify the usage that was required. And that was the true beginning of the electricity age. And so what I think is that for us to go from code to clinic, we really need to have a network of innovations. It’s not just an AI model. But the other systems that it can be embedded within. So we’re really trying to think about that in the way that we implement our own systems. At Moorfields, we’re building a new hospital. We’re already trying to think about… What are the new pathways of care into which we can deploy AI systems in the future? And how can we have the flexibility for new and as of yet unimagined deployance? But really for me, the really exciting potential of AI is… Can we use it to bring specialized expertise out of hospitals into the community and potentially into the homes of patients in the future? I think we have a really opportunity in that regard in ophthalmology. Certainly in the United Kingdom, there’s a lot of potential because… In community optometry settings, we now have OCT being used as a standard part of almost all eye tests. The big chains like Specsavers, Vision Express, Costco, as well as independent optometry chains deploying these systems. We’re beginning to put in place the infrastructure, the telemedicine to link those places with the hospital. So I think that will allow us to bring together this network of innovations. So just to conclude, then, this has just been a very much… A whistlestop tour about the potential of some of the work that we’re doing in AI and health care. I do believe it has the potential to transform ophthalmology. We’re still in the early stages. And if anyone would like to get in touch with me, as promised, that’s my email address. Thank you very much.

DR KAHOOK: That’s great. Thank you so much. And I’ll be emailing you after this session with questions as well. Along with, I’m sure, a lot of other people. So thank you for getting into some of the basics of the reality of where we are today, versus what… Not just the community at large is expecting, but also physicians or colleagues are expecting. Hey, when are we using AI? Are you using AI in your clinic? I think I asked you that, when you were visiting here a few months ago. So we’ll get into that in a bit and talk about diseases like glaucoma and retina, and we’ll start it off with a discussion around glaucoma from Chris Leung, who will specifically address the applications in glaucoma. So Chris, take it away.

DR LEUNG: (sound garbled) this webinar. I come from the University of Hong Kong. And in today’s talk, I’d like to cover the opportunities, challenges, and unmet needs regarding the application of AI in glaucoma. The diagnostic performance of AI for the detection of glaucoma has been investigated using fundus photographs, OCT scans, and visual field. And in this systemic view, a meta-analysis of the last year, including 503 AI studies in medical imaging across different specialties in medicine, 82 of them pertain to ophthalmology. Among which… 22, in the space of glaucoma… And 17 studies had fundus photographs as the input, and five studies had OCT as input. The conclusions were encouraging. Because we see very high diagnostic performance, with an average area under the receiver operating characteristic curve of 0.93 for fundus photograph studies and 0.96 for OCT studies. These studies show huge opportunities of AI in glaucoma care. One potential application is to aid non-specialists to detect glaucoma. And AI is also useful to help quantify optic nerve (audio drop) when conventional algorithms fail to work. Here is an OCT scan we obtained from a Plex Elite swept source OCT, and we’re looking at the retinal nerve thickness distribution. We see that the retinal from the temporal side (audio drop) but you can also appreciate that there are lots of artifacts. Motion artifacts, segmentation errors, which is actually quite common. In wide field scan. Here we are applying a Deep Learning segmentation. We’re able to quantify this optic nerve head structure, this retinal nerve fiber layer much better with the Deep Learning algorithm. So not only AI is able to help identify, detect glaucoma. But it can also help the time when the conventional software algorithm failed to quantify the structures. Telemedicine plays a major role in health care delivery in the era of COVID. With increasing opportunities for home monitoring of health data, AI can help… Actually revolutionize how visual field and OCT can be monitored. Glaucoma screening is another potential application of AI. It’s an important topic that I don’t think I have time to cover in detail. So what are the challenges in applying AI in glaucoma care? Although AI algorithms have a high diagnostic accuracy in glaucoma detection, level I evidence supporting the applications in glaucoma care is lacking. One major limitation, one challenge we face, is when we interpreted the OCT AI or fundus photograph AI studies… One challenge we often see is the inconsistent definition of glaucoma. In 2016, the World Glaucoma Association consensus meeting, a group of glaucoma specialists, talked about how to diagnose glaucoma. And they defined glaucoma as a disease which is predicated on the detection of a thinned retinal nerve fiber layer and narrowed neuroretinal rim. But in this consensus meeting, we didn’t discuss how thin is a thin retinal nerve fiber layer and how narrow is a narrowed neuroretinal rim. So that makes actually… The diagnosis or the defining features of glaucoma difficult to set in many studies. And here we see slides summarizing three AI studies, using fundus photographs as input, and what we see here is that they use different features. They use different criteria to define glaucoma. And some of them quantify glaucoma as having a vertical cup to disc ratio more than 0.9. And you can see some of those mentioned using rim width and some of those mentioned using RNFL defects as defining criteria. Likewise, for OCT studies, AI OCT studies, the defining criteria is also not consistent here. Here are three studies using three types of definition of glaucoma. In defining the ground truth — some of them use visual field. Some of them use cupping, optic disc features. And it’s really hard to get a set of well defined criteria across these studies. And all the challenges here are: We see grader variability. A lot of the studies actually rely on fundus photographs to determine the presence of glaucoma. But we know interobserver agreements in determining glaucoma damage was only moderate, even among glaucoma experts. So this huge interobserver agreement or huge interobserver variability, actually, led to the question that… Whether this labeling of the dataset… Some of these studies used thousands or tens of thousands of fundus photograph images. Whether these images were graded reliably, consistently, is always a big question mark. And even… If we are able to come to a consensus that for example, in this picture, we see the cup to disc ratio is pretty big. Many of us would agree the cup to disc ratio — the definition of some of these studies — we have two studies, using fundus photograph AI studies. In these studies, they defined glaucoma as any one of these features. Including vertical cup to disc ratio bigger than 0.9 in one study. And the other studies, bigger than 0.95. Even though here we have an optic disc fitting the exact criteria mentioned in the paper, when we look at the retinal nerve fiber layer, we see the retinal nerve fiber layer is actually pretty normal for these. This is probably related to physiological cupping. This is no evidence of optic nerve damage. There are technical limitations that may hinder the reliability of detecting optic nerve damage. We all know about the challenge in screening out poor quality images, motion artifacts, before we actually train the AI algorithms. Not only having these challenges posed — perhaps some biases in our training of data. But even if we have good quality images, here we have three eyes. We see these OCT pictures of the retinal nerve fiber layer. They have good signal strength. We don’t see any motion artifacts. They all show relatively normal retinal nerve fiber layer architecture. And the deviation probability maps are pretty normal. But it’s only when we use another type of technique here, a new technique that we developed some time ago. We call this ROTA, retinal nerve fiber layer optical texture analysis. Which allowed us to uncover the optical texture and trajectories of individual nerve fiber bundles. It’s only when we use this technique that we’re able to see clear and definite retinal nerve fiber layer defects, evidence in all these three eyes. So even if we’re using the best available technique indicator here, we have the OCT retinal nerve fiber layer analysis, we’re still not able to detect glaucomatous damage. And all these defects also show corresponding loss of the visual field. So we need consensus in defining glaucoma studies. We also need consensus in reporting the diagnostic performance of AI for glaucoma detection. Lastly, we need level one evidence to inform whether the application of AI can improve glaucoma care. This actually has been more challenging and difficult. Especially for glaucoma. Because by and large, glaucoma progresses very slowly. To illustrate the application of AI can help — perhaps improved patient reported outcomes to help slow visual field progression. It takes time and the design of the clinical trial is also very challenging. With that, I’ll stop there, and thank you for having me, again. In this webinar.

DR KAHOOK: Thank you, Chris. All of your work has been central to our clinic on a daily basis. We have your papers printed out and sitting on our desks so we can teach our fellows and residents about OCT. And I’m sure we’ll continue to do the same thing for all the work you’re doing with AI. We have some huge barriers to adoption. And it’s as basic as just defining glaucoma, so that we all agree on what is glaucoma. And what isn’t. So thank you for that. And to transition to a topic that’s related, the application of AI in retina, Dr. Ciku Mathenge will give us a talk that involves a study with real world application for AI. That I think will also answer some of the questions that have been coming through in the Q and A box. So Ciku, I’ll hand it over to you. Thank you for joining us.

DR MATHENGE: Thank you very much, Malik, for inviting me to be part of this panel. And thank you also to all my panelists and to Cybersight. So my first engagement with AI was about ten years ago. When maybe that word was not what was used. I had just completed my PhD. And for my studies, I had set up 100 retina clinics in 100 villages in Kenya. And what I was looking for is presentations of disease. Out of curiosity but also out of my own frustration as a clinician and not really knowing what to do with my retina patients. Currently I’m an academic retinal specialist. And after I submitted and graduated and got my PhD, about two years later, I got a call from Tunde Peto, who had read my images at Moorfields, and said there was a group interested in using my images to run this automated retinal image analysis. I had no idea what that was, but I was willing to give up my images for that. Because what I told Tunde is: I have over 5,000 images. If they can be used for the good of humanity, I’m willing to share them with anybody. So it’s quite interesting that Dr. Abramoff, who was one of the leads on that study, is on the panel today. I’ve never met him. But thank you that you at least included me on the paper. So this paper showed that automated retinal image analysis was as good as Tunde Peto at Moorfields. So that was really interesting for me. My second encounter with AI was through Cybersight. I teach residents and encourage them to use Cybersight. And the consult, where you can upload an image of a patient who is giving you a hard time, and as you wait for a mentor to give you an interpretation, you can upload it to the AI and get an answer. So I heard about it. I didn’t really use it, because I didn’t really understand it. So that was my next encounter with AI. So my most recent encounter is this randomized trial that we completed last year in Rwanda. And what we did is we set up screening at different diabetes clinics. And we divided the patients into a group that would get an instant report, using an AI platform, Cybersight AI. And another group that would wait for 3 to 5 days to get a report from the reading center. What we wanted to see was: Did that impact their uptake of care? So there’s no need in spending resources, in initiating screening programs, if the people who we say: You do have DR are not turning up for care. Which is a big problem that we have. So was there going to be a change? And really what we found — we now have the evidence to show that giving the patient the instant report actually the majority went to the ophthalmologist the same day. Because it was more convenient. They didn’t have to make another trip to the city to look for the doctor. And there was a 30% increase in uptake of care. But also that they took up care much faster. So that is… In this study, I was really interested in… How does AI change the way I practice ophthalmology? We’ve heard about the technical aspects of AI. I don’t really know the details of that. AI in my set-up does not have to be complicated. This is the set-up in one of our clinics in Rwanda. So to use the AI, we needed to have a color printer, we needed to have a computer with a screen. And we needed a good camera. And I repeat, you need a good camera. I believe that a lot of the frustrations coming out of some of the AI studies are the choice of camera. A good camera has to be easy for the user to learn how to use. It has to be easy for the patient to understand what they’re supposed to do. It has to be easy in terms of the speed of the procedure. And also that you don’t dilate patients. And we were lucky to have a really good system. And within 30 seconds, we got a report. And this is our sample of the report that we got from this project. We were very lucky, because we had direct communication with the authors of this AI system. And what that allowed us to do was to say what we needed for our purposes. And they were willing to suit our purpose. So for example, on this report, the green is the DR screening. So this patient has no DR. For those with DR, we wanted this traffic light coding — because it’s easier for patients to visualize what we are talking about. So especially for patients with mild DR, I didn’t want to just tell them: You have no DR. I wanted them to see: They’re not at zero. They’re at one. And even though we’re not referring them, they were not okay. Therefore they need to keep up screening. But this was very easy for patients to understand. Now, the red part in this report is about VCDR. I was worried that… About using a system where I tell the patient: You don’t have DR. You’re okay. And they go home, assuming that they have no other problem. Glaucoma is a problem in my area. And when I heard there was an algorithm that could grade the VCDR, I definitely wanted that flag for my patients. And the AI authors were willing to adjust and give me algorithms that suited my purpose. So I said: I want one that detects a large cup to disc ratio. And I want one that detects (audio drop) on the retina. So if a patient had no DR but had a huge macular scar, I didn’t want that patient to get a green report, looking as if they had no problem. So we wanted a report that allows patients to know that whatever problem they had, even outside DR, there was no other diagnosis. But at least it flagged that something was not okay. That communication with the AI authors was one of the most important aspects that I found with this set-up. So artificial intelligence can mitigate some of the major barriers that we have in taking care of our diabetes and other retina patients. In my population, there’s limited awareness that diabetes, for example, can cause vision loss, or that that vision loss is different from cataract and it’s irreversible. This is my patient Maria. Who came to me for cataract surgery. But had irreversible damage from diabetes, by the time we found her. We wanted to reduce these kinds of stories. There is lack of systematic eye screening programs. Including reading centers. So ophthalmologists have to be the ones who read the images, and we just don’t have enough time or enough of us to be the ones doing that. There are other barriers that are caused by distances to screening centers, or the high costs, or just equipment is not available everywhere. And then there’s poor coordination within the referral system. Such that patients who do have diabetes — I mean diabetic retinopathy — actually don’t know where to go for care. I think these three problems in white — AI can help. AI can help. By helping the few of us who are there, who can look at retinas, and tell what’s going on, on the retina, do more with less, because we can delegate more tasks. AI also allows the delivery of quicker results. And as we saw, this does impact how soon and how often patients go for care. Then what we also found is that the diabetologist, having this report, it was easy for him to convince the patient to go for care. And the diabetologist is the person the patient knows. When I go as a screener, I’m a stranger. But if the person who the patient actually came to visit can tell them: Look, your report shows you have a problem, they’re more likely to uptake care. And of course, by putting this screening within the diabetes clinic, we’re creating integrated patient-centered care, which is where we all want to go. We still have things to sort out in our region. We don’t have the relevant technical skills to build AI. And yet, we want AI to be built that works for us. I don’t know any AI developer in my region. There is also lack of investment in research and development. So that transition from the lab to market — I don’t know how it’s going to happen. Because we don’t have investment in that area. And of course, we need to improve some of the basic infrastructure that has been mentioned by others. Internet is still very expensive and unreliable. The legal frameworks that will tell us what AI should do and should not do, and also just to have enough data from our own population to inform technologies that are being developed. I do believe that even in low and middle income countries, AI can really be part of the solutions that we give our retina patients. And I believe that DR is just the beginning. Thank you very much again for including me on this panel.

DR KAHOOK: Ciku, it’s a pleasure for us that you joined the panel. I know you’re very busy. But your experience with AI in the real world is really where a lot of people are trying to go. You’re already there. You’re implementing it on the ground. And I don’t think that should be understated. I think we should celebrate what you’re doing. So thank you for sharing all of this information. And I continue to look forward to working with you and learning from you. So I appreciate it. I’m going to move on to Michael Abramoff, who to me — I know he wouldn’t like me defining it this way, but he’s one of the rock stars of AI. Certainly one of the original advocates of transitioning from just research, which certainly is valuable, but with the ultimate goal of getting it into the clinic and into patient care. I invited him today to speak about regulatory pathways. But during the Q and A, I’m sure we’ll have some other questions talking about his experiences. The good part and the scars from commercializing a device. And we’ll get to that in a little bit. But before that, Michael, please speak to us about regulatory pathways.

DR ABRAMOFF: Thanks so much for having me. Dr. Mathenge — I recognize the name. And now I realize we actually did a paper together. So nice to be on this panel together and I hope to work together soon. Let me share my presentation. It should work. Almost there. So ten minutes is not a lot. I will try to squeeze it in. I have a couple of conflicts of interest. I’m speaking here as an academic. But clearly I have a role in digital diagnostics. So there’s a clear conflict of interest as well as numerous (audio drop) of AI including with FDA. So one of the interesting aspects that indeed we are running into — regulation around the world, specifically for autonomous AI — some of the countries we already implemented in are dark and some of the countries we’re right now going to clinical trials and other regulatory aspects are shown in yellow. And there’s one in Bangladesh, where actually we are finishing and wrapping up a great trial, together with Orbis, in partnership with Orbis. Very exciting. I wish I could share the very exciting results. The paper should be coming out very soon. Anyway, some of the speakers already mentioned it, but I did want to establish this difference between more autonomous AI versus more assistive AI, where we as clinicians all use AI probably almost daily. From the Zeiss or Heidelberg, it helps us give analyses of OCT scan. But that’s not the same as autonomous AI, which essentially does the same as a retina specialist does, where the retina specialist is not. Where there’s only an outlet with 5G, it allows you to make a high quality diagnosis. It allows you to do it at point of care, and allows you to have — in developing countries but other countries as well — the access problem is the retina specialist. There is not enough of them. As well as other reasons. So you can be where the patient is. Rather than the patient being with you, you can go to the patient. And that can help solve problems that have been intractable, including in telemedicine. About regulation… I cannot discuss all the regulations in Japan, Saudi Arabia, Europe, the US — what I want to do is more a general framework that has been really helpful. The first thing — I am in Iowa. You may not know where Iowa is, but it’s in the middle of the US. It’s a grain, corn producing state. That’s why you have silos where the grain is stored. All around me are the silos. It’s a metaphor for working alone. So you see the AI working in the silo. That’s where you get rosy rainbow unicorn type projections. And the real world is a little bit different. Because in addition to just solving the algorithmic problems and the accuracy, there’s more general concerns about health care AI that people have. And that we have to solve, to make and implement it in so many countries around the world. Will it benefit me as a patient? Will it exacerbate rather than improve health disparities? What is going to happen to my data? Is there racial or ethnic bias? Who is liable for errors? And who pays for all of this? We have solved this in the US and we’re solving it in other countries and we’re solving it with frameworks. So I will talk a lot about that. One mention I want to make is this concept of glamour AI. Which is AI that is technologically really cool. But it doesn’t solve a patient problem, doesn’t solve a population health problem. It’s not benefiting patients. That is not what we’re interested in. And so that these concerns are valid is shown by very recent examples. Both in AI, where there were patients being harmed by AI. It has a lot of effect in the US until today. But the fear of backlash against all AI… Is real. We have seen it with gene therapy. Where in the US especially, gene therapy was making great progress. Really on the cusp of being implemented widely. There were some unethical trials. Young people died. And it was shut down for almost two decades. And only very recently did FDA approve gene therapy. In fact, for a retinal disease. Ophthalmology is typically at the forefront of all these new developments. There’s other examples I won’t go into. But if you want to address the issue, you need to step out of the silo. As an AI developer, as an AI creator, and make sure that all stakeholders in health care, that means the creator but also patients, also in the US, the government, policymakers, regulators like the FDA and the US… Physicians and provider organizations, payers, and even ethicists, they all need to be comfortable when you start implementing AI at scale. And we actually made a choice. Whether we were going to be the Uber of health care, breaking the system, on the left, or working within the system, addressing all these concerns. And that’s what we did. And that’s why we got… Already was mentioned in the beginning, in 2018, the first ever FDA approval for an autonomous AI. But that meant we had to do an enormous body of work on ethics of AI. It started with ethics. Because if you start from the very elementary bioethical principles that we have in medicine, like patient benefit, or non-maleficence, equity, responsibility, just with these principles, and build your regulatory and payment framework around it, you can be sure there’s no gaps. That ten or twenty years from now, we won’t say: Oh, we should have thought about it. Because these bioethical principles are thousands of years old. From every culture around the world, you can see very similar principles. So that is a way to effectively create a closed framework. And so we published that in ’20, after years of work. And the elementary message is really this: That there are varying competing bioethical principles. Like non-maleficence, justice or equity. Autonomy. Like you see here. A doctor, a health care system, an AI, sit in the middle of this. You cannot be 100% guaranteeing patient autonomy while at the same time guaranteeing 100% patient benefit. There’s always a balance to be found. More of this versus more of that. And different cultures, different health care systems, different physicians, and different AI creators will have to make that choice. But there is a choice to be made. And what the frameworks help with is measuring. So if you tell an engineer: Be ethical. They will ask you… Well, how? So you need to have a way of measuring how much of an ethical principle you’re meeting. And I think that is the principal contribution of this ethical framework that we say… Well, for example, patient benefit… There’s different ways of measuring it. Here are the ways you can measure for an AI. For equity or racial bias, here’s how you can measure it for an AI. Autonomy, the same thing. Even responsibility. That was the contribution there. And that has been really helpful. Because we started from this basis. It allowed us to do the ethical frameworks. Ultimately, with the goal of improving health disparities, improving access. Lowering cost. And just in general making health care more available. So that’s the goal. But for that… We worked very closely with FDA and the regulatory agencies. On regulatory frameworks. How do you design a clinical trial? How do you deal with bias? How do you develop and conceptualize an AI? How do you include it in the standard of care? And then on the other hand, we’re not talking much about it here. But there needs to be for an AI creator a business model for a provider… There needs to be some way of getting reimbursed for it. So it will be incentivized to use it. So payment frameworks are really important. We also worked very hard on the payment framework… Reimbursement framework for AI. And that allows us to have nationwide reimbursement in the US. Which was very exciting as of this year. So just sketching that there’s close work with the US FDA… Which is called the foundational principles of AI. They created a so-called collaborative community. Where we worked together with ethicists, FDA, tech companies, clinicians, to create guidelines or what we call considerations for how to validate an AI. How do you design a clinical trial? What about transparency and explainability of the AI? How do you deal with federated learning? How do you measure all of this? So that is the work we’re doing. The first publication came out now last year. And it’s very dense. It’s about 30 pages. But it really lays out what you should be thinking about, when you want to regulate AI and especially autonomous AI. Similarly, it allowed us to create a reimbursement framework, starting from ethical principles. We built this together with payers, both physician organizations — especially people specializing in physician reimbursement in the US. But also with government agencies that pay for AI. And so that was really important, to base it in ethics. And explain how you can get from an ethical principle all the way to reimbursement that actually works for everyone. If it doesn’t work for the payers, if it’s too expensive, they will not pay for it. If it’s too low for creators, no one will build AIs. The R and D stops. So very important to address all of these. And finally, a little bit about liability. I didn’t hear much about it. But the liability, very important in the US, of course, because of the medical malpractice claims of hundreds of millions of dollars that can be awarded here. But as important in other countries. Who is liable for an AI diagnosis? And especially for an autonomous AI. We have stated early on that it should be the AI creator that should be held liable. Not the provider using it. Which typically can be the primary care provider for a diabetic eye exam. And all of this — the American Diabetes Association included this AI in their standard of care. You’re not familiar with the US system, but very important that AI is now supporting the closing of care gaps. That indeed, many care providers — the diagnosis of diabetic retinopathy, they spent many years working this out, because they want to open the precedent for good AI, AI in the right way, and not harming patients. And how do you deal with that from a regulatory perspective. So yes, we have indeed been able to apply these — we continue to apply it worldwide. Very excited about the trials that are ongoing in these different countries and how we work together with these regulatory agencies. Also very proud, cannot help but mention it. Our partnership with Orbis, especially as shown by the publication soon to be appearing about this great study from Bangladesh. So I haven’t been able to say very much specifically about which law or regulation to follow in each of these different countries, but hopefully this is helpful to have a more general oversight of where it’s going. Thank you very much.

DR KAHOOK: Great. Thank you, Michael. And if everybody could join in with your cameras, and maybe we can get just the panelists on the screen. And Lawrence, you can direct that in the best way. So… Maybe just we can take the PowerPoint off. Here we go. It was a really ambitious program here to get as many talks as possible. And I had two really intentions behind this. One was: I wanted to learn from all of you. I read all of your papers. And this to me was just on a personal level very exciting. But also I wanted to try and create an enduring sort of file that can go on Cybersight. The session in itself will live on Cybersight. For a very long time. And I think somebody who is curious about AI can go to that, and really learn the basics. And start to pick up some of the threads that each of you have left in your talk. Michael just now… Each slide could have been a five hour session by itself. If given the time. And I certainly appreciate that. And I also try in the webinars that I moderate not to go over an hour and a half. Because a lot of people are busy. But I do… The Q and A part I think is super important. I tried to answer a lot of them, if you noticed, as we were going along. But I did write a few of them down. And I want to ask you some light questions and some heavy questions. And anybody can answer this. I’ll try and direct it to one. But then anybody else can speak up. Pearse, the first one is for you. I’ll throw you a heavy one to start with. One of the questions that came up, and certainly when you look at the people that dialed in, we have people from all over the world. Different health care systems, different resources. How do we make sure that AI does not widen the care gap in some areas? What are some of the pitfalls that we have to watch out for? In that AI can maybe… Help some areas, some populations, but not others? And actually benefit some, and if not hurt others, maybe just not benefit others who don’t have access. What are your thoughts on that? I know you’ve thought deeply about this, but what can you share?

DR KEANE: Well, I think there’s lots of aspects to that question, but one that I would like to highlight is that there is a real risk of something that my friend Alastair Denniston has coined. Health data poverty. And this is the idea that actually… If we have just developed AI systems on say, retinal photographs from white Irish people like myself, in their 40s, then essentially those are very unlikely to work in different populations. Different ethnic groups, et cetera. And so I think there’s a real risk that our systems won’t generalize, because as Jayashree said in her talk, the fact of the Achilles heel of a lot of modern AI approaches is that generally they don’t perform well on data that’s different from the data that they’ve been trained on. And so… And then sort of… Related to that, another point is that we see that ophthalmology in some senses has been at the forefront of this, through Michael and other work. But I think part of the reason for that is that there’s been large ophthalmic imaging datasets, some of which have been made publicly available, like Messidor and EyePACS and some of those datasets, that has really stimulated a lot of this. The machine learning experts, if they’re interested in doing some medical applications, they can get access to ophthalmic datasets and start to test and develop their algorithms. But you see in some other medical specialties, like obstetrics and gynecology, at least as far as I know, there isn’t really a large number of these public imaging datasets. And as a result, it’s not so common that you see cutting edge clinical AI papers in those specialties.

DR KAHOOK: Maybe, Michael, I’ll throw the question to you, because have the real world application part of this with digital diagnostics. So how did you ensure that your data would be applicable to a wide ranging population?

DR ABRAMOFF: There’s a more general question. Even from the concept stage… I mentioned the tension between the different ethical principles. It comes back to that. If you want higher performance, it may require more expensive scanner or imaging camera. It’s easier to get high quality images from high quality operators who are very expensive. But by even that choice… Instead of maybe a lower cost imager, where the AI needs to… With high performance, needs to compensate for lower quality noisy images, even those choices will affect whether you’re going to address health disparities or not. So it starts from the concept of how you train it, another thing that Pearse mentioned. Include very diverse datasets. Make sure that you also validate it on very diverse data. That’s why I was happy to collaborate with Dr. Mathenge. Years ago we developed mostly on North African and European. And they were able to validate it in Kenya. Which is very exciting. Because that was a big concern at the time. We have chosen to look at biomarkers. Right? So typically we do not train from blind images. Rather we say… A hemorrhage, an exudate is racially invariant. If you have it, it doesn’t matter what background color of the retina or what pigmentation. So that is a longer discussion, I think. But there’s ways to address and mitigate this. But my point is more: There’s a lot of choices. Even at the start. That you make, that will affect whether or not you’re able to solve health disparities.

DR KAHOOK: That’s great. This kind of throws the question to Chris from some of the content that Michael just talked about. So one of the issues here is quality of data. And the quality of data can be thought of on two different levels. One is when you’re training a system, when you’re trying to see if you can actually take it into the real world. But once you get it into the real world — Chris, you’ve talked about OCTs giving you valid information versus overcalling or undercalling, how do you think we can adjudicate the information that we’re obtaining from AI in the real world? I’m in clinic. I’m seeing patients. I have an AI system that is helping me with OCT. How should I think about this as a clinician when I’m getting that data? Should I believe everything? How would I adjudicate? How would I figure out what’s right and wrong? I think you’re on mute there. There you go.

DR LEUNG: It’s not easy, even for specialists. I remember when the spectral domain OCT was introduced back in 2005. 2006. Back then, they have — the Cirrus had what they called the retinal nerve fiber layer thickness deviation map, where it showed where the retinal nerve fiber layer thickness was below the reference range. That was not an AI algorithm. But in our locality, where we have a lot of individuals with high myopia, we see all these defects all the time. All around the superior pole, inferior pole, in the beginning, I was… Wow. High myopic eyes are susceptible to the development of glaucoma, which we didn’t detect. We were not able to detect. Before, we had OCT — now we had the OCT, we see all these defects. I think the same thing happens here. Especially for new technologies, we have to be very careful. We always need to integrate — especially for glaucoma, we need to integrate prospective data, follow the changes of the optic disc and the retinal nerve fiber layer over time. To ensure that this is a progressive type of neuropathy. This kind of… Those scans that we’re using. But unfortunately, a lot of the time, we would not have that. And I think that’s why we… It’s not easy. That’s why we don’t yet have a consensus, how to define glaucoma. But again, I think the integration of using fundus photograph pictures, using OCT, using other clinical information together… Is critical to judge whether the test results we’re looking at are true or not.

DR KAHOOK: Yeah. That last part I think is key for me in my practice. Right? So when I’m looking at a visual field OCT or photo, each one is a piece of the puzzle. We can think of AI the same way. It’s assisting what we’re doing. It’s not dictating what we’re doing. It’s gonna be a piece of the puzzle. And I appreciate that answer. Jayashree, I’m gonna shift gears a little bit and force you to talk about the differences between what your experience has been in radiology versus what you’re seeing in ophthalmology. What can we learn in ophthalmology from your experience in radiology? And what radiology has already done — arguably radiology is the most advanced as far as applying it to their day-to-day practice. Compared to what we’re doing. And I know Michael might disagree with that. Because it depends on how you define AI. But what would you say about differences between radiology and ophthalmology and what we can learn from radiology?

DR KALPATHY-CRAMER: So I think the problems are similar in many, many ways. For both fields, infrastructure is a key component of doing it for real. And the notion of interoperability between imaging formats and not having proprietary formats, things like that really can limit how easy it is to apply AI. And I think really pushing towards standards is something that radiology has been doing for a number of years. Ophthalmology is gradually getting there. But I don’t know if it’s my perspective, but I feel like interoperability and adoption of formats are not as advanced as in radiology. Again, radiology has… I wouldn’t necessarily say that one is necessarily ahead of the others. Because radiology has also very complex datasets. MRI can be very complex. And again, the same practical aspects of figuring out which to apply is not trivial. In many cases, it really does come down to a very mundane set of boring things. Like the plumbing between systems. The things that nobody wants to talk about that take the most time. Which is just the practical aspects. Validation, quality checks, things like that, I think, are not necessarily solved in radiology either. So infrastructure I think is just again… I cannot emphasize enough how important things like that are. But I think the… I mean, it’s the same with COVID and radiology. There was an explosion of really terrible publications that used AI. And I think we have to be sort of cautious about that here as well. What they call Frankenstein datasets, where all the COVID positive cases were shown in adults and negative cases were shown in pediatrics. The model did a great job of separating pediatrics from adults. Things of that sort. That had been brought up as well. Notions of bias in the dataset. So that’s something to do as well. What not to do. The amount of money and effort and resources that have gone into these awful publications, I think, is a cautionary tale for all of us. And we need to again learn from those experiences.

DR KAHOOK: That’s great. I think we can maybe learn a lot about what not to do. Because some of the mistakes were blazed by others before us. And people like you and others on this panel I think will keep us honest. Some of the questions that came in had to do with: How do we obtain data, how do we obtain the right type of data. A question for Gabriella… Can you talk a little bit about approach that Orbis takes to obtaining data? What’s your philosophy when it comes to gathering data from around the world? Or anything you wanted to touch on, from the experience you have within Orbis?

MS LANOUETTE: I do have a lot of experience of this. As it was my main role in the beginning when I first started working here. It was collecting data. I think Orbis’s main goal when it comes to data is diversity. And I know it’s been mentioned a lot. But I don’t think it can be mentioned enough. You need data for global representation, as this is a global problem. In terms of eyecare. And we take a very… I say personal approach in this. So we get a lot of data from Open Source things, such as Kaggle DR dataset, we partner with a lot of hospitals and clinics. And like I said, diversity is the biggest issue in terms of training AI and using AI. And you can’t expect to use an AI system in an area that it hasn’t been trained in.

DR KAHOOK: That’s great.

MS LANOUETTE: Diversity is key.

DR KAHOOK: Absolutely. And I think that’s a theme throughout some of the questions that we got in some of the lectures that we heard. Fully agree with that. So Ciku, I’m gonna ask you a question, and give me an honest answer. Let’s say you were talking to a philanthropist and they were writing a check for you and said: As much as you want. What would you do next to follow on the work with RAIDERS? Forget about the financial limitations of what you would do. What would impact your day-to-day clinical life, as a follow-up to the clinical study, which I think was fantastic. It gave us a lot of information. So what’s next?

DR MATHENGE: We do intend to take that next step and started it in a small way. Some of the things that are needed in this time have nothing to do with imaging eyes and uploading to AI. It’s the health system changes. For example, how do we charge for this new activity? Because I’m not gonna keep getting free AI from Orbis forever. So I need a tariff in the health system. And how is that tariff determined? We have an AI kind of department in Rwanda. But it’s just in its formation. And it’s exactly to guide on such things like… You know, do we develop storage in-country? Does storage go outside? What do we do with people who come and deposit cameras in clinics… And say just keep taking images and sending to us, and for every image, we’ll give you something. So I think it’s a lot of those other things in the health system and in the regulatory system. Those are the things that I’m finding are a challenge now. Rather than affording a camera or a computer for a clinic.

DR KAHOOK: Great insight. And it kind of touches on something that Michael said earlier about the reimbursement part of what we’re dealing with in the US. If you don’t have a business model, you’re not gonna get the innovators working on it, because it’s not sustainable. It has very little to do with the intentions of the person. But more so with the reality of being able to carry projects forward. You have to get paid for it. So that you can keep innovating. Would you agree, Michael?

DR ABRAMOFF: Great point. And what is exciting about the study specifically with Orbis in Bangladesh is that we’re looking at… What can the business model look like, where it works for a low income country, but on the other hand it’s still interesting for an AI creator. If it doesn’t work in the US, with a business model… It cannot work in Bangladesh either. I think we’re solving that. It’s very exciting. Because it allows you to scale for other countries. But if you can’t solve it, it will be philanthropically… And that’s hard to sustain, as you said. Completely agree.

DR KAHOOK: The last question here just for time’s sake is going to be to Pearse. It’s become kind of a pastime for me to put Pearse on the spot. So this is being recorded and will be online together. Tell us where you think AI in ophthalmology will be in 5, 10, or 20 years. You can pick whatever you want. I want you to make that grand prediction and we’re all gonna hold you to it. So go for it.

DR KEANE: Oh my goodness. First of all, I want to say: I don’t see AI replacing doctors any time soon. So certainly within my lifetime, within my career, I don’t see it… So what I imagine is… I imagine the clinic in 2030. And the reality is probably we’re gonna be just as busy with our patients. We’re gonna be still skipping lunch and clinics running late. But we’re just gonna have new problems to deal with. And I imagine us in the clinic in 2030, 2040, you know, we have a patient in front of us. We have ten different types of advanced imaging. We have their genomic information, their proteomic information, metabolomic information, image from their FitBit watch, their sleep patterns, information about the pollution levels from their neighborhood in the last three months and a range of narrow AI tools that somehow draw some learning from each of those modalities. And then somehow our role as clinicians will be to integrate that information. To plot the best course for that patient. And to convey that with the sort of human empathy and care and all of the things that makes us good. I would like to think. Good doctors.

DR KAHOOK: I love that answer. I love that answer, because it also focuses some of our training habits that we have to do for our residents and our fellows. That the human factors of what we’re talking about, and how we deal with the information and how we convey the information is becoming more and more important with the addition of technology into our clinics. So it’s kind of the opposite of what people feared. Where actually going back and saying… Okay. How do I convey the information to this patient and how do I speak to the patient appropriately… So I’m excited for the future. I’m excited to go back and watch this entire session when it’s up on Cybersight. So just to remind everybody… Lawrence will work very hard to get this up on Cybersight probably sometime next week. If you can go take a look. Sign up, make sure that if you’re in a group there that hasn’t signed up individually, sign up for Cybersight, so you can get all of the updates. I want to thank each and every one of the panel members. I’ve learned significantly from each of you throughout the years. As well as today. And I would hope that we can come back and do the same thing in about ten years and see if Pearse was right. So don’t forget the answers that he gave. So I want all of you to have a great day, great evening, and please reach out with any questions to the Orbis and Cybersight systems. Thanks, everybody. Have a great rest of your day. Bye, everybody. I really appreciate all of you.

Last Updated: October 31, 2022

Leave a Comment