Machine Learning and Hematopathology – Transcript and Audio
The views expressed in written materials or publications and by speakers and moderators do not necessarily reflect the official policies of the Department of Health and Human Services, nor does the mention of trade names, commercial practices, or organizations imply endorsement by the U.S. government.
Date of session: 03/22/20
Triona Henderson, MD, MPH
Centers for Disease Control and Prevention
Hooman H Rashidi MD, MS
University of California-Davis Health
TRIONA HENDERSON: I’m going to give the participants one more minute. We’re just about halfway through the number of people who’ve registered online right now.
OK, we’re going to go ahead and get started.
Good afternoon. My name is Triona Henderson. And I am a clinical pathologist and the facilitator of this ECHO Model™external icon pilot project. I extend a warm welcome from the Division of Laboratory Systems at the Centers for Disease Control and Prevention in Atlanta, Georgia. Thank you for joining.
The topic for this interactive discussion today is Machine Learning and Hematopathology. Our subject matter expert is Dr. Hooman Rashidi, a professor and Vice-chair of Graduate Medical Education, Vice-chair of Informatics and Computational Pathology, the director of their residency program, and the Director of Flow Cytometry and Immunology in the Department of Pathology and Laboratory Medicine at the University of California Davis School of Medicine in Sacramento, California.
Here is some technical information before introductions and presentation. Please use the video capabilities of whatever device you’re using for this session to enhance the interactivity of each session. All audience microphones are now muted. During the discussion section, please and unmute yourself to speak.
If you are experiencing technical difficulties during the session, please send a private chat message to Johanzynn Gatewood, labeled as DLS ECHO, who’s waving at you now. She will do her best to respond to your issue.
If you are connecting by phone only, please announce yourself by name and state when beginning to speak.
Finally, part of designing relevant sessions for these ECHO sessions, we are looking at your evaluation responses. We would like to encourage everyone to complete the post-session evaluation.
P.A.C.E.® credits are being offered for this session. Either a certificate of participation for those who are not requiring P.A.C.E.® or a P.A.C.E.® certificate will be issued upon completion, depending on your selection.
If you have additional comments, please send a private chat message directly to Johanzynn, or email directly to dlsecho– that’s DLSECHO@CDC.gov.
How are ECHO sessions different from teleconferences and webinars? In this case, we will have a discussion, and a really robust discussion, around the main topic that Dr. Rashidi will be presenting. Subject matter experts hope to share some of their work that may be translatable to all of you in your individual laboratories. These ECHO sessions will focus exclusively on clinical laboratories in the United States and US territories.
Once again, we value the discussion amongst all of you that ensues and want to encourage you to share your own experiences and challenges on this topic. We thank you for your interest and participation.
Here is an overview of the process. Our subject matter expert will present the information of his specialty. I will ask some clarifying questions to him. Then we will open up the floor for discussion of ideas or shared experiences with comments from the subject matter expert. Then we will have closing comments and reminders. And then we will adjourn.
Today’s session is being recorded. The audio and transcript will be shared on our website. If you do not wish to be recorded, please disconnect now.
Closed captioning will be provided for this session. Find the link right now in the chatbox.
Here is the biography of Dr. Rashidi. Dr. Rashidi combines his passion for patient care and education with his unique training in bioinformatics and computer programming to create innovative new tools and resources that improve clinical practice in education. Dr. Rashidi is the co-founder, developer, and senior editor of Hematology Outlines, a very popular online hematology atlas that is used internationally by medical schools and other training programs, and is endorsed by the American Society of Clinical Pathology for clinical laboratory scientist and medical technologist training.
Dr. Rashidi’s experience in bioinformatics dates back to his graduate years at UCSC, which subsequently allowed him to also serve as the principal author and editor of several popular bioinformatics textbooks. This background has also enabled him to develop various novel artificial intelligence and machine learning platforms, which has since led to multiple, active, and exciting research, clinical, educational, and quality improvement projects, which now include over 30 collaborators from multiple departments and various prominent institutions.
These studies have also led to several recent manuscripts in which he serves as the senior and corresponding author, most recent of which is published in Nature, Scientific Reports, and has also led to several filed patents. His most recent filed patent on the early identification of acute kidney injury through his novel machine learning approach is now being considered for licensing by an industry partner through the University of California.
In addition to the above, his most exciting and newest invention, which you will hear about today, is the unique, proprietary, automated machine learning, AutoML software known as Milo, Machine Intelligence Learning Optimizer, which is currently serving as a very powerful new AutoML tool for a large number of current clinical and educational projects.
Dr. Rashidi’s efforts are now widely recognized nationally and internationally as evidenced by his various national, international talk invites at various prestigious conferences and institutions.
Now, I will invite Dr. Rashidi to begin his presentation. Please be thinking about similar situations you’ve encountered and/or questions that you may have for him to enhance a robust discussion. Please offer your questions during the discussion period. Dr. Rashidi.
HOOMAN RASHIDI: Thank you so much for the kind intro. And it’s good to be here. And thank you for the invite. And I hope you enjoy this discussion. So I’ll get started.
I’m not going to go into too much in my background. But it’s important for you to know that, as a practicing physician with a background in bioinformatics, it really– the stuff that I’m going to be showing you is not just the work of me. It’s the work of a group of us that’s actually working together to– what we’ve brought to the thing. So one person by themselves can’t basically put something through like this. But hopefully, these tools can now be applicable in multiple sites so that others can use.
I do have to– as was mentioned, there are certain filed patents there, so this is my disclosure.
And in terms of overview of the talk, it’s a very simple talk in terms of what I’m going to be talking about. It’s mainly about technology. And specifically, the first part is going to be several minutes about technology and hematology education, although this can be outside of hematology as well, too, but specifically, our Hematology Outlines project, and exactly what is it that we’ve built, and how can you use it, because it’s a free resource.
And then from there, I’ll lead you on to future directions, and in terms of things that people are actually publishing on and you may be already familiar with within the machine learning world. And exactly what is machine learning? And more importantly, how can it be used? One of our classic projects that has led to its own IP, and then finally, our newest and most exciting project, which is MILO, which hopefully you’ll enjoy as much as we’ve enjoyed building it.
So how to utilize and incorporate these concepts along with technology and our teaching material, really, I’m going to be mainly talking about Hematology Outlines because this came out of needs. It was a project out of needs, where, when I was running the heme course in UC San Diego, hematology course, the students actually asked for me to build an online tool such as this.
And so this basically brought together 30-plus collaborators from various institutions to build a reliable atlas and glossary with multiple interactive tools that any user could actually use. And hopefully, some of you have already used it.
But the best way to get you to appreciate it is really through a demo. So I just take you directly through the website and show you the demo.
So it’s a very simple website. It currently has two parts that are active, which is the full atlas and glossary. And it’s a comprehensive atlas of hematology and a glossary of hematology. It’s a very extensive glossary.
The neoplastic and the non-neoplastic hematology, those are going to be part of the upcoming textbook, which is not out yet. So the main parts that are live are the atlas and the glossary.
I’ll start with the glossary. Glossary, we basically– throughout this whole building process, we were always thinking about what is it as a user, and what a user would want in terms of their best way of learning and having the best user experience. And so in our glossary, where it’s an extensive glossary, as you read this stuff– and I apologize for it to be a little small in text.
But let’s say you’re reading about APTT. And let’s say you’re a new user, and you’re not familiar with the term, “heparin.” Well, obviously, “heparin” is also in our glossary. But since it’s already in another part of our glossary, it already automatically gets hyperlinked. And then you, as the user, can click that. And instead of taking you to “heparin,” it actually pops it up for you really quickly so that you can kind see exactly how does that apply to the APTT that you were reading about. And this same concept applies to all the other terminologies that we have in there.
Besides that, we wanted to have multiple ways for the user to be able to learn from the process. So the other ways that they can learn from process is can have flashcards that are built-in. So we have flashcard builders in there. And we’re actually in the process of putting out a new flashcard tool app, which is going to be out soon, as well, too.
And that being said, this is on our website, which is a free website that you can all access. There are pre-built flashcards that are already there. Or you can actually select the terms that you could make your own flashcards on. Very simple flashcard builder, where you basically can quiz yourself in terms of seeing how you basically– how much do you know about a particular entity? So “basophil,” you can click it. If it actually has an image from the atlas, it already pops that in there as well, too.
Ultimately, the main thing that’s behind all of this is the content. And the content is a fully comprehensive atlas of hematology. And then, again, this has come from collaboration with 30-plus authors and editors to make it not only something that’s very user friendly but also content reliable. And it gets constantly updated. And this has been going on for about ten years now.
So let’s say, for example, you’re interested in a particular area, in maybe bone marrow, secondary lymphoid tissue, or bloods. Let’s say, right now, you’re interested in blood. You can actually go directly– from an ontology standpoint, you can actually now go into it, and dig into it, and say, OK, red blood cells– let me see exactly what topics are within red blood cells.
You just– by clicking the red blood cells, it actually pops up the topics for you and says the ones in the green, says those are normal. The ones in the red, those are abnormal. So you can actually find those particular entities.
Let’s say you want to evaluate mature red blood cells and sickle cells. And so by doing that, you can put them side by side. And then you can evaluate them. And we’ve actually also enabled it where you can actually have a virtual microscope built-in or a zooming lens. And then, you can actually compare that sickle cell that’s there to a normal red blood cell that’s to the left.
But more importantly, instead of just having, let’s say, a new user who may not be familiar with the hematologic terms, we’ve had our editors and content experts actually put in what’s the best and most resembled criteria. So if you are dealing with, let’s say, a sickle cell, some of the things that you should be able to distinguish it from– for example, schistocyte is a very important one. You would want the user to be able to have access to that readily.
So by having that linked immediately, you can just press the Compare button, and it automatically pops it to the left side of it, or if it was on the right side, to the right side of it. And then you can now basically compare your schistocyte to the sickle cell that you looked at earlier. So you can zoom in and read about it.
And if you notice, also, the type of format is similar to some of the other atlases that are out there because we kind of mixed and matched, took the best of what we liked from other atlases, and incorporated it into our own version. So that’s in terms of content.
Now, a lot of users want to be able to have ways to assess themselves. And one of the ways to assess yourselves would be through matching quizzes, multiple choice quizzes, and then, obviously, maybe sometimes not having an actual quiz but just a quick way to self-assess without being judged or without getting a score.
And so matching quizzes are very simple. It’s almost like playing a game. Instead of reading the entire glossary, we’ve made the entire glossary terms to actually get matched with their actual title. So you can actually just drag your mouse and individually populate the individual entities and then say, OK, that looks like the answers to me. And then you check them. Obviously, you know, I should be able to get all of these correctly. Hopefully.
But then if you got some of them wrong, it would actually give you a red and say– and it’s not a scoring system here. This is just for you, another way for you to be able to reap the glossary, to make it a little bit more enjoyable.
But then if you want to do actually a quiz where you actually get a score, we have a multiple choice quiz builder, which is in different categories– let’s say peripheral blood or bone marrow. And then you could then– it’s a 10 questions at a time that it randomly picks. And so it generates thousands of different kinds of questions. And then it puts them in 10 categories, random categories.
So for example, in this case here, the person is being asked what is it that this is pointing to. So this looks like a pretty straightforward babesia. And so you basically say, OK, this is babesia and say submit it. And then it says immediately– because it’s an active learning process, it says correct, it was babesia.
So let’s say you’re comfortable with that. You don’t want to really read about this. Now, you want to go to the next one. The next one basically says, oh, this looks like a May-Hegglin, but let’s say the person is not used to this and may think that’s– by accident, they were wrong, that they thought it was hyper-segmented neutrophil. Well, if they put in hyper-segmented neutrophil, it will tell them, no, the answer was May-Hegglin.
But then now, you don’t have to wait till the end of the quiz to be able to learn about it. You could actually just actively learn about this by putting the quiz on hold, by clicking the Review. And it puts you directly into the full atlas and then puts the quiz on hold. And then you can learn about May-Hegglin right away and then say, OK, I feel good about it. Let’s go to the next one.
And then at the end of the 10, it gives you a score. And then you basically look at the score and says, good job, or better luck next time, that sort of thing. And so we’ve had labs where, even know this doesn’t get an official credit, some labs have actually used it for competency internally to see how either their CLSs are doing, CLSs in training are doing. Or yeah, we’ve done it here, in other places with our residents and our fellows. So again, the possibilities are infinite in terms of things that you can do with it.
But besides that, if the time is a limit, we’ve also built this virtual slide. And this is a patient. Well, it’s not a real patient, obviously, because this patient has basically every single abnormal cell that he can think of, and even some of the normal cells.
And so in order for us to build this, I had a case of a markedly pancytopenic patient, which I had lots of empty areas. So we basically pasted all or most of our cells within these and stitched several hundred digital frames so that the user can quickly test themselves if they are basically short on time.
So if I wanted to see what this, this is a bad neutrophil. But then I can shift the slide to the left or the right and like test myself further. So this one is a blast that I basically picked. This one happens to be a target cell, and so on and so forth. So again, different ways for different people to learn. That’s the idea behind it.
We’ve put in also, in case there are areas of the world, especially who are basically devoid of, let’s say, Wi-Fi access, we’ve put this as a standalone program within an iPad and iPhone, and Android will be coming soon, so that people can actually run these on their own personal devices as well, too, and not dependent on being on the internet. So that’s it.
And they’ve had really good feedback. We’ve actually used these feedbacks to improve the program over the last ten years. And we’re grateful that it actually has improved it. And now it says 250,000, but it’s now over 300,000 folks who have used it. And it’s from different countries. And we’re grateful that it’s been adapted, as was seen, by ASCP for board certification for CLS and hematology specialists in CLS groups.
So that being said, this gives me– this is a completely different topic. But I’m going to switch gears now and now complete this with the future tools. And these are future tools that we could be using not just for education but for actual clinical practice, quality management, research studies, whatever your main idea may be. And these are artificial intelligence, machine learning platforms that I’m sure many of you have been used to and have been using, probably, some of you.
But in case you haven’t, and you want to know what artificial intelligence machine learning is– and I encourage you to please read our review paper in Academic Pathology, which is a nice long review of artificial intelligence but pretty concise in terms of what it covers, because of unsupervised learning.
And exactly– artificial intelligence is really a part of– machine learning, excuse me, is a part of artificial intelligence. And per some of the experts or the fathers of artificial intelligence, when they first– let’s say, Arthur Samuel. They said it’s basically a type of platform that will be able to automatically learn from experience without explicit programming. And those of us who are actually in this world, we know, even though that’s the definition, but in reality, there is programming involved.
So with that being said, once you’ve actually built the thing, the idea is that it will be able to do the task for you automatically. That’s the take-home point.
It has a lot of things in common, machine learning and human learning. In humans, we learn from touch, feel, smell. And in machine learning, the same thing’s the input. Instead of experiences, it’s data. That’s what it comes down to. So it’s important to know exactly what kind of data is getting imported.
And examples of AI /ML in our daily life include– I’m sure most of us are using Siri or Alexa. If you don’t, and you think that you don’t use it, I’m sure you are because you’re using emails. And within your email, you have spam filtering, which is a part of machine learning. And so people are using these platforms, even if they may not be fully aware of it. But I’m pretty sure this audience is fully aware of this.
That being said, the main question that people ask is that, is there a difference between machine learning for medical applications versus non-medical applications? And the answer is absolutely. There’s a huge difference between it. And the biggest differences for those of us who are practicing medicine, we know that medicine is a balance between art and science. So there is this experience-driven process that basically introduces intra-observer variabilities. And so since that’s the case, the goal would be to build things that are minimizing intra-observer variabilities so that they can actually be accurate and generalizable with future stuff.
Most of the stuff that we actually do within the machine learning world and medicine and pathology is really within supervised machine learning. And exactly what supervised machine learning is is this. And I’m not– for the sake of time, I’m not going to go through the rest of them.
Machine learning gets really separated into supervised, unsupervised, and reinforcement learning. Reinforcement learning is not much used currently in the medical sciences. But the two that are used regularly are the unsupervised and supervised platforms, especially the supervised platform. And that means that somebody’s knowledge is basically being translated into the machine. And then ultimately, that machine has learned from experts’ knowledge base and can now predict future stuff based on what they’ve actually inputted.
The two main types of supervised learnings are classification, which is, by far, the most common thing that we use. And the second one is regression. Really, the big difference between them is regression is– let’s say you want to tell, based on the size of your house, and how many bedrooms it has, what’s the exact price that it will go for. So it’s a number output that it gives you, versus classification tasks will give you a discrete, qualitative outcome. So let’s say cancer versus no cancer or acute kidney injury versus no acute kidney injury. And I’m sure you get the point.
So classifications, just because you can actually be building these things, and people are doing classifications tasks, doesn’t necessarily mean that it actually is applicable and translatable easily within our world, from within the medical world. And the real world can really be a tough place, as this slide basically mentions.
And this was the experience that I’m sure many of you have heard of, which was the M.D. Anderson IBM Watson. Fantastic groups of people, super intelligent, super bright folks got together and came up with an idea to be able to come up with an oncology expert advisor to be able to help target cancer therapy. That being said, without going into too much detail, in a project like that, even with the right team, ultimately, what happened is that it didn’t meet the ultimate goal of the project.
So lots of money was put into the place. But ultimately, if you’re not basically meeting the need for those of us on the front line, the tool may not be helpful. So it’s very important to know what is it that you’re trying to address when you’re building these AI and ML tools.
So where do we go from there, if it’s not that easy to build things? Our group’s approach has been to go after the safer applications, the lowest hanging fruits. And so what I mean by that, it’s things that we know are going to basically help us and have a group that includes pathologists, laboratorians, clinicians, other scientists, statisticians all getting together and figuring out, what is it that we’re trying to address clinically? Will it meet the needs clinically? And can it be done from an AI/ML platform?
So I’ll give you an example of our acute kidney injury model, which initially, started as a proof of concept. It’s now way beyond that. And so the goal here was to build a model that predicts acute kidney injury. And for those of you who are not familiar with acute kidney injury in terms of how it’s currently done, we started this project within our burns population, which actually does even worse within that population in terms of using the KDIGO criteria. The KDIGO criteria is the criteria that’s the gold standard, everybody uses, which is the Kidney Disease and Improving Global Outcome criteria, which is based on serial– and the keyword here is “serial”– creatinine and urine outputs. So as you can tell, because it’s serial creatinine and urine outputs, it can take days for you to be able to get an answer.
And then unfortunately, the sensitivity, especially within our population pool, which was burns patient, is not that great. It was in the 50s, low 50s.
So then what comes to the rescue was this thing called NGAL, which we had a clinical trial on, which is Neutrophil Gelatinous Associated Lipocalin. And this is something that people in Europe are using. And they basically are swearing how much of a better job it’s doing comparing to KDIGO. Reportedly, it’s going to be coming through the FDA and getting it cleared in the US in the future. Don’t know exactly when.
But then the idea was that, since we had a clinical trial on this, our idea was that– and this is going to hopefully be part of our process in the future. Can we enhance the performance of the NGAL, which is better than the 50%. It’s now goes up to a 70% sensitivity. Can we enhance it further by using machine learning platforms?
So if you did not use machine learning, here’s without machine learning. And you wanted to see how your acute kidney injury group is doing compared to no acute kidney injury. So the red is acute kidney injury group. You’ll notice the NGALs that were high, they basically, mostly, paired up in the acute kidney injury. And the ones that were low in the no acute kidney injury. But even with that, there are plenty of NGALs that are low that are acute kidney injuries.
So by itself, it’s not going to be the full answer. But if you mix and match it with other parameters like BMP for, let’s say, cardiorenal issue within an AKI patient, and along with the classic markers like creatinine and urine output, you may end up with a better model that will be able to ultimately get better performance criteria, better ROCs, better sensitivity, specificity, whatnot.
So the idea was, OK, let’s see if we can build something like that. So the burns population proof of concept allowed us to build a model. And it specifically was on a very simple, k-nearest neighbor model that could enhance NGAL’s performance. And that was by combining– we found out that combining BNP, creatinine, and urine output, along with the NGAL, it actually improved the sensitivity and accuracy, and it brought it up into the low 90s, which was incredible. And hence, the intellectual property that ensued.
So our follow-up AKI study, what’s after that we’ve done? We wanted to see the models that we had built within our burns population, can they be applied to non-burns population for predicting acute kidney injury. And the nice thing about it is that we found that it very much could be. So the non-burn trauma patient population, the models we have built with the burns populations, could definitely be applicable to the non-trauma burns population, with different algorithms included.
And here is an example of, if you did not use angle NGAL, you notice the ROC AUC regarding of the algorithm is not the greatest. But when you do bring an NGAL into the picture, the ROC AUC improves dramatically. And this basically shows the same picture.
So in summary, the acute kidney injury machine learning trained on burn populations were able to predict acute kidney injury in non-burn trauma and burn patient populations. And machine learning enhanced the predictive capability of NGAL and NGAL combined with other biomarkers. So by itself, it’s not as good as if it was with other biomarkers along with machine learning.
But really, the biggest thing, besides the improvement in terms of its performance, was that we did all of this on admission criteria. or admission serologies. So it means that, instead of waiting several days to get the acute kidney injury diagnosis, which an average, on KDIGO, would be 61.8, we actually did it a couple of days sooner. And that was really the big, groundbreaking part.
And quite frank, we published our information. And then a few months later, we were glad to see that one of the Google Groups, actually, published something similar within VA groups. And they were seeing similar thing that we had seen, that they can actually use machine learning to enhance time to acute kidney injury diagnosis. So this is exciting stuff.
So exactly what does that mean? It means that instead of waiting several days and using just urine output and creatinine to come up with an acute kidney injury versus no acute head injury, you can actually use machine learning to actually do it way sooner, at the time of admission, so potentially, even as a point-of-care device.
So can this– the main question you may be asking right now is like oh, that’s, fine. Sounds good to you because you have a group, and you guys are all basically the experts in this stuff. But I’m not in that world. So can this process– can I be in that world? Is there an automated version of this process that I could use? So if I have a great idea, can I apply it and build a machine learning model? So basically, something that all investigators can have easy access to machine learning methods, where there will be no machine learning expertise required, no programming or software engineering background needed, just using a website?
And our solution to that was MILO. So we built it exactly for that. And we call it your fully automated machine learning solution. So we’re hoping that this will allow people in the future to be able to use this without having any background in this. Because ultimately, we know machine learning, artificial intelligence platform algorithms are being incorporated in health sciences. But when people see these figures, even if I’ve removed most of the mathematics behind it, it doesn’t really induce confidence.
So really, there are major challenges within the machine learning world. One is it’s intimidating. Most people do not want to go after it because it requires coming up with a team that has machine learning expertise, statistics, programming expertise in place.
And then very time-consuming– so most of us, as physicians, we’re working long hours. This is a side thing for a lot of people would be thinking about. So if something that’s going to take you months to maybe a year to do, you may not be able to– even if you want to, you may not be able to have the time to be able to dedicate to it. So to be able to find something that could actually save time would be a huge help.
And then, ultimately, as somebody who edits and reviews machine learning articles all the time, the one thing that I’ve seen in my experience and in the years past has been people have been publishing fantastic models and literature, beautiful models. But the problem is, as most of you know, 99% of them don’t come to product. They’re not used in regular workflows.
And obviously, that’s multifactorial. You know, some of it has to do with medical legal aspects. Some of it has to do with IT resources. Some of it has to do with expertise. Because really, ultimately, when you build a model, you’re building it in one language. It’s typically Python. And when you translate it to the one that’s going to be deployed for people to use on a website, that’s typically on an HTML 2 or some other web-based format that people are going to be looking at. So that translation requires a whole different set of expertise.
So this actually came from a friend, who basically said, I imagine a world where artificial intelligence machine learning studies are as easy as using a website on your laptop or even your smartphone. And I couldn’t have said it better myself. And so really, they basically said, oh, it would be great if there was no machine learning, statistics expertise needed.
To be fair, though, you will need to know something about statistics because the stuff that are the performance criteria that are spit out by the machine learning models are statistically based. But really, no machine learning expertise needed. It would be great. No software engineering expertise or programming required, just be easy to do.
And our solution was MILO, where it basically allows the investigator have no coding, or programming, or machine learning expertise required for them to be able to do machine learning studies. All the heavy lifting is done by MILO.
Its key highlights is that it expedites the machine learning process. And I’ll show you the data in terms of how much faster. It’s so much faster than a non-automated, traditional approach. It actually improves the performance over the non-automated, traditional approach. And I’ll show you the data on that as well, too. And it becomes common sense once you see how it does it.
And then most importantly, is that anybody can do it because it’s on a very user friendly UI, user interface. So if you can use a website and you’ve attached files to an email, you can work with MILO. It’s that simple.
So exactly how does it expedite machine learning process? This is now two separate IPs that I basically showed you earlier. If we’re looking at, let’s say, our acute kidney injury, we work nonstop on this. It took us four months to come up with this at full speed. And MILO was able to do all of that work with all the people getting together automatically in less than a day. So less than 20 hours. The same thing with our sepsis project, which took three to four months to do. MILO was able to do that whole thing and give the final results in less than a day as well, too. So just so much faster.
Most importantly, it’s a validated platform. This is, I feel, one of the biggest attributes and biggest strengths of our platform is that, is that it’s validated on a large number of IRB-approved studies. That actually says 10. It’s actually, technically, 11 now.
And if you notice, right now, from the 10 that are listed here– and they’re not just all from UC Davis. This is UC Davis now with other centers included. And so they’re multi-center data. It’s from different topics, so acute kidney injury, sepsis prediction, kidney transplant, delayed graph function, cardiology cath results. We even have our global health tuberculosis, active tuberculosis project, or even transfusion medicine, massive transfusion protocol.
But it’s not all health care data. It can also be used for student evaluation. So we actually have two models that are actually used by our medical school now. So this is not just an idea. It’s actually a tool that’s being used, that are able to identify students at risk of not doing well on future USMLE exams a year or two earlier. So again, it’s a validated model that way.
So exactly how does it work? So if you don’t remember anything that I’ve just told you so far, this is the take-home point for MILO. MILO’s thing is not magic. It just makes no assumptions whatsoever. So if you came up to me as a machine learning expert, I will have my own biases in terms of what’s best for your data set. So if you give me your data set, I may tell you, oh, the best algorithm or feature selector combination may be– the best algorithm will be random forest for you. You go to talk to another machine learning expert. They may say neural network. Another one will say logistic regression. And so each person has their own biases.
With MILO, we make none of those assumptions. So instead of the machine learning experts or those science experts, they’re making the assumption in what’s best for your data set, your data set decides what’s best for it by using MILO. So MILO becomes a matchmaker.
And it follows the CRISP-DM approach, which is the cross industry-standard protocol for data mining. As you know, you may be familiar with it. It’s IBM’s modified approach. And we’ve actually made it even stricter because we’re dealing with health care data. And I’ll show you how.
So here’s the CRISP-DM approach, which is a very simple idea, which is that experts gather the data, so you collect your data sets. And then MILO will do everything within this yellow part that’s within the CRISP-DM. So it does the cleaning of the data. It does the scaling. It does the feature selection or transformations when there’s needed.
It actually builds the models on a subset of the data from the training set. It validates it on a subset that was left out. And then obviously, it cross validates to make sure that it’s statistically sound with seven different algorithms that are the most commonly used algorithms out there, including our neural network, our custom neural network. And then finally, it actually gives you a validation not based on the initial data set, but we require a secondary data set being built, which represents the true prevalence of whatever you’re looking to make the model for. So based on that, it gives you two test results back. And really, the secondary test result is the real test result of how well the model is doing.
And if the model is doing well, then you can take it to the next level and deploy it. So the part where people would be stuck at and not know where to take it next, we’ve actually also allow the user to actually deploy it and start using it live. And I’ll show you how that’s done in a second.
So exactly how does that basically mean? Can it find best models? So if this was our approach with our sepsis that took three to four months to build– and this is one of the largest machine learning studies out there, which is about 50,000 models. And we found within our burns population a fantastic model that had fantastic ROC AUC. We were very excited.
Our main question was, can MILO find that same model? And the answer was absolutely. MILO found it. But most importantly, MILO found it in less than a day. And so not only it found the model, because it had built 300,000 models instead of 50,000, but on top of that, it found five additional models that actually out-beat the original one. So that’s where the outperforming the non-automated approach comes in.
By the way, this thing that I’m just showing you here with the sepsis, we’ve actually repeated it on multiple other data sets. And we’re getting the same things. So that’s where it makes this very exciting.
So how does it achieve this? It’s a very simple idea is that it builds multiple hundreds of thousands of models in a very directive way, because, as you can see, because the number of feature selectors you had, and because a number of algorithms you had, really, the possibilities could be in the billions. But we’ve confined it within our approach to be within a few hundred thousand, where then the user decides within that few hundred thousand, what’s the best one for them with our help. And then it follows best practices within machine learning, which we follow.
So here’s the quick demo of MILO. So let’s say you want to build a model that predicts sepsis. And so you bring in two data sets. The initial one happens to be a balanced data set so that you can actually build the models and initially validate it. So it’s a very simple binary classifier.
So here’s the example of our data, where we have features. These patients had– so there’s about 500 patient events here that have temperature, ventilation status, hemoglobin, hematocrit– you know it, all of these features. And we were mapping those features to the sepsis, the ones that were sepsis positive– and I mean, negative, sorry, zeros– and the ones that were sepsis positive, so about 500 cases.
And so in MILO, it’s as simple as attaching– because the first step in MILO is selecting the data, is attaching the training data. So you just basically press the button, and you attach it as an Excel file, a .csv Excel file, to MILO.
And then once you’ve attached, it puts a checkmark next to it. And then you do the same thing now within the one that is the true prevalence of the disease. So you can see how well it really does, not based on just the primary validation but really the generalization.
And so you bring in a secondary data set. In this case, it was a data set of patients that had similar features. But the true prevalence was 20% sepsis rate. And so the features all match, but now, the sepsis, most of them were negative, and 20% of them were positive.
So then you basically do the same thing. You basically upload the data as an Excel file. And now, the third step is you select the target. And the target happens to be, can you fix those so they can predict sepsis from no sepsis? So sepsis happens to be the thing.
MILO’s approach is a very simple, four-step approach, which includes selecting the data, exploring the data, then training it in where you’d build all the models, and then finally, viewing the model’s performance. And then you as the user, as the investigator, you can decide which model is best for you. And if you’re ready, you can deploy it.
Exactly how does it work in real life? Here’s the video. So again, a four-step process. In the first step, you just basically pick the data. So you select the training. And then you basically say, OK, that’s the one that I want. And by the way, if you don’t pick the right one, it actually gives you the error, and says you have an error in this area or that area so you can go fix that Excel sheet for yourself.
So this basically pulls in all the Excel sheet features. Sepsis was the target. Now, you go to the next step, which is exploring the data to see what was ingested in MILO. On the left side is the training data initial validation. On the right side is the generalization data. As you can see, it was a 50-50 on the left side and the true prevalence on the right side, which is the real performance, which has nothing to do with the training itself.
And then so now, you go into the training part. By default, we actually recommend running everything because you don’t know. And we want to make no assumptions. But let’s say you actually really want to save time, or you don’t really you– you want to run it with all the features, or you want to run it with a specific feature selectors, well, you can just uncheck certain things and run it customized, the way you like it.
So let’s say now you’re happy with that. And then you build it, because this is going to build all the pipelines for you. And what I mean by pipelines, a typical machine learning study has typically one. If you’re a bigger group, you may have 10. This, by default, when you’re running all of them, it’s going to build about 2,000 pipelines. So these 2,000 pipelines are going to give rise to several hundred thousand models. And if you notice within them, they’re doing different things. So each one of these algorithms are paired up with different feature selectors and whatnot to build all of these different models.
Obviously, it’s going to take a while, so I’m going to say a little while later. And then once it gets up to 99%, and when it gets to 100%, it basically automatically clears out and goes to the results page so that you, as the user, can kind of see and evaluate the models.
And the top part of the graphs, the bottom is a live table. And each row is the model that you’re selecting. So if you’re picking the model that’s in gray, that gets translated to the graphs that are on top. And then you can also– basically, it’s a live table, so you can decide which sensitivity, specificity average, or accuracy, or ROC or Brier score, whatever that you’re looking at that’s important to you. You can also see how many models were built. So this one has built 129,000 models. And it tells you in terms of how many in each, how many patients were trained and tested on, how many of them were validated on, which generalization, and what kind of a statistics was used.
But then more importantly, based on this, let’s say you now have picked this model. And you said, OK, I really like this model. This was a k-nearest neighbor, and it had a sensitivity of 100%, specificity of 85. But it only picked 50% of my features. I want to know exactly what those 50% are, where, in MILO, it’s very transparent. So because we’re transparent, you can actually now download this whole table and get a lot more granularity by downloading the whole thing as an Excel file.
So that Export on the top right lets you download it immediately as a full Excel file. And now, you can open it and see exactly what was in these models in terms of more granularity. You may even want to repeat the stats yourself. So we give you the confusion matrix with the true positive, true negative, false positive, false negatives so you can redo that if you like. We give you the confidence intervals. We give you the parameters of the model itself, if you want to rebuild it in your own environment.
But let’s say in this case here, you cared about the selected features. And then you basically can quickly say, OK, the selected features happens to be this column right here. And K, I believe, right there. And then so that one, basically, I just expanded it, and I can quickly see, oh, respiratory rate, temperature, WBC, that happens to be within the group that’s the best one compared to the ones that all the features. So now, you know exactly which ones MILO picked as your best features when it was building the model. So half the features were better than all the features. So sometimes, more features are going to hurt the model.
And then, now, if you’re happy with that model, you can actually now go live and test it. So this is now your new, live model. So you can manually test it, or you can download it as a PMML file, which is standard protocol file that we can actually put in, or joblib file that you can put into, let’s say, Epic or other platforms.
And so here is, let’s say, a manual one. Let’s say it’s not automatically populated and you want to manually quickly test it. On the right side, it tells you the model performance. I’m just manually putting in the classic sepsis case, OK? So this is somebody who will have sepsis most likely. So I will say predict it based on this model. And this says, yes, the model also predicted it as sepsis, 62% probability.
So let’s say I’m feeling pretty good about it and say it sounds good. But let’s say you have a lot more things. Instead of manually, you would want to be able to do it automatically as a big batch. Or probably the best way to do is just build your own website from that model, so fix the model to your own website. And it takes literally a second or two to do that. So let’s say we just built the website for that model. So now the model is built. And I can now do a quick little manual or do a batch mode.
So now let me check now on that generalization test that I text– just do a self-check to see what was called negative when it was truly negative, when it was called positive when it was truly positive. So here’s how fast you can actually get the results. You can quickly see it, see all the negative cases where they were really negative. They were mostly called negative, but then every once in a while, some of them were called positive. And it also tells you the probability of this. So you can actually further investigate each one of these cases. Again, lots of transparency. Nothing is being hidden, so you can actually revalidate everything internally yourself.
So that being said, you can now go back to the home page from your thing, so seeing exactly how the home page is. You basically can now decide, OK, I want to go back to the home page and, let’s say, redo the study, or maybe go back to the actual website that I just built that had the model that I had an interest or the previous studies that I had, so like the sepsis that I just ran, or the actual models that I just built. So let me– the best KNN or the best massive transfusion protocol tests, or, let’s say, deep neural network, those are the ones that I can quickly go back to those and test them. And you can build as many models as you like to do.
So anyways, then long story short, it’s a super easy platform to use. No machine learning expertise is needed. It’s basically your machine learning pseudo group or your virtual group in a package. And so no programming, software engineering expertise needed because it’s all done there. It just follows now study best practices. And it does it a lot faster than the non-automated approaches. And most importantly, it’s a validated platform.
So I want to thank a lot of people who basically have helped during these studies. There are too many to actually thank. And with that being said, our core MILO team that we constantly chat, and talk, and collaborate on every other day. And I thank you for your attention. Thank you.
TRIONA HENDERSON: Thank you so much, Dr. Rashidi. This has really been exceptional, especially for our audience, who are community clinical laboratories or more in rural areas. A lot of the tools that you shared, I’m sure, will be really helpful.
In that essence of time, I’m going to keep my questions short. However, with your discussion first of Hematology Outlines, are you considering offering– I know, especially for our community clinical labs, they may not have time to go into national conferences. It’s very important to maintain certification, and as you said, competency for themselves and then their managers, who are reviewing that information annually. Are you considering or have you ever considered offering P.A.C.E.® credits, even if just for the testing modules or the individual tests that are offered on the site?
HOOMAN RASHIDI: Yeah, as you know, that part requires a lot more coordination. That’s a great point. And that has come up. And we’re happy to discuss in terms of partnering up with folks in terms of if they would want to use it to be able to offer that.
But from a resource standpoint, we just haven’t gone to that next step. It’s been mostly a study aid rather than– it’s people, when they’ve used it, it’s mostly been internal stuff, rather than counting towards an actual maintenance of certification or something like that. But happy to discuss that with whoever’s on the line or has an interest in that kind of a platform.
TRIONA HENDERSON: Awesome. And another question or clarifying question– I know you’ve mentioned a lot of current partners and users of MILO. For our community laboratories, can anyone just log in or access MILO online and put in their data?
HOOMAN RASHIDI: Yeah, that’s a great point. So right now, they can’t because, right now, we– literally, MILO’s patent filing– and we were going back and forth with the university lawyers up to a couple of months ago. And so that basically has now been in place. So we’re hoping that, in the next six months to a year, that it basically, through either the university or some third party vendor, that it could actually be provided as a tool that people can use. Because a lot of people are– actually, that’s the number one thing that everybody’s asking about is that how can I use it now.
And so before we actually go live with that part, right now, we’re doing further validations. So I only showed you the 11 IRB-approved data sets. We’re actually in the process to have– I think by the end of the year, we’re going to be approaching 20 of them. So there’s lots of people who are trying to do extra studies with us. And so with the more studies we do, the more we validate it, and the more products that are– or models that are actually used live that are used in quality projects and whatnot. And so, for example, we’re doing a COVID study right now with a couple of countries outside of the US. And they’re going to use MILO as a platform.
But again, these are going to be validation platforms that we would want to use. And we’re happy to discuss and collaborate right now in terms of doing stuff. But right now, it’s basically being used by a handful of super users within our core group of 30 people, or 37 people.
TRIONA HENDERSON: Understood. Thank you. Now, we will open up the lines for discussion, questions, question to Dr. Rashidi. Remember that everyone is muted. You can either use the raised hand function to speak and then unmute yourself to speak.
In the chatbox, OK, from Dr. Denga for Dr. Rashidi. “How do you approach AI or machine learning tools for clinical use? How do you choose a reference method for validation? At what point do you move something from RUO to LDT? And how are they being viewed by the FDA for IBD use?”
HOOMAN RASHIDI: Yeah, that’s a terrific question. So yeah, I knew some– I know within this group, somebody would be asking that. So even though– so obviously, in our review article in Academic Pathology, we did cite that FDA white paper in terms of what the government is planning on coming up with in terms of best practices, validations, and whatnot, personally, I see things like– so first off, right now, we’re doing validations kind of like a homebrew test kind of a thing. So these– you know, homebrew test or LDT-type stuff, I’m sure over time are going to basically have more of an oversight as more of these become more and more popular.
Unfortunately, and somewhat erroneously, some of these tools have been put under similar calculators. So people felt that, oh, if you’ve been using a very basic statistical calculator, maybe this will be the same thing as that. And I personally don’t see it that way. I see this as being something very different. It has a lot of similarities. But the one big difference between it is that, as opposed to a statistical model, where it’s going to basically give you some result that you basically now, as an investigator, will have to decide if it’s relevant or not, in machine learning models, it’s predictive, so it’s actually going to give you an answer. And so because it’s giving you an answer like a test, it does have some of those LDT-type elements to it.
So I see in the future a need for having these in place. And in quite frank, there are multiple things that need to be addressed there. One would be having another platform that cross validates the first platform. So just like when we buy a machine, and I’m running one machine– let’s say, my flow cytometry, I just finished validation with a second machine. So obviously, I cross validated the stuff obviously with external stuff and also internal stuff to make sure that my new machine is going to basically perform similar to the old machine, the same thing is going to be true with future machine learning platforms, where you basically are going to be given something. But you probably– or some regulatory body would basically come through and say, hey, has this been– has this met the requirements in terms of internal validation, in terms of how it’s performing? But just as important, externally, when it basically spits out the answer, can you rebuild the models? And will it predict the same way?
The good news is that, within our platform, two things– to keep it simple, we’ve built it as a binary classifier only to make the statistics more straightforward because it’s obviously much tougher to come up with 95% confidence intervals when you have multiple classes. With a binary, it’s much easier. You know, the Clopper-Pearson or something like that– that’s what we use, actually.
But that being said, because of that, we also enable the person to actually be able to rebuild the stuff because the platform we emulate is scikit-learn, which is the gold standard within the machine learning.
So based on that, yeah, I definitely think that there’s got to be attention to that and oversight on that. I hope that answered it.
TRIONA HENDERSON: Thank you. And just a reminder to all the participants, you can enter your questions also in the chat box if you’re unable to use your microphone.
Here’s another question. “Where can participants from this session get information to keep current with the status of MILO?”
HOOMAN RASHIDI: Yeah, just email me directly. And we’ve even started collaborations with others outside of UC Davis, based on just starting a conversation.
I noticed from the list of people on there, some of the people listed on there are my old friends. Some of them were my attendings at Yale. So I do remember– I see some names, some very familiar names. And so–
But yeah, I’m always looking for new friends too. So old friends, new friends, bring it on. Because quite frank, the reality is, all of these things that we do, and especially in laboratory medicine, where 70%, as they say, the clinical decisions are based on what we do, it’s very important for us to be able to we do it right. And the only way that you can really do it right is multi-site collaborations, making sure that you’re basically all working together, make sure you’re addressing everybody’s needs. So these are very important to do.
Now, the one thing that’s very nice about MILO is that I’m a firm believer that building one model in one place and applying it to every place is not a very good model. It’s not generalizable. Because what if you’re in a hospital system that now brought in a new antibiotics, and then your data may change based on that new antibiotic or the environment. And then your sepsis patients may change based on that. So let’s face it. That’s not– the model that was built on a separate patient data set may not apply to you as well.
So using a system like MILO, you can actually build customizable, or let’s say, individualized machine learning models that are readily available within a day or two within your system, and actually serves the better need that way.
TRIONA HENDERSON: Perfect. I have a question, and I don’t know if you can answer.
HOOMAN RASHIDI: Yeah, of course.
TRIONA HENDERSON: You said that you were working on COVID-19 outside of the country with MILO. Is it proprietary? Or can you just share with us generally what you guys are working on?
HOOMAN RASHIDI: No, no I can tell you. I mean, part of it has to do with the fact that it’s been very difficult to do stuff internally within the US because as long as we have a lot of resources, you know, the data within the US has been very difficult to share. And that’s multifactorial, right– HIPAA and government agencies, all sorts of stuff.
But outside of the US, we’ve been approached. And one of them has been Pakistan. And we actually have ties or direct contact, I should say, with the highest level of their NIH. And because of those friendships that we’ve had, that is starting a study there. We’re also starting another one, not within the government section, but at the private section in Belgium.
So these are different things. We’re not– again, we’re not picky at this point. Ultimately, we want to see if we can learn something from other centers and not wait around to see what comes around here. And so hopefully, if we learn something from other centers, we can apply them here. And if some discussion like this stimulates new people to wanting to do stuff with us and build COVID-19 with MILO internally within US, I’m game.
TRIONA HENDERSON: [LAUGHS] Perfect. Thank you so much for this presentation. Can you advance the last slide for me please?
HOOMAN RASHIDI: Yeah, of course, yeah.
TRIONA HENDERSON: We just want to thank everyone for participating today. ECHO Sessions will continue next month.
Next month, we have Dr. Heather Signorelli, the chief laboratory officer from HealthONE/HCA in Denver, Colorado. And I know she’s been feverishly working with COVID-19 and setting up their hundreds of hospitals and clinics that she oversees. And so we’re really excited to have her next month, Friday May 22.
So the day is going to be different. Usually, for the past few months, we’ve been on a Wednesday. But it’s going to be Friday at 1:00 PM.
Please visit the DLS ECHO website to register for this session and to view and register for the subsequent sessions.
Thank you for taking part in today’s discussion. We hope you found it very valuable in the work that you’re doing and that we will engage you and what you’re doing in your clinical laboratories. We look forward to participation in future sessions. And we’ll keep you posted on any changes that occur.
Now we will adjourn. Thank you so much, and have a great day.
HOOMAN RASHIDI: Thank you.
Additional Resources and Related Publications
Supervised Machine Learning & Automated-ML platform MILO
- Yuan Q, Zhang H, Deng T et al. Role of Artificial Intelligence in Kidney Disease. International journal of medical sciences. 2020;17(7):970-984.
- Shi F, Wang J, Shi J et al. Review of Artificial Intelligence Techniques in Imaging Data Acquisition, Segmentation and Diagnosis for COVID-19. IEEE reviews in biomedical engineering. 2020.
- Sallah SR, Sergouniotis PI, Barton S et al. Using an integrative machine learning approach utilising homology modelling to clinically interpret genetic variants: CACNA1F as an exemplar. European journal of human genetics : EJHG. 2020
- Rashidi HH, Sen S, Palmieri TL, Blackmon T, Wajda J, Tran NK. Early Recognition of Burn- and Trauma-Related Acute Kidney Injury: A Pilot Comparison of Machine Learning Techniques. Scientific reports. 2020;10(1):205.
- Mango VL, Sun M, Wynn RT, Ha R. Should We Ignore, Follow, or Biopsy? Impact of Artificial Intelligence Decision Support on Breast Ultrasound Lesion Assessment. AJR American journal of roentgenology. 2020:1-8.
- Heo J, Park SJ, Kang SH, Oh CW, Bang JS, Kim T. Prediction of Intracranial Aneurysm Risk using Machine Learning. Scientific reports. 2020;10(1):6921.
- Harmon SA, Sanford TH, Brown GT et al. Multiresolution Application of Artificial Intelligence in Digital Pathology for Prediction of Positive Lymph Nodes From Primary Tumors in Bladder Cancer. JCO clinical cancer informatics. 2020;4
- Gameiro J, Branco T, Lopes JA. Artificial Intelligence in Acute Kidney Injury Risk Prediction. Journal of clinical medicine. 2020;9(3).
- Tran NK, Sen S, Palmieri TL et al. Artificial intelligence and machine learning for predicting acute kidney injury in severely burned patients: A proof of concept. Burns : journal of the International Society for Burn Injuries. 2019;45(6):1350
- Lee S, Mohr NM, Street WN, Nadkarni P. Machine Learning in Relation to Emergency Medicine Clinical and Operational Scenarios: An Overview. The western journal of emergency medicine. 2019;20(2):219-227.
- Stewart J, Sprivulis P, Dwivedi G. Artificial intelligence and machine learning in emergency medicine. Emergency medicine Australasia : EMA. 2018;30(6):870-874.
- Cabitza F, Banfi G. Machine learning in laboratory medicine: waiting for the flood? Clinical chemistry and laboratory medicine. 2018;56(4):516-524.
- Shimabukuro DW, Barton CW, Feldman MD, Mataraso SJ, Das R. Effect of a machine learning-based severe sepsis prediction algorithm on patient survival and hospital length of stay: a randomised clinical trial. BMJ open respiratory research. 2017;4(1):e000234.