In 2016, the article “Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs,” by Gulshan et al and the Google team, was published in JAMA. It was seen as a groundbreaking article for medicine and ophthalmology, specifically retina, as it gave us a glimpse into the reality of retinal diagnosis via machine learning algorithms (artificial intelligence, or AI). The paper showed impressive accuracy in detecting diabetic retinopathy (DR) from color fundus photos, often with better diagnostic outcomes than board-certified ophthalmologists.
In the months to follow, many articles, smaller papers and meeting discussions centered around the future of this technology; it was prospected it could first be used for highly accurate diagnosis of DR and other common retinal diseases, before quickly leaping to full eye diagnosis and AI algorithms capable of interpreting multi-modal image sets on any given patient. It was almost seductive to think of the idea of feeding simple fundus photos, or even a more complex image, into an algorithm that would return the correct diagnosis in seconds. No more fretting over issues such as white dot diseases, CAR and MAR or wrestling with the true cause of subretinal fluid (chronic central serous retinopathy vs wet macular degeneration).
WHERE WE ARE TODAY
Unfortunately, in 2021, it feels as though we may be no closer to widespread autonomous AI diagnosis than we were 5 years ago. There have been great milestones in the recent years, however: the FDA approved two AI technologies for diagnosis and screening of DR. IDx-DR (Digital Diagnostics) and EyeArt (EyeNuk), which can be used in conjunction with fundus camera images, have broken ground with the FDA, paving the way for what will likely be many new AI technologies in the coming years.
IDx-DR is FDA approved to give the following point-of-care determinations:
- More than mild DR detected: refer to an eye- care professional
- Negative for more than mild DR; rescreen in 12 months.
EyeArt is FDA approved to give the following point-of-care determinations:
- More than mild DR
- Vision-threatening DR (severe DR and proliferative DR).
Both of these systems are similar and are aimed at non-eyecare settings, such a primary care physicians, to offer screening to their diabetic patients. These systems have little to add to traditional eye-care practices (MD or OD), except in the event that they have a very large diabetic screening population and limited provider resources, which is likely an uncommon scenario.
We also now have a growing knowledge base for gold standard algorithm development (as recounted in a 2018 Ophthalmology paper on using automated algorithms for DR grading, by Krause et al). But, it seems safe to say that these technologies are far from widespread, and adoption has been tentative. Furthermore, we have seen little progress on AI providing meaningful assistance to physicians in the clinic setting. So, what are the challenges that the use of AI for retinal diagnosis faces today?
TECHNOLOGY
Technology is obviously at the core of this field, but it currently has limits. While detection and grading of DR, macular degeneration and retinal vein occlusion are within reach, most other eye diseases remain much more elusive. AI image detection outside the medical field is light years ahead of retinal AI algorithms.
A significant limiting factor in making AI detection of less common diseases is the number of training images available for the machine learning algorithms to reach high accuracy. Images for DR are common, especially in large publicly available data sets. However, images of more rare conditions are much more difficult to obtain in the numbers needed to train an algorithm successfully and accurately. This is especially critical when looking at important, vision-threatening or rare issues (eg, papilledema, retinal detachment or intraocular tumor) in the screening population. This is a major problem in algorithm development, limiting a comprehensive AI solution for retinal diagnosis.
Offering a complete retinal AI diagnostic suite would require dozens of algorithms, each needing tens of thousands of images. The images must be sourced and purchased and then labeled by paid physicians, which is not cheap. At this point, a community project where images are uploaded and then labeled by volunteers over years may be the only reasonable path forward.
One promising concept was that retinal images could be linked to systemic disease and risk factors — for example, AI could detect vascular changes in the retina that closely link to myocardial infarction or cerebral stroke. Prevention of these would have huge value to health care. This likely still has promise but seems very difficult to practically achieve. One very large issue is pairing longitudinal data with retinal images. Those kind of data sets are incredibly costly and hard to manage.
While modern-day advanced medical technology can seemingly be created to address nearly any technical issue in medicine or surgery, sometimes it’s hard to replace a good doctor, at least in a cost-effective manner. AI is a classic example. Consider the development costs to create an AI retinal diagnostic system: image acquisition and labeling costs, general R&D, AI engineers, return on investment and on-going profits. Those costs add up. Then compare that with a retina specialist with a laptop reading fundus images at a reasonable pace. Human doctors are often hard to beat.
REGULATORY
AI systems have faced an uphill battle with the FDA. However, the main issues are not so much the core technology of disease detection/diagnosis, which is well-proven at this point The challenges come with the required approval for each combination of a specific algorithm with a specific camera: each pairing must have a separate trial and FDA approval process. Then, with the addition of a new disease detection algorithm, a trial must be done on each camera yet again.
While this may seem needless on the surface, it’s not. Different cameras have very different image characteristics and gradeability rates. This becomes a “factorial” problem, not simply linear or exponential. In today’s market, dozens of common cameras are on the market, and new, improved cameras are being added. This regulatory issue severely limits adoption, as moving an automated system usually requires purchase of a new (approved) camera.
As with most software, iteration is the key to progress. This is especially true in the field of AI, which is new and growing quickly. The current regulatory environment hampers this process, providing an obstacle for development and adoption of this promising technology.
CODING
In the last year, there has been a shift in the reimbursement landscape benefitting AI for DR caused by four CPT adjustments. First, the introduction of an “automated” retinal screening code was a long-awaited step (CPT 92229). Also, there were shifts in wording and reimbursement levels for all four codes (see Table).
CPT Code | Description | Medicare Reimbursement |
---|---|---|
92250 | Fundus photography | In physician office (Medicare): $39.78 Hospital Outpatient: $111.95 |
92227 | Imaging of retina for detection or monitoring of disease; remote clinical staff review and report, unilateral or bilateral | In physician office (Medicare): $16.05 Hospital Outpatient: $33.84 |
92228 | Imaging of retina for detection or monitoring of disease; with remote physician or other qualified health care professional interpretation and report, unilateral or bilateral | In physician office (Medicare): $31.06 Hospital Outpatient: $33.84 |
92229 | Imaging of retina for detection or monitoring of disease; point-of-care automated analysis and report, unilateral or bilateral | In physician office (Medicare): $28.42 - $55.00, per local MAC Hospital Outpatient: $55.66 |
Note: Private payers will usually pay somewhat higher than Medicare, and Medicaid will pay lower. Some payers and MACs do not pay for DR screening at all. |
Currently, CPT 92229 is the proper code for automated AI diagnosis and point-of-care reporting. It’s important to note that this requires immediate (point-of-care) diagnostic reporting, at least initially. With the currently FDA-approved systems on the market, this would mean that a diagnosis of “more than mild” referrable DR vs “no DR” would be immediately given at the time the image is taken, within seconds. However, having a physician review the screening images, even after the AI point-of-care analysis, is a good option in this situation because of the narrow diagnostic scope of current AI systems, which are only approved for DR.
Still, most screening programs require their patients screened for all common vision-threatening posterior segment disease, such a glaucoma, macular degeneration, papilledema and obvious tumors and retinal detachments. It seems obvious that diagnosing all pathology contained in an image is the best medical practice. AI simply is not advanced enough to accomplish this comprehensively, while human graders have this capability.
Note that in hospital outpatient locations only, 92229 reimburses the most. However, the reimbursement is less in non-hospital settings, as the reimbursement is determined by Medicare contractors in those locations. Medicaid is determined on a state-by-state basis and is lower than Medicare in most instances. This confusing and spotty reimbursement landscape is a challenge for the diabetic screening market as a whole.
CPT 92250, once the primary code billed in primary care/non-eye care settings, is facing headwinds. Reimbursements have declined, and some payers are no longer paying this when it is not in conjunction with an eye-care exam and eye-care provider taxonomy.
LOOKING AHEAD
We may see an uptick in adoption in AI systems for DR, due to the slightly more favorable reimbursement policy at present. The continued lack of comprehensive retinal diagnostic ability (which human physicians can achieve) will be the care-related counterpoint to this — also the fact that currently humans may very well be cheaper than the real cost of AI systems. Until we see AI algorithms able to accurately navigate many camera types easily and expand into other pathologies easily, there will be headwinds. Also, if these changes can be made, we need to see the FDA adjust regulatory pathways to reflect this. At this point, both of these factors are longer-term ambitions.
However, AI has other uses, which may be even more useful and obtainable. Consider up-stream AI assistance of medical imaging. Using existing post-processing AI techniques, image quality can be improved dramatically. Elimination of image artifacts, an enlarged field of view and easier/faster image acquisition would be significant steps forward for our field. This technology is available today and is in use in other areas of medicine, such as radiology. One example of this is AI-assisted post-processing, or preprocessing: video image capture for fundus images. This uses short high-quality videos to capture hundreds of images in seconds, with AI determining the best images and best portions of images, and stitching them together to create a high-quality, artifact-free, often wide-angle image. This is a simple, useful and attainable use of AI that could assist physicians and non-physician staff alike.
Autonomous AI interpretation of retinal images still has tremendous promise. As we often see in life, getting to the promised land frequently takes much longer than we first expected. OM