“Extraordinary claims require extraordinary evidence”
— Carl Sagan
Dear Readers,
I rarely write about specific companies or products on this Substack. First, there is a basic professionalism angle—blowing up another organization for clicks is antithetical to the educational mission that guides my work. Second, I don’t want to create any impression in readers that I have bias for OR against various commercial entities; I prefer to stick to the facts and talk about modalities and studies rather than trademarked offerings.
However, today I am going to make an exception and tell you my thoughts about a company called HT Vet and their AI-powered HT Vista thermography test that claims to be able to rule out cancer without standard of care procedures like fine needle aspirate cytology or a biopsy. This is for multiple reasons. For one, I have been repeatedly asked about this product by multiple vets. Next, their ads are starting to blanket social media (likely contributing to point number one):
And finally, in my view, they are marketing this product very aggressively in a way that goes well beyond what their own data supports. So without further ado, I am going to walk you through the basics of how to tell whether a medical test is accurate or not, how the company says their product works, and evaluate their claims against the data. While this message is primarily written to reach veterinary professionals who may have to choose whether or not to buy/use this test, I have tried hard to make the points understandable for pet owners and to provide contrasts with human medicine for our physician/nurse/PA readers.
—Eric
Veterinary medicine is the regulatory Wild West
Human medicine—at least in North America, Europe, and other OECD countries—is heavily regulated. In the United States, the FDA must approve medical devices ranging from pacemakers to digital x-ray machines to blood analyzers. Manufacturers of medical devices must submit data showing their product is safe and effective. Diagnostic tests in the US are also regulated by the Centers for Medicare and Medicaid (CMS) Clinical Laboratory Improvement Amendments (CLIA) program.
So you may be surprised to learn that there is essentially no regulation required for veterinary diagnostics. From the FDA website on veterinary medical devices:
“The FDA does not require submission of a 510(k), PMA, or any pre-market approval for devices intended for animal use. Device manufacturers who exclusively manufacture or distribute animal devices are not required to register their establishments or list animal devices with FDA and are exempt from post-marketing reporting. It is the responsibility of the manufacturer and/or distributor of these articles to assure that these animal devices are safe, effective, and properly labeled.”
While the FDA does have the authority to take action if a device is misbranded or adulterated, the onus is essentially on the companies to self-regulate. This is truly a “buyer beware” industry!
There is also no equivalent to CMS-CLIA. While the American Association of Veterinary Laboratory Diagnosticians (AAVLD) does offer lab accreditation, it is voluntary and does not directly regulate pathology results like CLIA does. Virtually all of the AAVLD-accredited organizations are affiliated with universities or state diagnostic labs; none of the “Big Three” veterinary diagnostic labs or smaller competitors are AAVLD accredited.
In the absence of any real governmental oversight of the private diagnostic market, it is up to independent entities like me to educate pet owners and veterinarians.
A primer on evaluating diagnostic test performance
There are a number of common statistics we turn to when evaluating the accuracy of a diagnostic test. I hate talking about stats as much as you probably hate reading about them, so I will try to keep this brief and as simple as possible.
Sensitivity (SN): This measures how well a test can pick up a diagnosis like cancer. If there were ten patients with cancer, a test with 100% sensitivity would call 10 of 10 positive, while a test with 90% sensitivity would pick up only 9/10, missing 10%. In general, higher sensitivity is better for screening tests.
Specificity (SP): This is the ability of a test to correctly identify patients who don’t have the disease (i.e. negative results). For a test with 100% specificity, 10/10 patients without cancer would test negative. If the specificity dropped to 50%, half (5/10) of healthy patients would have false positive cancer results. Higher specificity is generally desirable for confirmatory tests.
Positive Predictive Value (PPV): This is the probability that a patient with a positive test for cancer actually does have cancer. A test with a 90% PPV would mean that 10% of patients with a positive result were actually false positives and do not have cancer.
Negative Predictive Value (NPV): Similarly, this is the likelihood that a patient with a negative result is truly healthy. It would be bad for a screening test to have low NPV, but as we shall see, this is not the end-all-be-all parameter…
Some caveats…
“The Truth” (i.e. whether a patient has the given condition) must be based on the best test available, aka the “Gold Standard.” Using a substitute test can render any of the above data unreliable.
Importantly, PPV and NPV depend on how common a diagnosis is in the population (the prevalence), whereas sensitivity and specificity do not. This means that NPV/PPV can vary widely if the sample pool in a study does not accurately match the frequency of the disease in the real world.
PPV increases with disease prevalence and decreases when the disease becomes more rare in a population
NPV changes in the opposite direction as prevalence—as the frequency of a disease in the population decreases, the NPV increases, and vice versa (NPV decreases as prevalence goes up)
It is possible for tests to perform well on one or a few dimensions, but have fair to poor overall accuracy. You should be wary about studies or companies that selectively report or highlight only one or two parameters.
How does HT Vista work?
To avoid misrepresenting their technology, I will quote the company itself. From the website, this is how it works:
“The underlying principle of HT Vista Heat Diffusion Imaging (HDI) technology is that benign and malignant tissues display different Heat Transfer Rates due to differences in composition, metabolism, tissue morphology, and vascular network, which affect their thermophysical properties. This innovative screening modality relies on the unique thermal signals recorded by the device, as the tissue is heated and left to cool down.”
More from their pilot study (Dank et al Front Vet Sci 2023):
“…the HT Vista measures the difference in the temperatures between the mass and the adjacent normal tissue at baseline and throughout the scanning process (the heating and cooling phase) and is, therefore, less affected by the external environment. The data are sent to the service cloud, where they are classified using a machine learning algorithm to determine whether the mass requires further investigation.”
How accurate is it?
The tagline for HT Vista is “Rule out cancer without an aspirate or biopsy,” so let’s see if their data backs that claim up. They initially published a pilot study on 69 masses that showed promising results, with 90% overall accuracy, 93% sensitivity, 88% specificity, 83% positive predictive value, and 95% negative predictive value. Based on this, they followed up with a larger validation study:
Dank G et al. Training and validation of a novel non-invasive imaging system for ruling out malignancy in canine subcutaneous and cutaneous masses using machine learning in 664 masses. Front Vet Sci. 2023 Sep 29;10:1164438.
The headline from the second study that HT Vista promotes is its 97% negative predictive value. As stated earlier, this means that almost all patients with negative results are likely free of a malignant mass, which is impressive. However, you can’t look at that one stat in isolation. The sensitivity in this study was only 85%, meaning 15% of masses labeled benign would be false negatives! I would certainly be hesitant to say that a test that tells 1 in 7 patients with a malignant tumor that their lesion is benign is suitable to “rule out cancer.”
How does this performance stack up against traditional cytology? I have plotted the sensitivity and specificity from multiple studies comparing fine-needle aspirate cytology to histopathology for external lumps ‘n’ bumps (list of references here1). Unfortunately, there are not a ton of these studies in the veterinary literature, and many of those that do exist would not be a fair comparison to this device as they evaluate internal organs (liver, spleen, prostate, etc) or samples types (bone, nasal mucosa, effusions, etc) that this machine was not designed for.
As you can see in the figure above, the sensitivity of the HT Vista is lower than cytology in 3 of 4 studies (the outlier is on lymph node cytology, which is arguably not a fair comparison as nodes are more complex tissues than skin, and I suspect HT Vista is not capable of evaluating that sample type). Furthermore, the specificity is substantially lower than all four studies on cytology. These are not outliers either, the veterinary literature generally shows FNA cytology across contexts has fair to moderate sensitivity and high specificity. The total diagnostic accuracy for studies where it was available (Ku et al—77.2%; Sontas et al—96.5%) were also significantly higher than HT Vista’s, which was 69%.
One last note on this section—There were uncommon adverse effects associated with the heating procedure: Three dogs with mast cell tumors had signs of degranulation that had to be treated medically and resolved without serious complications. While this effect appears very rare, and MCT degranulation can occur with fine needle aspiration, it needs to be mentioned, especially in light of marketing that stresses it is “completely non-invasive.”
Can we trust this study data?
For any diagnostic test, drug, or medical device, it is critical to assess more than just a few topline numbers. One of the first things to keep in mind is potential sources of bias. Both of these studies were conducted by numerous authors affiliated with HT Vet and/or its parent company HT BioImaging LTD. While a commercially-sponsored or affiliated study does not de facto mean the data is unreliable, it is unquestionably a source of potential bias. We should assume these numbers are the best case scenario that may be lower when assessed by a completely independent research group.
Besides a potential commercial conflict of interest, there are multiple major study design weaknesses:
The benchmark diagnosis for the validation phase was based on cytology in 525 cases, while additional histopathology was done on only 41 lesions.
Many lesions sampled by cytology in this study were non-diagnostic (94 of 525 masses, 17.9%) and those cases were excluded from analysis. This would be considered a false negative in a typical cytology-histopathology correlation study, and could significantly skew results
The gold standard for tumor diagnosis in histopathology. Comparing the majority of results to the “silver” or “bronze” standard makes it difficult or impossible to interpret the diagnostic accuracy metrics provided
There are far more benign lesions (378) than malignant (53) in this study, meaning the prevalence of cancer was only 12.3%. If you remember our earlier discussion on positive and negative predictive values, they are heavily influenced by disease frequency. NPV in particular goes up with a lower frequency of the diagnosis, so having a skew towards benign lesions (negative test results in this study) could make that parameter look better than it is in reality.
Multiple types of lesions were excluded from the study, including mammary, testicular, and facial tumors, along with tumors that were severely ulcerated. These are common samples submitted for cytology and likely boost the apparent performance of HT Vista by their exclusion. Some of the concerns were over limitations of cytology, but again, the ideal study design would have been to compare both HT Vista and cytology performance to the gold standard histopathology results in all cases.
The diversity of malignant lesions in the validation cohort was quite limited—the only categories listed were carcinoma (n=2), mast cell tumor (n=38), and soft tissue sarcoma (n=13). There were no lymphomas, malignant melanomas, non-STS sarcomas (I.e. hemangiosarcoma, histiocytic, etc), or other types of aggressive tumors. Without evaluating a wide and representative spectrum of cancers it is impossible to conclude you can “rule out cancer” with this device.
Final Thoughts
Beyond just the raw numbers, you need to consider the value of the result provided by a novel method like HT Vista versus an established modality like cytology. Cytology can be performed on-site at a low cost with equipment that most hospitals already own, whereas HT Vista adds more equipment and cost to the procedure. HT Vista provides a binary result of benign or malignant—it does not tell you what the underlying disease is in either case. Cytology, on the other hand, can often provide both a malignancy screen and a specific diagnosis.
Not all lesions are created equal, even within the same category. Consider three different lesions from the “benign” cohort in this study: An adnexal cyst, a histiocytoma, and an “inflammatory lesion.” Even assuming perfect diagnostic accuracy, the HT Vista system would call all three “benign” and leave it at that. Yet the action item for all three is different. Most histiocytomas regress on their own as the immune system fights the tumor, so you could tell the owner to feel confident about watchful waiting or benign neglect. For an adnexal cyst, these can be very irritating to the patient’s skin and recur after drainage, so surgical removal is often necessary. Finally, for an inflammatory lesion, this may require one of several medical interventions, including antibiotics if there is an infection. In contrast to HT Vista, cytology can easily tell those lesions apart.
I can’t tell you whether or not you should buy or use the HT Vista system. All I can do is provide my summary evaluation of their data:
It has modest sensitivity, which could potentially lead to missing 1 in 7 cancers
It has poor specificity and positive predictive value, meaning most positive (malignant) HT Vista results are actually benign (the PPV was a mere 26%)
Numerous methodology issues in the studies reduce my confidence that the above numbers are reliable
In the best case scenario, this device only provides a binary result that needs to be followed up with cytology and/or a biopsy to formulate a clinical plan any way
As much as I have a vested interest in keeping my job interpreting cytology, my first duty as a veterinarian is to our animal patients and their owners. If a new modality came along that proved superior to traditional cytology, I would say we’d have to strongly consider changing the status quo.
Unfortunately this, as the kids say, ain’t it.
Update (8/27/24)
Well, HT Vet certainly was not a fan of this article! After it was published, the company sent me an email asking to call them. When I didn’t respond, a representative approached me in person at a conference to try and change my opinion. And just this morning, HT Vet left a comment below that they posted a response to me on their website’s blog.
In the spirit of open discussion I am linking to it here. I encourage everyone to read it in full for their perspective. However, I respectfully disagree that my critique “missed the mark,” though I did get a chuckle at their characterization of my analysis as “misleading” despite the fact that I quote their own public data! In their response they claim to now have achieved even better accuracy from >16,000 scans over the past two years.
To that I say: GREAT! Put the data out there for public consumption and cross-examination. Present it at a conference and get it published in a peer-reviewed journal. My challenge if you want to do it properly is to follow the STARD guidelines for diagnostic accuracy studies.
Substantively, their response largely ignores most of the issues that I raised. They make excuses for not using histopathology as the gold standard reference method to establish ground truth. They don’t even attempt to justify the limited diversity of their cancer cohort—how can they guarantee they can rule out something they didn’t evaluate in the study? They don’t say. They boast that their new and improved algorithm has a sensitivity of 90%—which would still miss 1 in 10 cancers—and completely ignore the issues of specificity, positive predictive value, and overall accuracy.
Given all of the above, I am quite happy to stand by my original comments.
Finally, for additional background on the history of thermography in cancer screening, one of my readers who is an MD radiation oncologist sent me this interesting article she wrote:
Her piece includes a link to a news story about the FDA sending a cease-and-desist letter to a thermography device company for inappropriately marketing their products:
“The FDA also says Total Thermal Imaging violated the FDA regulations by marketing the machines with advertising both online and in print brochures that says thermography is “an alternative to mammography” that is “far more efficient at detecting cancer.” According to the warning letter, marketing the thermography machines as a sole screening device is a “major change or modification in the intended use of the device” and requires premarket approval from the FDA.
“There is no valid scientific data to demonstrate that thermography devices, when used on their own or with another diagnostic test, are an effective screening tool for any medical condition including the early detection of breast cancer or other diseases and health conditions,” according to the FDA.”
In human medicine, excessive hype is regulated by the FDA. In veterinary medicine, we are on our own.
—Eric
Ghisleni G, Roccabianca P, Ceruti R, Stefanello D, Bertazzolo W, Bonfanti U, Caniatti M. Correlation between fine-needle aspiration cytology and histopathology in the evaluation of cutaneous and subcutaneous masses from dogs and cats. Vet Clin Pathol. 2006 Mar;35(1):24-30. doi: 10.1111/j.1939-165x.2006.tb00084.x. PMID: 16511787.
Ku CK, Kass PH, Christopher MM. Cytologic-histologic concordance in the diagnosis of neoplasia in canine and feline lymph nodes: a retrospective study of 367 cases. Vet Comp Oncol. 2017 Dec;15(4):1206-1217. doi: 10.1111/vco.12256. Epub 2016 Aug 15. PMID: 27523399.
Simon D, Schoenrock D, Nolte I, Baumgärtner W, Barron R, Mischke R. Cytologic examination of fine-needle aspirates from mammary gland tumors in the dog: diagnostic accuracy with comparison to histopathology and association with postoperative outcome. Vet Clin Pathol. 2009 Dec;38(4):521-8. doi: 10.1111/j.1939-165X.2009.00150.x. Epub 2009 Apr 24. PMID: 19392751.
Sontas BH, Yüzbaşıoğlu Öztürk G, Toydemir TF, Arun SS, Ekici H. Fine-needle aspiration biopsy of canine mammary gland tumours: a comparison between cytology and histopathology. Reprod Domest Anim. 2012 Feb;47(1):125-30. doi: 10.1111/j.1439-0531.2011.01810.x. Epub 2011 May 26. PMID: 21615802.
Lordy. Sorry this is making the rounds in the veterinary world! I wrote about the long history of human thermography last October. So sad to see people waste their money.
https://open.substack.com/pub/cancerculture/p/why-breasts-look-better-in-black?r=yjs6z&utm_medium=ios
I was on a jury a few years back, a plaintiff suing a medical device manufacturer, and was surprised to find out how loosely regulated some medical devices ( for humans) are.