Problem
The goal of the project was to create an AI algorithm for analyzing tongue photos based on Traditional Chinese Medicine (TCM).
A simple tongue analysis enables patients to detect basic health issues and take further health steps, such as consulting a doctor.
During such an analysis, an expert evaluates various visual tongue features, such as coating, discoloration, cracks, teeth marks, or spots, based on three photos (front, back, and side of the tongue). These help map the observed features to a dozen or so of the most common TCM syndromes.
The problem is that this photo analysis is time-consuming, requires specialist knowledge, and as a result the service has low availability for patients.
Challenges
The biggest challenge was the limited number of reliably labeled data points, the high rarity of some symptoms, and variability in photo quality.
The analyses completed so far were delivered to us as an export from an email inbox, where the photos were in separate messages from their analyses. The analyses did not have a standardized format and were often written as free-form text, which required the use of various NLP techniques.
Additionally, some symptoms were subtle and difficult to label unambiguously, which required ongoing consultations with the TCM expert. It was also necessary to adapt the CNN model architecture to the specifics of medical image interpretation and to balance the syndrome classes.
Solution
In close cooperation with a TCM expert, we defined a list of tongue symptoms to be detected automatically by the AI model.
We collected and prepared data, labeling it in terms of symptoms and the corresponding syndromes.
Next, we built convolutional neural network (CNN) models (using transfer learning, leveraging pre-trained vision models for general purposes) and tested two architectures: simple syndrome classification and a three-stage model (symptom detection → area analysis → classification), which helped achieve higher effectiveness.
The models were optimized for prediction accuracy, stability, and the ability to be further extended.
The models were deployed in the AWS cloud and start automatically when a patient uploads tongue photos. Detected symptoms are marked on the images and sent to the TCM expert for verification.
Result
In the PoC phase, the CNN models achieved 80–90% effectiveness in predicting TCM syndromes and high effectiveness in detecting individual symptoms in the image.
The system was integrated with services provided by the TCM expert and is used to speed up the expert’s analysis.
The system is a solid foundation for further deployments. Opportunities were identified to potentially reach up to 99% accuracy in the future with a larger amount of data and further model optimization.
In the future, after achieving stable results, the system could operate fully automatically and send the analysis results directly to the patient, without the expert verification step.