AI-based Skin Quality Assessment: Validation Relative to Plastic Surgeon and Dermatologist Evaluation

Grace L. Landis; Alexis Bailin; Ella Putigna; Nadiya A. Persaud; Robert Minkes

Poster
Author & Poster Info

AI-based Skin Quality Assessment: Validation Relative to Plastic Surgeon and Dermatologist Evaluation

Abstract

Background: Skin quality is intricately linked to a patient's overall health and a key target in facial aesthetics, yet its clinical assessment lacks standardized evaluation methods and remains subjective [1]. In aesthetic surgery and dermatology, assessment of skin features such as wrinkles, texture, and pigmentation are physician judged and can vary between evaluators [2]. Recent advances in artificial intelligence (AI) have demonstrated potential for objective image-based patient analysis, creating an opportunity to enhance quantitative assessment to support clinical decision making and patient satisfaction [2,3]. This study evaluates the diagnostic performance of multiple AI systems by comparing their patient assessments with physicians, exploring AI as a standardized tool to support physician facial analysis.

Methods: This study evaluated four large language model (LLM) chatbots (ChatGPT, Gemini, Claude, and Meta AI) using four case reports focused on treating photoaging. Each chatbot was provided with identical inputs consisting of a structured clinical vignette and paired baseline and post-treatment photographs. Models were instructed to act as board certified dermatologists and assign Physician Global Assessment (PGA) scores using a predefined 0-4 improvement scale [4]. Outputs were evaluated using a standardized grading rubric assessing diagnostic accuracy, adherence to a required response format, correct application of the PGA scale, and clinical reasoning. Model performance was compared to assess scoring reliability against a gold standard reference from the original case report.

Results: Overall mean performance scores were highest for Claude (84.38), followed by Meta AI (58.13), Gemini (56.88), and ChatGPT (52.50). Across all LLMs, Primary Diagnosis, PGA Scaling Definitions, and Skin Type Specific Risk Factors were identified accurately. Data showed strong agreement regarding Alignment Between Narrative Description and Assigned Scores. Variability was demonstrated in PGA Domain Scoring between different criteria.

Discussion: AI systems demonstrated moderate agreement with physician assessments, with Claude showing the closest alignment to the physician-defined gold standard. Although all LLMs accurately identified three criteria, as detailed in the results, PGA scores varied, particularly for dyschromia and wrinkle assessment, indicating that visual interpretation remains AI model dependent. Scoring may have been influenced by external knowledge retrieval, similarities to published images, reliance on clinical vignettes rather than true image analysis, and subjective rubric interpretation. Larger studies with repeated trials and more diverse cases are necessary to determine reproducibility, reliability, and clinical applicability before full integration into practice. Overall, this pilot study suggests that LLMs have potential to support physician facial analysis by enhancing the consistency and quantification of skin assessment.

Poster

non-peer-reviewed

AI-based Skin Quality Assessment: Validation Relative to Plastic Surgeon and Dermatologist Evaluation

Author Information

Grace L. Landis Corresponding Author

Research, Orlando College of Osteopathic Medicine, Orlando, USA

Alexis Bailin

Research, Orlando College of Osteopathic Medicine, Winter Garden, USA

Ella Putigna

Research, Orlando College of Osteopathic Medicine, Winter Garden, USA

Nadiya A. Persaud

College of Public Health, University of South Florida, Tampa, USA

Robert Minkes

Research, Orlando College of Osteopathic Medicine, Orlando, USA

Poster Information

Meeting

Second Annual OCOM Research Sympoisum April 03, 2026 - April 03, 2026

Publication history

Published: March 20, 2026

Copyright

© Copyright 2026
Landis et al. This is an open access poster distributed under the terms of the Creative Commons Attribution License CC-BY 4.0., which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

License

This is an open access poster distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

PDF

Learn more

Learn more

Learn more

Ongoing Competitions

AI-based Skin Quality Assessment: Validation Relative to Plastic Surgeon and Dermatologist Evaluation

Abstract

Related articles

AI-based Skin Quality Assessment: Validation Relative to Plastic Surgeon and Dermatologist Evaluation

Author Information

Grace L. Landis Corresponding Author

Alexis Bailin

Ella Putigna

Nadiya A. Persaud

Robert Minkes

Poster Information

Meeting

Publication history

Copyright

License

Download Cureus Media Kit