ML algorithms Archives - Healthentia https://healthentia.com/tag/ml-algorithms/ Fri, 04 Jul 2025 09:28:11 +0000 en-US hourly 1 https://healthentia.com/wp-content/uploads/2020/04/cropped-favicon_512-32x32.png ML algorithms Archives - Healthentia https://healthentia.com/tag/ml-algorithms/ 32 32 193384636 Use of Real-World Data in clinical research https://healthentia.com/use-of-real-world-data-in-clinical-research/ Mon, 16 Jan 2023 14:34:06 +0000 https://healthentia.com/?p=19617 Definition & importance Real-World Data (RWD) is any data relating to a patient’s health status, collected during the routine delivery of care, as opposed to data collected within the controlled setup of clinical trials. Hence RWD does not differ so much in its type but in the process and population involved in its collection. The...

The post Use of Real-World Data in clinical research appeared first on Healthentia.

]]>

Definition & importance

Real-World Data (RWD) is any data relating to a patient’s health status, collected during the routine delivery of care, as opposed to data collected within the controlled setup of clinical trials. Hence RWD does not differ so much in its type but in the process and population involved in its collection. The different types and sources of RWD can be:
  • Clinical data from electronic health records (EHRs) and case report forms (EHRs). This data establishes who the subject is, providing demographics, family history, comorbidities, procedure and treatment history, and outcomes. Such data types are also common in clinical trials.
  • Patient-generated data from patient-reported outcome (PRO) questionnaires, or measurements from wearables. This is data collected in everyday setting, providing insights directly from the patient, beyond clinic visits, procedures, and hospitalization. While patient-generated data is not unusual in clinical trials, it is collected in a centralized manner at the regular visits of the trial volunteers to the healthcare facilities. In the real-world context, the collection is done continuously at home.
  • Public and government data including cost and utilization data. Such data provides information on the healthcare system and the different stakeholders therein.
Such information can be used to create algorithms for risk stratification or to gain insight into associations between exposures, interventions, and outcomes. While clinical trials continue to be the main tool for studying the safety and efficacy of a new medicine, their controlled environment, and well-defined cohorts constitute experimental conditions that do not represent real-world settings. RWD is a much better tool for understanding how patients react to a medicine once approved and made available in the market, i.e., in routine medical care. The lack of highly controlled settings usually results in lower levels of confidence, but the outcomes represent a wider population of subjects. Such outcomes are better suited for understanding and taking decisions in everyday medical care, in broader settings than the controlled ones in clinical trials.  

RWD: Collecting in a clinical vs. everyday setting

There can be a huge quality difference between RWD collected in a clinical versus in everyday setting. In a clinical setting, the process is carried out sporadically by professionals, with subjects following strict guidelines (like time and method of collection, or diet prior to collection). In the everyday setting, the process is continuous and carried out by the subjects themselves. Whether the data is reported by the subjects or is measured by devices the subjects operate, the continuous nature and the self-supervision can lead to low quality due to device failure (usually uncharged devices, wearables not worn when they should have been, or mobile applications left unused for too long and automatically closed down) and lack of adherence (forgetting to answer instances of repeating questionnaires, amplified decline of interest in the process). Also, clinical data can be much more specialized to the medical conditions at hand, compared to most behavioral data collected in an everyday setting. But no matter these shortcomings when dealing with data collected in an everyday setting, it is now well-established that behavior is part of the intervention. The high specialization and quality of the sporadic clinical data is complemented by the continuous nature of the behavioral, everyday data, in much the same way a low-resolution film complements the understanding offered by the occasional high-resolution photo.  

Patient-generated, everyday RWD types

The behavioral, everyday RWD are categorized in terms of collection method and content. The following collection options are used:
  • Patient-reported via questionnaires: This collection model is closer to the established clinical trial approach, but this time the questionnaires are digital, pushed to subjects via some companion mobile app. They mostly have to do with self-assessment of different aspects.
  • Patient-reported via widgets: Similar to questionnaires, only this time rich graphical interfaces are employed. The widgets allow manual entry, or take advantage of integration with 3rd party devices meant for occasional use like scales or blood pressure monitors to automatically collect measurements.
  • Automatically reported by wearables: Continuous measurements from wearable devices is one of the most prominent sources of RWD. Ubiquitous activity trackers or more specialized devices like sleep monitors are integrated either at device level (when a Software Development Kit is available, e.g., via Apple Health Kit) or at device cloud level (when an Application Programming Interface is available).
  Using any of the above methods, the following everyday RWD types are collected:
  • Physiological: Data about physical activity, continuous monitoring of vitals, sleep
  • Psychological: Emotions
  • Social: Interactions (phone calls, social media)
  • Environmental: Living and working environment
 

Learning on RWD

At a raw level, RWD can lead to decisions about individuals and cohorts via analytics visualizations. But a full understanding of the context of subjects is gained via processing, using machine learning techniques. Supervised algorithms facilitate learning biomarkers, while unsupervised ones lead to phenotypes. RWD facilitates learning digital composite biomarkers. Biomarkers are quantities characterizing some disease or outcome. Digital refers to their attributes being ubiquitously available, not only as clinical data. Composite refers to the combination of multiple attributes in an attempt to predict some outcome. ML algorithms are used to learn outcome predictors as non-linear combinations of the attributes into the digital composite biomarkers. Phenotypes characterize the way the internal conditions of subjects manifest themselves for external observation. The different RWD attributes measured constitute the observation, and clusters of the observations correspond to different phenotypes. The clusters are learned from RWD using unsupervised ML algorithms. The clusters are then modeled for efficient representation of the phenotypes.  

RWD in Healthentia

Our product Healthentia is used to collect all types of patient-generated, everyday RWD types. Our subjects employ the Healthentia mobile app to answer questionnaires and to enter data via the widgets, either manually or using devices integrated via their Software Development Kits. Data collection also employs the Healthentia big data platform and ingests more subjects’ data using the Application Programming Interfaces of other device providers. The collected RWD is analyzed using the BI analytics available at the Healthentia portal for healthcare professionals. It is also processed using the smart services of Healthentia, namely:
  • The Learning Services for training models
  • The Inference Services for inferring with the help of the trained models
  • The Clinical Pathway for utilizing the raw RWD and the inference results in monitoring the state of subjects, and
  • The Virtual Coach for utilizing all the above in personalized advice given to the subjects.

The post Use of Real-World Data in clinical research appeared first on Healthentia.

]]>
19617
‘Discovering biomarkers’ https://healthentia.com/discovering-biomarkers/ Mon, 19 Oct 2020 10:39:56 +0000 https://healthentia.com/?p=18386 In Innovation Sprint we believe in the potential of the ‘missing data’ in clinical studies, such as lifestyle, activity, nutrition, sleep, to derive conclusions about the efficacy of treatments, as well as to bridge the gap between clinical research and eHealth/DTx. In the context of exploring ways to make use of such data, we started...

The post ‘Discovering biomarkers’ appeared first on Healthentia.

]]>
In Innovation Sprint we believe in the potential of the ‘missing data’ in clinical studies, such as lifestyle, activity, nutrition, sleep, to derive conclusions about the efficacy of treatments, as well as to bridge the gap between clinical research and eHealth/DTx. In the context of exploring ways to make use of such data, we started around a year ago the Digital Biotech activity, which involves the discovery of digital composite contextual biomarkers.

A biomarker is a naturally occurring characteristic by which a pathological or physiological process can be identified. A digital biomarker comprises of objective, quantifiable physiological and behavioral data, measured utilising digital portable, wearable, implantable or digestible devices, to be used to predict and manage health-related outcomes.

Innovation Sprint has built a composite contextual biomarker-based οn multiple aspects of Real-World Data (RWD), collected from people unobtrusively, while following-up their normal living routine. It is composite in the sense that it is not based on a single measurement, but rather on multiple diverse measurements (objective RWD) and peoples’ reports (subjective RWD). It is contextual in the sense that not only the person is measured, but also the person’s lifestyle context: social and environmental aspects complement the more traditional physiological and psychological ones.

Our RWD

At Innovation Sprint we are strong advocates of the empirical knowledge that lifestyle is a strong determinant of health. Hence our biomarker is based on RWD spanning four important aspects of a person’s lifestyle:

◾ Physiological RWD quantifies physical behaviour (active vs sedentary lifestyle as measured by steps walked, floors climbed, activity types, minutes in different intensity levels or heart rate zones, resting heart rate, sleep characteristics) and includes body info (height, weight, gender, race), nutrition (water, other liquids, food) and symptoms (body temperature, cough, diarrhea, headache, nausea, pain, etc.).

◾ Psychological RWD quantifies at a simple level mood, and in more complex situations mental state collected via elaborate, domain-specific questionnaires. Measurements can also play a role, either directly e.g. facial expression recognition, or indirectly, e.g. weather where people are living).

◾ Social RWD quantifies social activity of people. This can be measured indirectly from the usage of the phone (diversity, duration, frequency of calls) and social media (diversity, number, frequency of interactions). More direct information can be reported using questionnaires on activities with friends, family or co-workers.

◾ Environmental RWD indicates the quality of life. Usually, reported by the users. Measurements of living or working environment quality are made with commercial devices (e.g. air quality meters).

AI for discovering our biomarker

Biomarker discovery at Innovation Sprint is done in three stages

◾ Definition stage, where the domain experts select the clinically significant outcomes that need to be predicted by the biomarker(s).

◾ Manual RWD selection stage, where domain knowledge is applied to refine our generic RWD selection into those lifestyle aspects that are relevant to the disease/condition in question.

◾ Iterative design stage: Machine Learning/AI algorithms are used to train a proprietary classifier using the elected RWD to predict the selected clinically significant outcomes. The classifier is applied on new data  yielding predictions and insights leading to digital therapeutics.

Validating our approach

We employed RWD collected over 7 years to train a biomarker that predicts significant weight changes. Such a biomarker is important for patients with several diseases (e.g. NAFLD), as well as for the general population interested in well-being. We achieved over 80% or correct prediction of the outcome, while we also analysed the different RWD aspects that led each individual to positive or negative outcomes, in order to offer personalized coaching services.

 

As we speak, we are utilising the same approach in other therapeutic areas, e.g. cervical cancer, to predict low toxicity events. Starting from 2021 we will validate this hypothesis in much larger cohorts, targeting –among others- COPD patients with Cardiovascular Disease comorbidities.

We will keep you update on our observations and findings!

Aristodemos Pnevmatikakis
R&D Director, Innovation Sprint

The post ‘Discovering biomarkers’ appeared first on Healthentia.

]]>
18386