Saturday, 30 November 2024

From the tyger to the dying of the night

Around 1996, I chose a stanza from the poem The Tyger by William Blake as my email signature: I already blogged in the past about this: The Tyger is back. Here it is:

    Tyger Tyger, burning bright,

    In the forests of the night;

    What immortal hand or eye,

    Could frame thy fearful symmetry?

Here, you can find the full poem.

What fascinated me in that verse was the "fearful symmetry", that emotion a young engineer like me experienced looking closely at living organisms. 

I used it for 15 years, dropping it when I moved to the University of Sheffield.  In 2018, when I returned to Bologna, the temptation was too strong, and I put it back as part of my email signature.

Today, I am going to replace it.  My new signature citation will be:

    Do not go gentle into that good night.

    Rage, rage against the dying of the light.

    Do not go gentle into that good night, Dylan Thomas, in  Collected Poems (1952)

It is the closing verse of a poem Dylan Thomas wrote in the form of a Villanelle, a nineteen-line poetic form consisting of five tercets followed by a quatrain. The full text can be found here.  The sentence "Rage, rage against the dying of the light" closes every other stanza.  This reminds me of one of the folk songs I loved most when I was young, Riturnella, in the version of Musica Nova. It is a traditional from Calabria, recovered in the seventies.

But the reason for this change is another.  I am entering a time in my life when you are constantly confronted with death.  That of your parents, your youth's hero, and your own. It is something new for me; trust me, it is tough.  Now, I suddenly understand something I read years ago: "all elders have a baseline of depression".  Yes, sure, I can see it now.

You need to fight it; you cannot abandon yourself to it. But how?  Well, today, I found a citation to Dylan Thomas's poem, and I had a lamplight moment: considering who I am, the best way is to be pissed off about it. So this is my plan: I do not want to go gentle into that good night; I will rage, rage against the dying of the light.

And since I am still spending half of my life reading and writing emails ( I need to do something about this as well), the best place to remind me of this is my email signature, right? 

That's done, and I also added to my daily playlist a good song from Rage Against the Machine.

done.  Let's rage.







Sunday, 13 October 2024

Does interdisciplinary science truly exist?

This is a rhetorical question; the answer is, of course, “yes” because there are researchers who stray out of the confines of their specific scientific domain and wander out toward another scientific domain, ending in that no man’s land so dear to Michel Foucault. 

The In Silico World consortium has recently released in open access a report entitled “Regulatory barriers to the adoption of in silico trials”, which summarises years of work on this specific topic.  In chapter 4 of this report, entitled “A reflection on interdisciplinary decision-making”, we debate with a very narrow focus the complex issue of recognising justified true belief in an interdisciplinary domain”. But from it, we drew a more general reflection that we propose here.

Suppose we accept Thomas Kuhn's idea that in each scientific domain, periods of cumulative progress are periodically interrupted by periods of revolutionary science when the whole domain jumps from one paradigm to another. In that case, we see how interdisciplinary scientists are heralds of a particular class of paradigm change, modifying the confines of that scientific domain (or creating a new one). 

But these researchers travel in dangerous, unexplored lands. The biggest risk is the lack of well-tested pragmatic epistemology. There are many reasons why science fractions knowledge space into scientific domains.  A particularly important one is the need for a pragmatic, operational interpretation of the process through which a group of peers agree on a particular belief.  The best way to collectively decide when a belief is a Justified True Belief varies considerably over the knowledge space.  The ways physical and social sciences decide what a Justified True Belief is are very different and form what we call the pragmatic epistemology of each subdomain of science.

The scientific method is unforgiving; in particular, if the peers of a scientific domain choose an ineffective pragmatic epistemology, the result will be that they will tend to assume as true (and build upon it) tentative knowledge that is then falsified at a later stage. Everything produced using that tentative knowledge must be trashed when this happens, and the overall cost can be very high.  So eventually, the peers of each scientific domain develop a set of operational rules tempered at the fire of experience to decide how, in that specific knowledge territory, a Justified True Belief is considered as such.

If you are conducting interdisciplinary research, you are wandering out of the comfort zone of your domain of origin.  For example, say you are a physicist who has started exploring particular aspects of living organisms, a topic traditionally within the remit of biology.  The pragmatic epistemologies of biology and physics are profoundly different, and rightfully so. Our physicist is faced with two choices: either continue to be coherent with his/her domain of origin and use the pragmatic epistemology of physics or adopt that of the host domain, biology.  The first option is usually healthier: not having been trained as a biologist, the choice to adopt the pragmatic epistemology of biology can be challenging for a physicist (and vice versa, of course).  But in both cases, the choice is always poor.  That is because that researcher is not investigating physics or biology but a new knowledge domain (let us call it biophysics) for which a well-tested pragmatic epistemology is not yet available. 

Interdisciplinary research is precious because it expands the scientific enquiry to previously untouched portions of the knowledge space. However, we must accept that it operates in difficult conditions from an epistemological point of view, and the quality of the knowledge produced might not be as high as that provided by other, more established domains. It also requires a very special mindset, which is less rigid and more accepting. However, the biggest challenge is the need for epistemological attention. Most scientists are uninterested in how knowledge is produced from a philosophical point of view; like cops, they apply the existing laws with little interest in how the legislative process works.  Interdisciplinary scientists do not have this luxury; they need to understand the philosophical basis of their work and spend time debating how to decide what is true.

“Does interdisciplinary science truly exist?” ©2024 by Marco Viceconti is licensed under Creative Commons Attribution-NoDerivatives 4.0 International (CC BY-ND 4.0).

This opinion piece was published in my personal blog, and as an Open Access document on Zenodo.



Saturday, 28 September 2024

European Union: from the land of rights to the land of rules

As usual, this blog post is inspired by a series of events, conversations, news, and opinions I collected in the last few weeks, which condensated the need to share an opinion piece through this channel publicly. As always, for everything posted in this blog, these are my personal opinions, which do not necessarily represent the official views of my employer or any of the organisations I am affiliated with in various roles.

The topic is complex and politically sensitive. It is also very general, dealing with how the European Union produces its policies. My narrow perspective refers to the specific domain of in silico medicine technologies. I am not qualified to generalise, but I suspect the problem I am describing below is not specific to a particular area of innovation.

I always have been a champion of the European Union. Compared to the USA, the European Union has been, for many years, the land of rights. Coherent with the values of the majority of its citizens, we created a society where healthcare and education are rights, citizens' privacy is protected, and so on. I am very proud of this. But I am afraid the last few years transformed the EU into the land of rules and bureaucracy. This is crippling our ability to innovate and compete, I believe.

The introduction of in silico medicine is revolutionising healthcare. The ability of computer models, built with advanced techniques such as Artificial Intelligence, to predict changes in the health state of individual subjects opens amazing opportunities but also, as is always the case with disruptive technologies, a whole new set of potential risks for individuals and society. The differences in recent policy-making between the USA and the EU are creating dramatic disparities between researchers and companies working in this field.

When the General Data Protection Regulation (GDPR) was introduced in 2016, many, including myself, praised the European Commission. Eight years ago, it was already evident that information technology companies were abusing the personal data they collected, health information being very sensitive, which was already having a major impact on my research domain.  However, as the GDPR was enforced, biomedical researchers in some member states faced huge difficulties.  The idea of demanding the member states define the rules under which exceptions to the GDPR could be made for scientific research, while it was probably a smart political move to achieve complex legislation within a reasonable timeframe, created a nightmare in which every member state has different rules and multicentric studies across member states are very difficult.  I am sure the legislators had no intention to prevent ethically robust biomedical research, but de facto, this is what has happened. 

This story is well known to all operators in the field and shows, in my opinion, a pattern.  The European Parliament decides to legislate on a socially relevant topic, which has significant implications for complex innovations, usually technological.  Technical advisors are mobilised to support the legislative process, and the legislation tries its best to capture all possible implications of such innovation and how the legislation will impact the socioeconomic processes that involve such innovation. The results are large, complex legislations aimed to “cover all bases”.

However, for disruptive innovation, no group of experts, no matter how good, can foresee all the special cases and all the implications that such extensive policies can have. And the more complex the legislation is, the higher the chance it will produce unforeseen and undesired effects.  In many cases, the issues are related to the correct interpretation of the law and its application into procedures that streamline whatever legal process economic operators need to do to comply with.  The problem is that no one is in a position to handle this post-legislative process.  The European Parliament cannot manage the application of its laws, and the European Commission does not always have the in-house technical expertise necessary to do this.

I suggest that for such complex pieces of legislation, a federal agency should be appointed to oversee the application of such legislation, providing guidelines addressing concrete cases as they emerge, offering support and training, and uniforming the procedures across the union.

In silico medicine, particularly In Silico Trials, has been one of these disruptive innovations.  In the USA, where such federal authority exists (Food and Drug Administration, FDA), the first guideline on using computer modelling and simulation in the certification of medical devices was published in 2016.  The Medical Device Regulation (EU 2017/745) acknowledges the growing role of software artefacts in medical products. However, in 2024, we still do not have a guideline that assists notified bodies in deciding when evidence obtained in silico can be accepted.  The only technical standard for in silico methodologies is the ASME VV40:2018, whose production was driven by the FDA, which recognised the need.  Only this year, the IEC-ISO started a workgroup on the topic that will produce an EU-harmonised standard in some years, but this is only because the community of practice has lobbied for it.  When a new in silico methodology for all types of medical products is developed, one can ask the FDA for qualification advice to explore if such methodology suits regulatory purposes. In Europe, the European Medicine Agency (EMA) provides this only for drug development tools; nothing is available for medical device development tools.

Such a condition is reversing the flow: historically, medical companies would certify their products first in the EU and then in the USA. Today, for in silico solutions, many companies choose the opposite strategy and start with the USA regulator, ensuring clear, reliable regulatory pathways for such products.

In front of us, we have the application of the Artificial Intelligence Act in the next three years and the completion of the legislative process for the European Health Data Space regulation.  Both pieces of legislation could have a tremendous impact on health technology innovation. Without central authorities with the technical skills to provide guidelines on these new legislations, the EU will face the same shortcomings as the MDR and GDPR.  

“European Union: from the land of rights to the land of rule” ©2024 by Marco Viceconti is licensed under Creative Commons Attribution-NoDerivatives 4.0 International (CC BY-ND 4.0).

This opinion piece was published in my personal blog, and as an Open Access document on Zenodo.


“European Union: from the land of rights to the land of rules” © 2024 by Marco Viceconti is licensed under Creative Commons Attribution-NoDerivatives 4.0 International.


Saturday, 31 August 2024

How to gently retire

It has been two years since I have posted on this blog.

The reason is simple: 2023 and 2024 have been crazy, so busy that I could never find the time to let my mind go free and produce stories that, in my humble opinion, are worth to be shared.

I hope things will improve in 2025, primarily because I am starting to implement my "Gentle Retirement" plan.  I decided to write about such a plan for two reasons: the first is that one element of this plan will drastically reduce my attendance at conferences and other similar events, so I want to inform my many friends and colleagues who will not see me anymore about my future.  The second and most important is that all my closest work friends are in their sixties, or close to that,  so I am sure they are all wondering the best way to retire.  I am not sure mine is the best, but it might be worth sharing how and why I plan to approach this final passage of my professional career.

First, I will give a brief history of myself to contextualise this.  After a short but fundamental period in the USA working with Prof Alì Seireg (see this article on one of his many achievements), I returned to Bologna (IT) and, in late 1989, started the Medical Technology Lab at the Rizzoli Orthopaedic Institute. I worked there until 2011. When I left, the lab hosted 45 between PhD students, post-docs and researchers. I moved to the University of Sheffield, where in 2012, we started the Insigneo Institute for In Silico Medicine. I directed Insigneo for seven years, driving it to become the largest research institute on this topic in Europe.  In 2018, I returned to Bologna as a full professor of industrial bioengineering at the Alma Mater Studiorum - University of Bologna. I also had a joint appointment as director of my old lab at Rizzoli. For the third time, I had to restart from scratch, and despite my promises to take it easy after the crazy years with Insigneo, at the end of 2023, my group counted around 40 researchers.

Comes in 2024, and life is getting complicated.  Working in Italy has never been easy, but the considerable funding of the COVID-19 recovery plan combined with our proverbial administrative ineptitude makes our daily work a nightmare.  The right-wing government is underfunding public healthcare, and working in a research hospital like Rizzoli is becoming more and more difficult. On top of this, I experienced some serious health problems.  And suddenly, I realise that for the first time in my career, the pain is more than the gain.  Time to retire.  But how can I retire while preserving as much of my legacy as possible? 

The first thing I have already cut is the time spent travelling to conferences and similar events.  All the travel budget is now spent to support the travelling of my coworkers; they slowly have to become the faces of our team, replacing me in the perception of our community. 

I still have substantial funding until 2026.  However, I will not apply for any further funding; on the contrary, I will support my coworkers who are in the position to have their own research funding to pursue opportunities.

From Sept 1st, 2024, I will resign as Director of the Medical Technology Lab at the Rizzoli Institute.  The lab has three senior researchers who already run it without me for the seven years while I was in Sheffield; in addition, two of my younger coworkers have a tenure-track position, so I am optimistic the lab will survive my departure. 

At the end of the year, we will close my largest EU grant as coordinator, the In Silico World project.  This four-year endeavour has been a fantastic journey; On Sept 3rd, 2024, we will present the results to the research community in Stuttgart in conjunction with the VPH conference

After that, I will be left with primarily national funding that will finish in 2025, except for one project.  As these projects close, I will support the post-doctoral staff in finding a new position elsewhere.  I also hope there will be a tenure-track position in my university department to continue my academic legacy. 

By the end of 2025, I will be left only with the coordinating role in the DARE project, in which my responsibility as a spoke leader is primarily managerial.  Any research activity will be continued by those coworkers who managed to retain a tenure-track position in Bologna.

I will close all my professional social network accounts as my research career fades. I might open one to stay in touch with friends, but only to discuss non-professional topics. I will continue to post occasionally on this blog, but only about culture, society, and philosophy. 

In April 2027, I will be 66 years old.  Depending on my health, I could retire then or wait a few more years, during which I will focus on teaching and tutoring. The mandatory retirement age for full professors in Italy is 70, so I could keep teaching until 2031.

I do not have much more to offer regarding scientific discovery.   In math, the Field Medal can be won only by those 40 or younger. I always thought that was an exaggeration, but I must admit that my creativity has constantly decreased in the last years.  What I can still do is share the significant experience I accumulated in these many years with the researchers in training through teaching and tutoring.

Those who know me tend to describe me as a workaholic. So how will I manage with all that free time?  My wife and I would love to spend time in other places worldwide, so if you need a visiting teacher, talk to me.  Then, I need to go back and brush up on my musical skills, which I did not cultivate in the last 40 years. I would also like to volunteer; it seems a moral imperative in this growingly egoistic society. So, do not worry about me; I will manage even without working 14 hours per day, seven days per week. 

As I fade from the public eye, being part of this international research community has been an honour.  thanks to all of you who worked, debated, revised, laughed, and argued with me during this long career. 

Farewell.

Marco




Tuesday, 14 June 2022

Credibility of predictive Data-driven vs Knowledge-driven models: a layperson explanation

In science the concept of truth has a meaning quite different than its colloquial use.  Scientists observe a natural phenomenon and formulate different hypotheses on why things happen in the way we observe them. There are many ways to formulate such hypotheses, but the preferred one is to express them in quantitative, mathematical terms, which makes it easier to test whether they are well founded.  

Once a certain hypothesis is made public, all scientists investigating the same natural phenomenon start to design experiments that could demonstrate that the hypothesis is indeed wrong.  It’s only when all possible attempts have been made, and the hypothesis has resisted to all such attempts to prove it wrong, that we can call this a “scientific truth”. What that means is “so far no one could prove it wrong, so we temporarily assume it to be true”.

Achieving a scientific truth is a long and costly process, but it is worth it: once a hypothesis becomes a scientific truth, or as we will call it from now on, scientific knowledge, it can be used to make predictions on how to best solve problems related to the natural phenomenon it refers to. At the risk of oversimplifying, physics aims to produce new scientific knowledge, which engineering uses to solve the problems of humanity. 

For the purpose of this note, is important to stress that the mathematical form chosen to express the hypothesis cannot deny all pre-existing scientific knowledge accumulated so far.  For example, we are quite sure that matter/energy cannot be created or destroyed, but only transformed; this is called law of conservation in physics. Hence, any mathematical form we use to express a scientific hypothesis must not deny the law of conservation.


But the need to solve humanity problems cannot wait for all the necessary scientific knowledge to be available, considering it may take even some centuries for scientists to produce it.  Thus, scientists have developed methods that can be used to solve practical problems even if no knowledge is available, as long as there is plenty of quantitative data obtained from observing the phenomenon of interest.  When the necessary scientific knowledge is available, we solve problems by developing predictive models that are based on such knowledge; otherwise, we use models that are developed only using observational data.  We call the first type knowledge-driven models, and the second type data-driven models.  The first type of models includes those built from the scientific knowledge provided by physics, chemistry, and physiology, for example.  Data-driven models include statistical models and the so-called Artificial Intelligence (AI) models (e.g. machine-learning models).


Now, if the problem at hand is critical (for example in case a wrong solution may threaten the life of people) before we use a model to solve such problem we need to be fairly sure that its predictions are credible, which means sufficiently close to what does happen in reality.  Thus, for critical problems, assessing the credibility of a model is vital. Most problems related to human health are critical, so it should not be a surprise that assessing the credibility of predictive models is a very serious matter in this domain. Unfortunately, assessing the credibility of a data-driven model turns out to be very different from assessing the credibility of a knowledge-driven model.  While the precise explanation of why these are different is quite convoluted and require a solid grasp of mathematics, here we provide a layperson explanation, aimed to all healthcare stakeholders who by training do not have such mathematical background, but still need to make decisions on the credibility of models.


In order to quantify the error made by a predictive model, we need to observe the phenomenon of interest in a particular condition, measure the quantities of interest, then reproduce the same conditions with the model, and compare the quantities it predicts to those measured experimentally.  Of course, this can be done only in a finite number of conditions; but how can be sure that our model will continue to show the same level of predictive accuracy when we use it to predict the phenomenon in a condition different from those we tested?  Here is where the difference on how the model was built plays an important role. 

For knowledge-driven models it can be demonstrated that those mathematical forms chosen to express that knowledge, forms that must be compatible with all pre-existing scientific knowledge, ensure that if the model makes a prediction for a condition close to one tested, also its prediction error will be close to the one quantified in the test. This allows us to assume that once we quantified the prediction error for a sufficiently large number of conditions within a range, for any other condition within that range of conditions the prediction error will remain comparable.  The benefit of this is that for knowledge-driven models we can conduct a properly designed validation campaign, at the end of which we can state with sufficient confidence the credibility of such model.

However, this is not true for data-driven models.  In theory, a data-driven model could be very accurate for one condition, and totally wrong for another close to the first one.  So, the concept of credibility cannot be stated once and for all.  Assessing the credibility of a data-driven model is a continuous process; while we use the model, we periodically need to confirm that the predictive accuracy remains within the acceptable limits, by comparing the model’s predictions to new experimental observations.


To further complicate the matter, sometime a model is composed of multiple parts, some built using a data-driven approach, others built using a knowledge-driven approach.  In such complex cases the model must be decomposed into sub-models, and each needs to assessed in term of credibility in the way most appropriate for its type.


In conclusion, when no scientific knowledge is available for the phenomenon of interest, only data-driven models can be used. In that case, credibility assessment is a continuous process, like quality assessment. After the model is in use, we periodically need to reassess its predictive accuracy against new observational data.  Instead, when scientific knowledge is available, knowledge-driven models are preferable because their credibility can be confirmed with a finite number of validation experiments.




Friday, 8 January 2021

Positioning In Silico Medicine as a computationally-intensive science: a call to arms

In the last few months I followed with growing interest the recent developments of computational sciences, and I felt compelled to raise a warning, which becomes a call for engagement to the entire In Silico Medicine community.

At risk of oversimplifying, with the launch of the EuroHPC initiative the European Commission has made a clear move toward two directions: exascale computing (development and effective use of new computer systems capable of 10^18 floating point operations per second) and quantum computing (use of quantum phenomena to perform computation).  

Because of the strategic nature of this initiative, all computational sciences are slowly being divided into those that are considered computationally-intensive, and those that are not: to the first will be asked to contribute to the definition of the specifications of these new exascale and quantum computing system (codesign); as part of this they  most likely will receive dedicated funding, directly of by earmarking funding for solutions that exploit high performance computing (HPC), as we already saw in some Covid-related call in H2020.  I am less familiar with the other regions of the world, but my impression is that the political agenda around the strategic value of HPC is the same in USA, China, Japan, India, etc.  Thus, I dare to say that probably the same trend is being observed everywhere.

There are some domains that are unquestionably seen as HPC science: Weather, Climatology and solid Earth Sciences; Astrophysics, High-Energy Physics and Plasma Physics; Materials Science, Chemistry and Nanoscience. When we look at Life Sciences and Medicine, the picture is blurred: there is a clear case for molecular simulations, but much less clarity for single-cell system biology, and even less for system physiology.  In Silico Medicine, intended as the clinical and industrial application of computational biomedicine methods, is in my opinion at the present far from making a clear case for being and HPC Scientific domain.  2013 Chemistry Nobel Prize was given to a group of computational chemists; it will take still some decades before we can expect a medicine noble prize by a computational researcher.

Having worked in this field from its beginnings 20 years ago, I can understand why most of the in silico medicine researchers see the computational challenge as an immaterial detail: as community our main focus is still on the credibility in the clinical and regulatory context of our predictions.

But I am worried that if we lose this train, it might take a long while before another pass.  I think that as we prepare for Horizon Europe or for the next round of NIH and NSF funding, we need to start thinking seriously where is the added-value of porting our applications to HPC architectures, and to develop a HPC Science research agenda where scalability is key.  We need to think grand science, from a computational point of view.  And we need to pursue computational grand challenges: can we simulate a phase III clinical trial by running 1000 patient-specific models?  Can we model all cells in a whole tumour? Can we model the electrophysiology of all the cardiomyocytes in a human heart?  Can we couple a whole fluid-electro-mechanical model of the heart with a full fluid-chemo-mechanical model of the lungs?

Another thing we need to start to work as a community is the idea of the Virtual Physiological Human.  There is a funny story here: soon after the term was coined in 2005, we started to defend by those who were asking: Are you planning capture the entire human physiology in a single computer model?  At that time of course the answer was no, not even close.  But I think that now this idea should be brought back, if not as a feasible goal any time soon, at least as something to aim to.  We have great models for the bones, joints and muscles; for the heart; for the pancreas, for the liver; for the lungs.  Can we aim to a neuromusculoskeletal model of human movement?  Or a cardiovasculorespiratory model of the body oxygenation dynamics?

This is a grand challenge for our community, and I call you all to arms.  Make sure all those who are thinking in this direction, seniors and juniors, join the #Scalability channel:

https://insilicoworld.slack.com/archives/C0151M02TA4 

if you click the link and you get a message saying that you are not a member yet, follow this other link and request to join:

http://insilico.world/scalability-support-channel/

I also ask all of you to start posting your scalability challenges.  If you do not  have any, this means you are not thinking big enough, so try again :-).  

We need to get as soon as possible a good representation of what are the HPC needs for the In Silico Medicine community, and joining the #Scalability channel is the most effective way. As a bonus, the top HPC experts in Europe who are partners in the CompBioMed Centre of Excellence will be happy to share their wisdom with you through the same channel and help you to address your scalability issues in the most effective way.


 

Saturday, 21 November 2020

On the regulatory validation of AI models


Accepted epistemology suggests that a theory cannot be confirmed, since this would require infinite tests, but only falsified. So, we can never say a theory is true, but only that it has not disproved so far.  However, falsification attempts are not done randomly, they are purposely crafted to seek for all possible weaknesses. So, the process empirically works: all theories that resisted falsification for some decades were not falsified subsequently, at most extended. For example, special relativity theory addressed the special case of bodies travelling close to the speed of light but did not truly falsified the second law of dynamics.  In fact, if you write Newton’s law as F = dp/dt, where p is the momentum, even the case where the mass varies due to relativistic effects is included.

But once a theory has resisted extensive attempt of falsification, for all practical purposes we assume it is true, and use it to make predictions.  However, our predictions will be affected by some error, not because the theory is false, but because of how we use it to make predictions.  An accepted approach suggests that the prediction error of a mechanistic model can be described as the sum of the epistemic error (due to our imperfect application of the theory to the physical reality being predicted), aleatoric error (due to the uncertainty affecting all the measured quantities we use to inform the model), and numerical solution errors, present only when the equations that describe the model as solved numerically.  For mechanistic models, based on theories that resisted extensive falsification, validation means simply the quantification of the prediction errors, ideally in all its three components (which gives raise to the verification, validation and uncertainty quantification (VV&UQ) process).

A phenomenological model is defined as a predictive model that does not use any prior knowledge to make predictions, but only prior observations (data). Analytical AI models are a type of phenomenological models. When we talk about validation for a phenomenological model, we do not simply mean the quantification of the prediction error; since the phenomenological model contains an implicit theory, its validation is more related the falsification of a theory. And while an explicitly formulated theory can be purposely attacked in our falsification attempts, the implicit nature of phenomenological models forces us to use brute force approaches to falsification.  This bring us to the curse of induction: a phenomenological model is never validated, we can only say that with respect to the validation sets we used to challenge it so far, the model resisted our falsification attempts. But in principle nothing guarantees us that at the next validation set the model will not be proven totally wrong.

Following this line of thoughts, one would conclude that locked AI models cannot be trusted.  The best we can do is to formulate AI testing as a continuous process; as new validation sets are produced the model must be retested again and again, and at most we will say it is valid “so far”.

But the world is not black and white.  For example, while purely phenomenological models do exist, purely mechanistic models do not. A simple way to prove this is to consider that since the space-time resolution of our instruments is finite, every mechanistic model has some limits of validity imposed by the particular space-time scale at which we model the phenomenon of interest.  The second law of dynamics is not strictly valid anymore at the speed of light, and it also shakes at the quantum scale.  To address this problem, virtually EVERY mechanistic model must include two phenomenological portions that describe everything bigger than our scale as boundary conditions, and everything smaller than our scale as constitutive equation. All this is to say that there are black box models and grey box models, but not white box models. At most light grey models.  So what?  Well, if every model includes some phenomenological portion, in theory VV&UQ cannot be applied for the arguments above. But VV&UQ works, and we trust our life on airplanes and nuclear power stations because it works.

Which bring us to another issue. Above I wrote: “in principle nothing guarantees us that at the next validation set the model will not be proven totally wrong”. Well, this is not true.  If the phenomenological model is predicting a physical phenomenon, we can postulate some properties.  One very important that come from the conservation principles is that all physical phenomena show so degree of regularity. If Y varies with X, and for X =1, Y =10, and for X = 1.0002 Y = 10.002, when X = 1.0001, we can safely state that it is impossible that Y = 100,000, or 0.0003.  Y must have a value in the order of 10, because the inherent regularity of physical processes.  Statisticians recognise this from another perspective (purely phenomenological) by saying that any finite sampling of  a random variable might be non-normal, but if we add them eventually the resulting distribution will be normal (central limit theorem).  This means that my estimate of an average value will converge asymptotically to the true average value, associated to an infinite sample size. 

Thus, we can say that the estimate of the average prediction error of a phenomenological model as we increase the number of validation sets will converge to the true average prediction error asymptotically.  This means that if the number of validation sets is large enough the value of the estimate will change monotonically, and also its derivative will decrease monotonically. This makes possible to reliably estimate an upper boundary of the prediction error, even with a finite number of validation sets.

We are too early in the day to come to any conclusion, on if and how the credibility of AI-based predictors can be evaluated from a regulatory point of view.  Here I tried to show some aspects of the debate. But personally, I am optimistic: I believe we can reliably estimate the predictive accuracy of all models of physical processes, including those purely phenomenological.

A final word of caution:  this is definitely not true when the AI model is trying to predict a phenomenon affected by non-physical determinants. Predictions involving psychological, behavioural, sociological, political or economic factors cannot rely on the  inherent properties of physical systems, and thus in my humble opinion such phenomenological models can never be truly validated.  I can probably validate an AI-based model that predict the walking speed from the measurement of the acceleration of the centre of mass of the body, but not a model that predict whether a subject will go out walking today.