Antifragility and Machine Learning
Our intuition for most products, processes, and even some models might be that they either will get worse over time, or if they fail, they will experience an cascade of more failure. But, what if we could intentionally design systems and models to only get better, even as the world around them gets worse?
By Prad Upadrashta, SVP & Chief Data Science Officer (AI solutions) at Mastech InfoTrellis
A client recently asked if our entity matching algorithms are “antifragile.” This got me thinking. It is a really interesting question. Bear with me as I take you on a mind trip to explore this question and its implications. First, let’s start with a common understanding of “antifragile.” We’ve all read the google definition: When a system gains from stressors, shocks, volatility, noise, disorder, mistakes, faults, attacks, or failures, it is termed “antifragile.”
So, what does that look like? Well, at first pass, we might think it would look something like this:
Figure. A linear relationship between Entropy and Gain.
The line shows clearly that the system is gaining from disorder, i.e., +1 unit of gain for every +1 unit of disorder (for instance), but this is where things become weird. The fact is, when you are dealing with antifragile systems, they also exhibit a peculiar response function to each incremental increase in the level of disorder, where the previous gains feedback into the system in such a way as to compound the effect – this gives it a curved or convex appearance. This convexity is a critical feature of antifragile systems.
Simply, antifragility is defined as a convex response to a stressor or source of harm (for some range of variation), leading to a positive sensitivity to increase in volatility (or variability, stress, dispersion of outcomes, or uncertainty, what is grouped under the designation "disorder cluster"). Likewise, fragility is defined as a concave sensitivity to stressors, leading to a negative sensitivity to an increase in volatility. The relation between fragility, convexity, and sensitivity to disorder is mathematical, obtained by theorem, not derived from empirical data mining or some historical narrative. It is a priori.
— Taleb, N. N., Philosophy: 'Antifragility' as a mathematical idea. Nature, 2013 Feb 28; 494 (7438), 430.
So, the line should really be curved upward as follows:
Figure. Convexity of antifragility where the accelerating curve is due to feedback.
Compound interest is a well-known concept and serves here as a useful illustration of how the simple act of re-investing returns can lead to the convexity of the return stream. When you take your prior gains (%s) and re-invest them into a consistently producing process, your prior gains also see gains, and so on ad infinitum. As these gains cumulate, we see a snowball effect over time. This results in a curved (accelerating) line, not a straight line. This curvature is convexity. Convex systems are nonlinear. I’m not suggesting here that compound interest is antifragile – rather, it is one practical example of a process in which positive feedback (reinvestment) drives the slope of the function to increase nonlinearly
In his books, Nassim Taleb gives three examples of antifragility: Airlines, Restaurants, and Silicon Valley. All three of these become stronger every time something goes wrong. If an airplane goes down, every manufacturer will take pains to make sure that the next generation of airplanes will never experience the same problem. Silicon Valley is especially interesting because they see every failure or inefficiency in the market as an opportunity, and that leads to value creation through the formation of companies that address the problem. It is important to point out that these systems were not engineered to be antifragile. This realization seems to be ex post facto.
So, one open question is whether we can purposefully engineer systems and/or processes to make them antifragile?
Within the subset of natural systems, those that benefit from feedback turn out to be antifragile; for instance, the body’s immune system is strengthened in response to external stressors (viruses, bacteria, etc.). This, in fact, is the basis of all vaccines – the introduction of a weak stressor that triggers the immune system to adapt by learning to recognize the surface proteins that make up the viral shell – so that antibodies can be produced to attack the full-strength virus when it enters the system.
Most man-made systems and/or processes do not exhibit this sort of antifragile behavior, though at a process level, they turn out to be antifragile because humans have a natural inclination towards process improvement by studying past failures and adapting themselves accordingly. So, we impose our own antifragile tendencies on the systems we design because we simply can’t leave things alone. The vast majority of the natural physical world exhibits a tendency to decay over time. This is particularly true of the failure modes observed in complex systems (where system failures rise exponentially over time). So, most real systems exhibit fragility, which looks more like this:
Figure. A failure curve that points downward. As one thing fails, it creates the conditions for something else to fail, leading to negative convexity.
In fragile systems, each failure leads to successively more failures. So, failures compounding on top of failures leads to a rapidly decaying system; again, we note that explicitly that it is not a linearly decaying scenario – intuitively, we know that once something breaks, it sets in motion the tendency for other things to break (especially where there are strong direct or mechanical dependencies). Every failure makes the next failure more likely because failures are not independent of one another – in mechanical systems, in particular, failures tend to be highly dependent. This gives rise to the “right edge” of the so-called “bathtub curve” that is famous in the reliability world. One of the most frequently asked questions in the reliability world is: How likely is it that component B will fail, given that component A has already failed?
Figure. A “Bathtub Curve” model is used for modeling things like machine component failures. On the far right, you see an accelerating curve that represents the accelerating failure rate of a machine that is in its “wear out” phase.
We can develop an intuition for the behavior of “antifragile” systems by studying and contrasting these extreme cases, i.e., the edges of the “known,” if you will. Studying the extremes of any problem can help you identify the “boundaries” of a problem. By bounding a problem between two lines, we can start to understand the characteristic behaviors we are likely to expect in-between, all the while implicitly making a few mathematical assumptions about continuity and regularity, which we won’t go into here. The point is, we can interpolate between the extremes to understand the characteristics of the system under a variety of different operating parameters.
Original. Reposted with permission.
Bio: Prad Upadrashta has over 20 years of experience, culminating in the role of Chief Data Science Officer at Mastech InfoTrellis. His focus areas are Artificial Intelligence, Machine/Deep Learning, Blockchain, IIoT/IoT, and Industry 4.0.
- What is Noise?
- 10 Must-Know Statistical Concepts for Data Scientists
- How To Overcome The Fear of Math and Learn Math For Data Science