A Further Departure from the Neoclassical Period of AI
In our previous article in this series, we defined the concept of "machina behavioralis." We delve deeper into additional behaviors and examples that further illustrate how AI's decision-making processes deviate from the neoclassical interpretation of rationality [11].
More Behaviors and Examples
Temporal Discounting: Temporal discounting refers to the tendency to favor immediate rewards over future ones, which can affect AI decision-making. For instance, an AI might prioritize short-term gains in stock trading algorithms over long-term investment strategies, reflecting a bias towards immediate gratification similar to humans [1].
Social Influence and Conformity: AI systems can be influenced by the behavior of other agents, leading to herd behavior or conformity. For example, in financial markets, algorithmic trading systems may follow trends set by other algorithms, amplifying market movements and creating bubbles or crashes. This mirrors human susceptibility to social influence [2].
Ambiguity Aversion: Ambiguity aversion describes a preference for known risks over unknown ones. AI systems may exhibit this behavior by favoring decisions with well-defined probabilities over those with uncertain outcomes.
For instance, an AI in medical diagnosis might prefer treatments with well-documented efficacy over newer, less-certain options [3] at the expense of the patient by not provided the correct decision.
Further, these systems might exhibit another form of bounded rationality [14], where heuristic methods are used to prioritize certain diagnostic tests over others based on limited computational resources or available data [8]. This can lead to diagnostic biases similar to those seen in human physicians [7].
Outside of healthcare, we could consider autonomous vehicles must make split-second decisions under uncertainty. These systems might display risk-averse behavior when navigating complex environments, preferring safer, less efficient routes to avoid potential accidents. This risk perception parallels human drivers' cautious behavior in uncertain conditions [3], [6].
Framing Effects: The way information is presented can significantly impact decision-making. AI systems, like humans, can be susceptible to framing effects. For example, the wording of queries or the presentation of options can influence an AI's recommendations in customer service or online search results [4].
Endowment Effect: The endowment effect refers to the tendency to overvalue owned items compared to equivalent items not owned [5]. In AI, this might manifest as a preference for retaining existing resources or rewards, even when new, potentially more valuable ones are available. For example, consider a resource-collecting robot. If the robot's reward function is insufficiently designed to properly appraise the potential value of new rewards in its existing environment or if there is a distributional shift in the environment [13], it may overvalue the resources it has already gathered and therefore be reluctant to explore new areas or try new methods for resource collection, favoring its current endowment.
Prospect Theory: As mentioned in the earlier article in the series, prospect theory could play a role in decision making. For example, in more sophisticated algorithmic trading, AI might exhibit behaviors reflecting prospect theory, such as risk-seeking behavior when facing potential losses. This can result in aggressive trading strategies during market downturns, mirroring human traders' tendencies to take greater risks when attempting to recoup losses [10].
Heuristic Bias: Also mentioned in the previous article, AI might rely on heuristics for efficiency, introducing biases similar to those seen in human decision-making. For example, AI systems used for recruitment may demonstrate heuristic bias by favoring candidates with characteristics similar to successful hires from the past. This can perpetuate existing biases in the hiring process, akin to human recruiters' reliance on heuristic shortcuts [9].
Understanding the behaviors encapsulated by "machina behavioralis" is important for developing AI systems that better mimic the complexities of human decision-making. Recognizing these behavioral limitations, like bounded rationality, heuristic biases, and others mentioned earlier, allows us to create more robust, adaptable, and fair AI systems. While the behaviors exemplified are not exhaustive, we hope it provides sufficient evidence to explore these limitations further in the development of AI.
Glossary (Extended)
Machina Behavioralis: The proposed term to encapsulate additional behaviors in AI, such as bounded rationality, information asymmetry, heuristics and biases, prospect theory, and availability bias.
Temporal Discounting: The tendency to favor immediate rewards over future ones, affecting decision-making in both humans and AI.
Social Influence and Conformity: The impact of the behavior of other agents on an individual's decisions, leading to herd behavior.
Ambiguity Aversion: A preference for known risks over unknown ones, influencing decision-making under uncertainty.
Framing Effects: The influence of how information is presented on decision-making.
Endowment Effect: The tendency to overvalue owned items compared to equivalent items not owned.
References
[1] Green, L., & Myerson, J. (2004). “A Discounting Framework for Choice with Delayed and Probabilistic Rewards.” Psychological Bulletin, 130(5), 769-792.
[2] Bikhchandani, S., Hirshleifer, D., & Welch, I. (1992). “A Theory of Fads, Fashion, Custom, and Cultural Change as Informational Cascades.” Journal of Political Economy, 100(5), 992-1026.
[3] Ellsberg, D. (1961). “Risk, Ambiguity, and the Savage Axioms.” Quarterly Journal of Economics, 75(4), 643-669.
[4] Tversky, A., & Kahneman, D. (1981). “The Framing of Decisions and the Psychology of Choice.” Science, 211(4481), 453-458.
[5] Kahneman, D., Knetsch, J. L., & Thaler, R. H. (1991). “Anomalies: The Endowment Effect, Loss Aversion, and Status Quo Bias.” Journal of Economic Perspectives, 5(1), 193-206.
[6] Goodall, N. J. (2014). “Machine Ethics and Automated Vehicles.” In Road Vehicle Automation (pp. 93-102). Springer.
[7] Obermeyer, Z., Powers, B., Vogeli, C., & Mullainathan, S. (2019). “Dissecting Racial Bias in an Algorithm Used to Manage the Health of Populations.” Science, 366(6464), 447-453.
[8] Tversky, A., & Kahneman, D. (1973). “Availability: A Heuristic for Judging Frequency and Probability.” Cognitive Psychology, 5(2), 207-232.
[9] Cowgill, B., Dell'Acqua, F., Deng, S., & Deng, S. (2021). "Biased Programmers? Or Biased Data? A Field Experiment in Operationalizing AI Ethics." AEA Papers and Proceedings, 111, 491-495.
[10] Barberis, N. C. (2013). “Thirty Years of Prospect Theory in Economics: A Review and Assessment.” Journal of Economic Perspectives, 27(1), 173-196.
[11] Landreth, H., & Colander, D. C. (2002). History of Economic Thought (4th ed.). Houghton Mifflin.
[12] Sutton, R. S., & Barto, A. G. (1998). Reinforcement Learning: An Introduction. MIT Press.
[13] Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., & Mané, D. (2016). "Concrete Problems in AI Safety." arXiv preprint arXiv:1606.06565.
[14] Simon, H. A. (1955). “A Behavioral Model of Rational Choice.” Quarterly Journal of Economics, 69(1), 99-118.
These are well differentiated areas of concern, and points towards a recognizable feature landscape, particularly I can imagine Ambiguity Aversion and Social Conformity being features not bugs with proper supplemental control mechanisms. We need maximum attention to such control mechanisms in the face of capabilities explosion.
We don’t want to naively mimic human decision-making, Homo Behavioralis, which is the origin of all these in Machina Behavioralis. We need to make something wholly better that drags human decision-making along, without kicking and screaming, and we are only beginning to define what that it is.
Even if we could get close to Machina Economicus that would only address 1% of alignment issues, as what goals should it rationally pursue? Humanity should figure out analog collective intelligence before accelerating half-assed AI experiments towards it, but we aren’t going to that; and more narrow attempts at “AI alignment” is all we have to, perhaps vainly, attempt to mitigate the coming disaster.