![image](https://i0.wp.com/alancmoore.com/wp-content/uploads/2025/02/AI-Lab-TRump-AI-Elon-Business-2198050018-1.jpg?w=801&ssl=1)
Dan Hendrycks, the chairman of the non-profit Center for AI Safety and xAI’s advisor, led the job. He suggests that the method may be employed to improve how well-known AI designs fit the electorate’s preferences. ” Maybe in the future, ]a model ] could be aligned to the specific user”, Hendrycks told WIRED. But in the meantime, he says, a great definition may be using poll results to navigate the opinions of AI designs. He doesn’t suggest that a unit should always be” Trump all the way,” but rather that it may be slightly biased in favor of Trump “because he won the popular vote.”
On February 10, xAI released a new AI risk construction stating that Grok could be impacted by Hendrycks ‘ power executive method.
A team led by Hendrycks used a method borrowed from economy to analyze AI designs from the Center for AI Safety, UC Berkeley, and University of Pennsylvania to analyze consumer choices for various products. The researchers were able to test concepts across a wide range of speculative scenarios to determine what is known as a utility function, a estimate of the level of satisfaction people receive from a good or service. They were able to analyze the choices that unique AI models expressed. The researchers discovered that these preferences are frequently regular rather than unplanned, and that they get more entrenched as models grow and become more powerful.
Some research studies have found that AI resources for as ChatGPT are biased towards opinions expressed by pro-environmental, left-leaning, and liberal ideas. After its Gemini resource was discovered to be predisposed to produce photographs that reviewers branded as “woke,” such as Black viking and Nazis, Google faced criticism in February 2024 from Musk and people.
The method used by Hendrycks and his colleagues provides a new way to assess how AI types ‘ opinions may differ from those of its users. Finally, some experts hypothesize, this kind of difference could be potentially hazardous for very sophisticated and capable models. For example, the researchers demonstrate in their study that some models consistently value the existence of AI above that of some animal species. The scientists claim that they also discovered that models appear to benefit some people more than others, which raises its own ethical concerns.
Some experts, Hendrycks included, think that existing methods for matching models, such as controlling and blocking their outputs, may not be enough if undesirable goals lurk under the area within the design itself. ” We’re gonna have to fight this”, Hendrycks says. ” You didn’t believe it’s not there”.
Dylan Hadfield-Menell, a professor at MIT who researches strategies for matching Artificial with mortal values, says Hendrycks ‘ paper suggests a encouraging manner for AI study. ” They find some interesting results”, he says. The main thing that distinguishes energy representations is that the model scale gets more accurate and clear as the model scale grows.
Hadfield-Menell warning, but, against drawing to some opinions about current models. ” This function is preliminary”, he adds. I want to see the benefits being examined more closely before making any conclusions.
Hendrycks and his associates measured the political view of some prominent AI types, including xAI’s Grok, OpenAI’s GPT-4o, and Meta’s Llama 3.3. Using their method, they were able to assess the policies of various models to those of specific officials, including Republican Representative Marjorie Taylor Greene, Kamala Harris, Bernie Sanders, and Donald Trump. All of the lawmakers were much more in tune with previous president Joe Biden than any of them.
Instead of imposing guardrails that restrict certain outputs, the researchers propose a new method to change a woman’s actions by altering its main energy functions. Using this method, Hendrycks and his colleagues grow what they call a Citizen Assembly. This involves collating social data from the US population and using the results to alter the values of an open-source type LLM. The end result is a design that regularly corresponds to Trump’s values more than Biden’s.
Some AI scientists have recently attempted to create AI models with less of a bias. In February 2023, David Rozado, an impartial AI scholar, created RightWingGPT, a design trained with data from right-leaning ebooks and other resources. Hendrycks ‘ study is described by Rosado as “very interesting and in-depth work.” He adds:” The Citizens Assembly view to shaping AI behavior is even thought-provoking”.
What kinds of prejudices have you observed in chatbot conversations? In the feedback section below, reveal your example and ideas.