小程序
传感搜
传感圈

AI’s Victories in Go Inspire Better Human Game Playing

2023-03-15 17:51:12
关注

In 2016 a computer named AlphaGo made headlines for defeating then world champion Lee Sedol at the ancient, popular strategy game Go. The “superhuman” artificial intelligence, developed by Google DeepMind, lost only one of the five rounds to Sedol, generating comparisons to Garry Kasparov’s 1997 chess loss to IBM’s Deep Blue. Go, which involves players facing off by moving black and white pieces called stones with the goal of occupying territory on the game board, had been viewed as a more intractable challenge to a machine opponent than chess.

Much agonizing about the threat of AI to human ingenuity and livelihood followed AlphaGo’s victory, not unlike what’s happening right now with ChatGPT and its kin. In a 2016 news conference after the loss, though, a subdued Sedol offered a comment with a kernel of positivity. “Its style was different, and it was such an unusual experience that it took time for me to adjust,” he said. “AlphaGo made me realize that I must study Go more.”

At the time European Go champion Fan Hui, who’d also lost a private round of five games to AlphaGo months earlier, told Wired that the matches made him see the game “completely differently.” This improved his play so much that his world ranking “skyrocketed,” according to Wired.

Formally tracking the messy process of human decision-making can be tough. But a decades-long record of professional Go player moves gave researchers a way to assess the human strategic response to an AI provocation. A new study now confirms that Fan Hui’s improvements after facing the AlphaGo challenge weren’t just a singular fluke. In 2017, after that humbling AI win in 2016, human Go players gained access to data detailing the moves made by the AI system and, in a very humanlike way, developed new strategies that led to better-quality decisions in their game play. A confirmation of the changes in human game play appear in findings published on March 13 in the Proceedings of the National Academy of Sciences USA.

“It is amazing to see that human players have adapted so quickly to incorporate these new discoveries into their own play,” says David Silver, principal research scientist at DeepMind and leader of the AlphaGo project, who was not involved in the new study. “These results suggest that humans will adapt and build upon these discoveries to massively increase their potential.”

To pinpoint whether the advent of superhuman AI drove humans to generate new strategies for game play, Minkyu Shin, an assistant professor in the department of marketing at City University of Hong Kong, and his colleagues used a database of 5.8 million moves recorded during games from 1950 through 2021. This record, maintained at the website Games of Go on Download, reflects every move of Go games played in tournaments as far back as the 19th century. The researchers began analyzing games from 1950 onward because that’s the year modern Go rules were established.

In order to start combing through the massive record of 5.8 million game moves, the team first created a way to rate the quality of decision-making for each move. To develop this index, the researchers used yet another AI system, KataGo, to compare the win rates of each human decision against those of AI decisions. This huge analysis involved simulating 10,000 ways the game could play out after each of the 5.8 million human decisions.

With a quality rating for each of the human decisions in hand, the researchers then developed a means to pinpoint exactly when a human decision during a game was novel, meaning it had not been recorded before in the history of the game. Chess players have long used a similar approach to determine when a new strategy in game play emerges.

In the novelty analysis of Go game play, the researchers mapped up to 60 moves for each game and marked when a novel move was introduced. If it emerged at, say, move nine in one game but not until move 15 in another, then the former game would have a higher novelty index score than the latter. Shin and his colleagues found that after 2017, most moves that the team defined as novel occurred by move 35.

The researchers then looked at whether the timing of novel moves in game play tracked with an increased quality of decisions—whether making such moves actually improved a player’s advantage on the board and the likelihood of a win. They especially wanted to see what, if anything, happened to decision quality after AlphaGo bested its human challenger Sedol in 2016 and another series of human challengers in 2017.

The team found that before AI beat human Go champions, the level of human decision quality stayed pretty uniform for 66 years. After that fateful 2016–2017 period, decision quality scores began to climb. Humans were making better game play choices—maybe not enough to consistently beat superhuman AIs but still better.

Novelty scores also shot up after 2016–2017 from humans introducing new moves into games earlier during the game play sequence. And in their assessment of the link between novel moves and better-quality decisions, Shin and his colleagues found that before AlphaGo succeeded against human players, humans’ novel moves contributed less to good-quality decisions, on average, than nonnovel moves. After these landmark AI wins, the novel moves humans introduced into games contributed more on average than already known moves to better decision quality scores.

One possible explanation for these improvements is that humans were memorizing new play sequences of moves. In the study, Shin and his colleagues also assessed how much memorization could explain decision quality. The researchers found that memorization would not completely explain decision quality improvements and was “unlikely” to underlie the increased novelty seen after 2016–2017.

Murat Kantarcioglu, a professor of computer science at the University of Texas at Dallas, says that these findings, taken together with work he and others have done, shows that “clearly, AI can help improve human decision-making.” Kantarcioglu, who was not involved in the current study, says that the ability of AI to process “vast search spaces,” such as all possible moves in a complex game such as Go, means that AI can “find new solutions and approaches to problems.” For example, an AI that flags medical imaging as suggestive for cancer could lead a clinician to look more closely than they might have before. “This in turn will make the person a better doctor and prevent such mistakes in the future,” he says.

A hitch—as the world is seeing right now with ChatGPT—is the issue of making AI more trustworthy, Kantarcioglu adds. “I believe this is the main challenge,” he says.

In this new phase of concerns about ChatGPT and other AIs, the findings offer “a hopeful perspective” on the potential for AI to be an ally rather than a “potential enemy in our journey towards progress and betterment,” Shin and his co-authors wrote in an e-mail to Scientific American.

“My co-authors and I are currently conducting online lab experiments to explore how humans can improve their prompts and achieve better outcomes from these programs,” Shin says. “Rather than viewing AI as a threat to human intelligence, we should embrace it as a valuable tool that can enhance our abilities.”

参考译文
人工智能在围棋中的胜利激励人类更好地下棋
2016年,一台名为AlphaGo的计算机因在古老而热门的战略游戏“围棋”中击败当时的围棋世界冠军李世石而登上新闻头条。“超人”人工智能由Google DeepMind开发,在五局比赛中仅输掉一局,引发了与1997年加里·卡斯帕罗夫在国际象棋中输给IBM“深蓝”的比较。围棋是一种玩家通过在棋盘上摆放黑白棋子以争夺领地的游戏,长期以来被认为比国际象棋更难以被计算机应对。AlphaGo的胜利引发了人们对于人工智能威胁人类创造力和生计的焦虑,这与如今ChatGPT及其类似技术引发的担忧极为相似。然而,2016年失利后的新闻发布会上,情绪低落的李世石给出了一个带有积极意义的评论:“它的风格不同,这种不寻常的经历让我花了时间去适应。”他说,“AlphaGo让我意识到我必须更加努力地学习围棋。”当时,欧洲围棋冠军樊麾也曾在几个月前输掉了与AlphaGo的五局私下比赛,并告诉《连线》杂志,这些比赛让他对围棋有了“完全不同的认识”。这大大提升了他的水平,据《连线》报道,他的世界排名“迅速上升”。尽管追踪人类决策过程这一复杂过程颇具挑战,但几十年来专业围棋选手的棋谱为研究人员提供了一个评估人类在面对人工智能挑战时战略反应的途径。一项新研究确认,樊麾在面对AlphaGo挑战后的提升并不是个例。2017年,2016年那场令人沮丧的人工智能胜利之后,人类围棋选手获得了人工智能系统所下棋局的详细数据,并以一种非常人性的方式,开发出新策略,提高了他们在比赛中决策的质量。这一人类游戏风格的转变的证明,发表于3月13日的《美国国家科学院院刊》。DeepMind的首席研究科学家、AlphaGo项目的负责人戴维·西尔弗(David Silver)表示,看到人类选手如此迅速地将这些新发现融入到自己的比赛中是一件“令人惊叹”的事。他说:“这些结果表明,人类将会适应并建立在这些发现之上,从而极大地提升自身的潜力。”为了确认“超人”人工智能的出现是否促使人类为游戏开发新的策略,香港城市大学市场营销系助理教授申敏奎(Minkyu Shin)及其同事利用了一个包含1950年至2021年间580万步棋的数据集。这个记录在网站“下载围棋棋谱”(Games of Go on Download)上维护,反映了自19世纪以来的所有围棋比赛的每一步棋。研究人员从1950年开始分析,因为那是现代围棋规则确立的年份。为了分析这580万步棋的巨大数据集,团队首先创建了一种评估每一步棋决策质量的方法。为了开发这个指数,研究人员使用了另一个人工智能系统KataGo,将人类每一步棋的胜率与AI决策的胜率进行比较。这项巨大分析包括模拟在580万步人类决策后的每一步棋的10,000种可能结果。在获得每一步人类决策的评分后,研究人员开发了一种方法来明确判断在比赛中人类做出的决策是否属于“新颖”的,也就是说,这步棋是否在比赛中首次出现。国际象棋选手长期以来一直采用类似的方法来判断游戏中新策略的出现。在围棋比赛的“新颖性”分析中,研究人员记录了每场比赛中最多60步棋,并标注了新颖性棋步的出现时间。如果在某场比赛中新颖性棋步出现在第9步,而在另一场比赛中要到第15步才出现,那么前一场比赛将拥有更高的新颖性指数评分。申和其同事发现,2017年后,团队所定义的“新颖性”棋步大多在第35步之前出现。研究人员还考察了新颖性棋步的时间安排是否与决策质量提升相关——是否在比赛中采用这些新颖性棋步实际上改善了选手的优势和获胜的可能性。他们尤其想看看,在2016年AlphaGo击败李世石、2017年击败其他一系列人类选手之后,决策质量发生了什么变化。团队发现,在人工智能击败人类围棋冠军之前,人类的决策质量在66年内基本保持稳定。但在2016年至2017年这一关键时期之后,决策质量评分开始上升。人类开始做出更好的决策——虽然未必能稳定地击败超人人工智能,但确实有所提升。新颖性评分也在2016年至2017年之后大幅上升,因为人类开始在比赛早期引入新的棋步。在他们对新颖性与决策质量之间关系的评估中,申和同事们发现,在AlphaGo战胜人类选手之前,人类的新颖性棋步平均上对高质量决策的贡献,不如那些非新颖性棋步。但在这些具有里程碑意义的人工智能胜利之后,人类引入的新颖性棋步平均上比已知的棋步对决策质量的提升贡献更大。这种改进的一个可能解释是,人类记住了新的棋步序列。在研究中,申和同事还评估了记忆能多大程度地解释决策质量的提升。研究人员发现,记忆并不能完全解释决策质量的提升,也不太可能是2016年至2017年之后新颖性提升的主要原因。德克萨斯大学达拉斯分校计算机科学教授穆拉特·坎塔尔乔格鲁(Murat Kantarcioglu)表示,这些发现,加上他和其他人所做的工作,表明“显然,人工智能可以帮助提高人类的决策能力。”未参与这项研究的坎塔尔乔格鲁表示,人工智能处理“巨大搜索空间”的能力,例如复杂游戏如围棋中所有可能的棋步,意味着人工智能可以“找到解决问题的新方法和新思路。”例如,一个将医学影像标记为可能与癌症相关的AI,可能会促使临床医生比以前更仔细地观察。他说,“这反过来将使他们成为更好的医生,并在未来避免类似的错误。”坎塔尔乔格鲁补充道,目前世界正在经历的ChatGPT问题,一个问题是“如何让人工智能变得更加可信”。他说,“我相信这是主要的挑战。”在如今对ChatGPT和其他人工智能的新一轮担忧中,这项研究提供了“一个充满希望的视角”,表明人工智能可以成为人类进步和改善之旅中的盟友,而非潜在的敌人,申及其合著者在一封发给《科学美国人》的电子邮件中写道。“我和我的合著者目前正在开展在线实验室实验,探索人类如何改进他们的提示,以从这些程序中获得更好的结果,”申说。“我们不应将人工智能视为对人类智慧的威胁,而应将其视为可以增强我们能力的宝贵工具。”
您觉得本篇内容如何
评分

评论

您需要登录才可以回复|注册

提交评论

广告

scientific

这家伙很懒,什么描述也没留下

关注

点击进入下一篇

2023-2027年全球数据中心交换机市场将超$1000亿

提取码
复制提取码
点击跳转至百度网盘