小程序
传感搜
传感圈

Tech Companies’ New Favorite Solution for the AI Content Crisis Isn’t Enough

2023-08-13 12:20:54
关注

Thanks to a bevy of easily accessible online tools, just about anyone with a computer can now pump out, with the click of a button, artificial-intelligence-generated images, text, audio and videos that convincingly resemble those created by humans. One big result is an online content crisis, an enormous and growing glut of unchecked, machine-made material riddled with potentially dangerous errors, misinformation and criminal scams. This situation leaves security specialists, regulators and everyday people scrambling for a way to tell AI-generated products apart from human work. Current AI-detection tools are deeply unreliable. Even OpenAI, the company behind ChatGPT, recently took its AI text identifier offline because the tool was so inaccurate.

Now, another potential defense is gaining traction: digital watermarking, or the insertion of an indelible, covert digital signature into every piece of AI-produced content so the source is traceable. Late last month the Biden administration announced that seven U.S. AI companies had voluntarily signed a list of eight risk management commitments, including a pledge to develop “robust technical mechanisms to ensure that users know when content is AI generated, such as a watermarking system.” Recently passed European Union regulations require tech companies to make efforts to differentiate their AI output from human work. Watermarking aims to rein in the Wild West of the ongoing machine learning boom. It’s only a first step—and a small one at that—overshadowed by generative AI’s risks.

Muddling human creation with machine generation carries a lot of consequences. “Fake news” has been a problem online for decades, but AI now enables content mills to publish tidal waves of misleading images and articles in minutes, clogging search engines and social media feeds. Scam messages, posts and even calls or voice mails can be cranked out quicker than ever. Students, unscrupulous scientists and job applicants can generate assignments, data or applications and pass it off as their own work. Meanwhile unreliable, biased filters for detecting AI-generated content can dupe teachers, academic reviewers and hiring managers, leading them to make false accusations of dishonesty.

And public figures can now lean on the mere possibility of deepfakes—videos in which AI is used to make someone appear to say or do something—to try dodging responsibility for things they really say and do. In a recent filing for a lawsuit over the death of a driver, lawyers for electric car company Tesla attempted to claim that a real 2016 recording in which its CEO Elon Musk made unfounded claims about the safety of self-driving cars could have been a deepfake. Generative AI can even “poison” itself as the Internet’s massive data trove—which AI relies on for its training—gets increasingly contaminated with shoddy content. For all these reasons and more, it is becoming ever more crucial to separate the robot from the real.

Existing AI detectors aren’t much help. “Yeah, they don’t work,” says Debora Weber-Wulff, a computer scientist and plagiarism researcher at the University of Applied Sciences for Engineering and Economics in Berlin. For a preprint study released in June, Weber-Wulff and her co-authors assessed 12 publicly available tools meant to detect AI-generated text. They found that, even under the most generous set of assumptions, the best detectors were less than 80 percent accurate at identifying  text composed by robots—and many were only about as good as flipping a coin. All had a high rate of false positives, and all became much less capable when given AI-written content was lightly edited by a human. Similar inconsistencies have been noted among fake-image detectors.

Watermarking “is pretty much one of the few technical alternatives that we have available,” says Florian Kerschbaum, a computer scientist specializing in data security at the University of Waterloo in Ontario. “On the other hand, the outcome of this technology is not as certain as one might believe. We cannot really predict what level of reliability we’ll be able to achieve.” There are serious, unresolved technical challenges to creating a watermarking system—and experts agree that such a system alone won’t meet the monumental tasks of managing misinformation, preventing fraud and restoring peoples’ trust.

Adding a digital watermark to an AI-produced item isn’t as simple as, say, overlaying visible copyright information on a photograph. To digitally mark images and videos, small clusters of pixels can be slightly color adjusted at random to embed a sort of barcode—one that is detectible by a machine but effectively invisible to most people. For audio material, similar trace signals can be embedded in sound wavelengths.

Text poses the biggest challenge because it’s the least data-dense form of generated content, according to Hany Farid, a computer scientist specializing in digital forensics at the University of California, Berkeley. Even text can be watermarked, however. One proposed protocol, outlined in a study published earlier this year in Proceedings of Machine Learning Research, takes all the vocabulary available to a text-generating large language model and sorts it into two boxes at random. Under the study method, developers program their AI generator to slightly favor one set of words and syllables over the other. The resulting watermarked text contains notably more vocabulary from one box so that sentences and paragraphs can be scanned and identified.

In each of these techniques, the watermark’s exact nature must be kept secret from users. Users can’t know what pixels or soundwaves have been adjusted or how that has been done. And the vocabulary favored by the AI generator has to be hidden. Effective AI watermarks must be imperceptible to humans in order to avoid being easily removed, says Farid, who was not involved with the study.

There are other difficulties, too. “It becomes a humongous engineering challenge,” Kerschbaum says. Watermarks must be robust enough to withstand general editing, as well as adversarial attacks, but they can’t be so disruptive that they noticeably degrade the quality of the generated content. Tools built to detect watermarks also need to be kept relatively secure so that bad actors can’t use them to reverse-engineer the watermarking protocol. At the same time, the tools need to be accessible enough that people can use them.

Ideally, all the widely used generators (such as those from OpenAI and Google) would share a watermarking protocol. That way one AI tool can’t be easily used to undo another’s signature, Kerschbaum notes. Getting every company to join in coordinating this would be a struggle, however. And it’s inevitable that any watermarking program will require constant monitoring and updates as people learn how to evade it. Entrusting all this to the tech behemoths responsible for rushing the AI rollout in the first place is a fraught prospect.

Other challenges face open-source AI systems, such as the image generator Stable Diffusion or Meta’s language model LLaMa, which anyone can modify. In theory, any watermark encoded into an open-source model’s parameters could be easily removed, so a different tactic would be needed. Farid suggests building watermarks into an open-source AI through the training data instead of the changeable parameters. “But the problem with this idea is it’s sort of too late,” he says. Open-source models, trained without watermarks, are already out there, generating content, and retraining them wouldn’t eliminate the older versions.

Ultimately building an infallible watermarking system seems impossible—and every expert Scientific American interviewed on the topic says watermarking alone isn’t enough. When it comes to misinformation and other AI abuse, watermarking “is not an elimination strategy,” Farid says. “It’s a mitigation strategy.” He compares watermarking to locking the front door of a house. Yes, a burglar could bludgeon down the door, but the lock still adds a layer of protection.

Other layers are also in the works. Farid points to the Coalition for Content Provenance and Authenticity (C2PA), which has created a technical standard that’s being adopted by many large tech companies, including Microsoft and Adobe. Although C2PA guidelines do recommend watermarking, they also call for a ledger system that keeps tabs on every piece of AI-generated content and that uses metadata to verify the origins of both AI-made and human-made work. Metadata would be particularly helpful at identifying human-produced content: imagine a phone camera that adds a certification stamp to the hidden data of every photograph and video the user takes to prove it’s real footage. Another security factor could come from improving post hoc detectors that look for inadvertent artifacts of AI generation. Social media sites and search engines will also likely face increased pressure to bolster their moderation tactics and filter out the worst of the misleading AI material.

Still, these technological fixes don’t address the root causes of distrust, disinformation and manipulation online—which all existed long before the current generation of generative AI. Prior to the arrival of AI-powered deepfakes, someone skilled at Photoshop could manipulate a photograph to show almost anything they wanted, says James Zou, a Stanford University computer scientist who studies machine learning. TV and film studios have routinely used special effects to convincingly modify video. Even a photorealistic painter can create a trick image by hand. Generative AI has simply upped the scale of what’s possible.

People will ultimately have to change the way they approach information, Weber-Wulff says. Teaching information literacy and research skills has never been more important because enabling people to critically assess the context and sources of what they see—online and off—is a necessity. “That is a social issue,” she says. “We can’t solve social issues with technology, full stop.”

参考译文
科技公司应对人工智能内容危机的新宠方案仍显不足
得益于众多易于获取的在线工具,如今几乎任何人都可以在点击按钮后,利用人工智能生成图像、文本、音频和视频,这些内容看起来和人类创作的几乎无法区分。一个重大的结果是,我们正面临一个在线内容危机,大量未经检查的机器生成内容充斥网络,其中可能包含危险的错误、虚假信息和欺诈行为。这一形势令安全专家、监管机构和普通民众都在努力寻找一种方法,以区分人工智能生成的内容和人类作品。目前的人工智能检测工具极不可靠。甚至开发ChatGPT的OpenAI公司,最近也下线了其AI文本识别工具,因为该工具的准确率太低。现在,又一个潜在的解决方案开始受到关注:数字水印,即在每一份人工智能生成的内容中嵌入一种不可更改的、隐蔽的数字签名,以确保其来源可追溯。上个月底,拜登政府宣布美国七家人工智能公司已自愿签署了八项风险管理制度,其中包括承诺开发“可靠的技术手段,以确保用户知晓内容是否由人工智能生成,例如水印系统。”最近通过的欧盟监管法规要求科技公司努力区分人工智能输出和人类创作。水印技术旨在遏制当前机器学习热潮中的“西部荒野”状态。这只是一个初步且微小的步骤,仍然远远落后于生成式人工智能带来的风险。将人类创作与机器生成混为一谈会带来许多后果。几十年来,虚假新闻一直是个问题,但如今人工智能已使内容工厂能够在几分钟内生成大量误导性图片和文章,塞满搜索引擎和社交媒体的推荐内容。诈骗信息、帖子,甚至语音邮件和电话也可以以前所未有的速度生成。学生、不道德的科学家和求职者可以生成作业、数据或申请材料,声称这些是他们自己完成的。同时,检测人工智能生成内容的不可靠、有偏见的过滤器可能会误导教师、学术评审人员和招聘主管,使他们错误地怀疑他人不诚实。如今,公众人物还可以利用深度伪造(Deepfake,即使用人工智能让某人看起来说或做了某些事情)的可能性来试图逃避他们实际上所说或所做的事情的责任。在最近一起涉及一名驾驶员死亡的诉讼中,特斯拉公司的律师试图声称,其首席执行官埃隆·马斯克在2016年的一段真实录音中曾就自动驾驶汽车的安全性作出无根据的声明,这段录音可能是深度伪造的。生成式人工智能甚至可以“毒害”自己,因为其训练所依赖的互联网海量数据正日益被劣质内容污染。鉴于上述原因以及更多,区分人工智能生成内容和真实内容变得越来越关键。现有的人工智能检测器几乎没多大帮助。“对,它们并不管用。”柏林工程与经济应用科技大学的计算机科学家兼剽窃研究专家Debora Weber-Wulff说道。在六月发布的一项预印研究中,Weber-Wulff和她的合著者评估了12个公开可用的、用于检测人工智能生成文本的工具。他们发现,即使在最有利的假设条件下,最好的检测器在识别机器人生成的文本时准确率也低于80%——而许多工具的准确性差不多和抛硬币一样。所有的检测器都有很高的误报率,而且当人工智能生成的文本经过轻微人工编辑后,它们的能力都会大幅下降。类似的不一致也出现在了假图像检测工具中。弗兰克·凯尔斯查姆(Florian Kerschbaum)是安大略省滑铁卢大学的数据安全专家,他说:“水印技术几乎是目前我们拥有的少数技术选择之一。”“另一方面,这项技术的结果并不像人们想象的那样确定。我们无法真正预测将能实现什么样的可靠性。”创建一个水印系统面临诸多严峻且尚未解决的技术挑战,专家一致认为,仅靠这样一个系统无法完成遏制虚假信息、防止欺诈以及恢复公众信任的重大任务。为人工智能生成的内容添加数字水印并不像在照片上叠加可见的版权信息那样简单。要为图像和视频添加数字水印,可以通过随机调整小块像素的颜色,嵌入一种类似条形码的标记——这种标记可以被机器识别,但对大多数人来说是有效的隐形标记。对于音频材料,可以在声音波长中嵌入类似的可追溯信号。文本提出了最大的挑战,因为根据加州大学伯克利分校的数字取证专家哈尼·法里德(Hany Farid)所说,文本是数据密度最低的生成内容形式。然而,文本也可以被水印。一项提议的协议,发表在今年早些时候的《机器学习研究进展》(Proceedings of Machine Learning Research)中,该协议将文本生成的大型语言模型可用的所有词汇随机分为两组。在研究方法中,开发人员会编程AI生成器稍微偏向使用一组词汇和音节。由此生成的水印文本将明显包含更多来自某一组的词汇,使得句子和段落可以被扫描和识别。在所有这些技术中,水印的具体形式必须对用户保密。用户不能知道哪些像素或声波被调整过,以及是如何被调整的。同样,人工智能生成器所偏好的词汇也必须被隐藏起来。法里德表示,有效的AI水印必须对人类不可察觉,以防止被轻易去除。此外,还有其他困难。“这将成为一个巨大的工程挑战。”凯尔斯查姆说。水印必须足够坚固,能够抵抗一般的编辑和对抗性攻击,但又不能对生成内容的质量造成明显损害。此外,检测水印的工具也需要保持相对安全,以防止恶意行为者利用它们来反向破解水印协议。同时,这些工具也需要足够易用,以便人们能使用它们。理想情况下,所有广泛使用的生成器(如OpenAI和Google的生成器)都会共享一个水印协议。凯尔斯查姆指出,这样一来,一个AI工具就不能轻易地消除另一个工具的水印签名。然而,要让所有公司参与协调这一工作将是一个难题。此外,任何水印计划都不可避免地需要持续监测和更新,因为人们会逐渐学会如何规避它们。将这一切都托付给那些最初推动AI快速推出的科技巨头,前景并不乐观。开源AI系统(如图像生成器Stable Diffusion,或Meta的LLaMa语言模型)也面临其他挑战,因为任何人都可以对其进行修改。从理论上说,任何编码到开源模型参数中的水印都可以被轻易移除,因此需要采取不同的策略。法里德建议,将水印嵌入到开源AI的训练数据中,而不是可变的参数中。“但这个问题在于,这有点太晚了。”他说。已经存在的开源模型在训练过程中并未使用水印,它们正在生成内容,而重新训练这些模型也无法消除旧版本。最终,创建一个绝对可靠的水印系统似乎是不可能的——科学美国人采访的每一位专家都表示,水印本身并不足够。法里德说,就虚假信息和其他AI滥用而言,水印“并不是一种消除策略。”“它是一种缓解策略。”他将水印比作房子的前门锁。是的,小偷可能砸门而入,但锁仍然提供了一层保护。其他保护层也在进行中。法里德提到了内容来源和真实性的联盟(C2PA),该联盟制定了一项技术标准,已被包括微软和Adobe在内的多家大型科技公司采用。尽管C2PA指南确实推荐使用水印,但它们也呼吁建立一种账本系统,追踪每一份人工智能生成的内容,并使用元数据验证AI生成和人类创作的来源。元数据在识别人类生成的内容方面尤其有用:想象一下,手机相机会在每张照片和视频的隐藏数据中添加一个认证印章,以证明这些内容是真实的。另一个安全因素可能来自改进的后处理检测器,它们寻找AI生成过程中的无意痕迹。社交媒体平台和搜索引擎也可能面临更大的压力,以提高其内容审核策略,过滤掉最糟糕的误导性AI内容。然而,这些技术解决方案并未解决在线不信任、虚假信息和操控的根源,而这些问题早在当前一代生成式AI出现之前就已存在。斯坦福大学计算机科学家詹姆斯·邹(James Zou)研究机器学习,他说,即便是在人工智能驱动的深度伪造出现之前,一位擅长Photoshop的人也可以通过图像处理展示他们想要的几乎任何内容。电视和电影工作室经常使用特效来真实地修改视频。甚至一位超写实画家也可以手工制作出欺骗性的图像。生成式AI只是将可能性扩大了。最终,人们必须改变他们获取信息的方式,韦伯-沃尔夫说。培养信息素养和研究技能从未如此重要,因为让人们能够批判性地评估他们在网络上和现实生活中看到的内容的背景和来源,是必需的。“这是一个社会问题。”她说。“我们不能用技术来解决社会问题,完完全全不能。”
您觉得本篇内容如何
评分

评论

您需要登录才可以回复|注册

提交评论

广告

scientific

这家伙很懒,什么描述也没留下

关注

点击进入下一篇

北上广的打工人,被AI折磨疯了

提取码
复制提取码
点击跳转至百度网盘