小程序
传感搜
传感圈

You loved ChatGPT. Wait until you see its rivals.

2023-01-31 04:45:15
关注

Connor Leahy thought this might happen. “I’ve been basically waiting for this moment for at least two years now,” says the co-founder of the AI start-up Conjecture, referring to the current buzz surrounding ChatGPT. Deeply enmeshed in the development of large language models (LLMs) since his time at the helm of EleutherAI, an open-source cooperative of machine learning engineers and enthusiasts, Leahy got his general amazement at the capabilities of these programs out of the way way back in 2019, when he first glimpsed GPT-2. “Now, I feel like a lot of people are having the reaction that I had,” he says.

That reaction can generally be described as one of delight, awe and not a little foreboding. Built using GPT-3, an LLM boasting some 175 billion parameters, ChatGPT has knocked the world for six for its effortless ability to write reams of eloquent, incisive prose (and some poetry, too.) Joined by other services like DALL-E 2 and Midjourney, the market for generative AI has suddenly become very hot indeed, with Microsoft poised to dramatically increase its initial $1bn investment in ChatGPT’s creator, OpenAI, with a cash injection of up to $10bn. Little wonder then, that increasing attention is being paid to the development of alternative LLMs to GPT-3 – foundation models that other tech giants would likely snap up in a heartbeat.

ChatGPT is facing competition from other LLMs
ChatGPT has captured the imagination of the general public – and big tech firms eager to invest in the next, hot generative AI model. (Photo by Rmedia7 / Shutterstock)

Jurassic’s Park

One of these alternatives is Jurassic-1. Numbering some 178 billion parameters in its largest form, the LLM is the brainchild of AI21 Labs, a research start-up-turned-product and platform manager founded in 2017 with the goal of creating models specialising in language generation and comprehension. “We didn’t want to play a pure research game like DeepMind,” says Ori Goshen, AI21 Lab’s chief executive (though it, too, is flirting with the idea of its own superpowered chatbot.) “We admire DeepMind, we think they’re great. But we also wanted to bring real commercial value from the get-go.”

That same instinct is present at Cohere, another AI start-up based out of Toronto. Its latest model, explains its chief operating officer Martin Kon, powers classification, semantic search and content moderation across 160 languages. “That might not be exciting to consumers who enjoy generating poems about their cats or images of dogs in sushi houses, but we feel it’s certainly incredibly exciting to CEOs and executives of companies, organisations, and governments of all sizes everywhere in the world,” he says. 

Companies Intelligence

View All

Reports

View All

Data Insights

View All

The increasing buzz surrounding generative AI seems to be helping both start-ups hit their stride, with Cohere raising $125m in Series B funding in February last year and AI21 Labs’ Jurassic-1 released on AWS’s machine learning platform Sagemaker in November. This doesn’t mean that AI21 Labs is imitating OpenAI’s close relationship with Microsoft, says Goshen. While he doesn’t rule out the possibility of a much closer partnership with a large platform – “I mean, never say never,” says Goshen, highlighting AI21 Labs’ fruitful partnerships with Amazon and Google – he maintains that the company remaining neutral when it comes to cloud providers has long-term benefits, not least in allowing it to explore using the latest computing hardware whatever the provider.

A similar mindset seems to prevail among start-ups in the LLM space, including BigScience and its BLOOM model, and Anthropic, which just released ChatGPT rival Claude to mixed reviews. As the market for new foundation models continues to heat up, Goshen predicts that demand will inevitably grow for LLMs to move away from being trained on pages scraped en masse from the internet to narrower, proprietary datasets. “That can create these more specialised models, maybe [for] specific domains, or even…models that have an understanding of a specific company,” he says.

An LLM alignment problem

It’s impossible to tell, however, whether we’re seeing the emergence of a dynamic marketplace for LLMs or the beginning of a longer process of consolidation, as Big Tech companies fight to acquire as much talent and capabilities for themselves as possible. What we may see, argues Leahy, is something akin to Goshen’s prediction, with the creation of smaller models designed to accomplish narrower goals. The creation of larger and more complex LLMs, however, will continue to depend on the goodwill and infrastructure of hyperscalers. “There really are only a few actors in the world who are capable of mustering the resources to train something or build something like GPT-4,” he says, referring to OpenAI’s next LLM, due to be released sometime this year. Details of the system have yet to be revealed, but many in the field predict it will surpass most other LLMs in size and complexity.

Even so, OpenAI’s CEO Sam Altman recently told Reuters that the firm wouldn’t release the model until it met strict safeguarding benchmarks. It’s a challenge Leahy sympathises with. Known in AI research as the ‘alignment problem,’ Leahy spends much of his time at Conjecture figuring out how to bind new models to quintessentially human ethics and motivations. While he believes that generative AI has the potential to become “unimaginably positive for the world,” he worries that not enough people realise machine intelligences think very differently to the people they’re serving – and that those who do realise, like OpenAI and other start-ups, don’t know yet how to make them hold to our common set of values. It’s certainly something to consider as more LLMs hit the market in the coming year. As psychiatrist and part-time AI guru Scott Alexander recently put it in his own post about ChatGPT, ‘[t]his thing is an alien that has been beaten into a shape that makes it look vaguely human. But scratch it the slightest bit and the alien comes out.’

Content from our partners

Sherif Tawfik: The Middle East and Africa are ready to lead on the climate

Sherif Tawfik: The Middle East and Africa are ready to lead on the climate

What to look for in a modern ERP system

What to look for in a modern ERP system

How tech leaders can keep energy costs down and meet efficiency goals

How tech leaders can keep energy costs down and meet efficiency goals

Read more: Russian hackers are bypassing ChatGPT restrictions imposed by OpenAI

View all newsletters Sign up to our newsletters Data, insights and analysis delivered to you By The Tech Monitor team

Topics in this article : AI , ChatGPT

参考译文
你喜欢ChatGPT。等着瞧它的竞争对手吧。
康纳·利希认为这种情况迟早会发生。“我差不多已经等了至少两年了,”这家人工智能初创公司Conjecture的联合创始人说道,他指的是目前围绕ChatGPT所产生的热潮。自从在EleutherAI担任领导期间,利希就深度参与了大语言模型(LLMs)的开发。EleutherAI是一个由机器学习工程师和爱好者组成的开源合作项目。早在2019年,当利希第一次看到GPT-2的时候,他对这些程序所展现出的能力就已经感到非常震惊了。“现在,我觉得许多人产生的反应和我当时是一样的,”他说。这种反应通常可以被描述为既欣喜又惊叹,同时还带有一丝不安。ChatGPT是基于GPT-3构建的,这个拥有1750亿个参数的大语言模型,凭借其轻松写出大量优美、深刻文字(还包括一些诗歌)的能力,令全世界都感到震撼。与此同时,其他像DALL-E 2和Midjourney这样的生成式AI服务也纷纷涌现,生成式AI市场突然变得异常火爆。微软也计划将其对ChatGPT开发商OpenAI的初步投资10亿美元扩大到最高100亿美元。因此也就不足为奇,越来越多的关注投入到GPT-3之外的其他基础语言模型的发展上,而这些模型也极有可能被其他科技巨头争相收购。ChatGPT已经引发了公众的极大兴趣,也吸引了那些希望投资下一个热门生成式AI模型的大科技公司。(照片由Rmedia7 / Shutterstock提供) 侏罗纪公园 其中一种替代方案是Jurassic-1。其最大版本拥有1780亿个参数,是AI21实验室的产物。AI21实验室是一家成立于2017年的研究型初创公司,后转型为产品和平台管理公司,其目标是创建专注于语言生成和理解的模型。“我们不想像DeepMind那样纯粹地做研究,”AI21实验室的首席执行官奥里·戈申说(尽管他们也在尝试开发自己的超级聊天机器人)。“我们非常钦佩DeepMind,认为他们非常出色。但我们同时也希望从一开始就带来真正的商业价值。”这种理念在另一家位于多伦多的AI初创公司Cohere也同样存在。据其首席运营官马丁·孔解释,Cohere最新的模型可以支持160种语言的分类、语义搜索和内容审核。“对于喜欢生成关于猫咪的诗或者寿司屋中狗狗图像的消费者来说,这可能并不那么激动人心,但我们认为,对于世界各地的公司、组织和政府中的首席执行官和高管来说,这一定是非常令人兴奋的事情,”他说。 公司情报 查看所有报告 查看所有数据洞察 查看所有内容 生成式AI日益增长的关注似乎正在帮助这两家初创公司步入正轨。Cohere在去年2月完成了1.25亿美元的B轮融资,而AI21实验室的Jurassic-1则于11月发布在AWS的机器学习平台Sagemaker上。戈申表示,这并不意味着AI21实验室正在仿效OpenAI与微软之间紧密的合作关系。虽然他并未排除与一个大型平台建立更紧密合作关系的可能性——“我的意思是,永远不要说不可能,”戈申强调了AI21实验室与亚马逊和谷歌之间的富有成效的合作——但他坚持认为,公司保持在云服务提供商上的中立立场具有长期优势,尤其是在允许公司探索使用最新计算硬件方面,无论供应商是谁。 这种思维方式似乎也主导着LLM领域的初创公司,包括BigScience及其BLOOM模型,以及刚刚发布ChatGPT竞争对手Claude的Anthropic公司。随着新基础模型市场的持续升温,戈申预测,未来对LLM的需求将不可避免地增长,这些模型将不再完全依赖于从互联网大规模抓取的页面数据,而是转向更狭窄的专有数据集。“这将有助于创建更具针对性的模型,或许适用于特定领域,甚至……可以理解特定公司的模型,”他说。 LLM对齐问题 然而,我们还无法判断我们现在看到的是LLM市场的蓬勃发展,还是一个更长期整合过程的开始,因为大型科技公司正竞相争夺尽可能多的人才和能力。利希认为,我们可能会看到类似戈申预测的情况,即创建出一些专门用于完成更狭窄任务的小型模型。但与此同时,创建更大、更复杂的LLM仍将依赖于超大规模计算平台的好意和基础设施。他说:“世界上真正有能力集结资源去训练或构建像GPT-4这样的系统的参与者其实非常少。”GPT-4是OpenAI的下一个LLM,预计将在今年某个时候发布。其具体细节尚未公布,但许多业内人士预测它将在规模和复杂性上超过大多数现有的LLM。尽管如此,OpenAI首席执行官山姆·阿尔特曼最近告诉路透社,该公司不会在模型达到严格的安全标准之前发布它。利希对此表示认同。在人工智能研究中,这个问题被称为“对齐问题”,而利希在Conjecture公司花了大量时间研究如何将新模型与人类的基本伦理和动机对齐。虽然他认为生成式AI有潜力“给世界带来难以想象的积极影响”,但他也担心,人们尚未充分意识到机器智能与它们所服务的人类思维方式截然不同——而那些已经意识到这一点的公司,比如OpenAI和其他初创公司,目前还不知道如何让它们坚持我们共有的价值观。 随着未来一年更多LLM进入市场,这确实是一个值得思考的问题。正如精神科医生兼兼职AI专家斯科特·亚历山大最近在一篇关于ChatGPT的帖子中所说:“[这东西]就像是一个被强行塑造成看起来像人类的外星人。但哪怕只是轻轻地一碰,那个外星人就会显现出来。” 内容来自我们的合作伙伴 谢里夫·塔菲克:中东和非洲已准备好在气候议题上引领潮流 现代ERP系统中值得关注的要素 科技领导者如何降低能源成本并实现效率目标 阅读更多:俄罗斯黑客绕过OpenAI对ChatGPT的限制 查看所有通讯 订阅我们的通讯 数据、洞察与分析直达您的邮箱 由Tech Monitor团队提供 注册此处 本文主题:人工智能、ChatGPT
您觉得本篇内容如何
评分

评论

您需要登录才可以回复|注册

提交评论

广告

techmonitor

这家伙很懒,什么描述也没留下

关注

点击进入下一篇

全自动AI房东,开始收割租客

提取码
复制提取码
点击跳转至百度网盘