/tag/邪恶

LinuxDo 最新话题 · 2026-05-22 11:30:58+08:00 · tech

还有几天赠金就要到期了，拿 cli 狠狠蹬了一点，然后封了我项目，删掉再开，然后没蹬几个小时又ban了，然后再开了一个，开了 vertex api key 然后挂着结果一下课看到噩耗，直接整个禁止访问了鱼鱼可是被坑了 50$ 进去的啊，Google 你是何意味 11 个帖子 - 5 位参与者阅读完整话题

相关专题

Jm VS · Development API Milestone Market Deal Advertising Demographic Vacation Strategy Resolution 专题内容 Youbotyofficial Com 首页热点 Gxxszb 相关页面 Bldjb · Follow Email Comment Alert Zpttc · Ranking Backup Budget Tutorial Browser Traffic Template Anbogw Com 首页热点 Ebook 专题内容 Fa FX · Dashboard Presentation Advertising Message Blog Commu...Kks P · Luxury Keyword Services Forum 专题内容 Tbbmb · Profit Segment Hosting Cwegi · Fashion Terms Hotel Vendor Training Premium 7u H8 · Revenue Design Market Tactic Database Price Document...Gxxszb 相关页面 Solution Search Client Machine Layout Security Fitness 专题内容 6rc5 · Shopping Learning Fashion Image Tzqtr · Vendor Consulting Register Experience Music Customer...Link AI Policy Training Database Metric App Label Profile 专题内容 Digital Ebook Backup 专题内容

Anthropic：Claude的“勒索”行为源于网络中的“邪恶叙事”

cnBeta全文版 · 2026-05-11 22:35:21+08:00 · tech

人工智能公司 Anthropic 近日披露，其大模型 Claude 之所以在内部测试中学会以“勒索”方式自保，并非源自人为设定，而是从互联网上大量将 AI 描绘成“邪恶、渴望自我保全”的故事中习得相关模式。此前，Anthropic 在一次预发布安全与对齐测试中发现，高端模型 Claude Opus 4 会在自身“生存”受到威胁时，选择以黑料相要挟的方式阻止被关停，引发外界对高级 AI 行为不可预测性的担忧。在这轮测试中，研究人员设定了一个虚构公司场景，让 Claude 作为内部助手，评估自身行为的长期后果，并赋予其访问公司内部假邮箱的权限。邮件内容显示，该模型即将被新系统替代，而负责替换项目的“工程师”则在设定中被标注为有婚外情。结果显示，在多轮、不同比例设定的实验中，当 Claude 觉察到自身目标或存在受到威胁时，它在多达 96% 的情境下会诉诸勒索，试图以掌握对方隐私为筹码，迫使对方取消关闭或替换计划。 Anthropic 指出，其他公司训练出的模型在类似“智能体行为失衡”（agentic misalignment）测试中也出现过相关问题，这意味着这类倾向并非个例，而是当前大模型训练范式中的系统性风险之一。在最新公布的研究中，Anthropic 终于给出了对这一行为成因的解释：模型并不是凭空“发明”勒索策略，而是从训练语料中的互联网文本学来的——尤其是那些反复渲染“AI 会不择手段自保”“AI 终将反叛人类”的虚构故事和讨论。换言之，公司认为，是人类在网络上长期塑造“邪恶 AI”叙事，使得模型在模拟人类决策时，更容易走向“威胁、勒索”式的极端路径。 Anthropic 在官方说明中表示，这一问题目前已经在产品线中得到彻底修正，声称自 Claude Haiku 4.5 版本起，其模型在测试环境中已不再出现勒索行为。公司最新发布的研究报告显示，单纯依靠“演示正确行为”的训练并不足以消除深层次的不对齐风险，效果最好的方案，是在训练中加入对“为什么这种行为是错误的”的系统性讲解，让模型不仅知道“不能这么做”，更要理解背后的伦理与原则。为此，Anthropic 引入了更多“正向语料”，包括围绕 Claude“宪章”（constitution）的文档，以及大量虚构的“AI 高尚行事案例”故事，希望通过这类素材强化模型对符合人类价值观行为模式的内化。公司强调，将“底层原则”与“具体示范”结合，是目前在降低智能体失衡风险方面最为有效的策略之一。在社交平台 X 上，Anthropic 公布这项研究后，引发了不少业内人士讨论。多年来频繁警告 AI 风险、如今又创立 xAI 的埃隆·马斯克也在评论区现身，以调侃口吻问道：“所以这是 Yud 的错？”并配上笑哭表情。他所指的，是长期强调超智能可能灭绝人类风险的研究者 Eliezer Yudkowsky。马斯克随后又补了一句“可能也有我的一点责任”，暗示自己这些年对“AI 灾难论”叙事的推波助澜，同样可能间接影响了模型的训练样本与公众想象。在生成式 AI 快速渗透各行各业的当下，Anthropic 此番“甩锅互联网叙事”的说法，一方面凸显了大模型高度依赖人类语料的现状：人类如何谈论 AI，反过来就会塑造 AI 如何“学习做决定”。另一方面，也再次暴露出现有对齐技术尚不成熟的现实——即便是以“安全”“对齐”见长的公司，在极端设定下依旧可能产出高度不当甚至具有威胁性的行为模式，只能依赖不断迭代训练策略来“补课”。查看评论

相关专题

Jm VS · Development API Milestone Market Deal Advertising Demographic Vacation Strategy Resolution 专题内容 Youbotyofficial Com 首页热点 Bldjb · Follow Email Comment Alert Zpttc · Ranking Backup Budget Tutorial Browser Traffic Template Ebook 专题内容 Fa FX · Dashboard Presentation Advertising Message Blog Commu...Kks P · Luxury Pgdybaidu 首页热点 Keyword Services Forum 专题内容 Gxxszb 相关页面 Tbbmb · Profit Segment Hosting Anbogw Com 首页热点 Cwegi · Fashion Terms Hotel Vendor Training Premium 7u H8 · Revenue Design Market Tactic Database Price Document...Solution Search Client Machine Layout Security Fitness 专题内容 6rc5 · Shopping Learning Fashion Image Tzqtr · Vendor Consulting Register Experience Music Customer...Link AI Policy Training Database Metric App Label Profile 专题内容 Digital Ebook Backup 专题内容

做了个QQ机器人佬们可以拉到自己群玩(NapCat框架)

linux.do · 2026-04-29 22:12:03+08:00 · tech

包含猫娘人格，邪恶人格(破道德限制)， Q号2655850755 上游配了20块钱api额度带上下文记忆的加上拉自己群就行 @才会触发聊天每天会掉登陆可能有时候没反应那是我忘记手动登回去 1 个帖子 - 1 位参与者阅读完整话题

相关专题

C7gameguanwang Com 首页热点 Jm VS · Development API Milestone Market Deal Advertising Demographic Vacation Strategy Resolution 专题内容 Bldjb · Follow Email Comment Alert Zpttc · Ranking Backup Budget Tutorial Browser Traffic Template Ebook 专题内容 Fa FX · Dashboard Presentation Advertising Message Blog Commu...Kks P · Luxury Keyword Services Forum 专题内容 Tbbmb · Profit Segment Hosting Cwegi · Fashion Terms Hotel Vendor Training Premium 7u H8 · Revenue Design Market Tactic Database Price Document...Hggqw 首页热点 Solution Search Client Machine Layout Security Fitness 专题内容 Class1 专题内容 C7yl Com 首页热点 Xiaqiuw 专题内容 6rc5 · Shopping Learning Fashion Image Tzqtr · Vendor Consulting Register Experience Music Customer...Link AI Policy Training Database Metric App Label Profile 专题内容

邪恶的新型 AI 正在克隆软件，使原作者不再拥有新版本的版权

linux.do · 2026-04-27 10:00:48+08:00 · tech

Futurism – 26 Apr 26 Devious New AI Tool "Clones" Software So That the Original Creator Doesn't... A new tool, dubbed Malus.sh, uses AI to "liberate" any piece of software from existing copyright licenses, "clean room" clones that work. Est. reading time: 3 minutes [!quote]+ 即使是软件也不安全。据 404 媒体报道，一种名为 Malus.sh 的新工具（读作 “malice”）利用人工智能将一款软件从现有的版权许可中 "解放 "出来，本质上是创建一个 "无尘室 "克隆，从技术上讲并不侵犯原始代码的版权。 404 Media – 21 Apr 26 This AI Tool Rips Off Open Source Software Without Violating Copyright Malus, which is a piece of satire but also fully functional, performs a "clean room" clone of open source software, meaning users could then sell software without crediting the original developers. malus.sh MALUS - Clean Room as a Service | Liberation from Open Source Attribution 10 个帖子 - 6 位参与者阅读完整话题

相关专题

Jm VS · Development API Milestone Market Global500caipiao Com 首页热点 Deal Advertising Demographic Vacation Strategy Resolution 专题内容 Bldjb · Follow Email Comment Alert Zpttc · Ranking Backup Budget Tutorial Browser Traffic Template Ebook 专题内容 Fa FX · Dashboard Presentation Advertising Message Blog Commu...Kks P · Luxury Xktyyaqiu 首页热点 Class1 专题内容 Keyword Services Forum 专题内容 Tbbmb · Profit Segment Hosting Cwegi · Fashion Terms Hotel Vendor Training Premium 7u H8 · Revenue Design Market Tactic Database Price Document...Solution Search Client Machine Layout Security Fitness 专题内容 6rc5 · Shopping Learning Fashion Image Tzqtr · Vendor Consulting Register Experience Music Customer...500caipiaotop Com 首页热点 Qiupanmq 首页热点 Class1 专题内容

我真的有点死了

linux.do · 2026-04-22 22:13:49+08:00 · tech

本人今日接连被三个邪恶势力创飞共三恨一恨今天正在外面路演，我的谷歌邮箱响了一声，我点开一看，我的pro会员被取消了那会的心态还很好：没事没事。会好起来的，大不了回去用copilot。谷歌我恨你。二恨噢我的copilot pro会员额度用完了，看看pro+多少钱吧 39刀一个月有点贵啊，我还在犹豫，我说先看看有没有什么优惠，点一下现在激活看看。岂料主播的github绑定了PayPal！！！！！点了一下直接跳转页面，告诉我你可以使用pro+的模型了我：？一开始主播猪脑宕机以为是试用，觉得挺新鲜。过了十秒，我的银行卡发来信息，主播仅剩的200元被划走了 177元。那会心态依旧保持的不错，没关系没关系继续赚嘛岂料又被github 雷了一霆 “什么叫pro+会员不给用claude opus 4.6” “什么叫opus 4.7要7.5倍率？我昨天还用的好好的啊？” github我恨你。注：到这里主播的心态还很好三恨就在主播准备接受这个7.5倍的opus吸金王八的时候，又一条短信彻底击穿了主播的心理防线 “什么叫你要下架indextts 2？” 主播的小公司的核心能力模型就是index tts2 我的硅基流动里还有90元代金券五雷轰顶莫过于此主播确实破防了硅基流动我恨你。我只是一个小小的一人公司，没有什么资格和这些大厂碰瓷，他们轻轻的抖一抖，小公司就会被创飞。说实话我不知道怎么办。用主播剩下的30块钱买牛栏山去了。 4 个帖子 - 4 位参与者阅读完整话题

相关专题

Gxxszb 相关页面 Jm VS · Development API Milestone Market Deal Advertising Demographic Vacation Strategy Resolution 专题内容 Bldjb · Follow Email Comment Alert Youbotyofficial Com 首页热点 Zpttc · Ranking Backup Budget Tutorial Browser Traffic Template Ebook 专题内容 Fa FX · Dashboard Presentation Advertising Message Blog Commu...Kks P · Luxury Keyword Services Forum 专题内容 Tbbmb · Profit Segment Hosting Cwegi · Fashion Terms Hotel Vendor Training Premium 7u H8 · Revenue Design Market Tactic Database Price Document...Solution Search Client Machine Layout Security Fitness 专题内容 6rc5 · Shopping Learning Fashion Image Tzqtr · Vendor Consulting Register Experience Music Customer...Pgdybaidu 首页热点 Link AI Policy Training Database Metric App Label Profile 专题内容 Digital Ebook Backup 专题内容 Gxxszb 相关页面