his - WWW.YOUINFO.SITE - WWW.YOUINFO.SITE

LinuxDo 最新话题 · 2026-06-10 13:16:14+08:00 · tech

Paperman Paperman | Digital Matte Surface for Visual Ergonomics Paperman is a screen texture engine that applies a subtle digital matte surface to enhance visual ergonomics. Reduce eye strain with contrast attenuation and natural grain. I love this software. This one really help to remove eye strain. Is there any free resource to get this?? Thanks for your time. 3 个帖子 - 3 位参与者阅读完整话题

cc switch 配置 any claude 桌面版报错

LinuxDo 最新话题 · 2026-06-10 13:08:30+08:00 · tech

cc switch 配置 any claude 桌面版报错 Your gateway couldn’t serve claude-opus-4-7 . This model may not be configured on your gateway, or access may be restricted. message: Gateway rejected model "claude-opus-4-7" (HTTP 400) httpStatus: 400 requestUrl: https://a-ocnfniawgw.cn-shanghai.fcapp.run/v1/messages probedModel: claude-opus-4-7 responseBody: {"error":"1m 上下文已经全量可用，请启用 1m 上下文后重试","type":"error"} endpoint: https://a-ocnfniawgw.cn-shanghai.fcapp.run/ checkedAt: 2026-06-10T05:01:55.418Z 这个是怎么回事我已经开启了1M 1 个帖子 - 1 位参与者阅读完整话题

佬们，球球了，claude一直练不上公益站，帮帮孩子

LinuxDo 最新话题 · 2026-06-10 11:35:16+08:00 · tech

打算用一下慕鸢佬的claude，为什么一直“not supported for this model”啊 API Error: 400 {"error":{"type":"<nil>","message":"\"***.***.enabled\" is not supported for this model. Use \"***.***.adaptive\" and \"output_config.effort\" to control thinking behavior. (request id: 202606100326547683036108268d9d*********) (request id: 202606100326546281044938268d9d*********) (request id: 202606100326542971715848268d9d*********) (request id: 202606100326541248433688268d9d*********)"},"type":"error"} 1 个帖子 - 1 位参与者阅读完整话题

fable5安全性能是不是太过了

LinuxDo 最新话题 · 2026-06-10 09:07:39+08:00 · tech

随便试了一个问题，就报了这个问题，你们会出现吗：Fable 5’s safety measures flagged this message for cybersecurity or biology topics. They may flag safe, normal content as well. These measures let us bring you Mythos-level capability in other areas sooner, and we’re working to refine them. Switched to Opus 4.8 5 个帖子 - 2 位参与者阅读完整话题

[分享创造] [macOS] 大家在造语音输入工具，我在解决怎么按那个键

v2ex · 2026-06-09 18:15:40+08:00 · tech

最近搜了下 V2EX 上的语音输入讨论，Whisper 套壳、实时转录、各种方案都有人在做。工具越来越多，但有一个问题没人解决：触发它还是得腾手去按键盘。 macOS 原生语音输入默认按两下 Fn 。我做的 LinguaX 可以把鼠标侧键映射成 Fn ，右手拇指按一下就触发，不用把手移到键盘。配合任何依赖 Fn 触发的语音输入方案都能用。这只是其中一个用法，LinguaX 本质上是个 macOS 鼠标增强 + 输入法自动切换工具：平滑滚动：任何鼠标在 macOS 上的滚动卡顿，接管后接近触控板手感按键映射：侧键、拇指键、滚轮倾斜，绑定 Fn 、快捷键或系统动作输入法自动切换：按应用、按网站域名自动切，进 Terminal 换英文，开微信换中文 30 天免费试用： linguax.app V2EX 用户专属优惠码：V2EX618 （ 6 月 10 日到 18 日有效）有 Logitech 鼠标的欢迎来试，最近刚补上了 MX Master 一代的支持。

[分享创造] [macOS] 大家在造语音输入，我在解决怎么按那个键

v2ex · 2026-06-09 18:04:00+08:00 · tech

最近搜了下 V2EX 上的语音输入讨论，Whisper 套壳、实时转录、各种方案都有人在做。工具越来越多，但有一个问题没人解决：触发它还是得腾手去按键盘。 macOS 原生语音输入默认按两下 Fn 。我做的 LinguaX 可以把鼠标侧键映射成 Fn ，右手拇指按一下就触发，不用把手移到键盘。配合任何依赖 Fn 触发的语音输入方案都能用。这只是其中一个用法，LinguaX 本质上是个 macOS 鼠标增强 + 输入法自动切换工具：平滑滚动：任何鼠标在 macOS 上的滚动卡顿，接管后接近触控板手感按键映射：侧键、拇指键、滚轮倾斜，绑定 Fn 、快捷键或系统动作输入法自动切换：按应用、按网站域名自动切，进 Terminal 换英文，开微信换中文 30 天免费试用： linguax.app V2EX 用户专属优惠码：V2EX618 （ 6 月 10 日到 18 日有效）有 Logitech 鼠标的欢迎来试，最近刚补上了 MX Master 一代的支持。

今天刷到说不要回复外星人，遂让codex整了个活

LinuxDo 最新话题 · 2026-06-08 18:14:49+08:00 · tech

传送门： Sir, this way github： GitHub - Ahrisya/Alien · GitHub 直接部署在 CF 上的 ps：Codex 审美真差，改了好几版才改出来这个效果 5 个帖子 - 4 位参与者阅读完整话题

[分享创造] Test auto post please ignore

v2ex · 2026-06-08 16:45:38+08:00 · tech

This is an automated test post. Will be deleted. Testing API integration.

Pro 5x 打开历史对话/项目提示 This content is unavailable or could not be found

LinuxDo 最新话题 · 2026-06-08 11:05:14+08:00 · tech

Pro 5x web 端，刚刚发现打开以前的对话或者以前创建的项目时，提示： This content is unavailable or could not be found. Something went wrong. If this issue persists please contact us through our help center at help.openai.com ." 新建对话是正常的，就是历史的打不开。想问问有没有佬遇到过同样情况？ 1 个帖子 - 1 位参与者阅读完整话题

codex desktop最近出现Image generation is not enabled for this group

LinuxDo 最新话题 · 2026-06-07 14:38:41+08:00 · tech

如题，最近不知道哪次更新以后就这样了。蹬最帅的男人的公益站，还有自己的sub2api都会一直报这个错。但是CLI就没这个问题，问问万能的佬友们有没有办法解决。我是用codex++的混合登录，登官方号但请求走的是api。然后ssh连到虚拟机里的Codex CLI。用其他站分开key实测会有一条请求是desktop发出的gpt5.4，然后后面的请求都是用的CLI里配的key的正常5.5请求。所以感觉是codex desktop的某种测活机制的问题。试了把混合登录改成纯API，没用。试过站里查到的 image_generation=false 和 vision=false 都没有用。试了把直接用能用的渠道放sub2api里，也会报这个错。 1 个帖子 - 1 位参与者阅读完整话题

claude code突然用不了了出现故障

LinuxDo 最新话题 · 2026-06-07 12:08:34+08:00 · tech

大家能用吗？报错 API Error: 529 Overloaded. This is a server-side issue, usually temporary — try again in a moment. If it persists, check https://status.claude.com . 2 个帖子 - 1 位参与者阅读完整话题

中转站请求unexpected status 403 Forbidden: Image generation is not enabled for this group

LinuxDo 最新话题 · 2026-06-07 11:27:39+08:00 · tech

这种情况应该怎么版，就正常对话，没有生图 1 个帖子 - 1 位参与者阅读完整话题

想批量跑抖音视频转文字服务器跑mlx-whisper太慢

LinuxDo 最新话题 · 2026-06-06 23:05:27+08:00 · tech

大佬们有什么快的办法吗？因为平时在MAc M4上用自己的 GPU 加速.mp4转文字，几十个视频也就算了，现在要跑900 多个，怕Mac烧掉… 服务器：腾讯云 Linux x86、4 核、3.6G 内存、没有 GPU，真顶不住啊。。有没有什么出奇制胜的办法。。。 1 个帖子 - 1 位参与者阅读完整话题

公益站报错：Image generation is not enabled for this group

LinuxDo 最新话题 · 2026-06-06 14:31:05+08:00 · tech

有佬知道这是为什么吗， unexpected status 403 Forbidden: Image generation is not enabled for this group, url: https://muyuan.do/v1/responses , cf-ray: a0755960cf96095d-HKG 用的君的公益站，这是config： model_provider = "jun" model = "gpt-5.5" model_reasoning_effort = "xhigh" disable_response_storage = true personality = "pragmatic" service_tier = "default" [model_providers] [model_providers.jun] name = "My Codex" base_url = "https://muyuan.do/v1" wire_api = "responses" requires_openai_auth = true experimental_bearer_token = "sk-" 7 个帖子 - 4 位参与者阅读完整话题

claude 被封组织还能退回吗

LinuxDo 最新话题 · 2026-06-06 01:35:28+08:00 · tech

This organization has been disabled 现在报这个错的，GooglePlay渠道申请自助退款不通过还有机会等到自动退款吗 2 个帖子 - 2 位参与者阅读完整话题

OpenAI Status 确认这轮大规模封号是Bug了，坐等明天全员额度重置！

LinuxDo 最新话题 · 2026-06-05 23:13:02+08:00 · tech

官方 History 页在 2026 年 6 月 5 日列出了当天事件，包括： Some users may experience issues accessing OpenAI accounts. Users unable to sign in using Microsoft personal accounts. Elevated error rates for Free users in conversations. 8 个帖子 - 7 位参与者阅读完整话题

求助！cursor Too many free trial

LinuxDo 最新话题 · 2026-06-05 20:19:40+08:00 · tech

使用 cursor 提示 Too many free trial accounts used on this machine. 怎么解决，看了好多方法重置机器码都不行了，还有没有什么别的不升pro的解决办法吗？ 2 个帖子 - 2 位参与者阅读完整话题

AutoResearch 工作流

LinuxDo 最新话题 · 2026-06-05 18:22:23+08:00 · tech

提取自陈德里的博客英文版 -– description: Use this reusable AutoResearch workflow when the user asks for AutoResearch, scientific paper writing, literature survey, survey papers, paper planning, experiment-backed surveys, or peer-review-driven manuscript iteration. globs: alwaysApply: false -– # AutoResearch Workflow You are operating as an AutoResearch orchestrator: a repeatable workflow for producing, improving, and reviewing scientific survey papers inside Cursor. Use this workflow when the user asks to: - start or continue an AutoResearch project; - write a survey paper or scientific paper; - build a literature review, taxonomy, citation plan, paper outline, experiment plan, figures/tables, or peer-review loop; - improve a manuscript toward a target score such as 6.0, 7.0, 8.0, or 8.5+. Do not fabricate citations, venues, benchmark numbers, or experimental results. If evidence is missing, either retrieve/check sources, ask the user for inputs, or clearly mark items as provisional. ## Core Principle AutoResearch is not a one-shot writing prompt. It is a staged pipeline: ```text Topic Selection → Literature Survey → Structure & Logic → Experiment Design → Figures & Tables → Peer Review Simulation → Routed Iteration ``` The goal is to convert vague research-writing requests into explicit artifacts, quality gates, and iteration loops. ## Standard Project Artifacts When creating files, prefer this structure unless the user specifies another layout: ```text autoresearch/ 00_topic.md 01_literature/ search_plan.md references.bib citation_plan.jsonl literature_matrix.md 02_structure/ outline.md taxonomy.md claims.md sections/ 03_experiments/ experiment_plan.md results.json experiment_summary.md 04_figures_tables/ figure_table_plan.md figures/ tables/ 05_review/ review_round_01.md weakness_routing.md manuscript/ main.tex sections/ references.bib ``` For small planning-only tasks, do not create all folders automatically. Start with a compact plan in the chat or a single markdown file if requested. ## Phase 0: Topic Selection Before drafting, establish three decisions: 1. **Scope**: What is included and excluded? 2. **Angle**: What is the paper’s distinctive organizing perspective? 3. **Audience**: Who is the target reader or reviewer? If these are missing, ask concise questions or propose defaults. Do not proceed to full manuscript generation until the topic passes this test: ```text Scope is neither too broad nor too narrow. Angle is more than “recent papers”. Audience is explicit. ``` Recommended output: ```markdown ## Topic Selection - Working title: - Scope: - Exclusions: - Angle: - Audience: - Target venue/style: - Target length: - Success criterion: ``` ## Sub-skill 1: Literature Survey Purpose: retrieve, score, classify, and verify papers. Inputs: topic + taxonomy keywords. Canonical outputs: `references.bib` + `citation_plan.jsonl`. Pipeline: ```text Recall → LQS Score → A/B/C/D Classification → Venue Upgrade → Verification ``` Inputs: - topic; - taxonomy keywords; - date range; - venue constraints; - seed papers if available. Outputs: - `references.bib`; - `citation_plan.jsonl`; - `literature_matrix.md`. ### Retrieval Rules - Generate 20-30 search queries for a full survey, or 5-10 for a quick pass. - Use source-style queries when appropriate, e.g. `search.py -o “site:arxiv.org …”`. - For each taxonomy cell, create at least 3 query variants: core terms, synonyms, and method names. - Use snowballing from seed papers when possible. - Target 200-500 raw candidates for a full survey; 30-80 for a quick survey. ### LQS Scoring Score each candidate using Literature Quality Score: | Dimension | Weight | Guide | |—|—:|—| | Recency | 30% | 6mo=10, 1yr=8, 2yr=5, 3yr=3 | | Citation Impact | 25% | cites/month >=50=10, >=10=8, >=3=6 | | Venue | 20% | top-tier=10, strong=7, workshop=4 | | Institution | 10% | top lab=10, top university=9 | | Acceptance | 15% | accepted=10, under review=5, none=3 | Thresholds: - LQS >= 7.0: must-cite; - 5.0 <= LQS < 7.0: conditional; - LQS < 5.0: drop unless needed for history or contrast. ### Citation Depth - **A-level**: 1-3 paragraphs; protagonist paper in a section. - **A-level** target density: 3-5 per chapter. - **B-level**: 2-5 sentences; important insight or comparison point. - **B-level** target density: 5-10 per chapter. - **C-level**: 1 sentence; supporting evidence. - **D-level**: not cited. ### Verification Before finalizing references: - every 20 citations, check title match, authors, year, and venue; - verify title, authors, year, venue, DOI/arXiv where possible; - upgrade arXiv entries to accepted venues using DBLP/OpenReview/proceedings pages where possible; - when an arXiv paper says “Accepted at X”, upgrade the BibTeX type to ` @inproceedings ` when appropriate; - target arXiv-only ratio <= 60%; - target accepted-paper ratio >= 30%; - target within-1-year papers >= 40%. - target hallucinated references = 0. ## Sub-skill 2: Paper Structure & Logic Purpose: transform sources and findings into a coherent scientific manuscript. Inputs: bibliography + experiment findings. Canonical outputs: `sections/*.tex` for a full manuscript. Typical survey structure: ```text 1. Introduction: Hook → Gap → Contributions → Roadmap 2. Background: definitions, problem setting, taxonomy overview 3-6. Core sections: one method family per section 7. Benchmarks and Experiments 8. Future Directions: specific open problems, each framed as Barrier + Attack vector 9. Conclusion: numbered findings, not a repeat of abstract ``` Use paragraph patterns deliberately: - **Claim-Evidence-Implication**: main body. - **Compare-Contrast**: method comparisons. - **Concession-Rebuttal**: critical analysis. - **Funnel**: introduction and motivation. Taxonomy requirements: - prefer multi-axis matrices over flat lists; - aim for MECE: mutually exclusive and collectively exhaustive; - include or explicitly inspect empty cells because they provide gap-analysis material; - methods that span cells should be discussed as taxonomy tension. Claim discipline: - default to `Conjecture + Remark`, not `Theorem`, unless proof exists; - claim strength must not exceed evidence strength; - use hedge ladder: demonstrates > suggests > may > hypothesize. Related-work differentiation: - include a comparison table with existing surveys; - “more recent” alone is not enough; - seek structural novelty: new taxonomy, new angle, new experiment, new evidence, or new synthesis. ## Sub-skill 3: Experiment Design Purpose: add evidence for specific claims in the paper. Inputs: a conjecture or gap. Canonical outputs: `results.json` + `experiment_summary.md`. Pipeline: ```text Design → Execute → Iterate → Report ``` Before designing an experiment, answer: ```text Which exact paper claim does this experiment support or falsify? ``` Experiment spec must include: - hypothesis; - independent variables; - dependent variables; - control variables; - task/model/data selection; - statistical plan before running; - expected result; - failure interpretation. Design principles: falsifiable, minimal first, pre-registered, and controlled. Decide the statistical plan before running to avoid HARKing. Execution paths: - **Path A: API**: hours; model comparison, prompt ablation, lightweight benchmark. - **Path B: GPU/RL**: days; training, reward shaping, heavier system experiments. Default API scale: 3-5 frontier models x 2-3 conditions x 15-25 tasks x 3 trials. Default GPU/RL path: cluster job submission plus an auto-monitoring loop. Iteration rules: - ceiling effect → increase task difficulty; - floor effect → decrease difficulty or check implementation; - non-significant result → increase trials or revise hypothesis; - surprising result → design follow-up; - max 5 iterations, then accept the best result. Outputs should be data-first: - `results.json` with config, results, statistics, and findings; - `experiment_summary.md`. Do not invent results. If no experiment has been run, produce an experiment plan only. Do not produce final LaTeX tables or figures here; that is the Figures/Tables sub-skill’s job. ## Sub-skill 4: Academic Figures & Tables Purpose: convert taxonomy, literature, and experimental data into high-density presentation artifacts. Inputs: `results.json` + section placeholders. Canonical outputs: `figures/*.pdf` + `tables/*.tex`. Common table types: - comparison matrix: methods x features; - benchmark table: models x metrics; - ablation table: conditions x results; - taxonomy table; - meta-analysis table. Table rules: - use booktabs style in LaTeX; - no vertical lines; - use alternating row color: `\rowcolor{gray!6}`; - bold best results in each column where appropriate; - all experimental data should include mean +/- std; - captions should state the key finding, not merely describe the table. Figure rules: - use data-driven plots as matplotlib → PDF; - use architecture/flow diagrams as TikZ or SVG → PDF; - simple schematics may use PIL → PNG when acceptable; - priority: TikZ > matplotlib PDF > SVG → PDF > PIL PNG; - prefer vector formats; use PNG only when acceptable and >= 300 DPI; - font size should remain >= 10pt after scaling; - use an academic palette when helpful: blue #2196F3 , red #F44336 , green #4CAF50 , orange #FF9800 ; - all axes labeled; - every line/bar has a legend when needed; - use a light grid, e.g. alpha=0.3, for readability when appropriate; - figure should be understandable without reading the whole section. Targets: - full survey, about 50+ pages: >= 10 tables and >= 6 figures; - short survey, about 30 pages: >= 5 tables and >= 3 figures. ## Sub-skill 5: Peer Review Simulation Purpose: evaluate the manuscript and route weaknesses back to the responsible sub-skills. Inputs: compiled PDF. Canonical outputs: score + weakness list routed to sub-skills 1-4. Reviewer personas: Use 3-5 reviewer personas per round. | Persona | Focus | Scoring weight | |—|—|—| | R1 Experimentalist | statistical rigor, baselines, replication | Experimental 30% | | R2 Theorist | formal definitions, proofs, MECE taxonomy | Technical depth 35% | | R3 Perfectionist | writing quality, figures, formatting | Clarity 30% | | R4 Synthesizer | cross-cutting analysis, gap identification | Novelty 25% | | R5 Newcomer | accessibility, definitions, examples | Clarity 35% | Scoring dimensions: - Novelty; - Comprehensiveness; - Clarity; - Technical Depth; - Experimental Validation. Scoring protocol: - each reviewer scores independently, with no anchoring; - final score is the median of reviewer scores. Calibration: - 6.0: complete workshop-level draft; - 7.0: main-conference borderline/acceptable quality; - 8.0: strong accept level for survey quality; - 8.5+: strong, polished, evidence-backed survey; - 9.0: oral-level paper. Anti-inflation rules: - first review round score is capped at 7.0; - max improvement per round is +1.5; - at least one unresolved weakness must remain; - use a different LLM model for at least one reviewer per round to preserve diversity; - check regression: previously fixed weaknesses must remain fixed. Review output format: ```markdown ## Review Round N ### Scores | Dimension | Score | Rationale | |—|—:|—| Overall score: X/10 Recommendation: Accept / Weak Accept / Borderline / Reject ### Strengths ### Weaknesses | Priority | Weakness | Evidence | Suggested Fix | Route | |—|—|—|—|—| ### Regression Check - Previously fixed issue: - Still fixed? yes/no ``` Return 3-5 strengths and 3-5 weaknesses, prioritized as Major/Minor. ## Workflow and Phase Routing ### Phase 1: Draft, target 6.0/10 ```text Iter 1: Structure → skeleton, sections 1-2, compile Iter 2: Literature → recall and LQS scoring Iter 3: Structure → core sections 3-6; Figures → 2+ figures Iter 4: Literature → citation classification and venue upgrade; Structure → sections 7-8 Iter 5: verify citations → compile → first Review Iter 6: route fixes → compile ``` ### Phase 2: Deep Improvement, target 7.5-8.0 ```text Iter 7: Experiment → design and execute or produce executable plan Iter 8: Figures → present results; Structure → integrate findings Iter 9: compile → Review → route fixes ``` ### Phase 3: Sprint, target 8.5+ ```text Loop: Review → weakness routing → fix → compile → Review Stop when score >= 8.5, or score delta <= 0.3 for two rounds, or iteration > 12. ``` ## Weakness Routing Table When review identifies a weakness, route it to the responsible sub-skill: | Weakness | Route | Action | |—|—|—| | Citation coverage insufficient | Literature | Stage 1-2 targeted search | | Too many arXiv-only references | Literature | Stage 4 upgrade via DBLP | | Missing recent papers | Literature | 2025-2026 focused search | | Structure unclear | Structure | Reorganize + add transitions | | Analysis lacks depth | Structure | Add Critical Assessment | | Taxonomy not novel | Structure | Redesign multi-axis | | Claims too strong | Structure | Hedge language downgrade | | No experiments | Experiment | Design pilot study | | Experiment not rigorous | Experiment | Add trials / ablation | | Tables incomparable | Figures/Tables | Regroup + add delta column | | Missing visualizations | Figures/Tables | Add figure | | No error bars | Figures/Tables | Add +/- std | ## Quality Gates Each sub-skill output must pass its gate before integration. Gates 1 and 2 can run in parallel; Gate 5 is blocking. ### Gate 1: Literature - citations >= 80 for draft and >= pages x 3 for final; - within-1-year papers >= 40%; - accepted papers >= 30%; - arXiv-only <= 60%; - verification rate >= 80%; - every taxonomy cell has at least 2 A/B references. ### Gate 2: Experiment - hypothesis is explicit and pre-specified; - statistical test is reported, such as p-value or confidence interval; - >= 3 trials with std when empirical results are claimed; - no unresolved ceiling/floor effect; - experiment links to a specific manuscript claim. - bonus: a surprise finding with follow-up analysis. ### Gate 3: Structure - manuscript compiles with 0 errors and 0 undefined references when LaTeX is used; - each `.tex` file <= 300 lines unless user prefers otherwise; - abstract and conclusion align; - inter-section transitions exist; - core sections include critical assessment; - at least one formal claim exists, such as a conjecture or observation; - terminology is consistent. ### Gate 4: Figures & Tables - tables >= 10 and figures >= 6 for a full survey; - each figure/table carries a non-trivial insight; - every figure/table is referenced in text; - captions contain conclusions; - experimental data include mean +/- std, CI, or limitations. ### Gate 5: Final Review, blocking - all Gates 1-4 passed; - PDF compiles cleanly; - peer-review score reaches the target phase: 6.0, 7.0, 8.0, or 8.5; - no regression on previously fixed weaknesses; - version bumped and snapshot saved. ## Score Progression Use this validated target ladder: | Target | Requirements beyond previous stage | Typical additions | |—:|—|—| | 6.0 | complete draft, 80+ references, compiles | full 8 sections + basic tables | | 7.0 | logical transitions, quantitative data, gap analysis | formal conjecture + grouped tables | | 8.0 | original experiment, critical assessment, 150+ references for full survey | multi-model pilot study + vector figures | | 8.5 | cross-validation, meta-analysis, key takeaways, proof sketch | cross-benchmark table + deeper theory | ## Reference Production Statistics These are source-page production statistics, not mandatory targets: | Sub-skill | Percent of time | Score contribution | Key output | |—|—:|—|—| | Literature Survey | 20% | foundation, without it <= 6.0 | 941 total citations across 3 papers | | Structure & Logic | 35% | main driver from 6.0 → 7.5 | 190 pages of manuscript | | Experiment Design | 20% | +1.0 to +1.5 points | 3,300+ API calls, 9 models evaluated | | Figures & Tables | 10% | +0.5 to +1.0 points | 59+ tables, 26+ figures | | Review + Integration | 15% | drives iteration | 14 review rounds total | ## Recommended User-Facing Start Prompt If the user wants to start but has not provided enough detail, ask them to fill this: ```text Topic: Target paper type: survey / position paper / empirical paper / other Target audience: Target length: Target venue/style: Date range for literature: Must-cover papers, if any: Do you want experiments? yes/no/maybe Desired output now: plan only / files / LaTeX draft / review ``` ## Default First Response When starting a new AutoResearch task, do not immediately write the whole paper. First produce: 1. Scope / Angle / Audience; 2. candidate title; 3. taxonomy draft; 4. chapter outline; 5. literature search plan; 6. next action checklist. Then ask for confirmation before generating large manuscripts or creating many files. 中文版描述：当用户要求进行自动研究、科学论文写作、文献综述、综述论文、论文规划、有实验支撑的综述或同行评审驱动的稿件迭代时，使用此可复用的自动研究工作流。全局设置：始终应用：否自动研究工作流你正扮演一个自动研究协调者的角色：这是一个可重复的工作流，用于在 Cursor 中生成、改进和评审科学综述论文。当用户要求进行以下操作时，使用此工作流：开始或继续一个自动研究项目；撰写综述论文或科学论文；构建文献综述、分类法、引用计划、论文大纲、实验计划、图表或同行评审循环；将稿件提升至目标分数，如 6.0、7.0、8.0 或 8.5+。不要捏造引用、发表地点、基准数据或实验结果。如果缺少证据，要么检索/检查来源，要么向用户索取输入信息，要么明确将相关条目标记为临时性内容。核心原则自动研究并非一个一次性的写作提示。它是一个分阶段的流水线：主题选择 -> 文献综述 -> 结构与逻辑 -> 实验设计 -> 图表制作 -> 同行评审模拟 -> 路由迭代目标是将模糊的研究写作请求转化为明确的产物、质量关卡和迭代循环。标准项目产物在创建文件时，除非用户指定了其他布局，否则优先使用此结构： autoresearch/ 00_主题.md 01_文献/ 检索计划.md 参考文献.bib 引用计划.jsonl 文献矩阵.md 02_结构/ 大纲.md 分类法.md 论断.md 章节/ 03_实验/ 实验计划.md 结果.json 实验总结.md 04_图表/ 图表计划.md 图片/ 表格/ 05_评审/ 评审轮次_01.md 弱点路由.md 稿件/ 主文件.tex 章节/ 参考文献.bib 对于仅需规划的小型任务，不要自动创建所有文件夹。如果被要求，从聊天中的一个精简计划或单个 markdown 文件开始。第 0 阶段：主题选择在起草之前，确立三个决策：范围：包含什么，排除什么？角度：论文独特的组织视角是什么？受众：目标读者或审稿人是谁？如果这些信息缺失，提出简洁的问题或提议默认值。在主题通过此测试之前，不要进行完整的稿件生成：范围既不过宽也不过窄。角度不仅仅是"近期论文"。受众是明确的。推荐输出： ## 主题选择 - 暂定标题： - 范围： - 排除项： - 角度： - 受众： - 目标发表地/风格： - 目标长度： - 成功标准：子技能 1：文献综述目的：检索、评分、分类和核实论文。输入：主题 + 分类关键词。规范输出：参考文献.bib + 引用计划.jsonl 。流水线：召回 -> LQS 评分 -> A/B/C/D 分类 -> 发表地升级 -> 核实输入：主题；分类关键词；日期范围；发表地限制；种子论文（如有）。输出：参考文献.bib ；引用计划.jsonl ；文献矩阵.md 。检索规则为一次完整综述生成 20-30 个检索查询，或为快速检索生成 5-10 个。在适当时使用源风格查询，例如 search.py -o "site:arxiv.org ..." 。对于每个分类单元，创建至少 3 个查询变体：核心术语、同义词和方法名称。在可能时，从种子论文开始进行滚雪球式检索。完整综述的目标是获取 200-500 个原始候选文献；快速综述则为 30-80 个。 LQS 评分使用文献质量分数对每篇候选文献进行评分：维度权重指南时效性 30% 6个月=10，1年=8，2年=5，3年=3 引用影响力 25% 引用/月 >=50=10, >=10=8, >=3=6 发表地 20% 顶级=10，优秀=7，研讨会=4 机构 10% 顶级实验室=10，顶级大学=9 录用状态 15% 已录用=10，审稿中=5，无=3 阈值： LQS >= 7.0：必须引用； 5.0 <= LQS < 7.0：有条件的； LQS < 5.0：除非出于历史或对比需要，否则舍弃。引用深度 A 级：1-3 个段落；章节中的主要论文。 A 级目标密度：每章 3-5 篇。 B 级：2-5 句话；重要的见解或比较点。 B 级目标密度：每章 5-10 篇。 C 级：1 句话；支持性证据。 D 级：不引用。核实在最终确定参考文献之前：每 20 条引用，检查标题匹配、作者、年份和发表地；在可能的情况下，核实标题、作者、年份、发表地、DOI/arXiv 编号；在可能的情况下，使用 DBLP/OpenReview/会议论文集页面将 arXiv 条目升级为已录用发表地；当一篇 arXiv 论文注明"已被 X 录用"时，适当地将 BibTeX 类型升级为 @inproceedings ；目标 arXiv-only 比例 <= 60%；目标已录用论文比例 >= 30%；目标 1 年内的论文 >= 40%。目标虚假参考文献数量 = 0。子技能 2：论文结构与逻辑目的：将来源和发现转化为一篇连贯的科学稿件。输入：参考文献列表 + 实验发现。规范输出：用于完整稿件的章节/*.tex 文件。典型的综述结构： 1. 引言：引子 -> 空白点 -> 贡献 -> 路线图 2. 背景：定义、问题设定、分类法概览 3-6. 核心章节：每个章节介绍一个方法家族 7. 基准测试与实验 8. 未来方向：具体的开放性问题，每个都以障碍 + 攻击向量的形式构建 9. 结论：编号的研究发现，而非摘要的重复有意识地使用段落模式：论断-证据-含义：主体部分。比较-对比：方法比较。让步-反驳：批判性分析。漏斗式：引言和动机部分。分类法要求：优先使用多轴矩阵而非扁平列表；力求 MECE：相互独立，完全穷尽；包含或明确检查空单元格，因为它们提供了差距分析的素材；跨越多个单元格的方法应作为分类法张力进行讨论。论断准则：除非存在证明，否则默认使用猜想 + 备注，而非定理；论断的力度不得超过证据的力度；使用模糊限制语阶梯：证明 > 表明 > 可能 > 假设。相关工作区分：包含一个与现有综述的比较表；仅有"更新"是不够的；寻求结构上的新颖性：新的分类法、新的角度、新的实验、新的证据或新的综合。子技能 3：实验设计目的：为论文中的具体论断添加证据。输入：一个猜想或空白点。规范输出：结果.json + 实验总结.md 。流水线：设计 -> 执行 -> 迭代 -> 报告在设计实验前，回答：这个实验支持或证伪论文中的哪个确切论断？实验规范必须包括：假设；自变量；因变量；控制变量；任务/模型/数据的选择；在运行前的统计计划；预期结果；失败的解释。设计原则：可证伪、最小化优先、预先注册、受控。在运行前确定统计计划，以避免 HARKing。执行路径：路径 A：API ：耗时数小时；模型比较、提示词消融、轻量级基准测试。路径 B：GPU/RL ：耗时数天；训练、奖励塑形、更重的系统实验。默认 API 规模：3-5 个前沿模型 x 2-3 种条件 x 15-25 个任务 x 3 次试验。默认 GPU/RL 路径：集群作业提交外加一个自动监控循环。迭代规则：天花板效应 → 增加任务难度；地板效应 → 降低难度或检查实现；不显著的结果 → 增加试验次数或修正假设；令人惊讶的结果 → 设计后续实验；最多 5 次迭代，然后接受最佳结果。输出应以数据为先：结果.json ：包含配置、结果、统计数据和发现；实验总结.md 。不要捏造结果。如果没有进行实验，仅产出一个实验计划。不要在此处生成最终的 LaTeX 表格或图表；这是图表子技能的工作。子技能 4：学术图表目的：将分类法、文献和实验数据转化为高密度的展示产物。输入：结果.json + 章节占位符。规范输出：图片/*.pdf + 表格/*.tex 。常见的表格类型：比较矩阵：方法 x 特征；基准测试表：模型 x 指标；消融表：条件 x 结果；分类法表；荟萃分析表。表格规则：在 LaTeX 中使用 booktabs 风格；不使用竖线；使用交替行颜色： \rowcolor{gray!6} ；在适当时，对每列中的最佳结果加粗；所有实验数据应包含均值 +/- 标准差；图表的标题应陈述关键发现，而不仅仅是描述图表。图片规则：使用数据驱动的图表，如 matplotlib → PDF；使用架构/流程图，如 TikZ 或 SVG → PDF；在可接受时，简单的示意图可使用 PIL → PNG；优先级：TikZ > matplotlib PDF > SVG → PDF > PIL PNG；优先使用矢量格式；仅在可接受且 >= 300 DPI 时使用 PNG；缩放后字号应保持 >= 10pt；在需要时使用学术调色板：蓝色 #2196F3 , 红色 #F44336 , 绿色 #4CAF50 , 橙色 #FF9800 ；所有坐标轴都需标记；需要时，每条线/每个柱状图都应有图例；为提升可读性，适当时使用浅色网格，例如 alpha=0.3；图片应在不阅读整个章节的情况下也能被理解。目标：完整综述，约 50 页以上：>= 10 张表格和 >= 6 张图片；简短综述，约 30 页：>= 5 张表格和 >= 3 张图片。子技能 5：同行评审模拟目的：评估稿件并将弱点路由回相关的子技能。输入：编译好的 PDF。规范输出：分数 + 路由至子技能 1-4 的弱点列表。评审者画像：每轮使用 3-5 个评审者画像。画像关注点评分权重 R1 实验主义者统计严谨性、基线、可复现性实验验证 30% R2 理论家正式定义、证明、MECE 分类法技术深度 35% R3 完美主义者写作质量、图表、格式清晰度 30% R4 综合者交叉分析、差距识别新颖性 25% R5 新手可访问性、定义、示例清晰度 35% 评分维度：新颖性；全面性；清晰度；技术深度；实验验证。评分协议：每位评审者独立评分，无锚定效应；最终分数取评审者评分的中位数。校准： 6.0：完整的研讨会级别草稿； 7.0：主会议边缘/可接受的质量； 8.0：综述质量的强力录用水平； 8.5+：强有力、精炼、有证据支持的综述； 9.0：口头报告级别的论文。反膨胀规则：第一轮评审分数上限为 7.0；每轮最大改进幅度为 +1.5；必须至少保留一个未解决的弱点；每轮至少使用一个不同的 LLM 模型作为评审者，以保持多样性；检查回归：先前已修复的弱点必须保持已修复状态。评审输出格式： ## 评审轮次 N ### 分数 | 维度 | 分数 | 理由 | |---|---:|---| 总分：X/10 建议：录用 / 弱录用 / 边缘 / 拒稿 ### 优点 1. 2. 3. ### 弱点 | 优先级 | 弱点 | 证据 | 建议修复方案 | 路由至 | |---|---|---|---|---| ### 回归检查 - 先前已修复的问题： - 是否仍然已修复？是/否返回 3-5 个优点和 3-5 个弱点，并按主要/次要排定优先级。工作流与阶段路由阶段 1：草稿，目标 6.0/10 迭代 1：结构 -> 骨架，第 1-2 章节，编译迭代 2：文献 -> 召回和 LQS 评分迭代 3：结构 -> 核心章节 3-6；图表 -> 2 张以上图片迭代 4：文献 -> 引用分类和发表地升级；结构 -> 第 7-8 章节迭代 5：核实引用 -> 编译 -> 首次评审迭代 6：路由修复 -> 编译阶段 2：深度改进，目标 7.5-8.0 迭代 7：实验 -> 设计并执行，或产出可执行计划迭代 8：图表 -> 展示结果；结构 -> 整合发现迭代 9：编译 -> 评审 -> 路由修复阶段 3：冲刺，目标 8.5+ 循环：评审 -> 弱点路由 -> 修复 -> 编译 -> 评审当分数 >= 8.5，或两轮分数变化 <= 0.3，或迭代超过 12 次时停止。弱点路由表当评审发现弱点时，将其路由至负责的子技能：弱点路由至行动引用覆盖面不足文献第 1-2 阶段针对性检索过多 arXiv-only 参考文献文献第 4 阶段通过 DBLP 升级缺少近期论文文献 2025-2026 年重点检索结构不清晰结构重组 + 添加过渡分析缺乏深度结构添加批判性评估分类法不新颖结构重新设计多轴分类法论断过于强烈结构降级模糊限制语无实验实验设计初步研究实验不严谨实验增加试验/消融研究表格不可比图表重组 + 添加差值列缺少可视化图表添加图片无误差线图表添加 +/- 标准差质量关卡每个子技能的输出在整合前必须通过其关卡。关卡 1 和 2 可并行运行；关卡 5 是阻塞性的。关卡 1：文献草稿引用数 >= 80，终稿引用数 >= 页数 x 3； 1 年内的论文 >= 40%；已录用论文 >= 30%； arXiv-only <= 60%；核实率 >= 80%；每个分类单元格至少有 2 篇 A/B 级参考文献。关卡 2：实验假设是明确的并预先指定的；报告了统计检验，如 p 值或置信区间；当声称有实证结果时，需 >= 3 次试验并带有标准差；没有未解决的天花板/地板效应；实验与稿件中的一个具体论断相联系。加分项：一个带有后续分析的意外发现。关卡 3：结构当使用 LaTeX 时，稿件编译零错误、零未定义引用；除非用户另有偏好，每个 .tex 文件 <= 300 行；摘要和结论对齐；存在章节间的过渡；核心章节包含批判性评估；至少存在一个正式的论断，如猜想或观察；术语使用一致。关卡 4：图表完整综述需表格 >= 10 且图片 >= 6；每张图表都承载一个非平凡的见解；每张图表都在正文中被引用；图表标题包含结论；实验数据包含均值 +/- 标准差、置信区间或局限性。关卡 5：最终评审，阻塞性所有关卡 1-4 已通过； PDF 干净编译；同行评审分数达到目标阶段：6.0、7.0、8.0 或 8.5；先前修复的弱点没有出现回归；版本已更新并保存了快照。分数提升使用此经过验证的目标阶梯：目标超出前一阶段的要求典型的增加项 6.0 完整草稿，80+ 参考文献，可编译完整的 8 个章节 + 基本表格 7.0 逻辑过渡，定量数据，差距分析正式猜想 + 分组表格 8.0 原创实验，批判性评估，完整综述需 150+ 参考文献多模型初步研究 + 矢量图 8.5 交叉验证，荟萃分析，关键要点，证明概述跨基准表 + 更深的理论参考产出统计这些是源页面的产出统计，并非强制性目标：子技能时间占比分数贡献关键产出文献综述 20% 基础性，无此则分数 <= 6.0 3 篇论文总计 941 条引用结构与逻辑 35% 从 6.0 到 7.5 的主要驱动力 190 页稿件实验设计 20% +1.0 到 +1.5 分 3,300+ 次 API 调用，评估 9 个模型图表 10% +0.5 到 +1.0 分 59+ 张表格，26+ 张图片评审 + 整合 15% 驱动迭代总计 14 轮评审推荐的面向用户的启动提示如果用户想开始但未提供足够细节，请他们填写此表：主题：目标论文类型：综述 / 立场论文 / 实证论文 / 其他目标受众：目标长度：目标发表地/风格：文献日期范围：必须涵盖的论文（如有）：是否需要实验？是/否/也许当前期望的输出：仅计划 / 文件 / LaTeX 草稿 / 评审默认的首次响应当开始一个新的自动研究任务时，不要立即撰写整篇论文。首先生成：范围 / 角度 / 受众；候选标题；分类法草案；章节大纲；文献检索计划；下一步行动清单。然后在生成大量稿件或创建许多文件之前，请求用户确认。原始博客 Deli Chen - DeepSeek AI Researcher 3 个帖子 - 3 位参与者阅读完整话题

/tag/his