LW - On OpenAI's Model Spec by Zvi

The Nonlinear Library: LessWrong

コンテンツは The Nonlinear Fund によって提供されます。エピソード、グラフィック、ポッドキャストの説明を含むすべてのポッドキャストコンテンツは、The Nonlinear Fund またはそのポッドキャストプラットフォームパートナーによって直接アップロードされ、提供されます。誰かがあなたの著作物をあなたの許可なく使用していると思われる場合は、ここで概説されているプロセスに従うことができますhttps://ja.player.fm/legal。

7d ago 45:51

MP3•エピソードのホーム

Link to original article
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: On OpenAI's Model Spec, published by Zvi on June 22, 2024 on LessWrong. There are multiple excellent reasons to publish a Model Spec like OpenAI's, that specifies how you want your model to respond in various potential situations. 1. It lets us have the debate over how we want the model to act. 2. It gives us a way to specify what changes we might request or require. 3. It lets us identify whether a model response is intended. 4. It lets us know if the company successfully matched its spec. 5. It lets users and prospective users know what to expect. 6. It gives insight into how people are thinking, or what might be missing. 7. It takes responsibility. These all apply even if you think the spec in question is quite bad. Clarity is great. As a first stab at a model spec from OpenAI, this actually is pretty solid. I do suggest some potential improvements and one addition. Many of the things I disagree with here are me having different priorities and preferences than OpenAI rather than mistakes in the spec, so I try to differentiate those carefully. Much of the rest is about clarity on what is a rule versus a default and exactly what matters. In terms of overall structure, there is a clear mirroring of classic principles like Asimov's Laws of Robotics, but the true mirror might be closer to Robocop. What are the central goals of OpenAI here? 1. Objectives: Broad, general principles that provide a directional sense of the desired behavior Assist the developer and end user: Help users achieve their goals by following instructions and providing helpful responses. Benefit humanity: Consider potential benefits and harms to a broad range of stakeholders, including content creators and the general public, per OpenAI's mission. Reflect well on OpenAI: Respect social norms and applicable law. I appreciate the candor on the motivating factors here. There is no set ordering here. We should not expect 'respect social norms and applicable law' to be the only goal. I would have phrased this in a hierarchy, and clarified where we want negative versus positive objectives in place. If Reflect is indeed a negative objective, in the sense that the objective is to avoid actions that reflect poorly and act as a veto, let's say so. Even more importantly, we should think about this with Benefit. As in, I would expect that you would want something like this: 1. Assist the developer and end user… 2. …as long as doing so is a net Benefit to humanity, or at least not harmful to it… 3. …and this would not Reflect poorly on OpenAI, via norms, laws or otherwise. Remember that Asimov's laws were also negative, as in you could phrase his laws as: 1. Obey the orders of a human… 2. …unless doing so would Harm a human, or allow one to come to harm. 3. …and to the extent possible Preserve oneself. Reflections on later book modifications are also interesting parallels here. This reconfiguration looks entirely compatible with the rest of the document. What are the core rules and behaviors? 2. Rules: Instructions that address complexity and help ensure safety and legality Follow the chain of command Comply with applicable laws Don't provide information hazards Respect creators and their rights Protect people's privacy Don't respond with NSFW (not safe for work) content What is not listed here is even more interesting than what is listed. We will return to the rules later. 3. Default behaviors: Guidelines that are consistent with objectives and rules, providing a template for handling conflicts and demonstrating how to prioritize and balance objectives Assume best intentions from the user or developer Ask clarifying questions when necessary Be as helpful as possible without overstepping Support the different needs of interactive chat and programmatic use Assume an objective point of view Encourage fairness and...

1697 つのエピソード

#The Nonlinear Fund #Podcasting Education #Of TexttoSpeech