1. Introduction
Large language models (LLMs) have recently improved dramatically in text generation. This improvement is driven in large part by scale and the ability to be instruction following [1], [2], [3], [4], [5], [6], [7]. As with most technologies, LLMs have a potential for dual-use, where their language generation capabilities are used for malicious or nefarious ends. For example, text generation models have already been used to produce hateful text [8].