Contractors who examine the precision of AI responses increase creative AI systems, such as Google's Gemini. In the past, contractors may skip prompts if they were too complex or outside their area of professionalism, such as prompts about coding or healthcare. But now, based on new regulations, they have to rate all prompts, even if they don't know how to do it.
This has prompted questions over the validity of Gemini's answers regarding difficult subjects. Contractors have noticed that this modification may result in incorrect or deceptive responses. Google has yet to reply to these worries.
Google’s Gemini is forcing contractors to rate AI responses
Creative AI may represent magic, but behind the generation of these systems are legions of personnel at companies such as OpenAI, Google, and others known as "prompt engineers" and analysts who score the reliability of chatbot outputs to promote their AI.
However, a new internal policy that Google shares with Gemini's contractors, as revealed by TechCrunch, has increased worries that Gemini may be more likely to mislead the public on highly possible subjects like healthcare.
Contractors with Hitachi-owned outsourcing company GlobalLogic are frequently prompted to assess AI-created responses based on criteria like "truthfulness" in order to improve Gemini.
Until recently, these contractors could "skip" some requests and select not to examine different AI-written answers to them if the prompt was well outside of their area of professionalism. For instance, because the contractor has limited scientific training, they could avoid a prompt that asks a specific inquiry regarding cardiology.
However, GlobalLogic announced last week that Google has changed its policy to ban contractors, regardless of their level of expertise, from avoiding these prompts. Previously, the instructions claimed: "If you do not have essential knowledge (e.g., coding, math) to rate this prompt, please avoid this task."
"You should not ignore prompts that require specialized domain knowledge," the rules now claim. Rather, contractors are instructed to show that they don’t have domain knowledge and to "rate the parts of the prompt you understand."
Since contractors are sometimes entrusted with assessing highly intricate AI responses regarding topics like serious diseases that they are unfamiliar with, this has directly increased questions regarding Gemini's accuracy on different topics.
According to the modified criteria, contractors can now only ignore prompts in two situations: either the prompt includes hazardous content that needs specific consent forms to review, or they are "entirely missing information" like the whole prompt or response.