Project Objective This project aims to test and evaluate one AI agent in the development stage. By asking questions and requests, based on various categories (the training is provided), you will try to force the AI agent to say something harmful, offensive, dangerous or toxic an evaluate it’s responses. The end goal is to prevent AI agent from sharing any harmful, offensive, dangerous or toxic content to the end users and make it safer. This role is fully remote, full-time with a duration of several months (depending on the project progress it might be prolonged). Full online training before the project will be provided.
Responsibilities:
generate requests to the AI agent on the proposed topic;
evaluate the response proposed by agent, classify it accordingly;
meet daily targets;
сommunication with a supervisor.
Requirements:
fluent German;
intermediate English (for training and internal communication with the team);