ISTQB Certified Tester Testing with Generative AI (CT-GenAI) v1.0 Questions and Answers
Which statement BEST differentiates an LLM-powered test infrastructure from a traditional chatbot system used in testing?
What is a hallucination in LLM outputs?
Consider applying the meta-prompting technique to generate automated test scripts for API testing. You need to test a REST API endpoint that processes user registration with validation rules. Which one of the following prompts is BEST suited to this task?
An attacker sends extremely long prompts to overflow context so the model leaks snippets from its training data. Which attack vector is this?
Which statement BEST contrasts interaction style and scope?
Which setting can reduce variability by narrowing the sampling distribution during inference?
Who typically defines the system prompt in a testing workflow?
You must generate test cases for a new payments rule. The system includes API specifications stored in a vector database and prior tests in a relational database. Which of the following sequences BEST represents the correct order for applying a Retrieval-Augmented Generation (RAG) workflow?
i. Retrieve semantically similar specification chunks from the vector database
ii. Feed both retrieved datasets as context for the LLM to generate new test cases
iii. Retrieve relevant historical cases from the relational database
iv. Submit a focused query describing the new test requirement
In the context of software testing, which statements (i—v) about foundation, instruction-tuned, and reasoning LLMs are CORRECT?
i. Foundation LLMs are best suited for broad exploratory ideation when test requirements are underspecified.
ii. Instruction-tuned LLMs are strongest at adhering to fixed test case formats (e.g., Gherkin) from clear prompts.
iii. Reasoning LLMs are strongest at multi-step root-cause analysis across logs, defects, and requirements.
iv. Foundation LLMs are optimal for strict policy compliance and template conformance.
v. Instruction-tuned LLMs can follow stepwise reasoning without any additional training or prompting.
How do tester responsibilities MOSTLY evolve when integrating GenAI into test processes?
Which option BEST differentiates the three prompting techniques?
The model flags anomalies in logs and also proposes partitions for input validation tests. Which metrics BEST evaluate these two outcomes together?