ISTQB Certified Tester - Testing with Generative AI (CT-GenAI) 온라인 연습
최종 업데이트 시간: 2026년03월09일
당신은 온라인 연습 문제를 통해 ISTQB CT-GenAI 시험지식에 대해 자신이 어떻게 알고 있는지 파악한 후 시험 참가 신청 여부를 결정할 수 있다.
시험을 100% 합격하고 시험 준비 시간을 35% 절약하기를 바라며 CT-GenAI 덤프 (최신 실제 시험 문제)를 사용 선택하여 현재 최신 125개의 시험 문제와 답을 포함하십시오.
정답:
Explanation:
In structured prompt engineering, the Role component (also known as a Persona) is used to set the perspective, expertise, and tone of the LLM’s response. By assigning the role of a "senior test manager," the tester instructs the model to adopt the specific domain knowledge, vocabulary, and professional standards associated with that position. This technique is highly effective because LLMs are trained on vast datasets containing diverse professional documents; invoking a specific persona helps the model narrow its "latent space" to retrieve information relevant to that specific field. For instance, a senior test manager persona will prioritize risk management, resource allocation, and high-level strategy, whereas a "junior developer" persona might focus more on syntax and local unit tests. While Context (Option B) provides the background of the project and Instruction (Option A) defines the specific task to be performed, the Roleserves as the foundation for how those instructions are interpreted. This ensures the generated testware aligns with the expected professional seniority and organizational maturity required for high-stakes environments like a payments platform.
정답:
Explanation:
Hallucinations―where an LLM generates factually incorrect or nonsensical information―occur primarily when the model lacks sufficient specific information and "fills in the gaps" using probabilistic patterns from its training data. The most effective mitigation strategy is "grounding," which involves providing the model with detailed, project-specific context. By including technical specifications, existing API schemas, business rules, and identified constraints within the prompt, the tester restricts the model’s operational space to the "project realities." This ensures the model does not have to guess or improvise details about the System Under Test (SUT). In contrast, randomizing prompts (Option B) or relying on generic examples (Option C) increases the likelihood of inconsistent and inaccurate outputs. Furthermore, using "longer" or higher temperature settings (Option D) actually encourages creativity and randomness, which is the opposite of the precision required for testing and significantly increases the risk of hallucinations. Therefore, rich contextual grounding is the technical foundation for reliable AI-assisted test analysis.
정답:
Explanation:
LLMOps (Large Language Model Operations) is the set of practices used to manage the lifecycle of LLMs in production. When an organization integrates an AI chatbot into its test processes, the primary operational concern is maintaining data privacy and minimizing security risks, especially if using third-party APIs. Unlike traditional software, LLMs are "black boxes" that process every piece of data sent to them. A core LLMOps responsibility is ensuring that any "Prompt Data" (code, requirements, or logs) is not used by the provider to train their public models and that the communication channels are fully secured. While scalability (Option A) and latency (Option C) are important technical metrics, they are secondary to the catastrophic legal and reputational risk of a data breach. LLMOps in a testing context involves implementing data masking tools, monitoring for "Prompt Injection" attacks, and managing the "Grounding" data in vector databases to ensure it remains current and protected. This ensures the AI remains a safe and reliable asset within the enterprise testing ecosystem, rather than a liability for the organization’s intellectual property.
정답:
Explanation:
In the context of LLM inference, Temperature is a hyperparameter that controls the randomness or "creativity" of the model's output. When the temperature is set high, the model's probability distribution is "flattened," meaning it is more likely to select less-probable tokens, leading to more diverse and sometimes unpredictable text. For software testing, where precision and repeatability are paramount, lowering the temperature (Option C) is the standard practice. A temperature of 0.0 makes the model "deterministic," meaning it will consistently choose the token with the highest probability. This narrows the sampling distribution and significantly reduces variability between runs. While a larger context window (Option D) allows the model to process more information, it does not directly control the randomness of token selection. Similarly, the "learning rate" (Option B) is a parameter used during the training or fine-tuning phase, not during inference. For generating test cases or scripts that must follow strict logic, a lower temperature ensures that the model remains focused and produces consistent results.
정답:
Explanation:
Few-shot promptingis the technique of providing a few examples (exemplars) within the prompt to demonstrate the desired task and output format to the LLM. In this scenario, providing 10 existing, high-quality test cases acts as a "pattern" for the model to follow. This is significantly more effective than "Zero-shot prompting" (Option D), where the model is given a task without examples and may deviate from the specific organizational format required (e.g., specific JSON structures or assertion styles). While "Prompt chaining" (Option A) is useful for breaking down complex tasks into sub-tasks, the primary need here is pattern recognition and replication, which is the core strength of Few-shot learning. "Meta prompting" (Option C) involves having the AI write the prompt itself, which is unnecessary when the team already has clear examples. By using Few-shot prompting, the tester "conditions" the model's latent space to prioritize the provided format, ensuring that all 500 generated test cases maintain consistency with the HTTP methods, headers, and assertion logic defined in the exemplars.
정답:
Explanation:
In a structured prompt, "Input Data" (or Reference Data) provides the specific subject matter that the model must process or analyze. The statement provided consists of factual identifiers and specific entities related to the System Under Test (SUT), such as the version number, the specific module name, reference IDs for existing tests, and a specific defect record. These elements serve as the raw material for the LLM's task. This differs from "Instructions" (Option C), which would be the command (e.g., "Analyze the following..."), or "Constraints" (Option B), which would define the boundaries of the task (e.g., "Do not include T-115"). "Output Format" (Option D) would define how the result should look (e.g., "Provide a JSON list"). By clearly labeling this section as Input Data, the tester helps the model distinguish between the "what" (the data) and the "how" (the instructions), which is a key principle of structured prompt engineering aimed at improving the accuracy of AI-generated analysis.
정답:
Explanation:
Prompt Chaining is a technique where a complex task is decomposed into several smaller, sequential steps, where the output of one step serves as the context or input for the next. This is far more reliable than a "one-shot" approach (Option A) because it reduces the cognitive load on the LLM and allows for intermediate verification. In the scenario of test analysis, the most logical and effective chain begins by extracting discrete test condition sfrom the raw requirements. Once these conditions are established, the next "link" in the chain is to prioritize them based on risk (impact and likelihood), which requires the model to reason specifically about the importance of each condition. The final step is to map these prioritized conditions back to the original requirements to identify any "coverage gaps." This systematic flow (Option B) mirrors the professional test analysis process defined in the ISTQB/CT-GenAI standards. By following this sequence, the tester ensures that the AI-generated output is logically derived and thorough, providing a clear "audit trail" from the initial requirement to the final prioritized test suite.
정답:
Explanation:
A/B testing, also known as split testing, is a systematic empirical method used to compare two versions of a prompt (Version A and Version B) to determine which one performs better based on predefined evaluation metrics. In the realm of LLMs, where outputs can be stochastic (probabilistic), A/B testing is essential for mitigating inconsistency. When a team encounters vague or varying results for a user story, simply modifying the prompt iteratively (Option B) may improve the result but does not provide a statistical or objective basis for why one version is superior. Byrunning A/B tests, testers can evaluate prompts against specific KPIs such as accuracy, relevance, format adherence, or the absence of hallucinations. This process involves sending the same input data through both prompt versions multiple times and scoring the outputs. The version that consistently yields the "stronger wording" or more precise testware is then selected as the production standard. This data-driven approach is a cornerstone of prompt engineering in professional environments, ensuring that the most effective linguistic structures are utilized to maximize the model's performance and reliability.
정답:
Explanation:
A mature GenAI strategy for software testing must move beyond "hype" and focus on tangible value and operational feasibility. Selecting an LLM based on measurable test outcomes (such as reduction in test design time, increase in defect detection, or script accuracy) ensures that the AI investment directly supports the organization’s Quality Assurance goals. Furthermore, the model must be compatible with current infrastructure. This includes considerations for data security (on-prem vs. cloud), API integration capabilities, and cost-per-token efficiency. While vendor visibility (Option A) can be a factor, it is not a guarantee of task-specific performance. Prioritizing creativity over compliance (Option B) is highly risky for testing, where precision and policy adherence are paramount. Similarly, while broad functionality (Option C) is useful, it often results in "jack-of-all-trades" models that may not perform as well as specialized or instruction-tuned models on specific testing tasks. Strategic alignment requires a balance between model performance, organizational security requirements, and clear KPIs.
정답:
Explanation:
The statement that "Strict GDPR compliance eliminates all privacy risk" is incorrect because compliance is a legal and procedural framework, not a foolproof technical shield against all possible risks. Even within a GDPR-compliant environment, risks such as "model inversion" attacks, accidental data leakage through "membership inference," or the unintentional generation of Sensitive Personally Identifiable Information (SPII) can still occur. Data privacy in GenAI is complex because LLMs function by processing and sometimes retaining patterns from the data they are fed. As noted in the CT-GenAI syllabus, some tools may process data in ways that are not fully transparent (Option A), and outputs can inadvertently include snippets of sensitive data used during the prompting or training phase (Option B). Furthermore, failing to adhere to regulations like GDPR or the EU AI Act certainly leads to legal and financial exposure (Option D). Therefore, while compliance frameworks significantly mitigate risk, they do not "eliminate" it; a robust GenAI strategy requires ongoing technical controls, data masking, and human oversight to manage residual privacy threats effectively.
정답:
Explanation:
A successful Generative AI strategy for testing is heavily dependent on the quality of the data used for grounding (RAG) and prompting. The principle of "Garbage In, Garbage Out" is magnified with LLMs; therefore, a key strategic pillar is the prioritization of accurate, relevant, and high-quality input data. This involves establishing defined quality procedures to ensure that the requirements, codebases, and historical defect logs fed into the model are "clean" and representative of the current system state. Strategy must avoid the "unfiltered" approach (Option C), as including contradictory or obsolete data can lead to hallucinations or irrelevant test cases. While synthetic data (Option D) is a powerful tool for privacy, it cannot entirely replace the nuanced reality found in secured enterprise data. Furthermore, legacy data (Option A) often contains valuable insights for regression testing. Consequently, the strategy should focus on building a robust data pipeline that ensures only verified, contextually appropriate information is utilized, thereby increasing the reliability of AI-generated testware and ensuring it aligns with the organization's quality standards.
정답:
Explanation:
This scenario describes a failure in the model's ability to apply logical weight to specific domain concepts, specifically in the context of Risk-Based Testing (RBT). When an LLM ranks a low-impact UI element (a tooltip) higher than a critical functional failure (payment processing), it demonstrates a "Reasoning error in risk calculation logic." While LLMs can follow formulas like $Risk = Likelihood \times Impact$, they may lack the deep semantic understanding of "Impact" within a specific business domain unless explicitly guided. This is not necessarily a hallucination (Option C), as the model isn't necessarily inventing facts, but rather misapplying the logic of prioritization. It is also distinct from dataset bias (Option D), which would involve a systematic skewing across all outputs. In professional testing, this type of error highlights the necessity of "human-in-the-loop" verification. Testers must review AI-generated prioritizations to ensure that the logical deductions align with the actual business risk and technical criticality of the features being tested.
정답:
Explanation:
Differentiating between prompting techniques is essential for a tester to select the right tool for the task. Few-shot prompting is characterized by providing the model with a few examples of inputs and desired outputs, allowing it to learn the pattern and format. Prompt Chaining involves breaking a complex task into a sequence of smaller, interconnected prompts, where the output of one step becomes the input for the next (e.g., first extract requirements, then generate test cases from those requirements).Meta-prompting is a more advanced technique where the user asks the LLM to help design, write, or refine the prompt itself―essentially using the AI as a "prompt engineer" to optimize the instructions.
Option D correctly identifies these core characteristics. Options A, B, and C contain fundamental mischaracterizations: for instance, Few-shot requires examples (contradicting A), and Chaining is the opposite of a single prompt (contradicting A). Mastering these distinctions allows testers to move from simple "chatting" to sophisticated AI orchestration that can handle complex, multi-stage testing workflows with high reliability.
정답:
Explanation:
Data security is a paramount concern when using GenAI in testing, as test environments often contain sensitive business logic or PII (Personally Identifiable Information). To protect this data "at rest" (stored in databases or vector stores) and "in transit" (being sent to the LLM), a combination of technical controls is required. Role-Based Access Control (RBAC) is a fundamental security pillar that ensures only authorized individuals or services can access specific datasets or trigger GenAI workflows. This prevents unauthorized users from feeding sensitive enterprise data into public AI models. While encryption (omitted in Option A as an alternative to obfuscation) and TLS (falsely suggested to be disabled in Option C) are essential technical layers for protecting data in transit, RBAC provides the organizational "gatekeeping" necessary to manage who can interact with the AI system. In a professional GenAI strategy, testers must ensure that the tools they use adhere to strict access policies, ensuring that the "Input Data" used for prompting remains within the secured organizational boundary and is not leaked to unauthorized entities or public training sets.
정답:
Explanation:
ISO/IEC 42001:2023is the international standard for an AI Management System (AIMS). It is designed to help organizations develop, provide, or use AI systems responsibly by providing a certifiable framework of requirements and controls. In a software testing context, this standard is vital for establishing governance, ensuring that GenAI tools are used consistently and ethically across the lifecycle. NIST AI RMF 1.0 (Option B) is a highly respected framework, but it is a set of voluntary guidelines for managing risk, not a "requirement standard" for a management system. ISO/IEC 23053:2022 (Option C) provides a general framework for AI using machine learning but lacks the comprehensive "management system" scope found in 42001. Finally, the EU AI Act (Option D) is a regulation (law), not a technical standard. For a test organization looking to align its GenAI strategy with international best practices and achieve formal certification, ISO/IEC 42001 is the definitive standard to follow, as it covers the organizational processes, data handling, and risk management necessary for high-quality AI operations