시험덤프
매달, 우리는 1000명 이상의 사람들이 시험 준비를 잘하고 시험을 잘 통과할 수 있도록 도와줍니다.
  / DP-800 덤프  / DP-800 문제 연습

Microsoft DP-800 시험

Developing AI-Enabled Database Solutions 온라인 연습

최종 업데이트 시간: 2026년03월30일

당신은 온라인 연습 문제를 통해 Microsoft DP-800 시험지식에 대해 자신이 어떻게 알고 있는지 파악한 후 시험 참가 신청 여부를 결정할 수 있다.

시험을 100% 합격하고 시험 준비 시간을 35% 절약하기를 바라며 DP-800 덤프 (최신 실제 시험 문제)를 사용 선택하여 현재 최신 61개의 시험 문제와 답을 포함하십시오.

 / 2

Question No : 1


HOTSPOT
You have an Azure SQL database that contains a table named stores, stores contains a column named description and a vector column named embedding.
You need to implement a hybrid search query that meets the following requirements:
• Uses full-text search on description for the keyword portion
• Returns the top 20 results based on a combined score that uses a weighted formula of 60% vector distance and 40% full-text rank
How should you configure the query components? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.



정답:


Explanation:
For the vector portion, the correct choice is VECTOR_DISTANCE and order by distance ascending. The
requirement is to build a combined weighted formula using the actual vector distance. Microsoft documents that VECTOR_DISTANCE returns the exact distance between two vectors. Since lower distance means greater similarity, ascending distance is the right direction for ranking.
VECTOR_SEARCH is for ANN retrieval, but this hotspot specifically asks for a weighted formula based on distance, so VECTOR_DISTANCE is the appropriate operator.
For the keyword portion, the correct choice is CONTAINSTABLE on description and return ranked matches. Microsoft documents that CONTAINSTABLE returns a RANK column from 0 through 1000, which is exactly what is needed for weighted scoring in a hybrid formula.
For the final ranking expression, the best choice is order by (distance * 0.6) + ((1.0 - RANK/1000.0) * 0.4). This works because vector distance is a lower-is-better metric, while full-text RANK is a higher-is-better metric. Dividing RANK by 1000 normalizes it to the documented range, and subtracting from 1.0 converts it into a lower-is-better term so both components can be combined consistently in one ascending score. This final step is a sound inference based on Microsoft’s documented distance semantics and full-text rank range.

Question No : 2


You need to design a generative Al solution that uses a Microsoft SOL Server 2025 database named
DB1 as a data source.
The solution must generate responses that meet the following requirements:
• Ait' grounded In the latest transactional and reference data stored in D61
• Do NOT require retraining or fine-tuning the language model when the data changes
• Can include citations or references to the source data used in the response
Which scenario is the best use case for implementing a Retrieval Augmented Generation (RAG) pattern? More than one answer choice may achieve the goal. Select the BEST answer

정답:
Explanation:
The best use case for RAG is answering user questions based on company-specific knowledge. Microsoft defines RAG as a pattern that augments a language model with a retrieval system that provides grounding data at inference time, which is exactly what you need when responses must be based on the latest transactional and reference data, must avoid retraining/fine-tuning, and should be able to include citations or references to source data.
The other options do not fit as well:
summarizing free-form user input does not inherently require retrieval from DB1, training a custom model contradicts the requirement to avoid retraining/fine-tuning, generating marketing slogans is a creative generation task, not a grounding-and-citation scenario. RAG is specifically strong when answers must come from your organization’s own changing knowledge.

Question No : 3


HOTSPOT
Your company has an ecommerce catalog in a Microsoft SQL Server 202b database named SalesDB SalesDB contains a table named products, products contains the following columns:
• product.id (int)
• product_name (nvarchar(200))
• description (nvarchar(max))
• category (nvarchar(50))
• brand (nvarchar(W))
• price (decimal)
• sku (nvarchar(40))
The description fields ate updated dairy by a content pipeline, and price can change multiple times per day. You want customers to be able to submit natural language queries and apply structured filters for brand and price. You plan to store embeddings in a new VECTOR(1536) column and use VECTOR_SEARCH(... METRIC=’ cosine' ...).
For each of the following statements, select Yes if the statement is true. Otherwise, select No. NOTE: Each correct selection is worth one point.



정답:


Explanation:
The first statement is Yes. Embeddings are used to represent the semantic meaning of content, and vector search is for conceptually similar matches over that content. Here, the semantically meaningful fields are product_name, category, and description. Using those together supports natural-language search, while brand and price can be handled as structured filters outside the embedding itself. This is an inference from Microsoft’s guidance that vector search works over embeddings representing content meaning, while filters remain part of the nonvector query pipeline.
The second statement is No. price changes multiple times per day and is a structured numeric attribute, not stable semantic content. Since the requirement already says customers can apply structured filters for brand and price, price does not need to be embedded into the text. Embedding volatile numeric values would also make embeddings stale faster without improving the semantic-search objective. This is again an inference grounded in Microsoft’s distinction between vector similarity over content and filtering/sorting over nonvector fields.
The third statement is Yes. In SQL Server’s vector type, the default underlying base type is float32 unless float16 is specified explicitly.

Question No : 4


HOTSPOT
You have an Azure SQL database that contains a table named knowledgebase, knowledgebase stores human resources (HR) policy documents and contains columns named title, content, category, and embedding.
You have an application named App1. App1 queries two relational tables named employee_pnofiles and benefits_enrollnent that contain HR data. App1 hosts a chatbot that calls a large language model (LLM) directly.
Users report that the chatbot answers general HR questions correctly but provides outdated or incorrect answers when policies change. The chatbot also fails to answer questions that reference internal policy documents by title or category.
You need to recommend a Retrieval Augmented Generation (RAG) solution to resolve the chatbot issues.
What should you recommend? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.



정답:


Explanation:
The correct recommendation is to retrieve grounding data from knowledge_base and, at inference time, generate query embeddings and run a vector similarity search.
The chatbot currently answers some general HR questions but fails when policies change and when users ask about internal policy documents by title or category. That is exactly the kind of problem RAG is meant to solve: ground the LLM in the organization’s proprietary content instead of relying on the model’s training data or unrelated transactional tables. Microsoft’s RAG guidance states that RAG extends LLMs by grounding responses in your own content and that, for agentic retrieval, knowledge bases unify knowledge sources for retrieval.
So the grounding data should come from knowledge_base, because that table stores the HR policy documents and already includes fields like title, content, category, and embedding. Those are the fields directly tied to the missing and outdated policy answers.
By contrast:
employee_profiles and benefits_enrollment are operational HR tables, not the authoritative store for policy-document grounding.
PDF exports of the policies would be inferior to querying the indexed/structured knowledge base already prepared for retrieval.
The LLM training data is specifically the wrong source when the issue is outdated internal content.
For the retrieval step, Microsoft’s guidance says to use embeddings for vector queries and notes that vector similarity search matches concepts, not exact terms. This is especially important because users ask about policy documents by title or category and also phrase questions in ways that might not exactly match document wording. Generating a query embedding and then running a vector similarity search is the appropriate retrieval step in a RAG pipeline.

Question No : 5


You have a Microsoft SQL Server 2025 instance that contains a database named SalesDB SalesDB supports a Retrieval Augmented Generation (RAG) pattern for internal support tickets. The SQL Server instance runs without any outbound network connectivity.
You plan to generate embeddings inside the SQL Server instance and store them in a table for vector similarity queries.
You need to ensure that only a database user account named AlApplicationUser can run embedding generation by using the model.
Which two actions should you perform? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point.

정답:
Explanation:
Because the SQL Server 2025 instance has no outbound network connectivity, the embedding model cannot rely on a remote REST endpoint such as Azure AI Foundry or Azure OpenAI. Microsoft’s CREATE EXTERNAL MODEL documentation includes a local deployment pattern using ONNX Runtime running locally with local runtime/model paths. That is the right design when embeddings must be generated inside the SQL Server instance without external network access. Microsoft explicitly documents a local ONNX Runtime example for SQL Server 2025 and notes the required local runtime setup and model path configuration.
The permission requirement is handled by granting the application user access to use the external embeddings model. Microsoft’s AI_GENERATE_EMBEDDINGS documentation states that, as a prerequisite, you must create an external model of type EMBEDDINGS that is accessible via the correct grants, roles, and/or permissions. Among the choices, the exam-appropriate action is to grant execute permission on the external model project to AlApplicationUser so only that database user can run embedding generation through the model.

Question No : 6


You have an Azure SQL database that contains tables named dbo.ProduetDocs and dbo.ProductuocsEnbeddings. dbo.ProductOocs contains product documentation and the following columns:
• Docld (int)
• Title (nvdrchdr(200))
• Body (nvarthar(max))
• LastHodified (datetime2)
The documentation is edited throughout the day. dbo.ProductDocsEabeddings contains the following columns:
• Dotid (int)
• ChunkOrder (int)
• ChunkText (nvarchar(aax))
• Embedding (vector (1536))
The current embedding pipeline runs once per night
Vou need to ensure that embeddings are updated every time the underlying documentation content changes. The solution must NOT 'equire a nightly batch process.
What should you include in the solution?

정답:
Explanation:
The requirement is to ensure embeddings are updated every time the underlying content changes without relying on a nightly batch job. The right design is to enable change tracking on the source table so an external process can identify which rows changed and regenerate embeddings only for those rows. Microsoft documents that change detection mechanisms are used to pick up new and updated rows incrementally, which is the right pattern when you need near-continuous refresh instead of full nightly rebuilds.
This is better than:
A. fixed-size chunking, which affects chunk strategy but not change detection.
B. a smaller embedding model, which affects model cost/latency but not update triggering.
C. table triggers, which would push embedding-maintenance logic directly into write operations and is generally not the best design for AI-processing pipelines. The question specifically asks for a solution that replaces the nightly batch requirement, not one that performs heavyweight work inline during every transaction.

Question No : 7


You have an Azure SQL database that contains a table named dbo.ManualChunks. dbo.HonualChunks contains product manuals
A retrieval query already returns the top five matching chunks as nvarchar(max) text.
You need to call an Azure OpenAI REST endpomt for chat completions. The request body must include both the user question and theretiieved chunks.
You write the following Transact-SQL code.



What should you insert at line 22?

정답:
Explanation:
The correct insertion at line 22 is FOR JSON PATH, WITHOUT_ARRAY_WRAPPER.
The request body for the Azure OpenAI chat completions call must be a single JSON object containing the messages array with both the system/user content and the retrieved chunks. Microsoft documents that FOR JSON PATH is the preferred way to shape JSON output, especially when you want precise control over nested property names like messages[0].role and messages[1].content.
The key detail is WITHOUT_ARRAY_WRAPPER. By default, FOR JSON returns results enclosed in square brackets as a JSON array. Microsoft documents that WITHOUT_ARRAY_WRAPPER removes those brackets so a single JSON object is produced instead. That is exactly what is needed here for @payload, because the stored procedure is building one request body, not an array of request bodies.

Question No : 8


HOTSPOT
You have an Azure Al Search service and an index named hotels that includes a vector Held named DescriptionVector.
You query hotels by using the Search Documents REST API.
You add semantic ranking to the hybrid search query and discover that some queries return fewer results than expected, and captions and answers are missing.
You need to complete the hybrid search request to meet the following requirements:
• Include more documents when ranking.
• Always include captions and answers.



정답:


Explanation:
These are the correct selections for a hybrid query that uses semantic ranking in Azure AI Search.
Use k = 50 because Microsoft explicitly recommends that when you combine semantic ranking with vector queries, you should set k to 50 so the semantic ranker has enough candidates to rerank. If you use a smaller value such as 10, semantic ranking can receive too few inputs, which is exactly why some queries return fewer results than expected.
Use queryType = "semantic" because captions and answers are only available on semantic queries. Microsoft documents that captions is valid only when the query type is semantic, and semantic answers are returned only for semantic queries.
Use captions = "extractive" because semantic captions are extractive passages pulled from the top-ranked documents. Microsoft’s REST documentation states that the valid captions option here is extractive and that it defaults to none if not specified.
Use answers = "extractive" because semantic answers in Azure AI Search are extractive, not generated. Microsoft documents that semantic answers are verbatim passages recognized as answers and the REST API lists extractive as the answer-return option.

Question No : 9


DRAG DROP
You have an Azure SQL database named DB1 that contains two tables named knowledgebase and query_cache. knowledge_base contains support articles and embeddings. query_cache contains chat questions, responses, and embeddings DB1 supports an Al-enabled chat agent.
You need to design a solution that meets the following requirements:
• Serializes the retrieved rows from knowledee_base
• Extracts the answer field from the response
• Extracts the embeddings to store in query_cache
You will call the external large language model (LLM) by using the sp_irwoke_external_re standpoint stored procedure.
Which Transact-SGL commands should you use for each requirement? To answer, drag the appropriate commands to the correct requirements. Each command may be used once, mote than once, or not at all. You may need to drag the split bar between panes or scroll to view content. NOTE: Each correct selection is worth one point.



정답:


Explanation:
The correct mapping is:
FOR JSON PATH
JSON_VALUE
JSON_QUERY
To serialize the retrieved rows from knowledge_base, the correct command is FOR JSON PATH. Microsoft documents that FOR JSON formats query results as JSON, and PATH mode is the standard
way to shape relational rows into JSON for downstream application or AI use.
To extract the answer field from the response, the correct command is JSON_VALUE because answer is a single scalar field. Microsoft states that JSON_VALUE is used to extract a scalar value from JSON text.
To extract the embeddings to store in query_cache, the correct command is JSON_QUERY because embeddings are returned as a JSON array, not a scalar. Microsoft states that JSON_QUERY extracts an object or array from JSON text, which is exactly the right behavior for an embeddings payload.
The unused options are not the best fit here:
OPENJSON is mainly for shredding JSON into rows and columns.
AI_GENERATE_CHUNKS is for chunking text, not extracting fields from a response payload.
VECTOR_DISTANCE computes similarity between vectors and is unrelated to JSON extraction.
FOR XML PATH produces XML, not JSON.

Question No : 10


You have an Azure SQL database That contains a table named dbo.Products, dbo.Products contains three columns named Embedding Category, and Price. The Embedding column is defined as VECTOR(1536).
You use Ai_GENERME_EMBEDOINGS and VECTOR_SEARCH to support semantic search and apply additional filters on two columns named Category and Price.
You plan to change the embedding model from text-embedding-ada-002 to text-embedding-3-smalL
Existing rows already contain embeddings in the Embedding column.
You need to implement the model change. Applications must be able to use VECTOR_SEARCH without runtime errors.
What should you do first?

정답:
Explanation:
When you change embedding models, the stored vectors should be treated as belonging to a different embedding space unless you intentionally keep the entire corpus consistent. Microsoft’s vector guidance notes that when most or all embeddings are replaced with fresh embeddings from a new model, the recommended practice is to reload the new embeddings and, for large-scale replacement scenarios, consider dropping and recreating the vector index afterward so search quality remains predictable.
This question also says applications must continue to use VECTOR_SEARCH without runtime errors. VECTOR_SEARCH requires compatible vector dimensions, and the vector column already exists. Azure OpenAI documentation shows that text-embedding-ada-002 is fixed at 1536 dimensions and text-embedding-3-small supports up to 1536 dimensions. That means the migration can remain compatible with a VECTOR(1536) column, but the right implementation step is still to re-embed the existing rows so the table does not contain a mixed corpus produced by different models.
The other options are not appropriate:
B normalization does not solve a model migration problem.
C converting the vector column to nvarchar(max) would break vector-native search design.
D a vector index improves performance, but it does not migrate old embeddings to the new model.

Question No : 11


HOTSPOT
You have a SQL database in Microsoft Fabric named Sales BD that contains a table named dbo.Products.
You need to modify SalesBD to meet the following requirements:
• Create a vector index on the appropriate column.
• Use a supplied natural language query vector.
How should you complete the Transact-SQL code? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.



정답:


Explanation:
The first correct selection is embedding because a vector index must be created on the vector column, not on a scalar distance column or a text column such as product_name. Microsoft’s CREATE VECTOR INDEX documentation shows that the index is created directly on the vector-valued column, for example ON product_embeddings(embedding).
The second correct selection is VECTOR_SEARCH because the requirement is to use a supplied natural language query vector and search against the indexed embeddings. Microsoft documents that VECTOR_SEARCH is the Transact-SQL function for approximate nearest neighbor vector retrieval and that it applies to SQL database in Microsoft Fabric as well as other supported SQL platforms.
This also matches the shown code pattern:
declare a vector variable such as @query_vector VECTOR(1536),
create a vector index on dbo.Products(embedding),
query with VECTOR_SEARCH(... SIMILAR_TO = @query_vector, METRIC = 'cosine', TOP_N = 10).

Question No : 12


Vou have an Azure SQL database named SalesDB that contains a table named dbo. Articles, dbo.Articles contains two million articles with embeddmgs. The articles are updated frequently throughout the day.
You query the embeddings by using VECTOR_SEARQi
Users report that semantic search results do NOT reflect the updates until the following day.
Vou need to ensure that the embeddings are updated whenever the articles change. The solution must minimize CPU usage on SalesDB
Which embedding maintenance method should you implement?

정답:
Explanation:
The correct answer is B because the problem is not the vector search operator itself. The problem is that embeddings are becoming stale when article content changes. Microsoft documents that change data capture (CDC) tracks insert, update, and delete operations on source tables, which makes it the right mechanism to identify only the rows that changed.
This also best satisfies the requirement to minimize CPU usage on SalesDB. With CDC, the database only records the row changes, and the embedding regeneration work can be moved to an external process such as an Azure Functions app. That avoids running embedding generation inline inside the database for every update and avoids repeatedly recalculating embeddings for unchanged rows. In contrast, an hourly full-table regeneration would be extremely wasteful on a table with two million frequently updated articles, and a trigger that calls embedding generation per row would push expensive AI work into the transactional path of the database.
Option A is incorrect because changing from VECTOR_SEARCH to VECTOR_DISTANCE does not regenerate embeddings; it only changes the retrieval method. Microsoft states that VECTOR_SEARCH is the ANN search function, while VECTOR_DISTANCE performs exact distance calculation, so neither option addresses stale embedding data.
So the right design is:
use CDC to detect only changed articles,
process those changes outside the database,
regenerate embeddings only for changed rows,
write back the refreshed embeddings for current semantic search results.

Question No : 13


HOTSPOT
You have an Azure SQL database that contains the following tables and columns.



Embeddings in the NotesEnbeddings and DescriptionEabeddings tables have been generated from values in the Description and notes columns of the Articles table by using different chunk sizes.
You need to perform approximate nearest neighbor (ANN) queries across both embedding tables.
The solution must minimize the impact of using different chunk sizes.
What should you use? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.



정답:


Explanation:
The correct function is VECTOR_SEARCH because the requirement is to perform approximate nearest neighbor (ANN) queries. Microsoft’s SQL documentation states that VECTOR_SEARCH is the function used for vector similarity search, and that an ANN index is used only with VECTOR_SEARCH when a compatible vector index exists on the target column. By contrast, VECTOR_DISTANCE calculates an exact distance and does not use a vector index for ANN retrieval.
The correct distance metric is cosine distance. Microsoft documents that VECTOR_SEARCH supports cosine, dot, and euclidean metrics, and Microsoft guidance specifically notes that cosine similarity is commonly used for text embeddings. It also states that retrieval of the most similar texts to a given text typically functions better with cosine similarity, and that Azure OpenAI embeddings rely on cosine similarity to compute similarity between a query and documents. Since both NotesEmbeddings and DescriptionEmbeddings are text-derived embeddings and the goal is to minimize the impact of different chunk sizes, cosine is the best choice because it compares direction/angle rather than being as sensitive to vector magnitude as Euclidean distance.

Question No : 14


You have a Microsoft SQL Server 2025 instance that has a managed identity enabled.
You have a database that contains a table named dbo.ManualChunks. dbo.ManualChunks contains product manuals.
A retrieval query already returns the top five matching chunks as nvarchar(max) text.
You need to call an Azure OpenAI REST endpoint for chat completions. The solution must provide the highest level of security.
You write the following Transact-SG1 code.



What should you insert at line 02?
A)



B)



C)



D)



E)



정답:
Explanation:
The correct answer is Option B because the requirement is to call an Azure OpenAI REST endpoint from SQL Server 2025 while providing the highest level of security, and the instance already has a managed identity enabled. For Microsoft’s SQL AI features, the preferred secure pattern is to use a database scoped credential with IDENTITY = 'Managed Identity' instead of storing an API key. Microsoft documents that SQL Server 2025 supports managed identity for external AI endpoints, and for Azure OpenAI the credential secret uses the Cognitive Services resource identifier: {"resourceid":"https://cognitiveservices.azure.com"}.
So line 02 should be:
WITH IDENTITY = 'Managed Identity',
SECRET = '{"resourceid":"https://cognitiveservices.azure.com"}';
Why the other options are incorrect:
A and D use HTTP header or query-string credentials with an API key, which is less secure than
managed identity because a secret key must be stored and rotated manually. Microsoft recommends managed identity where supported to avoid embedded secrets.
C mixes Managed Identity with an api-key secret, which is not the correct pattern for Azure OpenAI managed-identity authentication.
E uses an invalid identity value for this scenario. The accepted credential identities for external REST endpoint calls include HTTPEndpointHeaders, HTTPEndpointQueryString, Managed Identity, and Shared Access Signature.
Because the endpoint is Azure OpenAI and the question explicitly asks for the highest security, managed identity with the Cognitive Services resource ID is the Microsoft-aligned answer.

Question No : 15


You have an Azure SQL table that contains the following data.



You need to retrieve data to be used as context for a large language model (LLM). The solution must minimize token usage.
Which formal should you use to send the data to the LLM?
A)



B)



C)



D)



정답:
Explanation:
The correct choice is Option A because it provides the relevant semantic context the LLM needs while avoiding an unnecessary field that would add tokens without improving answer quality.
For LLM grounding and RAG-style context, Microsoft guidance emphasizes mapping and sending the fields that contain text pertinent to the use case. In this FAQ scenario, the useful context is the ProductName, the Question, and the Answer. Those three fields help the model understand both the subject domain and the actual Q&A pair. By contrast, FaqId is just a technical identifier and generally adds no semantic value for response generation, so including it wastes tokens.
That is why Option A is better than the others:
Option A keeps the meaningful text fields and removes the low-value identifier.
Option B is too minimal because it includes only the answer text as Prompt, which strips away the product and question context the LLM may need for accurate grounding.
Option C keeps FaqId but omits ProductName, which can be important disambiguating context.
Option D includes everything, but that does not minimize token usage because it keeps the unnecessary FaqId.

 / 2