Microsoft DP-420 Exam Dumps & Practice Test Questions
Question No 1:
You are designing a new container named container1 in an Azure Cosmos DB database. This container is intended to store product data along with product category information, and the workload will primarily consist of read operations.
Your primary goals in configuring the container are to:
Minimize the size of each partition, ensuring efficient resource use and load distribution.
Reduce maintenance overhead, aiming for a solution that requires minimal ongoing management.
When selecting a partition key for container1, you must ensure that the choice aligns with the above objectives.
Which two characteristics should you prioritize when selecting a partition key?
A. Unique
B. High cardinality
C. Low cardinality
D. Static
Correct Answer: B, D
Explanation:
In Azure Cosmos DB, careful selection of a partition key is essential for achieving scalability, performance, and cost-efficiency. Since Cosmos DB distributes data across partitions based on the partition key, the structure and behavior of that key directly influence the system's ability to manage and serve requests efficiently.
To meet the stated goals of minimizing partition size and reducing maintenance, two main characteristics of a good partition key are high cardinality and static values.
High Cardinality refers to the number of distinct values in a column. A high-cardinality partition key ensures that the data is evenly distributed across many logical partitions. This even distribution helps prevent "hot partitions", where a disproportionate amount of data or operations are focused on a single partition, leading to performance bottlenecks. For example, a partition key like productId or a combination such as categoryId#productId could provide high cardinality, distributing the data more uniformly across partitions and helping Cosmos DB scale effectively under heavy read operations.
Static values in the context of partition keys imply that once a value is assigned to a document, it doesn't change. This is important because Cosmos DB does not allow updates to the partition key of an existing document. Choosing a static key ensures that data doesn't need to be moved or re-partitioned later, significantly reducing maintenance complexity and potential errors during updates.
Let’s look at why the other options are not ideal:
Unique (A): While unique keys ensure distinct partition values, using a completely unique value (e.g., a GUID for every document) would result in one document per partition, which is highly inefficient and leads to unnecessary overhead and underutilized throughput. Cosmos DB works best when multiple documents share the same partition key value, up to the size limit for physical partitions (20 GB per logical partition as of the current limit).
Low Cardinality (C): A low-cardinality key (like "category") could result in large amounts of data being grouped into very few partitions. This increases the risk of hot partitions, which degrade performance and limit scalability, especially in read-heavy workloads where requests may frequently hit the same partition.
In conclusion, to optimize for performance and operational simplicity in Cosmos DB, especially in read-heavy workloads, the ideal partition key should offer high cardinality for even data distribution and be static to avoid future update complications.
Question No 2:
You are working with an Azure Cosmos DB for NoSQL account that is globally distributed across four different regions. When a client application attempts to connect using the Azure Cosmos DB SQL SDK, it must determine the most appropriate regional endpoint to perform read and write operations efficiently.
Which two of the following factors directly influence how the SDK selects the most appropriate regional endpoint for reads and writes?
A. The consistency level specified within the RequestOptions object
B. The network latency between the client and each Cosmos DB region
C. The default consistency level defined at the Cosmos DB account level
D. The PreferredLocations list defined in the client configuration
E. The current availability status of each region
Correct Answer: B, D
Explanation:
Azure Cosmos DB is designed to provide low-latency, globally distributed access to your data. When using an SDK like the Azure Cosmos DB SQL SDK, selecting the optimal regional endpoint is crucial for ensuring performance and availability, especially for globally distributed applications.
Two primary factors that directly influence how the SDK determines which region to use for reads and writes are:
PreferredLocations list (D):
This is a list of regions defined by the developer or application configuration that specifies which regions the client prefers to use for data access. The SDK will attempt to connect to the first available region in this list for read or write operations, depending on the availability and role (read/write region). This allows applications to prioritize certain regions based on geographic proximity or business requirements, enabling optimal user experience and failover capabilities.
Network latency (B):
Azure Cosmos DB SDKs dynamically evaluate network latency to each of the available regions. When executing read operations, the SDK automatically chooses the closest or fastest region based on the latency measured from the client. This enhances read performance, ensuring the lowest possible response time by serving data from the nearest location. This is especially beneficial for applications serving global user bases.
Let’s analyze why the other options are incorrect:
Consistency level in RequestOptions (A): While consistency levels (like Strong, Session, Eventual) influence the data replication behavior and freshness, they do not directly control which region is selected for access. They are important for how the data behaves across regions, but not for endpoint selection.
Default consistency level (C): Similar to Option A, this impacts how consistent the data must be across replicas but does not affect the SDK’s region selection logic. It ensures consistent behavior but is orthogonal to physical region routing.
Region availability (E): While region availability is certainly considered internally by the SDK (i.e., it avoids offline or unhealthy regions), it is not something explicitly controlled by the user. The SDK handles failover automatically when a region becomes unavailable, but availability alone does not determine the endpoint selection — it is just one of several considerations handled behind the scenes.
In summary, the PreferredLocations configuration and network latency measurements are the two most influential and controllable factors that affect how the SDK routes traffic to regional endpoints in a multi-region Azure Cosmos DB setup.
Question No 3:
You are working with an Azure Cosmos DB for NoSQL account named account1. Your task is to programmatically create a new container called Container1 within this account using the Azure Cosmos DB .NET SDK. One of the key requirements is to ensure that no data within the container ever expires.
Which setting should you apply to achieve this?
A. Set TimeToLivePropertyPath to null
B. Set TimeToLivePropertyPath to 0
C. Set DefaultTimeToLive to null
D. Set DefaultTimeToLive to -1
Correct Answer: D
Explanation:
Azure Cosmos DB includes a Time to Live (TTL) feature that enables automatic deletion of documents after a defined period. This feature is often used to reduce storage costs and maintain fresh data by removing outdated entries. However, in cases where data must persist indefinitely unless explicitly deleted, TTL must be properly configured to disable automatic expiration.
There are two key TTL-related settings in Azure Cosmos DB:
DefaultTimeToLive: Determines whether TTL is enabled and how long an item should live by default (in seconds).
TimeToLivePropertyPath: An advanced setting used for dynamic TTL values based on a specific property in each document.
In this scenario, the requirement is not to expire data at all, which means TTL must be disabled.
To achieve this, the correct setting is to assign DefaultTimeToLive = -1. This explicitly instructs Cosmos DB that TTL should be disabled, ensuring that documents will not be removed automatically by the system under any circumstances.
Let’s review the provided options:
A (Set TimeToLivePropertyPath to null): This setting is only relevant when using per-item TTL based on a document field. It does not affect the container-wide TTL behavior unless such customization is desired, which is not the case here.
B (Set TimeToLivePropertyPath to 0): This also relates to the per-item TTL configuration and doesn’t control the default TTL behavior. Furthermore, setting TTL to 0 would actually cause documents to expire immediately after being written, which contradicts the requirement.
C (Set DefaultTimeToLive to null): While this might appear to disable TTL, the behavior can be ambiguous. In many SDKs and API versions, setting DefaultTimeToLive to null may not fully disable TTL but instead leave it undefined, potentially leading to unexpected behavior depending on the defaults or API version.
D (Set DefaultTimeToLive to -1): This is the explicit and recommended way to disable TTL in Azure Cosmos DB. It clearly tells the service not to delete documents automatically, satisfying the requirement to preserve all data permanently unless explicitly removed.
In summary, when using the Azure Cosmos DB .NET SDK and aiming for non-expiring data, DefaultTimeToLive = -1 is the correct and safest choice.
Question No 4:
You are optimizing a cloud-native application, App1, which uses Azure Cosmos DB with Eventual consistency. Your goals are to maintain high query throughput (without increasing RU consumption) and improve consistency of data reads.
Which consistency level should you choose?
A. Strong
B. Bounded Staleness
C. Session
D. Consistent Prefix
Correct Answer: C
Explanation:
Azure Cosmos DB offers five consistency levels, each representing a trade-off between performance, cost (in Request Units), and consistency guarantees. These levels are:
Strong – Guarantees linearizability; highest consistency, but highest RU consumption and latency.
Bounded Staleness – Ensures reads lag only by a defined time or number of versions; offers near-strong consistency with reduced but still significant latency overhead.
Session – Guarantees read-your-own-writes and monotonic reads within a client session; ideal balance for most applications.
Consistent Prefix – Guarantees order of writes, but not freshness or session-level consistency.
Eventual – No ordering or freshness guarantees; lowest latency and RU cost.
The default for most Cosmos DB SDKs is Session consistency because it strikes a powerful balance: stronger guarantees than Eventual while maintaining high performance and efficiency.
Let’s assess each option in the context of the problem:
A (Strong): While offering the best consistency, Strong is the most expensive in terms of RUs and network latency. It can also reduce throughput, violating the requirement to maximize query performance.
B (Bounded Staleness): Provides predictable consistency delays, which is better than Eventual, but still incurs higher RU costs and lower throughput than Session. Not ideal when RU preservation is key.
C (Session): This is the most suitable option. It delivers read-your-own-writes, a form of stronger consistency than Eventual, while still allowing high throughput and low latency, comparable to Eventual consistency in most scenarios. Since many apps work on a session-by-session basis (e.g., user sessions), this level of consistency is often "good enough" and extremely cost-effective.
D (Consistent Prefix): Guarantees that reads reflect the correct order of writes, but does not ensure users will read their own writes or even see updated data quickly. It's better than Eventual, but lacks session-level guarantees, making it less consistent than required.
Therefore, the Session consistency level is best for the given requirements. It offers a significant consistency improvement over Eventual while preserving RU efficiency and high throughput, meeting both business objectives without trade-offs.
Question No 5:
You are building a cloud-native application that stores JSON-based data using Azure Cosmos DB for NoSQL. To handle fluctuating workloads, you've configured the Cosmos DB container to use Autoscale throughput mode with a maximum limit of 20,000 RU/s (Request Units per second).
At present, the application is idle and is not making any requests (i.e., current usage is 0 RU/s). You need to understand the billing impact for this container configuration when Autoscale is enabled but no traffic is being processed.
When the actual RU/s usage is zero, how many RU/s will you be billed for in Azure Cosmos DB with Autoscale enabled and a maximum throughput of 20,000 RU/s?
A. 0
B. 200
C. 2,000
D. 4,000
E. 10,000
Correct Answer: C. 2,000
Explanation:
Azure Cosmos DB offers an Autoscale throughput mode, where the system dynamically adjusts provisioned RU/s based on actual traffic. While this provides flexibility and scalability, billing still occurs based on a defined range:
Minimum billable throughput in Autoscale mode is 10% of the maximum RU/s configured.
If no RU/s are consumed in an hour, you are still billed for the minimum—10% of the max RU/s.
In this case:
Maximum RU/s = 20,000
Minimum billable RU/s = 10% × 20,000 = 2,000 RU/s
Even if actual usage is 0 RU/s, you will be billed for 2,000 RU/s in that hour.
Autoscale mode ensures performance during usage spikes, but guarantees minimum capacity billing.
Minimum charge = 10% of your configured max RU/s per hour.
Therefore, with 20,000 max RU/s, the minimum billed is 2,000 RU/s, regardless of traffic.
Question No 6:
You are working with an Azure Cosmos DB for NoSQL account containing a database named DB1 and a container named Container1. Your objective is to manage Cosmos DB programmatically using the Azure Cosmos DB SDK (e.g., .NET, Python, Java, Node.js), instead of using the Azure Portal or CLI.
Which of the following operations can you perform using the Azure Cosmos DB SDK?
A. Create a new container inside the existing database DB1
B. List the physical partitions of Container1
C. Read a stored procedure from Container1
D. Create a User Defined Function (UDF) in Container1
Correct Answer: C. Read a stored procedure from Container1
Explanation:
The Azure Cosmos DB SDK allows developers to perform a wide range of operations, including working with server-side scripts such as stored procedures, UDFs, and triggers. These objects are stored inside a container and are accessible through the SDK.
Option C is correct because you can read (retrieve) a stored procedure using SDK methods such as:
This allows you to fetch metadata or prepare to execute the stored procedure programmatically.
Other Options Explained:
A. Create a new container in DB1:
Creating a new container is possible via the SDK, but not via the container client—you must use the database client object. This makes the option imprecise for this context.B. List physical partitions of Container1:
This is not supported via the SDK. Cosmos DB hides physical partitioning from users. Instead, developers work with logical partitions, defined by partition keys.D. Create a UDF in Container1:
While you can execute UDFs in queries through the SDK, creating or uploading new UDFs is not supported via SDK. They are typically created using the Azure Portal or REST API.
Summary:
You can read stored procedures via the SDK, making Option C the valid and supported operation.
Other operations may be possible through different tools (e.g., portal, ARM templates) but not through the SDK in the context described.
Question No 7:
You are creating an Azure Cosmos DB for NoSQL solution to store continuous time-series data from IoT devices. Each device sends new data every second, and all data must be kept indefinitely for future analysis. Each data point includes details such as the deviceId, manufacturer, and multiple sensor readings, along with a timestamp.
Given the need to avoid hot partitions, ensure even data distribution, and stay within partition throughput limits, which partition key strategy should you use?
A. Use deviceManufacturer as the partition key
B. Create a new synthetic key that contains deviceId and timestamp
C. Create a new synthetic key that contains deviceId and deviceManufacturer
D. Use deviceId as the partition key
Correct Answer: B
Explanation:
When designing a Cosmos DB data model for high-throughput time-series data, the selection of an effective partition key is critical. A poor partitioning strategy can result in imbalanced partitions—known as partition skew—where certain partitions receive disproportionately more read and write traffic than others. This can lead to performance bottlenecks, increased latency, or even throttling if a partition exceeds its allocated throughput.
The sample JSON structure includes consistent per-second telemetry data from each IoT device. Each record contains both static fields (like deviceId and deviceManufacturer) and dynamic fields (like timestamp and sensor values). Writes are frequent and continuous, making it imperative to avoid hot partitions—where a single logical partition becomes overwhelmed with requests due to excessive writes to the same partition key.
Let’s break down the provided options:
A (deviceManufacturer):
Using the device manufacturer as the partition key is not advisable. Manufacturers are usually few in number, and most devices in production will be associated with a limited number of companies. This leads to very low cardinality, meaning the same partition key would be shared by a large number of data points across devices. Consequently, this results in hot partitions and severely limits the scalability of the solution.
D (deviceId):
At first glance, deviceId seems reasonable due to its high cardinality—each device has a unique identifier. However, since each device writes data every second, this means each device consistently writes to the same logical partition. In large deployments with thousands of devices, the overall data will be spread out. But in smaller-scale deployments or with devices that generate massive write traffic, individual partitions can still get overloaded. Therefore, while better than A, this doesn’t fully mitigate hot partition risks.
C (deviceId + deviceManufacturer):
This synthetic key slightly improves on option D by adding a second identifier. However, deviceManufacturer typically adds little to the uniqueness of the key, as it is not sufficiently granular. The result is still a scenario where each device’s data ends up going to the same logical partition every time, maintaining the hot partition problem. The manufacturer component does not contribute meaningfully to dispersing write load.
B (deviceId + timestamp):
This synthetic partition key is the most effective strategy. By combining a unique identifier (deviceId) with a frequently changing field (timestamp), the key becomes dynamic for every data point. Because timestamp changes every second, this results in high cardinality and an almost unique key per document. This ensures that writes are distributed across many different logical partitions, greatly reducing the chance of hot partitions and maximizing parallelism in data ingestion.
It’s important to note that while this approach increases partition key cardinality, Cosmos DB is designed to handle high-cardinality keys efficiently. It improves scalability and avoids bottlenecks, particularly in real-time, high-ingestion scenarios like IoT telemetry.
One trade-off is that querying for all records for a single device becomes more complex since they are now spread across multiple partitions. However, this can be mitigated with query optimization and the use of effective indexing.
In conclusion, B is the correct answer because it ensures even write distribution, avoids hot partitions, and minimizes throughput constraints—critical requirements in time-series IoT applications.
Question No 8:
Why is it important to define partition keys correctly when designing containers in Azure Cosmos DB?
A) To reduce the cost of querying the database.
B) To ensure that each item is stored in a unique container.
C) To maximize throughput and efficiently distribute data across partitions.
D) To allow items to be written without indexing.
Correct Answer: C
Explanation:
Partition keys in Azure Cosmos DB play a fundamental role in how data is distributed and accessed across physical partitions. Selecting the right partition key is critical to optimize performance, scalability, and cost-efficiency. A well-chosen partition key leads to even data distribution and helps avoid hot partitions—where a disproportionate amount of data or operations are routed to a single physical partition, causing performance bottlenecks. Additionally, Cosmos DB assigns throughput (RU/s) at the partition level, so even partitioning supports better throughput management. Poor selection can lead to skewed access patterns and degraded application performance.
Question No 9:
Which of the following consistency levels in Azure Cosmos DB provides the lowest latency at the cost of potential data anomalies?
A) Bounded staleness consistency
B) Eventual consistency
C) Session consistency
D) Strong consistency
Correct Answer: B
Explanation:
Eventual consistency is the least strict consistency level offered by Azure Cosmos DB, which prioritizes low latency and high availability over data consistency. Under eventual consistency, data updates may not be immediately visible to all clients; however, given enough time and absence of further writes, all replicas will eventually converge to the latest version. While this can lead to temporary data anomalies, it's an acceptable trade-off in applications that prioritize performance and responsiveness over absolute accuracy in real-time—like social media feeds or product catalogs. In contrast, stronger consistency levels, like strong or bounded staleness, ensure more predictable reads but with higher latency.
Question No 10:
What is the primary use of Change Feed in Azure Cosmos DB?
A) To create snapshots of the data for backup purposes.
B) To monitor the indexing policy and storage usage.
C) To track changes to items in real-time for downstream processing.
D) To automatically partition large datasets.
Correct Answer: C
Explanation:
The Change Feed feature in Azure Cosmos DB provides a real-time, ordered log of changes (inserts and updates) that occur in a container. This feed can be consumed to trigger downstream processes, such as updating search indexes, sending notifications, or replicating changes to other systems. It is especially useful in event-driven architectures, enabling near real-time reactive processing of data. Unlike a traditional database trigger or polling mechanism, Change Feed offers scalability, reliability, and minimal latency without impacting the performance of transactional operations.