freefiles

Microsoft AI-900 Exam Dumps & Practice Test Questions

Question No 1:

You are working on a scientific research project aimed at analyzing environmental data to understand climate change. One aspect of this research involves predicting the sea level rise (in meters) over the next 10 years using historical data. The data set includes past sea levels, temperatures, ice melting rates, and other relevant environmental factors.

Given that the output is a continuous numerical value, which type of machine learning model is most suitable for this problem?

A. Classification
B. Regression
C. Clustering

Correct Answer: B. Regression

Explanation:

In this scenario, the goal is to predict a continuous numerical value—the rise in sea level over the next decade. When dealing with a machine learning problem where the output is a continuous number (like sea level height, temperature, or stock prices), the appropriate model is regression.

Regression is a form of supervised learning in which the model is trained on historical, labeled data (such as past sea levels, temperatures, and ice melt rates). The goal of the model is to predict a continuous output (e.g., the sea level in the future) based on input features. Regression models are designed to map the relationship between input variables and a continuous output, making them ideal for tasks like this one, where we need to predict numerical values.

On the other hand, classification is used when the output is categorical, meaning the model predicts discrete labels. For example, predicting whether an email is spam or not spam is a classification problem. Since the task at hand involves predicting a specific numerical value (sea level in meters), classification would not be applicable.

Clustering is another machine learning technique, but it is an unsupervised learning method. It groups similar data points together based on their features, without using labeled data. Clustering is useful for discovering hidden patterns or segmenting data but does not work well for making specific predictions like forecasting future sea levels.

To summarize, since this task requires predicting a continuous numerical value, regression is the most appropriate machine learning approach. Regression is widely used in scientific predictions, including climate change modeling, and enables us to make accurate forecasts based on historical data.

Question No 2:

You are tasked with developing a machine learning model to predict the likelihood of a customer purchasing a product based on their browsing behavior on an e-commerce website. The dataset consists of customer activity logs, including page views, product searches, and time spent on different sections of the website.

Which of the following machine learning techniques would be the most suitable for this predictive task?

A. Supervised Learning - Classification
B. Unsupervised Learning - Clustering
C. Supervised Learning - Regression
D. Reinforcement Learning

Correct Answer: A. Supervised Learning - Classification

Explanation:

In this scenario, the goal is to predict whether a customer will purchase a product based on their browsing behavior. This is a supervised learning task because the model is trained on labeled data. The labels will indicate whether a customer made a purchase (a binary outcome: "purchase" or "no purchase"). Since the outcome is categorical (either "purchase" or "no purchase"), this is a classification problem.

  • Supervised Learning - Classification is the most appropriate technique here because the task involves learning from labeled data where each customer's browsing behavior is mapped to a category (purchased or not purchased).

  • Unsupervised Learning - Clustering is not suitable in this case because clustering is used to group similar data points without pre-defined labels. You already have labeled data with known outcomes, so clustering is unnecessary.

  • Supervised Learning - Regression would be used if the goal were to predict a continuous value, such as predicting the total amount spent by a customer. Since you are predicting a categorical outcome (purchase or not), regression is not appropriate.

  • Reinforcement Learning is used in scenarios where an agent learns through interactions with an environment, such as in game-playing or robotics, and is not applicable here since you're working with a predictive model based on historical data.

In summary, since you are predicting a categorical outcome (purchase or not), supervised learning with classification is the best approach.

Question No 3:

You are working with a large retail dataset containing customer information, such as their demographics, past purchases, and browsing behavior on the company website. You want to predict how much a customer is likely to spend in the next month based on these features. 

Which machine learning model is best suited for predicting the total amount a customer will spend (a continuous value)?

A. Linear Regression
B. Decision Tree Classification
C. K-Means Clustering
D. Logistic Regression

Correct Answer: A. Linear Regression

Explanation:

In this scenario, the goal is to predict the total amount a customer will spend in the next month, which is a continuous numerical value. To solve this problem, the model should be able to predict numeric values rather than categories or clusters.

  • Linear Regression is the most appropriate model for this problem. It is a type of supervised learning model used for predicting continuous variables. In linear regression, the model learns the relationship between the input features (such as customer demographics, purchase history, etc.) and the continuous output (the amount a customer will spend). Linear regression works well when the relationship between the input variables and the output is expected to be linear, meaning that a change in the input features has a proportional effect on the prediction.

  • Decision Tree Classification is typically used for classification tasks, where the goal is to predict categorical labels (e.g., "yes" or "no", or "high", "medium", "low"). Since the output in this case is a continuous value (total amount spent), a classification model like decision tree classification is not suitable.

  • K-Means Clustering is an unsupervised learning technique used to group similar data points into clusters based on feature similarity. Clustering does not make predictions about continuous values, but instead organizes the data into groups. This method would not be helpful in predicting the total amount a customer will spend.

  • Logistic Regression is typically used for binary classification tasks (e.g., predicting whether an event will happen or not), not for predicting continuous outcomes. In this case, since you are predicting a continuous value, logistic regression is not appropriate.

In summary, linear regression is the best model for predicting continuous outcomes like the amount a customer will spend, as it is specifically designed for this kind of task.

Question No 4:

While reviewing machine learning basics, you're asked to determine whether each of the following statements is True (Yes) or False (No) based on best practices for developing and evaluating models.

  1. Labeling refers to tagging the training data with the correct output values.

  2. It is recommended to evaluate a model using the same data that was used to train it.

  3. Accuracy is always the most important metric for assessing a model's performance.

Instructions: Select Yes if the statement is true, otherwise select No.

Correct Answers:

  1. Yes

  2. No

  3. No

Explanation:

Understanding the basics of machine learning is essential for creating efficient and accurate models. Let's break down the truth behind each statement:

  1. Labeling refers to tagging the training data with the correct output values — Yes
    This statement is accurate. Labeling is a fundamental step in supervised learning, where each input in the training data is paired with a corresponding output, also known as the label. This allows the model to learn from the examples and make predictions. For instance, in image classification, labels might indicate whether a picture contains a dog or a cat. Proper labeling is critical to the model's ability to learn patterns and make accurate predictions on unseen data.

  2. It is recommended to evaluate a model using the same data that was used to train it — No
    This statement is incorrect. Evaluating a model on the same data it was trained on can lead to overfitting. Overfitting occurs when the model becomes too tailored to the training data, performing well on it but poorly on new, unseen data. To avoid overfitting and accurately assess a model's generalization capability, it's crucial to split the dataset into training and test sets. The test set, which the model has not seen during training, provides a more realistic measure of how the model will perform in real-world scenarios.

  3. Accuracy is always the most important metric for assessing a model's performance — No
    This is not always true. While accuracy (the proportion of correct predictions) is commonly used, it is not always the best metric, especially in cases of class imbalance. For example, in a fraud detection system, where fraudulent transactions make up only a small percentage of total transactions, accuracy can be misleading. A model that predicts "no fraud" most of the time will still achieve high accuracy but fail to detect actual fraud cases. In such cases, precision, recall, F1-score, and ROC-AUC are often more informative and better suited for evaluating model performance.

Question No 5:

You are developing an application that needs to extract structured data from scanned documents, such as receipts and forms. The data includes both plain text and key-value pairs (e.g., "Name: John Doe") as well as tabular information (e.g., product lists with prices). 

Which Azure AI service would be the most appropriate for this task?

A. Form Recognizer
B. Text Analytics
C. Language Understanding (LUIS)
D. Custom Vision

Correct Answer: A. Form Recognizer

Explanation:

For extracting structured data, including text, key-value pairs, and tabular information, Azure Form Recognizer is the ideal choice.

Azure Form Recognizer is a powerful service designed to extract structured data from documents such as receipts, forms, and invoices. It leverages advanced machine learning models to understand the layout and context of a document, extracting both the text and its structure. For example, it can accurately identify the total cost in a receipt or extract product names, prices, and quantities from a table in an invoice.

Here’s why Option A: Form Recognizer is the best solution:

  • Key-Value Pair Extraction: Form Recognizer can identify and extract key-value pairs, such as "Name: John Doe" or "Total: $50.00", which are essential for processing forms and receipts.

  • Table Extraction: It can intelligently detect tables and their contents, making it particularly useful for documents with multiple columns, rows, or financial data.

  • Flexibility: It supports both pre-built models for common document types and custom models that can be trained to handle specific document formats or layouts.

Now let’s review why the other options are not suitable:

  • Option B (Text Analytics): This service focuses on analyzing unstructured text for tasks such as sentiment analysis, entity recognition, and language detection. It doesn't extract structured data from documents like Form Recognizer.

  • Option C (Language Understanding - LUIS): LUIS is designed for natural language understanding in conversational applications like chatbots. It is not suited for structured data extraction from scanned documents.

  • Option D (Custom Vision): Custom Vision is used for image classification and object detection, not for extracting text or data from documents. It’s focused on visual analysis rather than text extraction.

Thus, Option A: Form Recognizer is the best choice for extracting structured data from scanned documents in a variety of formats.

Question No 6:

You are building an application that processes receipts to extract structured data, such as item names, subtotals, taxes, and total amounts, from scanned or photographed receipts. 

Which Azure Cognitive Service is most suited for this purpose?

A. Custom Vision
B. Form Recognizer
C. Ink Recognizer
D. Text Analytics

Correct Answer: B. Form Recognizer

Explanation:

The most suitable service for extracting structured data from receipts is Azure Form Recognizer, which is specifically designed for this kind of task.

Form Recognizer uses advanced machine learning models to recognize and extract key details from scanned or photographed documents. It can process receipts, invoices, business cards, and forms, extracting important information such as:

  • Merchant Name

  • Transaction Date

  • Line Items (product names, quantities, prices)

  • Subtotals, Taxes, and Totals

Here’s why Option B: Form Recognizer is the correct choice:

  • Accuracy: Form Recognizer is designed to handle semi-structured and structured documents like receipts, where the information follows a specific format but can vary slightly in layout.

  • Automated Data Extraction: With Form Recognizer, you can submit images or PDFs of receipts, and it will return structured data in JSON format, making it ideal for automating tasks like expense tracking or financial auditing.

  • Customization: Form Recognizer offers both pre-trained models for common document types and custom models that can be trained to handle specialized receipt formats, enhancing its flexibility.

Let’s look at why the other options are not appropriate for this task:

  • Option A (Custom Vision): Custom Vision is great for image classification and object detection, but it does not extract structured data or interpret document contents.

  • Option C (Ink Recognizer): Ink Recognizer is designed to recognize handwriting, but it is not suitable for processing printed receipts or structured documents.

  • Option D (Text Analytics): Text Analytics is useful for analyzing sentiment, key phrases, and entities in natural language, but it doesn’t work well for structured data extraction from documents.

In conclusion, Option B: Form Recognizer is the most effective tool for extracting structured data from receipts, providing both high accuracy and flexibility in handling various document layouts.

Question No 7:

You’ve developed an inference pipeline using Azure Machine Learning Designer and deployed it. Now, you wish to access this pipeline as a web service from an external application to get real-time predictions.

Which two parameters do you need in order to properly call and authenticate the web service?

A. The machine learning model's name
B. The training pipeline's endpoint
C. The API (authentication) key
D. The REST endpoint URL of the web service

Correct Answer: C and D

Explanation:

When you deploy an inference pipeline using Azure Machine Learning, it is exposed as a web service that can be accessed from external applications for real-time predictions. To interact with this web service securely, two crucial elements are required:

  1. The REST Endpoint URL (Option D) is the URL through which the deployed pipeline can be accessed. This URL serves as the address for external applications to send requests to get predictions.

  2. The API Key (Option C) is necessary to authenticate the web service request. This key ensures that only authorized applications or users can access the service. Without the API key, the request to the web service will be rejected.

The Model Name (Option A) is not required to invoke the web service. The inference pipeline already encapsulates the model within it, so specifying the model’s name is redundant.

The Training Endpoint (Option B) is used to connect to a model training pipeline, not an inference pipeline. Therefore, it’s irrelevant for calling a deployed inference service.

In summary, to access and authenticate the web service for real-time predictions, you need the REST endpoint URL and the API key to ensure secure communication.

Question No 8:

After building a real-time inference pipeline using Azure Machine Learning Designer, you now wish to deploy it as a web service so that external users or applications can access it in real-time.

Which deployment target should be chosen for deploying this real-time inference pipeline?

Options:

A. Local web service
B. Azure Container Instances
C. Azure Kubernetes Service (AKS)
D. Azure Machine Learning compute

Correct Answer: C. Azure Kubernetes Service (AKS)

Explanation:

When deploying a real-time inference pipeline for external use, the deployment target plays a crucial role in ensuring the service’s scalability, reliability, and ability to handle high-latency requests.

  1. Azure Kubernetes Service (AKS) is the optimal choice for deploying real-time inference pipelines in a production environment. AKS supports auto-scaling, high availability, and low-latency predictions, making it ideal for high-scale, production-grade deployments. It can handle large numbers of inference requests efficiently and ensures reliable service availability.

  2. Azure Container Instances (ACI) are suitable for short-term or testing environments, as they support lightweight services. However, ACI lacks the scalability and production-grade features required for handling real-time inference at scale.

  3. Azure Machine Learning Compute is typically used for training machine learning models, not for deploying inference pipelines. It is optimized for batch processing workloads and does not support the low-latency demands of real-time inference.

  4. Local Web Service is generally used for testing purposes during development and is not suitable for production or real-time scenarios. It lacks the scalability and features needed for a robust, production-ready service.

In summary, Azure Kubernetes Service (AKS) is the most appropriate deployment target for exposing a real-time inference pipeline, offering high scalability and low-latency response times necessary for production environments.

Question No 9:

A logistics company wants to predict the number of overtime hours a delivery driver will work based on the number of delivery orders received on a given day. 

Which machine learning approach is best suited for this scenario?

A. Classification
B. Clustering
C. Regression

Correct Answer: C. Regression

Explanation:

This scenario involves predicting a continuous numerical value — the number of overtime hours a driver will work, based on the number of delivery orders. Therefore, this is a classic application of regression in machine learning.

  • Regression models are used to predict continuous or numerical values. In this case, the goal is to predict overtime hours, a numerical value, based on input data (the number of delivery orders). The machine learning model will be trained using historical data containing both the input (number of orders) and the output (overtime hours), and once trained, it will be able to predict overtime hours for new, unseen data.

  • Classification (Option A) is used when the output variable is categorical, such as classifying whether an email is spam or not. Since the output in this case is a numerical value (overtime hours), classification is not applicable.

  • Clustering (Option B) is an unsupervised learning technique used to group similar data points together based on features, without predefined labels. It is not used for predicting specific values like overtime hours.

In conclusion, regression is the correct approach here since the task is to predict a continuous variable, which is the number of overtime hours based on delivery orders.

Question No 10:

You are evaluating the capabilities of Azure Machine Learning Designer. Review the following statements regarding its features. For each statement, determine if it is True or False.

  • Statement 1: Azure Machine Learning Designer allows users to create machine learning models using a no-code or low-code approach.

  • Statement 2: Azure Machine Learning Designer supports saving pipelines as drafts, enabling users to return and continue their work later.

  • Statement 3: Azure Machine Learning Designer supports running custom JavaScript code for model development.

Correct Answers: Yes, Yes, No

Explanation:

  1. Azure Machine Learning Designer provides a no-code or low-code environment, allowing users to build machine learning models without needing to write extensive code. The drag-and-drop interface simplifies the process of constructing machine learning workflows, making Statement 1 True.

  2. Azure Machine Learning Designer also supports saving pipelines as drafts, which means users can save their work at any stage of the process, return later, and continue developing the pipeline. This capability facilitates iterative development, making Statement 2 True.

  3. However, Azure Machine Learning Designer does not support running custom JavaScript code. It is designed to work with Python and R scripts, as these are the primary languages used in the data science and machine learning communities. JavaScript is not supported in this environment, so Statement 3 is False.

In summary, Azure Machine Learning Designer enables a simplified workflow for machine learning model creation with no-code or low-code features and supports draft-saving. However, it does not support JavaScript integration.