CloudPass LogoCloud Pass
AWSGoogle CloudMicrosoftCiscoCompTIADatabricks
Certifications
AWSGoogle CloudMicrosoftCiscoCompTIADatabricks
Google Professional Machine Learning Engineer
Google Professional Machine Learning Engineer

Practice Test #2

Simulate the real exam experience with 50 questions and a 120-minute time limit. Practice with AI-verified answers and detailed explanations.

50Questions120Minutes700/1000Passing Score
Browse Practice Questions

AI-Powered

Triple AI-Verified Answers & Explanations

Every answer is cross-verified by 3 leading AI models to ensure maximum accuracy. Get detailed per-option explanations and in-depth question analysis.

GPT Pro
Claude Opus
Gemini Pro
Per-option explanations
In-depth question analysis
3-model consensus accuracy

Practice Questions

1
Question 1

Your team is preparing to train a fraud detection model using data in BigQuery that includes several fields containing PII (for example, card_number, customer_email, and phone_number). The dataset has approximately 250 million rows and every column is required as a feature. Security requires that you reduce the sensitivity of PII before training while preserving each column’s format and length so downstream SQL joins and validations continue to work. The transformation must be deterministic so the same input always maps to the same protected value, and authorized teams must be able to decrypt values for audits. How should you proceed?

Randomizing sensitive values is not deterministic unless you maintain a mapping table, and it is typically not reversible for audits. It also risks breaking referential integrity and downstream joins/validations because randomized outputs may not preserve the original format/length constraints (e.g., card number patterns). While Dataflow can scale to 250M rows, this approach does not meet the deterministic and decryptable requirements.

Cloud DLP can both identify PII and apply de-identification at scale. Using DLP Format-Preserving Encryption (FPE) preserves the original data’s format and length, enabling downstream SQL joins and validations to continue working. With Cloud KMS protecting the key material, authorized teams can re-identify/decrypt for audits under controlled IAM. Dataflow provides the scalable execution layer for transforming hundreds of millions of BigQuery rows.

AES-256 with a per-row random salt makes the transformation non-deterministic (same input encrypts differently each time), which breaks the requirement for stable mapping needed for joins and consistent feature values. Additionally, standard ciphertext encoding (base64/hex) changes length and character set, violating format/length preservation. Building custom crypto also increases implementation risk compared to DLP’s managed FPE designed for this use case.

Dropping PII columns contradicts the requirement that every column is required as a feature for training. Authorized views can restrict access, but they do not reduce the sensitivity of the data used in training; the model pipeline would still either need the raw PII (violating security requirements) or lose critical features. This option addresses access control, not deterministic, reversible de-identification.

Question Analysis

Core Concept: This question tests privacy-preserving feature engineering for ML using managed de-identification. The key services are Cloud Data Loss Prevention (DLP) for de-identification and Format-Preserving Encryption (FPE), Cloud KMS for key management, and Dataflow for scalable transformation of BigQuery-scale datasets. Why the Answer is Correct: You must reduce PII sensitivity while (1) preserving each column’s format and length, (2) ensuring deterministic mapping (same input -> same output), and (3) enabling authorized re-identification (decrypt) for audits. Cloud DLP’s FPE is designed exactly for this: it produces ciphertext that matches the original data’s character set/length constraints (e.g., credit card-like strings), can be configured deterministically, and supports reversible transformation when paired with appropriate keying material. Using Cloud KMS to protect the wrapping key aligns with enterprise security and auditability requirements. Dataflow provides the throughput needed for ~250M rows and integrates well with BigQuery I/O. Key Features / Configurations: - DLP de-identification template using cryptoReplaceFfxFpeConfig (FPE/FFX mode) to preserve format/length. - Deterministic behavior via consistent keying material and configuration; optionally use a stable surrogate/“tweak” strategy if required by policy. - Cloud KMS-managed key encryption key (KEK) to protect the DLP crypto key material (enables centralized IAM, rotation, and audit logs). - Dataflow pipeline (batch) reading from BigQuery, applying DLP transform to specific columns, and writing back to BigQuery for training. - Principle of least privilege: Dataflow service account needs BigQuery read/write and DLP/KMS permissions; restrict decrypt capability to audit teams. Common Misconceptions: - “Randomizing” values removes PII but breaks joins/validations and is not reversible. - Standard encryption with random salts improves security but defeats determinism and typically changes length/format, breaking downstream SQL expectations. - Dropping PII columns violates the requirement that every column is needed as a feature. Exam Tips: When you see requirements for (a) preserving format/length, (b) deterministic tokenization, and (c) reversible access for authorized users, think Cloud DLP FPE + Cloud KMS. For very large BigQuery datasets, pair it with Dataflow for scalable batch processing and use DLP templates for repeatability and governance (aligned with the Google Cloud Architecture Framework’s security and operational excellence pillars).

2
Question 2

You are a data scientist at a national power utility analyzing 850 million smart-meter readings from 3,000 substations collected over 5 years; for exploratory analysis, you must compute descriptive statistics (mean, median, mode) by device and region, perform complex hypothesis tests (e.g., differences between peak vs off-peak and seasonal periods with multiple comparisons), and plot feature variations at hourly and daily granularity over time, while using as much of the telemetry as possible and minimizing computational resources—what should you do?

Not ideal because it calculates descriptive statistics and runs statistical analyses inside a notebook after importing data. With 850 million rows, pulling large volumes into a user-managed notebook is expensive and slow, and may exceed memory/IO limits. Looker Studio is fine for visualization, but the heavy computations should be pushed down to BigQuery to minimize compute and leverage MPP execution.

Incorrect because it relies entirely on a Vertex AI Workbench user-managed notebook for importing and analyzing the full dataset. Notebooks are not designed as a primary engine for scanning and aggregating hundreds of millions of records; this typically requires large VM sizing, long runtimes, and high cost. It also reduces reproducibility and scalability compared to BigQuery-based processing.

Partially correct: BigQuery is the right place for descriptive statistics at scale, and Workbench can run complex hypothesis tests. However, using notebooks to generate all time plots is not the most resource-efficient approach for interactive hourly/daily visual exploration. Looker Studio can query BigQuery directly and offload visualization without keeping notebook compute running.

Correct: BigQuery handles large-scale aggregations and descriptive statistics efficiently, minimizing compute and cost. Looker Studio connects directly to BigQuery for interactive time-series plots at hourly/daily granularity without exporting data. Vertex AI Workbench is then used only for complex hypothesis testing and multiple-comparison procedures, ideally on filtered/aggregated extracts from BigQuery, balancing fidelity with resource efficiency.

Question Analysis

Core Concept: This question tests choosing the right tools for large-scale exploratory data analysis (EDA) on Google Cloud: push aggregation and filtering to BigQuery (serverless MPP analytics), use a BI tool for interactive visualization, and reserve notebooks for advanced statistics that are not easily expressed in SQL. Why the Answer is Correct: With 850 million time-series readings, importing “full data” into a notebook is inefficient and often infeasible due to memory/IO limits and high compute cost. BigQuery is designed to scan and aggregate massive datasets efficiently and can compute descriptive statistics by device/region (mean, approximate quantiles for median, counts for mode) using SQL at scale. For plotting hourly/daily variations over time, Looker Studio (formerly Data Studio) can query BigQuery directly, enabling interactive dashboards without exporting data or running a notebook continuously. Complex hypothesis tests with multiple comparisons (e.g., t-tests/ANOVA variants, nonparametric tests, p-value adjustments) are better handled in Python/R in Vertex AI Workbench; critically, the notebook should query only the necessary slices/aggregates from BigQuery to minimize resources while still using as much telemetry as possible. Key Features / Best Practices: - BigQuery: partitioning by timestamp and clustering by device_id/region to reduce scanned bytes and cost; approximate quantiles for scalable median; materialized views or scheduled queries for repeated rollups. - Looker Studio: direct BigQuery connector, cached results, parameterized filters for peak/off-peak and seasonal windows. - Vertex AI Workbench: use BigQuery client/BigQuery Storage API to pull only required subsets; run statistical libraries (SciPy/Statsmodels) for hypothesis testing and multiple-comparison corrections. These align with Google Cloud Architecture Framework principles: choose managed services, optimize cost/performance, and separate concerns (analytics vs visualization vs advanced computation). Common Misconceptions: A and B assume notebooks are the primary engine for both aggregation and visualization, but notebooks are not optimized for scanning hundreds of millions of rows and lead to oversized instances and long runtimes. C is close, but it misses the most resource-efficient approach for visualization: using Looker Studio directly on BigQuery avoids notebook-based plotting workloads and supports broad stakeholder exploration. Exam Tips: For very large datasets, default to BigQuery for heavy aggregations and filtering, BI tools for dashboards, and notebooks for specialized analyses. Watch for phrases like “minimize computational resources” and “use as much telemetry as possible”—they usually imply serverless analytics (BigQuery) plus direct-connect visualization rather than exporting data into notebooks.

3
Question 3

You are launching a grocery delivery mobile app across 3 cities and will use Google Cloud's Recommendations AI to build, test, and deploy product suggestions; you currently capture about 2.5 million user events per day, maintain a catalog of 120,000 SKUs with accurate price and availability, and your business objective is to raise average order value (AOV) by at least 6% within the next quarter while adhering to best practices. Which approach should you take to develop recommendations that most directly increase revenue under these constraints?

"You Might Also Like" is commonly used for discovery and can improve click-through rate on a home feed, but CTR is not the same as revenue. Home feed users may be in browsing mode, so incremental clicks may not translate into larger baskets or higher AOV. For an explicit AOV +6% goal in a short timeframe, cross-sell at high-intent moments is typically more effective than broad personalization on the home screen.

"Frequently Bought Together" is purpose-built to recommend complementary items that increase basket size (attach rate). Showing these recommendations on product detail and cart pages targets users close to purchase, which most directly impacts AOV and revenue. This aligns with best practices: leverage strong purchase/add-to-cart signals from high event volume and use accurate catalog metadata (price/availability) to avoid recommending out-of-stock or irrelevant items.

This reverses best practice. Recommendations AI Retail requires a product catalog to exist so events can be correctly attributed to items and enriched with metadata. Importing events first can lead to unmatched item IDs, reduced training quality, and delayed time-to-value. The system does not "backfill" missing metadata reliably from events; you should upload/maintain the catalog first (or in parallel) and then stream/import user events.

Creating placeholder SKUs with default categories/prices is explicitly counterproductive for recommendation quality and business outcomes. Recommendations AI relies on accurate item attributes (category, price, availability) to train and to filter/serve relevant results. Placeholders can cause irrelevant or misleading recommendations (e.g., wrong price or out-of-stock), harming user trust and conversion. It also complicates governance and measurement during A/B tests, producing noisy results.

Question Analysis

Core Concept: This question tests how to choose the most appropriate Recommendations AI model type and placement to meet a concrete business KPI (increase AOV/revenue) while following data-quality best practices. Recommendations AI offers different recommendation types optimized for different user intents and surfaces (home feed vs product detail vs cart). Why the Answer is Correct: To most directly increase revenue/AOV, you want to increase basket size and attach rate (adding complementary items to an order). "Frequently Bought Together" is designed for cross-sell by recommending items that are commonly purchased in the same transaction. Placing it on product detail pages and especially the cart page targets users at high purchase intent, where incremental add-ons are most likely to convert and increase order value within a quarter. With 2.5M events/day and a well-maintained catalog (120k SKUs with accurate price/availability), you have the key prerequisites to train and serve high-quality recommendations quickly. Key Features / Best Practices: - Use the Retail domain in Recommendations AI with a complete, accurate product catalog (including price, availability, categories, and attributes) and high-volume user events (view, add-to-cart, purchase). - Ensure event logging includes user IDs (or visitor IDs), product IDs, timestamps, and event types; prioritize purchase and add-to-cart signals for revenue impact. - Run online A/B tests (e.g., via your experimentation framework) comparing placements (PDP vs cart) and measure AOV, conversion rate, and revenue per session, not just CTR. - Follow the Google Cloud Architecture Framework: align with business goals (AOV), ensure data quality and governance (accurate catalog), and design for reliability/observability (monitor recommendation serving latency and drift). Common Misconceptions: CTR-optimized placements (home feed) can look successful but may not move revenue. Also, trying to shortcut catalog quality (placeholders) often degrades model performance and can violate best practices for Retail recommendations. Exam Tips: When the KPI is revenue/AOV, prefer cross-sell/upsell recommendation types and high-intent surfaces (PDP/cart). When the KPI is engagement/discovery, home-feed "You Might Also Like" can be appropriate. Always import catalog first (or keep it current) and avoid synthetic placeholders; Recommendations AI depends heavily on accurate item metadata and availability.

4
Question 4

A fintech analytics team has migrated 12 time-series forecasting and anomaly-detection models to Google Cloud over the last 90 days and is now standardizing new training on Vertex AI. You must implement a system that automatically tracks model artifacts (datasets, feature snapshots, checkpoints, and model binaries) and end-to-end lineage across pipeline steps for dev, staging, and prod; the solution must be simple to adopt via reusable templates, require minimal custom code, retain lineage for at least 180 days, and scale to future models without re-architecting; what should you do?

Vertex AI Pipelines integrates natively with Vertex ML Metadata to automatically capture lineage: which inputs produced which outputs, per component execution, across pipeline runs. Using the Vertex AI SDK enables reusable pipeline templates and components, minimizing custom code while standardizing dev/stage/prod workflows. This scales well as new models are added because lineage capture is built-in and consistent, and artifact URIs are tracked centrally.

Mixing Vertex AI Pipelines for artifacts and MLflow for lineage adds unnecessary operational complexity and weakens standardization. MLflow tracking is not the native lineage store for Vertex Pipelines, so you would need custom integration to correlate pipeline steps, artifacts, and environments. This violates the “minimal custom code” and “simple to adopt via reusable templates” requirements and increases maintenance burden over time.

Vertex AI Experiments is primarily for experiment/run tracking (parameters, metrics, comparisons) and is not designed to provide full end-to-end lineage across multi-step pipelines and artifact dependencies. While ML Metadata can store lineage, using Experiments for artifacts is a mismatch: artifacts like datasets, checkpoints, and binaries are typically managed via pipeline artifacts and storage, with MLMD capturing their relationships automatically.

Using Cloud Composer to schedule lineage capture via Cloud Run functions is a custom-built metadata system. It requires significant bespoke code to infer lineage, map artifacts to steps, and maintain consistency across dev/stage/prod. This approach is harder to standardize, less reliable for audit-grade provenance, and does not leverage Vertex AI’s native MLMD integration, making it a poor fit for low-code, scalable governance.

Question Analysis

Core Concept: This question tests end-to-end ML governance on Google Cloud: tracking artifacts (datasets, feature snapshots, checkpoints, model binaries) and lineage across pipeline steps/environments using Vertex AI Pipelines and Vertex ML Metadata (MLMD). This aligns with the Google Cloud Architecture Framework pillars of Operational Excellence (repeatable automation), Reliability (consistent provenance), and Security/Compliance (auditability). Why the Answer is Correct: Vertex AI Pipelines (Kubeflow Pipelines on Vertex) automatically integrates with Vertex ML Metadata to record executions, inputs/outputs, and artifact URIs for each pipeline component. Using the Vertex AI SDK and reusable pipeline templates/components provides a low-code adoption path: teams standardize a pipeline pattern once, then future models inherit artifact and lineage tracking without re-architecting. This directly satisfies the requirement for minimal custom code, reusable templates, and scaling to additional models. Key Features / How to Implement: - Define pipelines with the Vertex AI SDK (KFP v2) and standard components (e.g., data extraction, feature generation, training, evaluation, deployment). - Ensure each step produces typed artifacts (Dataset, Model, Metrics, etc.) and writes outputs to durable storage (typically Cloud Storage). MLMD stores metadata/lineage references to these artifacts. - Use separate projects or environments (dev/stage/prod) with consistent pipeline templates; lineage is captured per run and can be queried for audits and debugging. - Retention: MLMD retains lineage/metadata; the 180-day requirement is met by keeping metadata and underlying artifact storage (e.g., GCS lifecycle policies for binaries/checkpoints). If organizational policy requires, configure dataset/model artifact retention and access controls. Common Misconceptions: Some assume MLflow is required for lineage; on Vertex, MLMD is the native lineage system tightly integrated with Pipelines. Others confuse Vertex AI Experiments (run tracking) with full artifact lineage across pipeline steps; Experiments is not a complete lineage solution for multi-step pipelines. Exam Tips: When you see “end-to-end lineage across pipeline steps” and “simple via templates/minimal custom code,” default to Vertex AI Pipelines + Vertex ML Metadata. Prefer native integrations over assembling multiple tools unless requirements explicitly demand third-party portability. Also remember: metadata stores references; artifact retention is handled by the backing storage (often GCS) and lifecycle policies.

5
Question 5

You trained an automated scholarship eligibility classifier for a national education nonprofit using Vertex AI on 1.2 million labeled applications, reaching an offline ROC AUC of 0.95; the review board is concerned that predictions may be biased by applicant demographics (e.g., gender, ZIP-code–derived income bracket, first-generation college status) and asks you to deliver transparent insight into how the model makes decisions for 500 sampled approvals and denials and to identify any fairness issues across these cohorts. What should you do?

Separating features in Feature Store and retraining without demographics is a remediation attempt, not the requested analysis. It also risks “fairness through unawareness,” which often fails because proxy variables (e.g., ZIP code, school, essay topics) can still encode demographics. Additionally, removing sensitive features can make it harder to measure fairness and audit outcomes by protected class. The board asked for transparent insight and cohort fairness evaluation first.

Vertex AI feature attribution (Explainable AI) provides per-instance explanations for approvals and denials, quantifying how each feature influenced each prediction. You can then aggregate and compare attributions and outcomes across cohorts (gender, income bracket, first-gen) to surface potential bias and proxy-feature reliance. This directly satisfies the requirement for transparency on 500 sampled decisions and supports fairness analysis using group-level comparisons and fairness metrics.

Vertex AI Model Monitoring focuses on operational issues like training-serving skew, feature drift, and prediction drift. While valuable for production reliability, it does not provide decision transparency for specific cases nor does it directly identify fairness issues across demographic cohorts. Retraining with recent data may even perpetuate or amplify bias if the underlying labeling or historical decision process is biased. This option addresses a different problem than requested.

Vector Search nearest-neighbor retrieval can help find similar examples, but it is not a primary tool for explainability or fairness auditing. Similarity search depends on embeddings and distance metrics and may not reveal which features drove the model’s decision or quantify demographic impacts. It can be a supplemental qualitative investigation technique, but it does not provide the systematic, per-feature, cohort-based transparency and bias evidence the board requested.

Question Analysis

Core Concept: This question tests model transparency and fairness evaluation in Vertex AI—specifically using explainability (feature attribution) to understand why individual predictions were made and then analyzing those explanations across demographic cohorts to detect potential bias. This aligns with responsible AI practices in the Google Cloud Architecture Framework (governance, risk management, and trust). Why the Answer is Correct: The board requests “transparent insight” for 500 sampled approvals/denials and to “identify fairness issues across cohorts.” Vertex AI feature attribution (Vertex Explainable AI) provides per-prediction explanations (local explanations) showing how each input feature contributed to a specific decision. By aggregating attributions and outcomes by cohort (e.g., gender, income bracket, first-gen status), you can identify whether sensitive or proxy features disproportionately drive approvals/denials, and whether similarly qualified applicants receive different outcomes across groups—key evidence for bias investigation. Key Features / How to Do It: Use Vertex AI Explainable AI on the deployed model or batch predictions for the 500 sampled cases to obtain attributions (e.g., Integrated Gradients / sampled Shapley depending on model type). Then slice results by cohort and compare: (1) distribution of prediction scores, (2) top contributing features, and (3) fairness metrics such as demographic parity difference, equal opportunity / TPR gaps, and calibration by group (often computed outside Vertex AI using BigQuery/Looker/Python, but driven by the attribution outputs). Also look for proxy variables (ZIP-code–derived income) acting as sensitive-feature surrogates. Common Misconceptions: High ROC AUC (0.95) does not imply fairness; it can coexist with discriminatory behavior. Monitoring drift/skew is operationally important but does not answer “why” decisions are made or whether they are biased. Removing demographic features may not remove bias because proxies remain, and it can reduce the ability to measure/mitigate fairness. Exam Tips: When the prompt asks for transparency, interpretability, and cohort-based bias analysis, think “Vertex Explainable AI/feature attribution + slice by groups.” When it asks for production data drift or training-serving skew, think “Vertex Model Monitoring.” For fairness, prefer measurement and evidence (explanations + group metrics) before remediation steps like reweighting, constraints, or feature removal.

Want to practice all questions on the go?

Download Cloud Pass — includes practice tests, progress tracking & more.

6
Question 6

You are building a deep neural network classifier for a ride-sharing fraud detection system with 30 million training records; several categorical features have very high cardinality (driver_id ≈ 320,000 unique values, vehicle_vin ≈ 110,000, pickup_zip ≈ 42,000), and due to a 16 GB GPU memory cap you cannot materialize a full one-hot vocabulary for each column. Which encoding should you use to feed these categorical features into the model so that the representation scales, remains sparse, and does not impose artificial ordinality?

Assigning integer indices and feeding them as continuous numeric inputs is generally incorrect for neural networks because it introduces artificial ordinality and distance (e.g., category 10 appears “closer” to 11 than to 1000). This can mislead the model unless you use an embedding lookup (which the option does not mention). While memory-efficient, it violates the requirement to avoid ordinality.

Feature hashing maps each category into a fixed number of buckets and produces a sparse one-hot (or multi-hot) vector over those buckets. It scales to very high cardinality without storing a full vocabulary, fits within memory constraints, and avoids ordinality because the representation is categorical rather than numeric magnitude. The main trade-off is hash collisions, managed by choosing adequate bucket sizes per feature.

Explicit one-hot encoding with a dimension per unique category is the most direct way to avoid ordinality, but it does not scale here. With hundreds of thousands of categories across multiple columns, the resulting vectors are extremely wide and can exceed GPU memory and increase compute costs. It also requires building and maintaining vocabularies, which is operationally heavy with evolving categories.

Run-length encoding is a compression technique for sequences/strings, not a standard ML encoding for categorical variables. It does not produce a stable numeric feature representation suitable for neural network ingestion, does not address ordinality, and does not provide a sparse categorical indicator structure. It would add complexity without solving the core modeling and scalability requirements.

Question Analysis

Core Concept: This question tests scalable encoding of high-cardinality categorical features for deep neural networks under memory constraints. The key requirement is an encoding that (1) scales to hundreds of thousands of categories, (2) stays sparse, and (3) avoids introducing artificial ordinality. Why the Answer is Correct: Feature hashing (hashing trick) maps each categorical value into one of a fixed number of buckets and then represents it as a sparse one-hot vector over those buckets. This avoids building and storing an explicit vocabulary (which is expensive for driver_id/vehicle_vin) and keeps the input representation sparse, which is efficient for both memory and compute. Because the representation is one-hot over hash buckets, it does not impose an ordinal relationship between categories (unlike feeding integer IDs as continuous values). Hashing also works well in streaming/online settings and when new categories appear, since unseen values can still be mapped deterministically to buckets. Key Features / Best Practices: - Choose bucket sizes per feature based on cardinality and acceptable collision rate (e.g., larger buckets for driver_id than pickup_zip). - Use separate hash spaces per feature to prevent cross-feature collisions. - Implement via TensorFlow/Keras preprocessing layers (e.g., Hashing + CategoryEncoding) or equivalent in other frameworks; keep the output as sparse tensors where supported. - Understand the trade-off: collisions introduce controlled noise; often acceptable at scale, and can be mitigated with larger bucket counts. Common Misconceptions: A is tempting because integer IDs are compact, but feeding them directly as numeric inputs makes the model treat category “320000” as larger than “2”, creating false ordinality and distance. C is the “classic” one-hot approach but becomes infeasible with very large vocabularies and GPU memory limits. D is unrelated to ML feature representation and does not create a meaningful numeric embedding. Exam Tips: When you see “high cardinality + cannot materialize vocabulary/one-hot + want sparse + no ordinality,” think feature hashing (or embeddings if dense representations are acceptable). If the question explicitly asks to remain sparse, hashing-based one-hot is the canonical answer. Also remember operational benefits: hashing handles unseen categories without retraining the vocabulary, aligning with scalable production ML practices and the Google Cloud Architecture Framework’s emphasis on performance efficiency and operational simplicity.

7
Question 7

You work for a wind farm operator. You have been asked to develop a model to predict whether a turbine will require unscheduled maintenance on a given day. Your team has processed 18 months of turbine telemetry (~2.4 million rows) and created a table with the following rows: • Turbine_id • Site_id • Date • Hours_since_last_service (measured in hours) • Average_vibration_frequency (measured in Hz) • Temperature_delta_7d (measured in °C) • Unscheduled_maintenance (binary class, if maintenance occurred on the Date) Models must be deployed in us-central1, and you need to interpret the model’s results for each individual online prediction with per-instance feature contributions. What should you do?

BigQuery ML can train boosted tree models on tabular data, but manually inspecting tree split rules is not the same as obtaining per-instance feature attribution values for each online prediction. The question specifically asks for feature contributions for individual predictions, which requires an explanation mechanism rather than human inspection of model structure. This option also does not describe deploying a model to a Vertex AI endpoint with the explain method, which is the managed pattern that directly satisfies the requirement.

Vertex AI AutoML Tabular supports training classification models on tabular data and deploying them to a Vertex AI endpoint in us-central1. Vertex AI Explainable AI can be enabled to return per-instance feature attributions via the explain method alongside online predictions. This exactly meets the requirement for individual prediction interpretability with feature contributions in the specified region.

Logistic regression coefficients describe the overall relationship between features and the target across the model, so they are a form of global interpretability rather than per-instance explanation. A large coefficient does not directly tell you how much that feature contributed to one specific prediction, especially when feature values differ by example and preprocessing may affect interpretation. This option also does not provide a deployed online endpoint with per-request explanations in us-central1.

L1 regularization is applied during training to encourage sparsity and reduce the influence of less useful features, but it is not something enabled at prediction time to explain a result. Even if a model is trained with L1 regularization, that does not produce per-instance feature attribution values for each online prediction. The option uses Vertex AI deployment language, but it does not include Explainable AI or the explain method, so it does not meet the stated interpretability requirement.

Question Analysis

Core concept: This question is about choosing a Google Cloud service that supports online prediction in a specific region and returns per-instance feature attributions for each prediction. The clearest managed solution for this requirement is Vertex AI with Explainable AI on a deployed endpoint. Why correct: Option B uses Vertex AI AutoML Tabular to train a binary classification model on tabular telemetry data, deploys it to a Vertex AI endpoint in us-central1, and enables feature attributions. The explain method on the endpoint returns attribution values for each feature for each individual prediction, which directly satisfies the requirement for per-instance interpretability during online serving. Key features: Vertex AI endpoints support low-latency online prediction and can be configured with explanation metadata and parameters. AutoML tabular models integrate with Vertex AI Explainable AI so you can request explanations together with predictions. This is the standard Google Cloud pattern when the exam asks for online predictions plus per-instance feature contributions. Common misconceptions: Global model interpretability is not the same as per-instance attribution. Tree structures or linear coefficients may help understand the model overall, but they do not by themselves provide the required feature contribution values for each individual online prediction. Regularization is a training technique, not an explanation mechanism used at prediction time. Exam tips: When a question includes phrases like "online prediction," "endpoint," and "per-instance feature contributions," strongly prefer Vertex AI Prediction with Explainable AI. If the requirement is specifically for explanations on each served prediction, look for an option that explicitly mentions deploying to a Vertex AI endpoint and using explain.

8
Question 8

Your media analytics team is building a Vertex AI Pipelines workflow running on a private GKE cluster in europe-west1, and the first task must run a parameterized BigQuery SQL that filters the last 24 hours of event logs (~30 million rows, ~15 GB scanned) and pass the query output directly as the input artifact to the next task; you want the simplest, lowest-effort approach that integrates cleanly into the pipeline with minimal custom code—what should you do?

Manual execution breaks automation and reproducibility, and it does not integrate cleanly with Vertex AI Pipelines orchestration. It introduces operational risk (missed runs, inconsistent parameters) and does not meet the requirement for a workflow task inside the pipeline. Exams generally penalize manual steps when an automated pipeline solution is requested.

A custom Python container calling the BigQuery API can work, but it is not the simplest approach. You must author code, manage dependencies (google-cloud-bigquery), build and store a container image, handle auth (Workload Identity/service account), and define artifact outputs. This is higher effort and maintenance than using a prebuilt component.

Creating a new component from scratch is even more work than option B. You would need to define component specs (inputs/outputs), containerization, versioning, and testing, duplicating functionality already available in official components. This contradicts the requirement for minimal custom code and lowest effort.

Using the official prebuilt BigQuery Query component from the Kubeflow Pipelines registry is the intended low-code solution. It integrates directly into the pipeline DAG, supports parameterized SQL, and produces standardized outputs (often a destination table reference or exported artifact) that downstream tasks can consume. It minimizes custom code and aligns with best practices for pipeline maintainability.

Question Analysis

Core Concept: This question tests using Vertex AI Pipelines (Kubeflow Pipelines v2) with prebuilt components to orchestrate data extraction from BigQuery with minimal custom code, and passing outputs as pipeline artifacts. Why the Answer is Correct: Option D is the lowest-effort, cleanest integration: use the official prebuilt BigQuery Query component from the Kubeflow Pipelines registry. These components are designed to be dropped into a pipeline DAG, accept parameterized SQL, execute via BigQuery APIs, and materialize results in a pipeline-friendly way (typically as a BigQuery table reference or exported artifact), which the next step can consume. This aligns with “low-code” pipeline authoring and reduces maintenance burden versus custom containers. Key Features / Best Practices: - Prebuilt components provide standardized inputs/outputs, logging, and retries consistent with pipeline execution. - Parameterization is supported via pipeline parameters (e.g., execution date) to filter “last 24 hours.” - For 30M rows / ~15 GB scanned, BigQuery is appropriate; ensure partitioning (e.g., ingestion-time or event-time) and clustering to minimize bytes scanned and cost. - Running on a private GKE cluster does not prevent calling BigQuery because BigQuery is a Google-managed service accessed over Google APIs; use Workload Identity for secure auth. If strict egress controls exist, use Private Google Access / restricted VIPs as applicable. - Prefer writing results to a temporary/destination table (or extract to GCS) as the artifact boundary; passing an in-memory result set between tasks is not how KFP artifacts typically work. Common Misconceptions: People often assume they must write a custom Python step (B/C) to call BigQuery. While workable, it increases code, container build/publish steps, dependency management, and long-term maintenance. Another misconception is that “directly pass query output” means streaming rows between steps; in practice, pipelines pass references (table URI, GCS path) as artifacts. Exam Tips: On the Professional ML Engineer exam, when asked for “simplest/lowest-effort/minimal custom code” in Vertex AI Pipelines, favor Google-provided or official KFP registry components over bespoke containers. Also remember BigQuery cost/performance hinges on partition pruning and avoiding full scans; parameterize time filters and set destination tables/expiration for intermediate outputs.

9
Question 9

Your team is fine-tuning a multilingual speech-to-text Transformer on Vertex AI using PyTorch DDP with 2 worker pools, each VM having 4x NVIDIA A100 40GB GPUs (total 8 GPUs) and a global batch size of 1024. You plan to use the Reduction Server strategy to accelerate cross-node gradient aggregation and will add a third worker pool dedicated to the reduction service. How should you configure the worker pools and container images for this distributed training job?

Incorrect. While it correctly separates training workers from the reduction service by using a third pool and the reductionserver image, it unnecessarily provisions GPUs for the reduction server pool. The reduction server primarily needs CPU, memory, and fast networking for tensor aggregation/transfer. Adding A100s increases cost significantly with little to no benefit, violating cost-optimization best practices.

Correct. Training runs on the first two GPU worker pools using the training container. The third pool runs the reductionserver container image without accelerators and should use a machine type optimized for CPU and network bandwidth (e.g., 32+ vCPUs and high-throughput networking). This matches the reduction server’s purpose: offloading cross-node gradient aggregation to reduce inter-node GPU communication overhead efficiently.

Incorrect. The scenario explicitly uses PyTorch DDP on A100 GPUs. Switching to TPUs changes the hardware and typically the distributed training stack (e.g., XLA, different parallelism patterns). The reduction server strategy described is intended for GPU-based multi-node training; using TPUs here is not aligned with the stated setup and would require re-architecting the training job.

Incorrect. This doubles down on the TPU mismatch and also incorrectly assigns TPUs to the reduction server pool. The reduction server is not a compute accelerator workload; it is a communication/aggregation service. Provisioning TPUs (or GPUs) for it is wasteful and does not address the real bottleneck, which is network and CPU handling of gradient reduction across nodes.

Question Analysis

Core Concept: This question tests Vertex AI custom training with multi-worker distributed PyTorch (DDP) and the Reduction Server strategy, which offloads cross-node gradient aggregation to dedicated CPU/network resources to reduce GPU-to-GPU communication overhead across VMs. Why the Answer is Correct: With 2 worker pools of A100 GPUs (8 GPUs total) running the actual training, the reduction server should run as a separate service on its own worker pool. The reduction server’s job is network-heavy and CPU/memory-heavy (handling gradient reduction/aggregation and shuttling tensors), not GPU compute-heavy. Therefore, the third worker pool should not waste expensive accelerators. Instead, it should use a high-bandwidth, high-vCPU machine type to maximize throughput and minimize latency for cross-node communication. Option B matches this: GPUs only on training pools; reductionserver container on a non-accelerator pool with strong networking. Key Features / Best Practices: - Vertex AI supports multiple worker pools; each pool can have different machine types, accelerators, and container images. - Reduction Server is designed to improve scaling efficiency when inter-node all-reduce becomes a bottleneck (common with large global batch sizes like 1024 and multi-node GPU training). - Use GPU-enabled images only where CUDA/NCCL compute is required (training workers). Use a separate reductionserver image for the reduction pool. - Choose a machine type for the reduction pool optimized for CPU and network (e.g., 32+ vCPUs) and ensure high-throughput networking; this aligns with the Google Cloud Architecture Framework’s performance optimization and cost optimization pillars (avoid paying for unused GPUs). Common Misconceptions: A is tempting because “distributed training = GPUs everywhere,” but the reduction server does not need GPUs; adding them increases cost without improving reduction throughput meaningfully. C and D incorrectly switch to TPUs, which is incompatible with the stated PyTorch DDP GPU setup and the reduction server approach described. Exam Tips: - When a component is described as “communication/aggregation” rather than “model compute,” assume CPU + network optimization, not accelerators. - In Vertex AI, remember each worker pool can run a different container image; training and auxiliary services (parameter servers, reduction servers) are commonly separated. - Watch for cost/performance tradeoffs: the correct design usually avoids provisioning accelerators for non-training roles.

10
Question 10

Your media subscription platform retrains a custom churn model every month on 48 GB of CSVs in Cloud Storage (~25 million rows) and then runs a batch job that scores 8.2 million users for the next 30 days; compliance demands auditable end-to-end lineage linking the exact data snapshot, container image digest, trained model version, and each batch prediction output URI, retained for 12 months; you need a repeatable batch process with built-in lineage for both the model and the predictions; what should you do?

This option can train and score data at the required scale, but it does not provide the strongest built-in end-to-end lineage. Moving the data into BigQuery is not necessary for the stated requirement, and custom prediction routines invoked through the SDK are not the same as an orchestrated pipeline that links training and prediction metadata together. You would need additional custom logic to correlate the exact training input snapshot, model version, and prediction output locations for audits. That makes it weaker than a single Vertex AI Pipeline containing both steps.

Vertex AI Experiments is useful for tracking runs and metrics, and Model Registry is useful for managing model versions, but neither service alone provides full workflow orchestration. This option also leaves the training process itself underspecified, so the lineage from the exact data snapshot and training execution to the registered model is incomplete. Although Vertex AI batch prediction can generate outputs at scale, the end-to-end chain from data to model to predictions is not as strongly captured as when both steps are executed inside a pipeline. Therefore, it does not best satisfy the requirement for built-in lineage across the entire monthly process.

A training pipeline is a good start because it can capture lineage for the training portion of the workflow. However, the option explicitly says batch predictions are generated directly in Vertex AI outside the pipeline, which breaks the single orchestrated lineage chain. That means the prediction outputs are not inherently tied to the same pipeline execution context as the training artifacts and model production step. For compliance and auditability, keeping prediction inside the pipeline is the stronger and more complete design.

Vertex AI Pipelines is the best choice because it orchestrates retraining and batch scoring as one repeatable managed workflow. Using a custom training job component creates a tracked training execution that produces a model artifact, and using the model batch predict component keeps the scoring step in the same pipeline context. This gives the strongest built-in lineage among the options, because the model used for prediction and the batch prediction outputs are associated with the same pipeline run metadata. It also aligns well with monthly scheduled execution and compliance-oriented auditability requirements.

Question Analysis

Core Concept: This question tests end-to-end ML orchestration with auditable lineage. On Google Cloud, the most direct way to get repeatable monthly retraining + batch scoring with built-in traceability is Vertex AI Pipelines (Kubeflow Pipelines on managed infrastructure) using pipeline components that automatically record artifacts and metadata in Vertex ML Metadata. Why the Answer is Correct: Compliance requires lineage that links (1) the exact input data snapshot, (2) the container image digest used for training/scoring, (3) the trained model version, and (4) each batch prediction output URI, retained for 12 months. Vertex AI Pipelines provides run-level provenance: each pipeline run captures inputs/outputs as typed artifacts (e.g., Dataset, Model, Metrics, BatchPredictionJob) and stores relationships in ML Metadata. Using a custom training job component ensures the training container image (including digest) and parameters are captured as execution metadata. Using the pipeline batch prediction component creates a Vertex AI BatchPredictionJob whose output GCS URI and model resource name/version are recorded and linked to the upstream model artifact. This produces an auditable chain from data to model to predictions, repeatable every month. Key Features / Best Practices: - Use Vertex AI Pipelines with a scheduled trigger (e.g., Cloud Scheduler/Workflows) to run monthly. - Materialize an immutable data snapshot (e.g., GCS object generation, date-partitioned path, or BigQuery snapshot table) and pass that URI/version into the pipeline so lineage points to an unchanging input. - Use Artifact Registry images pinned by digest (not just tags) in the custom training/prediction containers. - Store outputs in GCS with run-specific prefixes; pipeline metadata will record the exact URIs. - Retention: keep pipeline metadata and artifacts for 12 months (and ensure GCS retention/lock policies if required). This aligns with Google Cloud Architecture Framework guidance on governance, auditability, and operational excellence. Common Misconceptions: Many candidates assume Model Registry or Experiments alone provides end-to-end lineage. They help with model versioning and experiment tracking, but they don’t automatically link the full chain including batch prediction outputs and the exact training/prediction executions unless orchestrated through a pipeline/metadata system. Exam Tips: When you see “repeatable process,” “monthly retraining,” and “auditable lineage from data to predictions,” think Vertex AI Pipelines + ML Metadata. Prefer pipeline components for training and batch prediction to get automatic artifact tracking and reproducibility. Also watch for wording like “container image digest” and “output URI,” which strongly implies pipeline execution metadata rather than ad-hoc SDK calls.

Success Stories(7)

C
C***************Nov 24, 2025

Study period: 1 month

Just want to say a massive thank you to the entire Cloud pass, for helping me pass my exam first time. I wont lie, it wasn't easy, especially the way the real exam is worded, however the way practice questions teaches you why your option was wrong, really helps to frame your mind and helps you to understand what the question is asking for and the solutions your mind should be focusing on. Thanks once again.

F
f****Nov 23, 2025

Study period: 1 month

Good questions banks and explanations that help me practise and pass the exam.

민
민**Nov 12, 2025

Study period: 1 month

강의 듣고 바로 문제 풀었는데 정답률 80% 가량 나왔고, 높은 점수로 시험 합격했어요. 앱 잘 이용했습니다

S
S************Nov 11, 2025

Study period: 1 month

Good mix of theory and practical scenarios

A
A***********Nov 6, 2025

Study period: 1 month

I used the app mainly to review the fundamentals—data preparation, model tuning, and deployment options on GCP. The explanations were simple and to the point, which really helped before the exam.

Other Practice Tests

Practice Test #1

50 Questions·120 min·Pass 700/1000

Practice Test #3

50 Questions·120 min·Pass 700/1000
← View All Google Professional Machine Learning Engineer Questions

Start Practicing Now

Download Cloud Pass and start practicing all Google Professional Machine Learning Engineer exam questions.

Get it on Google PlayDownload on the App Store
Cloud PassCloud Pass

IT Certification Practice App

Get it on Google PlayDownload on the App Store

Certifications

AWSGCPMicrosoftCiscoCompTIADatabricks

Legal

FAQPrivacy PolicyTerms of Service

Company

ContactDelete Account

© Copyright 2026 Cloud Pass, All rights reserved.

Want to practice all questions on the go?

Get the app

Download Cloud Pass — includes practice tests, progress tracking & more.