Google Professional Machine Learning Engineer

Practice Test #2

Simulate the real exam experience with 50 questions and a 120-minute time limit. Practice with AI-verified answers and detailed explanations.

50Questions120Minutes700/1000Passing Score

Browse Practice Questions

AI-Powered

Triple AI-Verified Answers & Explanations

Every answer is cross-verified by 3 leading AI models to ensure maximum accuracy. Get detailed per-option explanations and in-depth question analysis.

GPT Pro

Claude Opus

Gemini Pro

Per-option explanations

In-depth question analysis

3-model consensus accuracy

Practice Questions

Question 1

Your team is preparing to train a fraud detection model using data in BigQuery that includes several fields containing PII (for example, card_number, customer_email, and phone_number). The dataset has approximately 250 million rows and every column is required as a feature. Security requires that you reduce the sensitivity of PII before training while preserving each column’s format and length so downstream SQL joins and validations continue to work. The transformation must be deterministic so the same input always maps to the same protected value, and authorized teams must be able to decrypt values for audits. How should you proceed?

Question Analysis

Core Concept: This question tests privacy-preserving feature engineering for ML using managed de-identification. The key services are Cloud Data Loss Prevention (DLP) for de-identification and Format-Preserving Encryption (FPE), Cloud KMS for key management, and Dataflow for scalable transformation of BigQuery-scale datasets. Why the Answer is Correct: You must reduce PII sensitivity while (1) preserving each column’s format and length, (2) ensuring deterministic mapping (same input -> same output), and (3) enabling authorized re-identification (decrypt) for audits. Cloud DLP’s FPE is designed exactly for this: it produces ciphertext that matches the original data’s character set/length constraints (e.g., credit card-like strings), can be configured deterministically, and supports reversible transformation when paired with appropriate keying material. Using Cloud KMS to protect the wrapping key aligns with enterprise security and auditability requirements. Dataflow provides the throughput needed for ~250M rows and integrates well with BigQuery I/O. Key Features / Configurations: - DLP de-identification template using cryptoReplaceFfxFpeConfig (FPE/FFX mode) to preserve format/length. - Deterministic behavior via consistent keying material and configuration; optionally use a stable surrogate/“tweak” strategy if required by policy. - Cloud KMS-managed key encryption key (KEK) to protect the DLP crypto key material (enables centralized IAM, rotation, and audit logs). - Dataflow pipeline (batch) reading from BigQuery, applying DLP transform to specific columns, and writing back to BigQuery for training. - Principle of least privilege: Dataflow service account needs BigQuery read/write and DLP/KMS permissions; restrict decrypt capability to audit teams. Common Misconceptions: - “Randomizing” values removes PII but breaks joins/validations and is not reversible. - Standard encryption with random salts improves security but defeats determinism and typically changes length/format, breaking downstream SQL expectations. - Dropping PII columns violates the requirement that every column is needed as a feature. Exam Tips: When you see requirements for (a) preserving format/length, (b) deterministic tokenization, and (c) reversible access for authorized users, think Cloud DLP FPE + Cloud KMS. For very large BigQuery datasets, pair it with Dataflow for scalable batch processing and use DLP templates for repeatability and governance (aligned with the Google Cloud Architecture Framework’s security and operational excellence pillars).

Question 2

You are a data scientist at a national power utility analyzing 850 million smart-meter readings from 3,000 substations collected over 5 years; for exploratory analysis, you must compute descriptive statistics (mean, median, mode) by device and region, perform complex hypothesis tests (e.g., differences between peak vs off-peak and seasonal periods with multiple comparisons), and plot feature variations at hourly and daily granularity over time, while using as much of the telemetry as possible and minimizing computational resources—what should you do?

Question Analysis

Core Concept: This question tests choosing the right tools for large-scale exploratory data analysis (EDA) on Google Cloud: push aggregation and filtering to BigQuery (serverless MPP analytics), use a BI tool for interactive visualization, and reserve notebooks for advanced statistics that are not easily expressed in SQL. Why the Answer is Correct: With 850 million time-series readings, importing “full data” into a notebook is inefficient and often infeasible due to memory/IO limits and high compute cost. BigQuery is designed to scan and aggregate massive datasets efficiently and can compute descriptive statistics by device/region (mean, approximate quantiles for median, counts for mode) using SQL at scale. For plotting hourly/daily variations over time, Looker Studio (formerly Data Studio) can query BigQuery directly, enabling interactive dashboards without exporting data or running a notebook continuously. Complex hypothesis tests with multiple comparisons (e.g., t-tests/ANOVA variants, nonparametric tests, p-value adjustments) are better handled in Python/R in Vertex AI Workbench; critically, the notebook should query only the necessary slices/aggregates from BigQuery to minimize resources while still using as much telemetry as possible. Key Features / Best Practices: - BigQuery: partitioning by timestamp and clustering by device_id/region to reduce scanned bytes and cost; approximate quantiles for scalable median; materialized views or scheduled queries for repeated rollups. - Looker Studio: direct BigQuery connector, cached results, parameterized filters for peak/off-peak and seasonal windows. - Vertex AI Workbench: use BigQuery client/BigQuery Storage API to pull only required subsets; run statistical libraries (SciPy/Statsmodels) for hypothesis testing and multiple-comparison corrections. These align with Google Cloud Architecture Framework principles: choose managed services, optimize cost/performance, and separate concerns (analytics vs visualization vs advanced computation). Common Misconceptions: A and B assume notebooks are the primary engine for both aggregation and visualization, but notebooks are not optimized for scanning hundreds of millions of rows and lead to oversized instances and long runtimes. C is close, but it misses the most resource-efficient approach for visualization: using Looker Studio directly on BigQuery avoids notebook-based plotting workloads and supports broad stakeholder exploration. Exam Tips: For very large datasets, default to BigQuery for heavy aggregations and filtering, BI tools for dashboards, and notebooks for specialized analyses. Watch for phrases like “minimize computational resources” and “use as much telemetry as possible”—they usually imply serverless analytics (BigQuery) plus direct-connect visualization rather than exporting data into notebooks.

Question 3

You are launching a grocery delivery mobile app across 3 cities and will use Google Cloud's Recommendations AI to build, test, and deploy product suggestions; you currently capture about 2.5 million user events per day, maintain a catalog of 120,000 SKUs with accurate price and availability, and your business objective is to raise average order value (AOV) by at least 6% within the next quarter while adhering to best practices. Which approach should you take to develop recommendations that most directly increase revenue under these constraints?

Question 4

A fintech analytics team has migrated 12 time-series forecasting and anomaly-detection models to Google Cloud over the last 90 days and is now standardizing new training on Vertex AI. You must implement a system that automatically tracks model artifacts (datasets, feature snapshots, checkpoints, and model binaries) and end-to-end lineage across pipeline steps for dev, staging, and prod; the solution must be simple to adopt via reusable templates, require minimal custom code, retain lineage for at least 180 days, and scale to future models without re-architecting; what should you do?

Question Analysis

Core Concept: This question tests end-to-end ML governance on Google Cloud: tracking artifacts (datasets, feature snapshots, checkpoints, model binaries) and lineage across pipeline steps/environments using Vertex AI Pipelines and Vertex ML Metadata (MLMD). This aligns with the Google Cloud Architecture Framework pillars of Operational Excellence (repeatable automation), Reliability (consistent provenance), and Security/Compliance (auditability). Why the Answer is Correct: Vertex AI Pipelines (Kubeflow Pipelines on Vertex) automatically integrates with Vertex ML Metadata to record executions, inputs/outputs, and artifact URIs for each pipeline component. Using the Vertex AI SDK and reusable pipeline templates/components provides a low-code adoption path: teams standardize a pipeline pattern once, then future models inherit artifact and lineage tracking without re-architecting. This directly satisfies the requirement for minimal custom code, reusable templates, and scaling to additional models. Key Features / How to Implement: - Define pipelines with the Vertex AI SDK (KFP v2) and standard components (e.g., data extraction, feature generation, training, evaluation, deployment). - Ensure each step produces typed artifacts (Dataset, Model, Metrics, etc.) and writes outputs to durable storage (typically Cloud Storage). MLMD stores metadata/lineage references to these artifacts. - Use separate projects or environments (dev/stage/prod) with consistent pipeline templates; lineage is captured per run and can be queried for audits and debugging. - Retention: MLMD retains lineage/metadata; the 180-day requirement is met by keeping metadata and underlying artifact storage (e.g., GCS lifecycle policies for binaries/checkpoints). If organizational policy requires, configure dataset/model artifact retention and access controls. Common Misconceptions: Some assume MLflow is required for lineage; on Vertex, MLMD is the native lineage system tightly integrated with Pipelines. Others confuse Vertex AI Experiments (run tracking) with full artifact lineage across pipeline steps; Experiments is not a complete lineage solution for multi-step pipelines. Exam Tips: When you see “end-to-end lineage across pipeline steps” and “simple via templates/minimal custom code,” default to Vertex AI Pipelines + Vertex ML Metadata. Prefer native integrations over assembling multiple tools unless requirements explicitly demand third-party portability. Also remember: metadata stores references; artifact retention is handled by the backing storage (often GCS) and lifecycle policies.

Question 5

You trained an automated scholarship eligibility classifier for a national education nonprofit using Vertex AI on 1.2 million labeled applications, reaching an offline ROC AUC of 0.95; the review board is concerned that predictions may be biased by applicant demographics (e.g., gender, ZIP-code–derived income bracket, first-generation college status) and asks you to deliver transparent insight into how the model makes decisions for 500 sampled approvals and denials and to identify any fairness issues across these cohorts. What should you do?

Question Analysis

Core Concept: This question tests model transparency and fairness evaluation in Vertex AI—specifically using explainability (feature attribution) to understand why individual predictions were made and then analyzing those explanations across demographic cohorts to detect potential bias. This aligns with responsible AI practices in the Google Cloud Architecture Framework (governance, risk management, and trust). Why the Answer is Correct: The board requests “transparent insight” for 500 sampled approvals/denials and to “identify fairness issues across cohorts.” Vertex AI feature attribution (Vertex Explainable AI) provides per-prediction explanations (local explanations) showing how each input feature contributed to a specific decision. By aggregating attributions and outcomes by cohort (e.g., gender, income bracket, first-gen status), you can identify whether sensitive or proxy features disproportionately drive approvals/denials, and whether similarly qualified applicants receive different outcomes across groups—key evidence for bias investigation. Key Features / How to Do It: Use Vertex AI Explainable AI on the deployed model or batch predictions for the 500 sampled cases to obtain attributions (e.g., Integrated Gradients / sampled Shapley depending on model type). Then slice results by cohort and compare: (1) distribution of prediction scores, (2) top contributing features, and (3) fairness metrics such as demographic parity difference, equal opportunity / TPR gaps, and calibration by group (often computed outside Vertex AI using BigQuery/Looker/Python, but driven by the attribution outputs). Also look for proxy variables (ZIP-code–derived income) acting as sensitive-feature surrogates. Common Misconceptions: High ROC AUC (0.95) does not imply fairness; it can coexist with discriminatory behavior. Monitoring drift/skew is operationally important but does not answer “why” decisions are made or whether they are biased. Removing demographic features may not remove bias because proxies remain, and it can reduce the ability to measure/mitigate fairness. Exam Tips: When the prompt asks for transparency, interpretability, and cohort-based bias analysis, think “Vertex Explainable AI/feature attribution + slice by groups.” When it asks for production data drift or training-serving skew, think “Vertex Model Monitoring.” For fairness, prefer measurement and evidence (explanations + group metrics) before remediation steps like reweighting, constraints, or feature removal.

Want to practice all questions on the go?

Download Cloud Pass — includes practice tests, progress tracking & more.

Question 6

You are building a deep neural network classifier for a ride-sharing fraud detection system with 30 million training records; several categorical features have very high cardinality (driver_id ≈ 320,000 unique values, vehicle_vin ≈ 110,000, pickup_zip ≈ 42,000), and due to a 16 GB GPU memory cap you cannot materialize a full one-hot vocabulary for each column. Which encoding should you use to feed these categorical features into the model so that the representation scales, remains sparse, and does not impose artificial ordinality?

Question Analysis

Core Concept: This question tests scalable encoding of high-cardinality categorical features for deep neural networks under memory constraints. The key requirement is an encoding that (1) scales to hundreds of thousands of categories, (2) stays sparse, and (3) avoids introducing artificial ordinality. Why the Answer is Correct: Feature hashing (hashing trick) maps each categorical value into one of a fixed number of buckets and then represents it as a sparse one-hot vector over those buckets. This avoids building and storing an explicit vocabulary (which is expensive for driver_id/vehicle_vin) and keeps the input representation sparse, which is efficient for both memory and compute. Because the representation is one-hot over hash buckets, it does not impose an ordinal relationship between categories (unlike feeding integer IDs as continuous values). Hashing also works well in streaming/online settings and when new categories appear, since unseen values can still be mapped deterministically to buckets. Key Features / Best Practices: - Choose bucket sizes per feature based on cardinality and acceptable collision rate (e.g., larger buckets for driver_id than pickup_zip). - Use separate hash spaces per feature to prevent cross-feature collisions. - Implement via TensorFlow/Keras preprocessing layers (e.g., Hashing + CategoryEncoding) or equivalent in other frameworks; keep the output as sparse tensors where supported. - Understand the trade-off: collisions introduce controlled noise; often acceptable at scale, and can be mitigated with larger bucket counts. Common Misconceptions: A is tempting because integer IDs are compact, but feeding them directly as numeric inputs makes the model treat category “320000” as larger than “2”, creating false ordinality and distance. C is the “classic” one-hot approach but becomes infeasible with very large vocabularies and GPU memory limits. D is unrelated to ML feature representation and does not create a meaningful numeric embedding. Exam Tips: When you see “high cardinality + cannot materialize vocabulary/one-hot + want sparse + no ordinality,” think feature hashing (or embeddings if dense representations are acceptable). If the question explicitly asks to remain sparse, hashing-based one-hot is the canonical answer. Also remember operational benefits: hashing handles unseen categories without retraining the vocabulary, aligning with scalable production ML practices and the Google Cloud Architecture Framework’s emphasis on performance efficiency and operational simplicity.

Question 7

You work for a wind farm operator. You have been asked to develop a model to predict whether a turbine will require unscheduled maintenance on a given day. Your team has processed 18 months of turbine telemetry (~2.4 million rows) and created a table with the following rows: • Turbine_id • Site_id • Date • Hours_since_last_service (measured in hours) • Average_vibration_frequency (measured in Hz) • Temperature_delta_7d (measured in °C) • Unscheduled_maintenance (binary class, if maintenance occurred on the Date) Models must be deployed in us-central1, and you need to interpret the model’s results for each individual online prediction with per-instance feature contributions. What should you do?

Question 8

Your media analytics team is building a Vertex AI Pipelines workflow running on a private GKE cluster in europe-west1, and the first task must run a parameterized BigQuery SQL that filters the last 24 hours of event logs (~30 million rows, ~15 GB scanned) and pass the query output directly as the input artifact to the next task; you want the simplest, lowest-effort approach that integrates cleanly into the pipeline with minimal custom code—what should you do?

Question 9

Your team is fine-tuning a multilingual speech-to-text Transformer on Vertex AI using PyTorch DDP with 2 worker pools, each VM having 4x NVIDIA A100 40GB GPUs (total 8 GPUs) and a global batch size of 1024. You plan to use the Reduction Server strategy to accelerate cross-node gradient aggregation and will add a third worker pool dedicated to the reduction service. How should you configure the worker pools and container images for this distributed training job?

Question Analysis

Core Concept: This question tests Vertex AI custom training with multi-worker distributed PyTorch (DDP) and the Reduction Server strategy, which offloads cross-node gradient aggregation to dedicated CPU/network resources to reduce GPU-to-GPU communication overhead across VMs. Why the Answer is Correct: With 2 worker pools of A100 GPUs (8 GPUs total) running the actual training, the reduction server should run as a separate service on its own worker pool. The reduction server’s job is network-heavy and CPU/memory-heavy (handling gradient reduction/aggregation and shuttling tensors), not GPU compute-heavy. Therefore, the third worker pool should not waste expensive accelerators. Instead, it should use a high-bandwidth, high-vCPU machine type to maximize throughput and minimize latency for cross-node communication. Option B matches this: GPUs only on training pools; reductionserver container on a non-accelerator pool with strong networking. Key Features / Best Practices: - Vertex AI supports multiple worker pools; each pool can have different machine types, accelerators, and container images. - Reduction Server is designed to improve scaling efficiency when inter-node all-reduce becomes a bottleneck (common with large global batch sizes like 1024 and multi-node GPU training). - Use GPU-enabled images only where CUDA/NCCL compute is required (training workers). Use a separate reductionserver image for the reduction pool. - Choose a machine type for the reduction pool optimized for CPU and network (e.g., 32+ vCPUs) and ensure high-throughput networking; this aligns with the Google Cloud Architecture Framework’s performance optimization and cost optimization pillars (avoid paying for unused GPUs). Common Misconceptions: A is tempting because “distributed training = GPUs everywhere,” but the reduction server does not need GPUs; adding them increases cost without improving reduction throughput meaningfully. C and D incorrectly switch to TPUs, which is incompatible with the stated PyTorch DDP GPU setup and the reduction server approach described. Exam Tips: - When a component is described as “communication/aggregation” rather than “model compute,” assume CPU + network optimization, not accelerators. - In Vertex AI, remember each worker pool can run a different container image; training and auxiliary services (parameter servers, reduction servers) are commonly separated. - Watch for cost/performance tradeoffs: the correct design usually avoids provisioning accelerators for non-training roles.

Question 10

Your media subscription platform retrains a custom churn model every month on 48 GB of CSVs in Cloud Storage (~25 million rows) and then runs a batch job that scores 8.2 million users for the next 30 days; compliance demands auditable end-to-end lineage linking the exact data snapshot, container image digest, trained model version, and each batch prediction output URI, retained for 12 months; you need a repeatable batch process with built-in lineage for both the model and the predictions; what should you do?

Question Analysis

Core Concept: This question tests end-to-end ML orchestration with auditable lineage. On Google Cloud, the most direct way to get repeatable monthly retraining + batch scoring with built-in traceability is Vertex AI Pipelines (Kubeflow Pipelines on managed infrastructure) using pipeline components that automatically record artifacts and metadata in Vertex ML Metadata. Why the Answer is Correct: Compliance requires lineage that links (1) the exact input data snapshot, (2) the container image digest used for training/scoring, (3) the trained model version, and (4) each batch prediction output URI, retained for 12 months. Vertex AI Pipelines provides run-level provenance: each pipeline run captures inputs/outputs as typed artifacts (e.g., Dataset, Model, Metrics, BatchPredictionJob) and stores relationships in ML Metadata. Using a custom training job component ensures the training container image (including digest) and parameters are captured as execution metadata. Using the pipeline batch prediction component creates a Vertex AI BatchPredictionJob whose output GCS URI and model resource name/version are recorded and linked to the upstream model artifact. This produces an auditable chain from data to model to predictions, repeatable every month. Key Features / Best Practices: - Use Vertex AI Pipelines with a scheduled trigger (e.g., Cloud Scheduler/Workflows) to run monthly. - Materialize an immutable data snapshot (e.g., GCS object generation, date-partitioned path, or BigQuery snapshot table) and pass that URI/version into the pipeline so lineage points to an unchanging input. - Use Artifact Registry images pinned by digest (not just tags) in the custom training/prediction containers. - Store outputs in GCS with run-specific prefixes; pipeline metadata will record the exact URIs. - Retention: keep pipeline metadata and artifacts for 12 months (and ensure GCS retention/lock policies if required). This aligns with Google Cloud Architecture Framework guidance on governance, auditability, and operational excellence. Common Misconceptions: Many candidates assume Model Registry or Experiments alone provides end-to-end lineage. They help with model versioning and experiment tracking, but they don’t automatically link the full chain including batch prediction outputs and the exact training/prediction executions unless orchestrated through a pipeline/metadata system. Exam Tips: When you see “repeatable process,” “monthly retraining,” and “auditable lineage from data to predictions,” think Vertex AI Pipelines + ML Metadata. Prefer pipeline components for training and batch prediction to get automatic artifact tracking and reproducibility. Also watch for wording like “container image digest” and “output URI,” which strongly implies pipeline execution metadata rather than ad-hoc SDK calls.

Success Stories(7)

C***************Nov 24, 2025

Study period: 1 month

Just want to say a massive thank you to the entire Cloud pass, for helping me pass my exam first time. I wont lie, it wasn't easy, especially the way the real exam is worded, however the way practice questions teaches you why your option was wrong, really helps to frame your mind and helps you to understand what the question is asking for and the solutions your mind should be focusing on. Thanks once again.

f****Nov 23, 2025

Study period: 1 month

Good questions banks and explanations that help me practise and pass the exam.

민

민**Nov 12, 2025

Study period: 1 month

강의 듣고 바로 문제 풀었는데 정답률 80% 가량 나왔고, 높은 점수로 시험 합격했어요. 앱 잘 이용했습니다

S************Nov 11, 2025

Study period: 1 month

Good mix of theory and practical scenarios

A***********Nov 6, 2025

Study period: 1 month

I used the app mainly to review the fundamentals—data preparation, model tuning, and deployment options on GCP. The explanations were simple and to the point, which really helped before the exam.

Other Practice Tests

Practice Test #1

50 Questions·120 min·Pass 700/1000

Practice Test #3

50 Questions·120 min·Pass 700/1000

← View All Google Professional Machine Learning Engineer Questions

Start Practicing Now

Download Cloud Pass and start practicing all Google Professional Machine Learning Engineer exam questions.

Want to practice all questions on the go?

Get the app

Download Cloud Pass — includes practice tests, progress tracking & more.

Cloud Pass

Google Professional Machine Learning Engineer

Practice Test #2

Simulate the real exam experience with 50 questions and a 120-minute time limit. Practice with AI-verified answers and detailed explanations.

50Questions120Minutes700/1000Passing Score

Browse Practice Questions

AI-Powered

Triple AI-Verified Answers & Explanations

Every answer is cross-verified by 3 leading AI models to ensure maximum accuracy. Get detailed per-option explanations and in-depth question analysis.

GPT Pro

Claude Opus

Gemini Pro

Per-option explanations

In-depth question analysis

3-model consensus accuracy

Practice Questions

Question 1

Question Analysis

Question 2

Question Analysis

Question 3

Question 4

Question Analysis

Question 5

Question Analysis

Want to practice all questions on the go?

Download Cloud Pass — includes practice tests, progress tracking & more.

Question 6

Question Analysis

Question 7

Question 8

Question 9

Question Analysis

Question 10

Question Analysis

Success Stories(7)

C***************Nov 24, 2025

Study period: 1 month

f****Nov 23, 2025

Study period: 1 month

Good questions banks and explanations that help me practise and pass the exam.

민

민**Nov 12, 2025

Study period: 1 month

강의 듣고 바로 문제 풀었는데 정답률 80% 가량 나왔고, 높은 점수로 시험 합격했어요. 앱 잘 이용했습니다

S************Nov 11, 2025

Study period: 1 month

Good mix of theory and practical scenarios

A***********Nov 6, 2025

Study period: 1 month

I used the app mainly to review the fundamentals—data preparation, model tuning, and deployment options on GCP. The explanations were simple and to the point, which really helped before the exam.

Other Practice Tests

Practice Test #1

50 Questions·120 min·Pass 700/1000

Practice Test #3

50 Questions·120 min·Pass 700/1000

← View All Google Professional Machine Learning Engineer Questions

Start Practicing Now

Download Cloud Pass and start practicing all Google Professional Machine Learning Engineer exam questions.

Want to practice all questions on the go?

Get the app

Download Cloud Pass — includes practice tests, progress tracking & more.