Last updated: 08.04.2026

Public preview

CHAPTER

Smart recommendations, upsell, bundling, and guardrails

Technical Cookbook

Did you know that AI systems were created (also) by programmers? At least for the moment, programmers are still needed to integrate AI with existing processes and software. This chapter is dedicated to those who want to see small snippets of such an integration with their own eyes. If you are a decision-maker, the minimum (but complete) implementation flow in 4.1 is for you. Or you can skip to the next chapter.

There is no absolute winner between the Build and Managed Deep Learning implementation options, which we compared in Ch. 2.4. In practice, for different use cases, many companies use a hybrid approach. For example, they use managed components for product recommendations but implement custom solutions for time-series forecasting or intelligent discounting.

The correct decision flow starts from the deliverables exemplified in Ch. 3.3, then identifies the appropriate technical approach, and finally estimates the necessary effort.

STEP BY STEP

4.1

A complete minimal flow

You will go step by step through the implementation of an AI recommendation engine, providing you with a blueprint for the Build variant (custom implementation) or a checklist for the Managed variant (Cloud Native).

In both variants, the committed costs are easy to quantify (specifically data engineers or Google Cloud consumption), but what is invisible is the opportunity cost: the cost of a poor implementation that goes unused. The goal is to present the mandatory steps for success.

Step 0

4.1.1

Minimum requirements to start

Before implementation, the company must define:

Business objectives: what you want to improve (AOV, conversion rate, margin, sale of slow-moving products). See Ch. 2.2.
Measurable KPIs: company indicators (monthly or quarterly) and technological metrics (e.g., latency and fallback).
Business rules: what the AI must respect (no price, stock, or compatibility hallucinations; minimum margin; brand diversity rules; exclusion of bad-paying customers). See Ch. 2.5.1.

Build

Meetings with sales, marketing, finance, and IT departments to define:

1-2 pilot scenarios (e.g., "upsell in quoting", "recommendation on product page").
A minimal set of guardrails (margins, stock, compatibility).
Feedback collection, especially negative feedback (e.g., "this recommendation does not interest me", "this recommendation led to 0 sales last month").

Managed

The same processes as in the Build variant.

Plus:

Understanding the objectives definable in the dashboard (e.g., maximize CTR / CVR / Revenue).
Preparing traffic segments for A/B testing.
Defining the region. At the present moment, Vertex AI Search for commerce only allows a global location.
Defining the compliance and security requirements (e.g., SOC 2 Type II, NIST AI RMF, HIPAA for medical, CCPA for data privacy, NIS2 for EU business).

Step 1

4.1.2

Data inventory and connection

The recommendation system needs at least three types of data detailed in Ch. 2.3.2: the product catalog, customer interaction history, and continuous events (e.g., clickstream). We focus on the first two.

Build

Identify the sources: ERP, WMS, CRM, e-commerce, and essential Excel files.

Define a unified data schema (unique product ID, mapping between ERP, CRM, and e-commerce codes, normalization plan for units of measurement, etc.).

Perform data cleaning and normalization, as recommended in Ch. 2.5.2.

Build an ETL/ELT pipeline (e.g., in BigQuery or another data warehouse) for:

Importing the catalog once a day (full) + incrementally as needed.
Optional basic filters for pre-processing (e.g., products out of stock for 3 months are excluded).
Near-continuous import of order / traffic events (streaming or batch at intervals).

Managed

Data cleaning and normalization is mandatory in this case as well.

Instead of building a full ETL, you use standard connectors and feeds:

Catalog feed (API/CSV/JSON) mapped to the strict schema of Google. If you use BigQuery as a data warehouse, you can connect the data directly.
Business rules can be initially defined as feeds: lists of products to promote, exclude, or bundle, then expanded via the console or configs.
Prepare the events feed: server-side events, Google Tag Manager, logs.
Note that code libraries and API endpoints still use the historical name of Retail (e.g., google.cloud.retail_v2).

For B2B, if prices vary per customer, Vertex AI Search for commerce can learn using list prices, then the application needs to enforce the correct price. There is also support for distinct prices per customer segments, but it requires complex data engineering.

Recommendation: even in the Managed variant, keep a copy of the data in a data warehouse (e.g., BigQuery) for audit purposes and to have your own source of commercial truth.

Step 2

4.1.3

Defining events, deliverables, and insertion points

You cannot optimize what you do not measure. The company must standardize the events that describe user behavior.

Vertex AI Search for commerce API event types. Check the official documentation

Discovery: search, detail-page-view, category-page-view, home-page-view, shopping-cart-page-view.

Transactional: add-to-cart, remove-from-cart, purchase-complete.

Then, decide in which concrete software deliverables (See Ch. 3.3) you will insert the AI recommendations. These give you the insertion points. For example, in an e-commerce UI, you can display recommendations on: the homepage, category pages, product pages, the cart, PDF quotes, emails, and the administrative backend (for the support team).

Data Risks in B2B

In B2B, the tracking key for indicators is not individual as in B2C, but rather the contract, the customer (account or organization), or a group of customers.
In B2B, as transactions are large but rare, losing even one event in tracking can be catastrophic. Use retry and idempotency mechanisms.
We recommend pseudo-anonymization (e.g., hashing) of personal data before sending it to the recommendation system. You maintain consistency with reduced compliance risk.
For explainability and audit, it is recommended to log all data interactions with the recommendation system. This way, you can separately reconstruct technical KPIs like NDCG@k (ranking relevance) and Recall@k (search relevance) in the future.

Build

Define an event schema (names, data payload, coherent unique IDs). Implement their logging from the company's existing applications:

JS scripts for the e-commerce site.
Hooks in the e-commerce backend, ERP, WMS, or CRM.
Logging to files or topics (e.g., Pub/Sub, Kafka).

Decide how the application will receive recommendations:

Endpoint API POST /recom Account_id=…&Context=…
Widget on the e-commerce site or in internal applications.

Managed

Map your events to the strict schema of Google. If you have events in BigQuery or Google Tag Manager, you can connect them directly.

Configure ServingConfigs for each insertion point (Placement) you define in your applications (e.g., homepage_recs, cart_upsell).

A ServingConfig includes

The model selected from the Google list. E.g., Buy it Again with the objective to maximize already selected
Serving Controls. Filter (exclusion - only in search mode), Boost (promotion), Bury (demotion)
Diversity setting: how non-repetitive the recommendations are.
Call a ServingConfig when you invoke the API, thus you can use it for multiple insertion points in your company's applications if they share the same rules.
Serving Controls can be implemented as JSON files, so they can be reused in different search ServingConfigs.

Note that, in older versions, objects of a special type Placement were defined, but these are deprecated in the new API for Vertex AI Search for commerce (retail_v2).

The benefit is that you won't implement from scratch. But you must decide what success means for each insertion point, plus insert historical data (minimum 3 months recommended) so the algorithms do not start from zero.

Step 3

4.1.4

Candidate generation (Retrieval)

The first technical step of the recommendation system will be to find potential candidates: dozens or hundreds of products that are relevant. This is the Retrieval phase from the architecture described in Ch. 2.3.1.

Build

Configure a lexical search engine (BM25 / full-text) on the catalog and, in parallel, a vector search engine with embeddings for products, customers, and session sequences.

Optionally implement a rules-based pre-processing module: customer type, stock, avoiding irrelevant products.

The result obtained from Vector Search is a list of candidates that then go to Scoring.

Managed

Offers native Retrieval based on events and content (content-based / deep attribution: attributes, product sheet, including images) , and based on associations.

For upsell and cross-sell, data engineering is usually employed together with the Collections capability: for each product you can send numerous package IDs (complementary, substitute, etc.) which contain them, so the model transfers relevance.

In the console, you can define new synonyms (e.g., define "pkg" to be "package") and, especially, filters (e.g., only products in stock, only this brand / category) or diversity strategies ("don't just recommend what is ultra-popular").

You don't implement the recommendation algorithms, but you must pay attention to configuration and data engineering.

See Guide #4 for combining lexical search with semantic search

Step 4

4.1.5

Ranking, rules, and final list generation

This contains the most technically complex steps. For ranking, the AI scoring models will perform:

Ordering the candidates.
Post-processing through business rules (guardrails). These can be exclusion rules (hard), ordering rules (soft), or process rules (e.g., audit).
Final delivery of a short list of products to the application in milliseconds (in optimized scenarios).

Build

Train and call a ranking model (e.g., learning to rank type) that receives:

signals from the catalog: price, margin, category, compatibility.
context signals: customer type, time, cart / quote details.

Apply business rules in post-processing (e.g., Python, rule engines like Drools):

Filter products without stock or below the minimum margin.
Promote specific products or campaigns.
Ensure deduplication and diversity. E.g., "no 3 consecutive products from the same brand".

Test scaling capacity and optimize for low latency: caching, model optimizations.

Managed

Initially choose objectives and ranking parameters: "optimize for click on recommendation" / "optimize for revenue per session" / "optimize for conversion". Once trained, a model with a selected maximized objective cannot be changed. You will have to re-train it if you want to change the objective.

You can also configure the training interval of the model.

You refine business rules as Filters / Boost / Bury (Serving Controls). To obtain desired effects, creativity and solid data engineering are required:

Each control has a structure of condition / effect (e.g., in CEL - Common Expression Language). This brings you back to defining the schema.
For example, to implement an external temporal forecast rule, you can create a seasonal_score attribute in the schema (e.g., 0.0-1.0) and then define a Boost.
Note that some commercial rules usually remain in the middleware or application, such as permissions.

Automatic scaling in the cloud at peak load (e.g., Black Friday) is usually an advantage.

See Guide #5 - Continuous optimization, reporting, and feedback

Step 5

4.1.6

UX integration and fallback

The AI recommendation engine must be seamlessly connected with the software application (e.g., e-commerce). It must also allow for feedback and ensure predictable behavior for the customers or sales reps who benefit.

Build

A dedicated microservice with an API will return the final list of recommendations.

Build a UI component that displays the recommendations (e.g., an e-commerce site).

Define an elegant graceful fallback. What is seen when the API does not respond / has no recommendations? For example, a SQL query for the newest products will run in these cases.

Implement an explicit feedback mechanism (Thumbs up/down or Hide). Upon re-training, this signal must be much stronger than simply ignoring the recommendation.

Managed

Send the output to the application programmatically (via API/SDK) or embed some mini-applications available in Vertex AI Search for commerce (e.g., frontend widgets via JS).

In the first case, you build the UI component as in the Build variant.

Provide for fallback as in the Build variant to ensure there is never an empty interface (necessary if using the API variant). If you use Google widgets, they have a built-in fallback mechanism and in the API versions there is a parameter strictFiltering which can be set as false to continue to return recommendations even if the filters would be so strict as to exclude all products.

Provide explicit feedback mechanisms. Usually, Vertex AI Search for commerce collects implicit feedback (clicks, conversions). Negative feedback must be implemented case by case. You can use Filter / Bury or even an event for removal from the cart. In the application you can use data engineering to filter out the products sent to AI or returned from AI.

Step 6

4.1.7

Testing, analysis, and continuous feedback

A robust recommendation system depends on A/B testing and continuous maintenance after delivery. You monitor KPIs, anomalies (e.g., quality reports), and feedback from customers or the company.

Shadow mode (running AI recommendations for internal audiences before launching to customers) allows for algorithm validation on real data without risking revenue or reputation. Similarly, a backtesting procedure runs the algorithm on historical data, potentially using a backfill, before any live run.

Build

Configure A/B testing (in-house or through a SaaS platform like split.io):

Group A = baseline situation.
Group B = applying new algorithms.

Compare KPIs (usually business metrics): AOV, recommendation acceptance rate, quoting time.

After validation and launch, implement analysis dashboards (e.g., Grafana / Looker Studio) for quality reporting:

Business KPI trends (see above).
Statistical distribution of recommendations: bestseller percentage, diversity by brand / price / category.
Interventions of business rules (guardrails) and fallback mechanisms.

When modifying algorithms, run backtesting (testing on old data) and shadow mode (internal testing in parallel with the existing system).

Managed

For A/B testing, use Vertex AI Search for commerce experiments:

Change objectives and strategies without modifying the programmed code.
Analyze results in visual dashboards.
Adjust based on KPIs (usually business-oriented).

For analysis, the platform offers visual reports and metrics: CTR, conversion rate, revenue per session. Custom dashboards can be configured as in the Build variant, e.g., in Looker Studio.

You can run shadow mode experiments on a new model (not in production) to compare results internally.

You can also experiment with gradually replacing configured Boost / Bury / Filter rules with those auto-calculated by the AI model (as positive effects are confirmed).

If you have successfully gone through these minimal steps, you have sketched out a complete recommendation system. Furthermore, you have learned that AI recommendations are not a magic black box, but a structure within the company's sales system, where data, rules, and people are essential.

In brief:

Deep Learning is not a single process, but a sequence of steps: defining objectives, cleaning and connecting data, defining events, generating candidates, ranking and post-processing, UX integration, explicit feedback, monitoring.
Build (custom) offers maximum control but requires a dedicated and disciplined technical team in addition to the entire company's effort.
Managed outsources part of the complexity but requires company-level decisions and careful technical configuration.
Both variants depend on data cleaning and A/B testing to achieve results.

CODE EXAMPLES

4.2

Challenges and solutions in the Build approach

We present code snippets for a Deep Learning Build implementation. These four scenarios correspond to the business challenges in Ch. 2.5: guardrails and data contracts, as well as the technical challenges in Ch. 2.6: business function models and bundling.

Notes

The examples are in pseudocode to demonstrate in practice several key pieces of AI recommendation logic.
For Big Data, logic tends to move from Python to Spark or SQL to avoid iterating over millions of products. See examples in Ch. 4.2.4.
Code examples are simplified for clarity and require adaptation to specific production environments, plus authentication/IAM, protection for concurrency, exceptions, retries, etc.

4.2.1

Guardrails: Brand security and exclusion rules

The AI has produced a raw score for each candidate, and guardrails must ensure we do not sell products below the minimum margin or without stock, and that we don't leave the UI empty if APIs fail.

Example B1: Filtering in the recommendation pipeline (`Python`)

Snippet: pipeline that receives candidates with scores and applies guardrails before sending to UI:


from dataclasses import dataclass

@dataclass
class Candidate:
    sku: str
    score_ai: float
    stock: int
    margin_pct: float
    forbidden: bool = False

def apply_guardrails(
    candidates: list[Candidate],
    min_margin_pct: float,
    min_score_ai: float,
    max_items: int = 10,
) -> list[Candidate]:
    # 1. filtering forbidden products, without stock or under minimum margin
    filtered = [
        c for c in candidates
        if not c.forbidden
        and c.stock > 0
        and c.margin_pct >= min_margin_pct
        and c.score_ai >= min_score_ai
    ]

    # 2. if the list is too short after filtering, fill with internal fallback (get_top_selling_products)
    if len(filtered) < max_items:
        missing = max_items - len(filtered)
        exclude_skus = {c.sku for c in filtered}
        filtered.extend(get_top_selling_products(exclude_skus=exclude_skus, limit=missing))
    # 3. sort by AI score and limit
    filtered.sort(key=lambda c: c.score_ai, reverse=True)
    return filtered[:max_items]

Notes

Guardrails are a distinct function in the pipeline.
Fallback (step 2) is just as important as filtering.
Ensure that get_top_selling_products() applies the same guardrails.

Example B2: Candidate quality validation at table level (`SQL`)

In addition to runtime filters, you can perform offline checks that detect bad recommendations (e.g., with negative margins).

Snippet: pseudocode for detecting recommendations with margin below threshold:


SELECT
  r.account_id
  , r.scenario
  , r.sku_recom
  , r.score_ai
  , p.margin_pct
  , r.created_at
FROM recommendations_log r
JOIN products p ON r.sku_recom = p.sku
WHERE 
  p.margin_pct <= 0.15 -- or a parametrizable threshold
  AND r.created_at > (NOW() - INTERVAL '1 DAY')
ORDER BY r.created_at DESC
LIMIT 100;

Notes

These types of queries generate a quality report in a dashboard for analysis.
You decide to what extent the guardrails are sufficient or more guardrails need to be defined.

4.2.2

Data ingestion and consistency

Without correct ingestion, bad recommendations will be generated. In addition to the latency challenge, data consistency is important. All events must have minimum details and be uniquely identified for idempotency.

Example B3: Event model (`Python`)

Snippet: stock update event structure, validated before the pipeline / Feature Store:


from pydantic import BaseModel, Field
from datetime import datetime
from typing import Optional

class InventoryUpdateEvent(BaseModel):
    sku: str
    new_stock: int = Field(ge=0)
    source: Optional[str] = None     # e.g. "WMS", "ERP", "Manual"
    updated_at: datetime
    idempotency_key: str = Field(min_length=8)

    # Here you can add Feature logic for which a pipeline should be defined

Notes

Validates minimum data (new_stock is not negative) and allows ingestion in any order via updated_at.
In B2B, new_stock must be calculated usually as available inventory (total - reserved) or even pickable inventory (from WMS).
idempotency_key ensures that during a retry/duplicate send of the same stock update, the system can process it only once.

Example B4: Upsert for stock (`SQL`)

Snippet: Postgres pseudocode for stock update. In BigQuery use MERGE:


INSERT INTO inventory (sku, stock, updated_at)
VALUES (:sku, :stock, :updated_at)
ON CONFLICT (sku)
DO UPDATE SET
  stock = EXCLUDED.stock
  , updated_at = EXCLUDED.updated_at
WHERE 
    inventory.updated_at IS NULL 
    OR inventory.updated_at < EXCLUDED.updated_at;

Notes

Idempotency can be additionally implemented with a deduplication table or a field (e.g., last_event_id).
You can determine if the product is eligible for recommendations (stock > 0) and mark it.
In case of modification, you trigger a task that invalidates the local cache for it.

Example B5: Stock consistency check (`SQL`)

Snippet: pseudocode for detecting differences between the official stock and the local recommendation cache:


SELECT
  i.sku
  , i.stock AS stock_official
  , c.stock_cache AS stock_recommender
FROM inventory i
JOIN recommender_cache c ON i.sku = c.sku
WHERE i.stock <> c.stock_cache;

Notes

Investigate where real-time ingestion is failing. E.g.: lost updates, unsent updates.
It is recommended that the system detects problems before they are reported by customers or reps.

4.2.3

Choosing the right model for the business function

Recall from Ch. 2.6.1 that there is no single magic recommendation model. In practice, you have a model for Similar Items, another for Frequently Bought Together, perhaps another for Buy it Again, plus special models.

Example B6: Simple model router (`Python`)

Snippet: router that decides which model to call based on context, with Feature Store variant:


from enum import Enum

class Scenario(Enum):
    SIMILAR_ITEMS = "similar_items"
    CART_ADDONS = "cart_addons"
    SUBSTITUTES = "substitutes"

# Registry mapping scenario -> model
MODEL_REGISTRY = {
    Scenario.SIMILAR_ITEMS:  "models/similar_items_v3",
    Scenario.CART_ADDONS:    "models/cart_addons_v5",
    Scenario.SUBSTITUTES:    "models/substitutes_v2",
}

# If you are using a Feature Store, you can have a separate registry for scenario service versions
FS_SERVICE_REGISTRY = {
    Scenario.SIMILAR_ITEMS:  "similar_items_service_v1",
    Scenario.CART_ADDONS:    "cart_addons_service_v2",
    Scenario.SUBSTITUTES:    "substitutes_service_v1",
}

def recommend_for_scenario(
    scenario: Scenario,
    account_id: str,
    context: dict,
) -> list[str]:
    model_path = MODEL_REGISTRY[scenario]
    
    full_context = context.copy()
    # Interaction with Feature Store (conceptual): first get details from the store, then hydrate the context
    # fs_service = FS_SERVICE_REGISTRY[scenario]
    # account_features = call_fs_to_get_features(account_id=account_id, service=fs_service)
    # full_context.update(account_features)

    # Call to a local model or a cloud endpoint (with all necessary data)
    scores = call_model(model_path, account_id=account_id, inputs=full_context)
    

    # scores is a list of (sku, ai_score), e.g., [("SKU1", 0.92), ("SKU2", 0.87)...]
    ordered = sorted(scores, key=lambda x: x[1], reverse=True)
    return [sku for sku, _ in ordered]

Notes

Model and service version organization is essential and it is better to be decoupled. If you don't know which version is running in a scenario, you cannot explain what is happening in production.
Through hydration you get very quickly, before model inference, the latest attributes from the Feature Store for that account_id.
The logic of the Feature Store is not shown, the contract of call_fs_to_get_features includes connection, caching, and deserialization (for minimal latency).

Example B7: Declarative config for scenarios

Snippet: scenario configuration in a declarative file (JSON or YAML), easily modifiable by an architect or product owner (see Ch. 5.2.1):


scenarios:
  similar_items:
    model: "models/similar_items_v3"
    max_items: 10

  cart_addons:
    model: "models/cart_addons_v5"
    max_items: 6

  substitutes:
    model: "models/substitutes_v2"
    max_items: 5

Notes

The router from the previous example can read this configuration at startup.
You can use Convention over Configuration principles (e.g., scenario name matches the model name, similar_items is also in the audit store name, etc.).

4.2.4

Bundling and dynamic packages

Bundling can occur in two places:

In the recommendation system: what bundle to propose for the current cart or quote.
In the data warehouse: which proposals we recommend becoming permanent bundles in the application.

In a production situation with millions of historical transactions, simple iteration (e.g., Python) is inefficient due to combinatorial explosion. Data clusters (e.g., Databricks, Spark) are used to search for associations like "those who buy A, also buy B with probability X."

Example B8: Calculating association rules with `Spark` (`PySpark`)

Snippet: FPGrowth (Frequent Pattern Growth) algorithm to find associations:


from pyspark.ml.fpm import FPGrowth
from pyspark.sql import SparkSession

spark = SparkSession.builder.getOrCreate()

# 1. Data: each row is a transaction with an array of SKUs
# schema: [order_id: string, items: array]
df_orders = spark.table("sales_history_clean")

# 2. Configure the FPGrowth algorithm
# minSupport: products appear together in X% of transactions
# minConfidence: if a customer buys A, there is X% chance they will also buy B
fp_growth = FPGrowth(itemsCol="items", minSupport=0.001, minConfidence=0.2)

# 3. Train the distributed model
model = fp_growth.fit(df_orders)

# 4. Extract association rules (antecedent -> consequent)
# schema: [antecedent: [SKU_A], consequent: [SKU_B], confidence: 0.85, lift: 2.4]
association_rules = model.associationRules

# Save to a temporary table for further processing
association_rules.createOrReplaceTempView("raw_rules")

Notes

This code runs distributed on a Spark cluster (for an Enterprise implementation) and processes history quickly.
The raw result must be refined with business rules (guardrails) later.
In pre-processing, overly common products that act as statistical noise should be removed: packaging, consumables, or shipping if it has an SKU in the ERP.

Example B9: Refining associated bundles by margin and stock (`Spark` `SQL`)

Once associations exist, the recommendation system logic is enforced after verifications. We exemplify a complex case using SQL over Spark:

There is an association table between bundles (antecedent), which can have multiple products, and a single recommended product (consequent).
We verify that products haven't been deleted, have prices, and that the associated products (consequent) have stock > 10.
We calculate the total margin of the association (antecedent + consequent).
Among bundles with positive total margin, we favor those with a product in stock for over 120 days.

Snippet: sorting bundles by margin and old stock with consistency checks (Hive/Spark):


WITH rule_financials AS (
    SELECT 
        r.antecedent
        , r.consequent
        , r.confidence
        -- COALESCE to avoid NULL
        , SUM(COALESCE(p_ant.margin_eur, 0)) as antecedent_total_margin        
        -- Check if all products in the antecedent are found
        , COUNT(p_ant.sku) as found_ant_products
        , SIZE(r.antecedent) as expected_ant_products
        , MAX(COALESCE(p_cons.margin_eur, 0)) as consequent_margin
        , MAX(p_cons.days_in_inventory) as cons_days_inv
    FROM raw_rules r
    LATERAL VIEW explode(r.antecedent) a AS ant_sku
    -- JOIN for antecedent
    JOIN products p_ant ON a.ant_sku = p_ant.sku  
    -- JOIN for consequent with direct stock filter > 10 
    JOIN products p_cons ON r.consequent[0] = p_cons.sku AND p_cons.stock_quantity > 10
    GROUP BY 
        r.antecedent
        , r.consequent
        , r.confidence
)

SELECT
    antecedent as current_basket
    , consequent[0] as recommended_upsell
    , confidence
    , (antecedent_total_margin + consequent_margin) as total_bundle_margin
    -- Score calculation
    , (confidence 
        * (antecedent_total_margin + consequent_margin) 
        * (CASE WHEN cons_days_inv > 120 THEN 1.5 ELSE 1.0 END)
        ) as priority_score
FROM rule_financials
WHERE 
    -- in the JOIN, no products from the antecedent were lost
    found_ant_products = expected_ant_products
    -- eliminate negative or zero margins (recommended)
    AND (antecedent_total_margin + consequent_margin) > 0
ORDER BY 
    priority_score DESC;

Notes

The statistic "customers buy A+B" is combined with company logic: "we sell A+B because B has been in stock for 120 days."
Results from this report (e.g., daily) can become actual sales bundles in e-commerce or for affiliates (Ch. 2.6.2.2).

Example B10: Applying bundles at the time of quoting (`Python`)

When a rep builds a quote, the system can automatically propose bundles based on precalculated bundle lists:

Snippet: pseudocode assigning precalculated bundles to current products:


def suggest_bundles_for_quote(
    current_skus: list[str],
    bundles: dict[tuple[str, str], int],
    max_suggestions: int = 5,
) -> list[tuple[str, str]]:
    suggestions: list[tuple[str, str]] = []
    seen: set[tuple[str, str]] = set()
    current = set(current_skus)

    for sku in current_skus:
        for (a, b), _count in bundles.items():
            pair = None

            if sku == a and b not in current:
                pair = (a, b)
            elif sku == b and a not in current:
                pair = (b, a)

            if pair and pair not in seen:
                seen.add(pair)
                suggestions.append(pair)

                if len(suggestions) >= max_suggestions:
                    break

        if len(suggestions) >= max_suggestions:
            break

    return suggestions

Notes

The company no longer depends on rep memory (Gen 0) to propose cross-sells. AI identifies patterns, and post-processing rules validate and order bundles according to commercial policy.
In production, you can index bundles by SKU (hashmap) to avoid loops.
In production, data pipelines and retries must be monitored 24/7.

In addition to coding, in the Build approach, server infrastructure, data pipelines, and their monitoring are essential.

Case Study: AI-Ready Infrastructure (web, email, sales)

In the next section, we see how Google takes over part of the work by configuring the Managed approach.

CONFIGURATION EXAMPLES

4.3

Challenges and solutions in the Managed (Cloud Native) version

We present configuration examples for a Managed implementation. They can be read in comparison with the previous section. The snippets illustrate the business challenges from Ch. 2.5: guardrails and data contract, as well as the technical challenges from Ch. 2.6: business function models and bundling.

Key elements:

Real-time events: userEvents API.
Business rules (guardrails): Serving Controls (Boost / Bury / Filter, Filter being used only in Search mode), including site-wide variants (per entire application).
Bundles and commercial strategies: declarative JSON configurations and the selected objective for the model (CTR / CVR / Revenue).
Testing: distinct ServingConfigs and A/B testing from the dashboard via Experiments.

In applications that also use generative components (e.g.: result summarization or chatbot assistants like copilots), prompts are built based on the same fields and rules. Context engineering (products, events, customers, rules) is the first line of control.

We define the configuration and let the service handle scaling, training, and monitoring.

Note: Code examples are conceptual; they require adaptation to the specific production environment. Always check the official documentation for the exact keys and fields.

4.3.1

Guardrails and ServingConfigs selection for an insertion point in the application

In the Build version, we separated which model we call (scenario / router) from which rules we apply (guardrails). In the Managed version, the same logic is expressed through:

ServingConfigs (placement parameter in the API): used for any insertion point (homepage_recs, cart_upsell).
filters and params in the prediction request.
possibly, reusable Serving Controls (Boost/Bury/Filter) for search.

Example M1: Prediction request for upsell with filters (`Python`)

Snippet: Python code requesting complementary products for items in the cart, only from the non-hobby range and in stock


from google.cloud import retail_v2

project_id = "PROJECT_ID"
catalog_id = "default_catalog"
serving_config_id = "cart_upsell"
# fill in
account_id = "HASH_ID_ERP_CUSTOMER"
user_id = "HASH_ID_ERP_CUSTOMER_REP"
device_id = "HASH_ID_UNIQUE_DEVICE"

serving_config = (
    f"projects/{project_id}/locations/global/catalogs/{catalog_id}"
    f"/servingConfigs/{serving_config_id}"
)

client = retail_v2.PredictionServiceClient()

user_event = retail_v2.UserEvent(
    event_type="shopping-cart-page-view",
    visitor_id=device_id,
    user_info=retail_v2.UserInfo(user_id=user_id),
    product_details=[
        # SKUs from the cart, including quantity
        retail_v2.ProductDetail(product=retail_v2.Product(id="SKU-SCREW-36"), quantity=1),
        retail_v2.ProductDetail(product=retail_v2.Product(id="SKU-PIN-35"), quantity=2)
    ]
)
# Custom attributes are essential in B2B
user_event.attributes["account_id"] = retail_v2.CustomAttribute(text=[account_id])

request = retail_v2.PredictRequest(
    placement=serving_config,
    user_event=user_event,
    page_size=10, # maximum number of recommendations
    filter='(availability: ANY("IN_STOCK")) AND NOT (categories: ANY("Hobby-DIY"))', # business rules
    params={
        "filterSyntaxV2": True, # allows advanced filters
        "strictFiltering": True, # respects written filters, no fallback
        "diversityLevel": "medium-diversity",
        "returnProduct": True, # returns product metadata
        "returnScore": True, # returns AI score
        "priceRerankLevel": "low-price-reranking", # slightly prioritizes higher-priced products (not profit)
    },
)

response = client.predict(request=request)

Notes

Guardrails are expressed through:

filter expressions. According to documentation.
Setting diversityLevel to avoid monotony in the list. Beware, diversityLevel with added filters can drastically reduce the number of results.

Example M2: Different ServingConfigs for different scenarios

Instead of a custom model router, we use different ServingConfigs for scenarios:

projects/.../servingConfigs/homepage_recs
projects/.../servingConfigs/cart_upsell
projects/.../servingConfigs/product_page_similar

Snippet: Python code calling the appropriate ServingConfig and receiving product IDs


from typing import Any, Optional
from google.cloud import retail_v2


def recommended_for_placement(
    user_event: retail_v2.UserEvent,  # see Example M1
    serving_config_id: str,  # Ex: "homepage_recs", "cart_upsell"
    project_id: str,
    catalog_id: str,         # Ex: "default_catalog"
    page_size: int = 10,
    filter_expr: Optional[str] = None,
    params: Optional[dict[str, Any]] = None = None,
) -> list[str]:
    client = retail_v2.PredictionServiceClient()

    serving_config = (
        f"projects/{project_id}/locations/global/catalogs/{catalog_id}/"
        f"servingConfigs/{serving_config_id}"
    )
    # we build the base request
    request = retail_v2.PredictRequest(
        placement=serving_config,
        user_event=user_event,
        page_size=page_size,  # maximum number of recommendations
        params={
            "filterSyntaxV2": True,
            "returnProduct": True,
            "returnScore": True,
        },
    )

    if filter_expr:
        request.filter = filter_expr

    if params:
        request.params.update(params)

    response = client.predict(request=request)
    
    # Recommended product IDs
    return [result.id for result in response.results]

Notes

You no longer use "model_X_vY.pkl"; scenarios are configured at the insertion point level (placement), and the code can remain almost unchanged.
Behind the placement, the application specifies a ServingConfig to the Google API.

4.3.2

How we capture user intent: real-time events

The first implementation step is the same as in the Build version: the catalog is loaded via batch feeds. But the critical difference is capturing real-time events via API. We use Vertex AI Search for commerce as the central point for userEvents with a specific B2B nuance.

Example M3: `detail-page-view` event (JavaScript)

When a B2B user intends to buy a product, we create a "product page view" payload that can be sent via GTM/pixel or backend.

The tracking unit behavior aggregation (e.g.: company, account, contract) must be decided:

Google documentation recommends visitorId uniqueness per device and userId uniqueness per user (e.g.: customer representative).
There are attributes (features) that can be sent in UserEvent.attributes that will help train the model. Example M1 uses this method to send account_id.
You can experiment with treating the client account as a single user (userId = account_id). In this case, the history of different client representatives will be merged under the same userId. We exemplify below.
In any case, the attributionToken field is used when a previous search/recommendation was received; otherwise, the model will not learn the correlation between conversion and prediction.

Snippet: payload for detail-page-view


let account_id = "HASH_ID_ERP_CUSTOMER";
let device_id = "HASH_ID_UNIQUE_DEVICE";
                            
const userEvent = {
  eventType: "detail-page-view",
  visitorId: device_id,
  userInfo: { userId: account_id },
  productDetails: [{ product: { id: "SKU-PIN-35" }, quantity: 1 }],
  eventTime: new Date().toISOString(),
};

// for downstream events after a previous predict/search
userEvent.attributionToken = attributionTokenFromPredict;

Example M4: Writing the event using the `Python` client

Snippet: Server-side (or a dedicated worker) sends the event via API


from typing import Any
from google.cloud import retail_v2
from google.protobuf.json_format import ParseDict

def write_event_from_json(
    user_event_json: dict[str, Any], # See Example M3
    project_id: str,
    catalog_id: str, # Ex: "default_catalog"    
) -> retail_v2.UserEvent:
    
    client = retail_v2.UserEventServiceClient()

    parent = f"projects/{project_id}/locations/global/catalogs/{catalog_id}"

    # ParseDict takes camelCase keys and maps them to proto fields
    user_event = ParseDict(user_event_json, retail_v2.UserEvent())

    request = retail_v2.WriteUserEventRequest(
        parent=parent,
        user_event=user_event,
    )

    return client.write_user_event(request=request)

Notes

You don't need to build the machine learning infrastructure. But you need a robust delivery mechanism for events via API.
Based on the data you send, the Google managed service learns according to the configuration.

4.3.3

Data ingestion, consistency, and experimentation

Part of the data work is handled by the service through catalog (and stock) feeds, userEvents for behavior, ServingConfig settings, and A/B testing from the dashboard.

Example M5: Stock update payload (check current documentation)

Snippet: product returns to stock, prepare a catalog update


{
  "id": "SKU-SCREW-36",
  "title": "Screw RTE 6x60 mm",
  "availability": "IN_STOCK", 
  "priceInfo": {
    "price": 45.90,
    "currencyCode": "USD"
  },
  "fulfillmentInfo": [
    {
      "type": "pickup-in-store",
      "placeIds": ["store_london_1"]
    }
  ]
}

After submission, the service will handle propagating the stock in the index (the availability update is near real-time).

Example M6: `A/B testing` at the `ServingConfig` level

Vertex AI Search for commerce offers support in the dashboard for experiments.

Technical separation of configurations:

servingConfigs/homepage_recs_v1 (Base A with controls or objectives),

servingConfigs/homepage_recs_v2 (Base B with different controls or objectives).

Instead of building testing or reporting logic, we focus on:

defining the feeds correctly (catalog, user, events),
configuring the ServingConfig (model with objective, controls, diversity),
configuring the Serving Controls . Boost / Bury for search and recommendations, Filter only for search.
Maintaining the attributionToken in events to correlate conversions with predictions and adding special fields to server-side UserEvents (e.g.: experimentIds, check current documentation).

See Guide #5 Continuous Optimization and Reporting

4.3.4

Commercial strategies and bundling

In the Build version, the logic for bundling, liquidating stock, and prioritizing margins was written in code. In the Cloud Native version, many decisions are expressed declaratively: through a ServingConfig object and through attached Serving Controls (Boost/Bury/Filter).

Example M7: `ServingConfig` config for search (check current documentation)

Snippet: ServingConfig configuration for homepage searches, can use reusable controls


{
  "displayName": "Produse Home",
  "solutionTypes": ["SOLUTION_TYPE_SEARCH"], // will be used for search
  "boostControlIds": [
    "margin_boost" // see control in Example M9
  ]
}

Example M8: `ServingConfig` config for recommendations (check current documentation)

Snippet: ServingConfig configuration for recommendations, does not use reusable controls, can use a price strategy


{
  "displayName": "Upsell Recommendations",
  "modelId": "upsell_model_v1", // name of the trained model (with the selected object)
  "solutionTypes": ["SOLUTION_TYPE_RECOMMENDATION"], // will be used for recommendations
  "priceRerankingLevel": "low-price-reranking", // slightly prioritize higher-priced products (not profit) 
  "diversityLevel": "high-diversity" // recommended here, not in Python call
}

Example M9: Global Boost Search Control for margin (check current documentation)

Snippet: control that pushes products with good margins higher in search


{
  "displayName": "Margin boost",
  "solutionTypes": ["SOLUTION_TYPE_SEARCH"],
  "rule": {
    "condition": { }, // applies to all searches
    "boostAction": {
      "boost": 0.3, // value between -1 and 1, below 0 is Bury (demotion)
      "productsFilter": "attributes.margin_pct: IN(0.2i, *)" // margin greater than or equal to 20%, attribute set as indexable, i means inclusive
    }
  }
}

Example M10: Dynamic bundle suggestion in cart (`Python`)

Snippet: function that retrieves active bundles (already added to the catalog) for a product in the cart.


from google.cloud import retail_v2
from collections.abc import Iterable

def recommend_bundles_for_cart_item(
    item_sku: str,
    cart_skus: Iterable[str],
    device_id: str, #HASH_ID_UNIQUE_DEVICE
    account_id: str, #HASH_ID_ERP_CUSTOMER (company as user)
    project_id: str = "PROJECT_ID",
    catalog_id: str = "default_catalog",
    serving_config_id: str = "bundle_recs",
    page_size: int = 6 
) -> list[str]:
   
    
    # 1. Create user-event, using the variant company as user
    user_event = retail_v2.UserEvent(
        event_type="shopping-cart-page-view",
        visitor_id=device_id,
        user_info=retail_v2.UserInfo(user_id=account_id),
        product_details=[
            retail_v2.ProductDetail(product=retail_v2.Product(id=str(sku)), quantity=1)
            for sku in cart_skus
        ],
    )

    # 2. ANY("...") for exact match, attributes are of type text and set as filterable
    filter_expr = (
        f'(availability: ANY("IN_STOCK")) AND '
        f'(attributes.is_bundle: ANY("true")) AND '
        f'(attributes.bundle_component_skus: ANY("{item_sku}"))'
    )
    
    # 3. See Example M2
    ids = recommended_for_placement(
        user_event=user_event,    
        serving_config_id=serving_config_id,
        project_id=project_id,
        catalog_id=catalog_id,
        page_size=page_size,
        filter_expr=filter_expr
    )

    return ids

Notes

Use solutionTypes (plural) in the definition of reusable Serving Controls.
Similar to Example M9, you can send out-of-stock products to the bottom of the search results with a Bury (negative boost).
Example M10 shows the data engineering required to suggest bundles in the cart. The bundles had been created in the application, sent to Vertex AI Search for commerce with custom attributes is_bundle and bundle_component_skus, with AI recommending them for the current basket based on the optimized objective.
In the privacy-first architecture from Ch. 2.4.4., not all data will be sent to the cloud, so some logic must be moved from ServingConfigs to middleware (application).

Case Study: Legacy ERP-Vertex AI Search Integration for Distribution

4.4

The bottom line:

A recommendation system can be implemented either in the Build (custom) version, or in the Managed (Cloud Native) version, with a trade-off between low-level control and implementation plus maintenance speed:

In the Build version, rules are code (e.g.: Python, Java, Go).
In the Managed version, rules generally move from code into declarative configurations and data engineering.
In the Managed version, the model is managed by Google, and developers focus on data synchronization, defining guardrails, and potential extensions (e.g., temporal forecasts discussed in Ch. 2.1.4).

If you are evaluating the possibility of implementing one of the two versions in-house, you can follow the project steps recommended above. The Build version is recommended if you have an ML development team and your own data. The Managed version is recommended for faster delivery.

See the cost matrix and our methodology in Ch. 5

Do you need a technical validation before implementation? OPTI Software offers a free audit.

Continue Exploring

Chapter 1

Why AI?

The problem: "The ERP is not a sales engine". `Brownfield` context and the retirement cliff crisis.

Read the Business Case

Chapter 2

How does it work?

Recommendation generations and deep learning. Goals and KPIs. Build vs. managed. Guardrails and the data contract.

See the Technology

Chapter 3

What can be built?

Deliverables built for B2B sales.

Agent copilots

Smart bundles

Substitutes

Re-order

See all Deliverables

Chapter 5

When and with what resources?

Team structure, required resources, budget, and Google Cloud assurances. Plus: our methodology.

See the Cost Structure

AI News

Future: Agentic AI

UCP launch in Jan 2026, new AI technologies maturing, and how companies can adapt.

See AI Updates

Resources

Resources & Glossary

Glossary (AI, business, software) + bibliography, whitepapers, and useful links from the guide.

See Resources

Gallery

Choose an AI topic

Explore role-based resources (CEO, Business, etc.) in a thematic gallery

Choose Role & Topic

Quick Questions

What does a Build implementation mean in practice?

Full control over data pipelines, rules, and decision logic, using technologies such as SQL, Spark, and custom code for ranking and guardrails.

Why is idempotency important for events?

Because stock and price updates can be sent multiple times. Without idempotency, systems quickly drift into inconsistent states.

How is bundling implemented correctly in B2B?

Bundle availability is determined by the limiting component, relative to the quantity required. Any other approach leads to unrealistic commercial promises.

What should remain in a Managed system?

Retrieval, pre-trained models, and scalable infrastructure. Critical commercial logic usually remains in middleware.

How do you avoid technical lock-in?

By clearly separating data, rules, and models, and keeping critical logic exportable.

What is the TLDR (conclusion)?

This chapter shows how the system is actually built: which events to send, how to avoid duplicates, keep stock consistent, and enforce guardrails so recommendations stay useful and auditable.

What technologies and methodologies are involved?

Technologies: Python, SQL, JavaScript, Postgres, MySQL, BigQuery, Spark, Databricks, Vertex AI Search, Vertex AI Search for commerce, Retail API, Cloud Logging, Cloud Monitoring, Pub/Sub, Dataflow, Cloud Run
Methodologies: event-driven architecture, idempotency keys, deduplication, UPSERT/MERGE, just-in-time validation (stock/price/permissions), market basket analysis (FP-Growth), offline scoring (NDCG/Recall), A/B testing, shadow testing, observability and reason codes

Sign up for the complete PDF manual

The reference manual will also be available in a complete PDF version. You will receive the PDF via email 48 hours before the official launch (aprox. 26th of March 2026). OPTI Software sends exclusive Tech & Biz news once a month.

Build

Managed

Build

Managed

Build

Managed

Build

Managed

Build

Managed

Build

Managed

Build

Managed

In brief:

Example B1: Filtering in the recommendation pipeline (Python)

Example B2: Candidate quality validation at table level (SQL)

Example B3: Event model (Python)

Example B4: Upsert for stock (SQL)

Example B5: Stock consistency check (SQL)

Example B6: Simple model router (Python)

Example B7: Declarative config for scenarios

Example B8: Calculating association rules with Spark (PySpark)

Example B9: Refining associated bundles by margin and stock (Spark SQL)

Example B10: Applying bundles at the time of quoting (Python)

Example M1: Prediction request for upsell with filters (Python)

Example M2: Different ServingConfigs for different scenarios

Example M3: detail-page-view event (JavaScript)

Example M4: Writing the event using the Python client

Example M5: Stock update payload (check current documentation)

Example M6: A/B testing at the ServingConfig level

Example M7: ServingConfig config for search (check current documentation)

Example M8: ServingConfig config for recommendations (check current documentation)

Example M9: Global Boost Search Control for margin (check current documentation)

Example M10: Dynamic bundle suggestion in cart (Python)

Quick Questions

What does a Build implementation mean in practice?

Why is idempotency important for events?

How is bundling implemented correctly in B2B?

What should remain in a Managed system?

How do you avoid technical lock-in?

What is the TLDR (conclusion)?

What technologies and methodologies are involved?

Sign up for the complete PDF manual

Interested?

Schedule a meeting

News and Guides

Example B1: Filtering in the recommendation pipeline (`Python`)

Example B2: Candidate quality validation at table level (`SQL`)

Example B3: Event model (`Python`)

Example B4: Upsert for stock (`SQL`)

Example B5: Stock consistency check (`SQL`)

Example B6: Simple model router (`Python`)

Example B8: Calculating association rules with `Spark` (`PySpark`)

Example B9: Refining associated bundles by margin and stock (`Spark` `SQL`)

Example B10: Applying bundles at the time of quoting (`Python`)

Example M1: Prediction request for upsell with filters (`Python`)

Example M3: `detail-page-view` event (JavaScript)

Example M4: Writing the event using the `Python` client

Example M6: `A/B testing` at the `ServingConfig` level

Example M7: `ServingConfig` config for search (check current documentation)

Example M8: `ServingConfig` config for recommendations (check current documentation)

Example M10: Dynamic bundle suggestion in cart (`Python`)