Skip to content

DATA DO – データ道

About
Data Science: What is it?
Data Scientist: Hype or Sexy?
Code Repository
Imprint
Terms u0026#038; Conditions

DataScientists: a blog about everything data related.

From Notebook to Production: Building End-to-End ML Pipelines with Kubeflow, KServe, and Fractional GPU Sharing

Transitioning a machine learning model from an experimental Jupyter Notebook to a highly available, auto-scaling production endpoint is rarely a linear path. In an enterprise environment, this migration introduces severe operational friction—not just in terms of rewriting code, but in handling infrastructure efficiency. With modern workloads demanding massive compute resources, assigning an entire enterprise-grade GPU…

July 27, 2026
Start Fresh, Don’t Lift and Shift: Scaling Analytics Platforms with dbt-core and PostgreSQL

We observed that executing a “lift and shift” of legacy, sprawling SQL scripts onto an enterprise cloud data warehouse fails to resolve core structural data issues. It transitions architectural technical debt into a variable, unconstrained operational expense. Moving unoptimized queries onto infinite-compute cloud platforms masks underlying engineering deficiencies rather than fixing them. We reject this…

July 9, 2026
PostgreSQL Data Mesh: A Technical Guide to Schema Segmentation, Boundaries, and Governance

We deploy PostgreSQL natively to execute a decentralized data mesh architecture, proving that multi-million dollar cloud platforms and proprietary vendor ecosystems are infrastructure bloat. By utilizing open-source database primitives, we eliminate dependencies on specific tech conglomerates and cloud provider pricing models. We enforce domain boundaries, query allocations, and data product contracts directly through the PostgreSQL…

July 3, 2026
Deterministic RAG Auditing: Implementing Verifiable Grounding & Lineage on Unified PostgreSQL

The pervasive “lost in the middle” phenomenon is a failure of semantic retrieval, not just context window capacity. While increasing token limits is tempting, standard Retrieval-Augmented Generation (RAG) pipelines depend on isolated chunk embeddings and generic vector similarity. As a result, they frequently bury critical technical dependencies deep within long prompts. If a system cannot…

June 26, 2026
Beating “Lost in the Middle”: Unified Graph RAG on PostgreSQL

Our evaluation shows that by substituting naive chunk-based vector lookups with relationally injected context, the model’s $F_1$ verification score increased from $0.61$ to $0.89$. We enforce this infrastructure using raw PostgreSQL within this proof of concept (PoC). The core engineering win of this implementation is the consolidation of the storage footprint: we completely discard specialized,…

June 19, 2026
RAG Context Pruning for Efficiency and Cost Optimization

After baseline production runs across our clients’ financial discovery pipelines, we observed an increase in Time-to-First-Token (TTFT) when retrieved context exceeded 2,500 tokens. Furthermore, the system’s retrieval accuracy score decayed when the target information was located in the middle 40% of the injected payload. We addressed this bottleneck by deploying an inline sentence-level extractive context…

June 3, 2026
Production-Grade Compliance: Engineering the EU AI Act into Sovereign Agentic Pipelines

We measured a 42% increase in inference latency when we shifted from standard RAG to a cryptographically-verifiable audit chain. We accept this overhead. After 2,000 simulated audit requests, we verified that any response lacking a signed Model_Hash and Data_Snapshot_ID could be purged within 150ms, effectively hardening the system against the “Black Box” failure modes targeted…

May 21, 2026
Unified Graph-RAG in a Single Postgres Engine

Our production benchmarks confirm that consolidating Hybrid Graph-RAG into a single PostgreSQL instance via pgvector and Apache AGE reduced cross-service network latency and eliminated the consistency lag inherent in multi-database synchronization. The Unified Postgres Architecture We enforce a unified data layer by storing vector embeddings and graph property data within the same relational clusters. This…

May 13, 2026
Production Metric: 14.2% Semantic Decay

After processing 2.8 million unstructured retail fragments, we observed that 14.2% of records passing traditional NOT NULL and regex constraints contained semantic noise specifically CAPTCHA text, “out of stock” redirects, and promotional modals that poisoned downstream RAG embeddings. We enforced a deterministic quality gate using PydanticAI and a sovereign vLLM cluster, which suppressed these failures…

May 6, 2026
Cost-Aware Agentic Workflows with PydanticAI

Introduction: The Hidden Price of Autonomy The Architecture of a Cost Guardrail Implementing Usage Limits with PydanticAI PydanticAI provides the primary library-level enforcement mechanism through its UsageLimits class. Real-Time Cost Tracking with LiteLLM While PydanticAI manages counts, LiteLLM converts those counts to dollars. Detailed HITL Workflow: The Slack Intervention For a SMB, a simple notification…

April 29, 2026

Got any book recommendations?

DATA DO – データ道

Proudly powered by WordPress

Manage Consent

To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.

Functional Functional Always active

The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.

Preferences Preferences

The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.

Statistics Statistics

The technical storage or access that is used exclusively for statistical purposes. The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.

Marketing Marketing

The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.

Manage options
Manage services
Manage {vendor_count} vendors
Read more about these purposes

View preferences

{title}
{title}
{title}