AlloyDB

Assembly Line

How AlloyDB transformed Bayer’s data operations

📅 Date:

🔖 Topics: Data architecture

🏢 Organizations: Bayer, AlloyDB, Google


Migrating to AlloyDB has been transformative for our business. In our previous PostgreSQL setup, the primary writer was responsible for both write operations and replicating those changes to reader nodes. The anticipated increase in write traffic and reader count would have overwhelmed this node, leading to potential bottlenecks and increased replication lag. AlloyDB’s architecture, which utilizes a single source of truth for all nodes, significantly reduced the impact of scaling read traffic. After migrating, we saw a dramatic improvement in performance, ensuring our ability to meet growing demands and maintain consistently low replication delay. In parallel load tests, a smaller AlloyDB instance reduced response times by over 50% on average and increased throughput by 5x compared to our previous PostgreSQL solution.

By migrating to AlloyDB, we’ve ensured that our business growth won’t be hindered by database limitations, allowing us to focus on innovation. The true test of our migration came during our first peak harvest season, a time where performance is critical for product decision timelines. Due to agriculture’s seasonal nature, a delay of just a few days can postpone a product launch by an entire year. Our customers were understandably nervous, but thanks to Google Cloud and AlloyDB, the harvest season went as smoothly as we could have hoped for.

To support our data strategy, we have adopted a consistent architecture across our Google Cloud projects. For a typical project, the stack consists of Google Kubernetes Engine (GKE) hosted pods and pipelines for publishing events and analytics data. While Bayer uses Apache Kafka across teams and cloud providers for data streaming, individual teams regularly use Pub/Sub internally for messaging and event-driven architectures. Data for analytics and reporting is generally stored in BigQuery, with custom processes for materialization once it lands. By using cross-project BigQuery datasets, we are able to work with a larger, real-time user group and enhance our operational capabilities.

Read more at Google Cloud Blog