Case Study: Over-Configurability in a (huuuuge) eCommerce System

Author(s): Gernot, Christine

A huge organization developed and operated a system for selling special kinds of goods to their customers.

What is the system’s background

To understand the system and its properties, one needs to know a bit about the goods the systems deals with: They

The system once started as a collection of mainframe programs, some of which are still in operation and heavy use - especially those integrating enterprises and data from other countries (!). As the organization is extremely risk-aware, they operated more than 6 (!) test stages for their system. At least 5 of these stages are responsible for certain parameters of product configuration, and thus need to test how their parameters influence overall pricing and availability of the products. Sounds complicated? Yeah, it really is…

In production environments, all these configurations culminate in a “data driven pricing”, where the data resides in a plethora of different formats and sources: Some configuration is stored in mainframe databases, others in relational stores and others in configuration files (!).

To make flexibility even more intense, the developers included more than 300 (!) “feature toggles” in their code, but only about half of those were documented :-(

What was the good idea?

(optional: What happened, was there a turning or tipping point?)

What were the bad consequences, why was everything bad?

Overly large and heterogeneous development team (>150 persons from different companies), making development very expensive (and slow, see below) Time-to-market became slower and slower, due to:

Lack of persistent knowledge about the system and its operation (missing overview-documentation) No overview of configuration options (the official feature-toggle documentation covered only 50% of the more than 300 existing toggles)

Operational instability: increasing number of severe runtime issues (system not available) Many regression errors (errors that surfaced again, although they had already been fixed in a prior version)

Which patterns were encountered?