Human organizations are commonly characterized by a hierarchical chain of command that facilitates division of labor and integration of effort. Higher-level employees set the strategic frame that constrains lower-level employees who carry out the detailed operations serving to implement the strategy. Typically, strategy and operational decisions are carried out by different individuals that act over different timescales and rely on different kinds of information. We hypothesize that when such decision processes are hierarchically distributed among different individuals, they produce highly heterogeneous and strongly path-dependent joint learning dynamics. To investigate this, we design laboratory experiments of human dyads facing repeated joint tasks, in which one individual is assigned the role of carrying out strategy decisions and the other operational ones. The experimental behavior generates a puzzling bimodal performance distribution–some pairs learn, some fail to learn after a few periods. We also develop a computational model that mirrors the experimental settings and predicts the heterogeneity of performance by human dyads. Comparison of experimental and simulation data suggests that self-reinforcing dynamics arising from initial choices are sufficient to explain the performance heterogeneity observed experimentally.