We consider the problem of optimizing information-theoretic quantities in recurrent networks via synaptic learning. In contrast to feedforward networks, the recurrence presents a key challenge insofar as an optimal learning rule must aggregate the joint distribution of the whole network. This challenge, in particular, makes a local policy (i.e., one that depends on only pairwise interactions) difficult. Here, we report a local metaplastic learning rule that performs approximate optimization by estimating whole-network statistics through the use of several slow, nested dynamical variables. These dynamics provide the rule with both anti-Hebbian and Hebbian components, thus allowing for decorrelating and correlating learning regimes that can occur when either is favorable for optimality. We demonstrate the performance of the synthesized rule in comparison to classical BCM dynamics and use the networks to conduct history-dependent tasks that highlight the advantages of recurrence. Finally, we show the consistency of the resultant learned networks with notions of criticality, including balanced ratios of excitation and inhibition.