We present an end-to-end model using streaming physiological time series to predict near-term risk for hypoxemia, a rare, but life-threatening condition known to cause serious patient harm during surgery. Inspired by the fact that a hypoxemia event is defined based on a future sequence of low SpO2 (i.e., blood oxygen saturation) instances, we propose the hybrid inference network (hiNet) that makes hybrid inference on both future low SpO2 instances and hypoxemia outcomes. hiNet integrates 1) a joint sequence autoencoder that simultaneously optimizes a discriminative decoder for label prediction, and 2) two auxiliary decoders trained for data reconstruction and forecast, which seamlessly learn contextual latent representations that capture the transition from present states to future states. All decoders share a memory-based encoder that helps capture the global dynamics of patient measurement. For a large surgical cohort of 72,081 surgeries at a major academic medical center, our model outperforms strong baselines including the model used by the state-of-the-art hypoxemia prediction system. With its capability to make real-time predictions of near-term hypoxemic at clinically acceptable alarm rates, hiNet shows promise in improving clinical decision making and easing burden of perioperative care.