Network embedding (NE) maps a network into a low-dimensional space while preserving intrinsic features of the network. Variational Auto-Encoder (VAE) has been actively studied for NE. These VAE-based methods typically utilize both network topologies and node semantics and treat these two types of data in the same way. However, the information of network topology and information of node semantics are orthogonal and are often from different sources; the former quantifies coupling relationships among nodes, whereas the latter represents node specific properties. Ignoring this difference affects NE. To address this issue, we develop a network-specific VAE for NE, named as NetVAE. In the encoding phase of our new approach, compression of network structures and compression of node attributes share the same encoder in order to perform co-training to achieve transfer learning and information integration. In the decoding phase, a dual decoder is introduced to reconstruct network topologies and node attributes separately. Specifically, as a part of the dual decoder, we develop a novel method based on a Gaussian mixture model and the block model to reconstruct network structures. Extensive experiments on large real-world networks demonstrate a superior performance of the new approach over the state-of-the-art methods.