In this work, we present an algorithm, dubbed FlipOut, which can reliably identify and remove redundant connections in a neural network when it is training. In our experiments, we are able to remove more than 90% of the connections in the networks we tested on with little to no impact on performance, and achieved the best results in the literature when removing 99% or more connections. We take this a step further by also applying quantization. That is, each remaining connection is approximated by a less precise version of itself, which allows us to use less bits of memory (from 32 to 8 per connection). We find that these two methods, pruning and quantization, are complementary and work well with each other, allowing us to remove 75% of the connections while storing the remaining weights with 4 times less bits with little degradation in accuracy. Thus, a theoretical speedup of 16x can be achieved.