Many modern communication channels are modeled as a Gaussian multiple-input multiple-output (MIMO) channel. Examples include multi-tone digital subscriber line (DSL), orthogonal frequency division multiplexing (OFDM) and multiple transmit-receive antenna systems. Here, we consider Gaussian multiple-input multiple output (MIMO) channels with discrete input alphabets. The MIMO channel can be transformed into a set of parallel subchannels using Singular Value Decomposition (SVD). In OFDM the FFT operation will generate the parallel subchannels, given a suitable cyclic prefix is added. We propose a non-diagonal precoder based on the pairing of subchannels to increase the mutual information. The pairings are given by simple 2x2 real rotation matrices. (parameterized with a single angle). This precoding structure enables us to express the total mutual information as a sum of the mutual information of all the pairs. The problem of finding the optimal precoder with the above structure, which maximizes the total mutual information, is solved by i) optimizing the rotation angle and the power allocation within each pair and ii) finding the optimal pairing and power allocation among the pairs. It is shown that the mutual information achieved with the proposed pairing scheme is very close to that achieved with the optimal precoder by Cruz et al., and is significantly better than Mercury/waterfilling strategy by Lozano et al. Our approach greatly simplifies both the precoder optimization and the detection complexity, making it suitable for practical applications.