We consider a problem of considerable practical interest: 
the recovery of a data matrix from a sampling of its entries. 
One practical example is the recommender system, where 
users submit ratings for a small subset of, say, movies, 
and the seller provides recommendations based on the users' 
inferred preferences. Since we have only a few known entries 
in this user-movie ratings matrix, the goal is to infer the 
unrated entries from this subset of entries so that we can 
recommend movies that have high estimated ratings. A much 
publicized example is the Netflix Challenge. 

Formally, suppose that we observe m entries selected uniformly 
at random from a large data matrix. Under what conditions can 
we complete the matrix and recover the entries that are not 
observed? We show that one can recover low-rank matrices 
exactly from what appear to be highly incomplete sets of 
sampled entries. Further, we introduce an efficient algorithm 
that perfectly recovers the unknown entries. We can complete 
matrices with about a billion entries in a matter of minutes 
on a personal computer. 

For many applications of interest, a more realistic scenario is 
when we have noise in the observations. Can the above approach 
be generalized when the underlying data is corrupted by noise?
We show that, perhaps surprisingly, the same algorithm is guaranteed 
to achieve order-optimal performance in a variety of circumstances.