This paper provides bounds on the sample complexity of estimating Kronecker-structured dictionaries for $K$th-order tensor data. The training samples are generated by linear combinations of these structured dictionary atoms and observed through white Gaussian noise. The lower bound follows from a lower bound on the minimax risk for general coefficient distributions and can be further specialized to sparse-Gaussian coefficients. This bound scales linearly with the sum of the product of the dimensions of the (smaller) coordinate dictionaries for tensor data. An explicit dictionary estimation algorithm for 2nd-order tensor data is also provided whose sample complexity matches the lower bound in the scaling sense. Numerical experiments highlight the advantages associated with explicitly accounting for tensor structure of data during dictionary learning.