At CFILT, a few of us have been working on understanding the IBM Models thoroughly. The IBM paper on SMT is a classic and seminal paper in the history of Machine Translation, and a must read for anybody wanting to work in this area. Its not an easy read, and we spent quite a lot of time figuring out how the estimation results are derived. Some notes sprung out of working for this discussion, and works out the steps missing in the original paper in detail. Hopefully it will be useful for everybody. These scanned notes of estimation for Model 1 and Model 2 can be found here. This is not a replacement for the original paper, but is just meant to supplement the reading of the original paper. Thanks to Mitesh for helping out with the key steps in the derivation.