In an 2D echocardiogram exam, an ultrasound probe samples the heart with 2D slices. Changing the orientation and position on the probe changes the slice viewpoint, altering the cardiac anatomy being imaged. The determination of the probe viewpoint forms an essential step in automatic cardiac echo image analysis. In this paper we present a system for automatic view classification that exploits cues from both cardiac structure and motion in echocardiogram videos. In our framework, each image from the echocardiogram video is represented by a set of novel salient features. We locate these features at scale invariant points in the edge-filtered motion magnitude images and encode them using local spatial, textural and kinetic information. Training in our system involves learning a hierarchical feature dictionary and parameters of a pyramid matching kernel based support vector machine. While testing, each image, classified independently, casts a votes towards parent video classification and the viewpoint with maximum votes wins. Through experiments on a large database of echocardiograms obtained from both diseased and control subjects, we show that our technique consistently outperforms state-of-the-art methods in the popular four-view classification test. We also present results for eight-view classification to demonstrate the scalability of our framework.