Linux distributions are often huge (4+GB's) because they contain all of the software under the sun on the disk. Everything from the OS and desktop, to things like DNS servers, DHCP servers, SQL Servers, Web Servers, Network news servers, you name it, it's there. But most admins, don't load even close to everything. In fact, I usually load almost nothing and add only what I need on a particular server. My builds often only contain 350 or so of the nearly 5,000 packages that might be available for the distro.
And I concur 1000% with doing as much as you can in a virtual machine. Great learning experience with almost 0 risk to your actual machine and OS.