You'll want to consult youtube's help section.
I've done some of that for you:http://www.google.com/support/youtube/bin/answer.py?hl=en&answer=55744http://www.google.com/support/youtube/bin/answer.py?hl=en&answer=132460
Try reading those articles. I think you'll find them useful. If you're still not finding what you need, hit youtube's help section, or even google on it. It's all well documented.
Note that the end-delivery is .flv format. Note, also, that MP3, and AAC, are the preferred audio track format. Normal .wav samples on M$ are 44kz audio sample rates, at 30fps for video, using any of a number of video codecs. If your audio is in a codec that youtube doesn't understand well, or the sample rate is to low (say 22kz or lower) audio will be funky. Youtube converts the incoming files based on file "magic" and common conversion utils from LINUX. While those tools are very powerful, they sometimes make "best guesses."
Cell phones often use a fairly slow audio sample rate (22kz or lower), and a fps of 15 or lower. You can't expect good quality with a low quality original.
A 44kz sample rate audio stream in an AAC 128K compressed audio stream, in mpeg4 (mp4, or flv) using h.264, at 16:9, 29.97fps (NTSC) or 30fps, 1920 x 1080 is probably about as good as your upstream file is going to get. (That's basically HDTV.) Video rates can vary. For decent HDTV you need a pretty fast video stream of 1.5Mbps, or more.http://en.wikipedia.org/wiki/H.264/MPEG-4_AVC#Levels
Take time to empower yourself by reading up on youtube's help section.
Note that as we move to 3D HDTV video stream rates are going to get much faster to support all the frames needed for those effects.
Some folks, like myself, have monitors and monitors cards that can support 3D today, but, it takes a buttload of computing horsepower.
If you have an AMC digital theater near you, you're already able to see phased 3D (Real 3D) at very high sample rates and streaming speeds. Audio sample rates are 96kz in most digital theater applications, with 120fps for right and left 3D (phase shifted frames).