Category Archives: Video & Audio
MP3s and the Degradation of Listening
Don’t get me wrong! I own three iPods, which I use extensively and absolutely adore for their portability and other obvious advantages. I, of course, use them differently than most listeners. (If you are lazy or impatient, feel free to jump to the bottom of the page and read how.) Most listeners use mp3 players and mp3 files in ways that severely degrade sound quality and eventually deteriorate the listener’s ability to even tell the difference between good and bad sound quality. But more on this a little later.
Disclaimer: For the cynics amongst you, I am not sponsored by any record label trying to boost CD sales; I could actually not care less. All the information below is not product-specific, is based on facts, and is common knowledge to anyone with a basic understanding of the physics of sound, digital sound processing, hearing physiology, and auditory perception. Ignore at your own risk!
CD sound quality
First, let me address some fundamental issues related to the relationship between CD sound data rates and sound quality.
CD quality is usually described in terms of:
- sampling rate (44,100 samples/sec.),
- bit rate (16 bits), and
- stereo presentation.
Doing some simple math, we can figure of that CD-quality sound corresponds to a data rate of 1411 kbits/sec. (44,100 * 16 * 2 = 1,411,200 bits/sec. = ~1411 kbits/sec.) Sampling rate determines the upper frequency limit (corresponding, in general, to timbre, or sound quality) that can be faithfully represented in a digital sound file (about half of the sampling rate). Bit rate determines the dynamic range (i.e. difference between the softest and strongest sound) that can be faithfully represented in a digital sound file (~6 dB per bit).
Given the maximum frequency and dynamic range of safe and functional human hearing (~20 kHz and ~100 dB respectively), CD-quality digital sound is very close to the best sound quality we can ever hear. There have been several valid arguments put forward, advocating the need for sampling rates higher than 44,100 samples/sec. (e.g. 98,200 samples/sec.), bit rates higher than 16 bits (e.g. 24 or 32 bits), and more than two channels (e.g. various versions of surround sound). Depending on the type of sound in question (e.g. the sound’s frequency/dynamic range and spatial spread) and what you want to do with it (e.g. process/analyze it in some way or just listen to it), such increases may or may not result in a perceptible increase in sound quality. So for the vast majority of listening contexts, CD-quality sound (i.e. 1411 kbits/sec. data rate) does correspond to the best quality sound one can hear.
Compressed sound quality
Now, let’s move to compressed quality sound, whether in mp3, iPod, Real, or any other format.
Every sound-compression technique has two objectives:
a) to reduce a sound file’s data rate and therefore overall file size (for easier download and storage) and
b) to accomplish (a) without noticeably degrading the perceived quality of the sound.
Sound-compression algorithms basically remove bits from a digital sound file and select the bits to be removed so that the information that will be lost will not be perceived by listeners as a noticeable loss in quality.
Compression algorithms base their selective removal of information from a digital file on three perceptual principles:
- Just noticeable difference in frequency and intensity:
Our ears’ ability to perceive frequency and intensity differences as pitch and loudness differences respectively is not as fine grained as the frequency and intensity resolution supported by CD-quality sound. So it is possible to selectively remove some relevant information without the listeners noticing their removal. - Masking:
Strong sounds at one frequency can mask soft sounds at nearby frequencies, making them inaudible. It is, therefore possible to remove digital information representing soft frequencies that are closely surrounded by much stronger frequencies, without the listeners noticing the removal, since they would not have been able to hear such soft sounds in the first place. - Dependence of loudness on frequency:
Even if different frequencies have the same intensity they do not sound equally loud. In general, for a given intensity, middle frequencies sound louder than high frequencies, which sound louder than low frequencies. Given the phenomenon of masking described above, this dependence of loudness on frequency allows us to remove some soft frequencies even if they are further away from a given strong frequency, providing an additional opportunity to remove bits (information) from a digital file without listeners noticing the loss. In addition, the dynamic range of hearing is much lower for low than for middle and high frequencies and may be adequately represented by ~10 versus 16 bits, offering one more possibility for unnoticeable data-rate reduction.
Different compression algorithms (e.g. mp3, iTunes, etc.) implement the above principles in different ways, and each company claims to have the best algorithm, achieving the most reduction in file size with the least noticeable reduction in sound quality.
Digital music downloads and the stupefaction of a generation of listeners
Regardless of which company and algorithm is the best, one thing is certain. No matter how the previously discussed principles are implemented and no matter how inventive each company’s programmers are, there is no way for the above principles to support the over 90 percent reduction of information required to go from a CD-quality file to a standard mp3. In other words, reducing data rates from CD quality (1411 kbits/sec.) to the standard downloadable-music-file quality (128 kbits/sec.) is impossible without a noticeable deterioration in sound quality.
In fact, the 139th meeting of the Acoustical Society of America devoted an entire session on the matter, with multiple acousticians and music researchers presenting their perceptual studies on the relationship between compression-data rates and sound quality. Based on these and other, more recent, relevant works, it appears that data rates below ~320 kbits/sec. result in clearly noticeable deterioration of perceived sound quality for all sound files with more than minimal frequency, dynamic, and spatial spread ranges. (E.g. listening to early Ramones at low or high data rates will not make as much of a difference as listening to, say, the Beatles’ “Sergeant Pepper” album.) Such low data rates cannot faithfully represent wide ranges of perceivable frequency, intensity, and spatial-separation changes, resulting in ‘mp3s’ that include only a small proportion of the sonic variations included in the originally recorded file.
As data rates drop, there is a gradual deterioration in
a) frequency resolution (loss of high frequencies, translated as loss of clarity),
b) dynamic range (small, dynamic changes become noninterpretable by the compressed file, resulting in flatter ‘volume’ song profiles), and
c) spatial spread (loss of cross-channel differences, resulting in either exaggeration or loss of stereo separation).
When this degradation of sound quality is combined with the fact that most young listeners get their music only online, what we end up with is a generation of listeners that is exposed to, and therefore ‘trained’ in, an impoverished listening environment. Prolonged and consistent exposure to impoverished listening environments is a recipe for cognitive deterioration in listening ability. That is, in the ability to focus attention on and be able to tell the difference between fine (and, if we continue this way, even coarse) sound variations.
Such deterioration will not only affect how we listen to music but also sound perception and communication in general, since our ability to tell the difference between sound sources (i.e. who said what) and sound source locations (i.e. where did the sound come from) is intricately linked to our ability to focus attention on fine sound-quality differences.
What you should do
a) Do not listen to music exclusively in mp3 (or any other compressed) format.
Go to a live concert! Listen to a CD over a good home sound system, set of headphones, or car stereo!b) Unless a piece of music is not available in another format, do not waste your money on iTunes or any other music download service, until such services start offering data rates greater than 300 kbits/sec.
c) When you load CDs on your iPod or other devise, select the uncompressed conversion rate (e.g. .wav or .aif formats). If you don’t have the hard disk space on your player to do this, convert at the highest available data rate (currently 320kBits/sec on iTunes).
d) Finally, get a good pair of headphones for your mp3 player! The headsets given out with iPods and most mp3 players are of such bad quality that they essentially create a tight bottleneck to the quality of your digital files and players. The response of these headphones has been designed to match the low quality of popular iTunes or other mp3 files (128 kbits/sec). Mp3-player manufacturers do this for two wise (for them) reasons:
i) poor quality headsets are cheap to produce and good enough to reproduce the poor quality mp3s files you are fed, and
ii) poor quality headsets prevent you from creating/requesting music files at higher data rates because when listening over such headphones you cannot even tell the difference between good and bad sound quality.
Well, what can I say? Wake up and listen to the music!
Outsourcing Subtitles
Running the video production team for IDD, I am often asked to include subtitles with the videos we create. However, we don’t really have an efficient workflow for producing subtitles and I am often unable to fulfill the request. I know we need to improve our ability in creating subtitles—not only to meet the demands of our diverse student body (students with disabilities, international students, etc.), but also to allow for text-based video-searching, which will increase each video’s value as a learning object.
Recently, I have asked some of our Graduate Assistants (GAs) to assist in producing subtitled tracks for our videos using a share-ware subtitling application. When you factor in software training, transcription time, proof reading, etc., it takes a GA two hours and 20 minutes on average to produce one minute of subtitled video. Once a GA was experienced with the processes and comfortable with the software, he or she could produce one minute of subtitled video in 20 to 30 minutes.
Last year, IDD produced 128 hours of original video content. In order to caption all of the videos we produced last year, it would cost us $215,050. (GAs make $12/hour.) Even if we used only experienced GAs, our annual cost would still be $30,720 and require at least two GAs dedicated to subtitling.
This past summer at the Annual Conference on Distance Teaching & Learning in Madison, Wisconsin, I was introduced to a company named Automatic Sync Technologies. The University of Nevada, Las Vegas uses Automatic Sync as its exclusive partner in creating video transcripts and subtitles. Through a web-based interface, users upload their videos to Automatic Sync receive a subtitle track and a full transcript three days later. At this point, all the video producers have to do is associate the subtitle tracks with their original videos and they are done. Automatic Sync pricing is based on the volume of videos you submit. The more you submit, the cheaper it gets. Captioning our 128 hours using Automated-Sync would have cost DePaul $17,114, a significant savings even at our most efficient production capabilities.
Outsourcing our subtitling work to Automatic Sync or one their competitors seems like a no-brainer. It’s cheaper than doing it in-house, produces a more reliable product, and lets our GAs spend time working on other valuable projects.
If your university or organization has an efficient and effective way of producing subtitles for video, I’d love to hear about it.
Video-Sharing Network Showdown, Part 2
Since my first post on video sharing, I’ve received suggestions for other sites that should be included in my evaluation of video-sharing networks. After reviewing these suggestions, I decided to add Viddler (www.viddler.com) to the list. With this addition, these are the 15 sites chosen to participate in the showdown:
For each evaluation category, each site will be ranked 1-15 (1 being worst, 15 being best) in each of the evaluation categories. Important or crucial categories will be given a multiple to give them extra weight in the rankings. The cumulative scores will be tabulated and the site with the highest score will be declared the winner.
Evaluation Criteria (with multiplier weight in parenthesis):
User Experience (3x): With regard to user experience, I am focusing specifically on simplicity. How easy is it to go from being a new user to uploading a video and organizing as needed? How many steps are involved in uploading each video? I rewarded sites with a clean interface and downgraded sites with a busy MySpace feel.
Sharing/Embedding (3x): Can the video be embedded in a web page or within a Blackboard class? Does the site provide the code (html or java script) to make embedding as simple as cut and paste
File Size/Storage Space (2x): What are the size and storage limitations, if any? Is there a limit on the number of videos that can be uploaded? Are you limited to a certain volume of uploads per month/week? Is the limitation based on upload file size or the encoded file size? Are videos limited to a certain length of time such as YouTube’s 10-minute limitation?
Ownership (2x): What are the sites terms of use and privacy policies? Are the terms of service easy to read and understand? Does the sharing service claim exclusive or partial ownership of the video?
Privacy (2x): Are uploaded videos available to the web audience at large? Can the videos be protected and only shared with a private group?
File Formats Accepted (2x): Does the video need to be converted to a specific format before it can be uploaded? How many file types (.mov, .mpg, .wmv, .rm) does the site accept?
Conversion/Encoding (2x): Are the files encoded to a file format that allows for optimal playback? Where does the encoding occur? Some sites require encode the video on your machine before it is uploaded. That allows for faster upload times but requires you the download the conversion software tool or applet. Does the site support encoding in multiple file formats?
Downloads/Full Screen (1x): Can viewers download the posted video to their computers? Can videos be played back in Full Screen or are viewers required to watch the video in a little player?
Site Management (1x): Is there any danger that the site will go out of business and the videos will be lost? Does the site have a stable and well-established ownership group such as Google’s ownership of Youtube or Yahoo!’s ownership of JumpCut? Does management seemed more focused on being bought out or on building a long-lasting product?
Extras (1x): In an effort to compete with the popularity of YouTube, several of the new sites are launching new features to allow for greater control and manipulation of the video. Are these features useful or just fancy window dressing to attract users? How stable is the new functionality? Do the features work across platforms (Mac & PC)? Examples Include:
- Editing/Remix: Several sites now allow you the change or edit your uploaded clip and combine them with additional clips. This allows you to add new material or update your previously posted videos without having to re-encode. This keeps videos current and reusable.
- Direct Recording/Post from a Camera: Allows you to upload directly from a web-cam or camera attached to a PC.
- Viewer Interaction: You can create a room to watch and interact with other users while sharing your videos.
- Timeline Tagging: You can tag the timeline of the video with keywords and/or comments. This is great for note taking.
Score Charts
Final Scores
|
|
Conclusions
The top four—Viddler, Vimeo, Eyespot, and Blip.TV—all scored over 200 points in the survey and are all excellent options for use in an educational setting.
Viddler’s strengths are excellent interface coupled with very useful and easy to use extra features. I thought the direct web-cam capture provided a simple way to leave quick video comments and instructions, but I was really impressed by the ability to add comments and tags within the video timeline. This provides students with a great way to manage notes. Viddler is also the only site that allowed full-screen mode functionality with embedded videos.
Vimeo came in 2nd and was the site that was the most fun to use. The ease of setup and uploading surpasses the other sites in the showdown. In addition, Vimeo allows you to set the size and settings of the video and generate the new embed code on the fly. Vimeo’s only limitation is its 250MB per week limit (about 50 minutes of compressed video). Compare that to Viddler’s 500MB per file limit with no weekly maximum and you can see it’s a serious drawback if you are posting lots of video or plan on pre-posting a large amount in preparation for an upcoming quarter or semester. If you remove the file size category from the review, Vimeo actually comes out on top. That’s pretty remarkable considering Vimeo provides very little in the extras department. In short, what Vimeo does do, it does very well.
Blip.tv was easy to use and had a wide array of features. Especially useful is the one-click distribution which lets you quickly post your videos to your blog or as an iTunes podcast. Blip also accepts every video file format you can throw at it, including real media and 16×9 aspect ratios.
Eyespot also had a clean and user-friendly interface. Its big advantage was the ease of use of its editing and remixing features. If that’s a feature you want, Eyespot is the best.
So, those are the best of the bunch and the ones I would recommend to faculty who are interested in using video in their online classrooms. If you any questions or comments about my rankings or want to know more about the sites I reviewed, feel free to comment on the blog or send me an email at rsalisbu@depaul.edu.
Video-Sharing Network Showdown, Part 1
With the increased use and demand for video in distance learning and the popularity of video services such as YouTube, I wondered what role these video-sharing services could play in an educational environment. Often an institution may not provide internal video hosting or time requirements may not allow the instructor to go through the centralized service and still meet the needs of the class. In these cases, a video sharing service can provide the solution for hosting and sharing the videos.
Clearly YouTube is the most well known of the video hosting platforms—but is it the best for educational use? Several competitors are slowly gaining an increased audience and are attempting to differentiate themselves from YouTube by providing a better user experience and/or unique set of features such as subtitling or editing.
I want to compare the leading video-sharing networks from an instructor perspective and find which one site is best suited for use in an online classroom. The first step was to eliminate the sites that I didn’t think would fit into an educational setting and thus were not worth comparing.
Elimination Criteria
The following criteria were used to eliminate certain video-sharing sites from consideration:
Ad Networks: I eliminated Revver and other sites that were primarily ad networks that embed ads into the uploaded videos and provide no opt-out option.
Site Editorial Control: Sites that must approve content before it is posted were also eliminated from consideration. For example, VideoJug was eliminated because they maintain strict editorial control of all posted videos and will take down any video that does not meet its site requirements of a “How To” Video.
Cost: I will only evaluate free video sharing sites. I excluded pay sites and will not evaluate features that are only available to upgraded accounts.
Which Sites Will Be Compared?
After applying the elimination criteria, these are the sites that were chosen to participate in the showdown:
The Next Step
In my next post, the sites will be ranked from 1 to 14 (1 being worst, 14 being best) based on how well each one meets specific evaluation criteria. Important or crucial categories will be given a multiple to give them extra weight in the rankings. The cumulative scores will be tabulated and the site with the highest score will be declared the winner.