USB 3.0 and Linux

USB is getting a facelift!

In the beginning, there was USB 1.1, with the “low speed” and “full speed” devices (at 1 Mbps and 12 Mbps, respectively). Then USB 2.0 came along with “high speed” devices that ran at 480 Mbps. Now the new USB 3.0 bus specification defines “SuperSpeed” devices that run at 5 Gbps (5,120 Mbps).

SuperSpeed Logo - image copyright 2008 Sarah Sharp
Now that the bus specification is public, I can finally talk about the code I’ve been developing at work. I’ve been writing a Linux driver for xHCI (the new USB 3.0 host controller), and changing the Linux kernel stack to support USB 3.0 devices. On November 17th, I got to demo my work at the world’s first USB 3.0 “SuperSpeed” Developers Conference.

This is a demo showing a USB 3.0 Mass Storage Device (commonly called a USB drive, thumb drive, or flash drive) prototype running under Linux with an unmodified Mass Storage Device driver. My Linux xHCI driver is necessary to communicate with the USB 3.0 device through the xHCI host controller prototype. The FPGA prototype was provided by Fresco Logic, a company that sells host controller and device IP.

The demo showed speeds that were about 3.5 times faster than USB 2.0 high speed devices. I expect this demo to be even faster when the device and host controller are implemented in silicon.

Details about USB 3.0

USB 3.0 is 10 times faster than USB 2.0. Roughly speaking, it means that a file that takes 30 minutes to transfer over USB 2.0 could take 3 minutes to transfer under USB 3.0.

USB 3.0 also provides better power management, which translates to longer laptop battery life. USB 3.0 is backwards compatible. That means you can plug all your USB 2.0 devices into a USB 3.0 port, or plug your USB 3.0 device into a USB 2.0 port. The USB 3.0 device will work at USB 2.0 speeds in the latter case, but that means consumers don’t have to upgrade their PC or laptop to use USB 3.0 devices at the slower speed.

FAQ

Q: When will there be USB 3.0 host controllers and devices?

A: Jeff Ravencraft, USB-IF President and Chairman, estimates that USB 3.0 devices will be shipped as early as mid-2009. See his SuperSpeed DevCon keynote slides, page 14.

Q: When will Windows have USB 3.0 support?

A: In the SuperSpeed Developers Conference keynote, Sriram Rajagopalan of Microsoft announced that they needed to rearchitect the USB stack, and they would have USB 3.0 support in “Windows 7+”. Windows 7 is Microsoft’s operating system version after Vista.

From a later Windows USB 3.0 session, Lars Giusti’s slides say, “Early input from some partners indicates they would like us to consider supporting it on Windows Vista and newer Windows OS.” Of course, that is only a statement of a request for support, not an official statement of support for USB 3.0 under Windows Vista. Nothing has been said of Windows XP.

Basically, Windows users have been promised official USB 3.0 support for Windows 7, not Vista or XP or older OSes. Some other USB vendors might ship unofficial Windows drivers for other Windows OSes, but that is the official word from Redmond as of now.

Q: When will Linux have USB 3.0 support?

Tux, the Linux mascot
For Linux to have basic USB 3.0 support, two things need to be added. First, we need to add support to the Linux USB stack to handle the new device speed and other changes mandated by the USB 3.0 bus specification. Second, we need to have a driver for the xHCI host controller. A host controller is the hardware that sits behind your USB port and talks to the USB devices you plug in.

Now that the bus specification is public, I can start pushing the patches for the USB core changes. They will need to be reviewed and possibly changed before they make it into the mainline kernel. Once the changes make it into the mainline kernel, they’ll be picked up by Linux Operating System Vendors like RedHat, Novell, and Ubuntu.

The xHCI host controller driver is a little trickier. The xHCI specification is not public yet. It’s currently available under NDA with Intel as a 0.9 draft specification. Since it’s not a public specification, I’m forbidden to ship code that would reveal what’s in the specification. That means the xHCI driver can’t be sent out for review by the whole Linux community until the xHCI specification is public. The driver is much bigger than the USB core code changes, so I know it will go through several review iterations before it gets accepted into the mainline kernel.

The beauty of this open source process is that you can watch the development by following the Linux USB mailing list. I’ll also update my blog with any announcements about Linux USB 3.0.

Q: Ok, so what does “basic” Linux USB 3.0 support mean?

A: Basic support means that some features might be lacking. We might not have awesome power management right off the bat, or we might be missing USB 3.0 support from some class drivers, or, heck, some versions of my xHCI driver might crash your system. My driver will be marked EXPERIMENTAL for a reason. 😉

The Linux philosophy is to ship early, and ship often. When code is shipped, more eyes can look at it and improve the code. Shipping partially functional code is better than waiting until the code is “perfect” and having the community report some fatal flaw or, worse, not understand why you’re trying to get changes in.

Q: Will there be a Linux compliance program?

A: The USB-IF has run, and always will run, the official USB compliance program. Devices and host controllers that pass the compliance suite are allowed to use the official USB logos on their products. There are no plans for a separate Linux compliance program.

However, I’m willing to test new devices and host controllers on an unofficial basis to make sure they work properly under Linux. You can contact me at my Intel work address at Sarah.A.Sharp at linux.intel.com.

About the SuperSpeed DevCon Windows Demo

Fresco Logic in the keynote - image copyright 2008 Dian Kurniawan
I can’t really talk about my Linux demo without talking about the Windows Demo that was in the SuperSpeed Developers Conference keynote. The demo was created by the Intel team that works closely with the USB-IF. It was the same demo they used for IDF Taipei in August 2008. The demo used the same Fresco Logic USB 3.0 prototypes and nearly the same PC system that I used.

Their goal was to show the maximum speed possible from the host controller and USB 3.0 device. To do that, they ran a simple compliance test suite that allocated a giant DMA buffer and sent data as fast as possible to the USB 3.0 device. The device was programmed to use the USB 3.0 protocol, but it was basically a loop back device. Their demo showed speeds of 318 MBps.

Fresco Logic windows keynote screenshot - image copyright 2008 Dian Kurniawan
The “wire speed” of USB 3.0 is 5Gbps. The 8b10b encoding and upper-level protocol overhead make USB 3.0 applications see less than that. The projected max application bandwidth for USB 3.0 devices is in the 400MBps range. In comparison, the wire speed of USB 2.0 is 60MBps, and applications see around 35MBps.

About the SuperSpeed DevCon Linux Demo

Linux and Windows USB 3.0 demos side by side - image copyright Dian Kurniawan
Fresco Logic, a company selling USB 3.0 host controller and device IP, provided hardware prototypes for an xHCI host controller and a USB 3.0 Mass Storage Device. For the demo, the host controller and the device was implemented on the same FPGA. I didn’t have time to get a two-card solution working.

Linux USB 3.0 demo - image copyright 2008 Sarah Sharp
The Linux demo copied Arjan’s 5-second boot video to the MSD, and played it using mplayer. The demo then read the 64KB mass storage device using dd with various block sizes, and displayed the host controller bandwidth measurements on a traffic graph. Visitors could see traffic spikes as the dd commands started and completed. They could also see the bandwidth increase as the block size increased to 16KB. In trial runs, block sizes larger than 16KB did not increase performance.

The script forced the system to drop all caches between runs (with `echo 3 > /proc/sys/vm/drop_caches`). This made sure that the video would actually be fetched from the device, instead of being cached. The dd command also used the direct I/O flag to ensure parts of the disk would not be cached.

The timing measurements were taken with ktime_get(), which uses high res timers. The goal was to measure the best possible bandwidth the operating system would see. This is the same measurement that the Windows demo used. The Windows compliance driver allocated one giant DMA buffer. The Linux stack set up DMA scatter-gather lists dynamically, as real applications requested data.

The measurements were fairly simple. When an URB was enqueued, the host controller driver would take a timestamp with ktime_get() just before the URB’s buffer was passed off to the hardware. When the host controller interrupted the system to indicate buffer completion, the host controller driver would take another timestamp. The delta time and the number of bytes transferred was then passed off to a generic HCD statistics reporting module that would return the delta time and bytes pairs to userspace. I got this reporting working two days before the demo.

Linux USB demo stack - image copyright 2008 Sarah Sharp
My wonderful husband, Jamey, found an open source real-time graphing program called Trend. It looked perfect for the demo, except for the fact that it would only update when it was given data. Since the HCD stats file would block when there wasn’t data (we didn’t want to spend too many CPU cycles returning zero), Jamey wrote a threaded C program to sample the input from the HCD stats file.

The program takes the HCD stats file as an input, calculates the throughput (in MBps) for each pair read from the stats file, and passes off the max throughput to Trend for each sample. It would pass zeros off to Trend if the file I/O was blocked. This meant the Trend graph updated at the same rate, even if there was no USB traffic. The Trend graph also auto-updated its vertical scale as the throughput increased. This seemed to confuse some people, but I thought the “growing” effect was fun.

The Windows demo saw around 318 MBps, while the Linux demo typically showed 125 MBps. I saw as high as 233 MBps while formatting the disk. dd is not the best application to use for performance testing; I only used to whip up a simple demo.

Application layer measurements showed poor performance (around 2 MBps). I think two things added fixed latencies between the application layer and the host controller hardware. First, there was a massive amount of debugging output in the host controller driver and mass storage driver. Second, I had placed some msleep() calls in the USB MSD driver so that I could see the debugging output and trigger a PCI analyzer at the same time. I didn’t have time to take those out before I ran my demo. I need to run more tests to disable debugging and profile the upper layer stack for other bottlenecks.

If you’re interested in seeing the demo run on an EHCI host controller and a high speed USB 2.0 MSD, you can view the Google video here, or download the higher resolution ogg here.