NVMe Servers: The Next Leap in Bare Metal Performance

May 14th, 2015

Jacob Smith

Jacob Smith

SVP, Revenue
A few years ago, SSD drives took the industry by storm - boosting performance for all kinds of use cases. Now Packet is introducing the next latency-killing revolution and its name is NVMe.

A few years ago, SSD drives took the industry by storm - boosting performance for all kinds of use cases.  Now Packet is introducing the next revolution and its name is NVMe.

Before we get to the promise of NVMe: what is it?

NVM Express (or NVMe) is a driver standard for taking full advantage of flash media (aka SSDs) via PCIe interfaces.  While PCIe attached flash has been in play for several years via expensive, proprietary solutions (FusionIO anyone?), the industry has now successfully adopted the NVMe standard, allowing all kinds of innovation.  

Intel NVMe P3600In the next year or so, you’re going to see a LOT of NVMe flash hitting the market, especially for data-intensive applications, in which it can deliver six times the throughput of existing 6 Gb/s SATA SSDs with significantly lower latency.  Zoom zoom!

Wait, 6x LESS latency on reading and writing data?  How does this thing work?  Well, quite simply, it gets rid of the SATA bus and associated AHCI-based drivers and streamlines the I/O stack.  SATA was designed primarily for interfacing with mechanical hard disk drives (HDDs), and has become increasingly inadequate as SSDs have improved - frankly, the disks are faster than the connector.

By ripping out and not relying on plumbing that was built for a world of slow, queued up, spinning disks, NVMe flash drives offers reduced latency, increased Input/Output operations per second (IOPS), much more bandwidth to the processor, and lower power consumption, in comparison to SAS-based or SATA-based SSDs.

Key Benefits of NVMe

  NVMe AHCI (SATA)

Latency (in microseconds)

2.8 µs

6.0 µs

Maximum Queue Depth

Up to 64K queues with

64K commands each

Up to 1 queue with

32 commands each

Multicore Support

Yes

Limited

4KB Efficiency

One 64B fetch

Two serialized host

DRAM fetches required

Source: Intel via http://baremet.al/nvme-differences

 

Packet and NVMe - Our Backstory

Packet’s product strategy prioritizes consistent availability over customization.  Our goal with each configuration is to really push the boundaries of performance, while providing a lot of value in terms of price.  

Looking at the market, we saw a dearth of good options for high IO workloads: big data, databases, etc.  The best option is the Rackspace High IO OnMetal box, which relies on a pair of super expensive 1.6TB Seagate Nytro WarpDrives.  This means you’re paying $1750 + management per box (or about $2.50 an hour before management).

AWS offers a high IO option (like the i2.4xlarge) but it still works out to $3.41 an hour.  We wanted to do better - our goal was less than $2 per hour.

We quickly became enamored with the emerging NVMe standard, and the ecosystem of flash drives coming to market that take advantage of these huge increases in performance with no real change to the underlying NAND.  Our friends at SuperMicro shipped us a FatTwin with 2 x 800GB Intel P3600 NVMe Flash Drives.  We were smitten!  But could we bring this tech to the masses, not just the chosen few with deep pockets?

Our Type 3 Config - A “Price to Performance” Powerhouse at Just $1.75 / hr

To ensure that this config would work for a broad enough range of power users, we beefed up our FatTwins with a healthy dose of extra speedy, low latency DDR4 RAM (128GB DDR-2100 ECC) and a generous amount of processing cores (16 physical / 32 hyperthreaded processing cores via 2 x E5-2640 v3 chips running at 2.6Ghz).  

We kept the boot disks fast with dual enterprise SSD’s.  This means that our Type 3 config can tackle intensive IO applications that also required fast processing threads.  It is primed for in-memory databases like Redis and Aerospike.  And with 32 cores, it even serves as a pretty awesome virtualization node for private cloud workloads (e.g. VMWare or OpenStack / Xen / KVM, etc).  

Massive Disk IO Requires a Better Network

With compute and storage demands increasing and scale-out architectures becoming more common, inter-server traffic has become a critical scalability bottleneck.  Most cloud providers limit “east west” traffic between servers -- either due to investment in the hardware or limitations in the software-based routers (e.g. OVS, tunnel overhead) that are the basis for most virtual clouds.   To help ensure maximum performance, we include bonded 2 x 10GbE SFP+ interfaces that offer high throughput, low latency, traffic between servers and have over-invested in our switching infrastructure to ensure maximum performance and reliability.

How It Compares

So, how’d we do?  Pretty good we think!  The specs of the Type 3 are off the charts and we managed to bring the price down to a point where just about anybody can harness the power of these incredible machines to tackle their intensive (or just well loved!) workloads.  As always, they’re available on-demand, by the hour via the Packet platform.
 

  Packet “Type 3” Rackspace OnMetal IO AWS i2.4xlarge

Price

$1.75 / hr

~$2.50 / hr + management

$3.40 / hour

Storage

2 x 800GB Intel P3600 NVMe Flash Drives

2x 1.6TB Seagate Nytro WarpDrive BLP4-1600 Cards

4 x 800 SSD Drives

Boot

2 x 120GB SSDs

32GB SATADOM

n/a

IOPS (random read)

~1 million (4k)

370k (8k)

100k (4k)

RAM

128GB DDR4

128GB

122 GB

Processing Cores

16 physical

20 physical

16 virtual

Network

2 x 10GbE (SFP+)

2 x 10GbE

Enhanced networking avail