Modified squid optimized for SSD/HDD mixed setup (based on squid-2.7.STABLE9)

From: Ning Wang <renren.wangning_at_gmail.com>
Date: Sun, 26 Aug 2012 13:05:29 +0800

Hi all,

Here is our modified squid optimized for SSD/HDD mixed setup (based on
squid-2.7.STABLE9) hosted at https://github.com/renren/squid-ssd.git

Squid-ssd is suitable for serving millions of small files such as
avatars, pictures, photos. For serving big files, you can use other
store squid already had had for years.

To build squid-ssd, you should add --enable-coss-aio-ops option to
configure scripts, and then make && make install.

In the config file, we add 'backstore' option to coss store. Below is
an example configuration
cache_dir coss /mnt/ssd0/stripe 100000 \
        backstore=/mnt/hdd0/stripe,400000 \
        backstore=/mnt/hdd1/stripe,400000 \
        backstore=/mnt/hdd2/stripe,400000 \
        max-size=1024768 block-size=1024

For a clean setup, you can use squid -z to initialize coss stripe
files. Besides that, squid-ssd are designed to be compatible with
legacy coss stripe files. In the example above, you already have three
coss stripe files /mnt/hdd{0,1,2}/stripe on three different HDD disks,
and just now, you installed a SSD disk and mounted to /mnt/ssd0. You
can simply use dd if=/mnt/hdd0/stripe of=/mnt/ssd0/stripe bs=1MiB
count=100000 to initialize SSD stripe file.

Architecture

Storing small files is a big problem for most conventional native
filesystems: that is, too many metadata and data scatter all over the
places, soheavy random read and write are involved when allocating,
writing, reading, moving, deleting files. For cache like squid, the
same problem exists, and is amplified, because cache serves heavy
random read.

OS filesystem (memory) caching mechenism may help, if these so many
metadata are cached in memory. But caching data is also important, and
sometime more important, because serving data from memory cache avoids
disk I/O, including metadata I/O.

Squid's COSS store is designed in the principle that pieces are
bundled together to decrease aformentioned overhead. Small files are
packed into big COSS file, although read is still mostly random, write
is sequentialized as much as possible. And a few big files save too
much metadata overhead than millions of small files. And for real
world maintenace, COSS is also an optimized (but with problem)
solution, because loading a COSS file is totally sequential.

You may find a lot of merits to describe COSS store, but world is not
perfect yet: random read from within a few big files is still random
read. And COSS introduces extra overhead for reallocation of hot
content, even if it is not heavy.

So using SSD is a potential and logic choice. Current generation of
SSD's are excellent at sequential write, sequential and random read,
and they have decent random write capability (compared to HDD).

SSD is getting cheap everyday, but still expensive compared to HDD. It
may or may not be good to use all SSD setup.

If your workload has high repeat rate (that is, averagely and evenly,
cached objects will be accessed a much lot of times), and cache
evicting & replacing rate is low (so total disk capacity is not an
issue), use all SSD setup. You can even use 10G NIC to match the
throughput capability.

Nevertheless, if you run a popular web 2.0 sites, you're almostly not
so lucky. Avatars should be ok. Some picutures, photos can be hot
spot, but just in short time when the related contents are new. They
will be digged out some months later, accessed few times, and
forgotten under dusts again.

So you face these challenges when appling SSD in CDN nodes:
1.You need to serve huge traffic while keeping reponse quick
2.You need fast write for modest traffic of back-to-origin fetches
3.You need big capacity for less back-to-origin fetches; SSD is expensive

Squid-ssd addresses those challenges well:
1.User reads from SSD; back-to-origin fetches written to SSD
2.Objects evicted from SSD to HDD, and from HDD to nothing
3.Objects promoted to SSD from HDD to avoid back-to-origin fetches

TODOs
1.Use fallocate to initialize coss stripe file for filesystems which
support the fallocate system call(ext4, xfs)
2.Add index for coss file to speedup startup. These codes are in
coss-index branch, however, they are not reviewed and fully tested.

LIMITATIONS
Squid 2.7 is single processed and can't scale well on multi-core
systems (however you can run multiple instances when convenient). You
may see 400Mbps-600Mbps capping when serving small files (such as
avatars) because one core is near 100% used.

And the wiki: https://github.com/renren/squid-ssd/wiki
Received on Sun Aug 26 2012 - 05:05:38 MDT

This archive was generated by hypermail 2.2.0 : Sun Aug 26 2012 - 12:00:07 MDT