35

I work in film production and need VERY fast access to very large raw footage files. I am presently using eSATA 6Gbps docks for internal hard drives, which go as fast as the drive will go.

If I were to use a server and connect to my workstations via network, what (relatively affordable) technology would allow me to get anywhere near to, or exceed the drive speed?

hjpotter92
  • 722
  • 2
  • 13
  • 27
Ben
  • 261
  • 3
  • 4
  • 12
    10Gbps ethernet is well into "affordable if you have budget" range these days. However, 8 or 16Gbps SAN (storage area network - aka fiberchannel) might be better suited to the particular use? – Ecnerwal Mar 08 '17 at 02:00
  • @Ecnerwal Many thanks for your thoughts. What would make fibrechannel more appropriate? – Ben Mar 08 '17 at 02:17
  • 3
    It's specifically designed/optimized for storage networks. – Ecnerwal Mar 08 '17 at 02:21
  • 3
    Just want to check -- are you using SSD's? If not, that is an essential upgrade. – trpt4him Mar 08 '17 at 02:33
  • 1
    I see no added value of a SAN compared to a simpler and cheaper NAS here. – jiggunjer Mar 08 '17 at 03:01
  • 1
    I'm really hoping to see some answers that really dive into the networking issue ... not just reminders that the OP should buy fast SSDs. I myself would like to have a very fast home network to access my file server - I'm willing to upgrade the file server (or its adapters) as well as the network, or alternatively, set up a SAN (but I have no idea how to do that). Some pro/cons as well as fiber vs copper explanations would be much appreciated. – davidbak Mar 08 '17 at 06:10
  • @jiggunjer I find NASes to be hit and miss in terms of OS quality. You frequently get very little control over them. A SAN would be overkill, yes, but a NAS would be too basic. I've had to give up with NASes because they become unstable in use. SANs are at the other end of the scale, particularly for irreplaceable production data, but way more than the OP needs. I wouldn't trust NASes to do the same. – Gargravarr Mar 08 '17 at 12:13
  • 3
    To properly answer this, more info is needed: which amount of data must be stored, how many concurrent users, where are the data accessed from (what kind of machine)? A 100Gbs network will not help if accessed from a 2 or 3 machine with 1Gbs network adapter. – JFL Mar 08 '17 at 13:05
  • 1
    LinusTechTips has several videos detailing their very fast internal network which they use in a similar use case. – SethWhite Mar 08 '17 at 14:39
  • I've built such a setup once. We went with a dedicated storage system (96 SATA channels <-> 6x 10GigE on an FPGA). The network part was the smallest problem there, we just gave dedicated ports to the machines that need fast access, and one port went to a switch for the remaining network. – Simon Richter Mar 08 '17 at 17:04
  • I know this doesn't directly address the question, but have you looked into enterprise NAS solutions such as Netapp or EMC? Yes, those can be pricey, but I would think a film production company would have the budget for such a thing. – Charles Burge Mar 08 '17 at 20:50
  • This question reminds me of the stuff I see the people of LinusTechTips do on Youtube. You might want to check out their videos on their storage and video rendering servers. Note that they are a professional company with already quite a big budget and a lot of sponsorships by PC part vendors, so they might be doing some stuff that is out of your budget range. Nevertheless, you could pick up some ideas from their videos. Link to their channel: https://www.youtube.com/user/LinusTechTips – BlueCacti Mar 09 '17 at 10:49
  • What "relatively affordable" means is critical to this question IMHO – Elder Geek Mar 09 '17 at 18:32

6 Answers6

31

Here in early 2017, judging from information readily available online, it looks like the fastest SATA HDDs max out at around 220 MegaBytes/sec, which is 1.760 Gigabits/sec.

So if you're just trying to beat the speed of a single drive, and you're limited to HDDs for cost-per-terabyte considerations for huge video files then, 10 Gigabit Ethernet is plenty.


As an aside, note that Thunderbolt Networking is 10 Gigabit/sec as well, so if you already have Thunderbolt ports, you can experiment with that. It could conceivably beat your 6 Gigabit eSATA 3 ports, although I'm not sure about that because eSATA is very storage-specific, whereas doing storage over Ethernet has more overhead. Also note that Thunderbolt is a desktop bus; it only reaches a few meters, not the 100m that 10 Gigabit Ethernet can handle. So while Thunderbolt may be interesting for experimenting and prototyping while you weigh your options, it's probably not the right long-term solution for you unless you want to keep all your workstations and disks connected back-to-back around a large table.


So that was for single HDDs. But if you RAID those drives together so that every read or write gets spread out across multiple drives, you can get much better performance than the single drive performance. Also, depending on your budget, you could put PCIe/M.2 NVME SSDs into a PC to act as a server/NAS, and you can get blazing fast storage performance (around 3.4 GigaByte/sec == 27 gigabit/sec) per drive.

In that case you might want to look at something faster than 10 Gigabit Ethernet, but glancing around online, it looks like prices jump quite dramatically beyond 10 Gigabit Ethernet. So you might want to look at doing link aggregation across multiple 10Gigabit links. I've also seen some anecdotes online that used network equipment, such as used 40Gbps InfiniBand stuff, can be bought on eBay for dirt cheap, if you don't mind the hassles that come with buying used gear on eBay.

Spiff
  • 101,729
  • 17
  • 175
  • 229
  • 1
    What about multiple 10Gbit NICs + bonding? – André Borie Mar 08 '17 at 10:22
  • 2
    bonding allow to have separate flows use different physical links, but a single flow will use only one NIC. So this will be useful when several files are served to different people at the same time, but will not increase bandwidth for a single file access. – JFL Mar 08 '17 at 10:39
  • But if disks raided, then higher speeds would be possible, no? Wait. I see you mentioned it now. – mathreadler Mar 08 '17 at 10:41
  • Just to make it clear, Ethernet and SATA uses encoding 8b/10b, so disk speed 440MB/s is equivalent to SATA/Ethernet speed 4.4Gb/s. – Jirka Picek Mar 08 '17 at 11:10
  • @JirkaPicek No, I know for a fact that Gigabit Ethernet runs 4 pairs at 250 Mbps each, for a fundamental rate of 1.2Gbps, which is 1Gbps after deducting 8b/10b. I'm pretty sure 10GBASE-T quotes the more realistic rate too, and I bet SATA does too, although I'm less familiar with SATA's low level workings. – Spiff Mar 08 '17 at 15:37
  • Remember Ethernet gets saturated at around 50% utilisation due to collisions and the suchlike so 1Gb line speed does not equal 1Gbit usable bandwidth, especially when you have more than one device using the network. This is the TL;DR version before people shout at me for being vague. – John U Mar 08 '17 at 16:44
  • For the server, RAID is a good idea, and for affordability: Windows 10 has "Storage Spaces" which is a Software "raid" that, when I experimented with it, got transfer rates comparable to my SSD using two old HDD drives. The advantage being the cost per TB for HDDs is much lower than SSD. ( I discontinued using it because one of the drives failed) – Yorik Mar 08 '17 at 16:50
  • 4
    @JohnU No, Ethernet is 94% efficient with standard 1500 byte frames, and even better with jumbo frames. I routinely see 930+ Mbps of throughput without even trying. Gigabit Ethernet does not allow hubs, only switches, so it is always full duplex and thus never has collisions. – Spiff Mar 08 '17 at 17:05
  • 1
    @JFL Ethernet bonding (IEEE 802.3ad link aggregation) distributes load packet-by-packet, not per-flow. Because networking technologies are generally designed in discrete layers, Ethernet (layer 2, the data link layer) has no idea about the protocols above it such as IP or TCP, so it has no idea of flows. – Spiff Mar 08 '17 at 17:29
  • @spiff this is true but it uses a hashing algorithm based on a tupple like source-ip / destination-ip / source-mac / destination-mac that will always have the same result for a given flow, so all packets matching those tuples will use the same physical links. This is by design to avoid out-of-order packet delivery. – JFL Mar 08 '17 at 17:33
  • 1
    @JFL IEEE 802.3ad (Now part of IEEE 802.1AX) certainly does not use IP addresses in any way, because the IEEE is very careful to make sure their link layer technologies are independent of all layer 3 (network layer) technologies. – Spiff Mar 08 '17 at 17:47
  • @Spiff, maybe, I'll check, in real life however, IP *can* be used and even layer 4 information (like TCP / UDP ports) to better balance the traffic. Anyway whichever criteria is used to balance traffic among the links, a single flow will still always use the same physical link. I think linux bonding driver can round robin the frames but it is never used this way since it cause issues. – JFL Mar 08 '17 at 18:05
  • 2
    @JFL Linux aggregation certainly can used some layer3+ information (Linux can do lots of unusual things), please remember you really need both sides to support the same bonding method to be useful. If you are connecting to a switch then you can really on use what the switch supports. – Zoredache Mar 08 '17 at 18:42
  • @Spiff, where did you find those 220 MB/s HDD's? especially "on a budget"? Also, my SSDs' in my laptop seem to max out at 530 MB/s (4.24 gbits/s). Might want to update that. – Gizmo Mar 08 '17 at 19:44
  • 1
    NVME SSD's max out at [3.5 GBps](http://www.samsung.com/semiconductor/minisite/ssd/product/consumer/ssd960.html) (not Gbps), and come in fairly large sizes. (if you really want to limit this to sata, why even mention drive speeds? sata III is 6Gbps < 10GbE) – Steve Cox Mar 08 '17 at 22:13
  • @Gizmo Seagate Barracuda Pro and WD Black both get over 200MB/s, and are around US$300 for 5-6TB. He asked for "relatively affordable", which I assumed from context was relative to what video pros are used to spending on equipment, not a teen building his first PC. Perhaps my perspective was skewed by the fact I was also seeing how 40-100Gbps Ethernet switches can cost US$30k or more, which is more than I spent on my last new car. – Spiff Mar 08 '17 at 22:45
  • @SteveCox That's good feedback. I've reworked my Answer so I'm not limiting SSDs to SATA. – Spiff Mar 09 '17 at 03:29
  • I've built high speed converged storage networks on 10 GbE equipment before. It is definitely possible to link aggregate for 20 Gbps and up where the load is balanced at the frame level, not per flow, but you have to have equipment that supports it and you have to know what you're doing. And enabling jumbo frames makes a huge difference in throughout, but you have to have equipment that supports it and you have to know what you're doing. – Todd Wilcox Mar 09 '17 at 13:58
5

If you stick to SATA disks, implementing 10Gb ethernet and building a reasonably sized RAID10 on the server will net you a noticeable performance increase beyond that of a single SATA disk. This will be a worthy investment because you can share the server among as many workstations as you need, and add in the future by adding switches. You'll need to run Cat-6 ethernet cable, as Cat-5E won't cut it - don't forget to add this expense to your calculations. You could also add SSDs as cache to speed the system up even further; since you're working with video footage, I assume you need vast amounts of storage space, which would be extremely expensive to build purely with SSDs.

You could buy a pre-made rackmount server from Dell or HP and use the hardware RAID card, or if you're more of a hardware person, you could buy a cheaper chassis from Supermicro and build the storage machine yourself, using software RAID in either Windows or Linux. Hardware RAID is often faster when a RAID1 is involved, since software must write to each disk in turn and wait for the write to complete before moving onto the next operation; a RAID card can generally write to both disks in parallel, and cache the write operation, returning control to the OS immediately. Do note, however, that although a RAID0 would be even faster, you have no redundancy and a single drive failure will cause complete data loss; never use a RAID0 when you have data that you want to keep. I recommend contacting Dell or HP or another big name to see if they can help you spec out a system to meet your needs.

At the high end of the scale you have Storage Area Networks (SANs) but these are designed to allow many operations from vast numbers of separate clients in parallel; a benefit of this is that the throughput is very high for a small number of connected machines, but is likely overkill for your needs, and very expensive. At the low end, you have Network Attached Storage (NAS) devices that others have mentioned, but although much simpler than a full server, I don't recommend them since a NAS is frequently a black box; they're designed to be plug & play for most users, and as a result you have little control over the OS - I have just had to send back a small NAS I bought for a client's office as it became unstable after a day's use.

Another advantage to building a server is that you're concentrating all your footage in one place, which makes it practical and relatively easy to back it up regularly. Never neglect your backup strategy; one day you'll need to depend on it!

Gargravarr
  • 289
  • 2
  • 14
  • 3
    "software must write to each disk in turn and wait for the write to complete before moving onto the next operation". That hasn't been true for several decades; even non-server disks nowadays support command queuing. And strictly speaking, with DMA software didn't need to wait either. That's been supported in desktops since 1998 or so. – MSalters Mar 08 '17 at 10:25
  • It's probably worth noting that while your first sentence is true for a single user, the 10Gb being shared bandwidth means that with multiple users trying to access the share simultaneously the bandwidth each gets will eventually drop below single user SATA levels. – Dan Is Fiddling By Firelight Mar 08 '17 at 16:59
  • 2
    a NAS isn't some special device by definition, you can have a normal Linux PC with samba or NFS and call it a NAS. The key difference is file-level sharing vs block-level sharing. A block device can only be used by one client at a time. Also SANs are more expensive and take longer/more knowledge to configure. – jiggunjer Mar 09 '17 at 02:40
  • @jiggunjer Very true, but most who would build a PC with Samba would call it a file server, not a NAS. I'm just playing on the technicality to mean the special-purpose device. Also very true about SANs, which is why I classed them as overkill. – Gargravarr Mar 09 '17 at 09:39
  • @DanNeely yes, as more clients are added with simultaneous access, speed to each workstation will decrease, but the total rate of data coming out of the server ought to remain very high. Augmenting with SSD cache could compensate for most of this. – Gargravarr Mar 09 '17 at 09:41
3

Nothing beats 10gbe's flexibility and ease of configuration, but SAS is surprisingly networkable all on its own:

For small numbers of workstations (n<8) where volumes do not need to be writable from multiple computers at once, SAS works great. With a Tyan JBOD ($1,500) and an LSI HBA ($400), we get 3,400 MB/s (27 Gbps) sustained transfers to SSD. The JBOD has an internal switch with 3 uplinks to HBA's, but SAS switches are available for higher port counts.

Here is a speed test of one of our volumes:

CDM5 showing 2,862MB/s sequential read with 32 iops in queue.  4k single threaded random read 29MB/s

Internally, we use this solution with Clustered Windows Server running Storage Spaces that are distributed to clients with 10gbe.

Clustered Windows Servers

Mitch
  • 1,143
  • 1
  • 9
  • 16
2

https://www.newegg.com/Product/Product.aspx?Item=N82E16820147593&cm_re=samsung_m.2--20-147-593--Product

First up, M.2 SSD which given the correct motherboard can reach 2gb-4gb/s, and possibly higher. RAID a few of these together, and your speeds are even higher.

Teaming multiple 10gb NIC could get you close to native speeds.

Anything more than 10g is a lot more money.

40 gigabit NIC https://www.serversupply.com/products/part_search/pid_lookup.asp?pid=263133&gclid=Cj0KEQiA9P7FBRCtoO33_LGUtPQBEiQAU_tBgIU2jZrJKf0kXFy96roOllcRkp7j-VoubG_n7xb0_pgaAnaT8P8HAQ

cybernard
  • 13,380
  • 3
  • 29
  • 33
  • Going above 10gb is also likely to put local bus speeds on the scene as a limiting factor :) A PCI-express 2.0 x16 slot is theoretically 64GBit/s - but overheads on all sides (bus to memory, driver software to memory, cpu load...) will gnaw that down pretty close to where you can be glad if your 10GBe runs at line speed ... – rackandboneman Mar 08 '17 at 08:33
  • 2
    @rackandboneman: That's why you have smart NIC's. The 10Gb/s Ethernet is raw speed. The Ethernet, IP and TCP headers can be handled by the NIC, leaving the CPU to deal with just the real data. – MSalters Mar 08 '17 at 10:20
  • IF these offloading mechanisms work in combination with your application scenario - and while they cut down on CPU usage, they won't reduce data volume THAT significantly unless a lot of that traffic is garbage in the first place... – rackandboneman Mar 08 '17 at 13:14
  • @rackandboneman PCIe 3.0 x16 exists so speeds can be double that. – cybernard Mar 08 '17 at 13:29
  • 10gbe can be far cheaper than that: [Mellanox nic for $180](http://www.provantage.com/mellanox-technologies-mcx311a-xcat~7MLNX1KC.htm) and 25gbe comes close [Mellanox 25gbe for $300](http://www.provantage.com/mellanox-technologies-mcx4111a-acat~7MLNX1XF.htm). What kills you is port cost on switches, but eBay or a full mesh makes that problem go away. – Mitch Mar 08 '17 at 20:38
  • @rackandboneman, RDMA can all but eliminate CPU transfers with SMB3 fileshare accesses. See [SMB Direct and RDMA performance demo](https://blogs.technet.microsoft.com/josebda/2014/03/10/smb-direct-and-rdma-performance-demo-from-teched-includes-summary-powershell-scripts-and-links/) or [Impact of RDMA Networking in an SMB3 workload over 100GbE](https://www.youtube.com/watch?v=u8ZYhUjSUoI). There will be protocol overhead, but that won't be much (<10%). – Mitch Mar 08 '17 at 20:46
  • And those QSFP ports need special cables? – jiggunjer Mar 09 '17 at 02:48
  • @jiggunjer Yes, special cables, compared to RJ45. – cybernard Mar 09 '17 at 14:56
0

Use a fiber ethernet card with a fiber switch, ideally multimode or single mode with pads. If you've got the budget you can make the whole thing 10gbps, if you use link aggregation with a dual port fiber card you can get 20gbps. Since the term "relatively affordable" changes depending on what your budget is, actual numbers as to what you're willing to spend would be helpful.

Sunny
  • 1
0

The cheapest way would be to use link aggregation (IE run N cables, typically 2) instead of one.

This way a dirt cheap 1Gb link will become a 2Gb link.

Obviously it means you need at least 4 ports for a single client server.

Antzi
  • 101
  • 5
  • and the client needs 2 ports... Do simple switches work with this? or should it be a direct link? – jiggunjer Mar 09 '17 at 02:28
  • @jiggunjer simple switch can be used for that but you need a special configuration on all computers/servers that use this scheme – Antzi Mar 09 '17 at 02:38
  • is this aggregation the same as bonding mentioned in the comments of the answer by @Spiff? In which case bandwidth isn't increased for single file access... – jiggunjer Mar 09 '17 at 03:07
  • @jiggunjer This really comes down to how you do the setup, so I can't really answer. – Antzi Mar 09 '17 at 03:21
  • wouldn't most simple switches be limited to 1Gb, even with link aggregation ? – Florian Castellane Mar 09 '17 at 13:18
  • 1
    @FlorianCastellane latest cheap one I bought wasn't. Depends of the specs – Antzi Mar 10 '17 at 00:10