Production Wiki

Knowledge base for the Film Program at the UCF Nicholson School of Communication and Media

< All Topics
Print

External Drive Storage 101

Overview

In today’s world of digital media, storage is paramount. The countless hours you have spent shooting, editing, finishing, and exporting your project can all vanish in the event of a catastrophic incident such as mechanical failure, loss/theft, or even accidental damage caused by just dropping your hard drive. The purpose of this article is to give you a better understanding of how storage drives work and how to select which option works best for storing and accessing your media project.

Storage Capacity

When selecting a drive, storage capacity is obviously one of the first things you should consider. However, in the world of media production one size does not necessarily fit all. In today’s digital age, almost everything you generate will need to be stored. That includes HD video dailies, project files, documents, images, and tons of other digital information.

Terms (Symbols)

In order to get the best bang for your buck when shopping for new storage, having command of the language will be very helpful in making sure you make informed decisions:

  • Bit (b) – A short term for binary digit, it is the basic capacity of information in computing. A bit represents either a zero (0) or a one (1) only.
    • Relative size: 1 bit = about 1 character of text
  • Byte (B) – A unit of digital information that consists of 8 bits.
    • Relative size: 6.3 bytes = a few words of text
  • Kilobyte (kB) – Approximately* 1,000 bytes of digital information.
    • Relative size: 1 kB = about a half page of text
  • Megabyte (MB) – Approximately* 1,000,000 bytes or 1,000 kB of digital information.
    • Relative size: 1 MB = about 500 pages of text
  • Gigabyte (GB) – Approximately* 1,000,000,000 bytes or 1,000 MB of digital information.
    • Relative size: 1 GB = about 1min of 1080p ProRes 422 video footage
  • Terabyte (TB) – Approximately* 1,000,000,000,000 bytes or 1,000 GB of digital information.
    • Relative size: 1 TB = about 16 Hrs 30 Mins of 1080p ProRes 422 video footage
  • Petabyte (PB) – Approximately* 1,000,000,000,000,000 bytes or 1,000 TB of digital information.
    • Relative size: 1 PB = about 2 years of 1080p ProRes 422  video footage

A kilobyte is a kilobyte, right? Not exactly.

* To help further confuse us all, there are actually two common standards in which data capacity is measured: drive manufactures and Mac OS X (version 10.6 Snow Leopard and forward – OS X and macOS) uses a unit of 1,000 bytes, called metric prefix, to equal 1 kilobyte; while the Windows operating system’s virtual storage and processors uses a unit of 1,024 bytes, called a binary prefix, to equal 1 kilobyte. Both recognize that 8 bits = 1 byte, so they are fundamentally the same, but all multiples of that unit of measurement are different. This seems like small issue except that it creates a significant discrepancy when you begin multiplying kilobytes into megabytes, gigabytes, terabytes, petabytes, and so on as the apparent difference becomes increasingly magnified as you up the demand for capacity.

There are actually different terms used by computer scientist to indicate this difference in measurement (i.e. kibibyte (KiB), mebibyte (MiB), or gibibyte (GiB)), but they are not commonly used by consumers). Because of this difference in measurement is why that 500 GB HDD you bought appears as approximately 465 GB on all Windows OS computers and approximately 500 GB on a macOS computer. However, remember that you are not losing any storage capacity by using a Windows based computer system, as your drive will always hold the same amount of data, just that the unit of measurement used by each operating system and drive manufactures are different. Just as the same distance is measured when comparing 1 standard yard to a 0.9144 metric meter.

The Precision of Language

The language, characters, and symbols used to describe digital information is very particular. For instance, when someone writes “TB” it means terabyte, but when “Tb” is written (with a lower-case “b”), that symbol represents a terabit. The main difference being that a terabyte represents 1,000,000,000,000 bytes of digital information and a terabit represents 8,000,000,000,000 binary digits (bits).

As you may have already noticed, there are common variation in symbols you will see. One in that you will deal with regularly is either kB/s or kbps. It is import to recognize the difference between them as kB/s represents kilobytes per second and kbps represent kilobits per second. So 1 kB/s is equal to 8 kbps. That means if a website such as Vimeo is asking you to make sure that the HD video you are uploading is set at a bitrate of 5000 kbps that is equivalent to them asking you to upload a video with a bitrate of 625 kB/s. If you uploaded a 5000 kB/s video by mistake you just tried to deliver a video with a bitrate of 40,000 kbps. That’ is 8 times the data rate they asked for. The key to all of this is to remember that bits have a constant 8:1 ratio with bytes.

Tip: When performing data rate conversions such as the Vimeo video example given in the above paragraph, the trend has been to use the decimal prefix standard of 1,000 bytes = 1 kilobyte. Use a Bit Calculator to ensure conversion accuracy.

Calculating Storage Demand

Now that you understand how to identify the capacity of any given storage device, and that you are moderately aware of the existence that media files have a variable data rate, you now have the basic skills required to calculate what size drive you will need to purchase for properly storing all of your project media and documents.

File Systems

Also casually known as format, a file system is how information is stored and retrieved on a drive. Before you begin filing your hard drive with your precious data, you should take the time to understand file system formats and which one best works for you. Many drives come preformatted in a file system that will probably work just fine, but you can always change a drive to best suit your needs. When dedicating on a file system format there are several variables to consider. Here is a list of the most commonly used file systems on personal computers and media production:

Apple File System (APFS)

A proprietary file system principally developed for supporting the macOS (High Sierra 10.13 and later) by Apple, Inc. It was made to fix some of the core problems with HFS+ and is optimized to work with solid state drives. It is now the primary file system for running the macOS.

  • Maximum individual file size limit – 8 Exabytes (8.6 Billion Gigabytes).

Hierarchical File System Plus (HFS +)

Also known as “Mac OS Extended,” a proprietary file system developed by Apple, Inc. for use in computer systems running Mac OS. As of 10.12.4, it is principally used for formatting external drives in the macOS environment. It replaced HFS, also known as “Mac OS Standard.” Drives and images formatted as HFS+ can only be read-by and written-to by the macOS and Mac OS X operating systems. They are not supported by Windows OS unless you install a third party utility such as Mac Drive for read and write ability.

  • Maximum individual file size limit – 8 Exabytes (8.6 Billion Gigabytes).

New Technology File System (NTFS) 

A proprietary file system developed by Microsoft. Starting with Windows NT 3.1, it is the default file system of the Windows OS family. A NTFS formatted drive can be read by Mac OS X, but it can not be written to unless a third party utility such as NTFS for Mac is installed.

  • Maximum individual file size limit – 16 Exabytes (17.1 Billion Gigabytes).

Extensible File Allocation Table (exFAT)

Introduced by Microsoft in 2006 as a solution for flash memory devices such as USB Flash Drives and SD Cards. It is intended to be used across platforms (Windows and macOS) similarly to FAT32, but work around the 4GB file limitation. While a great concept in theory, the overall reliability and performance of this file system is lacking. Many problems with using external hard drives with macOS computers are a result of using exFAT instead of HFS+ or NTFS along with a third party utility such as NTFS for Mac. We do not recommend using it to format your external drives.

  • Maximum individual file size limit – 128 Petabytes (128 Million Gigabytes).

File Allocation Table 32 (FAT32)

Also know as MS-DOS (FAT) or exFAT, a simple and robust legacy film system. It offers good performance even in light-weight implementations, but cannot deliver the same performance, reliability and scalability as some of the more modern file systems can. It is supported, for compatibility reasons, by virtually all existing operating systems for personal computers, and thus is a well-suited format for data exchange between computers and devices of almost any type and age from the early 1980s up to the present day. Great for formatting flash drives to move small data from different workstations, but certainly not ideal for formatting a hard drive you plan to use for editing video media due to rather small file size limitation.

  • Maximum individual file size limit – 4 Gigabytes.

Enclosure Type

Portable hard drives come in a variety of enclosures that allow for diverse drive configurations. Depending on what your needs, there are different advantages to each type and their features. Most external drives are for either expanding the capacity of a workstation, providing an on the go solution or both.

Desktop

Larger, fully functional hard drive enclosure that typically contains a 3.5″ drive. The largest hard drive capacities are available in these enclosures. All of these enclosures can be transported to other computer, but some are not as convenient as others as they typically have external power supplies. Even RAID arrays are available in this enclosure type.

Portable

Smaller, light weight, and easy to transport. The ideal storage hardware for someone who has to transport their media projects on a regular basis. Capable of all the functionality of a regular desktop external drive with an exception to the limits of maximum drive capacity. The drive is typically hub-powered as it utilizes a 2.5″ drive.

A Variety of Drive Enclosure Types from G-Technology

Drive Type

In principal, hard drives have fundamentally been  the same for quite a while. However, there are new technologies emerging that have created significant changes to digital storage. Here are your options:

Hard Disk Drive

Hard Disk Drives (HDD) with mechanically spinning disks are still the DeFacto standard of data storage for desktop computing. They use an electromagnetic method and are comprised of either single or multiple non-magnetic flat circular platters (disks) that are coated with a layer of magnetic material. Information can be read from and written to direct areas of this platter via the read-and-write head which is placed on a very precise armature.

  • Pros: Cheap capacity, reliable when handled properly, and a proven technology.
  • Cons: HDDs are the reliable equipment that most of the world’s data depends on, but they can be quite vulnerable to damage when mishandled. The platters inside are spinning at thousands of revolutions per minute (rpm). That obviously creates a lot of momentum. Read-and-write heads operate very close to the surface of the platter, so you should take special care in not moving the drive when it is in use. Sudden impacts and axis movements may jar the head, causing accidental contact with the platter surface. This can cause irreparable damage to the writing surface and the permeant loss of your data. They can be loud. Lengthy start up times are caused when the disks have to spin up from a stationary position (typically happens when the drive has been to sleep when not in use).

Solid-State Drive

Solid-State Drives (SSD) have been around for awhile, but are considered an emerging technology. Data is stored electronically in integrated circuits assembles with no moving parts. They work very similarly to USB flash drives you may be familiar with. This allows for reliable, fast drives in small form factors, but currently comes at significant cost compared to traditional HDDs.

  • Pros: High read-and-write speeds and no disk spin up wait times means very little latency, high performance, quite, and not sensitive to drive orientation or movement when operating (as is the case with traditional rotating HDDs). Because they are so fast and reliable, higher than average data rates can be achieved using these drives allowing effective throughput to climb closer to an interface’s theoretical maximum speed.
  • Cons: They are relativity expensive. The price per GB is 2 to 3 times higher than HDDs. Capacities are growing but are limited. There are 1TB SSDs available on the market. They do pose an increased chance of catastrophic failure. Due to the way they handle information, SSD drives are very hard to recover in the event of a file structure failure. Because mechanical HDD write in a linear fashion on a the surface of a physical disk, it is just easier to go in and retrieve the data in the event of the worst case scenario.

Form Factor

The rigid housing the drive components are assembled in. There are many sizes when it comes to drive form factor, but let’s just stick to the two most commonly used with modern desktop computing and digital media storage:

3.5″

A size you will typically find in most desktop computing such as your Mac Proor PC tower. It’s a form factor typical associated with traditional HDDs and optical drives like DVD-ROMs. HDD at this form factor size are capable of rotational speeds of 10,000rpm or more (for premium drives), but typically operate at 7,200rpm.

2.5″

A size you will typically find in laptops and smaller electronic devices like your MacBook Pro, iMac, or Playstation 3. It’s a form factor that houses both smaller HDDs and SSDs. HDD at this form factor size tend to max out at 7,200rpm (for premium drives), but typically operate at 5,400rpm.

Drive Quantity

Single

The majority of external drive enclosures are built with just one hard drive. This simple setup is usually cheap, compact, and reliable.

Multiple

There are some external drive enclosures that are built with two, three, four, or even five hard drives in one single enclosure. Many of these multi-drive enclosures feature easy to remove drives that can be replaced in the event of an individual drive failure or capacity upgrade.

Drive Configuration

Most external drive enclosures have a static drive configuration, but some allow for modification to improve performance or to create data redundancy.Here are the most common configurations you will run across in the world of video editing:

Normal

Simple. Just one formatted hard drive.

Redundant Array of Independent Disks (RAID)

Complex. RAID is a storage technology that combines multiple drives into a logical unit for the purposes of data redundancy and/or performance improvement. Data is distributed across the drives in one of several ways, referred to as RAID levels, depending on the specific level of redundancy and performance required. Here is a list of the RAID levels most commonly used in media production:

  • RAID 0 – Data striping without parity or mirroring. Provides improved performance and additional storage but no fault tolerance. Any drive failure destroys the array and any data stored on it, and the likelihood of failure increases with more drives in the array. This setup provides no additional protections and is purely for improving drive read/write performance to increase data rates.
    • Example: Two 1 TB drives setup as RAID 0 will have improved performance, no redundant data, and have an operating capacity of 2TB.
  • RAID 1 –  Data mirroring without parity or striping. Data is written identically to two drives, thereby producing a mirrored set to protect your data against mechanical failure.
    • Example: Two 1 TB drives setup as RAID 1 will have no improvement in performance, redundant data, and have an operating capacity of 1TB.
  • RAID 5 – Data redundancy with striping and distributed parity. The array distributes parity along with the data and requires that all drives but one be present to operate. The array is not destroyed by a single drive failure. Requires at least three disks for setup. The advantage of this setup is an increase in drive read/write performance along with efficient data backup that does not require a 1-to-1 capacity ratio.
    • Example: Four 2 TB drives setup as RAID 5 will have an improvement in performance, redundant data, and have an operating capacity of 6 TB.

Use this handy RAID Disk Space Calculator to determine your total storage capacity after a RAID level is applied to your drives.

Important Note Concerning Backing Up Your Data:

Redundant RAID arrays do not protect your data from file corruption, physical theft, or catastrophic damage such as a house fire or being submerged in water. It only protects your data from loss in the event of mechanical failure of the primary data disks.

Just a Bunch of Disks (JBOD)

A JBOD is an array of drives not setup in a RAID. These drives can be setup up as separate, independent volumes or they may be combined (using software such as the macOS Disk Utility) to create one single volume. The advantage of this configuration, inside of a single enclosure, is that you able to interface several hard drives to your computer without having to have multiple external drive enclosures connected.

Interface

In layman’s terms, a hardware interface is how you connect your drive (or other device) to your computer. It is the point of interaction between a hard drive and the computer. It is comprised of the mechanical, electrical and logical signals at the interface and the protocol for sequencing them.

Throughput

After you determine how much storage you will need, the next thing you have to figure out is how much throughput (or data rate) the media you will be storing, and eventually editing, will require. Throughput is the average rate of the successful delivery of information packages over a network. Which means it is not a constant number, it is always moving due to variations in data location on the disk, file size, network traffic, and other variables. A drive’s interface plays a crucial role with this average.

There are many types of interfaces that allow you to connect a portable external drive to your computer system. Here is a breakdown of the most commonly used in computerized video editing:

Universal Serial Bus (USB)

USB was initially developed in the mid-1990s and since become the dominate bus connection used for most computer peripherals devices such as keyboards, printers, hard drives, and for linking to electronic devices such as phones and iPods.

USB 2.0 (which many modern peripherals, such as keyboards and printers, still use) has an effective throughput to an external hard disk drive (HDD) of about 289 Mbps with a theoretical maximum interface speed of 480 Mbps. Because throughput information is just an averaging of what is being transmitted, this range is too limited to use when editing HD video footage (such as Apple ProRes 422 codec’s average data rate of 153 Mbps). SD video footage (such as DV NTSC with an average data rate of 48 Mbps) can be supported, but in practice we find using USB 2.0 tends to bottleneck any video editing workflow, as the throughput will occasionally drop below the data rate demand during video playback. USB 2.0 is best used for just everyday computing needs and archiving digital media.

USB 3.0 Launched at the end of 2008 and is now the new standard used in modern peripherals. It has an effective throughput to an external HDD of about 920 Mbps with a theoretical maximum interface speed of 5 Gbps. This means its average data rate is more than enough to handle 1998×1080 2K, 1920×1080 HD, and 720×480 SD video workflows.

USB 3.1 and 3.2 First arrived to the consumer market in 2014, but did not became more prolific in late 2017. USB 3.1 is has essentially the same data rates as USB 3.0. USB 3.2 (aka USB 3.1 Gen 2) has an effective throughput of 5 Gbps with a theoretical maximum interface speed of 10 Gbps. These data rates are robust enough to handle 4096×2160 4K DCI and 3840×2160 UHD, either as RAW frames or compressed video. While USB 3.1 is technically available using the USB 3.0 connector types, it is principally used with the USB Type C connector (which looks very similar to the original Micro USB connector but is more oval in shape).

USB Type C (aka USB-C) Male Connector
USB-C Female Port (on laptop device)

Thunderbolt

Developed by Intel and first released by Apple in 2011, Thunderbolt 1 is a peripheral interface that combines traditional PCIe and Display Port (DP) functionality into a Mini DisplayPort (MDP) type connector. This merger of functionality has created a fast, powerful, and flexible peripheral interface that is used to connect everything from external monitors, component expansion slots, to portable hard drives. The technology was presented as having an initial theoretical maximum interface speed of 10 Gbps and promising a final speed of 100 Gbps in future versions. The current effective throughput average is about 920 Mbps due to Hard Disk Drive rotation limitations and other drive side operational limitations. Solid-Stage Drives should allow for a faster data rate average of 1.65 Gbps  as their costs drop and are eventually adopted into video editing. It is robust enough to easily handle 2K, 1920×1080 HD video and 720×480 SD video workflows and support for some 4K workflows.

Thunderbolt 2 was released in 2013. By joining the two 10 Gbps channels, it supports a theoretical data-rate of 20 Gbps. While doing this does not inherently increase the total overall bandwidth, it does, however, make a more flexible interface that is able to transfer 4K video files while also being able to simultaneously playback 4K video over an external video monitor. It too uses the Mini DisplayPort (MDP) type connector.

Thunderbolt 3 was released in 2015. It allows up to 4 lanes of PCI Express 3.0 for general-purpose data transfer and 4 lanes of DisplayPort 1.4 for video playback over a maximum combined data rate of 40 Gbps. This essentially doubles the throughput and flexibility of the interface. It is more than capable of handling most modern 4K, 10-bit video workflows. Thunderbolt 3 shares connectors with Universal Serial Bus by adopting the USB-C type connector.

Thunderbolt 4 was released in 2020. The maximum bandwidth remains at 40 Gbps, the same as Thunderbolt 3 and four times faster than USB 3.2 Gen2. It supports multi-port accessory architecture, not just daisy chaining. It also support 8K displays with 10-bit color. Thunderbolt 4 shares connectors with Universal Serial Bus by using the USB-C type connector.

Thunderbolt 1 and 2 Connector (Same form factor as Mini-Display Port type, but distinguished by using a thunderbolt icon.)
Thunderbolt 3 and 4 Female Port (Same form factor as USB-C type but distinguished by using a thunderbolt icon.)
Thunderbolt 4 Male Connector (Same form factor as USB-C type, but distinguished by using a thunderbolt icon. Also using the number “4” to distinguish itself from Thunderbolt 3 cables.)

Firewire (IEEE 1394)

  • Firewire 400 – FW 400 has an effective throughput to an external HDD of about 320 Mbps with a theoretical maximum interface speed of 393 Mbps. Because throughput information is just an averaging of what is being transmitted, this range is too limited to use when editing most HD video footage  (such as Apple ProRes 422 codec’s 153 Mbps) as the throughput will occasionally drop below the data rate demand during playback. SD video footage (such as DV NTSC with an average data rate of 48 Mbps) can be properly supported.
  • Firewire 800 – FW 800 has an effective throughput to an external HDD of about 600 Mbps with a theoretical maximum interface speed of 786 Mbps. Editing most 2K and HD video footage  (such as Apple ProRes 422 codec’s 153 Mbps) can be properly supported with this interface.

Primarily developed by Apple in the late 1980’s – the early 1990’s and released to the public in 1995, its introduction as a peripheral interface had a major impact in the world of digital video editing as its significant throughput allowed for inexpensive real-time video capturing for nonlinear editing workstations.

eSATA

Standardized in 2004, eSATA (e standing for external) provides a variant of the traditional SATA interface (used inside desktop computers) meant for external connectivity with portable drives. It is about four times faster than USB 2.0 and Firewire 400, twice as fast as Firewire 800. It has an effective throughput to an external HDD of about 984 Mbps with a theoretical maximum interface speed of 3 Gbps. It is robust enough to easily handle 2K, 1920×1080 HD, and 720×480 SD video workflows. The major drawback with eSATA is its limited standard deployment by most major computer and portable hard drive manufacturers.

Power Supply

AC Power Adapter

A simple AC to DC power supply that allows you to power your external drive from a traditional 110v Edison wall outlet (US).

Hub Powered

Many smaller portable drives can be powered by an interface hub such as USB, Firewire, or Thunderbolt. They do not always require an external power source making it easier to transport and connect to computer workstations. However, sometimes interface hubs can become over saturated by power demands (such as multiple hub powered drives drawing from the same interface hub). In this situation an external power supply for the drive is usually required. If you plan on operating multiple hard drives at one time, it is best not to over invest in hard drives that are only hub powered cable.

Summary

Hard drives are the primary vessel for editing and storing your precious data. An entire project’s production value and potential exists on just a few metallic disks. It is crucial that the new generation of media producers develop an understanding and have a strong command of the nomenclature and technologies used in data storage to ensure that their projects are safely completed and are archived successfully for future use.

A data backup guide will be produced in the near future to review successful practices in protecting your data.

Table of Contents