Perth start-up Tape Ark plays a part in Microsoft’s ambitious cloud project
“People pay to store these tapes on air-conditioned shelves, and it’s not even accessible. We solve the problem of putting this content online,” he said.
Data tapes have been around since the 1960s. They owe their popularity to the fact that other mass storage options were first limited and then expensive. Companies also traditionally viewed their use as a security measure, as tapes were stored in secure storage locations.
However, they had their drawbacks. Access to data stored on a tape could take days (a van had to drive there, pick up the tape, bring it back), access was sequential and required special drives, and sometimes weather or accidents damaged pieces of a band.
Offsite storage was not foolproof. Storage provider Iron Mountain lost all recordings from one of its London sites in a fire in 2006, and occasionally interns, employees or malicious third parties have stolen a company’s tapes .
Mr. Holmes’ former clients are an eclectic mix. He moved tapes that NASA’s Apollo missions recorded on the moon – tapes that were discovered gathering cobwebs at Curtin University. He said he had migrated all mining exploration data from the Ethiopian government to a previous company he founded. And there have been fun ones in Rock & Roll Hall of Fame’s archives.
He estimated that his company’s work has so far helped customers decommission the equivalent of 13 data centers.
Data tapes have advanced over the years and now have greater storage capacity and faster access times. They are the preferred medium for companies to keep records for regulatory purposes.
But Mr Holmes said tape was nowhere as cost-effective or feature-rich as cloud storage.
“There have been over 60 price cuts in cloud storage, but there have only been cost increases in purchasing tape. In many circumstances, cloud storage is now cheaper on a cost per gigabyte,” he said.
“And secondly, we believe that some of the most profound discoveries of the next decade will come from what is called ‘dark data’ and it is this data that is stored on tape in the warehouses, of which nobody knows what they contain,” he said.
The job is not without obstacles. Once, Mr. Holmes had to write to the Australian Computer Museum for a 60-year-old IBM tape drive that could not be found anywhere else. Hiring is a problem, with most computer engineering grads having never seen a data tape, let alone worked with it.
Mr. Holmes, a physicist by training, has also failed to find a recycler that will accept the shredded Tape Ark tapes.
When a data tape arrives, Tape Ark first photographs the tape using a camera with mirrors to reflect all sides. It also extracts metadata from tapes with radio frequency identification chips.
The first step creates a count of all defects, errors, and data scales on the tape. It creates an audit report that is sent to the customer, who then chooses what to extract based on their needs and cloud storage cost estimates.
It then uses its proprietary system, ArkBridge, to ingest the data from the tape, reformat it, and output it in a cloud-ready format. The process takes between seven minutes for the initial data tapes from the late 1960s and 11 hours for the more complex ones.
In the case of Microsoft’s Met project, cloud data would go on slabs of fused silica that it says will withstand environmental damage to provide backup for 1,000 years. Microsoft has been working on the plates as part of its Silica project which wants to bring archival storage to the cloud.
The company also filed two patents. The first is for a fast box indexer, which it uses to look inside archived storage to verify content. The second is a universal tape drive for data types, which is at the prototype stage.
Mr. Holmes declined to comment on Tape Ark’s earnings. He said the company was started from earlier IT ventures and now has about a dozen shareholders.