Skip to content

Tar Parser Overview

A tar archive (short for "tape archive") is a file format that bundles multiple files and directories into a single file. Unlike zip, tar does not compress the contents -- it simply concatenates them with metadata headers. Tar archives are commonly used for distributing source code, backups, and container images.

The tar-parser package provides a streaming parser for tar archives that works in any JavaScript runtime -- Node.js, Deno, Bun, Cloudflare Workers, and browsers. It reads a ReadableStream<Uint8Array> and yields entries one at a time, so you can process archives larger than available memory.

When to Use This Package

  • Extracting files from tar archives fetched over HTTP
  • Processing npm packages (which are .tar.gz files)
  • Inspecting container image layers
  • Extracting specific files from large archives without downloading the whole thing

Quick Example

ts
import { parseTar } from 'remix/tar-parser'

let response = await fetch('https://example.com/archive.tar')
let entries = await parseTar(response.body!)

for (let entry of entries) {
  console.log(entry.header.name) // 'src/index.ts'
  console.log(entry.header.size) // 1024
  console.log(entry.header.type) // 'file'

  let content = await entry.text()
  console.log(content)
}

Key Concepts

  • Entry -- A single file or directory in the tar archive. Each entry has a header with metadata and methods to read the content (text(), arrayBuffer(), stream()).
  • Header -- Metadata for an entry: name (file path), size (byte length), type (file, directory, or symlink), mode (permissions), and mtime (modification time).
  • Streaming -- The TarParser class processes the archive incrementally via async iteration, so memory usage stays constant regardless of archive size.

Two Parsing Modes

FunctionUse Case
parseTar(stream)Buffer all entries into an array. Simple for small archives.
new TarParser(stream)Async iterator that yields entries one at a time. Memory-efficient for large archives.
  • fs -- Write extracted files to disk
  • mime -- Detect content types of extracted files
  • API Reference -- Full API documentation

Released under the MIT License.