Skip to content

tar-parser

The tar-parser package provides streaming parsing of tar archives. It works in any JavaScript runtime including Node.js, Deno, Bun, and Cloudflare Workers.

Installation

The tar parser is included with Remix. No additional installation is needed.

Import

ts
import {
  TarParser,
  parseTar,
  parseTarHeader,
  TarEntry,
} from 'remix/tar-parser'

API

parseTar(stream)

Parses an entire tar archive from a ReadableStream and returns an array of entries.

ts
function parseTar(
  stream: ReadableStream<Uint8Array>,
): Promise<TarEntry[]>
ts
let response = await fetch('https://example.com/archive.tar')
let entries = await parseTar(response.body!)

for (let entry of entries) {
  console.log(entry.header.name) // 'src/index.ts'
  console.log(entry.header.size) // 1024
  let content = await entry.text()
}

TarParser

A class for streaming tar archive parsing using async iteration. Use this when you want to process entries one at a time without loading the entire archive into memory.

ts
class TarParser {
  constructor(stream: ReadableStream<Uint8Array>)
  [Symbol.asyncIterator](): AsyncIterableIterator<TarEntry>
}
ts
let parser = new TarParser(stream)

for await (let entry of parser) {
  console.log(entry.header.name)
  console.log(entry.header.size)

  // Read entry content
  let bytes = await entry.arrayBuffer()
}

parseTarHeader(buffer)

Parses a single 512-byte tar header block and returns a TarHeader object. This is a low-level utility for building custom tar processing logic.

ts
function parseTarHeader(buffer: Uint8Array): TarHeader | null

Returns null if the buffer is an end-of-archive marker (all zeros).

TarEntry

Represents a single entry (file or directory) in a tar archive.

ts
interface TarEntry {
  readonly header: TarHeader

  arrayBuffer(): Promise<ArrayBuffer>
  text(): Promise<string>
  stream(): ReadableStream<Uint8Array>
}
  • arrayBuffer() -- Returns the entry's content as an ArrayBuffer.
  • text() -- Returns the entry's content decoded as UTF-8 text.
  • stream() -- Returns a ReadableStream of the entry's content.

TarHeader

Metadata for a single tar entry.

ts
interface TarHeader {
  name: string       // File path within the archive
  size: number       // File size in bytes
  mode: number       // Unix file mode (e.g. 0o644)
  uid: number        // Owner user ID
  gid: number        // Owner group ID
  mtime: Date        // Last modification time
  type: TarEntryType // 'file', 'directory', 'symlink', etc.
  linkName: string   // Target path for symlinks
  prefix: string     // Path prefix for long filenames
}

Examples

Extract an Archive

ts
import { parseTar } from 'remix/tar-parser'

let response = await fetch('https://registry.npmjs.org/remix/-/remix-1.0.0.tgz')
let entries = await parseTar(response.body!)

for (let entry of entries) {
  if (entry.header.type === 'file') {
    console.log(`${entry.header.name} (${entry.header.size} bytes)`)
  }
}

Stream Large Archives

Process entries one at a time to keep memory usage low:

ts
import { TarParser } from 'remix/tar-parser'

let parser = new TarParser(stream)

for await (let entry of parser) {
  if (entry.header.type === 'directory') {
    await mkdir(entry.header.name, { recursive: true })
    continue
  }

  if (entry.header.type === 'file') {
    await writeFile(entry.header.name, entry.stream())
  }
}

Filter Specific Files

ts
import { TarParser } from 'remix/tar-parser'

let parser = new TarParser(stream)

for await (let entry of parser) {
  if (entry.header.name.endsWith('.json')) {
    let text = await entry.text()
    let data = JSON.parse(text)
    console.log(data)
  }
}

Read a Single File from an Archive

ts
import { TarParser } from 'remix/tar-parser'

let parser = new TarParser(stream)

for await (let entry of parser) {
  if (entry.header.name === 'package/package.json') {
    let pkg = JSON.parse(await entry.text())
    console.log(pkg.name, pkg.version)
    break
  }
}
  • multipart-parser --- Streaming multipart/form-data parsing.
  • fs --- Filesystem utilities for writing extracted files.

Released under the MIT License.