Skip to content

Acknowledge

There are critical comments about the PDF.js project, they say it should be easier to integrate the default viewer. I understand them somehow as a developer who want to show a PDF quickly, but knowing the primary goal of an OSS project can help us to understand the trade-off decisions made by the maintainers.

All these are what I learned from the PDF.js project. It may not be accurate, but I hope it can help you to understand the PDF.js better.

Table of Contents

Background

PDF.js is an OSS project supported by Mozilla and developed using HTML5, It's goal is to create a general-purpose, web standards-based platform for rendering PDFs in the Firefox browser. Many people find out that it's hard to integrate it into project, it's somehow intentionally.

It's not developed as a component or library you can easily integrate like most npm packages, because it's primary goal is to be used easily in Firefox browser, the goal doesn't match most developers expectations. It's the trade-off decision made by the maintainers, we should understand it.

Issues talked about why it's not easy to integrate

Introduction

PDF.js use web worker for better rendering performance.

A typical web application only has one bundle result, but PDF.js has at least 4 bundles, they are main, worker, sandbox and web bundles.

javascript
// From https://github.com/mozilla/pdf.js/blob/master/gulpfile.mjs#L1001
function buildGeneric(defines, dir) {
  rimraf.sync(dir);

  return merge([
    createMainBundle(defines).pipe(gulp.dest(dir + "build")),
    createWorkerBundle(defines).pipe(gulp.dest(dir + "build")),
    createSandboxBundle(defines).pipe(gulp.dest(dir + "build")),
    createWebBundle(defines, {
      defaultPreferencesDir: defines.SKIP_BABEL
        ? "generic/"
        : "generic-legacy/",
    }).pipe(gulp.dest(dir + "web")),
    // ...
  ]);
}

The default viewer uses the web bundle directly, it depends on the main and worker bundles. Keep in mind you must load the main bundle before using the default viewer. Each time a PDF document is opened using the open method, it will create a new worker to render the PDF document.

The main bundle is built from src/pdf.js, it's the entry of the main bundle. The worker bundle is built from src/pdf.worker.js, it's the entry of the worker bundle.

The src/web directory contains the source code of the default viewer, all modules depend on the main bundle have to import from the pdfjs-lib package, it will be resolved to web/pdfjs.js using the resolve.alias option of webpack when building.

javascript
// web/pdfjs.js
// https://github.com/mozilla/pdf.js/blob/master/web/pdfjs.js
if ((typeof PDFJSDev === "undefined" || PDFJSDev.test("GENERIC")) && !globalThis.pdfjsLib) {
  await globalThis.pdfjsLibPromise;
}

const {
  AbortException,
  // ...
} = globalThis.pdfjsLib;

export {
  AbortException,
  // ...
};

As we can see from the code above, the web bundle must load the main bundle first. You can import from the result of globalThis.pdfjsLibPromise only when the main bundle promise is resolved.

javascript
// An example module in web/alt_text_manager.js depends main bundle.
//
// From https://github.com/mozilla/pdf.js/blob/master/web/alt_text_manager.js
import { DOMSVGFactory, shadow } from "pdfjs-lib";

class AltTextManager {
  // ...
}

PDFViewerApplication

The global PDFViewerApplication object is the entry of the default viewer of PDF.js, it glues all the modules together, and provides the API for the default viewer.

AppOptions

There are dozens of options in PDF.js, and they all belong to four kinds for now. You may wonder why there are so many options, and what they mean at the first time. This document will help you to understand them.

Let's crack on them one by one!

Option Kinds

  • VIEWER
  • API
  • WORKER
  • PREFERENCE

Important options

  • defaultUrl
  • locale
  • workerSrc

defaultUrl

  • Type URL | string | Uint8Array

The url of the PDF file. If you got CORS issue when loading a PDF file from a different origin, see details at origin match error section in common pitfalls page .

locale

  • Type string
  • Default en-US

The locale of the viewer, it easy to switch a different locale by setting this option. See all supported locales in folder l10n.

workerSrc

The url of the PDF.js web worker bundle.

The PDF.js project use web worker to speed up the rendering process, it means there must be some code creating and initializing the web worker using the workerSrc. It configured with default value, but it always depends on how you deploy it. Make sure you can download the worker bundle from the workerSrc url.

Worker Request

Other options

textLayerMode

The textLayerMode of text layer.

Important Events

documentinit

Emitted after called the setInitialView method to show the initial view successfully. After the event is emitted, the viewer will be ready to use.

documenterror

Emitted after error occurred when rendering document.

pagerendered

Emitted after each page is rendered successfully.

Gulp Tasks

Last updated: