Architecting Your Browser Extension with Message Passing
Learning to build a browser extension can be a lot of fun, though it may take some ramp-up. It requires using browser concepts that many web developers typically don’t touch. Once you get the hang of them, they can enhance your understanding of your web browser’s capabilities and constraints.
Despite some great documentation of the essentials, it can be tough to understand how different concepts fit together and build the experience you want.
In this guide, we’ll build a simple extension that allows the user to highlight a word on a webpage and get a preview of the Wikipedia article for that word. We’ll use popups, content scripts, and service workers all together. The extension will be built for Chrome, but you can adapt this for Firefox as well.
But first, the essentials
Before reading this guide, you should familiarize yourself with the core concepts. In particular, be sure you understand Manifest V3, content scripts, popups, and service workers.
- Manifest: overall declaration of your extension which also declares its entrypoint
- Content scripts: scripts that run in the context of a webpage
- Popups: many extensions open a small window when clicked. That’s a “popup”.
- Service workers: long-lived scripts that run in the background of your extension
The web of i/o
For web applications, the server has capabilities and access to information that the client doesn’t, and vice-versa. To work together, they pass messages back and forth. Browser extensions work the same way. The various scripts and service workers are sandboxed, so they must use message passing to work together. You’ll need to become comfortable using message passing to implement even simple features in your extension.
chrome.runtime.sendMessage("Hello world!")
This is a simple way to send a message to your service worker. If your service worker is listening for messages, it’ll receive one with the string Hello world!
You can send objects, too, and this will become necessary in our later examples. The foremost rule to the messages is that they must be serializable as JSON, so you can’t send things like functions.
You can send messages to tabs, too, using chrome.tabs.sendMessage
, though you need to know the ID of the tab to send the message to. We have an example of this later in the guide.
chrome.runtime.onMessage.addListener(
(message, sender, sendResponse) => {
if(message == "Hello world!"){
sendResponse("Thanks for the message, " +
"sender #`${sender.id}`!")
}
})
Here’s an example of listening to a message. It’s a little more complicated than sending a message, since you declare a callback function instead of a simple JSON-compatible value. Let’s break down the callback function:
The message is the exact message that was sent. It’s a good idea to check the content of the message to see what business logic should be invoked. This means that the callback will usually start with a big if
or switch
statement.
The sender can be used to reveal some information about which script/tab/document sent the message. Since many contexts can send a message, different senders can look quite different.
sendResponse is a function that allows the receiver to reply to a message. We’ll talk more about this later, but this can be really useful for making messages more transactional.
Both tabs and service workers are capable of using addListener
.
A sample extension using scripts and message passing
Here’s the code for our extension. This extension lets the user highlight any word in their browser, and by clicking the extension “action”, they’ll get a preview of the topic word from Wikipedia.
manifest.json
{
"manifest_version": 3,
"name": "What is this?",
"description": "Asks Wikipedia about a highlighted word",
"version": "1.0",
"action": {
"default_icon": "icon.png",
"default_popup": "popup.html"
},
"background": {
"service_worker": "background.js"
},
"permissions": ["activeTab", "scripting"],
"host_permissions": [
"*://*.wikipedia.org/*"
]
}
This manifest has the following characteristics:
It has a popup. When the user clicks the “action” icon, it will open up popup.html
It has a service worker. This starts as soon as the browser installs the extension and/or when the extension is invoked.
It has the activeTab permission, which allows our extension to query for the currently-focused browser tab.
It has the scripting permission, which allows content scripts to be injected into browser tabs.
It has a host permission for hostnames containing wikipedia.org, which will allow fetch requests to be made from the extension.
popup.html
<html>
<body style="width: 500px">
<h1>What is this?</h1>
<div id="report" />
<script src="popup.js"></script>
</body>
</html>
This is the markup page for the popup. It mostly contains DOM containers that will be manipulated by the script later. It also loads popup.js
- when the extension loads a new HTML page, you can also load scripts from the project folder, just as if the script was locally hosted.
popup.js
(async() => {
const report = document.getElementById("report");
report.innerText = "Waiting..."
// Kick off the plugin business logic - the service worker will receive
// this message ...
const result = await chrome.runtime.sendMessage("opened");
// ... and return an async result
report.innerText = ""
// Add an annotation to the source article
const link = document.createElement("a");
link.innerText = "From Wikipedia"
link.href = result.link
link.target = "_blank"
report.appendChild(link)
// The service worker can't access certain DOM APIs, so we let the popup
// script parse the content
const virtualDocument = new DOMParser()
.parseFromString(result.content, "text/html");
// Get the first paragraph with content
const firstParagraph = Array.from(virtualDocument.querySelectorAll("p"))
.find(p => p.innerText.trim() !== "");
// Add it to the DOM
const p = document.createElement("p");
p.innerText = firstParagraph.innerText;
report.appendChild(p);
})()
This is the self-invoking popup script that executes every time the popup is opened. There’s a bit going on, so there are some comments to explain what’s happening. Most of the code is garden-variety DOM manipulation. You can also see an example of sendMessage
, which returns a Promise
, which would be a response from a message recipient.
background.js
(async() => {
chrome.runtime.onMessage.addListener(
(message, _sender, sendResponse) => {
(async () => {
// The popup will have broadcast an "opened" message, which this
// will receive
if(message == "opened"){
// Get the active tab. We need the activeTab permission in the
// manifest for this to work
const [ tab ] = await chrome.tabs
.query({ active: true, currentWindow: true });
const tabId = tab.id;
// Inject the content script into the tab so that we can get
// the highlighted word. We need the executeScript permission
// in the manifest for this to work ...
const injection = await chrome.scripting.executeScript({
target: { tabId },
files: ["content.js"],
});
// ... and it returns a value
const word = injection[0].result;
const link = `https://en.wikipedia.org/wiki/${word}`
// Fetch the link. We can't reliably do this in a content
// script because of CORS, so we let the service worker do
// it. We need a matching host_permission in the manifest
// for the service worker to bypass CORS
const response = await fetch(link)
const content = await response.text()
// Send a reply. This will resolve the promise
sendResponse({ content, link });
}
})();
// IMPORTANT: Return true from this function to indicate that
// we will send a response asynchronously. Note that we return this
// _outside_ of the async function closure above. If we don't do this,
// sendMessage won't resolve asynchronously!
return true;
},
);
})()
This is the code for the service worker, which is specified by our manifest.json
. It’s more complex and extension-oriented, so there are comments here as well. Here are the concepts we employ:
We look for the focused tab. The user will execute the extension while viewing a tab, so we need to know which one it is.
We execute a content script remotely on the tab. This also returns a Promise
, and we can receive information from the injected script when it’s done.
We use the fetch
API to send a request to Wikipedia. Normally this wouldn’t work (unless we were already on a Wikipedia page) due to CORS restrictions. Extensions will bypass CORS, though, as long as there is a matching host_permission
in the manifest.
We call sendResponse
to reply to the popup script.
content.js
// This script is injected by the service worker on a browser tab
(async() => {
// Get the highlighted value and return it. The last evaluated value
// in this script is what will be returned to the service worker
return window.getSelection().toString();
})();
This is the content script that the service worker injected on the current tab. All it does is return the highlighted text.
Why didn’t we just invoke fetch
here? Because a content script on a real webpage is not treated as a “part of the extension” like the service worker or the popup page. It won’t be able to bypass CORS even with a host_permission
, so the fetch
would fail.
Does this mean we could use fetch
in the popup and totally remove the service worker? Yes, that will work. Though extensions are simpler to maintain if the popup is treated as more of a “view” layer. If you think about the model-view-controller paradigm, try to think of the popup as more of a “view” and the service worker as more of a “controller”. This should help you organize your code’s responsibilities. You’ll also get to enjoy the statefulness of the service worker.
And that’s it! This code gets you a working extension complete with a long-lived service worker and message passing. Here’s a basic diagram of what we’ve built.
Tightening up your message passing
As you build up your business logic, you’ll probably need to add more message passing. If you’re using an IDE, you may find that it can’t predict what types of messages can be passed and what they may contain, since messages are being passed across a thread boundary and aren’t just Javascript functions with known signatures. This is a recipe for runtime errors and frustrating debugging sessions.
Use Typescript
Not all projects need Typescript, but extensions are uniquely suited to it due to how they are invoked. You may find they’re less simple to test with automated testing software, so it’s good to use Typescript to iron out the simplest-to-avoid runtime errors. Using discriminated unions in Typescript is especially useful for event listeners, which we’ll illustrate later.
Use message topics
Though we can send primitive messages like strings and numbers, it’s better to just send a Javascript object with a “topic” key. This will give a simple facility for receivers to know if a message is important to them, and then to also examine additional values in the message. This might seem familiar to those who’ve worked with message bus architectures.
Here’s an example of how we can use structured messages with topics to change the desired language of the extension. Let’s say we want to add a feature where the user can toggle the language of the Wikipedia entry the extension retrieves.
popup.html
<html>
<body style="width: 500px">
<div style="display: flex;">
<div style="flex-grow: 1">
<h2>What is this?</h2>
</div>
<div>
<button id="language-en">EN</button>
<button id="language-es">ES</button>
<button id="language-fr">FR</button>
</div>
</div>
<div id="report" />
<script src="dist/popup.js"></script>
</body>
</html>
Here we’ve added a few buttons, one for each supported language.
popup.ts
(async() => {
const report = document.getElementById("report") as HTMLDivElement;
const reset = () => {
report.innerText = "Waiting..."
}
const updateDefinition = (result: { content: string, link: string }) => {
report.innerText = ""
const link = document.createElement("a");
link.innerText = "From Wikipedia"
link.href = result.link
link.target = "_blank"
report.appendChild(link)
const virtualDocument = new DOMParser()
.parseFromString(result.content, "text/html");
const firstParagraph = Array.from(virtualDocument.querySelectorAll("p"))
.find(p => p.innerText.trim() !== "") as HTMLParagraphElement;
const p = document.createElement("p");
p.innerText = firstParagraph.innerText;
report.appendChild(p);
}
const enButton = document.getElementById("language-en") as HTMLButtonElement;
const esButton = document.getElementById("language-es") as HTMLButtonElement;
const frButton = document.getElementById("language-fr") as HTMLButtonElement;
enButton.addEventListener("click", async () => {
reset();
const result = await chrome.runtime
.sendMessage({ topic: "changeLanguage", language: "es" });
updateDefinition(result);
});
esButton.addEventListener("click", async () => {
reset();
const result = await chrome.runtime
.sendMessage({ topic: "changeLanguage", language: "es" });
updateDefinition(result);
});
frButton.addEventListener("click", async () => {
reset();
const result = await chrome.runtime
.sendMessage({ topic: "changeLanguage", language: "fr" });
updateDefinition(result);
});
reset();
const result = await chrome.runtime.sendMessage({ topic: "opened" });
updateDefinition(result);
})()
Here we’ve changed a few things. On script load, we now send a message with a topic of “opened”. We also have click handlers for our new buttons that send a new type of message, including a topic and a “language”. Each of these will return an updated Wikipedia preview from the service worker, so they all re-render the DOM.
background.ts
let language = "en";
(async () => {
chrome.runtime.onMessage.addListener(
(message, _sender, sendResponse) => {
const getDefinition = async () => {
const [tab] = await chrome.tabs
.query({active: true, currentWindow: true});
const tabId = tab.id as number;
const injection = await chrome.scripting.executeScript({
target: {tabId},
files: ["content.js"],
});
const word = injection[0].result;
const link = `https://${language}.wikipedia.org/wiki/${word}`
const response = await fetch(link)
const content = await response.text()
sendResponse({content, link});
}
(async () => {
// The popup will have broadcast an "opened" message, which
// this will receive
if (message.topic == "opened") {
await getDefinition()
} else if(message.topic == "changeLanguage") {
console.log("changing language")
// The popup will have broadcast a "changeLanguage" message,
// which this will receive
language = message.language;
await getDefinition()
}
})();
// return true from this function to indicate that we will send
// a response asynchronously
return true;
},
);
})()
More changes here as well. The listener now checks the topic of the incoming message. If the language is changing, we set a language value before requesting the Wikipedia article. This value persists for the life of the service worker, so it’s a simple piece of statefulness for our extension. The next time we request a Wikipedia preview, it will be in the new language.
Without having an object with a reliable topic at the root, we’d have to write a lot more code to ensure that we don’t accidentally hit a runtime error.
Wrap the messaging and listening methods
We still don’t have great typesafety with this code; if you’re using @types/chrome, you’ll get bindings, but you’ll have to build your own typesafety for messages that you are sending and receiving. We want to achieve a few goals:
Assert that sent messages obey a specific message shape.
Use discriminated unions to know what keys are available for a message of given topic.
Require sent responses to receive a specific shape depending on the topic.
export async function sendMessage(
parameters: RequestParameters,
): Promise<MessageResponse> {
console.log("sending message", parameters);
if ("tabId" in parameters && parameters.tabId) {
return await chrome.tabs.sendMessage(parameters.tabId, parameters);
} else {
return await chrome.runtime.sendMessage(parameters);
}
}
Here’s an example of a sendMessage
wrapper. Note the input and output types, which are user-defined (the definitions for these are further down).
The function is very simple, and the only logic is to differentiate between whether the message should be sent to a tab or not. If you don’t intend to send messages to browser tabs, though, you don’t even need that conditional logic.
export const addListener = (callback: Callback) => {
chrome.runtime.onMessage.addListener(
(message: MessageRequest, _sender, sendResponse) => {
console.log("message received", message);
// Attach sendResponse at runtime. This is why sendResponse is optional, so you should nullguard it
message.sendResponse = (params: unknown) => {
console.log("sending response", params);
sendResponse(params);
};
callback(message);
return true;
},
);
};
Here’s an example of an addListener
wrapper. This also includes some instrumentation to help us track messages in the browser, which can be useful or debugging.
The Callback
type is user-defined, which is illustrated further down.
background.ts
(() => {
addListener(async (message) => {
const getDefinition = async () => {
const [tab] = await chrome.tabs.query({active: true, currentWindow: true});
const tabId = tab.id as number;
const injection = await chrome.scripting.executeScript({
target: {tabId},
files: ["content.js"],
});
const word = injection[0].result as string;
const link = `https://${language}.wikipedia.org/wiki/${word}`
const response = await fetch(link)
const content = await response.text()
return { content, link, word };
}
if (message.topic == "opened") {
// type system knows message is OpenedMessageRequest
const definition = await getDefinition()
message.sendResponse?.(definition);
} else if (message.topic == "changeLanguage") {
// type system knows message is ChangeLanguageRequest
language = message.language;
const definition = await getDefinition()
message.sendResponse?.(definition);
}
},
);
})()
Here’s our message receiver again. We’re now using the addListener
wrapper, which simplifies the API a bit. The conditional logic closures now enjoy discriminated unions. Within each of the closures, Typescript will know what type of message was received.
The sendResponse
method on the message is new and is injected via addListener
. We’ll use this to send an asynchronous reply to the sender. Below is how we stitch everything together to get typesafe message responses as well.
messaging.ts
type Preview = { content: string, link: string }
// Inputs and outputs for "opened" message
export type OpenedMessageResponse = Preview & { word: string };
export type OpenedMessageRequest = {
topic: "opened"
sendResponse?: (params: OpenedMessageResponse) => void;
};
// Inputs and outputs for "changeLanguage" message
export type ChangeLanguageResponse = Preview;
export type ChangeLanguageRequest = {
topic: "changeLanguage";
language: string,
sendResponse?: (params: ChangeLanguageResponse) => void;
};
// Union type for all messages
export type MessageRequest =
| OpenedMessageRequest
| ChangeLanguageRequest;
type Callback = (message: MessageRequest) => void;
// addListener wrapper. This includes some logging to help track the flow of
// messages in the browser
export const addListener = (callback: Callback) => {
chrome.runtime.onMessage.addListener(
(message: MessageRequest, _sender, sendResponse) => {
console.log("message received", message);
// Attach sendResponse at runtime. This is why sendResponse is optional,
// so you should nullguard it
message.sendResponse = (params: unknown) => {
console.log("sending response", params);
sendResponse(params);
};
callback(message);
return true;
},
);
};
// Handle requests that may need to be sent to a tab. Not required, but useful
// if you need to send messages to tabs
type RequestParameters = MessageRequest | ({ tabId: number } & MessageRequest);
// Union for all message responses
export type MessageResponse =
| OpenedMessageResponse
| ChangeLanguageResponse;
// Signature overloads for message inputs and outputs. If you add a new message
// type, be sure to add an overload here
export async function sendMessage(
input: OpenedMessageRequest,
): Promise<OpenedMessageResponse>;
export async function sendMessage(
input: ChangeLanguageRequest,
): Promise<ChangeLanguageResponse>;
// Actual function definition, which must receive a union of all possible
// message types
export async function sendMessage(
parameters: RequestParameters,
): Promise<MessageResponse> {
console.log("sending message", parameters);
if ("tabId" in parameters && parameters.tabId) {
return await chrome.tabs.sendMessage(parameters.tabId, parameters);
} else {
return await chrome.runtime.sendMessage(parameters);
}
}
These are all of the messaging wrappers (already mentioned) and types in one place. There’s a lot to go over, so there are comments to help you along the way. The result is that you’ll have an API such as below:
// returns Promise<ChangeLanguageResponse>
const result = await sendMessage({ topic: "changeLanguage", language: "fr" });
// returns Promise<OpenedMessageResponse>
const result = await sendMessage({ topic: "opened" });
The full code for this example project is here if you want a good starting point for your extension. Happy building!