Download authenticated Canvas attachments with AI agents

Canvas LMS hides everything behind your institution's SSO. The agent uses your existing browser session to fetch lecture slides, parse .docx assignments, and pull PDFs — without re-engineering the auth.

Why this is genuinely hard: Canvas attachment URLs are session-bound (they include a signed token bound to your SSO cookie). Even if you copy the URL, opening it in another browser fails. fetch() from the page console hits Canvas's strict connect-src CSP. The standard Playwright/Browser Use trick of "log in then download" requires you to bake your SSO password somewhere — which most universities forbid. Chromeflow's privileged-context tools run in the extension's background service worker, inherit your real Chrome cookies, and bypass page CSP. One tool call.

The three privileged-context tools

fetch_url(url) — generic HTTP, returns the response body. Use for JSON APIs and arbitrary endpoints.
download_file(url, filename?) — Chrome's authenticated download flow. Returns the absolute path on disk. Use when another tool needs the bytes (e.g. local pdftotext).
read_attachment(url, format?) — privileged fetch + format-aware text extraction in one call. Handles .docx (via in-extension ZIP parsing — no libreoffice needed), .txt, .md, .csv, .json, .xml, .html. The fastest path to text content.

Walkthrough: read a Canvas lecture's .docx slides

1. Navigate to the Canvas page

open_page("https://canvas.manchester.ac.uk/courses/12345/modules")

You're already signed in via your institution's SSO. Chromeflow uses that session.

2. Find the attachment

const found = find_text("Lecture 5", scope_selector=".context_module")
// found returns the matching link element, its href, and the click coords

3. Read the file content directly

read_attachment("https://canvas.manchester.ac.uk/files/8675309/download?download_frd=1", "docx")
// Returns: the extracted text of the .docx, ready to pass to the agent.
// No download to disk, no separate CLI, no libreoffice.

Behind the scenes: the extension fetches the URL with your Chrome cookies (privileged context bypasses CSP), recognizes the docx MIME type, runs the in-extension ZIP parser using DecompressionStream("deflate-raw"), extracts word/document.xml, strips the XML tags, and returns clean text.

4. Or grab the raw PDF

const path = download_file(
  "https://canvas.manchester.ac.uk/files/8675310/download?download_frd=1",
  "lecture5-slides.pdf"
)
// Returns: "/Users/you/Downloads/lecture5-slides.pdf"
// Now any local tool (pdftotext, Preview, your code) can read it.

PDF native parsing is on the roadmap. For now, download_file + pdftotext is the path.

Use cases this unlocks

"Summarize this week's lectures" — agent enumerates module attachments, reads each one with read_attachment, produces a study guide.
"Pull all my assignment PDFs and tell me what's due" — agent walks the assignments page, downloads each PDF with download_file, reads dates via local pdftotext.
"What does the rubric for Assignment 3 say about the API design section?" — agent finds the rubric, read_attachment, answers from the actual text rather than guessing.
"Did the lecturer post anything new since Friday?" — agent diffs the modules page using get_page_text, surfaces new items.

Why not just use the Canvas API?

The Canvas REST API exists but requires an API token (institution may or may not let you generate one), the token is bearer-only (no SSO inheritance), and many features (group assignments, locked files, certain LTI integrations) are still cookie-bound. Chromeflow's privileged-context fetch sidesteps all of this — if it's reachable from your browser, the agent can fetch it.

Privacy note

All of this runs locally. The agent talks to a WebSocket on 127.0.0.1. The extension talks to Chrome. Nothing about your Canvas data leaves your machine except via the agent's own LLM API calls per its own policy (Claude/OpenAI/etc.). Chromeflow itself collects no telemetry.

← Back to all use cases