Every day, thousands of web services generate PDF (Portable Document Format) files—bills, contracts, reports. This step is often treated as a technical routine, “just convert the HTML,” but in practice it’s exactly where a trust boundary is crossed. The renderer parses HTML, downloads external resources, processes fonts, SVGs, and images, and sometimes has access to the network and the file system. Risky behavior can occur by default, without explicit options or warnings. That is enough for a PDF converter to become an SSRF proxy, a data leak channel, or even cause denial of service.
We therefore conducted a targeted analysis of popular HTML-to-PDF libraries written in the PHP, JavaScript, and Java languages: TCPDF, html2pdf, jsPDF, mPDF, snappy, dompdf, and OpenPDF. During the research, the PT Swarm team identified 13 vulnerabilities, demonstrated 7 intentional behaviors, and highlighted 6 potential misconfigurations. These included vulnerability classes such as Files or Directories Accessible to External Parties, Deserialization of Untrusted Data, Server-Side Request Forgery, and Denial of Service.
PDF generation is increasingly common across e‑commerce, fintech, logistics, and SaaS. Such services are often deployed inside the perimeter, next to sensitive data, where network controls are looser. This means that even a seemingly harmless bug in the renderer can escalate into a serious incident: leakage of documents, secrets, or internal URLs.
In this article, we present a threat model for an HTML-to-PDF library, walk through representative findings for each library, and provide PoC snippets.
Continue reading
