Docker for microservices – Using JavaScript and Chrome to convert HTML to PDF

Postat av Dennis Eriksson on 11. augusti 2020


In our team we have struggled to find a simple but yet complete solution to design and generate reports and protocols for one of our customers business applications. Fortunately with the help of Docker, Node.js and Chrome we found a solution.


Our old solution – Telerik Reporting

We have for some time been using an old version of Telerik Reporting. Although Telerik Reporting is a powerful framework packed with loads of functionality for building reports, it has brought us some problems from time to time.


Our biggest struggles so far:

  • Version control
  • As Telerik Reporting is based on a drag and drop design the underlying code is autogenerated. That makes it sometimes hard to revert changes that you might want to undo.
  • As it’s drag and drop based, it’s hard to solve merge conflicts. As auto generated code for the most part isn’t the easiest to read or overview.
  • Drag and drop
  • Drag and drop will for the most make for a short learning curve. But the flexibility is almost never as good as pure code based frameworks.
  • Spread the skills between the all team members
  • As we aim to spread all the skills between all our team members, it’s sometimes hard to remember and time consuming to spread the knowledge further about how Telerik Reporting works.


Our new idea – HTML to PDF

We tried to find a way to create reports and protocols in a way that we could use our existing knowledge of making templates and that solves our biggest struggles. What straightly came to our mind was HTML. As the developers in our team are familiar with web development, HTML is a markup language that all have been working with. It’s a markup language easy to read and it also solves all our struggles.


Some concerns

As we thought that we may have found our solution there were still some few concerns that we needed to deal with. How do we turn HTML into PDF, and what about page breaks and pagination?


We thought straight away that we could use some open source library to deal with the conversation from HTML to PDF. But after hours of googling and testing we found out that this is probably not the easiest task to manage. As most of the free open source libraries that we found didn’t make the conversion in the way that pleased us. One thing that seemed difficult to sort out was preventing page breaks appearing in the middle of a sentence. Those libraries that seemed to do a sensible conversion cost more money than we could motivate.


The browser Google Chrome is solving our concerns. Then we found out that others have been using the browser Google Chrome to solve this type of problem. Chrome does an excellent job of converting web sites to a PDF-document. Chrome even does have support for preventing unwanted page breaks and pagination and so on. So we thought, why don’t we just use Chrome for making the conversion.


Final solution – Docker, Chrome, Node.js and Azure

To implement the idea with the browser Google Chrome as the conversion engine we wanted to package the whole functionally as a microservice. As we knew that we in some way needed to have Chrome installed we knew we didn’t have the possibility to install Chrome directly to our existing servers. This is because we don’t actually have any servers as we run the “main” application on a PAAS platform in Azure (Platform as a Service). But honestly we didn’t want to make infrastructural changes to the core application either.


What about Docker?

Docker allows us to create a small scale image with everything that we need already predefined and installed. Like Chrome, Node.js and the few lines of code managing the HTML to PDF conversion. Docker also runs quite simple on Azure which is a provider that we already use in the project.


Why Node.js?

We actually need some few lines of code to manage the whole conversion. Some lines of code managing the web server handling the incoming HTML and some lines to pass the incoming HTML to Chrome to later on return the complete PDF file as a response to the client.


Why JavaScript?

Well, we could have been using whatever programming language we wanted, but as we just needed some few lines of code it was hard to motivate say C# over the simplicity of managing this problem with pure JavaScript. We ended up using Express.js as the web server framework. A JavaScript library called Puppetier was also used to simplify the comands sent to Chrome. We also did run Chrome in the “Headless” mode as we weren’t interested in actually opening a browser.