Automating Contract Generation with Dynamic PDFs
A few weeks ago, I needed to create a system that automatically generated pdf contracts for a client. The client currently had to manually edit Microsoft Word documents and then save them as pdf. Having to do this about 30 times a day became a significant waste of time. Therefore, it needed to be automated.
Solutions and their drawbacks
A good approach is to use Template variables and then replace them just before generating the PDFs. This works great and integrates seamlessly with most CRMs, however, this only solves part of the problem. How do you generate PDFs for the contracts after the replacements? Below are possible options, ranging from the easiest to implement including the benefits and drawbacks of each approach.
1. Using a WYSIWYG and converting to PDF
This entails using a Rich Text Editor like Summernote and then using a text to pdf converter like jspdf or html2pdf.
Benefits
- This is easy and quick to setup.
- Everything is done on the FrontEnd, therefore makes it an accessible solution.
Drawbacks
- Fails when you need complex UI.
Overall, this is a great option, if the contracts are straightforward and contain no complex UI.
2. Copying the content of the Microsoft Word document into a text field, and then converting to PDF
This is quite similar to the first solution, however, you start out from the current Word Document, and then paste it in a text input. This fixes some of the UI issues with using a WYSIWYG editor. However, copying and pasting Word document content into a text field, often breaks the UI and sometimes, introduces gaps or spaces between text sections. Additionally, I found that it sometimes split the tables in a way where texts are also cut in half.
Benefits
- This is easy and quick to setup.
- Everything is done on the frontend, therefore makes it an accessible solution.
Drawbacks
- Create weird artifacts when dealing with complex Microsoft Word content.
Sometimes the best hack is to stop hacking and use the tool built for the job.
3. Docker comes to the rescue
Now we get to the optimal solution, albeit, with some caveats. We run an instance of Libreoffice and use it as we do Microsoft word. This is essentially the core idea of the docx-to-pdf project. It essentially provides a server that exposes an endpoint that takes your docx file and returns a well-formatted pdf file, similar to how Microsoft Word does it. This is perfect but it means that we have to make our replacements on the document and not on the text content. Most server-side frameworks have a good library for reading and writing from docx files. For PHP/Laravel, I use phpoffice/phpword and Java/Spring Boot, I use the Apache POI library.
Benefits
- The results are the best of the bunch with no weird artifacts.
Drawbacks
- Can be difficult to setup for anyone with little to no experience running docker containers or BackEnd services.
- Introduces new latency issues since you’re now making calls over the network.
- Additional service to maintain.
Conclusion
The optimal solution works, however, the hassle might not be worth it for a team with no wiggle room for another service to manage. However, I find that Kubernetes makes this a trivial decision, and using the local service name, instead of the external network can reduce the latency greatly. Additionally, you can deploy multiple instances of the service if you notice that it has become a bottleneck.
