/Creating a $24k/mo, open-source solution to the headaches of headless browsers

Creating a $24k/mo, open-source solution to the headaches of headless browsers

Photo of Joel Griffith

Hello, who are you and what are you working on?

Hey friends! I’m Joel Griffith, and I work on a small little slice of open-source called browserless. Browserless is simply a tiny web-server that “productionalizes” all the stuff about headless browsers and their automation capabilities. It handles everything from dependencies, garbage collection, and all the libraries to support, of which there’s quite a few!

It takes care of all this in a docker image so you can go build out your automated tests, scraping, or image/pdf rendering service without having to think about all the stuff you’ll need. I’ve been working on the project for about two years now.

Why did you start browserless?

It’s a bit of a story, but I always love telling it because it’s a great example of how you can start working on one thing, and pivot to something completely different when it’s not working. I think it also reinforces the whole “fail fast” mentality that folks seem to talk about, but don’t ever go into detail on.

So roughly 2 1/2 years ago I was trying to build a simple web-app where I could create gift-lists without being locked into any sort of seller or retailer. Every time a birthday/holiday/wedding rolled around I always struggled with trying to find the relevant party’s wishlist, where to go, and what was already bought. This was the problem I was focused on and I had a good feeling about it since it was solving my own personal problem and I assumed others must be having similar issues.

I’m making pretty good headway on the design and “list” part of this site when I finally hit a huge snag with uploading URLs of items you’d like. Most sites out there return pretty good HTML from a simple cURL-like call, but there were a few that didn’t. These sites (single-page apps) ran a lot of JavaScript on their page to fetch data and build-out the final HTML. This would force me to have to run some headless browser behind the scenes and have it do all this work.

At the time, puppeteer or even earlier libraries like chromeless didn’t exist, so I decided to write my own library for headless Chrome. This spawned a project called Navalia, which is still out there although it’s not being maintained anymore. I learned a lot about Chrome’s internal protocol for remote debugging (which is pretty much how all the various drivers out there work), and by the time I was finished with this library and moving on with my web-app, puppeteer came out. And it came out in a really big way.

After having gone through the exercise of getting this part of my app ready for headless chrome, and the amount of fanfare for puppeteer, it became pretty clear to me that many many others were going to be going through the exact same steps I just went through. Given all of that, I decided to switch over to doing a headless-browser service over my initial wishlist app.

What were the early days like?

The fundamentals of what you decide to build are really important, and so I spent a lot of time trying to think about what folks would want and what they’d need. Initially, I thought supporting just puppeteer would be the only thing necessary, and having a good mechanism in place for load-balancing work. I also wanted to package this all up in a really easy way, so you could just download one thing and be done.

After a lot of thought, I decided we’d need:

  • An internal queueing mechanism.
  • Support for running on-premise and hosted.
  • A way to debug and watch sessions live.
  • Some sort of health/monitoring system.

A good chunk of time was spent getting the queue setup and working properly, which now that I look back on it wasn’t really necessary. There are way better technologies out there that already do this, like Nginx, and so building that was just duplicating stuff that’s already out there. It also added a good deal of complexity to the codebase, which I somewhat regret.

The most fun thing to work on was our live session viewer and debugger. This let me utilize all the stuff I learned when writing Navalia for a really first-class tool. Instead of just recording videos of your browsers session, which I’d argue has little value, you can instead “drop-in” on any live-running session and remote debug it with all of Chrome’s devtools (and it even has a little viewport to see the browser). And there’s literally nothing you need to set up: just click a link and you’re there watching it!

The other challenge with a lot of the project was dealing with WebSockets, which is the protocol Chrome uses for issuing commands, in a flexible way. Each load-balancing technology out there handles them differently since they’re not straightforward like HTTP, and load-balancing can be tricky in a fleet of servers. We use a fantastic npm module called http-proxy, and it makes most of this magic just work out of the box.

Other than that there were a lot of debugging memory-leaks, getting fonts working, and all sorts of other zombie-process issues. After all that, I can say with pretty good confidence now, that we’re really really stable.

How have you grown browserless’s usage?

One of the greatest things about the project is that it is really designed for developers as the end-users, so all the channels and communities that’d use it I’m already familiar with. A lot of the times this means just announcing things in Slack, on a niche sub-reddit, or even on Hackers News. HN actually has been kind of back-and-forth in terms of returns, but it’ll still get it out there more.

Honestly, the biggest difference we saw was when we started detailing all the findings we were witnessing in our blog. Now that we support numerous libraries and run-times, there’s a lot of places where things can break, and we try to highlight those in our documentation as well as our blog-posts. This has the great side-effect of building up your SEO, and that just snowballs further and further.

What definitely hasn’t worked for us is paid strategies. Whether that’s marketing, advertising, or whatever else. There’s always been this thought that I’ve had, that advertising and “paid” attention is really in no one’s best interest. You’re likely to get users who really aren’t going to get any value out of your software, so your churn increases, and you’ve also just paid for that user that’s churned. These things get harder to tease out since it’s almost impossible to ask “show me all the churned users this period acquired from advertising channels.” Maybe that’s possible, but you’d have to do a lot wrangling together to get it all working.

I’ve also found that agonizing over tracking metrics is just a huge waste of time. You might be able to trace back peaks in traffic to some event, but it’s not clear whether there’s anything special you did personally. I suppose that, early on, it can help to identify what’s working and where to spend time. However, once you find your audience and where they’re coming from, just focus on executing and not “buying” users.

How have you managed the workload and the community?

It’s been really easy so far. Since our primary users are developers themselves, I find that there’s rarely conflict that needs tending to. Occasionally, a use-case pops up that’s rather convoluted and tough to accommodate, which I generally try to avoid. We’ve explicitly built certain APIs and features so you have an escape-hatch for tougher things.

For example, we’ve had issues crop up with screenshots where folks will ask us if they can embed profiles and other “prior-state” into the session. This would mean we’d have to maintain some sort of key-value storage locally and allow the screenshot API to access this. This is such a complex task, and really hard to design for as you now have to build a local cache, design an API for it, and do all the other stuff that caches need (TTL, access restrictions, and all that).

Instead, we allow you to write your own functions, which you can embed your own logic into. This gives folks the freedom to put their business-logic into it, and design their own systems around a really flexible interface.

I think the general design consensus and prioritizing I’ve arrived at has been:

  • Make what’s there better.
  • Optimize for the common case.
  • Always make sure there’s an escape hatch when needed.

It allows us to move quickly on newer things, keep what’s there in better shape, and give folks a way out if needed even though it might not be ideal.

How much time do you devote to browserless?

Right now I’m part-time on it, though that might change in the future. I spend a lot of time thinking about things before I go implementing them, which I think helps with time spent actually programming. I’ve always heard the saying “months of programming can save you hours of planning,” which I generally understood, but until you have no time do start to realize its beauty.

I’ve got a full-time job, a wife and kids, and browserless.io. As you can probably imagine, the first two take up 99% of my time wakeful time. Therefore, when I do somehow find time for browserless, I have to make my best use of that time. Thus, I generally spend a good few weeks or months thinking about something before I even open up my editor. If I can, I’ll even write tests prior to implementing, which makes writing that feature even better since you now have a way to track how close you are to being done.

All of that said, I probably spend a few hours a week working on new stuff and improving the system. Of course I could always use more, but I only like to do actual work on the project when I know what it is I’m doing.

browserless is a company too, why did you decide to make it open source?

I did this for a few reasons looking back: primarily I wanted something out there in open-source-land that I built and gave back. browserless.io is free to use for open-source projects and efforts, and I always want to keep that going. I’ve also learned so so so much by reading over open-source code and using it. A lot of non-tangibles in software (good API design, interacting with the community, etc.) can on only come from being exposed to a lot of different projects. Open-source is one of the ways you can make that happen at an incredible scale.

Another big reason is to increase our reach. We probably could get by by having paid ads or something else for browserless.io, but why do that when you can open-source the work? With open-sourcing, it’s win-win: folks can find it, learn from it, and talk about it; and we get lots of traffic from those sources without having to resort to advertising.

Now I do realize this is kind of muddying the waters with some folks in the open-source community. However, I think the tradeoffs are quite worth it, since we’ll live in a world with less ads which I think most folks can get behind. The only shortcoming is you run into projects that might at some point ask for some money, but it’s generally only when you’re making money, which I think is quite reasonable.

What other advantages do you see in not being closed source?

There’s probably a lot of users out there using browserless.io that should be paying for it, and if I had it “closed” then I’d probably have more revenue. However there’s a few non-tangibles when it comes to the closed/open debate.

First is that having open-source software helps people getting started, and also gives them something to use a reference. As I sort of mentioned above, I learned a lot by looking at open source projects and “copying” their ideas. I grew up in the 90s, and can easily recall a time where there weren’t a lot of resources out there for learning software, and I think it’s one of the largest innovations of the 2000’s. Being closed source would mean working in the opposite direction of this.

Another benefit is that we get a lot of traffic and feature requests by being open source. It also allows us to collaborate at a level most of our users are familiar with, which is in pull requests and issues. I sort of equate it to speaking a language, and most engineers out there already speak the language of git/GitHub. Having that be available gives them a mechanism for communicating with us in a way they feel comfortable with, which helps build bridges.

Will we always build products in open-source? Maybe not, I think it’s suitable for some things to be closed, but I really enjoy doing things out in the open and talking about it.

What are the biggest obstacles you’ve had to overcome?

The largest is overcoming imposter syndrome, which becomes more apparent when you’re talking money with people for software. Since software isn’t tangible it’s hard to put a price on it, and we generally use this “value” moniker to describe it since it can’t be gauged any other way. Value can be anything from time savings, speed of development, or even time to market. All these things have different value for different organizations, and thus have a different monetary aspect to it. This makes it very difficult when something may seem “easy” to you, but can be a huge shift for someone else, and especially when you feel like an imposter!

The other issue has been writing the software to run both in an on-premise setting as well as a cloud/hosted setting. These are two very different users and use-cases, so making a single product work for both can be especially tricky. However, the exercise of containing the software so it can be ran on-prem, or really elsewhere, means you’ve also decoupled the core pieces from your architecture. Having done so, you can now have a much better deployment topology where certain problems just vanish. For instance, dedicated accounts for us mean that each user can run their own instance on their own VM, meaning there’s a lot less chance of having security issues, and if someone breaks their deployment it won’t have an impact on others. We sort of took this inspiration to all parts of our architecture, and nearly any part of the system can go entirely down without affecting currently running work elsewhere, which is quite liberating!

What about obstacles with motivation, and funding?

Motivation, to me, is the sole thing that will push all other problems to their end. The lack or abundance of motivation is what will really determine if your project succeeds or not. I look at all the other projects out there that were successful because the author was motivated enough to build something and push it to success. HTTP servers are a great example: do we really need more of them? Probably not. However their creators believed in something enough and were motivated, and thus there is a proliferation of ways to stand-up and handle HTTP requests.

The easiest way that I know of finding that motivation is by starting with passion. Finding something out there that gets you excited, and immediately pushes you into the “flow” is the best way to see if you’re passionate about a particular topic. I’ve had several occasions where I’ve fallen asleep with my laptop’s editor still on: this is a sign of passion. Obviously, if you’re not careful then passion can become obsession, which leads to burnout, so paying attention to that is also crucial as you’ll need to stay in it for the long-run if you want to overcome escape velocity.

This sort of leads me to the funding aspect. Most projects out there ask for money, or hope someday to get hired by a company that will fund them. This style of funding might work for the top 1-5% of projects out there, but it doesn’t work for the whole ecosystem. What does work is the good old-fashioned business of selling licenses and, where necessary, hosted services. License are a lot easier to get going, whereas building your own cloud takes considerable time and effort. Having done both, I’d strongly recommend going with licenses over providing your own cloud, but that’s the subject for another interview!

One thing I have grown opposed to is venture funding. That, to me, just has so many conflicts of interest that it is only viable for businesses where a lot of capital is absolutely necessary. Funding puts a lot of pressure on you as an author, and is a forcing function to grow your own personal skills in a compressed timeline in order to get ROI for your investor. I know this sounds like a gross generalization, but at the end of the day you owe someone money for a promise, and having that weigh heavy over your head seems incredibly stressful to me.

I’d rather start with something small, and have it grow organically over time. Sure there will be other stress: working a job to keep yourself alive and so on, however it’s not nearly the same type of stress as getting a loan. You can always back out or sell a bootstrapped project, but it’s a lot harder to exit out when you’ve taken money from someone.

What are your hopes for the future of browserless?

I’m stupid happy with our debugging tools and APIs. These are so hard to get polished and elegant, and I feel like there’s a great start in both of those, so I’m incredibly excited just for our starting point. In the future, I hope to make it easier for folks without a strong development background, and those on tight enterprise systems.

Getting newcomers up and running quickly is something I’m really passionate about now. How cool would it be if you could just fire up your browser, do the work you want it to, and press a button and now it just magically does that someplace for you without ever having to write code? There have been a few stabs at these (Chrome extensions and others), but I think having something really seamless is key since the users themselves won’t be able to handle lots of steps or debugging. It’s a really hard problem, but if it can be solved well then there’s so much potential it’s hard not to get excited about it.

Finally, I’d love to see other browsers aside from Chrome included. It definitely gets the job done, but folks have issues with it, be it personally or whatever, so having more variety would be great. It’s getting there, however, the only interface browsers share is Selenium via the webdriver protocol, which has its flaws and shortcomings.

What advice do you have for other open source projects and maintainers?

Have a good plan in place for funding your venture, or at least a pathway. You might start out roofing houses for “fun”, but after a while it’ll need to serve more than just your passion for roofing houses: you work to live and not the opposite. I’d also say pick something you’re really interested in, and that maybe there’s interest elsewhere out there as well. It can be a new technology (Rust seems interesting) or a new library, but when you’re not feeling motivated, having someone else be is a great way to get back on it.

Remember, at the end of the day, it’s just software and there are more important things in life! While it’s fun to bask in those stars you get on GitHub, or having someone pay you real money, in the end, it’s just a quick hit of dopamine and there’s more out there that’s at stake. Have a lot of fun, but don’t get too upset if it doesn’t work out!

To find out more about Joel and browserless, you can check out the website here, see the source code on GitHub or follow the company on Twitter.

Original Source