Lighthouse update February 16th
Website to feed
The past week had first and foremost one improvement, website to feed conversion. It enables users to subscribe to websites that don’t provide an RSS feed.
This feature consists of multiple areas. The backbone is extracting items from a website based on CSS selectors, and then putting those items through the same pipeline as items of an RSS feed. Meaning extracting full content, calculating reading time, creating a summary, creating an about sentence, interpreting language and topic, and so on.
Additional areas are all about making it easier to use. Showing the website and letting users select items simplifies the feature, for many websites it’s not necessary to even know about the selectors. This also required some heuristics about which elements to select and how to find the repeating items from just one selection.
The user experience can always be improved, but I think as it is right now it’s already quite decent.
The next step for this feature is to automatically find the relevant items, without the user having to select anything.
Next steps
An ongoing thing is the first user experience. It’s not where I want it to be, but honestly it’s difficult to know or imagine how it should be. One issue that came up repeatedly is the premium trial, and that users don’t want to provide their credit card just to start the trial. That’s fair. Though Paddle, the payment system Lighthouse uses, doesn’t provide another option. They have it as private beta, but I didn’t get invited to that unfortunately. So I’m going to bite the bullet and implement this myself. Won’t be as great as if Paddle does it, but at least users will get the premium experience for 2 weeks after signup.
An improvement I had my eyes on for some time is using the HTML of RSS feed items for the preview. Lighthouse attempts to parse the full content for all items, but that’s not always possible. If websites disallow it via robots.txt, or block via bot protection, Lighthouse doesn’t get the content. In these cases it shows that access was blocked. But if the feed contains some content, that could be displayed. Feeds usually don’t contain the full content, but it’s at least something.
One more thing I wanted to do for a long time, and can finally make time for, is creating collections of feeds for specific topics. For example “Frontier AI labs”, “Company engineering blogs”, “JS ecosystem”, and so on. The blogroll editor is the basis for that. It lets you create a collection of websites and feeds, and export OPML from that. I’m going to improve its UX a bit and then start creating these collections.