
This is the first in a series of posts about my personal vision to implement a semantic desktop for everyday use, building on a modern, (meta-)data-centric open source OS, Haiku. There are many reasons for targeting this nic(h)e OS (outlined below), and many unique use cases for such a solution, but the primary focus is on personal knowledge management (PKM), because I think the “personal” part gets lost in the myriad of online and commercial services that popped up in recent years.
Most of these are controlled by single companies that often charge steep subscription fees1 and may “pivot” or disappear at some point (yes, I’m looking at you, Evernote). Exporting your data when migrating services often looses information (metadata, cross-linking, formatting) - leaving a glaring gap for a true local-first, desktop based solution for organizing and working with all the stuff that gobbles up your laptop’s disk space and use the vast array of tools already available there...
Before you shout “niche”, “desktop is dead” or “there is already a tool for that”2, let me explain what is so special about SEN, which stands for Semantic Extensions and is more of an infrastructure (with many implicit use cases that don’t need specialized apps) than a yet another tool(box/suite).
A Very Short History and Comparison
Semantic Desktop on Linux: KDE NEPOMUK and Baloo
Since my initial thoughts on such a semantic desktop 20 years ago in June 20033, some progress has been made in the form of preview/detail panes in file browsers, and the open source desktop KDE went as far as integrating the full breadth and weight of semantic web standards in a EU funded project (NEPOMUK). This resulted in a promising solution but ultimately failed because of the sheer complexity of using a full-blown search index (Lucene) and database (semantic triple-store Virtuoso), which introduced a heavy performance impact (all the more painful with the hardware at that time). UI wise, it was also was too complicated to use and seemed more targeted at people familiar with the underlying semantic web standards used (RDF, XML,…).
Eventually, NEPOMUK was replaced with a more light-weight and stripped-down approach, Baloo, which focuses on file indexing and search but otherwise does not provide any semantic functionality like relations or deep links into applications (more on that below). Baloo allows extracting metadata from supported file types and indexes them along with file names in a database like storage file4.
Unlike BeOS/Haiku (see next section) and therefore SEN, Baloo cannot utilize extended filesystem attributes (often called “xattrs”) for storage, as this would face several problems caused by the otherwise desirable and powerful pluggable nature of systems like Linux5, and the way metadata is treated in different abstraction layers:
In some ways using xattrs is 'the right place' to put this tagging info, but Linux doesn't have a 'core API' that would allow a 'xattrs preserved by default, and interpreted consistently' policy. Therefore there are a lot of ways to lose your tags. KDE has been trying to make this work for many years, but it is difficult given the way UNIX/POSIX has evolved.
source: https://userbase.kde.org/Baloo#Baloo_and_extended_attributes
Since you cannot guarantee the consistent treatment of filesystem metadata across all the available desktop environments and tools, they might be overwritten or removed at any time (e.g. GNOME’s gedit will silently remove them upon save, and most other tools will too, since metadata is often “forgotten” when copying files). Even in KDE alone, applications use different ways to handle metadata, e.g., the standard file manager Dolphin embraces xattrs, while the (excellent!) photo management app digikam uses its own index (but can at least synchronize with metadata contained in the header of photo files using IPTC or EXIF standards).
This results in a limited and very heterogeneous solution that falls short of a unified, efficient and data-centric approach for consistent, transparent and unobtrusive personal information management, as needed for a semantic desktop of today.
MacOS: Spot on or Spotlight?
MacOS was always special in handling files and metadata, not only regarding resource handling of Apps, but also in providing color labels (which can be renamed to more semantic and useful names instead of just the color name), which allow a simple categorization of files but cannot be extended since they are hard-coded into the OS.
Later, it gained one of the best desktop search engines with Spotlight (which was developed by Dominic Giampaolo, who originally engineered the exceptional BeOS filesystem, see below). Much like NEPOMUK and Baloo on KDE, this mechanism indexes files for easy search and preview, but doesn’t go beyond that. It does not know of any inherent relations between files and does not allow users to add them manually.
Finally, MacOS is not really data-centric since even the stock Notes application uses a database instead of keeping notes in files. This makes backup unnecessarily complicated and users cannot access their notes without the app, which can be a bit sluggy at times (I really had issues with it once as it wouldn’t come up and I kept my work notes there, so no personal backup in place).
MacOS does provide the best integration possibilities in a mainstream OS though, e.g. the Services infrastructure that apps can use to provide functionality that can also be accessed via scripting. One example that utilizes inter-application scripting is Hookmark, which provides a streamlined commercial extension to provide deep linking between applications and certain supported file types.
On the downside, it is Mac-only, commercial and closed source (with the drawbacks outlined above), and stores links in a separate folder, which needs to be carefully handled upon backup, restore and migration. It does not (and to my limited technical MacOS knowledge cannot) transparently extend the system with native file relations stored in file system attributes. In summary, the system feels a bit tacked on as an afterthought, e.g. also file templates are managed separately instead of being natively provided (like in KDE or BeOS/Haiku).
To sum up and circle back to the creative punny headline, MacOS is great for everyday use and light knowledge work, but does still not provide a data-centric, transparent and open platform for personal knowledge management.
Conclusion
In this first post, I could only briefly cover existing solutions to provide some context and a starting point for the main actor of this Substack, SEN. If you are interested in more detail and want to dive into the full breath and depth of SEN right now, including a technical description of its internal architecture and realization, and also want to support me with this project even more, go ahead and grab a copy of the generally interesting first book on personal knowledge graphs (PKM(, aptly named Personal Knowledge Graphs, at the official book website or from any major online shop (epub for now, print to come soon). SEN is covered in a separate chapter and is just one of the many insightful and interesting pieces on the topic of PKM.
What I really miss in today’s landscape is a native, data-centric desktop system that treats entities and their relations as native files and naturally forms my personal knowledge space, without breaking the flow, introducing a schism between different systems or tools, and does not force me into yet another closed shop with is own data silo, visualisation and Ux paradigms.
Instead, a transparent and open personal information infrastructure should integrate seamlessly into a user-first existing desktop system, adding just what is necessary to provide the missing (semantic) links, and keeping everything in the file system, treating entities and relations as first-class citizens, rather than as an afterthought.
Embrace Change: Enter BeOS & Haiku
A Short History and Overview of BeOS
In the 2nd half of the 1990s, BeOS came out of nowhere and was in many ways 15-20 years ahead of the competition (just look at the official demo video and compare that to other desktop OS’s of that time…). It was a completely fresh approach with a new kernel, a modern metadata-driven filesystem featuring indexed custom attributes for fast and dynamic queries (sounds familiar? see the note on Spotlight above), a message-based API also used for inter-application-communication, pervasive multi-threading, and all that in a slick and very responsive UI that made Windows 95 look like GEOS on the Commodore C64, which was already available in 1986. Even MacOS looked quite old in comparison (actually, as you might already know, before Steve Jobs returned to Apple and brought his creation NeXT with him to become MacOS X, there was a serious chance that Apple would acquire Be and make BeOS the basis for its next-gen OS).
From a knowledge worker’s perspective, the highlight of the young and excitingly fresh take on a desktop OS that really dared to think different, was its data-centric approach, with a completely new filesystem that natively supported custom metadata stored as extended attrbutes attached to files, and in the way it treated data as individual entities that could be handled and manipulated without specialized applications. This means that data was free and open to user manipulation (e.g. through the flexible file browser Tracker. Data could be exchangable between applications through the OS filesystem API and was not tied to a specific application, and even scripts could access them easily via Shell commands.
The real power was in the easy customizability and the direct, user-friendly visualization and interaction design, esp. regarding the special features around custom file types (to represent any kind of entities like Books, Media, Mail or Contacts). For example, you could quite easily build a simple DVD library without having to resort to a special application, just by using the system’s FileTypes settings and the stock file browser, as detailed in this workshop.
Keeping your information this way is very intuitive and straight-forward, but that is only half the battle, as you know - the real power of knowledge management lies in finding the right information at the right time. Now, the power of native, filesystem based custom queries comes into play, which is illustrated in the screenshot below:
In a similar way, managing personal contacts is as easy as creating a file of type Person and filling in the relevant information, which is kept as standard metadata in the file’s custom attributes:
Phoenix reborn: Haiku
Although Be, Inc., and BeOS went the way of many things too good to be true (they even had convinced an OEM to ship PCs with BeOS pre-installed), it was also too good to die, and a group of open source developers and BeOS lovers got together to build Haiku (as also repeatedly covered in the news). This is a clean-room implementation of BeOS, based on publicly and legally available information (including the legendary nerdy newsletters, API docs,…).
Haiku not only recreates the experience of BeOS but also brings it to the current state-of-the-art in terms of hardware support (including new architectures like ARM or RISC-V), and software management (it has its own package manager, inspired from the best but adding its own twist using filesystem-based overlays, similar to ZFS/btrfs snapshots), as well as bugfixes, optimisations and extensions of the API (faster messaging, new functionality) and user interface (SVG like icons, modern font rendering etc.).
Not reinventing the wheel, developers work closely with projects like FreeBSD (for driver support), also actively contributing back, e.g. to GCC, or Webkit for the native browser NetPositive (think Safari but there is some room for improvement still).
Finally, with a native port of Qt, KDE applications like KOffice and standard apps like LibreOffice are also available, massively extending software support for everyday use.
Gravitating towards Minimalism
Alltogether, this provides a very attractive platform for knowledge management, albeit targeting a niche OS. However, this can be a blessing, esp. for knowledge workers, keeping free of distractions and the noise of ad-loaden mainstream systems.
Just look at the trend towards minimalism in software, like NothingOS, which is monochrome and strives to be low-noise and distraction-free.
This trend is even happening in hardware, check out this deliberately restricted hardware project.
Given the sleak minimalism of Haiku and its native approach to information management, it makes a very compelling prototyping environment, and possibly beyond that.
In my next post, I will finally introduce you to SEN and how it integrates with the modern filesystem and API of Haiku to realize a modern semantic desktop with native semantic links on an infrastructure level.
with the notable exception of open source solutions like Anytype, or offers with a free plan like Capacities.
obviously you are already better informed than that, since you are reading this Substack;-)
I even still have the original “napkin” (in my case it was the back of a fax paper from my first IT job) where I scribbled down the first ideas on a linked desktop, but then - life happened, and I only came back to the idea many years later when moving in 2019…
this might again cause performance impact and heavy resource usage and has been noticed by users until now; this can only be partially solved by fine-tuning Baloo’s configuration.
don’t get me wrong here, I’m just talking metadata here, but I fully embrace the modular structure of UNIX and Linux, which goes back all the way to the 1960s and its extremely smart and powerful pipes&filters concept and the tools metaphor described in the landmark book “Software Tools” in 1976.
Niche, sorry. But that's precisely why I subscribed ;) Very enriching.