Architecture

An architectural overview of HASH


Technical Outline

Type system

HASH is built around a flexible type system which allows for information to be expressed as entities with types. Types can inherit from one another, and entities can have multiple types. Users can create their own types, keeping them private, or making them public. Types can also be composed from other types, making HASH the first multi-tenant, open-source, typed knowledge graph in the world.

System concepts

Most entities and types you work with in HASH will be user-created, or generated by apps or integrations. But there are also a number of core concepts which you'll likely want to familiarize yourself with as a developer building on HASH.

  • Instance: a single running node of HASH, generally available via the world wide web (e.g. at a web address like hash.ai), deployed within a private network (e.g. available via a VPN), or running locally for development purposes.
  • Instance Admin: relevant in the context of self-hosting HASH, instance admins are users with the ability to perform instance-wide moderation and updates, or to change certain settings.
  • Web: a web is a namespace within an instance that contains entities and types, which belongs to either a user or an organization.
  • User: a single account on an instance, corresponding to a real person, which can be logged into via the HASH application interface.
  • Organization: a collectively addressable group of users on an instance.
  • Block: a discrete piece of UI in a page for editing or displaying data in a specific way (e.g. a Paragraph, Map, or Table block)
  • Page: a user-created page within a web containing any number of blocks. All pages MUST be either a:
    1. Document: a linear page, whose blocks appear in one or more columns
    2. Canvas: a freeform page, whose blocks can be dragged-and-dropped anywhere
  • File: a binary object stored as a single entity. Files MAY have a more specific system-recognized sub-type (e.g. Image File), extending the range of ways in which they can be used in HASH.

System components

Each instance of HASH relies on a number of components. Opportunities for contribution will mainly arise in the key internally-written applications:

  • HASH Graph (apps/hash-graph): a Rust application which manages interaction with the datastore and authorization service, and exposes a REST API to other components of the system for creating, updating and querying types, entities, and permission-related records. The Graph has no knowledge of particular types, and is solely responsible for providing consumers with the ability to build and query a graph.
  • Node API (apps/hash-api): a Node.js/TypeScript application which is mainly responsible for authenticating users and passing requests between them and the Graph, and servicing the particular queries the frontend relies on. It exposes a GraphQL API, as well as some other endpoints related to file upload and retrieval, and external integration authentication and synchronization.
  • Temporal workers (apps/*worker*): Node.js applications registered with a Temporal server, which can be called by other components of the system to run workflows for background, scheduled or long-running jobs. Heavily used by AI features.
  • Frontend (apps/hash-frontend): a Next.js/TypeScript frontend for the HASH workspace application. It provides a graphical interface for users to build their own web through the creation of types and entities, as well as reporting on the results of automated web-building jobs (e.g. intergrations, AI inference).
  • Browser plugin (apps/plugin-browser): a React/TypeScript browser extension which allows users to configure the AI-enabled creation of entities as they browse the web.
  • Realtime service: a stream of changes in the datastore(s) that allows services to subscribe to realtime updates on entities and types.

We also make use of third-party open-source applications as part of:

  • Authentication system: based on Ory Kratos and Hydra, handles user accounts, sessions and access tokens. Managed by the Node API.
  • Authorization system: based on SpiceDB, extends a Zanzibar-like way of providing permissionsed access to information. Managed by the Graph.
  • Execution system: based on Temporal, powers flows
  • Datastore(s): currently Postgres, with planned support for additional specialized backends for selectively storing/offloading specific kinds of data (e.g. timeseries, financial/accounting) and queries (e.g. full-text/vector search). Managed by the Graph.
  • Blobstore(s): currently supportive of S3-compatible APIs (with Cloudflare R2 utilized by default), with planned for support for file-type specific handlers (e.g. Cloudflare Images for images, Cloudflare Stream for video). Managed by the Node API.

More information about each of these, as well as the corresponding code, can be found in the hash- prefixed subdirectories within apps (in our @hashintel/hash public monorepo):

Previous