Skip to content

BER Data Registry

A LinkML schema for cataloging databases and data sources across scientific lakehouses at LBNL, including KBASE (Spark-based) and Dremio environments. Informed by the HPDF Data Catalog & Lakehouse Demo report (Cohoon & Paine, LBNL-2001745, Dec 2025), DCAT v3, and DCAT-US.

URI: https://w3id.org/ber-data/ber-data-registry

Name: ber_data_registry

Classes

Class Description
Catalog Top-level container for the BER data catalog, holding references to lakehouse...
CatalogEntity Abstract base class providing shared metadata fields for all catalog entries
        DataSource A cataloged data source within a lakehouse
        Lakehouse A lakehouse environment such as the KBASE Spark lakehouse or a Dremio instanc...
ContactPoint Structured contact information for a data source, providing a name and email ...

Slots

Slot Description
access_level Visibility/access level of this data source
catalog_entries Catalog entries (data sources) hosted in this lakehouse
category Organizational category (project, shared, personal, system)
contact_email Email address of the contact person
contact_name Full name of the contact person
contact_point Structured contact information for this data source
created_date Date this entity was first created or registered
data_quality_notes Notes on data quality, known issues, or limitations
database_engine Database engine for Dremio database sources (e
deprecation_date Date this data source was deprecated
deprecation_reason Explanation for why this data source was deprecated
description Free-text description of this entity
documentation_url URL to external documentation for this data source
doi DOI if this data source has been published
domain Scientific domain(s) covered by this data source
endpoint_url URL endpoint for accessing the lakehouse
facility Originating facility (e
format Data format(s) available (e
id Unique identifier for this catalog entity
instrument Instrument or sensor that generated the data
is_deprecated Whether this data source is deprecated
keywords Discovery tags for this data source
lakehouses Lakehouses registered in this catalog
last_modified Date this entity was last updated
license License governing use of this data source
lineage Provenance or lineage information for this data source
modality Data modality (e
namespace Database name, source name, or space name within the lakehouse (e
operator Organization or team operating this lakehouse
owner Person or team responsible for this data source
platform_type The technology platform underlying this lakehouse
previous_version Reference to the previous version of this data source
project_affiliation BER program affiliations (e
replaced_by Reference to the data source that replaces this deprecated one
row_count Number of rows or records in the data source
size_bytes Total size of the data source in bytes
source_type Type of data source within the lakehouse (e
spatial_coverage Geographic or spatial coverage description
status Current lifecycle status of this data source
table_count Number of tables or collections in the data source
temporal_coverage_end End date of the temporal coverage of this data
temporal_coverage_start Start date of the temporal coverage of this data
title Human-readable name for this entity
update_schedule How frequently this data source is updated
version Version identifier for this data source

Enumerations

Enumeration Description
AccessLevel Visibility and access restrictions for a data source
DatabaseEngine Specific database engine for database-type sources
DataSourceCategory Organizational category for a data source
DataSourceStatus Lifecycle status of a data source
PlatformType Technology platform for a lakehouse
SourceType Type of data source within a lakehouse, particularly relevant for Dremio envi...
UpdateFrequency How often a data source is updated

Types

Type Description
Boolean A binary (true or false) value
Curie a compact URI
Date a date (year, month and day) in an idealized calendar
DateOrDatetime Either a date or a datetime
Datetime The combination of a date and time
Decimal A real number with arbitrary precision that conforms to the xsd:decimal speci...
Double A real number that conforms to the xsd:double specification
Float A real number that conforms to the xsd:float specification
Integer An integer
Jsonpath A string encoding a JSON Path
Jsonpointer A string encoding a JSON Pointer
Ncname Prefix part of CURIE
Nodeidentifier A URI, CURIE or BNODE that represents a node in a model
Objectidentifier A URI or CURIE that represents an object in the model
Sparqlpath A string encoding a SPARQL Property Path
String A character string
Time A time object represents a (local) time of day, independent of any particular...
Uri a complete URI
Uriorcurie a URI or a CURIE

Subsets

Subset Description