Opening an existing file for reading by creating and opening a read I/O object for the file.
Reading NWB neurodata_types by constructing the corresponding RegisteredType class to represent the neurodata_type , e.g., NWBFile or ElectricalSeries.
Reading data from RegisteredType objects by creating a ReadDataWrapper wrapper object for lazy read access to the particular dataset or attribute field.
Using ReadDataWrapper::values we can then request the parts of the data of interest, at which point the data is being loaded from disk and returned as a DataBlock, which contains a 1D vector with the data and the shape of the data.

Opening an existing file for reading

    // Open a new I/O for reading
    std::shared_ptr<BaseIO> readio = createIO("HDF5", path);
    readio->open(FileMode::ReadOnly);

References:: See createIO and HDF5IO

Reading NWB neurodata_types

Reading known RegisteredType objects

When the path and type of objects is fixed in the schema (or we know them based on other conventions), then we can read the types directly from the file. E.g., here we first read the NWBFile directly, which we know exists at the root "/" of the file. We then read the ElectrodeTable via the predefined NWBFile::readElectrodeTable method. The advantage of this approach is that we do not need to manually specify paths or object types. Similarly, when we read the locations columns, we do not need to specify the name or the data type to use.

    // Read the NWBFile
    auto readNWBFile =
        NWB::RegisteredType::create<AQNWB::NWB::NWBFile>("/", readio);
    // Read the ElectrodesTable
    auto readElectrodeTable = readNWBFile->readElectrodeTable();
    // read the location data. Note that both the type of the class and
    // the data values is being set for us, here, VectorDataTyped<std::string>
    auto locationColumn = readElectrodeTable->readLocationColumn();
    auto locationColumnValues = locationColumn->readData()->values();
    // confirm that the values are correct
    std::vector<std::string> expectedLocationValues = {
        "unknown", "unknown", "unknown", "unknown"};
    REQUIRE(locationColumnValues.data == expectedLocationValues);

Searching for RegisteredType objects

When paths are not fixed, we can use the findTypes() function of our I/O object to conveniently search for objects with a given type.

    std::unordered_set<std::string> typesToSearch = {"core::ElectricalSeries"};
    std::unordered_map<std::string, std::string> found_electrical_series =
        readio->findTypes(
            "/",  // start search at the root of the file
            typesToSearch,  // search for all ElectricalSeries
            IO::SearchMode::CONTINUE_ON_TYPE  // search also within types
        );
 

Note: Any RegisteredType (such as our NWBFile) object) provides the convenience method findOwnedTypes which uses findTypes() to search within the given object (so that we don't need to specify the path argument). By default, findOwnedTypes uses the STOP_ON_TYPE mode, i.e., the search does not recurse further into defined types (hence, returning only data elements that the object owns directly). Alternatively, we can set the search mode to CONTINUE_ON_TYPE to search recursively through all types (here the whole file since we started at the root "/").

Warning: The current implementation of findTypes() is not aware of inheritance but searches for exact matches of types only. However, we can search for objects of multiple different times at the same time by specifying multiple types to search for in our typesToSearch.

The found_electrical_series provides us with a map where each key is the path to an object and its corresponding value is the type of the object. Using this information we can read the neurodata_type objects from the file via the RegisteredType::create factory methods to conveniently construct an instance of the corresponding class in AqNWB.

    // Read the ElectricalSeries from the file.
    std::string esdata_path = "/acquisition/esdata0";
    auto readElectricalSeries =
        NWB::RegisteredType::create<AQNWB::NWB::ElectricalSeries>(esdata_path,
                                                                  readio);

Note: findTypes does not guarantee that objects are returned in any particular order. Instead of retrieving the first object via found_electrical_series.begin()->first; we here fix the esdata_path path variable to ensure consistent behavior of the tutorial across platforms.

Note

RegisteredType::create comes in a few different flavors:

When passing the 1) path and 2) io, and 3) specifying the type as a template parameter (as in the example above), the instance is being constructed using the common constructor and we get a pointer to the specific type directly. I.e., the above example is equivalent to creating the object via auto readElectricalSeries = ElectricalSeries(path, io).
When passing only 1) path and 2) io, AqNWB reads the neurodata_type and namespace attributes from the NWB file to determine the type to use (e.g., ElectricalSeries) and then returns generated instance is then returned as a generic RegisteredType pointer that we can then cast to the specific type if necessary, e.g., via auto readElectricalSeries = std::dynamic_pointer_cast<AQNWB::NWB::ElectricalSeries>(readRegisteredType);.
When passing the 1) fullname (e.g., core::ElectricalSeries), 2) path and 3) io, the behavior is the same as in option 2, but we avoid reading the type neurodata_type and namespace attributes from the file to determine the type. This option is useful when we used findTypes, since we have already determined the type information during the search, so that we can use found_electrical_series.begin()->second to set the fullname.

Reading data from RegisteredType objects

Now we can read fields and subsets of data from the fields

Reading predefined data fields

For fields with a predefined, fixed name in the schema, AQNWB provides read methods for convenient access to such common data fields.

    // Now we can read the data in the same way we did during write
    auto readElectricalSeriesData = readElectricalSeries->readData();
    auto readDataValues = readElectricalSeriesData->values();
    auto readBoostMultiArray = readDataValues.as_multi_array<2>();
    REQUIRE(readDataValues.data.size() == (numSamples * numChannels));
    REQUIRE(readDataValues.shape[0] == numSamples);
    REQUIRE(readDataValues.shape[1] == numChannels);

    // We can also read just subsets of the data, e.g., the first 10 time steps
    // for the first channel. "auto dataSlice" is again of type DataBlock<float>
    std::vector<SizeType> start = {0, 0};
    std::vector<SizeType> count = {9, 1};
    auto dataSlice = readElectricalSeriesData->values(start, count);
    // Validate that the slice was read correctly
    REQUIRE(dataSlice.data.size() == 9);
    REQUIRE(dataSlice.shape[0] == 9);
    REQUIRE(dataSlice.shape[1] == 1);

    // Or read a string attribute, e.g., the unit
    std::string esUnitValue =
        readElectricalSeries->readDataUnit()->values().data[0];
    REQUIRE(esUnitValue == std::string("volts"));

Note: For attributes, slicing is disabled at compile time since attributes are intended for small data only.

Reading arbitrary fields

Even if there is no dedicated DEFINE_FIELD definition available, we can still read any arbitrary sub-field associated with a particular RegisteredType via the generic RegisteredType::readField method. For example, to read the data from the ElectricalSeries:

    // Read the data field via the generic readField method
    auto readElectricalSeriesData3 =
        readElectricalSeries->readField<StorageObjectType::Dataset, float>(
            std::string("data"));
    // Read the data values as usual
    auto readDataValues3 = readElectricalSeriesData3->values();
    REQUIRE(readDataValues3.data.size() == (numSamples * numChannels));

Note

Using this approach, we need to specify the template parameters to use with the ReadDataWrapper, i.e.:

OTYPE: specifies the type of object being wrapped (AQNWB::Types::StorageObjectType)
VTYPE: defines the value type of the data

Warning: In particular for fields that are optional, it is useful to first check that the field actually exists via ReadDataWrapper::exists.

Similarly, we can also read any sub-fields that are themselves RegisteredType objects:

    // read the ElectricalSeries from the NWBFile object via the readField
    // method returning a generic std::shared_ptr<RegisteredType>
    auto readRegisteredType = readNWBFile->readField(esdata_path);
    // cast the generic pointer to the more specific ElectricalSeries
    std::shared_ptr<AQNWB::NWB::ElectricalSeries> readElectricalSeries2 =
        std::dynamic_pointer_cast<AQNWB::NWB::ElectricalSeries>(
            readRegisteredType);
    REQUIRE(readElectricalSeries2 != nullptr);

Note: Even though we here do not specify the template parameter for RegisteredType::create, the function still creates the correct type by reading the type information from the NWB file, however, because we do not specify the type, the function returns the object as a pointer of RegisteredType, that we can then subsequently cast to the approbriate type if necessary.

Working with fields with unknown data type

C++ is a statically typed language, i.e., we need to know the type of every variable at compile time. This can be particularly challenging when reading data from disk where the data type may not be known before-hand. AqNWB helps us here by allocating memory and determining data types for us when reading data fields. However, when we want to compute on the data, we still need to know the data type, e.g., to use the typed DataBlock<DTYPE> we need to know the DTYPE.

Using std::variant with std::visit (introduced in C++17) provides an alternative approach, that can help us avoid having to write complex switch/case statements to check for all possible types when we don't know the data type beforehand. E.g., using std::visit we can define a set of functions to compute the mean for any 1D std::vector:

// Helper function to compute the mean of a vector
template<typename T>
inline double compute_mean(const T& data)
{
  if (data.empty()) {
    throw std::runtime_error("Data vector is empty");
  }
  double sum = std::accumulate(data.begin(), data.end(), 0.0);
  return sum / data.size();
}
 
// Function to compute the mean using std::visit
inline double compute_mean(const BaseDataType::BaseDataVectorVariant& variant)
{
  return std::visit(
      [](auto&& arg) -> double
      {
        using T = std::decay_t<decltype(arg)>;
        // Check that the variant represents a BaseDataType we can compute on
        if constexpr (std::is_same_v<T, std::monostate>) {
          throw std::runtime_error("Invalid data type");
        } else if constexpr (std::is_same_v<T, std::vector<std::string>>) {
          throw std::runtime_error("Cannot compute mean of string data");
        } else {
          return compute_mean(arg);  // Compute the mean
        }
      },
      variant);
}

Using DataBlockGeneric::as_variant we can then cast our data to a BaseDataVectorVariant, which is am std::variant representing a 1D std::vector containing values of any valid BaseDataType. We can then using our compute_mean methods to conveniently compute on the data without having to explicitly specify the type of the data ourselves.

    // Compute the mean using the std::variant approach. We specify
    // the types of variables for clarity, but could us "auto" instead
    DataBlockGeneric genericDataBlock =
        readElectricalSeriesData->valuesGeneric();
    BaseDataType::BaseDataVectorVariant variantData =
        genericDataBlock.as_variant();
    double meanFromVariant = compute_mean(variantData);
    // Compare with computing the mean from the typed DataBlock<float>. We
    // specify the template type for clarity although the compiler can infer it.
    double meanFromTypedVector =
        compute_mean<std::vector<float>>(readDataValues.data);
    REQUIRE(meanFromVariant == Catch::Approx(meanFromTypedVector));

Table of Contents