Neurodata Without Borders Extracellular Electrophysiology Tutorial

Table of Contents

This tutorial

Create fake data for a hypothetical extracellular electrophysiology experiment with a freely moving animal. The types of data we will convert are:

Installing matnwb

Use the code below within the brackets to install MatNWB from source. MatNWB works by automatically creating API classes based on the schema. Use generateCore() to generate these classes.
%{
!git clone https://github.com/NeurodataWithoutBorders/matnwb.git
cd matnwb
addpath(genpath(pwd));
generateCore();
%}

Set up the NWB file

An NWB file represents a single session of an experiment. Each file must have a session_description, identifier, and session start time. Create a new NWBFile object with those and additional metadata. For all MatNWB functions, we use the Matlab method of entering keyword argument pairs, where arguments are entered as name followed by value.
nwb = NwbFile( ...
'session_description', 'mouse in open exploration',...
'identifier', 'Mouse5_Day3', ...
'session_start_time', datetime(2018, 4, 25, 2, 30, 3), ...
'general_experimenter', 'My Name', ... % optional
'general_session_id', 'session_1234', ... % optional
'general_institution', 'University of My Institution', ... % optional
'general_related_publications', 'DOI:10.1016/j.neuron.2016.12.011'); % optional
nwb
nwb =
NwbFile with properties: nwb_version: '2.4.0' acquisition: [0×1 types.untyped.Set] analysis: [0×1 types.untyped.Set] file_create_date: [] general: [0×1 types.untyped.Set] general_data_collection: [] general_devices: [0×1 types.untyped.Set] general_experiment_description: [] general_experimenter: 'My Name' general_extracellular_ephys: [0×1 types.untyped.Set] general_extracellular_ephys_electrodes: [] general_institution: 'University of My Institution' general_intracellular_ephys: [0×1 types.untyped.Set] general_intracellular_ephys_experimental_conditions: [] general_intracellular_ephys_filtering: [] general_intracellular_ephys_intracellular_recordings: [] general_intracellular_ephys_repetitions: [] general_intracellular_ephys_sequential_recordings: [] general_intracellular_ephys_simultaneous_recordings: [] general_intracellular_ephys_sweep_table: [] general_keywords: [] general_lab: [] general_notes: [] general_optogenetics: [0×1 types.untyped.Set] general_optophysiology: [0×1 types.untyped.Set] general_pharmacology: [] general_protocol: [] general_related_publications: 'DOI:10.1016/j.neuron.2016.12.011' general_session_id: 'session_1234' general_slices: [] general_source_script: [] general_source_script_file_name: [] general_stimulus: [] general_subject: [] general_surgery: [] general_virus: [] identifier: 'Mouse5_Day3' intervals: [0×1 types.untyped.Set] intervals_epochs: [] intervals_invalid_times: [] intervals_trials: [] processing: [0×1 types.untyped.Set] scratch: [0×1 types.untyped.Set] session_description: 'mouse in open exploration' session_start_time: 2018-04-25T02:30:03.000000-04:00 stimulus_presentation: [0×1 types.untyped.Set] stimulus_templates: [0×1 types.untyped.Set] timestamps_reference_time: [] units: []

Subject information

Create a Subject object to store information about the experimental subject, such as age, species, genotype, sex, and a freeform description. Then set nwb.general_subject to the Subject object.
Each of these fields is free-form, so any values will be valid, but here are our recommendations:
subject = types.core.Subject( ...
'subject_id', '001', ...
'age', 'P90D', ...
'description', 'mouse 5', ...
'species', 'Mus musculus', ...
'sex', 'M')
subject =
Subject with properties: age: 'P90D' date_of_birth: [] description: 'mouse 5' genotype: [] sex: 'M' species: 'Mus musculus' strain: [] subject_id: '001' weight: []
nwb.general_subject = subject;

SpatialSeries and Position

Many types of data have special data types in NWB. To store the spatial position of a subject, we will use the SpatialSeries and Position classes.
SpatialSeries is a subclass of TimeSeries. TimeSeries is a common base class for measurements sampled over time, and provides fields for data and time (regularly or irregularly sampled).
Here, we put a SpatialSeries object called 'SpatialSeries' in a Position object, and put that in a ProcessingModule named 'behavior'.
position_data = [linspace(0,10,100); linspace(0,8,100)];
spatial_series_ts = types.core.SpatialSeries( ...
'data', position_data, ...
'reference_frame', '(0,0) is bottom left corner', ...
'timestamps', linspace(0, 100)/200)
spatial_series_ts =
SpatialSeries with properties: reference_frame: '(0,0) is bottom left corner' starting_time_unit: 'seconds' timestamps_interval: 1 timestamps_unit: 'seconds' comments: 'no comments' control: [] control_description: [] data: [2×100 double] data_continuity: [] data_conversion: 1 data_resolution: -1 data_unit: 'meters' description: 'no description' starting_time: [] starting_time_rate: [] timestamps: [1×100 double]
To help data analysis and visualization tools know that this SpatialSeries object represents the position of the animal, store the SpatialSeries object inside of a Position object.
Position = types.core.Position('SpatialSeries', spatial_series_ts);
NWB differentiates between raw, acquired data, which should never change, and processed data, which are the results of preprocessing algorithms and could change. Let's assume that the animal's position was computed from a video tracking algorithm, so it would be classified as processed data. Since processed data can be very diverse, NWB allows us to create processing modules, which are like folders, to store related processed data or data that comes from a single algorithm.
Create a processing module called "behavior" for storing behavioral data in the NWBFile and add the Position object to the module.
% create processing module
behavior_mod = types.core.ProcessingModule( ...
'description', 'contains behavioral data')
behavior_mod =
ProcessingModule with properties: description: 'contains behavioral data' dynamictable: [0×1 types.untyped.Set] nwbdatainterface: [0×1 types.untyped.Set]
% add the Position object (that holds the SpatialSeries object)
behavior_mod.nwbdatainterface.set(...
'Position', Position);
% add the processing module to the NWBFile object, and name it "behavior"
nwb.processing.set('behavior', behavior_mod);

Test write

Now, write the NWB file that we have built so far.
nwbExport(nwb, 'ecephys_tutorial1.nwb')
We can then read the file and print it to inspect its contents. We can also print the SpatialSeries data that we created by referencing the names of the objects in the hierarchy that contain it. The processing module called 'behavior' contains our Position object. By default, the Position object is named 'Position'. The Position object contains our SpatialSeries object named 'SpatialSeries'.
read_nwbfile = nwbRead('ecephys_tutorial1.nwb')
read_nwbfile =
NwbFile with properties: nwb_version: '2.4.0' acquisition: [0×1 types.untyped.Set] analysis: [0×1 types.untyped.Set] file_create_date: [1×1 types.untyped.DataStub] general: [0×1 types.untyped.Set] general_data_collection: [] general_devices: [0×1 types.untyped.Set] general_experiment_description: [] general_experimenter: [1×1 types.untyped.DataStub] general_extracellular_ephys: [0×1 types.untyped.Set] general_extracellular_ephys_electrodes: [] general_institution: 'University of My Institution' general_intracellular_ephys: [0×1 types.untyped.Set] general_intracellular_ephys_experimental_conditions: [] general_intracellular_ephys_filtering: [] general_intracellular_ephys_intracellular_recordings: [] general_intracellular_ephys_repetitions: [] general_intracellular_ephys_sequential_recordings: [] general_intracellular_ephys_simultaneous_recordings: [] general_intracellular_ephys_sweep_table: [] general_keywords: [] general_lab: [] general_notes: [] general_optogenetics: [0×1 types.untyped.Set] general_optophysiology: [0×1 types.untyped.Set] general_pharmacology: [] general_protocol: [] general_related_publications: [1×1 types.untyped.DataStub] general_session_id: 'session_1234' general_slices: [] general_source_script: [] general_source_script_file_name: [] general_stimulus: [] general_subject: [1×1 types.core.Subject] general_surgery: [] general_virus: [] identifier: 'Mouse5_Day3' intervals: [0×1 types.untyped.Set] intervals_epochs: [] intervals_invalid_times: [] intervals_trials: [] processing: [1×1 types.untyped.Set] scratch: [0×1 types.untyped.Set] session_description: 'mouse in open exploration' session_start_time: 2018-04-25T02:30:03.000000-04:00 stimulus_presentation: [0×1 types.untyped.Set] stimulus_templates: [0×1 types.untyped.Set] timestamps_reference_time: 2018-04-25T02:30:03.000000-04:00 units: []
read_nwbfile.processing.get('behavior').nwbdatainterface.get('Position').spatialseries.get('SpatialSeries')
ans =
SpatialSeries with properties: reference_frame: '(0,0) is bottom left corner' starting_time_unit: 'seconds' timestamps_interval: 1 timestamps_unit: 'seconds' comments: 'no comments' control: [] control_description: [] data: [1×1 types.untyped.DataStub] data_continuity: [] data_conversion: 1 data_resolution: -1 data_unit: 'meters' description: 'no description' starting_time: [] starting_time_rate: [] timestamps: [1×1 types.untyped.DataStub]
We can also use the HDFView tool to inspect the resulting NWB file.

Trials

Trials are stored in a TimeIntervals object which is a subclass of DynamicTable. DynamicTable objects are used to store tabular metadata throughout NWB, including for trials, electrodes, and sorted units. They offer flexibility for tabular data by allowing required columns, optional columns, and custom columns.
The trials DynamicTable can be thought of as a table with this structure:
Trials are stored in a TimeInterval object which subclasses DynamicTable. Here, we are adding 'correct', which will be a boolean array.
trials = types.core.TimeIntervals( ...
'colnames', {'start_time', 'stop_time', 'correct'}, ...
'description', 'trial data and properties', ...
'id', types.hdmf_common.ElementIdentifiers('data', 0:2), ...
'start_time', types.hdmf_common.VectorData( ...
'data', [.1, 1.5, 2.5], ...
'description','start time of trial' ...
), ...
'stop_time', types.hdmf_common.VectorData( ...
'data', [1., 2., 3.], ...
'description','end of each trial' ...
), ...
'correct', types.hdmf_common.VectorData( ...
'data', [false, true, false], ...
'description', 'whether the trial was correct') ...
)
trials =
TimeIntervals with properties: start_time: [1×1 types.hdmf_common.VectorData] stop_time: [1×1 types.hdmf_common.VectorData] tags: [] tags_index: [] timeseries: [] timeseries_index: [] colnames: {'start_time' 'stop_time' 'correct'} description: 'trial data and properties' id: [1×1 types.hdmf_common.ElementIdentifiers] vectordata: [1×1 types.untyped.Set]
nwb.intervals_trials = trials;

Extracellular electrophysiology

In order to store extracellular electrophysiology data, you first must create an electrodes table describing the electrodes that generated this data. Extracellular electrodes are stored in a electrodes table, which is also a DynamicTable. electrodes has several required fields: x, y, z, impedence, location, filtering, and electrode_group.

Electrode table

Since this is a DynamicTable, we can add additional metadata fields. We will be adding a "label" column to the table.
Here, we also demonstate another method for creating DynamicTables, by first creating a MATLAB native Table object and then calling util.table2nwb to convert this Table object into a DynamicTable.
nshanks = 4;
nchannels_per_shank = 3;
variables = {'x', 'y', 'z', 'imp', 'location', 'filtering', 'group', 'label'};
tbl = cell2table(cell(0, length(variables)), 'VariableNames', variables);
device = types.core.Device(...
'description', 'the best array', ...
'manufacturer', 'Probe Company 9000' ...
);
nwb.general_devices.set('array', device);
for ishank = 1:nshanks
electrode_group = types.core.ElectrodeGroup( ...
'description', ['electrode group for shank' num2str(ishank)], ...
'location', 'brain area', ...
'device', types.untyped.SoftLink(device) ...
);
nwb.general_extracellular_ephys.set(['shank' num2str(ishank)], electrode_group);
group_object_view = types.untyped.ObjectView(electrode_group);
for ielec = 1:nchannels_per_shank
electrode_label = ['shank' num2str(ishank) 'elec' num2str(ielec)];
tbl = [...
tbl; ...
{5.3, 1.5, 8.5, NaN, 'unknown', 'unknown', group_object_view, electrode_label} ...
];
end
end
tbl
tbl = 12×8 table
 xyzimplocationfilteringgrouplabel
15.30001.50008.5000NaN'unknown''unknown'1×1 ObjectView'shank1elec1'
25.30001.50008.5000NaN'unknown''unknown'1×1 ObjectView'shank1elec2'
35.30001.50008.5000NaN'unknown''unknown'1×1 ObjectView'shank1elec3'
45.30001.50008.5000NaN'unknown''unknown'1×1 ObjectView'shank2elec1'
55.30001.50008.5000NaN'unknown''unknown'1×1 ObjectView'shank2elec2'
65.30001.50008.5000NaN'unknown''unknown'1×1 ObjectView'shank2elec3'
75.30001.50008.5000NaN'unknown''unknown'1×1 ObjectView'shank3elec1'
85.30001.50008.5000NaN'unknown''unknown'1×1 ObjectView'shank3elec2'
95.30001.50008.5000NaN'unknown''unknown'1×1 ObjectView'shank3elec3'
105.30001.50008.5000NaN'unknown''unknown'1×1 ObjectView'shank4elec1'
115.30001.50008.5000NaN'unknown''unknown'1×1 ObjectView'shank4elec2'
125.30001.50008.5000NaN'unknown''unknown'1×1 ObjectView'shank4elec3'
electrode_table = util.table2nwb(tbl, 'all electrodes');
nwb.general_extracellular_ephys_electrodes = electrode_table;

Links

In the above loop, we create ElectrodeGroup objects. The electrodes table then uses an ObjectView in each row to link to the corresponding ElectrodeGroup object. An ObjectView is an object that allow you to create a link from one neurodata type referencing another.

ElectricalSeries

Voltage data are stored in ElectricalSeries objects. ElectricalSeries is a subclass of TimeSeries specialized for voltage data. In order to create our ElectricalSeries object, we will need to reference a set of rows in the electrodes table to indicate which electrodes were recorded. We will do this by creating a DynamicTableRegion, which is a type of link that allows you to reference specific rows of a DynamicTable, such as the electrodes table, by row indices.
Create a DynamicTableRegion that references all rows of the electrodes table.
electrode_table_region = types.hdmf_common.DynamicTableRegion( ...
'table', types.untyped.ObjectView(electrode_table), ...
'description', 'all electrodes', ...
'data', (0:height(tbl)-1)');
Now create an ElectricalSeries object to hold acquisition data collected during the experiment.
electrical_series = types.core.ElectricalSeries( ...
'starting_time', 0.0, ... % seconds
'starting_time_rate', 30000., ... % Hz
'data', randn(12, 3000), ...
'electrodes', electrode_table_region, ...
'data_unit', 'volts');
This is the voltage data recorded directly from our electrodes, so it goes in the acquisition group.
nwb.acquisition.set('ElectricalSeries', electrical_series);

LFP

Local field potential (LFP) refers in this case to data that has been downsampled and/or filtered from the original acquisition data and is used to analyze signals in the lower frequency range. Filtered and downsampled LFP data would also be stored in an ElectricalSeries. To help data analysis and visualization tools know that this ElectricalSeries object represents LFP data, store it inside an LFP object, then place the LFP object in a ProcessingModule named 'ecephys'. This is analogous to how we stored the SpatialSeries object inside of a Position object and stored the Position object in a ProcessingModule named 'behavior' earlier.
electrical_series = types.core.ElectricalSeries( ...
'starting_time', 0.0, ... % seconds
'starting_time_rate', 1000., ... % Hz
'data', randn(12, 100), ...
'electrodes', electrode_table_region, ...
'data_unit', 'volts');
lfp = types.core.LFP('ElectricalSeries', electrical_series);
ecephys_module = types.core.ProcessingModule(...
'description', 'extracellular electrophysiology');
ecephys_module.nwbdatainterface.set('LFP', lfp);
nwb.processing.set('ecephys', ecephys_module);

Spike times

Ragged arrays

Spike times are stored in another DynamicTable of subtype Units. The default Units table is at /units in the HDF5 file. You can add columns to the Units table just like you did for electrodes and trials. Here, we generate some random spike data and populate the table.
num_cells = 10;
firing_rate = 20;
spikes = cell(1, num_cells);
for ishank = 1:num_cells
spikes{ishank} = [];
for iunit = 1:poissrnd(20)
spikes{ishank}(end+1) = cumsum(exprnd(1/firing_rate));
end
end
spikes
spikes = 1×10 cell
 12345678910
11×28 double1×25 double1×22 double1×23 double1×18 double1×21 double1×18 double1×29 double1×15 double1×20 double
Spike times are an example of a ragged array- it's like a matrix, but each row has a different number of elements. We can represent this type of data as an indexed column of the units DynamicTable. These indexed columns have two components, the vector data object that holds the data and the vector index object that holds the indices in the vector that indicate the row breaks. You can use the convenience function util.create_indexed_column to create these objects.
[spike_times_vector, spike_times_index] = util.create_indexed_column(spikes);
nwb.units = types.core.Units( ...
'colnames', {'spike_times'}, ...
'description', 'units table', ...
'id', types.hdmf_common.ElementIdentifiers( ...
'data', int64(0:length(spikes) - 1) ...
), ...
'spike_times', spike_times_vector, ...
'spike_times_index', spike_times_index ...
);

Write the file

nwbExport(nwb, 'ecephys_tutorial.nwb')

Reading NWB data

Data arrays are read passively from the file. Calling TimeSeries.data does not read the data values, but presents an HDF5 object that can be indexed to read data. This allows you to conveniently work with datasets that are too large to fit in RAM all at once. load with no input arguments reads the entire dataset:
nwb2 = nwbRead('ecephys_tutorial.nwb');
nwb2.processing.get('ecephys'). ...
nwbdatainterface.get('LFP'). ...
electricalseries.get('ElectricalSeries'). ...
data.load;

Accessing data regions

If all you need is a data region, you can index a DataStub object like you would any normal array in MATLAB, as shown below. When indexing the dataset this way, only the selected region is read from disk into RAM. This allows you to handle very large datasets that would not fit entirely into RAM.
% read section of LFP
nwb2.processing.get('ecephys'). ...
nwbdatainterface.get('LFP'). ...
electricalseries.get('ElectricalSeries'). ...
data(1:5, 1:10)
ans = 5×10
0.3238 0.0568 0.1861 -1.0841 0.4015 1.5673 0.5038 0.9148 0.6305 -1.8499 -0.2572 1.1631 -0.2843 -2.0889 -0.6124 -0.0981 -0.1522 -0.6394 -1.9411 0.2052 -1.0528 -0.1458 0.1278 1.6205 1.2195 -0.3119 0.8782 -0.0532 -0.5970 1.4307 -1.3472 -1.0314 -0.7384 0.0003 1.1219 0.0319 -1.0600 0.4216 -1.6961 0.5677 -0.5865 2.4174 -0.6169 -0.6104 -0.8354 -0.3832 0.4795 -1.4911 0.3497 -0.2414
% You can use the utility function |util.read_indexed_column| to read the
% spike times of a specific unit.
util.read_indexed_column( ...
nwb.units.spike_times_index, nwb.units.spike_times, 1)
ans = 28×1
0.0192 0.0151 0.0079 0.0401 0.0393 0.0343 0.0657 0.0066 0.0230 0.0292

Learn more!

See the API documentation to learn what data types are available.

MATLAB tutorials

Python tutorials

See our tutorials for more details about your data type:
Check out other tutorials that teach advanced NWB topics: