%A Matlab script that creates a NetCDF file for use in Hy3S %This script will create the large-scale system benchmark used in the BMC %bioinformatics paper and will also give you a good example of how NetCDF %files are created. %This script uses the MexCDF project's NetCDF Toolbox for Matlab. Do not %confuse the Toolbox with MexCDF's other project, the SNC Tools. They both %do the same thing, but in slightly different ways. Once you know how to %use one, using the other is straightforward. %Comments, Questions? Email me at salis at cems.umn.edu clear filename = 'OurBenchmarkModel.nc'; %You can change this. Mfast = 10; %And this too. Mslow = 10; %Also this. %------------------------------------------------------------------------- %First, we'll summarize the structure of the NetCDF file needed by Hy3S. %A NetCDF file is composed of dimensions, variables, and attributes. %Each variable can be either a scalar (no dimensions) or an %array with multiple dimensions. Variables have a type %(integer, single float, double float, etc). %Attributes are meta-data that can be used to self-describe the contents of the file. Attributes can %be defined in the global space or on any variable. %The following are the names of the NetCDF dimensions/variables/attributes. %Note: For convenience, they can be different than the names of variables within this Matlab %script. %List of needed NetCDF dimensions: %NumTimePoints -- number of timepoints in each trial %NumTrials -- number of independent trials %NumSpecies -- total number of species in model %NumSaveSpecies -- number of species whose trajectories are being saved to disk %NumReactions -- number of reactions in model %NumModels -- total number of models (this is usually the "unlimited" record dimension) %NumPerturbations -- number of system perturbations. %NumVariations -- number of variations of kinetic parameters or initial conditions. Used by the GUI, but not by the simulation programs. %NumMaxDepList -- maximum number of species/kinetic parameters in each rate law (default is 6) %NumMaxStoichList -- maximum number of species/stoichiometric coefficients in each reaction (default is 25) %StringLen -- length of a string (default is 72) %DataLen -- length of data in arrays of misc. simulation parameters. %List of needed NetCDF variables and their types: %int ExpType (scalar) -- the Experiment Type of the NetCDF file. 1 => Single Model. 2 => Multi-Model. %double TStart (scalar) -- begin time (seconds) %double TEnd (scalar) -- end time (seconds) %double SaveTime (scalar) -- amount of time between each savepoint of a trajectory to disk. (seconds) %double Volume (scalar) -- the initial volume (Liters) %double CellGrowthTime (scalar) -- the average amount of time between cell divisions (minutes) %double CellGrowthTimeSD (scalar) -- the standard deviation of the time between cell divisions (minutes) %int NumModels (scalar) -- the number of models currently in the NetCDF file. %int LastTrial (scalar) -- the last trial simulated. %int LastModel (scalar) -- the last model simulated. %int MaxNumModels (scalar) -- The maximum number of models allowed in the file. (Not currently used for anything.) %character Reaction_names (NumReactions x StringLen) -- reaction names %int Reaction_Rate_Laws (NumReactions x 1) -- each rate law is designated by an integer. Each reaction then has a rate law number. %int Reaction_StoichSpecies (NumReactions x NumMaxStoichList) -- list of species affected by each reaction's occurrence %int Reaction_StoichCoeff (NumReactions x NumMaxStoichList) -- list of stoichiometric coefficients of each reaction %int Reaction_DepList (NumReactions x NumMaxDepList) -- list of species present in the rate law of each reaction %int Reaction_StoichListLen (NumReactions x 1) -- length of StoichSpecies & StoichCoeff %int Reaction_DListLen (NumReactions x 1) -- length of DepList %int Reaction_OptionalData (NumReactions x 1) -- Used to store the number of steps in a gamma-distributed reaction. %double Reaction_Rate_Constants -- contains the kinetic parameters of each reaction % If ExpType = 1 then dimensions are (NumReactions x NumMaxDepList) % If ExpType = 2 then dimensions are (NumModels x NumReactions x NumMaxDepList) %int SpeciesIC -- contains the initial conditions for each species % If ExpType = 1 then dimensions are (NumSpecies x 1) % If ExpType = 2 then dimensions are (NumModels x NumSpecies) %character Species_names (NumSpecies x StringLen) -- species names %int SpeciesSplitOnDivision (NumSpecies x 1) -- 0 if species is not halved on cell division. 1 if it is halved. %int SaveSpeciesData (NumSpecies x 1) -- 0 if species is not saved to disk. 1 if it is saved to disk. %int VariationIDs (NumVariations x DataLen) -- stores integer numbers related to encoding how the kinetic parameters / initial conditions are %varied %double VariationData (NumVariations x DataLen) -- stores double prec. %numbers related to encoding how the kinetic parameters / initial conditions are varied. %OUTPUT variables %double Time (NumTimePoints x 1) -- stores the times when the trajectories are saved to disk. %double State -- stores the numbers of molecules of each saved species when the trajectories are saved to disk. % If ExpType = 1 then dimensions are (NumTrials x NumTimePoints x NumSpecies) % If ExpType = 2 then dimensions are (NumModels x NumTrials x NumTimePoints x NumSpecies) %Global Attributes %Data_Written -- If 0, no data has been written. 1, otherwise. When %running a Hy3S program, if the Data_Written attribute is set to 1 then the '-OV' flag is required to overwrite the data. %Model_Description -- string describing the model. %Creation -- creation date. %Last_Modified -- last modified date. %Author_Info -- author information %Format_Version -- format version (1.3 is the current one.) %------------------------------------------------------------------------- %Now, we create the data that we will be writing to the file. %Each variable is read by Hy3S and used to simulate the stochastic %dynamics of the model. %The Experiment Type ExpType = 1; %This indicates that this NetCDF file is a 'Single model' file. %ExpType = 2; %This indicates that this NetCDF file is a 'Multi-model' %file. %Time-related variables TStart = 0.000; TEnd = 200.000; SaveTime = 10.000; TimePoints = floor( (TEnd - TStart) / SaveTime) + 1; %Volume-related variables Vo = 1e-15; CellGrowthTime = 0.0; CellGrowthTimeSD = 0.0; %Some misc. simulation parameters. NumModels = 1; MaxNumModels = NumModels; LastModel = 0; LastTrial = 0; StringLen = 72; MaxSpeciesList = 25; MaxDepList = 6; %Now we create the stoichiometric matrix of the reactions, set the rate %laws of the reactions, and their kinetic parameters. %We are creating a large-scale system benchmark with Mfast fast/continuous %reactions and Mslow slow/discrete ones. %Total number of reactions. M = Mfast + Mslow; %Number of trials NumTrials = 1; %You can change this too. %To keep things simple, we'll have each reaction be 2nd order bi-molecular %or 'A + B --> C'. A, B, and C will be different species. To make sure the %dependency graph is sufficiently non-sparse (or that would be cheating!!!) %we will round-robin through the number of species to create a system of %coupled fast + slow reactions. We do this by creating the system of fast %reactions like 'S1 + S2 --> S3', 'S4 + S5 --> S6', and so on until %'S_Rf-2 + S_Rf-1 --> S_Rf'. And then coupling the slow reactions to the %fast ones by adding reactions like 'S1 + S_Rf+1 --> S_Rf+2' and %'S2 + S_Rf+3 --> S_Rf+4'. Rf is the number of fast/continuous reactions. %Total number of species will be ... N = 3*Mfast + 2*Mslow; %Each reaction has a stoichiometric vector of coefficients, a rate law %number, one or more kinetic parameters, and a list of species which the %rate law depends on. We allow the stoichiometric vector to be different %than the list of species in the rate law because we would also like to %include approximate reaction mechanisms, like Michaelis Menten kinetics. %The rate law for a 2nd order bi-molecular reaction is stored as #3 in the %Hy3S program. The full list of rate laws is in ratelaws.f90. %To keep things interesting, we'll also round robin through a list of %kinetic parameters. There is only one kinetic parameter per reaction for %the 2nd order bi-molecular rate law so we only need to pick one the %following list. %**The reaction kinetics are stored in macroscopic units** %They will be converted to 'mesoscopic' (in terms of molecules) within the %program. kfastconstant(1:10) = [15.06 30.11 60.22 40.0 50.0 60.0 70.0 80.0 90.0 100.0]; kslowconstant(1:10) = [60.22 60.22 30.0 40.0 50.0 60.0 70.0 80.0 90.0 100.0]; %We also want to set the initial conditions for the species. %The reactant and product species for the fast reactions must be sufficiently numerous. Xo_fast_reactants = 1e6; %Very large Xo_fast_products = 100; %Just large enough to make the approximation. %The slow reactions include 1 reactant that is affected by the fast %reactions and 1 that is not. Its product species can be initialized to 0 %molecules. We set the initial conditions for the species not affected by fast reactions below. Xo_slow_reactants = 100; Xo_slow_products = 0; %The loop that creates the stoichiometric matrix (SM), the list of species %affected by each reaction (SLIST), the list of species appearing in each rate law (DLIST), %setting the kinetic constants (C) of each reaction, the rate laws for each reaction (MType), %the initial conditions for each species (Xo), and the names of each species (Species_names) %and reaction (Reaction_names). %We should initialize the data variables. SList = zeros(M, MaxSpeciesList); SM = zeros(M, MaxSpeciesList); DList = zeros(M, MaxDepList); c = zeros(M, MaxDepList); MType = zeros(M,1); OptionalData = zeros(M,1); Xo = zeros(N,1); Species_names = []; Reaction_names = []; counter=0; %Loop over the fast/continuous reactions in the benchmark. for i=1:Mfast SList(i,1:3) = [counter+1 counter+2 counter+3]; %Si+1 + Si+2 --> Si+3, incremented. SList(i,4:MaxSpeciesList) = 0; %No other species used. SListLen(i) = 3; SM(i,1:3) = [-1 -1 1]; %Stoichiometric vector of coefficients for 'A + B --> C' SM(i,4:MaxSpeciesList) = 0; %No other species are used. DList(i,1:2) = [counter+1 counter+2]; %Only the first two species (Si+1 and Si+2) are used in the rate law. DList(i,3:MaxDepList) = 0; %No other species are used. DListLen(i) = 2; c(i,1) = kfastconstant(mod(i,length(kfastconstant))+1); %Round robin selection of the first kinetic parameter. c(i,2:MaxDepList) = 0.0; %No other kinetic parameters are used in the reaction. MType(i) = 3; %The rate law number for a 2nd order bi-molecular reaction. %Setting the initial conditions. Xo(counter+1) = Xo_fast_reactants; Xo(counter+2) = Xo_fast_reactants; Xo(counter+3) = Xo_fast_products; %The name of each species. We'll just number them 1 to N. Species_names = strvcat(Species_names,['S' num2str(counter+1) blanks(72 - length(num2str(counter+1)) - 1)]); Species_names = strvcat(Species_names,['S' num2str(counter+2) blanks(72 - length(num2str(counter+2)) - 1)]); Species_names = strvcat(Species_names,['S' num2str(counter+3) blanks(72 - length(num2str(counter+3)) - 1)]); %The name of each reactions. We'll just number them 1 to M. Reaction_names = strvcat(Reaction_names,['R' num2str(i) blanks(72 - length(num2str(i)) - 1)]); counter = counter + 3; %Increment the counter. end Fastcounter = 1; %Loop over the slow reactions in the benchmark. for i=Mfast+1:M SList(i,1:3) = [Fastcounter counter+1 counter+2]; SList(i,4:MaxSpeciesList) = 0; SListLen(i) = 3; SM(i,1:3) = [-1 -1 1]; SM(i,4:MaxSpeciesList) = 0; DList(i,1:2) = [Fastcounter counter+1]; DList(i,3:MaxDepList) = 0; DListLen(i) = 2; c(i,1) = kslowconstant(mod(i,length(kfastconstant))+1); c(i,2:MaxDepList) = 0.00; MType(i) = 3; Fastcounter = Fastcounter + 3; if (Fastcounter > 3 * Mfast) Fastcounter = 1; end Xo(counter+1) = Xo_slow_reactants; Xo(counter+2) = Xo_slow_products; %The name of each species. We'll just number them 1 to N. Species_names = strvcat(Species_names,['S' num2str(counter+1) blanks(72 - length(num2str(counter+1)) - 1)]); Species_names = strvcat(Species_names,['S' num2str(counter+2) blanks(72 - length(num2str(counter+2)) - 1)]); %The name of each reactions. We'll just number them 1 to M. Reaction_names = strvcat(Reaction_names,['R' num2str(i) blanks(72 - length(num2str(i)) - 1)]); end %We also need to create variables that describe ... %If the species split on division if cell division is turned on. SpeciesSplitOnDivision = zeros(N,1); %1 is yes. %If the trajectories of each species will be saved or not. SaveSpeciesData = ones(N,1); %1 is yes. %Ok, all of the data is set. %Now we open up the NetCDF file, define the dimensions and variables, and %then write the data into the variables. %Open the NetCDF file in 'clobber' mode -- this will overwrite an existing %file. nc = netcdf(filename,'clobber'); %Define all of the needed dimensions with their lengths nc('NumTrials') = NumTrials; nc('NumSpecies') = N; nc('NumReactions') = M; nc('NumTimePoints') = TimePoints; nc('NumModels') = 0; %Unlimited dimension has length of zero. Only one unlimited dimension is allowed per NetCDF file. nc('NumMaxDepList') = 6; nc('NumMaxStoichList') = 25; nc('StringLen') = StringLen; %Define all of the needed variables, their types, and their dimensions. %Then put the data into the variables. %** Use the squiggly brackets {} for variables. %Define Scalar variables nc{'TStart'} = ncdouble(); nc{'TEnd'} = ncdouble(); nc{'SaveTime'} = ncdouble(); nc{'Volume'} = ncdouble(); nc{'CellGrowthTime'} = ncdouble(); nc{'CellGrowthTimeSD'} = ncdouble(); nc{'ExpType'} = ncint(); nc{'LastTrial'} = ncint(); nc{'LastModel'} = ncint(); nc{'MaxNumModels'} = ncint(); nc{'NumModels'} = ncint(); %Put data into them nc{'TStart'}(:) = TStart; nc{'TEnd'}(:) = TEnd; nc{'SaveTime'}(:) = SaveTime; nc{'LastTrial'}(:) = LastTrial; nc{'LastModel'}(:) = LastModel; nc{'ExpType'}(:) = ExpType; nc{'MaxNumModels'}(:) = MaxNumModels; nc{'NumModels'}(:) = NumModels; nc{'Volume'}(:) = Vo; nc{'CellGrowthTime'}(:) = CellGrowthTime; nc{'CellGrowthTimeSD'}(:) = CellGrowthTimeSD; %Define variables with NumSpecies dimension nc{'SpeciesSplitOnDivision'} = ncint('NumSpecies'); nc{'SaveSpeciesData'} = ncint('NumSpecies'); %Put data into them nc{'SpeciesSplitOnDivision'}(:) = SpeciesSplitOnDivision; nc{'SaveSpeciesData'}(:) = SaveSpeciesData; %Define variables with NumReactions dimension nc{'Reaction_Rate_Laws'} = ncint('NumReactions'); nc{'Reaction_DListLen'} = ncint('NumReactions'); nc{'Reaction_StoichListLen'} = ncint('NumReactions'); nc{'Reaction_OptionalData'} = ncint('NumReactions'); %Put data into them nc{'Reaction_Rate_Laws'}(:) = MType; nc{'Reaction_DListLen'}(:) = DListLen; nc{'Reaction_StoichListLen'}(:) = SListLen; nc{'Reaction_OptionalData'}(:) = OptionalData; %Define 2D variables nc{'Reaction_StoichCoeff'} = ncint('NumReactions','NumMaxStoichList'); nc{'Reaction_StoichSpecies'} = ncint('NumReactions','NumMaxStoichList'); nc{'Reaction_DepList'} = ncint('NumReactions','NumMaxDepList'); nc{'Reaction_names'} = ncchar('NumReactions','StringLen'); nc{'Species_names'} = ncchar('NumSpecies','StringLen'); for i=1:N nc{'Species_names'}(i,:) = Species_names(i,:); end for i=1:M nc{'Reaction_names'}(i,:) = Reaction_names(i,:); nc{'Reaction_StoichCoeff'}(i,:) = SM(i,:); nc{'Reaction_StoichSpecies'}(i,:) = SList(i,:); nc{'Reaction_DepList'}(i,:) = DList(i,:); end %Define ExpType-dependent variables if (ExpType == 1) nc{'SpeciesIC'} = ncint('NumSpecies'); nc{'Reaction_Rate_Constants'} = ncdouble('NumReactions','NumMaxDepList'); %Put data into them. nc{'SpeciesIC'}(:) = Xo; nc{'Reaction_Rate_Constants'}(:,:) = c; else if (ExpType == 2) nc{'SpeciesIC'} = ncint('NumModels','NumSpecies'); nc{'Reaction_Rate_Constants'} = ncdouble('NumModels','NumReactions','NumMaxDepList'); %Put data into them. for i=1:NumModels nc{'SpeciesIC'}(:,i) = Xo; %or something else nc{'Reaction_Rate_Constants'}(i,:,:) = c; end end end %Define and put data into attributes. Attributes automatically determine %what type they should be. datestring = datestr(now,0); nc.Model_Description = 'Automatically Generated Large-scale Benchmark Model'; nc.Data_Written = 0; nc.Format_Version = '1.3'; nc.Creation = deblank(datestring); nc.Last_Modified = deblank(datestring); nc.Author_Info = 'Benchmark Maker code by Howard Salis'; %Close the file. close(nc); %Done!