In this paper we propose a methodology to determine the structure of the pseudo-stoichiometric coefficient matrix K in a mass balance based model, i.e. the maximal number of biomasses that must be taken into account to reproduce an available data set. It consists in estimating the number of reactions that must be taken into account to represent the main mass transfer within the bioreactor. This provides the dimension of K. The method is applied to data from an anaerobic digestion process and shows that even a model including a single biomass is sufficient. Then we apply the same method to the “synthetic data” issued from the complex ADM1 model, showing that the main model features can be obtained with two biomasses.