The brain is the most mysterious and intricate biological structure known to man. It dominates the way we live, think, reason and behave. It is built of billions of neurons that communicate with each other through trillions of connections using electrical signals. The neocortex is undoubtedly the most complex brain region, hosting all higher brain functions such as sensory perception and cognitive behavior. Scientists have been struggling for more than a century to identify the anatomical blueprint of its cellular organization and to understand its dynamic behavior. The Blue Brain Project aims to develop the first simulation-based research environment for modeling and studying the basic building blocks of the neocortex, the neocortical column. The model column is based on data obtained from the juvenile rat somatosensory cortex. The best way to build such a large-scale model is by following a data-constraint driven approach and integrating biological information at different levels. In this thesis, I present novel informatics approaches to systematically constrain the membrane protein composition of different neocortical neuron types. In the first study, I investigate the single-cell gene expression patterns of ion channels in different neuron types and present a combinatorial rule extractor model that reverse engineers the observed expression rules in ten different neocortical neuron types. In the second study, I present a physical constrain checker that mines different publicly available data resources regarding all known membrane proteins such as channels and receptors, and extracts available information relating to their specific location on different parts of the neuronal morphologies, their 3D morphological structures as well as their genetic and proteomic annotations. Combining the results from both studies allows us to build neuronal models that better reflect the biological reality of neocortical neurons, ensuring the correct localization and distribution of membrane proteins across neuronal morphologies and computing their maximum count number based on their molecular surface areas and the available neuronal membrane surface area. This novel data-constraint driven approach is the first step towards integrating existing biological knowledge and constraints from various data sources into the process of building neuron models, in this way improving the neuron models as they come closer to the biological reality of real neurons.