With the ever-growing data sizes along with the increasing complexity of the modern problem formulations, contemporary applications in science and engineering impose heavy computational and storage burdens on the optimization algorithms. As a result, there is a recent trend where heuristic approaches with unverifiable assumptions are overtaking more rigorous, conventional methods at the expense of robustness and reproducibility. My recent research results show that this trend can be overturned when we jointly exploit dimensionality reduction and adaptivity in optimization at its core. I contend that even the classical convex optimization did not reach yet its limits of scalability. Many applications in signal processing and machine learning cast a fitting problem from limited data, introducing spatial priors to be able to solve these otherwise ill-posed problems. Data is small, the solution is compact, but the search space is high in dimensions. These problems clearly suffer from the wasteful use of storage resources by the classical optimization algorithms. This problem is prevalent. Storage is a critical bottleneck that prevents us from solving many important large-scale optimization problems. For example, semidefinite programs (SDP) often have low-rank solutions. However, SDP algorithms require us to store a matrix decision variable in the ambient dimensions. This dissertation presents a convex optimization paradigm which makes it possible to solve trillion dimensional SDP relaxations to combinatorial decision problems on a regular personal computer. The key idea is to use an optimization procedure that performs compact updates, and to maintain only a small sketch of the decision variable. Once the iterations terminate, we can recover an approximate solution from the information stored in this sketch. We start by formalizing this idea for a model problem with low-rank solutions. Based on the classical conditional gradient method, we propose the first convex optimization algorithm with optimal storage guarantees for this model problem. Then, we develop and describe the key ingredients to extend our results for a broader set of problems, SDP in particular. SDP formulations are key in signal processing, machine learning, and other engineering applications. By greatly enhancing the scalability of optimization algorithms for solving SDP formulations, we can potentially unlock new results in important scientific applications.