The holy grail for a domain-specific language (DSL) is to be friendly and fast. A DSL should be friendly in the sense that it is easy to use by DSL end-users, and easy to develop by DSL authors. DSLs can be developed as entirely new compilers and ecosystems, which requires tremendous effort and often requires DSL authors to reinvent the wheel. Or, DSLs can be developed as libraries embedded in an existing host language, which requires significantly less effort. Embedded DSLs (EDSLs) manifest a trade-off between being friendly and fast as they stand divided in two groups: - Deep EDSLs trade user experience of both DSL authors and DSL end-users, for improved program performance. As proposed by Elliott et al., deep EDSLs build an intermediate representation (IR) of a program that can be used to drive domain-specific optimizations, which can significantly improve performance. However, this program IR introduces significant usability hurdles for both the DSL authors and DSL end-users. Correctly transforming programs is difficult and error prone for DSL authors, while error messages can be cryptic and confusing for DSL end-users. - Shallow EDSLs trade program performance for good user experience. As proposed by P. Hudak, shallow EDSLs omit construction of the IR, i.e., they are executed directly in the host language. Although friendly to both DSL end-users and DSL authors, they can not perform domain-specific optimizations and thus exhibit inferior performance. This thesis makes a stride towards achieving both (1) good user experience for DSL authors and DSL end-users, as well as (2) enabling domain-specific optimizations for improved performance. It unites shallow and deep DSLs by defining an automatic translation from end-user-friendly shallow DSLs, to better-performing deep DSLs. The translation uses reflection of the host language to cherry-pick the best of both shallow and deep EDSLs. During program development, a DSL end-user is presented with user friendly features of a shallow EDSL. Before execution in a production environment, programs are reliably translated into the high-performance deep EDSL with equivalent semantics. Since maintaining both shallow and deep EDSLs is difficult, the thesis further shows how to reuse the shallow-to-deep translation to automatically generate deep EDSLs based on shallow EDSLs. With automatic generation of the deep EDSL, the DSL author is required only to develop a shallow EDSL and the domain-specific optimizations for the deep EDSL that she deems useful. Finally, the thesis discusses a new programming abstraction that eases the development, of a specific kind, of deep EDSLs that are compiled in two stages. The new abstraction simplifies management of dynamic compilation in two-stage deep EDSLs.