A common technique to extend linear prediction to nonstationary signals is time segmentation: the signal is split into small portions and the modelization is carried out locally. The accuracy of the analysis is, however, dependent on the window size and on the signal characteristics, so that the problem of finding a good segmentation is crucial to the entire modeling scheme. In this paper, we present an algorithm which determines the optimal segmentation with respect to a cost function relating prediction error to modeling cost. The proposed approach casts the problem in a rate/distortion (R/D) framework, whereby the segmentation is implicitly computed while minimizing the modelization distortion for a given modelization cost. The algorithm is implemented by means of dynamic programming and takes the form of a trellis-based Lagrangian minimization. The optimal linear predictor, when applied to speech coding, dramatically reduces the number of bits per second devoted to the modeling parameters in comparison to fixed-window schemes