Due to the steady tendency to propose highly customized products and to respond to volatile (i.e random) demands, Flexible Manufacturing Systems (FMS) are now present in most shopfloors. In this thesis, flexibility in a FMS is understood as the ability of a single production cell to deliver several different types of items, say N. The production capacity is usually limited in the sense that only one or K < N type(s) of items can be simultaneously produced. Each type of item faces a specific market demand which often shows random fluctuations. To insure a high reactivity when facing such random demands, efficient production rules for the FMS are mandatory. These rules include in particular two generic entities, namely: Finished Goods Inventories. The fluctuations of i) the production flows, (due to failure-prone machines) and ii) the demand flows, can be partly absorbed by the presence of Finished Good Inventories (FIG). Such storage zones incur costs especially when serving highly customized demands. Clearly, the balance between the advantage of high reactivity on the one hand and storage costs on the other introduces complex optimization issues. The optimal solution will generally include hybrid production rules, i.e. certain types of products optimally require FIG (we call this strategy make-to-stock production), while other types require no FIG (we call this strategy make-to-order production). Dynamic Scheduling rules. The intrinsic presence of fluctuations implies that simple deterministic scheduling rules (as for example deterministic polling rules which produce each type j of items periodically during a fixed period of time Tj) may lead to a very poor performance. Clearly, an optimal production schedule will be based on both past experience and observation of the present state of the system (i.e the populations of the FIG and the instantaneous rates of the demand and production flows). Hence, any optimal scheduling rule will necessarily present a time adaptive character (i.e. real time scheduling rules). In spite of a growing usage of FMS in industry, the general problem of determining the optimal dynamic scheduling of flexible manufacturing systems remains, in its full generality, an open issue of operations research. In order to give some answers to the question of optimal scheduling, the present thesis will discuss two mathematical models known as "Multi-Armed Bandit Problem" (MABP) and "Restless Bandit Problem" (RBP) in terms of which the FMS can be modeled. Let us briefly recall the salient features of the Bandit formalization. In its basic version, the MABP considers a series of N stochastic processes (also called projects) evolving in parallel. At each time t, a decision maker (DM) can engage at most one project (this feature reflects the limited resource property). The engaged project generates an instantaneous cost while the disengaged projects incur no cost and remain fixed ("frozen" dynamic rule of the disengaged projects). The optimal scheduling problem in the MABP consists in choosing at each time t which project to engage in order to minimize the global cost over a given time horizon. An exact solution of the basic MABP has been given in 1974 by Gittins. The solution is based on the construction of a set of N priority indices (the so-called Gittins indices) which assign to each project an "urgency function" in terms of which the optimal dynamic scheduling reads: "At each instant, engage the item exhibiting the smallest priority index value". To use the MABP formalization in the production engineering context, the basic hypothesis of the "frozen" dynamics needs to be loosened. It is indeed mandatory to allow for the following features: Disengaged projects evolve in time and do incur costs. This generalization is necessary to reflect the fact that demands for items not currently produced continue to accrue and their FGI obviously also incurs costs. The assumptions in a) lead us to study the so-called "Restless Bandit problems" for which the optimal scheduling rule is yet unknown. As it has been noted in the (scarce) literature available, a naive generalization of the priority indices will definitely not yield the optimal rule. However, close to optimal solutions can still be expressed in terms of suitably generalized priority indices. The use of such indices has the determining advantage of leading to very simple --though sub-optimal-- scheduling policies. This is indeed an essential feature of the production applications we have in mind. Hence, the construction of simple solvable models of optimal scheduling rules, will help to develop the perception needed to construct reliable heuristics applicable to the general problems. Accordingly, the general approach adopted in the present thesis is: Construct explicitly solvable classes of Bandit problems (Classical and Restless). Develop a heuristic to approach the problem of FMS and tests its validity on the basis of simulation studies. Moreover, as flexibility in a FMS generally generate setup penalties (such as the need for additional workforce incurring additional costs or cleaning operations imposing switching time delay) we will further: Construct an explicitly solvable class of MABP with setup penalties. Develop a heuristic to approach the general problem of MABP with setup costs and test its validity on the basis of the simple models introduced in iii). Our original contributions are: Explicit computation of the Gittins index for MABP (without switching penalties). We were able to compute explicitly the form of the Gittins indices when the evolution is given by a piecewise deterministic process which is intrinsically non-Markovian. This is among the few classes of non-Markovian examples in the literature for which the Gittins indices can be computed explicitly (M.-O. Hongler and F. Dusonchet, "Optimal stopping and Gittins indices for piecewise deterministic evolution process", Discrete Events Systems (2001) (11), 235--248). Explicit treatment of the Restless Multi-Armed Bandit process. We studied several underlying random dynamics relevant for the production engineering context, (e.g. diffusion processes as well as birth and death processes). We obtained explicit generalized priority indices and the resulting dynamic scheduling was compared with exact results derived numerically by A. Ha, (Oper. Res. 45, 1994, 42-53). Finally, using the RBP, we propose a sub-optimal heuristic solving the multi-items make-to-stock production problem (F. Dusonchet and M.-O. Hongler, "Continuous Time Restless Bandit and Dynamic Scheduling for Make-to-Stock Production", accepted for publication by IEEE Trans. on Robo. and Auto., (2003)). Construction of a sub-optimal heuristic for the MABP with setup penalties. Adding setup penalties to optimal control problems singularly increases the complexity of the solution. Therefore, very few results presently exist and the optimal decision problem with setups remains mostly unexplored. Our effort concentrates on the MABP with setup penalties and significant progress has been made by constructing a new class of MABP with setups for which the optimal policy can be explicitly constructed by recursion. Using this optimal derivation, we then propose a heuristic, approaching the optimal policy for general MABP with switching penalties (F. Dusonchet and M.-O. Hongler, "Optimal Policy for Deteriorating Two-Armed Bandit Problems with Switching Costs", accepted by Automatica, (2003)).