We present a not fully connected recurrent network applied to the problem of load forecasting. Although many authors have pointed out that recurrent networks were able to model NARMAX processes, we present a constructing scheme for the MA part. In addition we present a modification of the learning step which improves learning convergence and the accuracy of the forecast. At last, the use of a continuous learning scheme and a robust learning scheme, which appeared to be necessary when using a MA part, enables us to reach a good precision of the forecast, compared to the accuracy of the model in use at the utility at present