Scaling down robots to miniature size introduces many new challenges including memory and program size limitations, low processor performance and low power autonomy. In this paper we describe the concept and implementation of learning of a safewandering task with the autonomous micro-robots, Alice. We propose a simplified reinforcement learning algorithm based on one-step Qlearning that is optimized in speed and memory consumption. This algorithm uses only integer-based sum operators and avoids floatingpoint and multiplication operators. Finally, quality of learning is compared to a floating-point based algorithm.