We propose a technique to localize computation in Instruction Set Extensions (ISEs) that are clocked at very high speed with respect to the processor. In order to save power, data to and from Custom Instruction Units (CIUs) is synchronized via an optical signal that is detected through a Single-Photon Avalanche Diode (SPAD) capable of timing uncertainties as low as 50 picoseconds. The CIUs comprise a free-standing local oscillator serving a computing area of a few tens of square micrometers, thus resulting in extremely reduced power dissipations, since the distribution of a high frequency clock over long distances is avoided. This approach is based on the globally asynchronous locally synchronous concept, whereby the granularity of the local domains is reduced to a minimum, thus enabling extremely high local clock frequencies and low power, while minimizing substrate noise injection and intra-chip interference. Thanks to this approach we can free ourself from expensive synchronization techniques such as FIFOs, delays, or flip-flop based synchronizers by creating fixed synchronization points in time where data can be exchanged. The paradigm is demonstrated on a chip designed and fabricated in a standard 90nm CMOS technology. A full characterization demonstrates the suitability of the approach.