A mixed-signal computer architecture and its application to power system problems
To my family.

Ἐν ἄρχῃ ἦν ὁ Λόγος.
— Ἰω. 1:1
Acknowledgements

First of all I would like to thank my supervisors Maher Kayal and Rachid Cherkaoui. They gave me the opportunity to conduct my research “the old way”, in the sense of providing me with suggestions and tips along the way rather constraining me to a fixed target. It is this kind of freedom that allows the investigation of “crazy ideas” and it is due to them that this work has been enabled.

I would like to equally thank the professor Thierry Van Cutsem of the University of Liège and his group for their exemplary hospitality during my spell with them in Liège. Particular mention goes to dear friend and colleague Petros Aristidou for all our fruitful discussions and his invaluable help.

A big part of what you read in this manuscript is done in collaboration with colleagues of my lab and elsewhere. Sometimes collaboration was so tight that is is really difficult to say who has really contributed what. I would like to specially mention Guillaume Lanz, Denis Sallin and Georgios Lilis. Each one with his deep dedication to his field, and each one always willing to share and help.

Apart from the above, there is a number of people that have contributed their bit in the scientific part of this work. It being a discussion over a coffee, a review of a paper, a coding idea, or some useful small-talk over a pint of beer, to you all people, I am grateful.

I would also like to thank all the people who have made my days here in Switzerland happy and cheerful: my office-, flat-, and team-mates, and all my amazing Greek and Swiss friends.

I would finally like to thank my family, for their constant support and for the happiness they bring to my life.

Lausanne, 04 Juin 2015

Θ. K.
Abstract

Radical changes are taking place in the landscape of modern power systems. This massive shift in the way the system is designed and operated has been termed the advent of the "smart grid". One of its implications is a strong market pull for faster power system analysis computing. This work concerns in particular transient simulation, which is one of the most demanding power system analyses. This refers to the imitation of the operation of the real-world system over time, for time scales that cover the majority of slow electromechanical transient phenomena. The general mathematical formulation of the simulation problem includes a set of non-linear differential algebraic equations (DAEs).

In the algebraic part of this set, heavy linear algebra computations are included, which are related to the admittance matrix of the topology. These computations are a critical factor to the overall performance of a transient simulator.

This work proposes the use of analog electronic computing as a means of exceeding the performance barriers of conventional digital computers for the linear algebra operations. Analog computing is integrated in the frame of a power system transient simulator yielding significant computational performance benefits to the latter. Two hybrid, analog and digital computers are presented.

The first prototype has been implemented using reconfigurable hardware. In its core, analog computing is used for linear algebra operations, while pipelined digital resources on a field programmable gate array (FPGA) handle all remaining computations. The properties of the analog hardware are thoroughly examined, with special attention to accuracy and timing. The application of the platform to the transient analysis of power system dynamics showed a speedup of two orders of magnitude against conventional software solutions.

The second prototype is proposed as a future conceptual architecture that would overcome the limitations of the already implemented hardware, while retaining its virtues. The design space of this future architecture has been thoroughly explored, with the help of a software emulator. For one possible suggested implementation, speedups of two orders of magnitude against software solvers have been observed for the linear algebra operations.

Key words: [Computing]: analog computing, pipeline processing, field programmable gate arrays, reconfigurable computing, computer accelerator architectures, high performance computing; [Power systems]: power system simulations, power system analysis computing, power system dynamics; [Mathematics]: linear algebra - linear systems, numerical analysis - numerical simulation, numerical linear algebra;
Résumé


Ce travail concerne en particulier une analyse très exigeante, la simulation transitoire. Ceci se rapporte à l’imitation de l’opération du système réel, pour des échelles de temps qui couvrent la majorité des phénomènes transitoires électromécaniques lents. La formulation mathématique générale du problème comprend un ensemble d’équations algébriques différentielles non-linéaires (DAE).

Dans la partie algébrique de cet ensemble, de lourds calculs d’algèbre linéaire prennent place, qui sont liés à la matrice d’admittance de la topologie. Ces calculs sont un facteur critique de la performance globale d’un simulateur transitoire.

Ce travail propose l’utilisation des calculs analogiques comme un moyen de dépasser les barrières de performance des ordinateurs numériques classiques pour les opérations d’algèbre linéaire. Le calcul analogique est intégré dans le cadre d’un simulateur transitoire et apporte d’importants avantages en termes de performances. Deux ordinateurs hybrides, analogique et numérique, sont présentés.

Le premier prototype a été mis en œuvre en utilisant ressources reconfigurables. Dans son cœur, le calcul analogique est utilisé pour des opérations d’algèbre linéaire, tandis que les ressources numériques sur un Field Programmable Gate Array (FPGA) gèrent tous les calculs restants. Les propriétés du matériel analogique sont examinées. Une attention particulière est portée à la précision et au timing. L’application de la plate-forme à l’analyse transitoire de la dynamique du réseau électrique a montré une accélération de deux ordres de grandeur par rapport à des solutions logicielles classiques.

Le deuxième prototype est proposé comme une future architecture conceptuelle qui permettrait de surmonter les limitations du matériel déjà mis en œuvre, tout en conservant ses vertus. L’espace de conception de cette future architecture a été exploré à fond, avec l’aide d’un émulateur en logiciel. Pour une mise en œuvre suggérée, accélérations de deux ordres de grandeur contre logiciels classiques ont été observées pour les opérations d’algèbre linéaire.

Mots clefs : Informatique : calcul analogique, traitement en pipeline, circuit logique programmable, computing reconfigurable, architectures d’ordinateur de l’accélérateur, calcul haute performance ; Réseau électrique : simulations du réseau électrique, informatique
Acknowledgements

de l’ analyse du réseau électrique, dynamique du réseau électrique; [Mathématiques] : algèbre linéaire - systèmes linéaires, analyse numérique - simulation numérique, algèbre linéaire numérique;
Περίληψη

Πρόκειται για μία εντυπωσιακή αλλαγή λαμβάνουν χώρα στον τομέα των σύγχρονων συστημάτων ηλεκτρικής ενέργειας (ΣΗΕ). Αυτή η μετάβαση στον τρόπο με τον οποίο το σύστημα σχεδιάζεται και λειτουργείται αποτελεί ένα από τα έξτραντα δικτύα. Ένα από τα παραπέμποντα της είναι η ανάγκη για ταχύτερους υπολογισμούς ανάλυσης ΣΗΕ.

Αυτή η εργασία επικεντρώνεται σε μια ιδιαίτερη απαιτητική ανάλυση, την προσομοίωση ΣΗΕ. Αυτό αναφέρεται στην πιστή αναπαράσταση της λειτουργίας του συστήματος για σταθερές χρόνιοι και χρόνιες περιοδευτικές αλλαγές των πλευρών της ηλεκτρομηχανικής μεταβατικών φανομένων στο σύστημα. Η γενική μιαθητική διατύπωση του προβλήματος της προσομοίωσης περιλαμβάνει ένα σύνολο μη γραμμικών διαφορετικών-άλγεβρικών εξισώσεων (ΔΛΕ).

Στο αλγεβρικό μέρος του συνόλου περιλαμβάνονται πολλοί ιδιαίτεροι υπολογισμοί γραμμικής άλγεβρας. Αυτοί σχετίζονται με τον πίνακα αγωγομονητών της τοπολογίας και είναι ιδρυμένοι για την συνολική απόδοση ενός προσομοιωτή μεταβατικών φανομένων ΣΗΕ.

Σε αυτή την εργασία προτείνεται η χρήση αναλυτικών αρχιτεκτονικών ως μέσο υπέρβασης των περιορισμών των συμβατικών ηγεμονικών υπολογιστών όπως αφορά ιδιαίτερα τους υπολογισμούς γραμμικής άλγεβρας. Οι αναλυτικοί υπολογισμοί ενσωματώνονται στην ροή του προσομοιωτή μεταβατικών φανομένων προφέροντας σημαντικά ορθές στην απόδοσή του.

Δύο ιδρυκτικοί υπολογισμοί παρουσιάζονται.

Ο πρώτος αφορά ένα πρωτότυπο το οποίο έχει κατασκευαστεί σε επαναδιαφανούς υλικό. Στον πυρήνα του αναλυτικού υπολογισμού χρησιμοποιούνται για την γραμμική άλγεβρα, ενώ συλλέχθηκε επεξεργαστής έχουν προγραμματιστεί σε μια συστοίχια επιτόπια προγραμματιστικόν πυλών (ΦΠΑ) για τους υπόλοιπους υπολογισμούς. Οι ιδιότητες του αναλυτικού τμήματος εξετάζονται διεξοδικά, με ιδιαίτερη έμφαση στην αφαίρεση και την χρονικότητα για την πλατφόρμα στην προσομοίωση ΣΗΕ έδειξε μια επιτάχυνση δύο τάξεων μεγέθους σε σχέση με συμβατικό ίχνωμα λογισμικού.

Ένα δεύτερο πρωτότυπο προτείνεται ως μελλοντική αρχιτεκτονική η οποία θα μπορέσει να διατηρήσει τις αρχές του υπόλοιπου πρωτότυπου, αλλά και τοντόρανα να ξεπεράσει τους περιορισμούς του. Ένας εξομοιώτερος υποκαλύφθηκε σε λογισμικό με σκοπό την ανάλυση και την βελτιστοποίηση του σχεδιασμού. Για την τελική προετοιμασία διαμόρφωση, επιπλέον δύο τάξεων μεγέθους επεξεργάστηκε για τους υπολογισμούς γραμμικής άλγεβρας, σε σχέση με συμβατικό ίχνωμα λογισμικού.

Λέξεις κλειδιά: Πληροφορική: αναλυτική πληροφορική, συλλέχωση επεξεργασία, συστοίχια
Contents

Acknowledgements i

Abstract (en/fr/el) iii

List of figures xiii

List of tables xvii

Introduction 1

1 Modern power system landscape 5
  1.1 Current architecture .............................................. 6
    1.1.1 The need for change ........................................ 13
  1.2 Future architecture: the Smart Grid ........................... 14
    1.2.1 Distributed energy resources .............................. 16
    1.2.2 Aggregations ............................................... 17
  1.3 The effect of the smart grid on analysis tools ................. 18

2 Power system simulation, linear algebra and computing platforms 21
  2.1 Simulation ...................................................... 22
    2.1.1 Mathematical formulation of simulation problems ........ 23
  2.2 Linear algebra in power systems ............................... 24
    2.2.1 Algorithms and implementations .......................... 24
    2.2.2 Coherence between the algorithm and the platform ...... 26
  2.3 Dedicated platforms ........................................... 28
    2.3.1 SIMD ....................................................... 28
    2.3.2 Multi-core .................................................. 28
    2.3.3 Heterogeneous computing ................................... 30
    2.3.4 GPU ........................................................ 30
    2.3.5 FPGA ........................................................ 31
    2.3.6 DSP .......................................................... 32
    2.3.7 ASIC and VLSI .............................................. 32
    2.3.8 Conventional computing and its limitations ............. 33
    2.3.9 Unconventional computing ................................ 33
  2.4 Analog electronic computers ................................... 34
## Contents

2.4.1 In linear algebra .............................. 35
2.4.2 In power systems computing ................. 36
2.4.3 Evaluation criteria .......................... 37
2.5 Outlook ................................. 38

3 Realized dedicated mixed signal solver ........ 41
  3.1 Hardware .................................. 42
      3.1.1 Analog part .......................... 45
      3.1.2 Digital part .......................... 55
      3.1.3 Timing ................................ 61
  3.2 Software .................................. 62
      3.2.1 Backend ................................ 62
      3.2.2 Frontend ................................ 65
  3.3 Inaccuracy ................................ 66
      3.3.1 Analog inaccuracy ...................... 66
      3.3.2 Digital inaccuracy ..................... 80
      3.3.3 Effect on the mathematical operation ... 82
      3.3.4 Calibration ............................ 84
  3.4 Results .................................. 86
      3.4.1 Linear system solving ................... 86
      3.4.2 Sample radial and meshed topologies .... 91
      3.4.3 Transient simulation .................... 95
      3.4.4 Dynamic stability analysis .............. 101
      3.4.5 Effect of the integration algorithm on the results ... 103
      3.4.6 Effect of time step versus waiting time ... 107
  3.5 Conclusions ................................ 108
      3.5.1 Comparison with related work ............ 108
      3.5.2 Limitations ............................ 111

4 Concept future solver .......................... 113
  4.1 RAMSES overview ................................ 115
  4.2 Design methodology ........................... 119
      4.2.1 Power system components and matrix building ... 120
      4.2.2 Electronic equivalents .................... 124
      4.2.3 Value range profiling ..................... 129
  4.3 MSC architecture ........................... 131
      4.3.1 Local cells ............................ 131
      4.3.2 Global architecture ...................... 132
      4.3.3 Topological mapping and value mapping ....... 135
      4.3.4 Mathematical operations ................... 135
      4.3.5 Inaccuracies and effect in the linear operations ... 136
      4.3.6 Interface between the MSC and the RAMSES flow ... 138
      4.3.7 Timing ................................ 139
# List of Figures

1.1 A schematic of the traditional structure of a power system [1] .......................... 6
1.2 The power system as a black box ........................................................................ 8
1.3 The operation cycle of a stakeholder .................................................................... 8
1.4 The effect of a faster operation cycle for a stakeholder ....................................... 9
1.5 Building blocks of an Energy Management System (EMS) of a utility operator ... 9
1.6 The concept of DER aggregation ......................................................................... 18

2.1 Power system dynamic simulation domains for different time scales - adapted from [2, 3] .......................................................... 22

3.1 Overview of the multi-platform system .................................................................. 42
3.2 Levels of the multi-platform system .................................................................... 42
3.3 Photo of the multi-platform system ...................................................................... 43
3.4 Overview of the existing mixed-signal computer [4]. ......................................... 44
3.5 Partitioned solution scheme for power system simulation equations ................. 44
3.6 Generalized π model of a generic power system branch .................................... 46
3.7 The complex two-port network for a branch connecting buses f and t and the effect it has on the building of the Y matrix ......................... 48
3.8 The complex one-port network for a shunt element on bus s and the effect it has on the building of the Y matrix ............................... 49
3.9 Power system topological mapping into an electronic resistor network equivalent. 50
3.10 Schematic of an electrical branch of the existing FPPNS ................................. 51
3.11 Detail of the implementation of a slice of the FPPNS using discrete electronics 52
3.12 A stack of four FPPNS slices ........................................................................... 52
3.13 Synoptical schematic of a node of the FPPNS ................................................... 53
3.14 Configuration of the digital part of the existing computer prototype (Altera Cyclone III FPGA). ................................................................. 56
3.15 Stability region of the Forward Euler method and the 2-step Adams-Bashforth method ................................................................. 57
3.16 Pipelined versions of the FE and the AB2 algorithms ....................................... 58
3.17 Datapath of the synthesized pipeline for generators that are modeled with the classical generator model of (3.30) using the FE integration scheme of (3.28) .... 60
3.18 The USB controller and the shared RAM that is interfaced to the FPGA ....... 60
# List of Figures

3.19 Schematic diagram of the timing break up of the operations of the FPPNS for one simulation step ................................................. 61
3.20 Diagram of the timing break up of a computing pipeline ................. 62
3.21 Architecture overview of `elab-tsaot` ..................................... 63
3.22 Interface design pattern for the SS and TD engines .......................... 63
3.23 Hardware Abstraction Layer (HAL) of the dedicated hardware ............ 64
3.24 The model-view-controller (MVC) software architectural pattern used to implement the frontend of `elab-tsaot` ............................... 66
3.25 Physical and schematic representation of parasitics for an electrical branch that contains a potentiometer ........................................ 70
3.26 Schematic representation of the resulting electrical circuit taking into account the node parasitic capacitances ................................. 77
3.27 An RC circuit in which R and C are in series. ................................ 79
3.28 Calibration procedure .......................................................... 84
3.29 Sample topology with 5 electrical branches and 5 electrical nodes .......... 86
3.30 Sample radial topology .......................................................... 91
3.31 Sample meshed topology ....................................................... 93
3.32 Maximum absolute voltage error for radial and meshed topologies of increasing size ................................................................. 94
3.33 Rotor angle oscillations of generator #3 of the 18-bus topology using different simulators .......................................................... 96
3.34 Rotor angle oscillations of generator #11 of the 59-bus topology using different simulators .......................................................... 97
3.35 Rotor angle oscillations of generator #54 of the 59-bus topology using different simulators .......................................................... 97
3.36 Reduced schematic of the 59-bus test case .................................... 98
3.37 Timing breakup of computation time for transient simulation of the 18-bus .......................................................... 100
3.38 Screenshot of the `Analysis editor` of the `elab-tsaot` visualizing results for an n-1 branch contingency analysis on an 18-bus system ................. 102
3.39 Minimal time-step FPPNS error compared to PC software simulation reference for the 18-bus test system ........................................ 103
3.40 FPPNS error due to digital imprecision for different time steps for the 18-bus test system .......................................................... 104
3.41 FE instability while AB2 succeeds in retaining the stability of the numerical solution .......................................................... 104
3.42 CCT for branches #4, #10, #30 of the 18-bus system with varying timesteps using FE & AB2 ....................................................... 106
3.43 CCT for branch #30 of the 18-bus system with varying timesteps using FE & AB2, in a calibrated and an non-calibrated FPPNS environment .......... 106
3.44 Internal angle of generator #4 after a transient event in the 18-bus case, for different ADC waiting times ............................................. 107
<table>
<thead>
<tr>
<th>Figure</th>
<th>Description</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>4.1</td>
<td>A schematic overview of the proposed mixed-signal computer (MSC)</td>
<td>114</td>
</tr>
<tr>
<td>4.2</td>
<td>The flow of the RAMSES transient simulator with the linear algebra operation identified in red</td>
<td>118</td>
</tr>
<tr>
<td>4.3</td>
<td>The two-port, four-pole network (2 poles per port) defined for a branch by (4.8) and the effect it has on the building of the ( \mathbf{Y} )</td>
<td>120</td>
</tr>
<tr>
<td>4.4</td>
<td>Empirical pie chart of occurrences of branch types for several typical power systems with sizes ranging between 3-15k buses</td>
<td>121</td>
</tr>
<tr>
<td>4.5</td>
<td>The one-port, two-pole network (2 poles per port) defined for a shunt element by (4.9) and the effect it has on the building of the ( \mathbf{Y} )</td>
<td>122</td>
</tr>
<tr>
<td>4.6</td>
<td>The ( 2 \times 2 ) diagonal correction that arises from the dynamic behavior of an injector and the effect it has in the augmenting of ( \mathbf{Y} ) to ( \tilde{\mathbf{Y}} )</td>
<td>123</td>
</tr>
<tr>
<td>4.7</td>
<td>An equivalent of a power system bus that follows the real decomposition, the ordering and the sign convention for the MSC</td>
<td>123</td>
</tr>
<tr>
<td>4.8</td>
<td>An outline of an electrical two-port network used to map a power system branch</td>
<td>124</td>
</tr>
<tr>
<td>4.9</td>
<td>Examples of the effect of the TPN building blocks (potentiometers and VCCS) to the ( g )-parameters matrix of the TPN</td>
<td>125</td>
</tr>
<tr>
<td>4.10</td>
<td>Schematic of the digital part of a TPN</td>
<td>127</td>
</tr>
<tr>
<td>4.11</td>
<td>An outline of an electrical one-port network used to map a power system shunt element or a diagonal correction</td>
<td>128</td>
</tr>
<tr>
<td>4.12</td>
<td>Schematic of the digital part of an OPN</td>
<td>129</td>
</tr>
<tr>
<td>4.13</td>
<td>Synoptical diagram of an injection and measurement node of the MSC</td>
<td>130</td>
</tr>
<tr>
<td>4.14</td>
<td>Schematic of the digital part of an IMN</td>
<td>130</td>
</tr>
<tr>
<td>4.15</td>
<td>Schematic of a local cell (LC) of the MSC</td>
<td>133</td>
</tr>
<tr>
<td>4.16</td>
<td>Connectivity detail and simplified schematic of a TPN connection multiplexer (MUX)</td>
<td>133</td>
</tr>
<tr>
<td>4.17</td>
<td>Global view of the digital architecture of the MSC</td>
<td>134</td>
</tr>
<tr>
<td>4.18</td>
<td>Relative ( \infty )-norm errors that are introduced in the electronic equivalent of the ( \Gamma ) matrix</td>
<td>142</td>
</tr>
<tr>
<td>4.19</td>
<td>Relative ( \infty )-norm errors that are introduced in the electronic equivalent of the ( \Gamma^{-1} ) matrix</td>
<td>142</td>
</tr>
<tr>
<td>4.20</td>
<td>Relative ( \infty )-norm errors that are introduced in the electronic equivalent of the ( \mathcal{J} ) vector</td>
<td>143</td>
</tr>
<tr>
<td>4.21</td>
<td>Relative ( \infty )-norm errors that are introduced in the electronic equivalent of the ( \mathcal{J} ) vector; dynamic scaling of the current mapping ratio ( \rho_I ) is followed as per (4.36)</td>
<td>144</td>
</tr>
<tr>
<td>4.22</td>
<td>Relative ( \infty )-norm errors that are introduced to the voltage solution ( \tilde{\mathbf{V}} ) due to conductance and current inaccuracies of the MSC</td>
<td>145</td>
</tr>
<tr>
<td>4.23</td>
<td>Relative ( \infty )-norm errors introduced to the voltage solution ( \tilde{\mathbf{V}}' ) due to the quantization of the voltage ADC of the IMNs</td>
<td>146</td>
</tr>
<tr>
<td>4.24</td>
<td>The schematic of the electronic equivalent of an 11-bus system created in OrCAD Capture</td>
<td>147</td>
</tr>
<tr>
<td>4.25</td>
<td>Thumb-rule diagram for the order of magnitude of parasitic capacitances on different electronic design paradigms</td>
<td>148</td>
</tr>
<tr>
<td>Figure</td>
<td>Description</td>
<td>Page</td>
</tr>
<tr>
<td>--------</td>
<td>-----------------------------------------------------------------------------</td>
<td>------</td>
</tr>
<tr>
<td>4.26</td>
<td>Transients of the node of voltages of the MSC-mapping of the 11-bus system</td>
<td>149</td>
</tr>
<tr>
<td></td>
<td>using a current source slew rate of $t_{irr} = 10 \text{ ns}$</td>
<td></td>
</tr>
<tr>
<td>4.27</td>
<td>Transients of the node of voltages of the MSC-mapping of the 11-bus system</td>
<td>150</td>
</tr>
<tr>
<td></td>
<td>using a current source slew rate of $t_{irr} = 1 \text{ ns}$</td>
<td></td>
</tr>
<tr>
<td>4.28</td>
<td>Transients of the node of voltages of the MSC-mapping of the 11-bus system</td>
<td>150</td>
</tr>
<tr>
<td></td>
<td>assuming uniform parasitic capacitances of $C = 10 \text{ fF}$</td>
<td></td>
</tr>
<tr>
<td>4.29</td>
<td>Voltage of bus #7 of the 11-bus system after a 60 ms 3-$\phi$ fault on the</td>
<td>152</td>
</tr>
<tr>
<td></td>
<td>same bus</td>
<td></td>
</tr>
<tr>
<td>4.30</td>
<td>MSC invocations for each time instant for the transient scenario on the 11-bus</td>
<td>153</td>
</tr>
<tr>
<td></td>
<td>system</td>
<td></td>
</tr>
<tr>
<td>4.31</td>
<td>Average penalty on the number of Newton (internal) iterations for one time</td>
<td>154</td>
</tr>
<tr>
<td></td>
<td>(external) iteration</td>
<td></td>
</tr>
<tr>
<td>4.32</td>
<td>Voltage of bus #4072 of the 77-bus system after a 60 ms 3-$\phi$ fault on</td>
<td>156</td>
</tr>
<tr>
<td></td>
<td>the same bus</td>
<td></td>
</tr>
<tr>
<td>4.33</td>
<td>MSC invocations for each time instant for the transient scenario on the 77-bus</td>
<td>157</td>
</tr>
<tr>
<td></td>
<td>system</td>
<td></td>
</tr>
<tr>
<td>5.1</td>
<td>Most generic sparsity pattern of the $A_l$ matrix of injectors</td>
<td>164</td>
</tr>
</tbody>
</table>
# List of Tables

1.1 Operations in power systems ........................................... 10  
1.2 Smart grid technologies .............................................. 15  
1.3 Benefits of distributed generation .................................. 17  
   
2.1 Abstraction layers of power system computing ...................... 23  
2.2 Linear algebra libraries in power system research and applications .... 26  
2.3 Analogies between flow networks, power systems, and analog electronic networks 36  
   
3.1 Time domain and Laplace domain .................................... 77  
3.2 Stages of inaccuracy of linear operations performed by the mixed-signal computer 82  
3.3 59-bus test case maximum absolute rotor angle deviation for TSA ............ 98  
3.4 Speed comparison between different engines for the TD simulation of the 18-bus system ........................................ 100  
3.5 Timing break-up between PC, USB communication and the FPPNS for a transient stability operation ........................................ 101  
3.6 Timing results summary for the n-1 branch contingency analysis .......... 101  
3.7 Timing results summary for the CCT analysis .......................... 102  
3.8 n-1 branch contingency results for different integration algorithms and time steps 105  
3.9 Characterization of the existing platform ................................ 109  
   
4.1 Comparison of the roles of different platforms and domains in the FPPNS and the MSC ............................................. 115  
4.2 Different models for different parts of the branch model of Fig. 3.6 ............... 122  
4.3 TPN modifications and their ability to represent branch types .................. 125  
4.4 TPN modifications, their complexity and their primary usage for branch type representation ........................................ 128  
4.5 Interface between the RAMSES software and the MSC platform .............. 138  
4.6 Empirical settling times for different parasitic capacitances and current sources slew rates ............................................. 149  
4.7 Free design parameters of the MSC and their effect in the operation ........... 151  
4.8 Test cases to validate the design guidelines of table 4.7 ..................... 153  
4.9 Effect of analog inaccuracy to the average penalty on Newton iterations with respect to the bit resolution of the current injections ..................... 155
List of Tables

<table>
<thead>
<tr>
<th>Table</th>
<th>Description</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>4.10</td>
<td>Effect of analog inaccuracy to the average penalty on Newton iterations with respect to the bit resolution of the voltage measurements</td>
<td>155</td>
</tr>
<tr>
<td>4.11</td>
<td>Effect of analog inaccuracy to the average penalty on Newton iterations with respect to the bit resolution of the reconfigurable conductances</td>
<td>155</td>
</tr>
<tr>
<td>A.1</td>
<td>General overview of systems to be examined</td>
<td>168</td>
</tr>
<tr>
<td>A.2</td>
<td>General linear algebraic properties of the admittance matrix ( Y ) of power system test cases</td>
<td>169</td>
</tr>
</tbody>
</table>
Introduction

The purpose of this work is to create a computing platform, dedicated to power system simulation. First of all it is important to understand, why such a platform is necessary. In order to understand this motivation, an overview of modern power system and its stakeholders is given in chapter 1. The operations that are performed by the stakeholders are described and their time intensity is assessed. The environment is shifting however, with radical changes being imminent, in what has been termed as the advent of the "smart grid". The triggers of this change are presented and the premises of this new future state of the grid are shown. It is partly this change, that calls for faster power system analysis tools.

Equally important is to see what is expected by a power system analysis platform. Special focus is given to simulation of the dynamics of the power system, since this is the major focus of this work. Through modeling of the power system components, each analysis is associated with a mathematical formulation. The latter is handled by algorithms dedicated to it. The algorithms are concerned with how the problem is solved. They are the ones to be executed on the hardware computing platforms, i.e. where the problem is solved. This logical chain and the coherence that is necessary between its components is the focus of the first part of chapter 2.

In this course it will become evident, that linear algebra has a central position in power system computations. This is related to the interconnected nature of the system under study, which results in large matrix-vector relations between the quantities of interest, e.g. voltage, currents, and admittances. Linear algebra qualifies as one of the most challenging scientific computing domains for conventional digital computing. The state of the art of conventional digital platforms with favorable computing characteristics for linear algebra & power system applications is reviewed in the second part chapter 2. As an alternative to conventional digital computing the analog electronic paradigm is presented, which is the main focus of the next chapters of this work. Both analog electronic networks and power system topologies are flow networks, and this underlying affinity makes analog electronic computing particularly pertinent to power system problems.

Chapter 3 presents a hardware platform that is dedicated to the transient simulation of power systems. The platform, termed Field Programmable Power System Network (FPPNS), has an analog and a digital part. The analog part handles the linear algebra operations, and
the digital one the solution of the differential algebraic equations that describe the dynamic behavior of the components of the system. A USB connection is used to connect the FPPNS to a conventional PC. On the PC side a co-designed software application has been created. It acts as a user interface as well as a wrapper of the functionality of the dedicated hardware. The mathematical properties of the FPPNS are thoroughly examined in the chapter. The inaccuracies of the hardware are examined and their effect on the final results provided by the FPPNS is shown. The system is evaluated across nine different criteria, and a comparison with related state-of-the-art is conducted.

Despite the favorable performance characteristics of the realized platform, certain limitations have been identified. Hence, chapter 4 introduces a concept design for a future generation of a mixed-signal computer that overcomes most of the limitations of the current prototype. The new design is again mixed signal, analog and digital and is termed Mixed Signal Computer (MSC). An a priori design objective was to integrate the MSC into the flow of the transient simulator RAMSES [5, 6]. A software emulator of the proposed hardware platform has been created. With the help of the emulator a thorough exploration of the design space of MSC parameters has been conducted. The effect of individual parameters to the overall accuracy of the results has been examined and related design guidelines have been proposed. One suggested implementation has been successfully utilized in the RAMSES flow to simulate small and medium sized systems. For the linear algebra operations, speedups of four orders of magnitude have been achieved. This translates to respectable overall speedups of the simulation. Finally, conclusions and future perspectives of this work are given in chapter 5.

This work lies in the cross-field of three distinct and vast domains: power systems, electronics, and computing. An exhaustive presentation of every encountered topic is not realistically feasible and is out of the scope of this manuscript. When required, the reader is referred to related literature. Where additional insight on peripheral topics has been deemed useful, dedicated appendices have been included.

**Contributions of this work**

Hereunder the major original contributions of this work are listed.

- A detailed list of operations and analyses in the modern power system environment has been compiled.
- An overview of the emerging future of the power system, termed the “smart grid” has been presented.
- The critical role of linear algebra operations in power system computations has been identified and related literature has been reviewed.
- The importance of the coherence between the underlying platform and the algorithm...
that is executed has been highlighted. In view of this observation, a literature review on hardware platforms that are dedicated to linear algebra operations has been compiled.

- The limitations of conventional digital electronic hardware are noted and analog computing is proposed as an alternative. The state of analog computing is given, again with a particular focus on power system applications.

- A set of plausible evaluation criteria for analog computers is drafted. These criteria are used to evaluate the main results of this work.

- A hardware prototype dedicated to the analysis of power systems has been created. The system spans along a conventional host PC and a dedicated hardware platform.
  - On the PC side, dedicated software has been written. It acts as a user interface and it exposes the functionality of the dedicated hardware to any programming language through an API.
  - The dedicated hardware consists of an analog and a digital part which is synthesized on reconfigurable hardware. The author did not take part in the actual implementation of the platform. He participated in the design of the system and the determination of its specifications, especially on the digital and the communication interface side.
  - The mathematical properties of the analog part of the platform have been detailed. The accuracy of the actual implementation has been thoroughly studied.
  - An investigation has been conducted on the effect of numerical integration on the quality of the final results of the dedicated hardware.
  - The timing properties of the analog part of the hardware have been investigated and an aggregate simplified model has been proposed.
  - The limitations of the proposed design are exposed.

- A conceptual future solver that overcomes the limitations of the existing prototype has been proposed, termed Mixed Signal Computer (MSC). It is again hybrid (analog-digital) and dedicated to power system simulation. A major design goal has been to integrate the proposed hardware in the flow of the RAMSES [5, 6] power system transient simulator software.
  - The mathematical properties of the MSC are presented. An analysis of the accuracy and the timing of the operations is in the same vein as in the case of the existing realized prototype.
  - A fully parameterizable software emulator of the MSC has been created. It has been used to get results on the application of the MSC in atomic linear algebra operations, as well as on its integration into the RAMSES flow. Based on these results an exploration of the design space has been conducted and design guidelines are proposed.
In this chapter an overview of the architecture of the power system is given. Its traditional structure is presented alongside a brief description of the stakeholders that participate in its design, planning and everyday operation. A list of operations of the stakeholders is compiled and characterized for their time intensity.

The traditional way to operate the grid is changing. Reasons that call for that change will be highlighted. The prospective evolution of the grid into the future is what is often called the “smart-grid”. Premises of it will be postulated and the side-effect it has to power system analysis tools will be shown.
1.1 Current architecture

The power system has been named the “largest man-made machine ever”. It can be defined as a complex socio-economic-technical system, featuring strong interrelation between societal needs and expectations, market operations and technical infrastructure. Electricity is conceivably the most multipurpose energy carrier in our modern global economy, and it is therefore primarily linked to human and economic development. Electricity growth has overtaken that of any other fuel, leading to ever-increasing shares in the overall mix. This trend is expected to continue throughout the following decades, with large parts of the world population in developing countries appealing to be connected to power grids [7].

Electrical power systems have been traditionally designed and operated taking energy from high-voltage levels, and distributing it to lower voltage level networks. There are large generation units connected to transmission networks. In these networks there is a bulk transport of electricity, with central coordination of control. Demands are passive and uncontrollable, connected to distribution networks. Distribution systems are also passive and, in the lower levels of voltage, radial in operation. They are designed to accept power from transmission systems and distribute to customers, generally with unidirectional flows [8]. Such a system is depicted in Fig. 1.1.

It is made up of many geographically dispersed components and it can exhibit global change almost instantaneously as a result of local actions. Actions are exerted on the components of the system by the stakeholders. As a stakeholder we consider anyone that is in interaction with the power system, i.e. that is affected by it or that can affect the system through his actions. It is out of the scope of this work to present a thorough stakeholder analysis of the power system. Instead the following sample entities are defined indicatively for the facilitation of the analysis that follows. These entities have arisen after the liberalization and the unbundling of the top-down monopolies in power system operators. They are the conceptual players in the power system world that have an interest/utility in fulfilling their objectives. These objectives may be financial profit, or simply the utility of serving their need for electricity.

ISO refers to the Independent System Operator. ISOs coordinate, control and monitor the
operation of the system in a given region. In some regions around the world the ISO also has market authorities.

**GENCO** stands for Generation Company, i.e. a company engaged solely in producing electricity, normally by owning or maintaining energy production facilities (e.g. generators).

**TRANSCO** stands for Transmission Company, as understood in the market sense, i.e. a company which owns or maintains energy transmission facilities and the business object of which is the transmission of energy.

**DISCO** stands for Distribution Company, i.e. a theoretical (or real) company which owns or maintains energy distribution facilities and the business object of which is the distribution of energy from the transmission level of the grid to the **End users**.

**End users** are the final consumers of electricity, residential, commercial or industrial. As it will be explained in sections that follow, there is a recent trend that the end users also feed electricity back into the grid.

The above definitions of the ISO, the GENCO, the TRANSCO and the DISCO are only indicative and always depend on the local power system governance and regulatory scheme. Often the boundaries between their responsibilities are blurred.

The aim of the stakeholders is to drive the system in a way that generates the most utility to them. Utility can be monetary or anything relevant, e.g. proper functioning of the grid within its technical limits for some ISOs. An illustrating abstraction can be introduced that is drawn from general system theory. For each stakeholder, the power system can be viewed as a black box $\mathcal{F}$ that has two kind of inputs: the *uncontrollable* inputs that the stakeholder cannot control $u_u$, and the ones that can be controlled, termed *controllable* $u_c$ and *semi-controllable* inputs $u_{sc}$. The former are termed controllable because the stakeholder has a large degree of freedom on them $[u_{cmin} - u_{cmax}]$, and the latter are termed semi-controllable because the degree of freedom on them $[u_{scmin} - u_{scmax}]$ is smaller. The goal of each stakeholder is to manipulate the controllable and the semi-controllable inputs so as to keep some system “state” variable $x$ within some “good operation limits” $[x_{min} - x_{max}]$. These state variables (e.g. voltage magnitudes, angles, power flows, prices) are different for each stakeholder and they are all subject to equality and inequality constraints, to ensure proper operation of the system. This concept is illustrated in Fig. 1.2.

The way each stakeholder acts on the system is as follows.

**Acquisition** is the phase where the stakeholder collects data on the system, i.e. “measures” variables of interest to him.

**Analysis** is the phase where the stakeholder runs his “computation routines” in order to determine his actions.
Chapter 1. Modern power system landscape

Actuation is the feedback of the stakeholder back to the grid, in terms of values for the controllable and the semi-controllable variables.

The above cycle is executed repeatedly, in what is here called the stakeholder operation cycle, shown in Fig. 1.3.

The faster the stakeholder cycle, the better the ideal $x$ curve is approximated as shown in Fig. 1.4. In the ideal case where all acquisition, analysis and actuation phases were instantaneous, the $x$ curve would be as close to the ideal one as possible. It is therefore to the interest of the stakeholder to improve the speed of every element of the cycle of his operation cycle.

Figure 1.3 – The operation cycle of a stakeholder
1.1. Current architecture

Figure 1.4 – The effect of a faster operation cycle for a stakeholder

Figure 1.5 – Building blocks of an Energy Management System (EMS) of a utility operator

For utility operators, these operations are performed in computer-aided environments called energy management systems (EMS). EMSs have been present since the era of vertical integration in power systems. EMSs are a point of monitoring, control and optimization for a part of a grid. For that to be enabled, an EMS employs a wide range of supervisory control and data acquisition (SCADA) functionality. Such a system is shown schematically in Fig. 1.5; color coding from Fig. 1.3 has been retained.

Table 1.1 provides a non-exhaustive list of operations commonly executed in the frame of a modern power system. Column “Level” refers to the level of the power system, as in table 1.2.
<table>
<thead>
<tr>
<th>Operation</th>
<th>Algorithms</th>
<th>Software &amp; Equipment involved</th>
<th>Level</th>
<th>Stakeholders</th>
<th>Time intensity</th>
</tr>
</thead>
<tbody>
<tr>
<td>Economical feasibility studies (equipment sizing)</td>
<td>Operational research</td>
<td></td>
<td>GTD</td>
<td>ISO, GENCO, TRANSCO, DISCO</td>
<td>1</td>
</tr>
<tr>
<td>Transmission expansion, Optimal compensation device/capacitor placement</td>
<td>Prediction scenarios/methods, (Available Transfer Capability (ATC) methods), VAR optimization methods</td>
<td></td>
<td>T</td>
<td>TRANSCO</td>
<td>1</td>
</tr>
<tr>
<td>Power flow analysis</td>
<td>Power flow algorithms, network reduction algorithms</td>
<td></td>
<td>*</td>
<td>ISO</td>
<td>2..3</td>
</tr>
<tr>
<td>Static Security Assessment (SSA), contingency analysis, emergency switching plans</td>
<td>N-1/N-P contingency analysis</td>
<td></td>
<td>*</td>
<td>ISO, GENCO, TRANSCO, DISCO</td>
<td>2</td>
</tr>
<tr>
<td>Protective device coordination/sizing, Distance relaying configuration, Reclosure synchronization</td>
<td>Short circuit analysis (ANSI C37/IEC 60909)</td>
<td>Distance/protection relays</td>
<td>*</td>
<td>ISO, GENCO, TRANSCO, DISCO, 1-2</td>
<td></td>
</tr>
<tr>
<td>Voltage stability analysis</td>
<td>Continuation power flow (CPF) methods, Bifurcation analysis</td>
<td></td>
<td>*</td>
<td>ISO</td>
<td>2</td>
</tr>
<tr>
<td>Power quality/Harmonic (distortion) analysis</td>
<td>Fourier analysis</td>
<td></td>
<td>TD</td>
<td>ISO, TRANSCO, DISCO</td>
<td>2</td>
</tr>
<tr>
<td>Small-signal Stability analysis, Power System Stabilized (PSS) configuration</td>
<td>Eigenvalue algorithms, sensitivity analysis</td>
<td></td>
<td>T</td>
<td>ISO, TRANSCO, DISCO</td>
<td>1-2</td>
</tr>
<tr>
<td>Dynamic Security Assessment (DSA)</td>
<td>Time domain (TD) simulation (ms scale), Symmetrical components theory</td>
<td></td>
<td>TD</td>
<td>ISO, TRANSCO, DISCO</td>
<td>2</td>
</tr>
<tr>
<td>Electo-Magnetic Transients analysis (EMT), Lightning protection Reliability assessment</td>
<td>Time domain (TD) simulation (sub-ms scale)</td>
<td>EMTP</td>
<td>TD</td>
<td>ISO, TRANSCO, DISCO</td>
<td>2</td>
</tr>
<tr>
<td></td>
<td>Probabilistic methods, Index based quantification</td>
<td></td>
<td>*</td>
<td>ISO, GENCO, TRANSCO, DISCO</td>
<td>2</td>
</tr>
<tr>
<td>Topic</td>
<td>Description</td>
<td>Component</td>
<td>Reference</td>
<td></td>
<td></td>
</tr>
<tr>
<td>--------------------------------------------------------------</td>
<td>-----------------------------------------------------------------------------</td>
<td>-----------------</td>
<td>--------------------</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Optimization, Unit Commitment</td>
<td>Unit commitment algorithms, Optimal Power (OPF) algorithms, emissions/security constrained</td>
<td>*</td>
<td>ISO, GENCO, <strong>3</strong></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Market decision making/bidding</td>
<td>Market simulation</td>
<td>Market agents</td>
<td>ISO, GENCO, ENd users <strong>4</strong></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Load forecasting</td>
<td>Statistical methods, Intelligent systems (ANN, etc.)</td>
<td>C</td>
<td>ISO, GENCO, ENd users <strong>3-4</strong></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Price/Load forecasting</td>
<td>Artificial intelligence, neural networks, times series analysis</td>
<td>Microgrid controllers</td>
<td><strong>3-4</strong></td>
<td></td>
<td></td>
</tr>
<tr>
<td>State Estimation, Monitoring/Situational awareness (engineering diagnostics, billings)</td>
<td>WLS, Kalman filters</td>
<td>SCADA, Wide Area Measurement Systems (WAMS)</td>
<td><strong>TD C</strong></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Active power flow and frequency control, Power electronic interface control</td>
<td>AGC &amp; FACTs configuration</td>
<td>Prime movers (speed governing)</td>
<td>ISO, DISCO, <strong>3-5</strong></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Smart metering</td>
<td>Smart meters</td>
<td><strong>C</strong></td>
<td>ISO, DISCO, <strong>3-5</strong></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Intelligent load shedding</td>
<td>Distributed electronics (sensors, actuators, comm. infrastructure)</td>
<td><strong>C</strong></td>
<td>ISO, DISCO, <strong>3-5</strong></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Home/Building automation</td>
<td>Line parameter estimation</td>
<td><strong>TD C</strong></td>
<td>ISO, DISCO, <strong>1</strong></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Transmission/distribution line design</td>
<td>Reactive compensation techniques, AVR configuration</td>
<td><strong>TD C</strong></td>
<td>ISO, DISCO, <strong>4-5</strong></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Reactive power flow and voltage control</td>
<td>Tap changing/ULTC transformers, shunt/series capacitor banks/reactors, static var systems</td>
<td><strong>TD G</strong></td>
<td>GENCO, ISO, <strong>1-2</strong></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Sub-synchronous resonance analysis</td>
<td>Prime movers and related controls</td>
<td><strong>GT C</strong></td>
<td>ISO, <strong>1</strong></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Controlled islanding schemes, Black start planning, System restoration</td>
<td></td>
<td><strong>GT C</strong></td>
<td>ISO, <strong>1</strong></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>Time-domain simulation (sub-ms scale)</td>
<td>TD</td>
<td>TRANSCO, DISCO</td>
<td>C</td>
<td>End users</td>
</tr>
<tr>
<td>--------------------------</td>
<td>--------------------------------------</td>
<td>----</td>
<td>----------------</td>
<td>---</td>
<td>-----------</td>
</tr>
<tr>
<td>Insulation coordination</td>
<td></td>
<td>TD</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Power factor correction</td>
<td></td>
<td>C</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>Compensation device/- capacitor banks</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
1.1. Current architecture

The column “Time intensity” refers to how tight are the time margins for the operation to be finished. Increasing number denote tighter requirements.

1: planning/design operations These analyses are often performed only once, in a planning/design phase, hence, they have no practical time constraints.

2: periodic offline operations These are analyses that are executed once per day/week. They are performed more than once, but their periodicity is so infrequent that poses almost no practical limits to their execution.

3: periodic online operations These are analyses that executed once per a tighter time frame than the above, e.g. in the execution frame of a “measurement - data acquisition - state estimation - analysis” cycle of an Energy Management System in a system operator (15 minutes). Or in the bidding-clearing time frame of an energy exchange (5 minutes).

4: real-time operations These operations refer to operations that are performed continuously, i.e. that are subject to some real-time constraints.

5: hard real-time operations In these case of real-time operations the constraints are “hard”, in the sense that if they are not met, some part of the expected functionality in the system will fail. These task are the most demanding tasks from a computing point of view.

1.1.1 The need for change

It is now widely accepted that changes to the aforementioned traditional architecture are necessary. Among others, factors contributing to this reshaping are the following [10].

- Different governance frame: liberalization, shift from vertical monopolies to deregulated markets.

- Consumer consciousness, power “ethics”, e.g. consider post-Fukushima declining support for nuclear.

- Aging infrastructure: early post-WWII.

- Increase in demand quantity beyond expectation, e.g. demand doubled since ’60s.

- Changes in demand nature: changing peak hours, load patterns, increased quality-of-service demands, differentiated products.

---

1It is useful to note a distinction raised in [9], between the terms online and real-time when used in the power system domain. Online refers to the results of the analysis tool being available to the SCADA/EMS of the operator, while real-time refers to the results of the analysis being ready within a time frame that is deemed “real-time” for the specific application.
Chapter 1. Modern power system landscape

- Advances in technology: storage, distributed generation, smart metering.
- Climate change and related political/public opinion pressures.

As very significant investments will be required to simply renew this infrastructure ($6.9 trillion capital investment spanning 15 to 25 years reported in [11]), the most efficient way forward is to incorporate innovative technologies and solutions when planning and executing this renewal [12].

1.2 Future architecture: the Smart Grid

The term smart grid is used interchangeably to denote a plethora of emerging technologies in the field of power systems. It would be more precise to refer to the smart grid as the grid of the future, but for conventionality the use of the term is retained throughout this work. Two underlying conceptual points of the smart grid are its de-centralization and the existence of localized intelligence.

History and organizational analysis suggest that a centralized point of control comes in the way of rapid scaling of value creating activities, often becoming the bottleneck limiting evolution of ideas and novelty of executions [13]. Furthermore, a bottom-line premise of the smart grid is to be self-healing. This implies some sort of underlying distributed control of the system with individual components as independent intelligent agents. These agents would compete and cooperate to achieve global optimization [14].

A complex socio-technical system featuring similar requirements is the internet. And as Robert Metcalfe urges, it is of utmost importance to mine the history of the internet, to learn from that experience, and to find the correct lenses to look at the smart grid problem [15]. The genius in the internet is in its flexibility and versatility. For the smart grid to follow the success story of the internet, it needs to endorse this versatile design and encourage innovative participation from all stakeholders.

For that the following premises are set [13]: The smart grid is the future evolution of the power system that:

- will meet all our energy demands (not just current electricity demand),
- will be powered almost totally by renewable resources,
- will operate all its components at their maximum allowable capacity,
- command and control intelligence will be pervasively distributed and sub-second responsive, and
- will be able to self heal.
1.2. Future architecture: the Smart Grid

Table 1.2 – Smart grid technologies

<table>
<thead>
<tr>
<th>Tech</th>
<th>Level</th>
</tr>
</thead>
<tbody>
<tr>
<td>Substation automation</td>
<td>TD</td>
</tr>
<tr>
<td>Distribution automation</td>
<td>TD</td>
</tr>
<tr>
<td>Phasor measurement</td>
<td>GTD</td>
</tr>
<tr>
<td>Smart metering</td>
<td>C</td>
</tr>
<tr>
<td>Smart appliances</td>
<td>C</td>
</tr>
<tr>
<td>Home automation</td>
<td>C</td>
</tr>
<tr>
<td>Demand response</td>
<td>C</td>
</tr>
<tr>
<td>Electric vehicles</td>
<td>GC</td>
</tr>
<tr>
<td>Energy storage</td>
<td>GC</td>
</tr>
<tr>
<td>Distributed generation</td>
<td>G</td>
</tr>
<tr>
<td>Renewable sources</td>
<td>G</td>
</tr>
<tr>
<td>Communication infrastructure</td>
<td>GTDC</td>
</tr>
<tr>
<td>DER aggregation schemes</td>
<td>GTDC</td>
</tr>
</tbody>
</table>

1 G: Generation; T: Transmission; D: Distribution; C: Consumption

The definition of smart grid can also depend on local conditions; different countries can have very different starting points for the progress towards smart grid. Deployment of smart grid technologies will occur over a long period of time, adding successive layers of functionality and capability onto existing equipments and systems.

Technology is the key consideration and it can be defined by certain technical characteristics: e.g. predictive, integrated, interactive, optimized, flexible, accessible, reliable, economic, and secure. Broadly speaking, three major technological domains of the smart grid are distributed intelligence, communication technologies, and automated control systems [16]. So far, research has been focusing on all three of them. Among others, impressive work has been carried out in technologies such as: substation automation, distribution automation, phasor measurement (PMU), smart metering, smart appliances, home automation, demand response, electric vehicles (PHEV), energy storage, distributed generation, renewable sources, communication infrastructure, DER aggregation schemes, etc. Scrutiny of the aforementioned is out of the scope of this paper. Table 1.2 summarizes them according to which level of the power system they pertain to: Generation, Transmission, Distribution or Consumption.

The next section presents recent advances in the field of distributed energy resources (DER) that seem to be greatly contributing to shifting the power system paradigm away from the prevalent centralized production scheme. The vehicle through which DER are to impact the structure of the power system is DER aggregations, reviewed in section 1.2.2.
1.2.1 Distributed energy resources

Perhaps the most heavily researched field in smart-grid related technologies is distributed energy resources (DER). DER include distributed generation (DG), responsive loads and storage systems alike. Smart grid advances regarding the loads mainly focus on their responsiveness (see table 1.2). Storage systems would add to the flexibility of the system, providing the ability to shift energy consumption/production in time. Distributed generation is the sector that has received the most attention of the three and seems to have the greatest potential.

There is an ongoing debate on the definition of the distributed generation. To the author’s opinion, the most apt definition has been given in [17]: DG is defined as an electric power source connected directly to the distribution network or on the customer side of the meter. Recent developments in DG include the following [18].

- Improvement of the efficiency of solar cells up to 20-24%.
- Increase in the capacity of wind generators from few kW up to several MW.
- Development of micro-, bio- and multi-fuel-CHPs (combined heat and power) plants to replace the conventional ones.
- Development of fuel cell technologies.
- Increase of efficiency and capacity of storage devices.
- Introduction of new renewable energy resources as tidal generators, small hydro generators, etc.

Reference [19] provides a comprehensive insight on DG technologies and their respective status quo.

The projected benefits of increased DG penetration seem to be in accord with the concept of sustainable development. This becomes clear once the benefits of DG are categorized in four domains as shown in table 1.3 [20, 8].

For completeness sake, it should be noted that high-degree penetration of DG into the bulk grid can also have adverse effects. These mostly relate to the technical operation of the system [8], including inversion in the energy flow, difficulties in voltage control and in management of reactive power. Also, despite the professed positive impact on reliability, it has been argued that there can also be a negative impact due to the non-dispatchable nature of some types of generation, such as intermittent solar panels, wind turbines, tidal generators, etc. Furthermore, the unpredictability implied by dependance on weather conditions for those generation units would require additional conventional standby capacity.

Up to now, DER have been used to displace energy from conventional generating plants but not to displace their capacity as they are not visible to system operators [23]. However, it is
1.2. Future architecture: the Smart Grid

<table>
<thead>
<tr>
<th>Domain</th>
<th>Benefit</th>
</tr>
</thead>
<tbody>
<tr>
<td>Technical</td>
<td>increase overall system efficiency e.g. CHP, bypass ‘congestion’ in existing transmission grids, avoid transmission losses, provide network support or ancillary services, increase flexibility, continuity and reliability of supply, positive effect on transient stability [21]</td>
</tr>
<tr>
<td>Social [22]</td>
<td>enable power equity, diversification of energy sources to enhance energy security, support for competition policy, improved utilization of local resources, local job creation</td>
</tr>
<tr>
<td>Environmental</td>
<td>increase overall system efficiency e.g. CHP, bypass ‘congestion’ in existing transmission grids, avoid transmission losses, provide network support or ancillary services, increase flexibility, continuity and reliability of supply</td>
</tr>
<tr>
<td>Economic</td>
<td>uncertainty in electricity markets favours small generation schemes, cost-effective improvement of power quality and reliability, new market opportunities, facilitate investment (easier real estate, easy acquisition of equipment)</td>
</tr>
</tbody>
</table>

Table 1.3 – Benefits of distributed generation

Now clearly understood that DG needs not only to displace the energy produced by central generation but also to provide all the associated flexibility and manageability. So there is a need to move away from the fit and forget approach that has characterized DG installation so far [20]. As a means of comprehensive integration of DER into power systems various aggregation schemes have been suggested.

1.2.2 Aggregations

Aggregation is the process of linking small groups of industrial, commercial, or residential customers into a larger power unit to make them visible to the electric system (see figure 1.6). Aggregations usually involve loads, local storage and DG. Thus, load or generation profiles of individual consumers and/or small generators appear as a single unit to the system. Building up a large and flexible portfolio enables aggregators to operate DER and to provide services to the power system, e.g. system balancing. It is by combining these features (more flexibility, lower operating costs) that aggregations will reduce the gap to profitability of DER units [16]. As mature representative examples of such aggregation schemes one can mention the microgrid (uGrid)[24, 25] and the virtual power plant (VPP) [26].

A microgrid (see figure 1.6) usually consists of an aggregation of localized small-scale generation and demand resources connected at a single point of common coupling (PCC), usually at a low voltage level in the grid. The main idea is grouping these resources and presenting
them as a single controllable entity to the rest of the grid. In this way, DG becomes visible and
given attractive remuneration it can provide services for the grid, e.g it can act as a source of
power, or it can provide ancillary services [27]. An interesting feature of a microgrid is that it
can operate in two modes: both connected to the grid and in an autonomous islanded mode,
increasing the flexibility of the host power system. To the utility the microgrid can be thought
of as a controlled cell of the power system. To the customer it can be designed to meet his
special needs and provide additional benefits. An elaborate presentation of the technological
status quo of microgrids, their management and related projects around the world can be
found in [28, 29].

The virtual power plant is a similar concept. It consists of a flexible representation of a portfolio
of smaller generation and demand resources, in a way that a single operation profile is created.
This profile consists of the composite parameters characterising each participating element
of the aggregation. As a result, the VPP is characterized as a whole by a set of parameters
usually associated with a traditional transmission connected generator, such as scheduled
output, ramp rates, voltage regulation capability and reserves [30, 23]. The VPP concept was
thoroughly investigated by the European project FENIX 1.

1.3 The effect of the smart grid on analysis tools

One very important side effect of the advent of the smart grid is the requirement of faster
analysis tools for the grid.

• Given the shorter time constants of the inertia-poor, sub-(milli-)second responsive
smart grid environment it is a technical requirement that the “real-time” frame is shorter.

• It is to the interest of all stakeholders to perform their analyses faster since this would
give them an edge against competition in the open market/operations environment of

1http://www.fenix-project.org/
1.3. The effect of the smart grid on analysis tools

the smart grid.

• There is smaller experience on the operating mechanisms of the smart grid from a technical point of view (e.g. voltage, transient stability etc.), hence more analyses are required to ensure the safe operation of the grid.

Faster tools mean a shorter time for the “Analysis” part of the stakeholder cycle of Fig. 1.3, which leads to better attainment of the goals of the stakeholders, as per Fig. 1.4. Should substantial speed increase be in hand, new opportunities would arise in the Energy Management Centers (EMC) of the power utilities. Current analyses would be completed within smaller time windows and larger types of analysis (e.g. for larger, more detailed networks) would be possible in the same amount of time. More importantly there would be a phenomenon of technology push, in the sense that new types of applications and services could be explored, currently hindered by the unavailability of potent enough tools [31].
This work is particularly concerned with transient simulation. In the latter heavy linear algebra operations with the admittance matrix of the grid are involved. Given their computing intensity, linear algebra operations are a domain where significant performance benefits can be harnessed. Designers of power system applications have been long aware of the fact and have tried to optimize the related software and hardware parts. Different architectures have been employed, a review of which is given in this chapter.

The vast majority of related work concerns conventional digital computing. Certain characteristics of linear algebra operations make conventional computing less appropriate for it. As an alternative analog computing is proposed. Both analog electronic networks and power system topologies are flow networks, and this underlying affinity makes analog electronic computing particularly pertinent to power system problems. The state of the art of analog computing is given in the end of the chapter, with particular emphasis on power system applications.


## 2.1 Simulation

Out of the different operations of table 1.1 the focus of this work are the ones primarily concerned with the simulation of the power system. Simulation and its related concept of modelling allow us to get information about how the system will behave without testing it in real life. Simulation is the imitation of the operation of the real-world system over time. Power system simulation may refer to different time scales, as shown in Fig. 2.1.

This is particularly pertinent in the field of power systems, for a range of reasons.

- Due to the massiveness of the system performing real life tests would be too expensive.
- Actually it would not be possible, since the system is in continuous use, i.e. it cannot be stopped for scenario testing.
- The (slow) time constants involved prohibit the realization of all the analyses that would be needed. Therefore it is quite common to perform faster-than-real-time simulations so as to speedup the analysis in question.

A power system simulation application consists of three abstraction layers (1-3), the power system problem, the underlying mathematical problem, and the computing implementation of the latter. These levels and transitions between them are illustrated in table 2.1.

The *power system problem* is the one to be solved. It comes directly from the power system world, e.g. the power flow analysis. The translation of the power system problem to a mathematical problem is done through the *modeling* of the power system. Model selection is a fine process which is determined by the (a) purpose, (b) the time scale, and (c) the detail that is required in the application in question. At level 2, the problem consists of mathematical predicates which can be handled by relevant algorithms. These algorithms are amenable to a *computing implementation*, i.e. a machine code that can be executed directly by a computing platform. The *hardware* computing platform on which the implementation is executing is shown below the dashed line in the table.

It is important that the above is generic, and can be more or less applied to a wide range of analyses. It is equally important to highlight that after model selection is complete, at level 2,
2.1. Simulation

Table 2.1 – Abstraction layers of power system computing

<table>
<thead>
<tr>
<th>Layer</th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>3 Power system problem</td>
<td>↓</td>
</tr>
<tr>
<td>↓ modeling</td>
<td></td>
</tr>
<tr>
<td>2 Mathematical problem</td>
<td>↓</td>
</tr>
<tr>
<td>↓ algorithms</td>
<td></td>
</tr>
<tr>
<td>1 Computing implementation</td>
<td>↓</td>
</tr>
<tr>
<td>↓ runs on</td>
<td></td>
</tr>
<tr>
<td>0 Platform</td>
<td></td>
</tr>
</tbody>
</table>

the problem has been completely translated outside the power system domain. From then on, it can be decoupled from power system knowledge and expertise, and can be handled by mathematicians and computer scientists.

As an example, most of what is thought of as “power system algorithms” are actually not in the power system domain at all. When W. F. Tinney introduced the Newton-Raphson (NR) power flow algorithm in 1967 [32], he actually only applied an already established mathematical algorithm, i.e. the Newton-Raphson method, to a typical mathematical problem, i.e. system of non-linear equations. The latter is actually the formulation that results from the modeling of the power flow problem in the power systems domain.

2.1.1 Mathematical formulation of simulation problems

A general formulation of any power system simulation problem includes a set of non-linear differential algebraic equations (DAEs).

\[
x = \begin{bmatrix} x_d \\ x_a \end{bmatrix}, \quad \begin{bmatrix} \dot{x}_d \\ 0 \end{bmatrix} = \begin{bmatrix} f_d(x_d, x_a) \\ f_a(x_d, x_a) \end{bmatrix}
\] (2.1)

The evolution of state variables \( x_d \) and algebraic variables \( x_a \) along time is determined by numerically integrating the differential part of the equations \( f_d \). If an initial value of \( x_d(0) = x_{d,0} \) is supposed for (2.1), then the perfect analytical solution of it along time is \( x(t) = \phi_t(x_{d,0}, t_0) \). The point of numerical integration algorithms is to approximate this perfect analytical trajectory by a sequence of points \( x^k \) that correspond to respective time instants \( t^k \). For a given point in time \( t^k \) a numerical integration algorithm produces a point of the trajectory for time instant \( t^{(k+1)} = t^k + h \), where \( h \) is a timestep [33]. The solution of the above is highly dependent on the time scales of interest for the study as per Fig. 2.1. In the extreme case of steady-state studies (e.g. power flow on the far right of the figure), all the variables are algebraic and the equation set is purely algebraic \( 0 = f_a(x_a) \).
2.2 Linear algebra in power systems

One class of mathematical problems (level 2) that are ubiquitous in power system computing applications are linear algebra operations. The exact amount of time spent in linear algebra operations in a power system algorithm depend on its exact flow.

For example:

- In [34], the linear algebra solver takes 30.3-62.9% of the total execution time for the power flow calculations on an artificial system of 800k buses, and 28.6-80.8% for the power flow calculations on the WECC system.

- In the transient simulator detailed in [5], linear algebra related operations (mostly linear system solving and sparse matrix-vector multiplication) account for 21-58.1% of the total execution time of the program.

- The assembling and factorization of the Jacobian at every step of a quasi-Newton power flow method in [35] takes up 85% of the computing time.

Given the computing intensity of linear algebra operations, the latter are identified as having a significant potential for performance benefits in power system calculations. The operations that are particularly pertinent are (sparse) matrix-vector multiplication (2.2a) and (sparse) linear system solving (2.2b).

solve for \( y \) (\( A, x \) known) : \( y = A \cdot x \)  \hspace{1cm} (2.2a)

solve for \( x \) (\( A, y \) known) : \( y = A \cdot x \)  \hspace{1cm} (2.2b)

The matrices involved in these operations are usually exactly or closely related to the admittance matrix \( Y \) of the topology, which links bus voltages to current injections. It is square, sparse, complex, often symmetric and it arises from the interconnected nature of the grid and formalizes the nodal voltage equations [36, 37, 38]. The size of \( Y \) is equal to the number of buses in the system, and in cases of practical interest it is in the order of (tens of) thousands [39, 40, 41].

2.2.1 Algorithms and implementations

The algorithm to solve the matrix-vector multiplication problem of (2.2a) is trivial from a mathematical point of view. However, this basic algorithm can be implemented in a variety of ways: in the ordering of the loops, the storage scheme of the elements, vectorization/grouping of the operands, parallelization techniques etc.
The solution of the linear system problem of (2.2b) is less trivial but well-established direct (e.g. LU decomposition, Cholesky decomposition, QR decomposition, etc.) and iterative techniques (Jacobi, Gauss-Seidel, CG, BiCG, GMRES) exist to solve such systems [42, 43, 44]. Direct methods are called so, because they attempt to solve (2.2b) for an exact \( x \) in “one-shot”, meaning in a set sequence of a finite number of algorithmic steps. This is in contrast to iterative methods, which start from an initial guess and try to generate a sequence of improving solution approximations for (2.2b). These approximations converge to the precise solution only to the limit (\( \text{iter} \to \infty \)).

The realization of these theoretical problem-solving algorithms on computing platforms is the focus of numerical linear algebra. Numerical linear algebra implementations are usually based on low-level building blocks of elementary operations. These operations are traditionally grouped in three levels of Basic Linear Algebra Subprograms (BLAS) functionality [45, 46, 47]. Usually the performance is heavily dependent on how well the BLAS software realization matches/exploits the specificities of the host platform. This explains the wealth of vendor/architecture-specific/optimized BLAS implementations that are available, e.g. [48, 49].

Higher level linear algebra operations can be constructed by invoking cores of one of the three basic BLAS levels. They usually provide a broader range of linear algebra routines, such as matrix factorizations, linear system solving and eigen-analysis. Given that in most of the cases these libraries are based on some lower level BLAS implementation, their performance depends highly on the performance of the latter. A survey on linear algebra libraries is given in [50].

The major trend on how to include linear algebra algorithms in the flow of a power system application is the utilization of off-the-shelf libraries [51, 52, 53, 54]. Efforts to develop implementations tailored to the specific very needs of the problem also exist [55, 56]. Table 2.2 summarizes linear algebra library utilization in power system research.

The sparse matrix-vector multiplication problem of (2.2a) is part of Basic Linear Algebra Subprograms (BLAS) level 2 functionality under the naming convention \( \text{xyyMV} \). A performance edge may be gained if special properties of the matrix/vector operands are exploited, such as real/complex entries, (block) triangular/banded/Hermitian/symmetrical structure, sparsity, diagonal dominance, positive definiteness, etc. [56].

For the linear system solving problem of (2.2b), the majority of power system related work uses predominantly direct methods, such as pivoted or not L(D)U decomposition [36, 57, 40, 5, 3], as well as multifrontal Cholesky and QR decompositions [52]. For these implementations, superior results are reported for KLU in [40, 94, 69, 54, 95].

Preconditioned iterative methods, sometimes called inexact, have recently emerged into the power system application landscape [70, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 53, 83]. As said, iterative methods are able to reduce the residual of the linear solution \( r = A \cdot x - b \) down to a predefined desirable tolerance, in this way avoiding the extra computational cost of an
Chapter 2. Power system simulation, linear algebra and computing platforms

Table 2.2 – Linear algebra libraries in power system research and applications

<table>
<thead>
<tr>
<th>Library name</th>
<th>Related work</th>
</tr>
</thead>
<tbody>
<tr>
<td>SuperLU</td>
<td>[57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67]</td>
</tr>
<tr>
<td>KLU</td>
<td>[68, 51, 40, 69]</td>
</tr>
<tr>
<td>UMFPACK</td>
<td>[52, 70, 71, 72, 73, 67, 69, 74, 75]</td>
</tr>
<tr>
<td>NMath</td>
<td>[76]</td>
</tr>
<tr>
<td>ILNumerics</td>
<td>[77]</td>
</tr>
<tr>
<td>ALGLIB</td>
<td>[78]</td>
</tr>
<tr>
<td>SPARSKIT</td>
<td>[79, 80, 81]</td>
</tr>
<tr>
<td>MUMPS</td>
<td>[70]</td>
</tr>
<tr>
<td>PARDISO</td>
<td>[58, 69, 82, 83]</td>
</tr>
<tr>
<td>PETSc</td>
<td>[84, 85, 39, 86, 87, 34]</td>
</tr>
<tr>
<td>SuiteSparse</td>
<td>[88, 83, 89]</td>
</tr>
<tr>
<td>CXSparse</td>
<td>[90, 91]</td>
</tr>
<tr>
<td>Hypre</td>
<td>[92]</td>
</tr>
<tr>
<td>NICSLU</td>
<td>[92]</td>
</tr>
<tr>
<td>HSL[93]</td>
<td>[5, 6]</td>
</tr>
</tbody>
</table>

over-solution. They can be more easily parallelized and they are thus deemed more suitable for multi-core platforms [60]. Also, they are not so critically dependent on the ordering of the matrix $Y$ as direct methods [38]. Computationally, iterative solvers prevail only when the number of buses is rather high ($> 21k$ [53], $> 2k$ [99]), and this advantage becomes quite significant for very high number of buses ($> 300k$ [103]). As in the direct method case, iterative solvers are integrated into power system related works mostly as off-the-shelf linear system libraries.

### 2.2.2 Coherence between the algorithm and the platform

The coherence between specificities of the algorithm and the platform is essential for high performance. This coherence refers to how well the algorithm is adapted to the particularities of the architecture, e.g. a vectorized version of matrix-vector multiplication will run faster on a machine with SIMD instructions. Vice versa, it also refers to how suitable the platform is to the nature/frequency/type of operations executed in the algorithm, e.g. an architecture with high memory bandwidth is more suitable for an algorithm with many memory access operations.

In linear algebra algorithms in general, and the ones for (2.2a) & (2.2b) in particular, the following general observations can be made. Some of them are more pertinent to dense rather than sparse versions of linear algebra operations.

a. The vast majority of the operations are non-integer arithmetic operations (in contrast
2.2. Linear algebra in power systems

to integer arithmetic, bitwise, logical, etc. operations).

b. Operations feature data parallelism, i.e. they are usually vectorized, e.g. matrix-vector multiplication can be decomposed in multiple vector inner products.

c. The operands change often, e.g. a limited number of operations is applied to a large number of operands and not vice versa, e.g. a single multiplication operation has to be performed on every element of a vector in a vector multiplication by a scalar.

d. The operands usually change regularly/not-randomly, e.g. the elements of a vector are accessed incrementally, one by one in a vector multiplication by a scalar.

e. There are many loops, e.g. across columns and rows (in the dense case) for a matrix multiplication by a scalar.

Conventional General Purpose (GP) single-core Central Processing Units (CPUs) are not optimally designed to handle such requirements. Especially troublesome are the vectorial operations in matrix manipulations, and the speed limitations in memory access. The performance penalty cannot be leveraged by increasing the clock frequency beyond a practical limit, mainly because of electronic limitations and power consumption issues.

The idea of platforms/architectures dedicated to specific problems is based on the obvious observation that routines run faster on hardware specifically made for them. Various architectures have been proposed and established that overcome the linear algebra performance shortcomings of the traditional CPU. Architectures dedicated to power system linear algebra are the ones that most effectively handle the specificities (a)-(e) identified above.

Point (a) is now trivially handled by the inclusion of hardware that can handle non-integer arithmetic operands, e.g. Floating-Point Units (FPUs) that are ubiquitous in modern day processing units.

Point (b) is handled by vectorizing the datapath of the arithmetic unit of the platform. This can be either by the inclusion of (Single Instruction Multiple Data) SIMD instructions into the Instruction Set (IS) of the machine, or by pipelining, e.g. in linear and/or systolic arrays of sub-CPUs.

Point (c) relates to the computer horsepower to memory bandwidth ratio of the platform. Linear algebra in that sense is particularly memory-intensive. The issue can be handled by increasing the overall memory bandwidth or by increasing the memory bandwidth locally, e.g. by the introduction of distributed memory (so that PUs may have faster access to their local memory). Generally, exploiting the memory hierarchy has been a major driving force for the advance of linear algebra libraries.

Point (d) strongly relates to point (b) in that the operands of many linear algebra super-operations are vectors. Thus, their implementation may be sped up using automatic counters so that arithmetic operations are performed to incremental positions of a contiguous memory.
space - similar to counters in direct memory access (DMA) units. An 1d-vectorization of this kind is found at the core of many sparse matrix storage formats and computation schemes [109].

Point (e) can be effectively tackled by hardware controlled looping that simplifies overhead for loops. The parallelized nature of linear algebra suggests that such operations run best on analogously parallel platforms. It is of no surprise that most platform modifications aim to some sort of parallelization of the computation datapath, since it is nowadays established that performance increase is to be sought after in the parallelization domain [110, 111].

2.3 Dedicated platforms

Most of the platforms combine more than one of the above, in an effort to maximize benefits. Comparative studies on the performance of different platforms architectures on linear algebra applications can be found in [112, 111, 113, 114, 115, 116]. An overview of platform paradigms is given in the following subsections.

2.3.1 SIMD

A kind of a mildly dedicated platform are processors that implement Single Instruction Multiple Data (SIMD) instructions, i.e. that are able to process multiple data within the same instruction cycle or in a pipelined fashion - see also points (b) and (d). Almost all modern processing units have SIMD capabilities (VIS, MMX, 3DNow!, SSE1-5, XOP, FMA3-4, AVX/-512) which can be exploited by linear algebra implementations, e.g. [48]. Of course the latter should take into account the vectorizing capabilities of the hardware, and therefore follow a different coding philosophy. Generally assembling operands into vectors that are fed into higher-level BLAS routines should be sought after, as it makes better use of the capabilities and the memory hierarchy of the platform [117].

The potential of SIMD architectures for power system oriented, linear algebra applications was identified as early as their introduction as “array” or “vector processors” [31, 118, 119, 120, 121], and has enjoyed widespread popularity ever since [122, 123, 124, 125, 117, 90, 91]. SIMD functionality (data parallelism) is nowadays usually integrated into systems with multiple levels of parallelism (instruction and task parallelism) in multi-core platforms (see next section) [122, 123, 124, 125, 117].

2.3.2 Multi-core

A natural extension of the level of parallelism of the platform are multi-core architectures. In this work, for simplification purposes, multi-core is understood as having more than one datapath, no matter how this is physically implemented, e.g. multiple cores on a die, multiple dies in a package, multiple packages in a system, etc. In this way, the term multi-core in this
work encompasses the (normally wider) terms of multi-chip CPUs and multi-CPU systems. For example, a cluster of 5 computers, each of which has 2 quad-core processors, is counted as a 40-core system in this respect. The recent Multiple/Many Integrated Cores (MIC) accelerator, and Single-Chip Cloud Computer (SCC) [126] hardware fall into the wider category of multi-core architectures, as defined hereinabove.

Multi-core architectures may have different connectivity and arbitration patterns (symmetric, asymmetric), memory arrangements (shared, distributed, shared-distributed), and may also implement SIMD instruction sets, but in any case they are expected to be able to execute multiple processes without time-sharing, i.e. fully parallel. Most of linear algebra routines can be naturally parallelized (e.g. matrix multiplication [127]) up to an Amdahl’s limit. Another slightly different flavor of parallelization is to cast the algorithms into a block/tile/panel version with blocks shared between different computing cores [128], or the problem can be treated in a domain-decomposed way, with each of the sub-problems being handled to a core (e.g. linear system solving with the Schur decomposition [129], or multi-grid methods [130]).

The vast majority of high-grade linear algebra takes advantage of multi-core architectures and vectorization when possible. MIC-accelerated multi-core systems successfully rival more dedicated architectures (e.g. GPU-accelerated) in terms of performance [131].

**In power systems**

Power system particularities for the special case of (2.2a) & (2.2b) has been considered in [132, 133, 134]. Generally, the majority of applications in the power system domain exploit multi-core architectures for the linear algebra part. This is usually done through the inclusion of a multi-core enabled linear algebra library. Related work covers many power system analyses, e.g.

- power flow studies [135, 136, 137, 138, 139, 140, 141, 142],
- voltage stability studies [143],
- transient stability/electromechanical transients simulation [144, 137, 145, 146, 147, 148, 138],
- state estimation [149, 60, 140],
- electromagnetic transients simulation [146].

All of parallelized power system analyses assume the execution of the application on a platform with parallel capabilities, i.e. multi-core, often with SIMD support. In any case, multi-core platforms call for *parallel programming*. Software parallelization should always be in accord with hardware capabilities. A detailed presentation of the topic is out of the scope of this work.
Chapter 2. Power system simulation, linear algebra and computing platforms

2.3.3 Heterogeneous computing

Another class of specialized platforms are the ones that follow the heterogeneous computing paradigm. Heterogeneity refers to the use of processing units (cores) with dissimilar architecture and properties in an effort to get the best of all worlds. The workload is divided between different PUs based on their specialization. Usually, heterogeneous architectures are centered around one (or possibly more) GP PU(s) that handles generic, mostly serial, tasks. The GP PU(s) are interfaced with dedicated PU(s), often referred to as coprocessors or accelerators, which handle specialized tasks such as vector operations, intense floating point mathematics, signal processing, etc.

Over time, heterogeneous computing has gained significant acceptance for scientific computing [112], especially for linear algebra applications [111]. Connectivity, communication schemes and memory arrangements between the different PUs (processors and coprocessors) vary. The coprocessors that have been credited as having the biggest potential in linear algebra applications are MICs (see previous subsection), Graphical Processing Units (GPUs), Digital Signal Processors (DSPs), Field Programmable Gate Arrays (FPGAs) and Application Specific Integrated Circuits (ASICs), all briefed in next subsections.

2.3.4 GPU

GPUs are highly parallel vector processors with local memory for each of the processing cores. The total amount of memory is usually smaller than traditional multi-core CPUs (across different levels of cache) but the number of processing cores is normally larger [140]. They were initially targeted to graphics operations, but nowadays they are increasingly used as GP PUs, in what is known as General-Purpose Computing on Graphics Processing Units (GPGPU) arrangements [150]. Their characteristics make them ideal for a wide range of operations similar in computational nature [151].

The highly parallelized, wide-width SIMD datapaths of GPUs can tackle many of the points identified as problematic in section 2.2.2. Therefore a great wealth of GPU-enabled linear algebra work exists in literature, e.g. [152, 153]. Most of the time the integration of GPUs is done through linear algebra libraries that support GPU acceleration. Loop restructuring for the parallel operations (vectorizing, chaining, blocking, tiling) is critical in the process [107]. GPUs are often utilized en-masse and/or combined with multi-core GP PUs in hybrid architectures e.g. [154, 155]. FLOPS performance of modern GPUs is often one order of magnitude higher than GP CPUs. Given an optimized implementation, GPU speedups in the range of order(s) of magnitude are reported in most of the above - most conservative estimates of a speedup of 2.5x are given in [156]. One limitation of GPUs is that they have a higher computer horsepower to memory bandwidth ratio than GP (C)PUs, so this might lead to memory bottlenecks for GPU kernels [157]. Concerns have also been raised regarding their energy per operation efficiency [158] but relevant results in the Green500 list show that all 17 first spots are occupied by heterogeneous GPU-accelerated architectures (as of June 2014) [159].
2.3. Dedicated platforms

**In power systems**

Power system applications that have GPU-accelerated linear algebra operations have been presented for many type of analyses:

- power flow [160, 161, 162, 163, 164, 165, 166, 167, 168],
- state estimation [169, 170],
- transient stability/electromechanical transients simulation [171, 172, 173, 174],
- electromagnetic transients simulation [175, 176, 177, 178, 179],
- optimization, e.g. Optimal Power Flow (OPF) [180], Unit Commitment [179].

In all of the cases, the heterogeneous computing concept is used: the CPU assumes main control of the application and the GPU is accelerating the burdensome parts.

**2.3.5 FPGA**

Field Programmable Gate Arrays (FPGAs) are integrated circuits that can be reconfigured after their manufacturing. This enables the designer of an FPGA platform to tailor it to the specificities of its application. FPGAs have recently seen penetration into the supercomputing domain, and this trend is expected to continue [181]. Because of the reconfigurability of the datapath and the high local distributed memory bandwidth, FPGAs have been deemed particularly suitable for linear algebra kernels, e.g. [182]. In most recent implementations, speedups of one order of magnitude are reported for linear system solving on conventional platforms [183].

A key enabler of FPGA utilization in linear algebra applications has been the addition of “hard” arithmetic cores, particularly for floating point operations [184], see also point a) in section 2.2.2. To target the vector/matrix nature of linear algebra operations, the dataflow arrangements in FPGAs include systolic [185]) and serial/parallel linear arrays [186]. The use of pipelining is also ubiquitous in the above.

**In power systems**

A lot of recent research has been focused in accelerating typical power system analyses through the use of FPGAs.

- power flow [187, 188, 189, 190, 191, 192, 193, 194]
- electromagnetic transients simulation [195, 196, 197, 198, 199, 200, 201, 202]
- optimization [203, 191]
Chapter 2. Power system simulation, linear algebra and computing platforms

As for GPUs, in most of the cases the FPGA is used as a dedicated accelerator running in conjunction with a conventional machine that handles the main flow of the application.

2.3.6 DSP

Digital Signal Processors (DSPs) are processing units optimized for the needs of digital signal processing. Digital signal processing operations are related to points b and c of section 2.2.2 and the datapath of DSPs is optimized for them. This high degree of specialization makes the instruction set of DSPs highly irregular, which renders them less suitable for general purpose computations. Streaming (direct memory access - DMA), pipelining (single-cycle instructions), SIMD with very-long-instruction-words (VLIW) techniques and hardware multiply-accumulate (MAC) units are a commonplace in DSP architectures.

The use of DSPs for numerical linear algebra has been readily investigated [204, 205, 206]. It normally uses a array of DSP processors [204] or a combination of DSPs with GP PUs in heterogeneous architectures [207]. Because of the exceptional power efficiency of DSP resources these heterogeneous architecture have achieved some impressive FLOPS/Joule figures [206]. Throughput is found to stabilize above a matrix size of approx 500-1500, and scaling with the number of DSP cores has been shown to be quasi-linear [206, 204]. In [205] a sparse matrix-vector multiplication is realized, similar to the one required by (2.2a). It is there highlighted the potential of DSPs to surpass GPUs in terms of power efficiency if manufactured using modern fabrication techniques (subsequently realized by TI®in the C667x series, as of March 2013).

2.3.7 ASIC and VLSI

The most dedicated platforms that can be constructed are Application Specific Integrated Circuits (ASICs) through Very Large Scale Integration (VLSI) Processes. ASICs are expected to deliver performance that greatly exceeds general purpose PUs, with a cost of significantly reduced flexibility/reconfigurability. For linear algebra, due to the vectorial nature of operations, proposed ASIC solutions have linear [208] or systolic array [209, 210, 211] arrangements.

For completeness sake, it should be mentioned that in-between general purpose and fully application-specific solutions (ASICs), there exist the Application-Specific Instruction Set Processors (ASIPs) [212]. These PU cores provide a tradeoff between flexibility of general-purposeness and dedication [213]. ASIP linear algebra implementations have a fixed instruction set part for GP computing and adapt their reconfigurable part to linear algebra needs, e.g. matrix singular value decomposition (SVD) in [213].

Conceptual designs for “power-system computers” based on custom VLSI ASICs have been proposed [214, 215, 216]. All of related work is conceptual and is based on systolic arrays, in an effort to tackle the nature of linear algebra computations. Nowadays, ASIC concepts can be proven by rapid-prototyping them on FPGAs.
2.3. Dedicated platforms

2.3.8 Conventional computing and its limitations

All of the above platforms are based on conventional technology. The term conventional encompasses three main aspects. Conventional computers are

- **digital**, in that data is represented using discrete values (for a finer difference between digital and its contrary, analog, see [217]),
- **binary**, in that the discrete value of the data is encoded, stored and manipulated using their binary representation, i.e. “ones” and “zeros”, and
- **electronic**, in that the carrier of information is electricity, i.e. voltage, current and charge.

Digital binary electronic computers are manufactured using (very) large scale integration (VLSI) techniques on semi-conductor materials, mostly silicon, overlaid with various conducting (metal) or insulating layers. A set of circuits that implement a certain functionality is packed on the same physical piece, the chip, and the result is called an integrated circuit (IC). All of the platforms mentioned in the previous subsections, e.g. CPUs, GPUs, FPGAs, etc. come in the form of ICs. More complex designs are built on special boards, the Printed Circuit Boards (PCBs), with metallic connections that connect many ICs together.

The Moore's law defines that the trend in the increase of computing power in digital electronic computers is exponential. Recently fatigue in the trend has been observed [218]. The features of the semiconductors on the IC have been continuously scaled down, and now limitations are faced due to the (lack of) accuracy of the available lithographic technology. Another important factor is the increase of the frequency that the chips are clocked at. The two contribute to a variety of physical problems such as excessive heating of the chip, excessive power consumption (through leakage currents) [219], increased probability of errors due to manufacturing defects [220] and quantum phenomena appearing (e.g. quantum tunneling) that impede the proper functioning of the device [221].

2.3.9 Unconventional computing

Problems with conventional architectures has led to the emergence of unconventional computing paradigms. Unconventional can be understood as differentiating in any of the above three aspects that define conventional computing. Various such platforms have been proposed, such as optical computers (non-electronic), quantum computers (many different proposals), bio-computers (non-electronic) and wetware computers (non-digital and non-electronic), mechanical computers (non-electronic), analog electronic computers (non-digital), etc. All of the above try to gain advantage over conventional platforms in some metric, for example speed, size, energy consumption, heat generation, etc. Most of these technologies are either in their infancy (optical, quantum, bio) or have been deemed deprecated for scientific computation use (mechanical, analog electronic).
Chapter 2. Power system simulation, linear algebra and computing platforms

Depending on the properties of the underlying architecture some have proven better suited to linear algebra applications, such as optical computers [222], quantum computers [223, 224] and analog electronic computers. Implementations are usually experimental and of limited practicality, but serve well as concepts to get a better understanding of the technologies. Analog electronic computers are of particular interest to this work and will be analyzed in a separate section.

2.4 Analog electronic computers

In electrical computers data is represented using continuous electrical quantities (e.g. voltage, current, charge, etc.) The architecture of analog computers includes reconfigurable electronic devices such as potentiometers, operational amplifiers, controlled sources etc. Functionality desired from the computer can be obtained by setting the values of the reconfigurable elements as well as the connectivity pattern between them. Their scope of application includes simple arithmetic, the solution of differential equations, learning algorithms, neural network implementations, signal processing, linear algebra operations etc.

The main source of performance gain for analog computers is that the computing force behind them are the laws of physics [225]. The time that it takes to solve a problem on an analog computer depends only on the settling time of the latter. Additionally, this property is little affected by increasing the size of the problem [226]. From a theoretical point of view there is evidence that some analog configurations, such as neural nets, possess extraordinary computing properties [227, 228].

Apart from performance benefits, additional favorable characteristics of analog architectures have long been identified [229]. The ability to represent continuous real-world quantities with analogously continuous ones is a natural advantage [230]. Also, a continuous model of computation fits better the specificities of some kinds of problems such as differential equations [231]. Finally, power efficiency and related problems of digital architectures, is a major drive for a possible re-emergence of analog computing [232, 226].

Analog computers can be coupled with digital in arrangements that are called hybrid [233]. They are also called mixed-signal, in the sense that analog and digital signals coexist on the same hardware, and discrete-continuous because of the nature of the digital and analog signals. The transition between the digital and the analog domain is realized by arrays of analog-to-digital (ADC) and digital-to-analog converters (DAC).

The main reasons for implementing hybrid systems is to harness the benefits of both worlds, the computational prowess of the analog and the accuracy/precision of the digital. Often, the analog part is used as a seed (initial condition) for a digital part that takes over after to produce the final accurate solution [232]. A seed closer to the correct solution greatly enhances the speed of many iterative numerical algorithms. Various dedicated and general-purposes hybrid platforms have been proposed [234, 235, 232, 4].
Due to the pervasiveness of digital computing in our era, all modern hybrid platforms are "digital-computer-oriented". The core platform is digital and the analog part is used as a black-box accelerator, dedicated to a specific operation. e.g. "math-coprocessor" in [236]. All analog signals are brought back to the digital domain, and usually no analog interface to the outside world is available.

2.4.1 In linear algebra

Some characteristics of analog platforms make them suitable for linear algebra applications. Points (b), (d), (e) of section 2.2.2 can be handled naturally in analog with meshed grid structures. This is because an analogy is drawn between the linear algebra operation and the physics that govern the behavior of such analog structures. More simply, analog grid computers are governed by linear algebra relations, exactly as they ones they are required to execute; therefore, the "mapping" is natural.

Naturally, the solution will suffer from the inaccuracy due to inaccuracies to the values of the analog components. From a mathematical point of view, this can be seen as epsilon perturbations to the perfect values of the operands (matrices and vectors).

The solution speed depends on the settling time of the device, which -under circumstances- can be significantly faster than that of digital solvers. This is so because laws of physics perform the "computations". This settling time is owed to the non-instantaneous response of any physical circuit, due to parasitic and possibly non-parasitic capacitances in the circuit (and the resulting RC constants). Any additional auxiliary times, e.g. ADC and DAC conversions, should be added to this settling time. In total, in analog linear algebra, accuracy is traded for performance.

Exploratory studies on linear algebra using analog structures had been carried out in the 60's and the 70's [237, 238, 239, 240, 241, 242, 243] and the early 1990's [244, 245, 246]. Recent relevant work has been based on Field Programmable Analog Arrays (FPAs) [247, 248]. In [237, 238, 241] the authors dealt with the solution of linear systems that arise in the finite difference solution of partial differential equations, such as the diffusion equation. The resulting matrix has very specific fixed structure, therefore a fixed-connectivity resistor lattice is used. The linear algebraic properties of the connectivity matrix of the resistor network are elaborated in [240]. In [242] it is correctly identified that the use of such fixed topologies of purely passive components significantly narrows the range of matrices that can be handled. As a solution to the problem, the addition of active elements is proposed. This is similar to the use of negative resistor values in [237, 238].

References [237, 238, 240, 241, 242, 245] all use a hybrid (digital-analog) iterative scheme. They integrate the (finitely precise) analog solver into a digital iterative algorithm. In [242, 243, 245] the convergence properties of such a process are examined and [245] gives an insight on its timing. The benefit of this scheme is that analog imprecisions have no effect on the accuracy
Chapter 2. Power system simulation, linear algebra and computing platforms

Table 2.3 – Analogies between flow networks, power systems, and analog electronic networks

<table>
<thead>
<tr>
<th>Flow network</th>
<th>Power systems</th>
<th>Analog electronics network</th>
</tr>
</thead>
<tbody>
<tr>
<td>Node $v \in V$</td>
<td>Power system bus</td>
<td>Electrical node</td>
</tr>
<tr>
<td>Edge $e \in E$</td>
<td>Power system branch</td>
<td>Electrical branch</td>
</tr>
<tr>
<td>Flow</td>
<td>Current</td>
<td>Electrical current</td>
</tr>
<tr>
<td>Sources and sinks</td>
<td>Injectors (e.g. generators, loads, etc.)</td>
<td>Current sources</td>
</tr>
<tr>
<td>Edge capacity constraints</td>
<td>Branch/transformer ratings</td>
<td>Power ratings of the components of electrical branches</td>
</tr>
</tbody>
</table>

of the final solution.

In [246] an analog CMOS neural network implementation for the solution of (overdetermined) linear system is proposed. The interconnection scheme of [242] is used. Finally, [247, 248] uses the resources of an FPAA [249] to implement 2x2 dense vector-matrix multiplications.

2.4.2 In power systems computing

Analog platforms have a long history in power system applications [250]. This mainly comes from the fact that the two domains, power systems and analog electronics, feature topological similarities, as they are both typical examples of linear flow networks. A flow network consists of vertices $V$ called nodes and edges $E$ called arcs. The connectivity between edges and nodes is given by a directed graph $G = (V,E)$. Flow $f : V \times V \to \mathbb{R}/\mathbb{Z}$ is conveyed by edges, which have a given limited capacity $f(u,v) \leq c(u,v)$. On the vertices, sinks or sources may be present. The aggregate input and output flow to/from a given vertex must be equal, as per the flow conservation property. The analogies between a power system, and analog electronic network, and a generic flow network, are shown in table 2.3.

The analogies highlighted above have been exploited to create analog computers dedicated to power systems computations. Up until the 1960s transient network analyzers (TNA), also called AC network analyzers, were the established platforms for power system analysis purposes, e.g. power flow studies, short circuit analysis, transient stability studies, etc. They included scaled-down models of the elements of the real grid and operated with real analog voltage and current quantities.

The main disadvantages of TNAs were their cost (in terms of size, power consumption and money), lack of scalability (in terms of the limited size of systems that could be simulated) and lack of flexibility (in terms of models of the grid components that could be simulated). However they managed to provide real-time simulation of events at a point of time where computational power of digital computers was still insubstantial. This was because they implicitly performed linear algebra operations such as (2.2a) and (2.2b) using analog computing.
2.4. Analog electronic computers

(through their wirings), and hence they were using the ultra-fast laws of physics as computing
cforce. Additionally, they were immune to issues arising from the numerical solution of the
problem, e.g. numerical instability.

2.4.3 Evaluation criteria

The criteria on which analog electronic computers can be evaluated are the following.

Cost. This refers to the dollar cost of the platform. Generally this metric is rather unfavorable
for analog computers, since the number of the required resources scales proportionally to the
system that has to be analyzed - see also “scalability” hereunder.

Size. This refers to the physical size of the computer. This can be quite critical since an analog
computer is comprised from actual physical electronic components. Especially in the past
TNAs used to be quite large in size and an entire room used to be dedicated to the TNA.

Interfacing. This refers to the capabilities of the analog computer to be interfaced to other
computing platforms. This is especially important nowadays since practically all modern
computing infrastructure is digital.

Reconfigurability. This refers to the capabilities of the computer to be reconfigured, in the
sense of performing different (mathematical) operations. Reconfiguration implies galvani-
cally connecting the analog components of the computer. This wiring, was commonly done
manually in older machines [251].

Scalability. Scalability refers to the ability of the computer to handle a larger amount of
work. For power system problems this translates to larger power system sizes). In order to
accommodate this growth the analog computer has to be enlarged with real more physical
analog electronic components. Additionally larger problems, may affect the performance of
the computer, as explained in the next paragraph.

Performance. The solution time of an analog computer greatly depends on the settling
time of its electronics. The latter is affected by parasitic and non-parasitic elements (e.g.
capacitances).

Accuracy and precision. The solution of an analog computer to a problem can be considered
a variable $X$ with a normal distribution $N(\mu, \sigma^2)$. Accuracy is understood as how close the
solution for a problem computed in analog $x$ is from the true one $x^\ast$. Precision is understood
as how close repeated solutions in analog are to each other. Accuracy can be quantified by the
mean value $\mu$ of $X$ and precision by its standard deviation $\sigma$. The main cause of inaccuracy
is the inherent inaccuracy of the electrical components that are used as well as losses of all
kinds, e.g. leaked currents, parasitic impedances, etc. The main cause for imprecision is the
Johnson-Nyquist (thermal) noise, which can often be approximated as white noise. The higher
the Signal-to-Noise (SNR) ratio in the system, the higher its precision.
Power efficiency. Power efficiency can be understood as performance per watt. Joule losses account for the majority of the consumption in analog computers.

Functional completeness. Functional completeness refers to the range of computations that can be performed by the analog computer. By definition analog components are not universal as digital resources are in the sense of a Turing machine, i.e. analog components are bound to executing the specific operation they were (a priori) manufactured for. This can significantly hamper the modeling/functional capabilities of the analog computer.

2.5 Outlook

Despite their natural dedicated-ness to power system problems, analog TNAs were abandoned in favor of digital computers running numerical routines, once digital technology of adequate performance was widely available. This was because digital technology was deemed generally superior in almost every aspect. TNAs remained a bit longer in use for education [252], for the simulation of power electronic equipment (e.g. HVDC, SVC) [253] and real-time studies (e.g. HIL) [254], but they were later superseded by Real Time Digital Simulators (RTDSs) in that domain.

An important paradigm that was learned from TNAs was that the calculations for the power grid (the ensemble of interconnecting elements, such as transmission lines and transformers), and the calculations for the components connected to the grid (generators, loads, etc.) were carried out in two connected, but physically different machines [255].

- The grid calculations involved a big linear algebra part (with the admittance matrix) that reflects the interdependence of voltages and currents in the system. This part was handled in a (miniaturized) DC or AC model of the real-world grid.

- The calculations for the bus components were usually differential-algebraic in nature and reflected their dynamic behavior. This part was handled by miniature models of the real-world elements.

Analog computing was brought back to the fore by the work of Fried [256, 257]. The major drive for that was the demand for higher performance highlighted in chapter 1. Great advances in analog electronic technology (especially manufacturing) offered a potential to solve the deficiencies of analog computers in the criteria of section 2.4.3.

In [257], five different approaches with different levels of abstractions and analog computation were proposed. The one that was the most promising is based on the simulation of phasors and uses a real decomposition of the (complex) nodal equation. This gave birth to subsequent related research, centered around three universities:

- the Arizona State University [258, 259, 260],
• the Drexel University [191, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273] and

• the École Polytechnique Fédérale de Lausanne (the home university of the original research of Fried) [256, 257, 274, 275, 276, 277, 278, 279, 280, 281, 4, 282, 283, 284, 285, 286, 287].

This work is a continuation of investigations in the field of the development hybrid (analog and digital) computers dedicated to power system applications. In chapter 3 a related realized prototype will be presented. The prototype will be evaluated based on the criteria of section 2.4.3 and will be compared to related state of the art. Advantages and shortcomings will be underlined and a future conceptual design will be proposed in chapter 4 to overcome the drawbacks of the current design.
3 Realized dedicated mixed signal solver

Chapter 3 presents a hardware platform that is dedicated to the transient simulation of power systems. The platform, termed Field Programmable Power Network System (FPPNS), has an analog and a digital part. Analog computing is used for the matrix solution part, while pipelined digital resources on a field programmable gate array (FPGA) handle all non-linear-algebraic computations.

The FPPNS is connected to a host PC as a USB peripheral. On the PC side a power system analysis suite has been developed, termed elab-tera. Through a set of OS drivers the dedicated platform is recognized as a hardware accelerator to the computations of the analysis suite. The entire software design and functionality is presented in the chapter.

The mathematical properties of the FPPNS are thoroughly examined. The inaccuracies of the analog hardware are examined and their effect on the final results provided by the FPPNS is shown. The system is evaluated across the nine criteria of section 2.4.3, and a comparison with related state-of-the-art is conducted.
Chapter 3. Realized dedicated mixed signal solver

Fig. 3.1 shows a schematic overview of the multi-platform system. On the left there is the dedicated hardware, on the right the software, and in between the USB connection of the two. A color convention has been adopted to denote the different levels of the whole system, shown in Fig. 3.2. A photo of the dedicated hardware connected to a real PC is shown in Fig. 3.3. The hardware part of the system will be referred to as the Field Programmable Power Network System (FPPNS) for the rest of this work and the software part as elab-tsaot.

3.1 Hardware

The FPPNS consists of two parts, an analog part and a digital one. Analog computing is used for the matrix solution part, while pipelined digital resources on a field programmable gate array
(FPGA) handle all non-linear-algebraic computations. The analog and the digital domains are interfaced using analog-to-digital (ADC) and digital-to-analog (DAC) converters. An overview of the proposed mixed-signal platform can be seen on Fig. 3.4.

The system is targeted to the transient simulation of power system. In particular it handles the transient simulation of balanced phasors of electrical quantities for symmetrical three-phase networks in the per-unit system. All related modeling assumptions and theory concerning the above can be found in appendix B.

The general formulation of the problem to solve is the one of (2.1). FPPNS uses a partitioned (and not simultaneous) approach to solve (2.1) \cite{288, 283}. Equations are separated in grid and injection equations. At each time step of the integration, the nodal flow algebraic equation of the grid is solved and then the dynamic behavior of the injections is determined, as per Fig. 3.5.

The nodal voltage equation of the figure is included in $f_a$ set of (2.1). The injections equations consist of the differential $f_i$ and the algebraic equations $g_i$ for each injection $i$. Functions $f_i$ and $g_i$ are included in the $f_d$ and $f_a$ sets of (2.1) respectively. $I$ is the vector of currents injected at each bus and $V$ is the vector of bus voltages. Both $I$ and $V$ are included in the set of algebraic variables $x_a$ of (2.1). One current injection $I_i$ and one voltage $V_i$ is associated with each injector. The latter has a set of differential variables $x_i$ and algebraic variables $(V_i, I_i, y_i)$. The former are included in $x_d$ and the latter in $x_a$ of (2.1). The set of DAEs for each injector is parameterized by a set of parameters $\lambda_i$ - more on this in section 3.1.2.
Chapter 3. Realized dedicated mixed signal solver

Figure 3.4 – Overview of the existing mixed-signal computer [4].

Grid Equations
\[(nx_1) \begin{pmatrix} I \\ V \end{pmatrix} = (nx_n) Y (nx_1) \begin{pmatrix} V \end{pmatrix}\]

Injection Equations
\[x_i = f_i(x_i, y_i, \lambda_i)\]
\[0 = g_i(x_i, y_i, \lambda_i)\]

Figure 3.5 – Partitioned solution scheme for power system simulation equations
3.1.1 Analog part

The analog part of the existing platform contains the wealth of reconfigurable electronics that perform the hyper-parallelized analog computation of the linear system (nodal equations), which arises in a power system.

Derivation of the nodal equations

A starting point for the investigation is to consider branches as complex two-port networks. In our case, one port is denoted as the “from” (related quantities subscripted with the letter $f$) port of the network and another the “to” port (related quantities subscripted with the letter $t$). Each port is a complex interface for the electrical quantities, current and voltage. The convention is that the currents are entering the two-port network on both sides. For the two-port network above the (complex) admittance parameters are defined as follows.

\[
\begin{bmatrix}
i_f \\
i_t
\end{bmatrix}
= \begin{bmatrix}
 y_f \\
y_t 
\end{bmatrix}
\cdot
\begin{bmatrix}
v_f \\
v_t
\end{bmatrix},
\begin{bmatrix}
y_f \\
y_t
\end{bmatrix}
= \begin{bmatrix}
y_{ff} & y_{ft} \\
y_{tf} & y_{tt}
\end{bmatrix}
\tag{3.1}
\]

Every interconnecting element of the power system grid can be modeled as a two-port network such as the above. Three-port elements such as three-winding transformers can be considered as a combination of three normal two-winding transformers modeled with their $\pi$-equivalent, and one fictional bus [36].

Branch models can have various degrees of complexity depending on the time scale of phenomena of interest. It is very common for studies in the transient time scale to neglect the fast transient (electromagnetic) behavior of the interconnecting elements, e.g. transformers and lines.

In this work branches are modeled with the generalized $\pi$-model of Fig. 3.6, where all quantities refer to the phasors of the positive sequence in the per unit system. This model adequately describes the behavior of short transmission lines ($l \ll \lambda$), where short refers to the length of the line with respect to the wavelength $\lambda = c/ f_n$ of the carrier frequency $f_n$ of the system [36]. In the general case the admittance parameters of the two-port network are as follows.
Chapter 3. Realized dedicated mixed signal solver

Figure 3.6 – Generalized π model of a generic power system branch

\[ y_{ff} = \left. \frac{i_f}{v_f} \right|_{v_f=0} = \frac{y_s + y_f}{x^2} \quad (3.2a) \]

\[ y_{ft} = \left. \frac{i_f}{v_t} \right|_{v_f=0} = -\frac{y_s}{x \cdot e^{-j\theta}} \quad (3.2b) \]

\[ y_{ft} = \left. \frac{i_t}{v_f} \right|_{v_f=0} = -\frac{y_s}{x \cdot e^{j\theta}} \quad (3.2c) \]

\[ y_{tt} = \left. \frac{i_t}{v_t} \right|_{v_f=0} = y_s + y_t \quad (3.2d) \]

Where

• \( n = x \cdot e^{j\theta}[\varphi] \) is the ratio of the ideal transformer,

• \( y_s = g_s + j \cdot b_s = 1/(r_s + j \cdot x_s)[pu] \rightarrow \begin{cases} g_s = \frac{r_s}{r_s^2 + x_s^2} \\ b_s = -\frac{x_s}{r_s^2 + x_s^2} \end{cases} \) is the transversal line admittance,

• \( y_f = g_f + j \cdot b_f[pu] \) is the shunt admittance at the from side, and

• \( y_t = g_t + j \cdot b_t[pu] \) is the shunt admittance at the to side.

The degrees of freedom in the model are the following 8 real parameters.

• \( x [\varphi] \) is the tap setting of the transformer, and it quantifies the magnitude scaling of quantities on the secondary winding side.

• \( \theta [rad] \) is the phase shift introduce by the transformer to quantities on the secondary winding side.
3.1. Hardware

- \( g_f \) [pu] is the shunt conductance at the from side.
- \( b_f \) [pu] is the shunt susceptance at the from side.
- \( g_t \) [pu] is the shunt conductance at the to side.
- \( b_t \) [pu] is the shunt susceptance at the to side.
- \( r_s \) [pu] is the series resistance of the line.
- \( x_s \) [pu] is the series reactance of the line.

In the above the pu units refer to the corresponding physical unit translated in the pu system, e.g. Ohms for impedances, etc.

Apart from branches, that connect two different nodes, it is also very common to have elements that are shunt to buses, i.e. that connect a bus to ground. For these, an one-port network representation is used. Its admittance parameters of this network simply involve the voltage of the node and the current that is injected through the shunt element to ground.

\[
i_n = y_{sh} \cdot v_n \tag{3.3}\]

A power system topology is usually provided as a list of all its constituent elements, branches and shunt elements. Branches are provided in lists having values for all the parameters of the models that are involved. Connectivity is usually provided in the form of edge lists, e.g. branch \( k \) connecting from-bus \( f \) to to-bus \( t \). Let \( N = (U, E) \) be the graph of the power system with a bus set \(|U| = n\) and a branch set \(|E| = m\). Then the to- and the from- incidence matrices of the graph of \( N \) can be defined.

\[
m \times n C_f \text{ with } c_{fji} = \begin{cases} 
1, & \text{if branch } j \text{ is from-incident to bus } i \\
0, & \text{if branch } j \text{ is not from-incident to bus } i 
\end{cases} \tag{3.4}
\]

\[
m \times n C_g \text{ with } c_{tji} = \begin{cases} 
1, & \text{if branch } j \text{ is to-incident to bus } i \\
0, & \text{if branch } j \text{ is not to-incident to bus } i 
\end{cases} \tag{3.5}
\]

The complex two-port admittance parameters of (3.2) can be collected in four \((m \times 1)\) vectors \( Y_{ff}, Y_{ft}, Y_{tf} \) and \( Y_{tt} \) and then the contribution of branches to the overall admittance matrix can be formalized as follows.

\[
(n \times n) Y_{br} = C_f^T \cdot \text{diag}(Y_{ff}) \cdot C_f + C_f^T \cdot \text{diag}(Y_{ft}) \cdot C_t + C_t^T \cdot \text{diag}(Y_{tf}) \cdot C_f + C_t^T \cdot \text{diag}(Y_{tt}) \cdot C_t \tag{3.6}
\]
Chapter 3. Realized dedicated mixed signal solver

![Diagram of a complex two-port network](image)

Figure 3.7 – The complex two-port network for a branch connecting buses $f$ and $t$ and the effect it has on the building of the $Y$ matrix

Similarly if there are $m_s$ shunt elements in the system the shunt incidence matrix can be defined.

$$m_s \times n C_s \text{ with } c_{sj} = \begin{cases} 1, & \text{if shunt element } j \text{ is incident to bus } i \\ 0, & \text{otherwise} \end{cases}$$ (3.7)

The complex admittance parameters can be collected to a $(m_s \times 1)$ vector $Y_{sh}$ and the contribution of shunt elements to the overall admittance matrix can be formalized as follows.

$$(n \times n) Y_{sh} = C_s^T \cdot \text{diag}(Y_{sh}) \cdot C_s$$ (3.8)

The final complete admittance matrix of the grid is the addition of the effects of branches and bus shunt elements.

$$Y = Y_{br} + Y_{sh}$$ (3.9)

The effect that branch $k$ has on the building of the $Y$ matrix according to (3.6) is shown in Fig. 3.7, and the effect of shunt elements according to (3.8) is shown in Fig. 3.8.

For a power system of $n$ buses, the $n \times n$ complex nodal equations that link nodal current injections $I$ to bus voltages $V$ can be decomposed in real parts as follows.

$$I = Y \cdot V \rightarrow \begin{cases} \text{Re}(I) = G \cdot \text{Re}(V) - B \cdot \text{Im}(V) \\ \text{Im}(I) = G \cdot \text{Im}(V) + B \cdot \text{Re}(V) \end{cases}$$ (3.10)
3.1. Hardware

Where $Y = G + j \cdot B$ is the admittance matrix of the topology. In high-voltage transmission lines the assumption that $x_s \gg r_s$ is often made (Ch. 5 in [289]). In the extreme case $r_s/x_s \to 0$. Substituting this to transversal line admittance parameters of branches yields the following.

$$g_s = \frac{r_s}{r_s^2 + x_s^2} = \frac{(r_s/x_s)}{x_s(r_s/x_s)^2 + x_s} \quad \text{as } r_s/x_s \to 0$$

$$b_s = -\frac{x_s}{r_s^2 + x_s^2} = -\frac{1}{x_s} \cdot \frac{1}{(r_s/x_s)^2 + 1} \quad \text{as } r_s/x_s \to 0$$

(3.11a)

(3.11b)

An additional simplifying assumption is that no transformers exist in the model, i.e. $x = 1$ and $\theta = 0$ for all branches. Lastly, the conductive part of all shunts (shunt admittances at the from and to side of branches as well as proper shunt elements) is neglected, i.e. $g_f = 0$ and $g_t = 0$ for all branches, and $g_{sh} = 0$ for all shunt elements. This is a reasonable assumption, since shunt conductance means a galvanic connection between a point in the electric grid and the ground, i.e. a permanent “fault”. With the simplifying assumptions above, $G_{ps} = 0$ and two separate $n \times n$ linear systems are required to model the relations between $I$ and $V$.

$$\begin{bmatrix}
-I\text{m}[I] \\
\text{Re}[I]
\end{bmatrix} = 
\begin{bmatrix}
-B & 0 \\
0 & -B
\end{bmatrix} \cdot 
\begin{bmatrix}
\text{Re}[V] \\
\text{Im}[V]
\end{bmatrix}
$$

(3.12)

The sign of the first subset of equations is changed so that the resulting $-B$ has positive diagonal entries and negative off-diagonal entries. This is explained hereunder.

For the transversal reactance of (3.11) it holds.

$$x_s = x_L - x_c = \omega \cdot L - \frac{1}{\omega \cdot C}$$

(3.13)
Chapter 3. Realized dedicated mixed signal solver

Figure 3.9 – Power system topological mapping into an electronic resistor network equivalent.

Where \( x_L \) is the inductive reactance of the line, \( x_C \) is the capacitive reactance, \( L \) is the inductance, \( C \) is the capacitance of the line and \( \omega \) is the angular frequency. For normal power system transmission lines the reactance is inductive, hence the susceptance is negative\(^1\).

\[
\begin{align*}
\frac{x_L}{x_C} > 1 & \implies x_s > 0 \implies b_s > 0 \\
\text{(3.13)}
\end{align*}
\]

Hence by inverting the sign of \( B \), its diagonal elements are negative, and the off-diagonal elements are positive.

**Creation of the analog equivalent**

Each power system branch is associated with one electrical branch, and a resistor network is created that features topological similarity with the \( B \) matrix. The process is called *topological mapping* and is illustrated in fig. 3.9.

An electrical branch of the existing prototype is featured in Fig. 3.10. Similar to power system branches, there is a “from” and a “to” side, denoted with an \( f \) and a \( t \) in the figure. The effective resistance of the branch is handled by two digitally reprogrammable potentiometers, \( pot_f \) and \( pot_t \). Both potentiometers have a resolution of 8 bits and a range of 10kOhms. Their nominal final resistance value is given by

\[
\text{r}_{pot} = \frac{t}{2^M} \cdot R. \\
\text{(3.15)}
\]

\(^1\)This might not be the case for lines where series capacitive compensation is present, but this (exceptional) case is not considered in this work.
Where \( t \in [0, 2^M) \) is the digital tap setting, \( M = 8 \) is the bit resolution and \( R = 10 \, \text{k}\Omega \) is the full-scale resistance.

The design also includes a transversal short-circuiting switch \( sw_s \) and a switch that can be used to emulate a galvanic connection in the middle of the line. The exact point where the faulty connection is determined by the ratio of the resistances of \( pot_f \) and \( pot_t \). In the following equation \( loc \) is the emulated percentage of line length measuring from the “from” side, e.g. \( loc = 0\% \equiv \) fault at \( f \)-node.

\[
loc = \frac{pot_f}{pot_f + pot_t} \%
\]  

(3.16)

Notice that there are two duplicate networks of electrical branches (as shown in Fig. 3.4). All connections and values (\( pot_f, pot_t, \) etc.) are the same for corresponding branches in the two networks. This is necessary since two equivalents of \( B \) are required in (3.10). Sometimes the first network is called “real” since it models the relation between the imaginary current and the real voltage and the second is called “imaginary” since it models the relation between the real current and the imaginary voltage in (3.12).

In the current implementation a bank of electrical branches exist, the connectivity pattern of which can be modified at will (with some limitations) by a set of electronic programmable switches. The topology of the available electrical branch resources is shown in Fig. 3.11. The constrained topology of the electrical branch bank poses limitations to the nature of the topologies that can be mapped on the FPPNS.

Each slice of the FPPNS has a capacity of 24 nodes. Each of the resistors in the figure schematically depicts an electrical branch as the one of Fig. 3.10. On the end of each electrical branch there are digitally controlled electrical switches that can disconnect the branch. The implementation is based on discrete electronics on a custom-made PCB.

The PCB has analog extension connections that allows it to be vertically connected as shown
Once the topological mapping of the branches is decided, the mapping of power system buses to electrical nodes is also automatically induced. In every electrical node, currents and/or voltages can be written and read from the resistor network using dedicated circuitry. In the realized prototype, current injection, voltage injection and voltage reading is supported for each node. The synoptical electronic interface of a node of the resistor network in the current implementation is shown in Fig. 3.13. All electrical nodes possess individual ADC/DAC units, therefore all injections and measurements can be done fully in parallel.

In the current implementation a 12-bit DAC has been used. When the DAC is operated in voltage injection mode its output is as follows.

\[
V_{DAC} = \frac{t}{2^M} \cdot V_{FS} - V_{off} \tag{3.17}
\]

Where \( t \in [0,2^M] \) is the digital tap setting, \( M = 12 \) is the bit resolution and \( V_{FS} = 5V \) is the full-scale voltage of the DAC. \( V_{off} = 2.5V \) is an offset voltage that allows the final output of the DAC to assume both positive and negative values \((-2.5, +2.5) \, V\).
When the DAC is operated in current injection mode, the instrumentation amplifier (shown schematically as IA in the figure) is connected so that a $V_{DAC}$ voltage drop is applied across a conversion resistor $R_{conv}$. The resulting current output is

$$I_{DAC} = \frac{V_{DAC}}{R_{conv}}.$$  \hspace{1cm} (3.18)

The latter is a function of the voltage DAC tap setting $t$ of (3.17). The exact setting of $R_{conv}$ depends on the desired full-scale output current $(V_{FS} - V_{off})/R_{conv}$.

Voltage measurement functionality is provided through the A/D block of Fig. 3.13. The following model describes its functioning.

$$D^R = \frac{V_{in}}{V_{FS}} \cdot 2^M$$ \hspace{1cm} (3.19a)

$$D = \begin{cases} 0 & \text{if } D^R \leq 0 \\ \lfloor D^R + 0.5 \rfloor & \text{otherwise} \\ 2^M - 1 & \text{if } D^R \geq 2^M - 1 \end{cases}$$ \hspace{1cm} (3.19b)

$$V_{ADC} = \frac{D \cdot V_{FS}}{2^M}$$ \hspace{1cm} (3.19c)

The analog input to the ADC is $V_{in}$ and the output is the digital word $D$ of length $M$. $D^R$ would be the output word if it was allowed to assume non-integer values - it is an intermediate result in (3.19a) that facilitates the analysis. $D^R$ is quantized and saturated to $D$ in (3.19b). $V_{FS}$ determines the range of the ADC, since it is the full scale quantity that can be accepted at its input $V_{in} \in [0, V_{FS})$. Finally, $V_{ADC}$ is the value that is actually "read" by the ADC, and that is communicated to digital parts of the system.
Chapter 3. Realized dedicated mixed signal solver

Once the topological mapping is complete, the value mapping process takes place. Admittance, current and voltage values are scaled from one domain to the other via multiplicative mapping ratios $\rho_Y$, $\rho_I$ and $\rho_V$ with units of $\Omega/\text{pu}$, $A/\text{pu}$ and $V/\text{pu}$ respectively.

\begin{align*}
g_{el} &= \rho_Y \cdot (-b_{ps}) \\
i_{el} &= \rho_I \cdot i_{ps} \\
v_{el} &= \rho_V \cdot v_{ps}
\end{align*}  \tag{3.20a/b/c}

In the above, the $el$ and the $ps$ subscripts are added for clarity purposes. Due to Ohm’s law, the three ratios are related and only two degrees of freedom are available for the three ratios.

\begin{equation}
\rho_I = \rho_Y \cdot \rho_V \tag{3.21}
\end{equation}

The mapping ratio for impedances is simply the inverse of the admittance ratio.

\begin{equation}
\rho_Z = \frac{1}{\rho_Y} \tag{3.22}
\end{equation}

The result after the topological and value mappings is an electronic system that is equivalent to the power system topology.

\begin{equation}
I_{el} = G_{el} \cdot V_{el} \tag{3.23}
\end{equation}

The current injection vector $I_{el}$ contains elements as per (3.18), “known” voltage values in $V_{el}$ are injected into the grid by the mechanism described in (3.17), and “unknown” voltage values in $V_{el}$ are measured from the grid as per (3.19).

**Mathematical operations**

Once the analog of (3.23) has been created then the resistor network of the FPPNS can be seen as a causal system. For a grid of $n$ electrical nodes, there are $n$ node voltages and $n$ current injections to them. For a node, either a voltage or a current injection can be performed. Nodes with voltage injections have unknown current injections, and vice versa nodes with current injections have unknown voltages. Unknown electrical quantities are induced by the physics of the system.
When only voltages are injected, \( V_{el} \) is known in (3.23), \( I_{el} \) is the unknown vector and the operation performed by the FPPNS corresponds to a matrix-vector multiplication. When only current are injected \( I_{el} \) is known (rhs vector), \( V_{el} \) is the unknown vector, and the operation performed corresponds to a linear system solving.

An alternative operation is to inject currents to some of the nodes, and voltages to some others. Without loss of generality, let \( \mathcal{N} \) be the set of all electrical nodes with a cardinality of \( |\mathcal{N}| = n \), let \( \mathcal{I}_n = \{1, \ldots, n\} \) be the set of bus indexes. Let the non-empty bus (index) set of buses for which current is injected be \( \mathcal{I}_I \subseteq \mathcal{I}_n \). Then the complementary bus (index) set of buses for which voltage is injected is \( \mathcal{I}_V = \mathcal{I}_n \setminus \mathcal{I}_I \). Borrowing the notation from [290], for a matrix \( A \) and any two index sets \( \mathcal{I}_{s1} \) and \( \mathcal{I}_{s2} \), let \( A|\mathcal{I}_{s1}, \mathcal{I}_{s2} \) denote the submatrix of \( A \) obtained by the rows indexed by \( \mathcal{I}_{s1} \) and by the columns indexed by \( \mathcal{I}_{s2} \). Similarly, for an one-dimensional vector \( u \), let \( u|\mathcal{I}_{s1} \) denote the subvector of \( u \) obtained by the elements indexed by \( \mathcal{I}_{s1} \). Then mixed current-voltage injection can be described in matrix notation as follows.

\[
\begin{bmatrix}
I_{el}[\mathcal{I}_I] \\
V_{el}[\mathcal{I}_V]
\end{bmatrix}
= \begin{bmatrix}
G_{el}[\mathcal{I}_I, \mathcal{I}_I] & G_{el}[\mathcal{I}_I, \mathcal{I}_V] \\
G_{el}[\mathcal{I}_V, \mathcal{I}_I] & G_{el}[\mathcal{I}_V, \mathcal{I}_V]
\end{bmatrix}
\begin{bmatrix}
V_{el}[\mathcal{I}_I] \\
V_{el}[\mathcal{I}_V]
\end{bmatrix}
\tag{3.24}
\]

Where \( I_{el}[\mathcal{I}_I] \) and \( V_{el}[\mathcal{I}_V] \) are the unknowns. The above can be reformulated as a linear system problem.

\[
\begin{bmatrix}
I_{el}[\mathcal{I}_I] \\
V_{el}[\mathcal{I}_V]
\end{bmatrix}
= \begin{bmatrix}
G_{el}[\mathcal{I}_V, \mathcal{I}_V]^{-1} G_{el}[\mathcal{I}_I, \mathcal{I}_V] \\
-G_{el}[\mathcal{I}_V, \mathcal{I}_V]^{-1} G_{el}[\mathcal{I}_I, \mathcal{I}_V]
\end{bmatrix}
\begin{bmatrix}
G_{el}[\mathcal{I}_I, \mathcal{I}_I] \\
G_{el}[\mathcal{I}_V, \mathcal{I}_V]^{-1}
\end{bmatrix}
\begin{bmatrix}
V_{el}[\mathcal{I}_I] \\
I_{el}[\mathcal{I}_V]
\end{bmatrix}
\tag{3.25}
\]

Where \( G_{el}/G_{el}[\mathcal{I}_V, \mathcal{I}_V] \) is the Schur complement of \( G_{el}[\mathcal{I}_V, \mathcal{I}_V] \) of \( G_{el} \) as follows.

\[
G_{el}/G_{el}[\mathcal{I}_V, \mathcal{I}_V] = G_{el}[\mathcal{I}_I, \mathcal{I}_I] - G_{el}[\mathcal{I}_I, \mathcal{I}_V] \cdot G_{el}[\mathcal{I}_V, \mathcal{I}_V]^{-1} \cdot G_{el}[\mathcal{I}_V, \mathcal{I}_I]
\tag{3.26}
\]

### 3.1.2 Digital part

The digital part of the platform is realized on reconfigurable hardware. An Altera Cyclone III FPGA has been used, the configuration of which is shown on Fig. 3.14. The Avalon interconnection bus is the backbone of the synthesized system. A softcore NIOS II CPU has been instantiated to act as an orchestrator to the functioning of the computation pipelines as well as to perform all auxiliary functions, i.e. initialization, communication with a PC, etc.

The main purpose of the digital part is to solve the equations that govern the behavior of the elements connected to the grid, generators, loads, etc. This is in sharp contrast to other com-
parable work that uses analog models for the components connected to the grid (generators, loads, etc.) [273]. In our design all related computations are done in the digital domain, and hence practically unlimited flexibility is offered.

At any time, the behavior of any power system element can be fully defined as a set of DAEs, that depend on the complex voltage and complex current at the bus where the element is connected to, as shown in Fig. 3.5.

\[
S_i = \begin{cases} 
    \dot{x}_i = f_i(x_i, y_i, \lambda_i) \\
    0 = g_i(x_i, y_i, \lambda_i)
\end{cases}
\]  

(3.27)

Any other internal algebraic and dynamic variable that might concern injection \( i \) is decoupled from the solution of other injections hence this part of the computations is amenable to heavy parallelization. In the current implementation, a tradeoff between computational optimization, flexibility, and resource usage is achieved by adopting a pipelined computation scheme [4].

The set of injections \( E \) is partitioned in \( K \) equivalence classes \( E_k \) of structurally similar equations, i.e. equations that differ only in a set of parameters. Parallelization along the above partitioning scheme is suggested: One pipelined computational module \( C_k \) exists per equivalence class \( E_k \) that processes injections of the class. In the pipelines the numerical integration of the DAE sets of each injector are performed. At each pipeline clock cycle, the parameters of the equations to be solved on \( C_k \) are iterated across injections that belong to the class \( E_k \).

In the current implementation two explicit algorithms have been used, the Forward Euler (FE) (1st order) and the 2-step Adams-Bashforth (2nd order) method (AB2). FE is a single step
3.1. Hardware

method, and AB2 is a multistep method.

\[ \text{FE: } x_{k+1} = x_k + h \cdot f(x_k) \]  
\[ \text{AB2: } x_{k+1} = x_k + h \cdot \left[ \frac{3}{2} f(x_k) - \frac{1}{2} f(x_{k-1}) \right] \]

The stability region of the two methods is shown in Fig. 3.15, in red for FE and in blue for AB2. This shows that AB2 has a smaller region of stability. However, when stable, it is expected to perform better than FE since it is a second order method, and thus limits error propagation between steps. The latter can be of great importance since the implementation of the algorithm resides in a low-accuracy environment (see more in section 3.3).

Based on the integration scheme, the dedicated computation pipelines are synthesized on the FPGA. A schematic representation of pipelined version of FE and AB2 is shown in Figs. 3.16. **Fixed-point arithmetic** has been used along the datapath of each pipeline. The characteristic of fixed-point arithmetic is the fixed number of decimal bits in the representation of a real number. There are two benefits from adopting it in a datapath.

- faster execution of arithmetic operations, and
- lower silicon footprint of arithmetic modules, i.e. resource utilization in the FPGA.

However, this comes at the cost of accuracy loss as it will be explained in section 3.3.

After all pipelined computations are finished, the nodal injections are updated in parallel and the analog part performs the linear algebra operation for the next step. This interaction with the grid is done using a dedicated interface to the FPPNS, seen on the bottom of Fig. 3.14.
computational pipelines operate in a DMA fashion, reading and writing data to the ADC/DAC array of the previous section using dedicated I/O pins. The analog grid driver also manages the connectivity of the FPPNS via the digitally controlled switches of the latter. Also notice that any parameter of the analog part (e.g. line, switch status) can be changed in runtime without manual intervention.

**Example**

A simple example for the above is a set of generators modeled using the classical generator model. They are all put under the same equivalence class \( E_k \), and they are all described using a similar set of DAEs (the swing equation), adapted to the different parameters of each individual generator (mechanical starting time and transient impedance). The dynamics of generators modeled by the classical generator model are given by the swing equation.

\[
\begin{align*}
\dot{\omega} &= \frac{P_m - P_e - D \cdot (\omega - \omega_0)}{M} \quad (3.30a) \\
\dot{\delta} &= \omega - \omega_0 \quad (3.30b) \\
I &= \frac{V - (E' \angle \delta)}{r_a + j \cdot x'} \quad (3.30c) \\
re(P_e) &= V \cdot I^* \quad (3.30d)
\end{align*}
\]

Where \( \omega \) is the synchronous speed of the generator, \( P_m \) is the mechanical power acting on the rotor of the generator, \( P_e \) is the electrical power that is supplied by the generator to the grid, \( D \) is a damping coefficient, \( \omega_0 \) is the nominal synchronous speed, \( M \) is the mechanical starting time, \( \delta \) is the machine internal angle, \( E' \) is the machine internal voltage amplitude, \( V \) is the complex terminal voltage, \( I \) is the complex current that is injected into the grid, \( r_a \) is the armature resistance, \( x' \) is the transient reactance.
3.1. Hardware

It is customary for “classical” transient stability studies to model generators with the above set of equations [291]. When first-swing only instability is of concern, damping is also often neglected, and results are rather on the pessimistic side. The above can be written in the form of (3.27) as follows.

\[
x \ := \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix} \omega \\ \delta \end{bmatrix} \\
(3.31a)
\]

\[
y \ := \begin{bmatrix} y_1 \\ y_2 \\ y_3 \end{bmatrix} = \begin{bmatrix} V \\ I \\ P_e \end{bmatrix} \\
(3.31b)
\]

\[
\lambda \ := \begin{bmatrix} P_m \\ D \\ \omega_0 \\ M \\ E' \\ r_a + j \cdot x' \end{bmatrix} \\
(3.31c)
\]

\[
\dot{x} \ := \begin{bmatrix} \dot{x}_1 \\ \dot{x}_2 \end{bmatrix} = \begin{bmatrix} \left( \lambda_1 - y_3 - \lambda_2 \cdot (x_1 - \lambda_3) \right) / \lambda_4 \\ x_1 - \lambda_3 \end{bmatrix} := \begin{bmatrix} f_1(x, y, \lambda) \\ f_2(x, y, \lambda) \end{bmatrix} \\
(3.31d)
\]

\[
0 \ = \begin{bmatrix} y_1 - \left( \lambda_5 \cdot e^{j \cdot \chi} \right) - y_2 \cdot \lambda_6 \\ re \left\{ y_1 \cdot y_2^* \right\} - y_3 \end{bmatrix} := \begin{bmatrix} g_1(x, y, \lambda) \\ g_2(x, y, \lambda) \end{bmatrix} \\
(3.31e)
\]

The dedicated pipeline for the computations of generators modeled with the classic generator model using FE integration is shown in Fig. 3.17. The figure presents a detailed view on the fixed-point datapath. The width of the pipeline appears in green, and the time spent on each block (in FPGA clock cycles) is shown in brown.

Similar classes of equations can be derived for other power system elements connected to buses, e.g., dynamic loads, synchronous condensers, induction machines, etc. For each one of the equivalence classes similar dedicated pipelines can be created based on the integration scheme that is used. In the current implementation 8 pipelines have been created, for generators with the classical model, for constant impedance/current/constant power loads, each using FE or AB2 numerical integration.

Communication interface

A Cypress CY7C68016 USB 2.0 controller is interfaced to the FPGA via a two-port RAM of 4096 positions for words of 32 bits. The configuration is seen in Fig. 3.18. Access to the shared RAM is provided to the host PC by a dedicated driver utility as described in the next section. Threaded USB 2.0 read and write operations of up to 65.69 Mbps [292] are supported.
Chapter 3. Realized dedicated mixed signal solver

Figure 3.17 – Datapath of the synthesized pipeline for generators that are modeled with the classical generator model of (3.30) using the FE integration scheme of (3.28)

Figure 3.18 – The USB controller and the shared RAM that is interfaced to the FPGA
3.1. Hardware

3.1.3 Timing

Fig. 3.19 presents a schematic diagram of the timing break of the operations of the FPPNS. The total time per iteration is as follows.

\[ t_{IT} = t_{A/D} + t_P + t_{D/A} + t_{rd} \]  
(3.32)

Where

- \( t_{A/D} = 200 \text{ns} \) is the time for the ADC array to read the values from the analog grid. The current implementation is based on AD7356 converters operated at 5 MSPS (Million Samples Per Second).

- \( t_P \) is the pipeline computing time. This value is explained hereunder.

- \( t_{D/A} = 400 \text{ns} \) is the time for the DAC array to update the analog inputs (injections) to the analog grid. The current implementation is based on AD5449 converters.

- \( t_{rd} \) is the waiting time for the analog electronics to settle, and will be explained in section 3.3.1.

Regarding \( t_P \), the time that takes each pipeline to complete is

\[ t_{Pi} = (\ell_{Pi} + N_{Pi} - 1) \cdot t_{clk} \]  
(3.33)

Figure 3.19 – Schematic diagram of the timing break up of the operations of the FPPNS for one simulation step
Where $\ell_{P_i}$ is the length of pipeline $i$ in clock cycles. In the example of Fig. 3.17 it is $\ell_{P_i} = 22$, $t_{clk} = 1/f_{clk} = 8\, ns$ is the clock period (in the current implementation the digital clock frequency is $f_{clk} = 125\, MHz$) and $N_{P_i}$ is the number of injections to be processed by the pipeline. The equation becomes clearer in view of Fig. 3.20. Since all pipelines run in parallel, the time that the digital part of the computations takes to complete is the maximum time spent in any of the pipelines, $t_p = \max_i t_{P_i}$

### 3.2 Software

Alongside the dedicated hardware a co-design software has been developed, named elab-tsaot$^2$. The entire application is written in C++. Compilation has been done using MinGW 4.7 32 bit. The frontend uses the Qt framework and widgets from the Qwt library. Parts of the Boost C++ Libraries have been used in the backend (Random, Timer, Chrono, Any, uBLAS, Pointer Container, Bimap). Linear algebra functionality is provided through the BLAS and LAPACK from Netlib. An overview is shown on Fig. 3.21

#### 3.2.1 Backend

The backend is the core of the application. A software model of the power system has been created (Power system model). Models can be imported and exported from XML files that adhere to a specially tailored schema. The power system model can be fed into the steady state engine (SS engine) which is responsible of making all sorts of steady calculations on the power system.

An interface design pattern has been followed for SS engine as shown in Fig. 3.22a. Imple-

---

$^2$Available on the GIT repository https://kyriakid@git.epfl.ch/repo/elab-tsaot2.git. Access is upon request to the author.
3.2. Software

![Architecture overview of elab-tsaot](image)

Figure 3.21 – Architecture overview of elab-tsaot

![Interface design pattern for the SS and TD engines](image)

Figure 3.22 – Interface design pattern for the SS and TD engines

Implementations that want to qualify as SS engines for elab-tsaot have to respect the API defined by the abstract class ssengine. Its inputs are shown schematically in the figure: the power system to be solved and the options on which analysis to perform and how to perform it. After the analysis is complete, its results are stored in the SS results bank.

A current ssengine software implementation can perform power flow analysis on a given power system. It implements a Newton-Raphson power flow using polar coordinates [37]. Linear algebra functionality for the above is offered by the Netlib BLAS and LAPACK libraries.

Once the power flow has been solved, the power system model is updated with its steady state values and it is fed as an input to the time domain simulation engine (TD simulation engine). A similar interface pattern has been used for it, as shown in Fig. 3.22b.

The three inputs to the TD engine are the power system, the scenario to be executed, and
options for the simulation. For a power system to be ready for TD analysis, the power flow needs to be already available. A scenario is the set of events that will take place in the time-domain simulation window. Scenarios are created in a special user interface Scenario editor and stored in a dedicated store (Scenario set). They can also be imported/exported as an xml file that adheres to a dedicated schema. Options are auxiliary inputs to the TD engine, such as the time step, duration of the simulation, the desired variables to store in the results etc. They are provided to the engine by the user. The result of a run of the TD engine is a set of waveforms (TD results) which are stored in a dedicated store (TD results bank). Results can be imported from and exported to text files with a special format.

Two TD engine implementations are provided in the current prototype. The first one concerns the software class that uses the dedicated hardware to perform the TD analysis. In the core of the it there is a Hardware Abstraction Layer (HAL) of the hardware, as shown in Fig. 3.23. The HAL respects the API of Fig. 3.22b through the inputs Power system, Scenario, Options and TD results - color coding of Fig. 3.22b has been retained.

The HAL contains a model of the hardware and provides a set of functionality to bridge the software with the hardware world.

Calibration

The purpose of calibration is to increase the accuracy of the hardware platform. Calibration is detailed in section 3.3.4.

Mapper

The mapper utility creates the association between the power system topology and the analog and digital resources of the hardware. It corresponds to the topological mapping operation of section 3.1.1. It is similar to the “place and route” operation in the design flow of modern FPGAs. Mapper information can be imported from and exported to an xml file that adheres to a dedicated schema.
3.2. Software

Fitter
After the mapping is complete the fitter is responsible of calculating all the exact parameters for all the resources of the hardware. This involves translating values from the power system to the electronics domain, as described in the *value mapping* operation of section 3.1.1.

Power system and Scenario Encode
The next step is to encode the information of the hardware model into a format that the machine itself can understand, i.e. into an equivalent of a “binary” code. In our platform this is called the *bitstream*. The bitstream contains the binary encoding of the entire hardware of the dedicated platform (values of potentiometers, status of switches, etc.) which are to be communicated to the platform. This operation is similar to the “assembler” operation in the design flow of modern FPGAs.

USB Comm
The HAL encapsulates communication to the hardware through a dedicated USB driver. A wrapper dynamic-link library (DLL) has been developed based on the Cypress CyUSB library. The drivers provide direct access to the shared memory of the USB comm module of the hardware (see Fig. 3.18). Threaded USB 2.0 read and write operations are supported. The protocol used to communicate commands and data between the two platforms is a flag-polled shared memory scheme.

Results parser
After an operation is finished executing on the hardware, the results are communicated back to the software application. These raw results are translated into power system quantities by the HAL. It is the result of this parsing that meets the TD results requirements of the tdengine software API.

A second TD engine implementation is a purely software based one. It implements the partitioned solution scheme of Fig. 3.5 by purely software techniques. BLAS and LAPACK provide the linear algebra functionality to solve the grid nodal equation \( I = Y \cdot V \). The FE and AB2 numerical integration algorithms of (3.28) and (3.29) are used to solve the DAE sets (3.27) of each injector. The purpose of this research-grade software engine is to provide an accuracy benchmark of the hardware-based one.

3.2.2 Frontend
Fig. 3.24 shows the software design pattern followed for the frontend of elab-tdsamt. The *model* is the core of the application. It maintains the state and the data represented for every view of the GUI. When changes occur, the model updates all its views. The *controller* is the interface presented to the user (buttons, tabular interfaces, menus, etc.) that allows it to manipulate the application. Finally, the *view* is the user interface which displays the
information contained in the model. Any object that needs information about the model has to be a registered view with the model. The user sees the views and that he manipulates/uses
the controller.

Frontend equivalent of HAL blocks have also been created for the facilitation of the user, as shown in Fig. 3.23. Calibration, mapper, fitter and communication operations have their dedicated frontends - Calibration editor, Mapper editor, Fitter editor, and Comm. editor respectively.

3.3 Inaccuracy

The inaccuracy sources of the hardware platform can be categorized as digital and analog.

3.3.1 Analog inaccuracy

On the analog side, both the topological mapping of Fig. 3.9 and the value mapping of (3.20) are in the mathematical sense perfectly accurate. By that, we mean that there is a perfect translation of the power system quantities/topology to the electronic quantities/topology. By the time the electronic quantities are realized in real hardware, inaccuracy has to be considered.

The main causes of inaccuracy can be categorized in points (L) that refer to the accuracy of the electrical lines, and points (N) that refer to the accuracy of the electrical nodes and their injections (currents and voltages).

(a) the quantization and saturation of the resistor values due to the finite resolution and finite range of the potentiometers (L),

(b) the inherent inaccuracy of the potentiometer (L),
3.3. Inaccuracy

(c) parasitic impedances of the lines and the configuration switches (L),
(d) the quantization and saturation of the written voltage/current values due to the finite resolution and range of the DACs (N),
(e) the quantization and saturation of the read voltage values due to the finite resolution and range of the ADCs (N),
(f) non-linearities (integral and differential) and errors (offset and gain) of the DACs and ADCs (N), and
(g) the finite waiting time for the analog electronics to settle due to RC time constant effects (N).

In what follows an effort is made to analyze and quantify the above. In this section, bold typeface is used as a notational convention for random variables.

**Line inaccuracies**

Given (3.15) for the functioning of the potentiometer the maximum attainable value is

\[ r_{pot} = \frac{2^M - 1}{2^M} \cdot R. \tag{3.34} \]

When a specific value \( r_{req} \) is requested from the potentiometer, \( t \) has to be calculated so that the resulting \( r_{pot} \) from (3.15) best approximates \( r_{req} \). This tap setting can be calculated from (3.15). However, the real full-scale value \( R \) is a priori unavailable. Instead it is known in the form of a random distribution.

\[ R \sim \mathcal{N}(\hat{R}, \sigma^2_R) \tag{3.35} \]

This distribution may be known from the manufacturing process or may be the result of measurements. Equation (3.15) is solved using the expected full-scale value.

\[ t^R = \frac{r_{req}}{\hat{R}} \cdot 2^M. \tag{3.36} \]

Notice that \( t^R \) in (3.36) is not random, and also not integer, as required by (3.15). Due to the
Chapter 3. Realized dedicated mixed signal solver

discrete nature of the reconfigurable element, it has to be quantized and saturated according to the limits.

\[
t = \begin{cases} 
0 & \text{if } t^R \leq 0 \\
\lfloor t^R + 0.5 \rfloor & \text{otherwise} \\
2^M - 1 & \text{if } t^R \geq 2^M - 1 
\end{cases} 
\tag{3.37}
\]

In the non-saturated case \( t \) is rounded half towards positive infinity. If the requested value is outside the attainable limits of the element, then the tap setting saturates accordingly, to the value closest to the one requested.

Assuming a uniformly random \( r_{req} \) within the attainable limits of (3.34), the resulting \( t^R \) from (3.36) is also uniformly random within the tap limits. Accordingly, the \( t \) calculated from (3.37) is also random: it is always selected to the closest bit rounding to \( t^R \).

\[
t = t^R + q_t \tag{3.38}
\]

Where \( q_t \) is the effect of quantization on the tap setting. According to the above it is safe to assume that \( q_t \sim \mathcal{U} (-1/2, +1/2) \), i.e. the quantization error is at most half a bit. The mean and variance of \( q_t \) are according to the uniform distribution.

\[
E[q_t] = 0 \tag{3.39a}
\]
\[
var[q_t] = 1/12 \tag{3.39b}
\]
\[
\text{(3.39c)}
\]

In turn the mean and variance of \( t \) are

\[
E[t] = t^R, \tag{3.40a}
\]
\[
var[t] = 1/12. \tag{3.40b}
\]
3.3. Inaccuracy

Substituting (3.38) into (3.15) yields the final (random) output of the reconfigurable element.

\[
r_{\text{pot}} = \frac{r_{\text{req}}}{R} \cdot R + \frac{1}{2^M} \cdot R \cdot q_t
\]  

(3.41)

The mean and variance of (3.41) are as follows.

\[
E[r_{\text{pot}}] = r_{\text{req}}
\]  

(3.42a)

\[
\text{var}[r_{\text{pot}}] = \left(\frac{r_{\text{req}}}{R}\right)^2 \cdot \sigma_R^2 + \frac{1}{12} \cdot \left(\frac{\hat{R}}{2^M}\right)^2
\]

due to analog inaccuracies due to quantization

(3.42b)

From (3.42a) it becomes clear that in the way the reconfigurable potentiometer is operated its output is on average equal to the desired value \(X_{\text{req}}\). In (3.42b) the two sources of the inaccuracy are highlighted.

An error variable can be defined as \(\epsilon_X = r_{\text{pot}} - r_{\text{req}}\). Then for this error variable it holds.

\[
E[\epsilon_X] = 0
\]  

(3.43a)

\[
\text{var}[\epsilon_X] = \text{var}[r_{\text{pot}}]
\]  

(3.43b)

The above analysis covers points (a) and (b) from the list in the beginning of this subsection. In the current implementation the expected full scale resistance is \(\hat{R} = 10000\ \Omega\), the bit resolution is \(M = 8\) bits and variance of the full-scale resistance is \(\sigma_R^2 = (666.7\ \Omega)^2\). The last value is inferred by the nominal tolerance which is provided by the manufacturer of the potentiometers as a \(\pm 3\sigma\) value in percent - in the case of our potentiometers 20\%. Then (3.43b) becomes

\[
\text{var}[\epsilon_R] = 4.4 \cdot 10^{-3} \cdot r_{\text{req}}^2 + 127.15.
\]  

(3.44)

Regarding point (c) of the list, an approximated schematic of the actual realization of the circuit between the two end nodes of an electrical branch that contains a potentiometer can be seen in Fig. 3.25.

Parasitic capacitors \((c_{\text{ndf}}, c_{\text{cd}}, c_{\text{pot}}, c_{\text{sw}}\) and \(c_{\text{ndt}}\)) and inductors \((l_{\text{cd}}, l_{\text{pot}}\) and \(l_{\text{sw}}\)) are not considered here since they do not affect the steady state functioning of the circuit. The actual
Chapter 3. Realized dedicated mixed signal solver

Figure 3.25 – Physical and schematic representation of parasitics for an electrical branch that contains a potentiometer

The final resistance of the branch is given by the following.

\[ r = r_{pot} + r_{cd} + r_{csw} \]

The additional offset resistance \( r_{off} \) is the combined result of the resistance of the conductors that are involved in the electrical path \( r_{cd} \) as well as the imperfect non-zero resistance of the electronic switches in the same path.

Ideally the conductors have a zero resistance. However in reality their effective resistance is given by Pouillet’s law.

\[ r_{cd} = \rho_{cd} \cdot \frac{\ell_{cd}}{A_{cd}} \]

Where \( \rho_{cd} \) is the resistivity of the material, \( \ell_{cd} \) is the total length and \( A_{cd} \) is the area of the cross-section of the conductor. Fixed cross-section across the length is considered in the above. In the current implementation the conductors are copper PCB traces with a rectangular
3.3. Inaccuracy

cross-sections.

\[ \rho_{cd} = 1.68 \cdot 10^{-8} \, \Omega \cdot \text{m (the resistivity of copper)} \]  
\[ A_{cd} = 254 \, \text{um} \times 35 \, \text{um} = 8890 \, \text{um}^2 \]  

The length \( \ell_{cd} \) varies depending on the exact electrical branch considered on the prototype. A good approximation in the current prototype is as follows.

\[ \ell_{cd} \approx 5.5 \text{cm} . \]  

By using (3.46) the average conductor resistance is \( r_{cd} \approx 0.1 \, \Omega \).

The reconfiguration of the topology of the resistor network is effectuated by the switching of electronic switches. Depending on the routing of the line on the electronic board a different number \( n_{sw} \) of switches can be in its path. The term \( r_{cs} \) in (3.45) is the lumped effect of all of them.

\[ r_{cs} = n_{sw} \cdot r_{sw} \]  

When a switch is open its nominal resistance is \( r_{sw} \to \infty \) and when closed it is \( r_{sw} \to 0 \). However the latter is not always accurate due to manufacturing imperfections. In our case the average on-resistance of the digitally controlled switches is \( r_{sw} \approx 2.5 \, \Omega \). On average \( n_{sw} = 2 \) switches are involved in the electrical path of a potentiometer (one on each side of it). Hence the combined effect of them is \( r_{cs} \approx 5 \, \Omega \).

In total the offset value in (3.45) is

\[ r_{off} \approx 5.1 \, \Omega . \]  

This value is at least two orders of magnitude smaller than the effective values of resistors used in the topology in most of the times and thus it can be neglected.
Chapter 3. Realized dedicated mixed signal solver

Voltage injection inaccuracies

Equation (3.17) holds for the voltage DACs of the FPPNS. An inaccurate version of it is as follows ($V_{off}$ is neglected since it does not affect the computations).

$$V_{DAC} = \frac{t}{2^M} \cdot (V_{FS} + \epsilon_{DAC1}) + \epsilon_{DAC2} \quad (3.51)$$

The random variable $V_{FS}$ comes from a circuit that generates the full-scale reference voltage of the DAC. It is a combination of an Analog Devices AD5624 16-bit, B Grade nanoDAC with a fixed output, in series with a buffer Analog Devices AD8685 low-noise, precision CMOS op. amp. The random variable for the generated voltage is as follows.

$$V_{FS} \sim \mathcal{N}(\hat{V}_{FS}, \sigma_{FS}^2) \quad (3.52)$$

The reference is set to $\hat{V}_{FS} = 5 \, V$ and the variance is $\sigma_{FS}^2 = (305 \cdot 10^{-6} + 50 \cdot 10^{-6})^2$. The latter is taken from the datasheets of the components:

- a maximum (i.e. $3 \cdot \sigma$) inaccuracy of $\pm 12$ bits for 5 V for the AD5624, and
- a typical (i.e. $1 \cdot \sigma$) inaccuracy of 50 $\mu$V for a supply voltage of 5 V for the AD8658.

The additional error terms in (3.51) are determined by the characteristics of the Analog Devices AD5449 12-bit DAC that has been used in the design.

- $\epsilon_{DAC1} \sim \mathcal{N}(0, \sigma_{DAC1}^2)$, corresponds to a gain error of $V_{FS} \pm 0.5\%$. Hence, for $V_{FS} = 5 \, V$ it is $\sigma_{DAC1} = 25 \cdot 10^{-3}$.
- $\epsilon_{DAC2} \sim \mathcal{N}(0, \sigma_{DAC2}^2)$, corresponds to a relative accuracy of $\pm 1 \, LSB$. The value of an LSB can be calculated from (3.17) $\sigma_{DAC2} = 1.22 \cdot 10^{-3}$.

The functioning of voltage DACs follows a similar line to the functioning of potentiometers. When a $V_{req}$ is requested from the voltage DAC, $t$ has to be calculated so that the resulting $V_{DAC}$ from (3.17) best approximates $V_{req}$. So, the latter is solved for $t$.

$$t^R = \frac{V_{req} \cdot 2^M}{\hat{V}_{FS}} \quad (3.53)$$

Notice that $t^R$ in (3.53) is not random, and also not integer, as required by (3.17). Due to the
3.3. Inaccuracy

discrete nature of the reconfigurable element, it has to be quantized and saturated according to the limits, similarly to (3.37). Again, in the non-saturated case \( t \) is rounded half towards positive infinity. If the requested value is outside the attainable limits of the element, then the sign tap setting saturate accordingly, to the value closest to the one requested.

Assuming a uniformly random \( V_{req} \) within the attainable limits of the voltage DAC, the resulting \( t^R \) from (3.53) is also uniformly random within the tap limits. Accordingly, the \( t \) calculated from (3.37) is also random: it is always selected to the closest bit rounding to \( t^R \), as per (3.38). The term \( q_t \) stands for the effect of quantization on the tap setting and follows the uniform distribution of (3.39). By using the distributions of all the random variables, the mean and variance of (3.51) can be calculated.

\[
E[V_{DAC}] = V_{req} \quad \text{(3.54a)}
\]

\[
\text{var}[V_{DAC}] = \left( \frac{V_{req}}{V_{FS}} \right)^2 \left( \sigma_{FS}^2 + \sigma_{DAC1}^2 \right) + \frac{1}{12} \left( \frac{1}{2^M} \right)^2 \left( \sigma_{FS}^2 + \sigma_{DAC1}^2 \right) + \frac{1}{12} \left( \frac{V_{FS}}{2^M} \right)^2 \quad \text{due to quantization (3.54b)}
\]

From (3.54a) it becomes clear in the way the reconfigurable is operated, its output is on average equal to the desired value \( V_{req} \). In (3.54b) the two sources of the inaccuracy are highlighted. Again, an error variable can be defined as \( \epsilon_V = V_{DAC} - V_{req} \). Then for this error variable it holds.

\[
E[\epsilon_V] = 0 \quad \text{(3.55a)}
\]

\[
\text{var}[\epsilon_V] = \text{var}[V_{DAC}] \quad \text{(3.55b)}
\]

Substituting values for the current implementation in (3.54b) yields the expression for the variance of voltage injections in the current system.

\[
\text{var}[V_{DAC}] = V_{req}^2 \cdot 2.5 \cdot 10^{-5} + 1.5 \cdot 10^{-6} + 1.24 \cdot 10^{-7} \quad [V^2] \quad \text{(3.56)}
\]

The first two terms are due to the inaccuracy of the analog components and the last one is due to quantization. This analysis covers points (d) and (f) of the list in the beginning of the section for voltage injections.
Chapter 3. Realized dedicated mixed signal solver

Current injection inaccuracies

A similar line can be followed for current injections in the FPPNS. It was presented in (3.18) that current injections are done by the voltage-to-current converted topology of Fig. 3.13. The voltage of the nominator of (3.18) follows the analysis of (3.51) and the conversion resistance in the denominator of (3.18) follows the analysis of (3.41). The mean value of $I_{DAC}$ is as follows.

$$E[I_{DAC}] = \frac{E[V_{DAC}]}{E[R_{conv}]} = \frac{V_{req}}{R_{conv}}$$

(3.57)

In the above $R_{conv}$ is the fixed resistance value asked by the conversion resistance and $V_{req}$ is the value asked to the voltage DAC. The latter is computed as

$$V_{req} = I_{req} \cdot R_{conv}$$

(3.58)

Where $I_{req}$ is the current asked as a current injection.

The variance of $I_{DAC}$ is determined by the rule to find the variance of the fraction of two independent random variables.

$$var[I_{DAC}] = var \left[ \frac{V_{DAC}}{R_{conv}} \right] = \frac{var[V_{DAC}] \cdot E[R_{conv}]^2 + var[R_{conv}] \cdot E[V_{DAC}]^2}{E[R_{conv}]}$$

(3.59)

The analytical expression of the result is omitted here for brevity sake. By defining an error variable can be defined as $\epsilon_I = I_{DAC} - I_{req}$ the distribution of this error can be found.

$$E[\epsilon_I] = 0$$

(3.60a)

$$var[\epsilon_I] = var[I_{DAC}]$$

(3.60b)

This analysis covers the points (d) and (f) of the list in the beginning of the section, for current injections.

Voltage measurement inaccuracies

A similar line of analysis is followed for all measurement devices in the system (voltage and current ADCs). Equations (3.19) describe the perfect functioning of the voltage ADCs in the
3.3. Inaccuracy

The voltage measurement subsystem of the FPPNS is based on an AD7356 differential input 12-bit successive approximation (SAR) ADC by Analog Devices. Assuming a random \( V_{\text{in}} \) within the limits of the ADC an inaccurate version of (3.19a) is given hereunder.

\[
D^R = \frac{V_{\text{in}}}{V_{\text{FS}}} \cdot (2^M + \epsilon_{FS}) + \epsilon_{\text{mid}} + \epsilon_{\text{INL}}
\]  

(3.61)

\( V_{\text{FS}} \) is the full-scale reference voltage of the ADC converter. It is provided by an Analog Devices ADR445B LDO XFET ultralow noise voltage reference. The multiplicative error term \( \epsilon_{FS} \) is a full-scale gain error. The additive error term \( \epsilon_{\text{mid}} \) is a mid-scale offset. The additive error term \( \epsilon_{\text{INL}} \) is due to the integral non-linearity of the converter. Based on the datasheets of the components of the actual implementations, the random distributions are

- \( V_{\text{FS}} \sim \mathcal{N}(\hat{V}_{FS}, \sigma_{VFS}^2) \), where \( \hat{V}_{FS} = 5 \) V and \( \sigma_{VFS} = 2 \cdot 10^{-3} \) V.
- \( \epsilon_{FS} \sim \mathcal{N}(0, \sigma_{FS}) \), where \( \sigma_{FS} = 1 \) bit.
- \( \epsilon_{\text{mid}} \sim \mathcal{N}(\mu_{\text{mid}}, \sigma_{\text{mid}}^2) \), where \( \mu_{\text{mid}} = 5 \) bits and \( \sigma_{\text{mid}} = 1.66 \) bits.
- \( \epsilon_{\text{INL}} \sim \mathcal{N}(0, \sigma_{\text{INL}}^2) \), where \( \sigma_{\text{INL}} = 0.5 \) bit.

If the input is outside the allowable limits of the ADC, then the digital output saturates to the value that represents as close as possible the input. If the input is assumed uniformly random within the saturation limits then the quantization operation of (3.19b) can be rewritten as follows.

\[
D = D^R + q_D
\]

(3.62)

The quantization term \( q_D \) follows a uniform distribution.

\[
q_D \sim \mathcal{U}(-1/2, +1/2)
\]

(3.63)

Through (3.19c) this yields the random value expression for \( V_{ADC} \).

\[
V_{ADC} = \frac{\hat{V}_{FS}}{2^M} \cdot \left( \frac{V_{\text{in}}}{V_{\text{FS}}} \cdot (2^M + \epsilon_{FS}) + \epsilon_{\text{mid}} + \epsilon_{\text{INL}} \right) + \frac{\hat{V}_{FS}}{2^M} \cdot q_D
\]

(3.64)

due to analog inaccuracies
due to quantization
Chapter 3. Realized dedicated mixed signal solver

The mean and the variance of $V_{ADC}$ are as follows.

\[
E[V_{ADC}] = V_{in} + \frac{\mu_{mid} \cdot \hat{V}_{FS}}{2^M} \quad (3.65a)
\]

\[
var[V_{ADC}] = V_{in}^2 \cdot \left( \frac{\sigma_{V_{FS}}^2}{V_{FS}^2} + \frac{\sigma_{FS}^2}{2^M} \right) + \left( \hat{V}_{FS} \cdot \frac{2^M}{\sigma_{mid}^2 + \sigma_{INL}^2} \right) + \frac{1}{12} \cdot \left( \hat{V}_{FS} \cdot \frac{2^M}{M} \right)^2 \quad (3.65b)
\]

due to analog inaccuracies
due to quantization

An error variable can be defined as $\epsilon_{V_{ADC}} = V_{ADC} - V_{in}$. Then for this error variable it holds.

\[
E[\epsilon_{V_{ADC}}] = 0 \quad (3.66a)
\]

\[
var[\epsilon_{V_{ADC}}] = var[V_{ADC}] \quad (3.66b)
\]

Substituting the values for the current implementation in (3.65b) an analytical expression for the variance is derived.

\[
var[V_{ADC}] = V_{in}^2 \cdot 2.2 \cdot 10^{-7} + 4.48 \cdot 10^{-6} + 1.24 \cdot 10^{-7} \quad (3.67)
\]

The first two terms are due to the inaccuracy of the analog components and the last one is due to quantization. This analysis covers points (e) and (f) of the list in the beginning of the section.

Parasitical capacitances

Point (g) strongly relates to the parasitical capacitances and inductances that appear in the electronic circuit. This can be partly seen in Fig. 3.25. Parasitic inductances for this scale of conductive lines, i.e. lengths in the order of few centimeters up to few tens of centimeters, can safely be neglected. However, this is not the case for capacitances.

Parasitic capacitances appear not only in external nodes, at the ends of integrated circuits (ICs) as show schematically in Fig. 3.25, but in every electrical node of the circuit, including internal nodes of the ICs. A simplifying assumption is made that parasitic capacitances appear only as shunts from each electrical node to ground and not as series in electrical branches of the circuit.

The FPPNS is operated in steady-state, i.e. with all variables that describe its behavior (voltages, currents) being invariant with time. The result of the parasitic capacitances is that changes in
3.3. Inaccuracy

Figure 3.26 – Schematic representation of the resulting electrical circuit taking into account the node parasitic capacitances

Table 3.1 – Time domain and Laplace domain

<table>
<thead>
<tr>
<th>Component</th>
<th>Time domain</th>
<th>Laplace domain</th>
</tr>
</thead>
<tbody>
<tr>
<td>resistor</td>
<td>$v = R \cdot i$</td>
<td>$V(s) = R \cdot I(s)$</td>
</tr>
<tr>
<td>capacitor</td>
<td>$\frac{\partial v}{\partial t} = \frac{1}{C} \cdot i$</td>
<td>$s \cdot C \cdot V(s) - C \cdot V_0 = I(s)$</td>
</tr>
</tbody>
</table>

the electrical steady state cannot occur instantaneously. Instead the circuit goes through a transient state during which variables change non-periodically. Gradually the transient state vanishes and the new steady state is reached. This transient is mainly due to the physical requirement to charge the parasitic capacitors. This procedure cannot be instantaneous since this would require infinite power [293].

When a read command is issued by the computational pipelines to the ADCs, the latter sample-and-hold the voltage at their input. All other things considered perfect, these FPPNS voltages are not in their final (correct) values until the transient is over. Therefore if the read command is issued before the new steady state is reached, erroneous results will be output by the FPPNS for the solution of the linear system.

The exact way to model this transient of the electrical circuit is out of the scope of this work. The reader is referred to relevant bibliography instead [293, 294, 295]. Hereunder there is brief attempt to gain an insight on the mechanism of the phenomenon with the help of Laplace transform.

The resulting analog circuit contains a resistor grid, and at each node a parasitic capacitor and a current source. Fig. 3.26 shows a schematic representation. Some of the nodes may contain voltage source and the analysis remains the same. Table 3.1 contains the Laplace-domain equivalents of the resistor and capacitor.
Chapter 3. Realized dedicated mixed signal solver

The time-domain equations of the grid are follows.

\[
G \cdot V + \text{diag}(C) \cdot \begin{bmatrix}
\frac{\partial V_1}{\partial t} \\
\frac{\partial V_2}{\partial t} \\
\vdots \\
\frac{\partial V_n}{\partial t}
\end{bmatrix} = I
\]

(3.68)

Where \( G \) is the conductance matrix resulting from the resistor network, \( V \) is the node voltage vector of the circuit, \( C = [C_1 \ldots C_n]^T \) is the vector of the node parasitic capacitances and \( I \) is the current injection vector. The Laplace transform \( \mathcal{L} \{ \cdot \} \) is applied to the above. An initial voltage condition at each node (capacitor) of \( V^0 = [V_1^0 \ldots V_n^0] \) is considered. Current injections are considered to be step inputs, for node \( k: 0 \rightarrow I_k \) at \( t = 0 \). Hence, the Laplace transform of the current injection vector is \( I(s) = (1/s) \cdot I \).

\[
(s \cdot C + G) \cdot V(s) = \frac{1}{s} \cdot I + C \cdot V^0
\]

(3.69)

Mathematically the solution for the voltage in the s-domain is given as follows.

\[
V(s) = (s \cdot C + G)^{-1} \cdot \left( \frac{1}{s} \cdot I + C \cdot V^0 \right)
\]

(3.70)

According to the initial value theorem, the initial value of the solution for \( V \) in the time-domain is as follows. As expected it coincides with \( V^0 \).

\[
V|_{t=0^+} = \lim_{s \to \infty} s \cdot V(s) = \lim_{s \to \infty} \left[s \cdot \left(C + \frac{G}{s}\right)\right]^{-1} \cdot s \cdot \left(\frac{1}{s} \cdot I + C \cdot V^0\right) = V^0
\]

(3.71)

According to the final value theorem, the final value of the solution for \( V \) in the time-domain is given in the following equation. As expected it only depends on the resistor grid and not on the capacitors. It is therefore established that the final solution will be what is expected from the solver.

\[
V|_{t=\infty} = \lim_{s \to 0} s \cdot V(s) = \lim_{s \to 0} (s \cdot C + G)^{-1} \cdot \left( I + s \cdot C \cdot V^0 \right) = G^{-1} \cdot I
\]

(3.72)
3.3. Inaccuracy

The actual trajectory between $V^{0} \rightarrow G^{-1} \cdot I$ depends on the actual time-domain solution of the above. This can be derived by using the inverse Laplace transform $\mathcal{L}^{-1} \{ \cdot \}$ on (3.70). In reality this $V(t)$ solution is tedious to attain. Hence, an approximation of the mechanism is proposed.

An one-pole model, similar to a series RC circuit, is assumed for the transition. Such a circuit is shown on Fig. 3.27. $V_{\text{ideal}}$ on the left of the figure is the “ideal” voltage imposed to the node by the collective effect of the resistor grid and voltages/currents of other nodes. $V_{\text{actual}}$ is the actual voltage on the node that is emulated by the one-pole model. The transfer function of such a system is as follows.

$$H(S) = \frac{1}{\tau \cdot s + 1} \quad (3.73)$$

Where the RC constant is $\tau = R \cdot C$. The system is assumed to be initially in a steady state and the initial condition is $V_{\text{actual}}(t = 0) = V_{\text{actual}}^{0} = V_{\text{ideal}}^{0}$. A step input for $V_{\text{ideal}}$ is defined as follows.

$$V_{\text{ideal}}(t) = V_{\text{ideal}}^{0} + (V_{\text{ideal}}^{*} - V_{\text{ideal}}^{0}) \cdot 1_{[0,\infty)}, \text{ where } 1_{[0,\infty)} = \begin{cases} 0 & \text{if } t < 0 \\ 1 & \text{if } t \geq 0 \end{cases} \quad (3.74)$$

The time domain output of $V_{\text{actual}}$ is then

$$V_{\text{actual}}(t) = V_{\text{ideal}}^{0} \cdot e^{-t/\tau} + V_{\text{ideal}}^{*} \cdot \left(1 - e^{-t/\tau}\right). \quad (3.75)$$

An ADC read command is issued at some time $t_{rd}$ and the ADCs sample-and-hold the voltage at the exact time instant $V_{\text{actual}}(t_{rd})$. This sampled-and-held voltage is then converted to digital, and corresponds to the solution of the analog part for the electrical node in question. A similar model can be applied to all nodes of the system.

The mechanism described hereinabove introduces a tradeoff between the accuracy of the solution and the waiting time $t$. The longer the waiting time, the closer the FPPNS solution to
Chapter 3. Realized dedicated mixed signal solver

the true electronic solution, i.e. to the electronic steady-state. This speed-versus-accuracy tradeoff is a feature that can be readily exploited when/where required.

The above simplified model of (3.73)-(3.75) is of course inaccurate and should only be used as a rough reference when deciding on the waiting time between the new current injection and the ADC read command. The selection of this waiting time is actually a design choice that decides the well-functioning of the FPPNS.

It would be closer to reality to assume different RC constants \( \tau_k \) for different nodes \( k = 1,...,n \), however assuming a system wide constant \( \tau \) may be sufficient. This universal \( \tau \) cannot be obtained analytically. Instead it can be determined by a series of empirical investigations, by measuring how “fast” the voltage value at the nodes of the system stabilizes to a final value. This empirically determined value is of course going to vary between different power systems, and different topological and value mappings of the same system. This is because all conductances and capacitances that are involved change in each case. Related results are presented in section 3.4.6.

3.3.2 Digital inaccuracy

There are two main inaccuracy sources in the digital part of the hardware. The use of fixed-point arithmetic and the inherent inaccuracies of the integration algorithms.

**Fixed-point arithmetic**

A fixed-point arithmetic is fully defined by a bit width and a scaling factor \( <B,S> \). Assume a positive real number \( x \in \mathbb{R}^+ \) without loss of generality. Negative numbers can be treated with any common signed number representation, e.g. signed magnitude, one’s complement, two’s complement, etc. Then the number is amenable to a fixed-point representation \( \hat{x} = x_{int} \cdot S \), where \( x_{int} \in \mathbb{Z} \) is an integer and \( S \in \mathbb{R}^+ \) is a scaling factor. If \( x_{int} \) is expressed in the binary numeral system, as it is done in modern digital computers, a sequence of \( B \) bits are used to represent it.

\[
x_{int} = 0b \ b_B b_{B-1} ... b_2 b_1
\]

(3.76)

Where \( b_i \) in the above are bits. The difference between the actual value of a number and its representation is the error introduced by the fixed-point representation.

\[
e_x = x - \hat{x}
\]

(3.77)
3.3. Inaccuracy

The minimum representable value by \( x_{\text{int}} \) is obviously zero, \( \min(x_{\text{int}}) = 0 \) - when all bits are equal to zero, and the maximum representable value is \( \max(x_{\text{int}}) = 2^B - 1 \) - for all bits equal to one. Scaled by \( S \) these correspond to a minimum and a maximum value for \( \hat{x} \).

\[
\begin{align*}
\min(\hat{x}) &= 0 \\
\max(\hat{x}) &= (2^B - 1) \cdot S
\end{align*}
\]  
(3.78)

If the number to be represented is outside the above limits, then it is not representable with the \( < B, S > \) fixed-point arithmetic, i.e. an overflow occurs. Either \( B \) or \( S \) have to be increased. The increase of \( B \) does not degrade the accuracy/granularity of the arithmetic, but comes at the cost of additional hardware resources. The increase of \( S \) is free from a hardware point of view, but results in coarser representations, i.e. increased errors. If none of the two measures are taken in the datapath, this results to information loss due to truncation.

Also, for a given \( < B, S > \), by construction the scaling factor is also the value resolution of \( \hat{x} \); since the minimum increase in \( x_{\text{int}} \) is one, then the minimum increase in \( x \) is \( 1 \cdot S \). Assuming a \( x \in [\min(\hat{x}), \max(\hat{x})] \) then this value resolution introduces a quantization inaccuracy. If a well behaved \( x \to \hat{x} \) mapping exists, for this inaccuracy it holds

\[
|e_x| \leq \frac{S}{2}
\]  
(3.79)

This is so, since if the error exceeds \( \pm S/2 \), \( x \) should be mapped to the next or previous integer \( x_{\text{int}} \) value.

In the current implementation care has been taken to minimize truncation inaccuracies. Most datapath truncations occur in a dedicated truncation block - see “trunk” in Fig. 3.17. Minor truncations occur also elsewhere, but attention has been paid to minimize the probability of losing information due to truncation. This has been achieved by designing the pipelines by taking into account the range of the operands at each stage.

As for the quantization inaccuracies, if examined per se, of course they introduce errors in the fixed-point datapath. However, it should be taken into account that the digital part operates in conjunction with the analog part. The interface between the two is the array of analog-to-digital (ADC) and digital-to-analog (DAC) converters. By definition the latter only provide fixed-point representations of analog quantities, hence, the very first step of the computing pipeline forcibly uses fixed-point arithmetic. In the current implementation the bit length of this first stage is 12 bit (bit resolution of both ADCs and DACs).

This limit is actually the quantization accuracy bottleneck for the whole design. Every latter
Chapter 3. Realized dedicated mixed signal solver

Table 3.2 – Stages of inaccuracy of linear operations performed by the mixed-signal computer

<table>
<thead>
<tr>
<th>Stage</th>
<th>Perturbed equation</th>
<th>Due to</th>
</tr>
</thead>
<tbody>
<tr>
<td>a</td>
<td>( G_{el} = G_{el} + E_{G} )</td>
<td>inaccuracies in potentiometer admittances</td>
</tr>
<tr>
<td>b</td>
<td>( I_{el} = I_{el} + \epsilon_{I} )</td>
<td>inaccuracies in current DAC injections</td>
</tr>
<tr>
<td>c</td>
<td>( V_{el} = V_{el} + \epsilon_{V1} )</td>
<td>combined effect of stages (a) and (b) to voltage</td>
</tr>
<tr>
<td>d</td>
<td>( V'<em>{el} = \tilde{V}</em>{el} + \epsilon_{V2} )</td>
<td>inaccuracies in voltage ADC measurements</td>
</tr>
</tbody>
</table>

Stage has been designed to be more granular than this first stage. Hence the effect of fixed-point quantization in the pipeline can be *neglected*.

Numerical integration algorithms

Another source of digital inaccuracy are the numerical integration algorithms. Every numerical integration algorithm, including the FE and AB2 which have been implemented in this work, only produces an approximation to the perfect solution of the DAEs/ODEs. The exact mechanism of the error, called *truncation error*, is briefly introduced in section C.3 of appendix C. This is the same whether the algorithm is implemented in conventional PC software or in the FPGA computational pipelines as described in section 3.1.2. In that sense the use of the FPPNS does not introduce any additional inaccuracy compared to standard power system analysis tools.

There is one point of interest however. As explained in the previous sections, there are many sources of digital and analog inaccuracy in the system. This raises concerns on the usability of the final results with respect to the integration algorithm alone - all other inaccuracy sources considered equal. A series of studies to investigate the phenomenon have been carried out [286, 287], and have confirmed this suspicion. Relevant results will be presented in section 3.4.

3.3.3 Effect on the mathematical operation

Individual inaccuracy components described in the previous sections contribute to the overall inaccuracy of operations performed by the FPPNS. This section focuses specially on the analysis of the linear system operation of (3.23). The analysis of the inaccuracy of other operations (e.g. matrix-vector multiplication) goes along the same lines.

Inaccuracy in linear system solving is manifested in four stages, as it is formalized in table 3.2.

*Stage a*) Inaccuracies of the conductances of the potentiometers that make up \( G_{el} \) result in an additive error \( E_{G} \) term in the building of matrix. Errors in individual entries of \( E_{G} \) are contributed by every potentiometer that is incident to the corresponding electrical node in question. Based on the assumptions of the previous subsection, these contributions follow the distribution of (3.42).
3.3. Inaccuracy

Stage b) Inaccuracies of the injected values of DACs result in the additive error term \( \epsilon_I \) in the \( I_{el} \) injection vector. The distribution of (3.60) is followed for the individual components of \( \epsilon_I \).

Stage c) The imperfect injections \( \tilde{I}_{el} \) applied on the imperfect grid \( \tilde{G}_{el} \) induce voltages that are different from the ones that would occur in case the perfectly accurate \( I_{el} \) and \( G_{el} \) had been used. Stage c is exactly this inaccuracy to the voltage because of imperfect current injections and grid conductances.

\[
\begin{align*}
\text{imperfect } V_{el} &= G_{el}^{-1} \cdot I_{el} \\
\text{perfect } \tilde{V}_{el} &= \tilde{G}_{el}^{-1} \cdot \tilde{I}_{el} \\
\tilde{V}_{el} &= V_{el} + \epsilon_{V1} \text{ definition of error (3.80)}
\end{align*}
\]

Stage d) This last stage comprises of two parts. For this reason its effect can be broken up in two subparts.

\[
\epsilon_{V2} = \epsilon_{V2a} + \epsilon_{V2b} \quad (3.81)
\]

The first part concerns the fact that \( \tilde{V}_{el} \) is the steady-state voltage that would be induced to the resistor grid if \( I_{el} \) was applied. However as discussed in section 3.3.1, this steady state voltage is not instantaneously reached due to parasitic RC phenomena. This lag can be formalized by the introduction of an additive error \( \epsilon_{V2a} \) into \( \tilde{V}_{el} \). The longer the waiting time for the analog grid to settle, the smaller is the magnitude of this additional error term. If a \( \tau \) system-wide RC constant is assumed, based on (3.75), \( \epsilon_{V2a} \) can be approximated as follows.

\[
\epsilon_{V2a} = \tilde{V}_{el}^* - \tilde{V}_{el}(t) = (\tilde{V}_{el}^* - \tilde{V}_{el}^0) \cdot e^{-\frac{t}{\tau}} \quad (3.82)
\]

Where \( \tilde{V}_{el}^* \) is the final settling value of the electronics if they were allowed enough time to settle, \( \tilde{V}_{el}^0 \) is the initial analog steady state condition, and \( \tilde{V}_{el}(t) \) is the time-dependent value that is finally read by the ADCs. It is seen from the above that as expected \( \epsilon_{V2a} \) features an exponential decay with time.

Assume that at the time the ADC read command is issued the real voltage on the FPPNS is \( V_{rd} = \tilde{V}_{el}(t)|_{t=t_{rd}} = \tilde{V}_{el} + \epsilon_{V2a} \). Then, the \( V_{rd} \) value cannot be read accurately, since only finite precision measurement devices are available. This introduces the additive error vector \( \epsilon_{V2b} \). Individual entries of \( \epsilon_{V2b} \) follow the normal random distribution of (3.66).

The total absolute error of the solution of the analog solver compared to one provided by an
### 3.3.4 Calibration

A calibration system has been developed, for the analog part of the FPPNS [284]. The purpose of calibration is to improve the level of accuracy of the analog system. This translates into updated expected values and lower variances for many of the “unknown” inaccurate variables in the system, e.g. the full-scale resistance of the potentiometers (3.35), the combined inaccuracies of voltage DACs (3.51), the combined inaccuracies of voltage ADCs (3.61). The new expected values result in better tap setting calculations for potentiometers and voltage sources through (3.36) and (3.53). This combined with lower variances leads to smaller $E_G, \epsilon_I, \epsilon_{V1}, \epsilon_{V2b}$ in what has been detailed in the section 3.3.3. In turn this leads to a more accurate solution of the original problem.

The calibration sequence is fully automated in the sense that no user intervention is required. Hence, it can be carried out as often as required, for example due to drift of the analog parameters inherent to changes in temperature. It is realized using the probing/measurement resources that are already available on the boards of the prototype implementation. Fig. 3.28 illustrates the procedure. The steps described hereunder suffer naturally from the finite accuracy of the on-board measurement devices as well as from their quantized nature.

The process starts by correcting the offset of the ADC by connecting its input to the analog ground (step 1). This step reduces the collective effect of $\sigma_{VFS}, \sigma_{FS}, \sigma_{mid}, \sigma_{INL}$ in (3.61). Indirectly this reduces the $\epsilon_{V2b}$ in step (d) of table 3.2.
3.3. Inaccuracy

Step 2 takes the calibrated ADCs as reference to correct the output characteristic of the DACs. This reduces the collective effect of $\sigma_{FS}$, $\sigma_{DAC1}$, $\sigma_{DAC2}$ in (3.54). In turn this amelioration has a positive impact on the current injection mechanism through (3.59).

Step 3 calibrates the conversion resistors $R_{conv}$ which is part of the current injection mechanism. For the conversion resistor of every node, an accurate voltage divider is build by putting $R_{conv}$ in series to with a resistor of very high accuracy ($\pm0.1\%$), $R_{acc}$ in the figure. In this way $E[R_{conv}]$ is updated and $var[R_{conv}]$ is reduced, therefore the current injection through (3.59) is rendered more accurate. Steps 2 and 3 of the calibration process combined reduce the $\epsilon_I$ component of (b) of table 3.2.

Step 4 uses the already calibrated values of the conversion resistors to calibrate the potentiometers of the resistors network ($R_{(x,y)}$ in the figure). This step suffers from the naturally imperfect calibration of the conversion resistors in the previous step. Overall it reduces the effect of $\sigma_R$ of (3.41) and in turn $E_G$ in step (a) of table 3.2.

There are two comments to make regarding the sub-optimality of the current calibration procedure.

- Given that it utilizes measurement instruments that are available on-board, its correction capabilities are limited by the inherent inaccuracies of the instruments themselves (ADC inaccuracy). In the ideal case, calibration measurements would be provided by a dedicated (possibly external) measurement instrument.

- It assumes a linear model for the functioning of the DACs, which is only roughly true. The output of the DAC is essentially non-linear because of manufacturing integral and differential non-linearities, INL and DNL respectively. In the ideal case, a correction look-up table for all codes of every single DAC should be available. However, due to resource utilization considerations (limited memory of the digital core), this has not been implemented in the current prototype.

As it will be shown in section 3.4 calibration has a positive effect on the quality of the results of the FPPNS. Where relevant, “automatic” calibration will refer to the actual calibration on the prototype implementation. “Ideal” will refer to a calibrating procedure which involves external measurement devices and the use of (manually calculated) corrections for each DAC output code. “Ideal” calibration is a non-automatic (i.e. manual) procedure that counter-balances the two points mentioned hereinabove.
3.4 Results

3.4.1 Linear system solving

First, in order to demonstrate the effect of inaccurate electronics on the solution of a mathematical problem, the resistor topology of Fig. 3.29 is used as a case study (or sample) for non-calibrated electronics.

Every resistor in the grid is $R_{xy} = 2000 \, \Omega$, hence every conductance is $G_{xy} = 5 \cdot 10^{-4} \, S$. External measurements with a Fluke 45 multimeter (resistor measurement precision of $\pm 0.05\%$) showed the following discrepancies in the real value of the resistance of the potentiometers.

\[
\begin{bmatrix}
R_{01} \\
R_{12} \\
R_{23} \\
R_{34} \\
R_{14}
\end{bmatrix} = \begin{bmatrix}
2000 \\
2000 \\
2000 \\
2000 \\
2000
\end{bmatrix} \, \Omega, \quad \begin{bmatrix}
\tilde{R}_{01} \\
\tilde{R}_{12} \\
\tilde{R}_{23} \\
\tilde{R}_{34} \\
\tilde{R}_{14}
\end{bmatrix} = \begin{bmatrix}
2096 \\
2083 \\
2047 \\
2095 \\
2059
\end{bmatrix} \, \Omega
\] (3.84)

The additive error introduced by the inaccuracies is

\[
\epsilon_R = \begin{bmatrix}
96 \\
83 \\
47 \\
95 \\
59
\end{bmatrix} \, \Omega.
\] (3.85)
Substituting $R_{req} = 2000 \, \Omega$ into (3.44) yields an a priori estimate of the variance.

$$\text{var} [\epsilon_R] = (133.8 \, \Omega)^2$$  \hspace{1cm} (3.86)

This value is notably higher than the values of (3.85). This is because the 20% tolerance value given for the full-scale resistance of the potentiometers by the manufacturer is very much on the conservative side. Empirical tests of the potentiometers showed a tolerance of approximately 5%. This would result in a variance of $\sigma_R^2 = (166.6 \, \Omega)^2$. Substituting this to (3.43b) yields the following formula.

$$\text{var} [\epsilon_R] 2.7 \cdot 10^{-4} \cdot R_{req}^2 + 127.15$$  \hspace{1cm} (3.87)

In the latter substituting $R_{req} = 2000 \, \Omega$ yields a more realistic a priori estimate of the variance.

$$\text{var} [\epsilon_R] = (35.2 \, \Omega)^2$$  \hspace{1cm} (3.88)

The latter is closer to the values observed in (3.85).

The perfect conductance matrix of the resistor grid would be as follows.

$$G_{el} = \begin{bmatrix}
0.5 & -0.5 & 0 & 0 & 0 \\
-0.5 & 1.5 & -0.5 & 0 & -0.5 \\
0 & -0.5 & 1 & -0.5 & 0 \\
0 & 0 & -0.5 & 1 & -0.5 \\
0 & -0.5 & 0 & -0.5 & 1
\end{bmatrix} \cdot 10^{-3} \, S$$  \hspace{1cm} (3.89)

Due to inaccuracies, it is actually

$$\tilde{G}_{el} = \begin{bmatrix}
0.477 & -0.477 & 0 & 0 & 0 \\
-0.477 & 1.443 & -0.48 & 0 & -0.486 \\
0 & -0.48 & 0.969 & -0.489 & 0 \\
0 & 0 & -0.489 & 0.966 & -0.477 \\
0 & -0.486 & 0 & -0.477 & 0.963
\end{bmatrix} \cdot 10^{-3} \, S.$$  \hspace{1cm} (3.90)
The additive error introduced by the inaccuracies is as follows.

\[
E_G = \begin{bmatrix}
-0.023 & 0.023 & 0 & 0 \\
0.023 & -0.057 & 0.02 & 0 & 0.014 \\
0 & 0.02 & -0.031 & 0.011 & 0 \\
0 & 0 & 0.011 & -0.034 & 0.023 \\
0 & 0.014 & 0 & 0.023 & -0.037
\end{bmatrix} \cdot 10^{-3} \text{ S}
\]  

(3.91)

Node 0 is kept as constant voltage, i.e. a voltage injection is performed. Nodes 1-4 are used as current injection points. The constant voltage node is used to set a reference node with known voltage magnitude and angle of 1 \( \angle 0^\circ \). Hence the operation performed by the resistor grid is a mixed-current and voltage injection one, as described in (3.25)-(3.26). The voltage injection set is \( \mathcal{I}_V = \{0\} \) and the current injection node set is \( \mathcal{I}_I = \{1, 2, 3, 4\} \). The actual perfect matrix of the operation can be calculated from (3.25)-(3.26).

\[
\begin{bmatrix}
V_0 \\
I_1 \\
I_2 \\
I_3 \\
I_4
\end{bmatrix}
= D \cdot
\begin{bmatrix}
I_0 \\
V_1 \\
V_2 \\
V_3 \\
V_4
\end{bmatrix}
\]  

(3.92)

Where,

\[
D = \begin{bmatrix}
2 \cdot 10^6 & 1000 & 0 & 0 & 0 \\
-1000 & 1 & -0.5 & 0 & -0.5 \\
0 & -0.5 & 1 & -0.5 & 0 \\
0 & 0 & -0.5 & 1 & -0.5 \\
0 & -0.5 & 0 & -0.5 & 1
\end{bmatrix} \cdot 10^{-3} \text{ S}.
\]  

(3.93)

Due to inaccuracies, the actual matrix used in the solver is as follows.

\[
\tilde{D} = \begin{bmatrix}
2.096 \cdot 10^6 & 1000 & 0 & 0 & 0 \\
-1000 & 0.966 & -0.48 & 0 & -0.486 \\
0 & -0.48 & 0.969 & -0.489 & 0 \\
0 & 0 & -0.489 & 0.966 & -0.477 \\
0 & -0.486 & 0 & -0.477 & 0.963
\end{bmatrix} \cdot 10^{-3} \text{ S}
\]  

(3.94)
3.4. Results

The voltages in nodes 1-4 can also be given by the following.

\[
\begin{bmatrix}
V_1 \\
V_2 \\
V_3 \\
V_4
\end{bmatrix} = G_{el} \begin{bmatrix} I_1 \\ I_2 \\ I_3 \\ I_4 \end{bmatrix}^{-1}.
\] 

(3.95)

Also, as said, due to electronic inaccuracies, the injected quantities are not precise. The perfect and the actual vectors of the injections follow. Actual injections were measured using the same instrument as for the resistor measurement (voltage measurement precision of $\pm 0.025\%$).

\[
\begin{bmatrix}
V_0 \\
I_1 \\
I_2 \\
I_3 \\
I_4
\end{bmatrix} = \begin{bmatrix} 0 \\ 70 \\ 110 \\ 35 \\ -125 \end{bmatrix} \text{ mV, } \begin{bmatrix}
\tilde{V}_0 \\
\tilde{I}_1 \\
\tilde{I}_2 \\
\tilde{I}_3 \\
\tilde{I}_4
\end{bmatrix} = \begin{bmatrix} 1.7 \\ 69 \\ 109.8 \\ 35.9 \\ -124.1 \end{bmatrix} \text{ mV, uA}\] 

(3.96)

The error vector is as follows.

\[
\begin{bmatrix}
\epsilon_{V_0} \\
\epsilon_{I_1} \\
\epsilon_{I_2} \\
\epsilon_{I_3} \\
\epsilon_{I_4}
\end{bmatrix} = \begin{bmatrix} 1.7 \\ -1 \\ -0.2 \\ 0.9 \\ 0.9 \end{bmatrix} \text{ mV, uA}\] 

(3.97)

Regarding the voltage injection at node 0, substituting $V_{\text{req}} = 2.5 \text{ V}$ in (3.56) yields an error for the a priori variance of the voltage injection.\(^3\)

\[
\text{var} [V_{\text{DAC}}] = (2.8 \text{ mV})^2
\] 

(3.98)

This is similar to the result of (3.97).

By disregarding the inaccuracy of the conversion resistor in (3.59), and considering a value\(^3\) according to (3.17) the ground voltage offset is at $V_{\text{off}} = 2.5 \text{ V}$ so a voltage DAC output of $V_{\text{req}} = 2.5 \text{ V}$ is required for a final voltage injection of 0 V.

\(^3\)According to (3.17) the ground voltage offset is at $V_{\text{off}} = 2.5 \text{ V}$ so a voltage DAC output of $V_{\text{req}} = 2.5 \text{ V}$ is required for a final voltage injection of 0 V.
Chapter 3. Realized dedicated mixed signal solver

of $R_{conv} = 3125 \, \Omega$, the voltage DAC inaccuracy can be translated into a current injection inaccuracy (through (3.18)).

$$\text{var} [I_{DAC}] = \frac{\text{var} [V_{DAC}]}{R^2_{conv}} = (0.9 \mu A)^2$$

Again, values of current inaccuracies observed in (3.97) are remarkably close to the theoretically expected values of (3.99).

The voltages of the nodes 1-4 can be determined from (3.95). The perfect solution and the solution expected due to the DAC and grid imprecisions are given hereunder.

$$\begin{bmatrix} V_1 \\ V_2 \\ V_3 \\ V_4 \end{bmatrix} = G_{el}[2 : 5, 2 : 5]^{-1} \cdot \begin{bmatrix} I_1 \\ I_2 \\ I_3 \\ I_4 \end{bmatrix} = 180 \quad mV \quad \begin{bmatrix} \bar{V}_1 \\ \bar{V}_2 \\ \bar{V}_3 \\ \bar{V}_4 \end{bmatrix} = \bar{G}_{el}[2 : 5, 2 : 5]^{-1} \cdot \begin{bmatrix} \bar{I}_1 \\ \bar{I}_2 \\ \bar{I}_3 \\ \bar{I}_4 \end{bmatrix} = \begin{bmatrix} 191.6 \\ 336 \\ 250.2 \\ 93.3 \end{bmatrix} \quad mV$$

(3.100)

Then, the voltage error of stage (c) of table 3.2 is as follows.

$$\epsilon_{V_1} = \begin{bmatrix} 11.6 \\ 18.5 \\ 18.2 \\ 10.8 \end{bmatrix} \quad mV$$

(3.101)

Due to ADC imperfections, the exact $\bar{V}$ values cannot be measured with infinite accuracy. Instead, the following is provided as the solution by the analog solver.

$$\begin{bmatrix} \bar{V}'_1 \\ \bar{V}'_2 \\ \bar{V}'_3 \\ \bar{V}'_4 \end{bmatrix} = \begin{bmatrix} 188 \\ 333.3 \\ 250.2 \\ 89.1 \end{bmatrix} \quad mV$$

(3.102)

The grid is allowed to completely settle for the measurement of the above, hence $\epsilon_{V_{2a}} = 0$ in
3.4. Results

![Sample radial topology](image)

Figure 3.30 – Sample radial topology

(3.81), and the entire stage (d) is due to ADC inaccuracies.

\[
\epsilon_{V2} = \epsilon_{V2b} = \begin{bmatrix}
-3.6 \\
-2.75 \\
-3.04 \\
-4.19
\end{bmatrix} \text{ mV}
\]  

(3.103)

The total absolute error in the voltage solution of the FPPNS is as per (3.83).

\[
\epsilon_V = \begin{bmatrix}
-8 \\
-15.8 \\
-15.2 \\
-6.6
\end{bmatrix} \text{ mV, } \quad \frac{\epsilon_V}{V} = \begin{bmatrix}
4.44 \\
4.98 \\
6.47 \\
8.00
\end{bmatrix} \% 
\]  

(3.104)

3.4.2 Sample radial and meshed topologies

This subsection investigates the effect of the underlying topology graph on the error. Two topological archetypes are chosen, a purely radial topology, and a perfectly meshed topology, in an attempt to gain an insight on the propagation of the error. This insight is valuable as every real-world power system is composed by a set of radial and meshed parts.

Radial

A trivial radial architecture of \( n + 1 \) nodes is shown on Fig. 3.30. There is a voltage injection at node 0 (constant voltage) and current injections to nodes 1 → \( n \). All injections (voltage and current) are considered inaccurate, and so are the resistor values. The voltage inaccuracy at node 0 is due to the inaccuracy of the injection, while at nodes 1 → \( n \) is a result of the other inaccuracies.
Chapter 3. Realized dedicated mixed signal solver

An analytical expression for the voltage of node $k$ is as follows.

$$
\tilde{V}_k = \tilde{V}_0 + \sum_{i=1}^{k} \tilde{R}_i \sum_{j=i}^{n} \tilde{I}_j 
$$

(3.105)

Additive error models can be assumed for all resistors, voltages and currents.

$$
\tilde{R}_i = R_i + \epsilon_{Ri}
$$

(3.106a)

$$
\tilde{V}_i = V_i + \epsilon_{VI}
$$

(3.106b)

$$
\tilde{I}_i = I_i + \epsilon_{II}
$$

(3.106c)

The distribution of $\epsilon_{Ri}$ is given by (3.43), the distribution $\epsilon_{VI}$ of is given by (3.55) and $\epsilon_{II}$ follows the distribution of (3.60). Based on the above an expression for the error part of $\tilde{V}_k$ can be derived from (3.105).

$$
\epsilon_{V_k} = \epsilon_{V0} + \sum_{i=1}^{k} \tilde{R}_i \sum_{j=i}^{n} \epsilon_{Ij} + \sum_{i=1}^{k} \epsilon_{Ri} \sum_{j=i}^{n} \epsilon_{Ij} + \sum_{i=1}^{k} \epsilon_{Ri} \sum_{j=i}^{n} \epsilon_{Ij}
$$

(3.107)

Let $\sigma_R$, $\sigma_V$ and $\sigma_I$ be the variances of individual components of resistance, voltage and current injections respectively. Then an expression for the variance of $\epsilon_{V_k}$ is as follows.

$$
var[\epsilon_{V_k}] = \sigma^2_V + \sigma^2_R \sum_{i=1}^{k} (n - i + 1) \cdot R_i^2 + \sigma^2_R \sum_{i=1}^{k} \left( \sum_{j=1}^{n} \right)^2 + \sigma^2_R \cdot \sigma^2_I \cdot \frac{k^2}{2} \cdot (2 \cdot n - k + 1)
$$

(3.108)

From the above it can be seen that for an increasing distance $k$ from the fixed voltage node the variance increases. Also for the same $k$ (distance from the fixed voltage node) the variance is greater for larger $n$ (total number of nodes). This is in accordance with common sense intuition that expects larger (radial) topologies to be more prone to uncertainty that smaller ones.

Meshed

A trivially meshed topology (single loop) is shown in Fig. 3.31.

The voltage of node $k$ can be found by reaching the node through the traversal $0 \rightarrow 1 \rightarrow ... \rightarrow k$. 

92
3.4. Results

An analytical expression (including the inaccuracies) is as follows.

\[
\tilde{V}_k = \tilde{V}_0 + \frac{1}{\sum_{q=1}^{n+1}} \sum_{j=1}^{n} I_j \sum_{r=j+1}^{n+1} \tilde{R}_r \sum_{i=1}^{k} \tilde{R}_i - \sum_{i=1}^{k} \tilde{R}_i \sum_{j=i}^{n+1} \tilde{I}_j
\]

\[k \text{ independent} \quad k \text{ dependent, } \propto k \]  

(3.109)

Alternatively, node \( k \) can be reached by traversing \( 0 \rightarrow n \rightarrow ... \rightarrow k \). Then, the analytical expression for the voltage is analogously calculated.

\[
\tilde{V}_k = \tilde{V}_0 + \frac{1}{\sum_{q=1}^{n+1}} \sum_{j=1}^{n} \tilde{I}_j \sum_{r=j+1}^{n+1} \tilde{R}_r \sum_{i=k+1}^{n+1} \tilde{R}_i - \sum_{i=k+1}^{n+1} \tilde{R}_i \sum_{j=i}^{n} \tilde{I}_j
\]

\[k \text{ independent} \quad k \text{ dependent, } \propto k \]  

(3.110)

Analytical derivation of the above is omitted for space economy. It is very easy to show that (3.109) and (3.110) are equivalent. An analysis in the lines of (3.107)-(3.108) is possible but too tedious to present here. Instead, few insightful remarks will be made.

- Unlike the radial case, in the meshed one the voltage of node \( k \) involves all resistors and currents in the topology, regardless of \( k \); see the \( k \)-independent underbraced quantity in both (3.109) and (3.110).
- The number of inaccurate terms in the sums increases the further we move away from node 0.
  - For nodes closer to 0 in the \( 0 \rightarrow 1 \rightarrow ... \rightarrow k \) direction, the \( k \)-dependent underbraced quantity in (3.109) increases as \( k \) does so.
Figure 3.32 – Maximum absolute voltage error for radial and meshed topologies of increasing size

- For nodes closer to 0 in the $0 \rightarrow n \rightarrow \ldots \rightarrow k$ direction, the $k$-dependent under-braced quantity in (3.110) increases as $k$ does so.

- As in the radial case, for the same $k$, larger $n$ translates in any case into more inaccurate terms in the sums of (3.109) and (3.110), hence the total variance is higher.

The above observations are illustrated with an example. A radial and a meshed topology with a single loop have been implemented and tested. In both cases, all branch resistances have been set to $R_k = 2000 \Omega$, $k = 1 \ldots n$ and current injections to $I_k = 5 \mu A$. Topologies of size $n = 2 \ldots 18$ have been created. Fig. 3.32 shows the maximum absolute error (across different $k$ values) of (3.105) and (3.109)-(3.110) for different $n$ values. The perfect values have been calculated in software using double-precision arithmetic. The curves correspond to the different states of calibration of the FPPNS.

In any case, the detrimental accuracy effect due to the size is clearly demonstrated. This is particularly true for the radial case, in which the node “furthest away” from node 0 is mostly affected. For an arbitrary topology, containing radial and meshed subparts, a quantitative conclusion cannot be a priori drawn but this general trend is expected to hold. The more “radial” a topology is, the more severe the inaccuracy possibly induced to the voltage results of the analog solver. This will be illustrated in a real world example in a following section.

Another conclusion that can be drawn from this subsection is the greatly beneficial effect calibration has on the quality of the results of the solver. As mentioned in section 3.3.4, calibration helps reducing the error of each individual component.

In the case of automatic calibration (with circular markers), trends similar to the non-calibrated case are observed. This can be explained by the fact that due to its inherent limited accuracy,
the measurement circuit (ADC) is unable to perfectly correct the DAC output characteristic – step 2 of Fig. 3.28. The effect of this residual error is accumulated for larger topologies.

In the case of ideal calibration, manual measurements of voltage quantities are made offline and are passed asynchronously to the digital core. Correction tables for the DACs that are involved in the topology are created and stored in external memory. In the ideal calibration inaccuracies are limited to the quantization limits and actual quantities are distributed randomly (±1 bit) around the ideal value. Therefore, a node injecting a too high current value might be partly counterbalanced by a neighbor node injecting a smaller value. The effect of this is that an almost size-indifferent behavior of the analog solver is demonstrated.

3.4.3 Transient simulation

The FPPNS has been used for the transient simulation of power systems as per Fig. 3.5. Most tests have been performed on the systems rc3, lanz5, fabre18, fabre36, fabre59. Details and references on the test systems are given in appendix A.

Typical perturbation scenarios will be used for the 18- and 59-bus systems. A short-circuit in the topology can create some serious power imbalances, i.e. non-zero \( P_m - P_e \) terms in (3.30), which in turn will initiate a dynamic response of the angles. This response has been calculated by the FPPNS, as well as by a reference software running on a conventional PC [296]. In the current FPPNS-e1ab-tsaot implementation, all machine dynamic variables, as well as any voltage and current injection in the system can be asked as an output result. The results are stored in the TD results bank of the software and can be visualized in the Analysis editor.

The fully digital implementation (notation identifier “sim”) of [296] will be used as a benchmark reference. Difference of the angle trajectories will be used as a metric to measure the quality of the results of the dedicated hardware. Results with no calibration (notation identifier “nc”) and with automatic calibration (notation identifier “ac”) will be presented. For this application, ideal calibration is not available, since the current digital part of the solver lacks the memory necessary to store calibration corrections for the entire voltage output range.

Fig. 3.33 shows the rotor angle oscillations for one generator (gen. #3) in the 18-bus topology. The scenario is a short-circuit at the middle of the branch connecting buses #3 and #4, active for 70ms, with a post-fault tripping of the line.

The accuracy of the solver \( |\delta^{\text{sim}}_3 - \delta^{\text{nc}}_3| \) and \( |\delta^{\text{sim}}_3 - \delta^{\text{ac}}_3| \) is less that two degrees for the entire simulation window. For most applications this margin is very well tolerable. The high accuracy of the solver can be explained partly by the small size of the topology and partly by the fact that the topology is highly meshed, as discussed in section 3.4.2. The effect of automatic calibration is hardly noticeable. This is because, as mentioned in section 3.3.4, the on-board converters apparently cannot measure, and therefore correct, beyond a certain point of accuracy; this point is the residual inaccuracy shown in the figure.
Chapter 3. Realized dedicated mixed signal solver

The same flow of operations is performed in the 59-bus topology. The scenario is a short-circuit at the middle of the branch connecting buses #9 and #10, active for 200ms, without a post-fault tripping of the line, i.e. the fault disappears while the line stays connected. Figs. 3.34 and 3.35 show the rotor angle oscillations for generators #11 and #54. A very interesting phenomenon is observed.

While the angle of gen. #11 is highly accurate \( |\delta_{3}^{\text{sim}} - \delta_{3}^{\text{nc}}|, \ |\delta_{3}^{\text{sim}} - \delta_{3}^{\text{ac}}| \leq 1^\circ \ \forall \ t \), the one for gen #54 is not so. This is one case in which calibration significantly increases the quality of the results, as demonstrated by the following.

\[
|\delta_{34}^{\text{sim}}(t) - \delta_{34}^{\text{nc}}| \leq 7.6^\circ \ \forall \ t
\]  \hspace{1cm} (3.111a)

\[
|\delta_{34}^{\text{sim}}(t) - \delta_{34}^{\text{ac}}| \leq 2.6^\circ \ \forall \ t
\]  \hspace{1cm} (3.111b)

Bad accuracy for generator #54 can be partly explained by examining the topology architecture and by recalling what has been discussed in section 3.4.2. Fig. 3.36 contains a simplified view of the 57-bus topology. It shows that this system can be split in two sub-networks with a single radial connection. According to what was stated in section 3.4.2, the voltage results of sub-network 2 are expected to be less accurate than the ones of sub-network 1, as table 3.3 attests. It shows the maximum absolute angle deviation for all synchronous machines of this topology. The impact of the calibration as well as the accuracy dependence on the concerned sub-network can be clearly seen.
3.4. Results

Figure 3.34 – Rotor angle oscillations of generator #11 of the 59-bus topology using different simulators

Figure 3.35 – Rotor angle oscillations of generator #54 of the 59-bus topology using different simulators
Figure 3.36 – Reduced schematic of the 59-bus test case

Table 3.3 – 59-bus test case maximum absolute rotor angle deviation for TSA

<table>
<thead>
<tr>
<th>Generator #</th>
<th>Sub-network</th>
<th>Angle nc [°]</th>
<th>Angle ac [°]</th>
</tr>
</thead>
<tbody>
<tr>
<td>7</td>
<td>2</td>
<td>5.5</td>
<td>2.1</td>
</tr>
<tr>
<td>8</td>
<td>2</td>
<td>5.8</td>
<td>1.9</td>
</tr>
<tr>
<td>9</td>
<td>1</td>
<td>0.5</td>
<td>1.1</td>
</tr>
<tr>
<td>11</td>
<td>1</td>
<td>0.6</td>
<td>0.4</td>
</tr>
<tr>
<td>12</td>
<td>1</td>
<td>0.7</td>
<td>0.4</td>
</tr>
<tr>
<td>19</td>
<td>2</td>
<td>4.9</td>
<td>1.6</td>
</tr>
<tr>
<td>28</td>
<td>2</td>
<td>5.7</td>
<td>1.8</td>
</tr>
<tr>
<td>34</td>
<td>2</td>
<td>5.4</td>
<td>1.7</td>
</tr>
<tr>
<td>36</td>
<td>1</td>
<td>1.9</td>
<td>0.4</td>
</tr>
<tr>
<td>54</td>
<td>2</td>
<td>7.6</td>
<td>2.6</td>
</tr>
</tbody>
</table>

nc: refers to non-calibrated hardware
ac: refers to auto-calibrated hardware
Generally transient simulation results provided by the solver have tolerable levels of inaccuracy. A factor that smooths out part of the overall analog inaccuracy is locality. The power mismatch of (3.30) depends on voltage differences between that bus and the direct-neighbor bus(es). This local relative inaccuracy is expected to be lower than the absolute system-wide one. Therefore it yields results that are closer to the correct value than expected.

**Power consumption**

The standard average consumption of one PCB is approximately 5.85 W (for steady voltage and current values of 6.5 V and 0.9 A), yielding a total of approximately 23.4 W for the entire platform. This figure compares favorably with consumptions of other dedicated accelerator hardware such as GPUs, e.g. 49W(max) for a low-range desktop ATI Radeon X800 GTO, and 250 W for a high-end NVIDIA GeForce GTX TITAN X. An exact watt per FLOPS/Joule is difficult to attain, since it is difficult to quantify the operation of the analog part in floating point operations.

**Timing breakup**

A timing breakup of the calculation is presented in Fig. 3.37. It is compared to a research grade software (MatDyn [296]) and to an industrial grade software (Ramses [6]) in table 3.4.

Time profiling of the FPPNS starts at the time when all initialization data have been written to the 2-port RAM of Fig. 3.18, and stops when the results have been written back to that memory. This is honest in the sense that this RAM is the interface of the FPPNS-accelerator to the elab-tsaot software. Time spent on the hardware platform can be divided in three steps: pre-processing, processing and post-processing. The first part is dedicated to the initialization of the digital and analog resources. The second one is dedicated to calculations of the partitioned scheme of Fig. 3.5 on the pipelines. The timing break-up of each iteration is according to Fig. 3.19. Within the processing time, time to reconfigure the grid according to the predefined fault is also accounted for (“Fault on” and “Fault off”). The post-processing stage extracts the results and writes them in the shared memory. As it can be seen, a significant percentage of the over-all time is spent on auxiliary tasks, and not specifically on the calculations (pre- and post-processing).

The program flow of MatDyn is similar to the one used implicitly in the FPPNS. A partitioned approach is used to solve the set of DAEs for the TD simulation. Classical generator models have been used for all generators without exciters or governors. A predictor-corrector Euler method has been used for numerical integration since it has similar complexity and properties to the FE/AB2 methods used in the FPPNS.

RAMSES uses a simultaneous scheme to solve the simulation equations. Afterwards an implicit integration scheme is applied, and the resulting non-linear system of equations is solved using a dishonest Newton method. This flow is significantly different as compared to the one used
Chapter 3. Realized dedicated mixed signal solver

A time that does not appear in Fig. 3.37 is the time used to communicate between the dedicated hardware and a host PC. Table 3.5 shows the timing breakup between the different platforms, for the time domain simulation that has been presented in this section. Operations on the PC involve all preconditioning actions that take place in the elab-\textit{tsaot} software. These are mainly auxiliary tasks, as shown in Fig. 3.23. USB timings in the table comprise the entire communication cycle. Normally for a standard FPPNS operation there are two stages of writing to the shared RAM.

- Write the bitstream to the FPPNS slices. This bitstream contains information on the topology to be mapped on the analog hardware and on the scenarios to be executed. In the current implementation, each slice has to be written separately.

- Issue the start commands that correspond to the analysis that is to be executed. These commands also contain generic information on the simulation, e.g. time step, duration, etc.

The FPPNS part of table 3.5 comprises the computation of Fig. 3.37. After the FPPNS computation is finished, the results are available in the internal memory of the FPGA. They can be retrieved by the elab-\textit{tsaot} by issuing specific request-commands through the USB interface.

As seen, in the current implementation the USB communication is a bottleneck that can have a very detrimental effect to the overall throughput performance of the platform. This motivates the use of it in a serial “batch” execution of analyses/scenarios as will be shown in the following section [283].

<table>
<thead>
<tr>
<th>Tool</th>
<th>Computation time (ms)</th>
</tr>
</thead>
<tbody>
<tr>
<td>MatDyn (academic sw)</td>
<td>~ 360</td>
</tr>
<tr>
<td>RAMSES (industrial sw)</td>
<td>~ 24</td>
</tr>
<tr>
<td>FPPNS</td>
<td>5.6</td>
</tr>
</tbody>
</table>

in FPPNS and MatDyn, however results of the comparison are included for completeness sake.

Table 3.4 – Speed comparison between different engines for the TD simulation of the 18-bus system

<table>
<thead>
<tr>
<th>Tool</th>
<th>Computation time (ms)</th>
</tr>
</thead>
<tbody>
<tr>
<td>MatDyn (academic sw)</td>
<td>~ 360</td>
</tr>
<tr>
<td>RAMSES (industrial sw)</td>
<td>~ 24</td>
</tr>
<tr>
<td>FPPNS</td>
<td>5.6</td>
</tr>
</tbody>
</table>

Figure 3.37 – Timing breakup of computation time for transient simulation of the 18-bus system
3.4. Results

Table 3.5 – Timing break-up between PC, USB communication and the FPPNS for a transient stability operation

<table>
<thead>
<tr>
<th>Time</th>
</tr>
</thead>
<tbody>
<tr>
<td>PC (elab-tesot)</td>
</tr>
<tr>
<td>USB comm</td>
</tr>
<tr>
<td>FPPNS</td>
</tr>
</tbody>
</table>

Table 3.6 – Timing results summary for the n-1 branch contingency analysis

<table>
<thead>
<tr>
<th>System size</th>
<th>Branches examined</th>
<th>Time sim [s]</th>
<th>Time FPPNS [s]</th>
<th>Speedup</th>
</tr>
</thead>
<tbody>
<tr>
<td>5</td>
<td>5</td>
<td>0.9</td>
<td>0.23</td>
<td>3.9x</td>
</tr>
<tr>
<td>18</td>
<td>20</td>
<td>10.2</td>
<td>0.5</td>
<td>20.4x</td>
</tr>
<tr>
<td>36</td>
<td>39</td>
<td>106.7</td>
<td>0.92</td>
<td>116x</td>
</tr>
<tr>
<td>59</td>
<td>55</td>
<td>434.9</td>
<td>1.13</td>
<td>384.9x</td>
</tr>
</tbody>
</table>

3.4.4 Dynamic stability analysis

Dynamic Stability Assessment (DSA) is an analysis concerned with the quantitative/qualitative characterization of the ability of the system to retain a state of operating equilibrium after being subjected to severe disturbances. The transient events examined during DSA studies, are handled using TD simulations.

Branch n-1 contingency analysis

A common procedure in DSA studies is n-1 contingency analysis. In the latter, the operator is interested in knowing whether the system (all generators) retains or not its stability, after an outage/perturbation is applied on each one of its elements. The outcome of this study, is a set of simple boolean answers (stable/unstable) for each of the contingencies in the test set. The answer from the hardware platform can therefore be communicated with a simple boolean statement for each of the scenarios (stable/unstable). This greatly reduces the communication requirements.

A variation of this procedure was applied to different test systems. A set of branches was selected to be examined and perfect 3φ faults were applied in the middle of each branch for 200 ms. Table 3.6 summarizes the timing results for the software-only (sim) and the mixed-platform (FPPNS) solution. The numbers given for the timings include the time for the USB communication with the FPPNS. The software implementation of the n-1 DSA module is based on the TD engine detailed in section 3.2.1.

Contingency analysis results can be visualized in the Analysis editor as shown in the screenshot of Fig. 3.38. The results of a n-1 analysis for the 18-bus system are shown with branch “D0” is selected (in dashed blue). Generators that lose their stability in case of a fault
Critical Clearing Time analysis

The Critical Clearing Time (CCT) analysis determines the maximum duration of a fault on a given branch for which the system (even marginally) maintains its stability. Naturally, CCT depends on TD simulations. The most common algorithm to do it is to perform transient simulations in a binary search fashion for different fault durations. The CCT value is approximated by an upper and a lower bound, the difference of which needs to be under an asked-for precision.

A CCT algorithm similar to the one described here, was implemented on the FPGA. Perfect fugitive 3φ faults in the middle of the branches were considered. A binary search was performed for each fault location, in the search window $t_{src} \in [0.0s, 1.5s]$, and a precision of $t_{prec} = 10ms$ was requested. Timing results for the software-only (sim) and the mixed-platform (FPPNS) architectures are summarized in table 3.7. The software-only implementation of the CCT DSA module is based on the TD engine detailed in section 3.2.1.
3.4. Results

3.4.5 Effect of the integration algorithm on the results

The integration algorithms that are synthesized in the fixed-point datapaths of the pipelines in the FPGA reside in a low accuracy environment, as discussed in section 3.3. This way their inherent numerical inaccuracy is exacerbated. The purpose of this section to investigate the effect of different integration algorithms on the quality and the usability of the results. Related studies have been performed on the fabre18 test system.

Assessment of absolute precision for different time-steps

A metric to quantify the relative precision of two simulators is the maximum absolute difference in the trajectories of the internal angle of the same generator, $L_i^\infty = \max_t |\Delta \delta_i(t)|$.

Fig. 3.39 presents $L_i^\infty$ for all generators in the test system. Simulations are run on the emulator using a time-step $h = 60\mu s$, for both FE and AB2. This value is orders of magnitude less than common practice in TRANSCOs [297]. Before the FPPNS run, the calibration procedure of section 3.3.4 was carried out to ensure minimum relative inaccuracy. Results coming from a PC software 4th-order Runge-Kutta (RK4) implementation were used as a “angle solution” reference. Discrepancies between the best possible analog solution and the software reference can be attributed exclusively to analog imprecision.

As the time step of the hardware emulator increases, its accuracy decreases. This deterioration is due to the effect of digital imprecisions that increase with the increase of the time step. Fig. 3.40 shows this added inaccuracy against the minimum timestep case that is taken as a relative reference. $L_i^\infty$ for generators are displayed in groups of time steps. Naturally, as higher time steps are used the accuracy decreases. However this decrease in the quality, and thereof usability, of the results is less pronounced for AB2 than for FE. For example, using a timestep of 15.6 ms yields acceptable results for AB2, while FE collapses. In cases of very high time step (e.g. 62.5 msec), both algorithms collapse and a quantitative assessment of $L_i^\infty$ is irrelevant.
Chapter 3. Realized dedicated mixed signal solver

Figure 3.40 – FPPNS error due to digital imprecision for different time steps for the 18-bus test system

Figure 3.41 – FE instability while AB2 succeeds in retaining the stability of the numerical solution

A selected case where the contrasting behavior of FE and AB2 is well illustrated is shown on Fig. 3.41. In it a detailed view on the trajectory of the internal machine angle of a generator after a perturbation is shown. The differences on the amplitude and the exact time instant of the first peak of the trajectory are quantifying the global accumulated truncation error of the algorithms. It is clear that the AB2 trajectory is way closer to the “real” one coming from RK4 on software. As the phenomenon evolves, local truncation errors further accumulate in the FE case, and finally render the trajectory unstable after the second peak. It is crucial to note, that this instability is a numerical artifact of the integration algorithm (FE), and it is not owed to a real-life phenomenon. This clearly demonstrates that such results given by FE are unusable and potentially dangerous for real world operation analyses. This issue is further analyzed in the following subsection.
3.4. Results

Table 3.8 – n-1 branch contingency results for different integration algorithms and time steps

<table>
<thead>
<tr>
<th>contingency</th>
<th>sim</th>
<th>( h = 7.8 , ms )</th>
<th>( h = 15.6 , ms )</th>
<th>( h = 31.3 , ms )</th>
</tr>
</thead>
<tbody>
<tr>
<td>250ms @br#1</td>
<td>S</td>
<td>S</td>
<td>U</td>
<td>S</td>
</tr>
<tr>
<td>250ms @br#6</td>
<td>S</td>
<td>U</td>
<td>S</td>
<td>S</td>
</tr>
<tr>
<td>250ms @br#10</td>
<td>S</td>
<td>S</td>
<td>S</td>
<td>S</td>
</tr>
<tr>
<td>250ms @br#13</td>
<td>S</td>
<td>S</td>
<td>U</td>
<td>S</td>
</tr>
<tr>
<td>250ms @br#17</td>
<td>S</td>
<td>S</td>
<td>U</td>
<td>S</td>
</tr>
<tr>
<td>250ms @br#20</td>
<td>S</td>
<td>S</td>
<td>S</td>
<td>S</td>
</tr>
<tr>
<td>250ms @br#31</td>
<td>S</td>
<td>S</td>
<td>U</td>
<td>S</td>
</tr>
<tr>
<td>410ms @br#32</td>
<td>U</td>
<td>U</td>
<td>U</td>
<td>U</td>
</tr>
<tr>
<td>250ms @br#33</td>
<td>S</td>
<td>S</td>
<td>U</td>
<td>S</td>
</tr>
</tbody>
</table>

Effect on n-1 branch contingency analysis

Table 3.8 summarizes the results of a n-1 contingency analysis for the test system, for a selected set of branch contingencies. Under the “sim” column reference results coming from a purely software RK4 simulator are shown. It is clearly seen that as the time steps become larger (\( a \to b \to c \)) FE implementations fail to be in accord with the reference, while AB2 implementations succeed in doing so. Boolean results in bold typeface show this mismatch between FE results and the reference.

Effect on Critical Clearing Time analysis

Potentially erroneous results for CCT analysis can be critical for the safety of the system, since many of the coordinated protection schemes rely on it. Fig. 3.42 presents CCT results for branches #4, #10 and #30. It is clear from the graphs that AB2 is able to provide reasonable results (\( \pm 2.2\% \) of the minimum step reference) with time steps as high as \( \sim 16 \, ms \). On the contrary, for FE results to be in the same range, time step sizes of \( \leq 2 \, ms \) have to be used. Since CCT analysis is a repetitive procedure due to the binary search scheme (\( \sim 90 \) transient simulations for each point of the figures for 10 ms CCT precision), higher time steps result in considerable computational savings.

This phenomenon becomes even more pronounced when the accuracy of the environment surrounding the numerical integration blocks is further reduced. A related set of tests were performed for the power system, this time without the calibration of the analog part. This makes the approximation of the solution of the linear system even worse, hence reducing the overall accuracy of the solver. Fig. 3.43 presents relevant result for branch #30, for the two algorithms. It is clear that AB2 is far less affected by the drop of accuracy of the linear solution compared to FE. For the latter, we observe results that are off by a margin of 8% compared to, for time steps as low as 4 ms.
Figure 3.42 – CCT for branches #4, #10, #30 of the 18-bus system with varying timesteps using FE & AB2

Figure 3.43 – CCT for branch #30 of the 18-bus system with varying timesteps using FE & AB2, in a calibrated and an non-calibrated FPPNS environment
3.4. Results

Figure 3.44 – Internal angle of generator #4 after a transient event in the 18-bus case, for different ADC waiting times.

3.4.6 Effect of time step versus waiting time

The expression for the error on the final voltage due to the parasitic RC capacitance (3.82) can be rewritten as follows.

\[ \epsilon_{V2a} = \epsilon_{V2a}(\Delta \tilde{V}_{el}, t_{rd}) = \Delta \tilde{V}_{el} \cdot e^{-\frac{t_{rd}}{\tau}} \]  

(3.112)

Where \( \Delta \tilde{V}_{el} := \tilde{V}_{el}^* - \tilde{V}_{el}^0 \) is the ideal voltage change induced by the updated current/injections of the new step. It is seen that the magnitude of \( \epsilon_{V2a} \) is directly proportional to the magnitude of this voltage change. The latter is expected to be larger for larger time steps, regardless of the numerical integration algorithm that is used. This is easily understood since a larger time step, allows the dynamics of the system to manifest for longer, hence the state of the system is moved further away compared to a smaller time step. This relation \( h \propto \epsilon_{V2a} \) might require a longer waiting \( t_{rd} \) to compensate, i.e., for larger time steps the analog grid might require longer time to settle.

Relevant tests have been conducted for the 18-bus system. Fig. 3.44 shows the internal angle of generator #4 for a 3ph fault scenario. The blue line is the true angle, obtained with a software simulator, while grey lines correspond to the solutions given by FPPNS for different waiting times \( t_{rd} \).
Chapter 3. Realized dedicated mixed signal solver

It is clearly seen that the longer the waiting time, the closer the FPPNS result is to the true value. This is particularly evident in parts of the graph where there are peaks (and valleys). Peaks are smaller for shorter waiting times. During peaks $\Delta \tilde{V}_{el}$ variations are large, and through (3.112) larger errors are introduced to the voltage solutions. The shorter the waiting times, the more difficult for the FPPNS solution to follow the true one. Empirical studies show that $t_{rd} = 1 \text{us}$ is a good trade off between speed and accuracy in most of the cases that have been tried.

3.5 Conclusions

To the knowledge of the author the FPPNS presented in this chapter is the most advanced mixed-signal power system computer. Table 3.9 provides an overview of the existing prototype based on the criteria established in section 3.1.1. All figures given in the table concern a research-grade prototype that has been realized using off-the-shelf discrete components as a proof of concept. Significant improvements can be expected in more optimized implementations, such as the one involving customized integrated circuits, as suggested in [4].

3.5.1 Comparison with related work

If only the linear algebra capabilities of the FPPNS is concerned, related work has been presented in section 2.4.1. Most of the efforts presented in that section however are conceptual and provide no implementation of the proposed system, unlike the work presented in this chapter. Additionally, many of the studies deal with with the solution of linear systems that arise in the finite difference solution of partial differential equations which result in matrices of fixed specific structures. Of course our work is suitable for the solution of such linear systems. It goes well beyond that, in offering complete reconfigurability of the resistor network, so that an arbitrary topology (i.e. matrix structure) can be mapped onto it. In our approach the analog part performs the linear algebra operation in one-shot, in no iterative scheme. Hence the inaccuracies of the electronics have a direct impact of the accuracy of the precision of the operation. An alternative has been proposed in [241, 240, 242, 245] to use iterative refinement techniques in a effort to decrease the negative impact of analog imprecision to the final solution. The drawback of this approach is that iterative solutions (using the residuals) of the same linear system are required, hence the performance is decreased. Recent FPAA-based work is in very experimental stage as it concerns only dense algebra operations of system sizes up to 2 [248]. In our current implementation matrices of size up to $2 \times 96$ can be handled.

Works related to analog-computing for power system studies have been presented in section 2.4.2. The work Fried [257] gave birth to research primarily centered around two universities, EPFL and Drexel University [271, 268, 273, 269, 270]. Work of the two research groups can be compared along the points of section 2.4.3, wherever relevant data is available.
### Table 3.9 – Characterization of the existing platform

<table>
<thead>
<tr>
<th>Criterion</th>
<th>Remarks</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Cost</strong></td>
<td>The cost of all the discrete components that make up each one of the prototype PCBs is in the order of $5k$ CHF, yielding a total cost of $20k$ CHF ($4 \times 5k$).</td>
</tr>
<tr>
<td><strong>Size</strong></td>
<td>The entire computer is self-contained in a $32 \times 26 \times 10$ cm case. Power supply and communication connections are provided.</td>
</tr>
<tr>
<td><strong>Interfacing</strong></td>
<td>A USB connection is provided to the platform. A shared memory is available both to the mixed-signal computer central processing unit (a synthesized NIOS II CPU) and to the CPU of the conventional host computer (through dedicated drivers).</td>
</tr>
<tr>
<td><strong>Reconfigurability</strong></td>
<td>The topology of the resistor network is a fixed 5-degree lattice, the edges of which can be enabled/disabled through the use of digitally controlled switches. The value of all resistors, current and voltage sources are reconfigurable (even in runtime) according to their respective bit resolution.</td>
</tr>
<tr>
<td><strong>Scalability</strong></td>
<td>A total capacity of 96 power system nodes ($4 \times 24$) is provided. Systems up to 57 buses have been successfully analyzed. There is evidence that the favorable linear algebra computing properties scale well with system size, but a detailed investigation is pending.</td>
</tr>
<tr>
<td><strong>Performance</strong></td>
<td>Against research grade solvers, speeds of up to three orders of magnitude have been achieved in dynamic stability assessment [283]. Communication time between the platform and the conventional host PC is a main bottleneck, and hence performance benefits are maximized when analyses are performed in series.</td>
</tr>
<tr>
<td><strong>Accuracy</strong></td>
<td>The main source of accuracy errors are the inaccurate analog electronics and the quantized discrete values they can assume (quantization). An automated calibration scheme has been developed in order to ameliorate the phenomenon [284]. In total, an average relative error of &lt;6% has been estimated and observed for the linear system solution.</td>
</tr>
<tr>
<td><strong>Power efficiency</strong></td>
<td>The standard average consumption of one PCB is approximately 5.85 W (for steady voltage and current values of 6.5 V and 0.9 A), yielding a total of approximately 23.4 W for the entire platform.</td>
</tr>
<tr>
<td><strong>Functional completeness</strong></td>
<td>On the analog part, only linear algebra operations are supported, by injecting currents and measuring voltages (see also section 3.1.1). The fixed predefined topology of the available (positive resistance) potentiometer bank creates limitations on the nature and value range of admittance matrices that can be mapped on the FPPNS.</td>
</tr>
</tbody>
</table>
Chapter 3. Realized dedicated mixed signal solver

**Scalability and Accuracy.** The largest network that has been presented was one of 5 nodes and 2 machines in [273]. There are claims of potential to scale up to thousands, however the precision of the results is already questionably acceptable for the 5 bus case (fact that the authors of [273] acknowledge in section VII). The current capacity of the FPPNS is 96 buses, and networks up to 59 buses have been realized and tested while retaining acceptable precision for the results.

**Reconfigurability.** In [273] the topology of the power network is connected by manually wiring analog boards of fixed sub-topology. It is difficult to imagine how a system of more than a few tens of buses could possibly be mapped manually to an array of FPAA boards by manually wiring the connections. Additionally, the FPAA that are used are configured once and retain their configuration for the rest of the study. On the contrary our system has complete runtime reconfigurability. The topology of the power system is automatically mapped to a suitable resistor network and any parameter (e.g. line, or injection) or system value (e.g. simulation step size) can be changed on-the-fly while calculations are performed, without manual intervention. The commands to perform this changes are issued by the NIOS CPU which can be programmed at will (e.g. in C programming language).

**Interfacing.** In [273] it is mentioned that DAQ hardware by National Instruments and Labview software has been used as the interface of the platform to the user, via a conventional PC. The interface presented in [271] is mostly dedicated to the configuration of the fixed topology and all preprocessing/post-processing/interpretation of the results is done by the human user.

On the other side, in the software presented in section 3.2 a complete integration of the proposed platform into the frame of power system analysis application is provided. All operations executed by the dedicated hardware are transparent to the user that has the look and feel a purely software solution. On the developers side, an API has been created that allows the integration of the platform as a linear algebra accelerator, into any program by a simple function call.

**Functional completeness.** Drexel prototypes retain all four subnetwork from the real decomposition of (3.10) [298, 271, 268]. This allows them to retain both conductive and susceptive parts of the admittance matrix of the power system, unlike this work which neglects the conductive part.

The way bus injectors are treated is a fundamental difference between this work and the approach in Drexel. The latter use analog for the components connected to the grid. In the opinion of the author, this adds unnecessary complexity to the design as it is very difficult and time-consuming to reproduce accurate analog models of real system components that may vary arbitrarily in complexity. In our design all computations related to components connected to buses are done in the digital domain, and hence practically unlimited flexibility is offered.
3.5. Conclusions

3.5.2 Limitations

Despite its many merits the current architecture has inherent shortcomings that limit its applicability.

- Neglecting the conductance part of the admittance matrix is required.
- Negative electronic conductance values are not supported.
- Non-symmetric admittance matrix non-zero patterns are not supported.
- Despite the digitally controlled switches the pre-fixed topology of the resistor network poses limitations to the power systems (i.e. admittance matrices) that can be handled.
- The USB communication is a major performance bottleneck.

Chapter 4 presents a conceptual new design which retains some of the features of the FPPNS, but in the same time tries to mitigate many of its shortcomings.
4 Concept future solver

In this chapter, a hybrid analog-digital computer, called mixed-signal computer (MSC), is proposed as a linear algebra accelerator specially adapted to power system analysis requirements. The MSC retains some design principles of the FPPNS while overcoming its limitations. It works in conjunction with a host PC conventional power system simulation software. The program flow is diverted to the MSC when intensive linear algebra operations are required. The latter returns the result and the flow of the conventional software continues.

The analog and the digital architecture of the proposed MSC are presented, alongside a detailed analysis of the design procedure and the mathematical properties of the platform. A software emulator of the proposed architecture is created in order to validate the concept, assess its functionality. With the help of the emulator an exploration of the design space is conducted and a set of design guidelines is given. The computational prowess of a suggested implementation is demonstrated by its application to small- and mid-sized realistic topologies.
The proposed new design concerns a linear-algebra enabled accelerator dedicated to power system computations. The new design has an analog part and a digital part and will be hereafter referred to as a Mixed-Signal Computer (MSC). The new system retains some design concepts of the existing FPPNS prototype while in the same time it has provisions to overcome its limitations.

It is meant to be connected to the frame (e.g. motherboard) of a conventional digital computer, through a standard digital interface. The conventional computer runs the power system analysis application, and linear algebra computations are offloaded to the MSC. Fig. 4.1 gives a conceptual overview.

On the left of the figure there is the conventional platform that runs the power system analysis software. In this study a typical desktop PC has been used a platform and the RAMSES phasor transient simulator of [6] has been used a power system analysis software.

On the right the MSC is shown schematically. Parts shaded in light blue are in the digital domain, while parts shaded in red are in the analog domain. The connection between the conventional platform and the MSC is realized through a standard electronic interface, e.g. PCI express, the controller of which is depicted in the IF box. The digital part does not have any computing capabilities. Instead its processing unit PU handles all tasks that concern the configuration of the analog part as well as the communication/interfacing of the MSC with the conventional computer.

There is a digital control bus between the digital PU and the analog parts of the MSC, denoted by a bold black line in the figure. The analog part of the MSC is the computing core dedicated to linear algebra operations. It consists of four main parts: Injection and Measurement Nodes (IMN), Two-Port Networks (TPN), One-Port Networks (OPN), and the interconnection fabric between them - shown as a grid of analog (red) lines shaded in gray. The exact functioning of the analog grid and its components is explained in the next subsections.
Table 4.1 – Comparison of the roles of different platforms and domains in the FPPNS and the MSC

<table>
<thead>
<tr>
<th>Platform/Domain</th>
<th>FPPNS</th>
<th>MSC</th>
</tr>
</thead>
<tbody>
<tr>
<td>PC</td>
<td>Only handles user interface and auxiliary tasks</td>
<td>Controls the flow of the power system analysis program (RAMSES); also handles auxiliary tasks for the MSC</td>
</tr>
<tr>
<td>Digital hardware</td>
<td>Controls the flow of the power system analysis program</td>
<td>Only handles auxiliary and interfacing tasks</td>
</tr>
<tr>
<td>Analog hardware</td>
<td>Handles the linear algebra part of the computations</td>
<td>Handles the linear algebra part of the computations</td>
</tr>
</tbody>
</table>

Table 4.1 summarizes differences in the design philosophy between the FPPNS of chapter 3 and the MSC of this chapter.

4.1 RAMSES overview

The hardware has been designed by taking into consideration the specificities of the tasks that it needs to execute. This is so, in order to increase the coherence between the hardware platform and the problem specificities as per section 2.2.2 of chapter 2. In this section a short introduction on RAMSES is given, which is the software that will host the use of the MSC.

RAMSES is a relaxable accuracy multi-threaded software simulator for electrical power systems. The general formulation of power system simulation problems has been described in (2.1) in chapter 2 (the equation is copied hereunder for the facilitation of the reader).

\[
\begin{bmatrix}
\dot{x}_d \\
\dot{x}_a \\
0
\end{bmatrix}
= \begin{bmatrix}
f_d(x_d, x_a) \\
f_a(x_d, x_a)
\end{bmatrix}
\] (2.1)

The dynamic \( x_d \) and algebraic \( x_a \) variables may correspond to the grid (interconnecting components such as branches, transformers etc.) or to injectors connected to the grid (generators, loads, etc.) RAMSES uses the simultaneous approach to solve (2.1). The vector of unknown variables is ordered into subvectors that correspond to injector-related variables \( x = [x_1 \ x_2 \ ... \ x_n]^T \) followed by bus voltages \( V \); \( x_i \) elements are dynamic or algebraic variables and \( V \) elements are algebraic variables. Similarly, equations are grouped in sets of DAEs \( \phi_i \) that correspond to injection \( i \), and to a large linear equation \( g \) that corresponds to the grid. An implicit integration algorithms, e.g. second order Backward Differentiation Formula (BDF), is used to discretize \( \phi_i \)’s into \( f_i \)’s. The combination of the above results in a non-linear system.
Chapter 4. Concept future solver

that has to be solved in each time instant \( k \).

\[
\dot{x}_i = \phi_i(x_i, V) \quad \text{BDF} \quad 0 = f_i(x_i, V), \quad i = 1\ldots n
\]

\[
0 = g(x_1, x_2, \ldots, x_n, V)
\]

(4.1)

For the solution of the non-linear system of (4.1) a Newton scheme is used (\( m=1,2,\ldots \)). For each Newton iteration the values of \( x_i \) and \( V \) are initialized from the values in the previous time step

\[
x_i^{m=0} = x_i(t-h) \quad V^{m=0} = V(t-h)
\]

and then the following is performed.

\[
x_i^m = x_i^{m-1} + \Delta x_i^m, \quad i = 1\ldots n
\]

(4.2a)

\[
V^m = V^{m-1} + \Delta V^m
\]

(4.2b)

In (4.2), the Newton corrections \( \Delta x_i^m \) and \( \Delta V^m \) at iteration \( m \), come from the solution of the following linear system.

\[
\begin{bmatrix}
A_1 & B_1 \\
A_2 & B_2 \\
\vdots & \vdots \\
A_n & B_n
\end{bmatrix}
\begin{bmatrix}
\Delta x_1^m \\
\Delta x_2^m \\
\vdots \\
\Delta x_n^m
\end{bmatrix} =
\begin{bmatrix}
f_1(x_1^{m-1}, V^{m-1}) \\
f_2(x_2^{m-1}, V^{m-1}) \\
\vdots \\
f_n(x_n^{m-1}, V^{m-1})
\end{bmatrix} -
\begin{bmatrix}
f_1(x_1^{m-1}, V^{m-1}) \\
f_2(x_2^{m-1}, V^{m-1}) \\
\vdots \\
f_n(x_n^{m-1}, V^{m-1})
\end{bmatrix}
\]

(4.3)

In (4.3) \( A_i, B_i \) are the Jacobians of \( f_i \) w.r.t. \( x_i \) and \( V \) respectively. \( C_j \) are matrices to extract current components from \( x_j \) and \( D \) corresponds to a reordered real expansion of the admittance matrix \( Y \) of the grid.

\[
I = D \cdot V \Leftrightarrow
\begin{bmatrix}
i_1^R \\
i_1^I \\
i_2^R \\
i_2^I \\
\vdots \\
i_n^R \\
i_n^I
\end{bmatrix}
= \begin{bmatrix}
b_{11} & g_{11} & b_{12} & g_{12} & \ldots & b_{1N} & g_{1N} \\
g_{11} & -b_{11} & g_{12} & -b_{12} & \ldots & g_{1N} & -b_{1N} \\
b_{21} & g_{21} & b_{22} & g_{22} & \ldots & b_{2N} & g_{2N} \\
g_{21} & -b_{21} & g_{22} & -b_{22} & \ldots & g_{2N} & -b_{2N} \\
\vdots & \vdots & \vdots & \vdots & \ddots & \vdots & \vdots \\
b_{N1} & g_{N1} & b_{N2} & g_{N2} & \ldots & b_{NN} & g_{NN} \\
g_{N1} & -b_{N1} & g_{N2} & -b_{N2} & \ldots & g_{NN} & -b_{NN}
\end{bmatrix}
\begin{bmatrix}
V_1^R \\
V_1^I \\
V_2^R \\
V_2^I \\
\vdots \\
V_N^R \\
V_N^I
\end{bmatrix}
\]

(4.4)

The Bordered Block Diagonal (BBD) structure of the Jacobian of (4.2) is exploited to decompose
the problem into \( n + 1 \) subsystems using the Schur complement domain decomposition method.

\[
\tilde{D} \Delta V^m = -g(x^{m-1}, V^{m-1}) - \sum_{i=1}^{n} C_i A_i^{-1} f_i(x_i^{m-1}, V^{m-1})
\]

(4.5a)

where \( \tilde{D} = D + \sum_{i=1}^{n} C_i A_i^{-1} B_i \)

(4.5b)

\[
A_i \cdot \Delta x_i^m = -f_i(x_i^{m-1}, V^{m-1}) - B_i \Delta V^m, \quad i = 1 \ldots n
\]

(4.5c)

The (reordered real expansion of the) admittance matrix is Schur-augmented in (4.5b). By construction, the factors \( C_i A_i^{-1} B_i \) are \( 2 \times 2 \) correction matrices that are added along the diagonal of \( D \), in the position that corresponds to the bus that the injector is connected to. They result from the discretization of the differential algebraic equations that govern the dynamic behavior of injections connected to power system buses.

\[
K_i = \begin{bmatrix}
0 & \cdots & 0 \\
\vdots & \ddots & \vdots \\
0 & \cdots & 0
\end{bmatrix}
\]

(4.6)

In (4.5a), the unknown vector \( x \) and the r.h.s. vector \( b \) have units of voltage and current respectively, but are not exactly voltage and current quantities. Actually \( x \) is a voltage correction vector and \( b \) is a pseudo-current vector, the explanation of which can be found in [5]. Analogously to (4.4) voltage corrections and pseudo-currents can be written in real and complex quantities

\[
x = \begin{bmatrix} x_1^R & x_2^R & \cdots & x_N^R \\
x_1^I & x_2^I & \cdots & x_N^I
\end{bmatrix}^T
\]

\[
b = \begin{bmatrix} b_1^R & b_2^R & \cdots & b_N^R \\
b_1^I & b_2^I & \cdots & b_N^I
\end{bmatrix}^T.
\]

Finally, the linear system of (4.5a) is solved for \( x = \Delta V^m \) and (4.5c) is solved for \( \Delta x_i^m \), for every injector \( i \). Having \( \Delta V^m \) and \( \Delta x_i^m \), the next step in the Newton iteration is calculated from (4.2). The above flow is summarized in Fig. 4.2. The demanding linear algebra operation of (4.5a) is highlighted in red.

For reasons that will become evident later on in this work, the signs of the rows corresponding to imaginary pseudo-current quantities are inverted. The final linear system is written hereunder. The solution of (4.7) is equivalent to (4.5a). \( \mathcal{I} \) is the pseudo-current vector, \( \Psi = \mathcal{I} + \mathcal{X} \) is the augmented, reordered, sign-inverted (for odd rows only) admittance matrix, and \( \mathcal{V} \) is the pseudo-voltage solution. \( \mathcal{X} \) is the ensemble of the individual diagonal corrections \( \mathcal{X}_i \) that
Figure 4.2 – The flow of the RAMSES transient simulator with the linear algebra operation identified in red
correspond to injectors $i$.

\[
\begin{bmatrix}
-b_1^l & -g_1 & -b_1^l & -g_1 & \cdots & -b_1^l & -g_1 \\
 g_1 & -b_1 & g_1 & -b_1 & \cdots & g_1 & -b_1 \\
-b_2^l & -g_2 & -b_2 & -g_2 & \cdots & -b_2 & -g_2 \\
 g_2 & -b_2 & g_2 & -b_2 & \cdots & g_2 & -b_2 \\
 \vdots & \vdots & \vdots & \vdots & \cdots & \vdots & \vdots \\
-b_N^l & -g_N & -b_N & -g_N & \cdots & -b_N & -g_N \\
 g_N & -b_N & g_N & -b_N & \cdots & g_N & -b_N \\
\end{bmatrix}
\begin{bmatrix}
\Psi
\end{bmatrix}
\begin{bmatrix}
\mathcal{X}
\end{bmatrix}
\begin{bmatrix}
0 & \cdots & 0 \\
0 & \cdots & 0 \\
\vdots & \vdots & \vdots \\
0 & \cdots & 0 \\
\end{bmatrix}
\begin{bmatrix}
\sum_{i=1}^{n} \begin{bmatrix}
k_{11} & k_{12} \\
k_{21} & k_{22} \\
\vdots & \vdots \\
0 & 0 \\
\end{bmatrix}
\end{bmatrix}
\begin{bmatrix}
x_1^R \\
x_2^R \\
x_N^R \\
x_1^I \\
x_2^I \\
x_N^I \\
\end{bmatrix}
\]

The main purpose for the design of the MSC is to tackle the computational needs of this linear system solving operation. Particularities of the original power system problem have been taken into account in the design of the MSC. The next sections detail how.

### 4.2 Design methodology

There are three steps in creating (4.7).

1. The admittance parameters of the power system branches and shunt elements are expressed using an ordering and a sign convention that is compatible with (4.7).

2. The $\mathcal{Y}$ matrix is created by connecting the branches between the buses, according to the given (graph) topology of the system. The same is done with the bus shunt elements.

3. $\Psi$ is generated by augmenting $\mathcal{Y}$ with the diagonal corrections that correspond to the injectors. These corrections are also expressed to respect the ordering and sign convention compatible with the above, and they are applied to $2 \times 2$ position (rows and columns) of the main diagonal of $\mathcal{Y}$ that corresponds to the bus where the injector is connected to.

The design of the MSC is based on the steps above through the mapping principle of Fig. 3.9, which has been retained.
Figure 4.3 – The two-port, four-pole network (2 poles per port) defined for a branch by (4.8) and the effect it has on the building of the $\mathbf{Y}$.

### 4.2.1 Power system components and matrix building

The electrical nature of a branch can be modeled using its complex $y$-parameters of (3.1). These parameters can be written in a real-expanded form using an ordering and a sign convention compatible to the one of $\mathbf{Y}$.

$$
\begin{bmatrix}
-i_f^R \\
-i_f^I \\
-i_t^R \\
-i_t^I \\
\end{bmatrix} =
\begin{bmatrix}
-b_{ff} & -g_{ff} & -b_{ft} & -g_{ft} \\
-g_{ff} & -b_{ff} & g_{ft} & -b_{ft} \\
-b_{tf} & -g_{tf} & -b_{tt} & -g_{tt} \\
-g_{tf} & -b_{tf} & g_{tt} & -b_{tt} \\
\end{bmatrix}
\begin{bmatrix}
v_f^R \\
v_f^I \\
v_t^R \\
v_t^I \\
\end{bmatrix}
$$

(4.8)

The parameters of the above can be derived by taking into account the model of a power system branch, as shown in Fig. 3.6. A two-port, four-pole network that is defined by (4.8) is shown in Fig. 4.3 alongside the effect it has to the building of the $\mathbf{Y}$. This is analogous to the complex case of Fig. 3.7 of chapter 3.

For the different parts of Fig. 3.6 different models can be assumed. Table 4.2 summarizes these different sub-models. The model of the shunts part can be codified using two axes. The first axis concerns the completeness of the model and the second whether it is balanced or not. The letter codes for the table are as follows.

- for Transformer part models: $F$ - full model, $R$ - non-phase shifting model, $N$ - no transformer
4.2. Design methodology

Figure 4.4 – Empirical pie chart of occurrences of branch types for several typical power systems with sizes ranging between 3-15k buses

- for Shunts part models (first axis): F - full model, I - imaginary model, N - no shunts
- for Shunts part models (second axis): u - unbalanced, b - balanced
- for Line part models: F - full model, I - imaginary model

The combination of models for the transformer, the shunts and the line parts yields the overall two-port network model of the branch. The following nomenclature is established: models are to be referred to using the following abbreviation $TS\varepsilon\ell$. $T$ refers to the transformer model, $S$ refers to the completeness axis of the shunts model, $\varepsilon$ refers to the balanced-ness axis of the shunts model and $\ell$ refers to the line model. Capital $X$ or small $x$ is used as a wildcard, i.e. means any submodel. E.g. $NIbF$ has a $N$ model for the transformer, an $Ib$ model for the shunts and an $F$ model for the line.

Fig. 4.4 shows the pie chart of the branch types that occur in the power systems of appendix A. The size of the power systems is between 3 and 15226 buses. Among all possible models, there are ten types ($F\varepsilon F$, $R\varepsilon F$, $NI\varepsilon F$, $FNF$, $RNF$, $NNF$, $NI\varepsilon I$, $FNI$, $RNI$ and $NNI$) that appear in systems used in transient simulations.

Analogously, for a shunt element ($g_{sh} + j \cdot b_{sh}$) on bus $n$ the $\gamma$-parameters are as follows. The one-port network defined by these parameters and its effect on the building of the $\mathcal{Y}$ is shown in Fig. 4.5. This is analogous to the complex case of Fig. 3.8.

\[
\begin{bmatrix}
-i_n^f \\
 i_n^R
\end{bmatrix} = \begin{bmatrix}
-b_{sh} & -g_{sh} \\
 g_{sh} & -b_{sh}
\end{bmatrix} \begin{bmatrix}
 v_n^R \\
 v_n^I
\end{bmatrix}
\] (4.9)
Table 4.2 – Different models for different parts of the branch model of Fig. 3.6

<table>
<thead>
<tr>
<th>Transformer part</th>
<th>Shunts part</th>
<th>Line part</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Full model:</strong> In the most general case no assumptions are made for the transformer parameters, i.e. $x \neq 0$ close to 1 and $\theta \neq 0$. When the transformer is operating at nominal tap setting then $x = 1$.</td>
<td><strong>Full model:</strong> No constraints are set for the parameters of the shunts, $g_f, b_f, g_t$ and $b_t$.</td>
<td><strong>Unbalanced:</strong> This is the general case where the “from” part and the “to” part can have different parameters.</td>
</tr>
<tr>
<td><strong>Non-phase shifting model:</strong> When the transformer is not phase shifting then $\theta = 0$ and the ratio is real $n = x \in \mathbb{R}$.</td>
<td><strong>Imaginary model:</strong> The conductive parts at the “from” and the “to” sides are neglected ($g_f = g_t = 0$) and the shunt admittances only have a susceptive part.</td>
<td><strong>Balanced:</strong> The shunts are equal for both sides $g_f = g_t$ and $b_f = b_t$.</td>
</tr>
<tr>
<td><strong>No transformer:</strong> Neglecting the transformer means $x = 1$ and $\theta = 0$ and the branch becomes a simple line; the generalized $\pi$ model of Fig. 3.6 becomes a standard (non-generalized) $\pi$ model.</td>
<td><strong>No shunts:</strong> Neglecting the shunts means $g_f = b_f = g_t = b_t = 0$.</td>
<td><strong>Imaginary model:</strong> In high-voltage transmission lines a common assumption that $x_s \gg r_s$ is made. This leads to $g_s \rightarrow 0$, hence the line has only a susceptive part.</td>
</tr>
</tbody>
</table>

Figure 4.5 – The one-port, two-pole network (2 poles per port) defined for a shunt element by (4.9) and the effect it has on the building of the $Y$.
4.2. Design methodology

The $Y$ matrix is generated, by taking into account the effect of all branches and shunt elements. Then it needs to be augmented into $\tilde{Y}$ by taking into account all the diagonal corrections that correspond to injectors. According to observations made earlier in this section, the diagonal correction for an injector represents a relation between the voltage at the bus where the injector is connected at and the current injected into the grid. This is formalized in the following equation. The effect of the diagonal correction corresponding to an injector to the building of matrix $\tilde{Y}$ is illustrated in Fig. 4.6.

$$\begin{bmatrix} -i_n^I \\ i_n^R \end{bmatrix} = \begin{bmatrix} -k_{11} & -k_{12} \\ k_{21} & k_{22} \end{bmatrix} \cdot \begin{bmatrix} v_n^R \\ v_n^I \end{bmatrix} \quad (4.10)$$

To accompany the matrix building procedure described above, a model is also assumed for the nodes of the system. These concern the pseudo-current r.h.s. vector $I_v$ and the solution vector $V$. Given the real decomposition scheme and the sign convention that has been adopted, a bus of the power system can be represented as a set of two points, for which the currents are injected and the voltages are measured as shown in 4.7. $b_I$ and $b_R$ refer to the $b$ vector of (4.5a).
4.2.2 Electronic equivalents

Based on the analysis of the power system components presented in the previous section, their electronic equivalents are presented here.

Electronic Two-Port Networks

For an electronic two-port network to have a conductance parameter matrix such as the one of (4.8) a two-port network four-pole design such as the one of Fig. 4.8 is adopted. The resulting conductance parameters are as follows in (4.11). The nodes of interest in the TPN are the nodes denoted by 1, 2, 3 and 4. The effect of any internal circuitry will be projected to the overall g-parameters of the two-port network, as long as it involves one of these nodes.

\[
\begin{bmatrix}
I_1 \\
I_2 \\
I_3 \\
I_4 \\
\end{bmatrix} =
\begin{bmatrix}
g_{11} & g_{12} & g_{13} & g_{14} \\
g_{21} & g_{22} & g_{23} & g_{24} \\
g_{31} & g_{32} & g_{33} & g_{34} \\
g_{41} & g_{42} & g_{43} & g_{44} \\
\end{bmatrix} \cdot
\begin{bmatrix}
V_1 \\
V_2 \\
V_3 \\
V_4 \\
\end{bmatrix}
\]  

(4.11)

The building blocks of the internal topologies of the TPN are potentiometers and voltage controlled current sources (VCCS). The effect of a potentiometer to the g-parameters of the TPN is shown in Fig. 4.9a. A VCCS is an active electronic element which can control the current injection at an electrical node based on the voltage difference at other points of the circuit, i.e. the transconductance/transimpedance phenomenon. This explains the fact that a VCCS can have an effect at a single position of the g-parameters matrix, without affecting the symmetrical entry or the corresponding entries of the main diagonal, as shown in Fig. 4.9b.

The y-parameters matrix of each power system branch type has different degrees of freedom and numerical properties, e.g. non-zero pattern, symmetry, etc. Analysis conducted by the author showed that five distinct electronic TPN modifications are sufficient to cover all branch models. These modifications are called TPN Mk1 ... Mk5. Table 4.3 summarizes their ability to represent the power system branch types of Fig. 4.4. TPNs are in decreasing order of complexity.
4.2. Design methodology

![Diagram of a TPN with a potentiometer and VCCS connections]

(a) Connection of a potentiometer between poles 1 and 3 of a TPN

(b) Connection of a VCCS in a TPN to draw current from pole 1, controlled by voltage at pole 2

Figure 4.9 – Examples of the effect of the TPN building blocks (potentiometers and VCCS) to the g-parameters matrix of the TPN

Table 4.3 – TPN modifications and their ability to represent branch types

<table>
<thead>
<tr>
<th>TPN</th>
<th>FTbF</th>
<th>RTbF</th>
<th>NTbF</th>
<th>FbF</th>
<th>NNbF</th>
<th>NNbF</th>
<th>FNI</th>
<th>RNI</th>
<th>NNI</th>
</tr>
</thead>
<tbody>
<tr>
<td>Mk1</td>
<td>Y</td>
<td>Y</td>
<td>Y</td>
<td>Y</td>
<td>Y</td>
<td>Y</td>
<td>Y</td>
<td>Y</td>
<td>Y</td>
</tr>
<tr>
<td>Mk2</td>
<td>N</td>
<td>Y</td>
<td>N</td>
<td>Y</td>
<td>Y</td>
<td>Y</td>
<td>Y</td>
<td>Y</td>
<td>Y</td>
</tr>
<tr>
<td>Mk3</td>
<td>N</td>
<td>N</td>
<td>N</td>
<td>N</td>
<td>N</td>
<td>N</td>
<td>Y</td>
<td>Y</td>
<td>Y</td>
</tr>
<tr>
<td>Mk4</td>
<td>N</td>
<td>N</td>
<td>N</td>
<td>N</td>
<td>N</td>
<td>N</td>
<td>N</td>
<td>Y</td>
<td>Y</td>
</tr>
<tr>
<td>Mk5</td>
<td>N</td>
<td>N</td>
<td>N</td>
<td>N</td>
<td>N</td>
<td>N</td>
<td>N</td>
<td>N</td>
<td>Y</td>
</tr>
</tbody>
</table>

It is always desirable to represent branch types with TPNs that are just complex enough and “Y”s in bold show the TPN that fulfills this criteria. TPNs of higher complexity can be used to represent branches of lower complexity. This can be thought of as a procedure of downgrading, as it uses resources in excess of what is really necessary for the representation of the branches. TPN downgrading might be useful in cases when the TPN bank is out of resources of a particular Mk, but there are still resources available in Mks of higher complexity.

An overview of the architecture of TPN modifications alongside the resulting electronic g-parameters matrix is presented hereunder. The total g-parameters matrices are the combined effect of potentiometers and VCCS in the TPNs. Voltage variables beside the VCCSs denote the controlling voltage of the source.

For TPN Mk3 different dependencies are required for different models that are mapped onto the Mk, e.g. NN and FNI. Therefore two “versions” of the g-parameters matrix are shown. For the implementation, the issue can be easily addressed with an additional digitally controlled switch that controls the control voltage of the respective VCCS.
Chapter 4. Concept future solver

Mk1

\[
\begin{bmatrix}
  g_{1a} & g_{1b} & g_{1c} & g_{1d} \\
  g_{2b} & g_{2a} & g_{2c} & g_{2d} \\
  g_{3b} & g_{3c} & g_{3a} & g_{3d} \\
  g_{4b} & g_{4c} & g_{4d} & g_{4a}
\end{bmatrix}
\]

Mk2

\[
\begin{bmatrix}
  g_{13} + g_{1a} & g_{1b} & -g_{13} & g_{1b} \\
  g_{2b} & g_{24} + g_{2a} & g_{2c} & -g_{24} \\
  -g_{13} & g_{3b} & g_{13} + g_{3a} & g_{3c} \\
  g_{4b} & -g_{24} & g_{4c} & g_{24} + g_{4a}
\end{bmatrix}
\]

(Ver. a)

(Ver. b)

Mk3

(Ver. a)

(Ver. b)

Mk4

\[
\begin{bmatrix}
  g_{13} + g_{1a} & 0 & -g_{13} & 0 \\
  0 & g_{24} + g_{2a} & g_{2b} & -g_{24} \\
  -g_{13} & g_{3b} & g_{13} + g_{3a} & 0 \\
  g_{4b} & -g_{24} & 0 & g_{24} + g_{4a}
\end{bmatrix}
\]

Mk5

\[
\begin{bmatrix}
  g_{13} & 0 & -g_{13} & 0 \\
  0 & g_{24} & 0 & -g_{24} \\
  -g_{13} & 0 & g_{13} & 0 \\
  0 & -g_{24} & 0 & g_{24}
\end{bmatrix}
\]
Apart from the analog part, TPNs also have a digital part, illustrated in Fig. 4.10. In the figure, black wires represent digital connections, red wires and coloring represent analog connections and generally analog domain functionality, bold wires stand for (digital or analog) buses (i.e. more than one connections).

The TPN_PU is the digital processing unit present in the TPN. Depending on its Mk, the TPN may have $X_{TPN}$ number of VCCSs and $Y_{TPN}$ number of potentiometers. The analog functionality of the TPN is depicted as the red square on the bottom of the figure.

The TPN_PU is connected to the rest of the MSC by three digital buses (data_bus, address_bus and control_bus). These are used to deliver commands to the TPN by higher levels of the MSC hierarchy. In turn the TPN_PU generates the internal digital signals that are required to drive its VCCSs and POTs.

Table 4.4 summarizes the complexity of TPN modifications in terms of numbers of VCCSs ($X_{TPN}$) and potentiometers ($Y_{TPN}$). The last column of the table refers to the nominal percentage of TPN of each type that should be available in the TPN bank. This percentage comes from the sum of occurrence percentages of the branches (see Fig. 4.4) that the TPN Mk is primarily used to represent (second to last column).

### Electronic One-Port Networks

An analogous approach has been followed for shunt elements and diagonal corrections. An one-port network (OPN) structure has been created as shown in Fig. 4.11. Its g-parameters matrix is as follows in (4.12). The nodes of interest in the OPN nodes are 1 and 2. The effect of any internal circuitry will be projected to the overall g-parameters of the two-port network, as long as it involves one of these nodes. The building blocks of OPNs are again potentiometers and VCCSs. Two OPN modifications are proposed. Their architecture alongside their corre-
Chapter 4. Concept future solver

Table 4.4 – TPN modifications, their complexity and their primary usage for branch type representation

<table>
<thead>
<tr>
<th>TPN</th>
<th>$X_{TPN}$</th>
<th>$Y_{TPN}$</th>
<th>Primarily used branch types</th>
<th>Nominal percentage in the TPN bank</th>
</tr>
</thead>
<tbody>
<tr>
<td>Mk1</td>
<td>12</td>
<td>4</td>
<td>FIbF, FNF</td>
<td>0.20 %</td>
</tr>
<tr>
<td>Mk2</td>
<td>12</td>
<td>2</td>
<td>RIbF, NIbF, RNF</td>
<td>82.40 %</td>
</tr>
<tr>
<td>Mk3</td>
<td>8</td>
<td>2</td>
<td>NNF, FNI</td>
<td>14.53 %</td>
</tr>
<tr>
<td>Mk4</td>
<td>4</td>
<td>2</td>
<td>NIbI, RHFI</td>
<td>0.58 %</td>
</tr>
<tr>
<td>Mk5</td>
<td>0</td>
<td>2</td>
<td>NNI</td>
<td>2.29 %</td>
</tr>
</tbody>
</table>

Figure 4.11 – An outline of an electrical one-port network used to map a power system shunt element or a diagonal correction

Corresponding $g$-parameter matrices are presented hereunder. OPN Mk1 is used to represent shunt elements or diagonal corrections when they are complex (both real and imaginary parts) while OPN Mk2 can represent only real-only or imaginary-only power system one-ports.

$$\begin{bmatrix} I_1 \\ I_2 \end{bmatrix} = \begin{bmatrix} g_{11} & g_{12} \\ g_{21} & g_{22} \end{bmatrix} \cdot \begin{bmatrix} V_1 \\ V_2 \end{bmatrix}$$ (4.12)

OPNs also have a digital part, illustrated in Fig. 4.12. The figure is analogous to Fig. 4.10 of the TPN case.
Injection and Measurement Node

The pseudo-current injections and the pseudo-voltage measurements of Fig. 4.7 will be performed by an array of electronic nodes. A synoptical diagram of this injection and measurement node (IMN) is shown in Fig. 4.13. \(I_{ia}\) and \(I_{ib}\) stand for current injection devices, \(V_{ia}\) and \(V_{ib}\) for voltage injection devices. The type of the injection to the node is determined by a pair of digitally controlled switches.

\(I_{ma}\) and \(I_{mb}\) are current measurement devices (e.g. ampere-meters) and hence they are connected in series, while \(V_{ma}\) and \(V_{mb}\) are voltage measurement devices (e.g. volt-meters) and so that are connected in parallel. All building blocks of the IMN are digitally controlled.

The digital part of the IMN is illustrated in Fig. 4.14. The figure is analogous to Figs. 4.10 and 4.12 of the TPN/OPNs.

4.2.3 Value range profiling

For the sizing of the electronics of the MSC, a value profiling of the corresponding power system elements is necessary. Empirical tests have been conducted on all the power systems of appendix A. The value range of the branch admittances is as follows.

\[
g \sim (5 \cdot 10^{-3}, 3 \cdot 10^{3}) \ pu \tag{4.13}
\]

\[
b \sim (1 \cdot 10^{-2}, 2 \cdot 10^{4}) \ pu \tag{4.14}
\]

The value range of shunt admittances is as follows.
Chapter 4. Concept future solver

Figure 4.13 – Synoptical diagram of an injection and measurement node of the MSC

Figure 4.14 – Schematic of the digital part of an IMN
4.3 MSC architecture

4.3.1 Local cells

A set of a \( Y_{2LC} \) IMNs, \( Y_{2LC} \) OPNs, and \( X_{LC} \) TPNs consist an entity that in the MSC frame is called a local cell (LC), shown schematically in Fig. 4.15. A LC is a number of electrical nodes with higher degree of connectivity between them. This is so in order to resemble the clustered nature of real world power system grids. In a local cell \( Y_{2LC} \) buses can be represented.

The representation of a bus is handled by the components that are inside the gray shaded boxes on the left of the figure. Injections and measurements are handled by IMNs and the representation of a shunt element and/or diagonal corrections by OPNs. The connection of the latter to the electrical node of the IMN is controlled by the switch \( \text{OPN}_\text{SW} \), which is controlled by a digital line \( \text{cs} \).

On the top right of the figure, the interconnection part of the local sell is highlighted in yellow. This interconnection infrastructure is managed by a dedicated processing unit \( \text{LC}_\text{CONN}_\text{PU} \). The horizontal double-analog connections represent the electrical nodes of the MSC. The
vertical lines are connected through analog multiplexers (MUX) to TPNs.

The digital processing core of the LC is the local cell processing unit (LC\_PU) on the bottom left of the figure. It controls all the components of the LC through three buses, lc\_data\_bus, lc\_address\_bus and lc\_ctrl\_bus.

The MUX entity (shown on Fig. 4.16) is a digitally controlled double analog multiplexer. Each of the analog lines (shown in bold red in the figure) represents two analog signals, as per the convention in the MSC. The multiplexer on the left is controlled by a digital control signal \( \text{mux}_f \) and leads to the “from” (analog) connectivity side of associated TPN. The multiplexer on the right is controlled by a digital control signal \( \text{mux}_t \) and leads to the “to” (analog) connectivity side of the associated TPN. There are as many positions of the analog multiplexers as there are electrical nodes (IMNs) in the LC, i.e. \( Y_{2\text{LC}} \), plus one extra position for no connection. This mechanism ensures that the TPN is connected to any IMN on the from-side, and to another one on the to-side. Connectivity is limited to the LC, i.e. only IMNs of the LC can be connected to TPNs of the LC.

For the LC, clusters of \( Y_{2\text{LC}} = 8 \) IMN nodes and \( X_{\text{LC}} = 15 \) TPN branches can be envisioned. This would allow the LC to be able to accommodate a neighborhood of a power system that consists of a cluster of 8 densely connected buses. Out of these buses, there are \( Y_{1\text{LC}} \) that can be connected to buses of neighboring LCs. This is done in the frame of the global MSC structure that is presented in the following subsection.

### 4.3.2 Global architecture

Fig. 4.17 presents a synoptical view of the global MSC architecture (GB). The proposed MSC is to be connected to a normal PC architecture as a hardware accelerator through a standard PC bus. The GB consists of \( Y_{\text{GB}} \) LC entities.

The interconnection part of the GB is highlighted in yellow on the right of the figure, and it is similar to the interconnection found in LC, but not exactly the same. On the bottom of the figure there are \( X_{\text{GB}} \) TPNs. These are used to connect LCs together through the \( Y_{1\text{LC}} \) analog connections of the latter. The analog connections of LCs are shown as horizontal red lines in the figure. The from and two sides of the TPNs are connected to analog wires, which are shown as vertical red lines in the figure. The horizontal and vertical lines can be interconnected by a set of switches. For each possible connection, one switch is provided. In total there is the following number of switches.

\[
\frac{Y_{\text{GB}} \cdot Y_{1\text{LC}}}{\text{# of horizontal lines}} \cdot \frac{2 \cdot X_{\text{GB}}}{\text{# of vertical lines}}
\]

Each of the switches is driven by a digital control signal. These control signals are multiplexed
4.3. MSC architecture

Figure 4.15 – Schematic of a local cell (LC) of the MSC

Figure 4.16 – Connectivity detail and simplified schematic of a TPN connection multiplexer (MUX)
into GB_PU and fed to a series of cascaded multiplexers up to the point of the single bit wires that control the switches. This is so, as to avoid a potentially prohibitive number of digital lines exiting from the GB_PU.

The reason that this interconnection scheme has been selected for the global architecture of the MSC has to do with parasitic capacitances. Indeed in the interconnection scheme that is used in the LCs there are analog lines that connect each of the TPNs with every IMN. Given the limited size of the LC, this is not problematic. In the GB case however, such a scheme would result in a prohibitive number of total analog connections, which in turn would lead to a prohibitive amount of (capacitive) parasitics. The latter would have an adverse effect on the performance of the MSC, as it will be explained in a later section.

The total number of LCs that are included in the GB MSC scheme depends on the required capacity of the platform. For example assuming LCs with a capacity of 8 nodes, then 10 of them would be required to have a total MSC capacity of 80 buses. The number of TPNs in the GB affects the available connectivity between the LC clusters.

All \( Y_{1LC}, Y_{2LC}, X_{LC}, Y_{GB} \) and \( X_{GB} \) are parameters to be decided on depending on the requirements from an actual MSC implementation.
4.3. MSC architecture

4.3.3 Topological mapping and value mapping

The procedure to associate power system branches to electronic TPNs (edges), and buses and shunts to electronic IMNs and OPNs (nodes) is the topological mapping described in section 3.1 of chapter 3. This is achieved through the interconnection fabric of the local cells and the global structure as detailed in the previous subsection. The topological mapping is necessary only once for a given topology.

After a power system branch has been mapped to an electronic TPN the latter has to be configured in a way so that (4.11) is equivalent to (4.8). For a shunt element or diagonal correction mapped on an OPN, the latter has to be configured so that (4.12) is equivalent to the corresponding power system shunt or diagonal correction quantity. Finally, the electronic injections by an IMN associated to a power system bus, have to correspond to the real-world injections of the bus in the power system. This procedure is the value mapping of section 3.1 of chapter 3. The multiplicative mapping ratios $\rho_Y$, $\rho_I$ and $\rho_V$ defined in that section are again relevant. The result is that an electronic equivalent equivalent system with is created in electronics.

\[
\begin{align*}
\mathbf{J} &= \begin{pmatrix} \Psi \\ \mathbf{G} + \mathbf{G}_K \end{pmatrix} \cdot \mathbf{V} \\
\mathbf{I} &= \begin{pmatrix} \rho_Y \\ \rho_I \\ \rho_V \end{pmatrix} \cdot \mathbf{V}
\end{align*}
\] (4.21)

The matrix $G$ is the electronic equivalent of the topology admittance matrix $\Psi$. $G$ is augmented with the extra terms $G_K$ that correspond to the diagonal corrections $\mathbf{K}$. The resulting total matrix $\Gamma$ is the electronic equivalent of the power system matrix $\Psi$. The current injections and measurements ($\mathbf{J}$ and $\mathbf{V}$) are handled by the IMNs ($\mathbf{I}$ and $\mathbf{V}$).

The matrix topology matrix $G$ (made up by TPNs and OPNs corresponding to shunts) is not expected to change often, while $G_K$ (made up by OPNs corresponding to diagonal corrections) may change often. Hence the corresponding OPNs have to modified accordingly. The injection and measurement vectors $\mathbf{I}$ and $\mathbf{V}$ are the main input and output variables of the MSC and are expected to change in every invocation.

4.3.4 Mathematical operations

In the MSC, voltages and currents can be both injected and read using resources of the IMNs. This is an upgrade compared the FPPNS where currents can be injected but not read from the
grid. For a node, either a voltage or a current injection can be performed. Nodes with voltage injections have unknown current injections, and vice versa nodes with current injections have unknown voltages. Unknown electrical variables are induced by the physics of the system.

Similar to the FPPNS, when only voltages are injected, $V$ is known in (4.21), $I$ is the unknown vector and the operation performed by the analog part of the MSC corresponds to a matrix-vector multiplication. When only currents are injected, $I$ is known (rhs vector), $V$ is the unknown vector, and the operation performed corresponds to a linear system solving. Mixed voltage and current injections result in the operation described in (3.25).

\subsection{Inaccuracies and effect in the linear operations}

An inaccuracy analysis of the MSC can be performed based on section 3.3 of chapter 3. The following model is assumed for all reconfigurable devices (potentiometers, VCCSs, current DACs and voltage DACs) in the system. In the following $X$ may stand for a conductance $G$ of a TPN or OPN, or an injected voltage $V$ or a current $I$ of an IMN.

\begin{equation}
X = \frac{s}{2^M} \cdot t \cdot X_{FS}
\end{equation}

Where $X$ is a physical electronic quantity, $s$ represents the sign, $t \in \mathbb{Z}^+$ is the tap setting, $M$ is the bit resolution and $X_{FS} \in \mathbb{R}^+$ is the full-scale value of the reconfigurable element. The tap setting can take integer values in the range $t \in [0, 2^M)$. The sign $s$ of the electrical quantity can be controlled using one bit $s_b$, e.g. by a polarity reversal switches at the outputs of DACs. This extra sign bit is not counted in $M$. The pair $[s_b, t]$ can be thought of as a signed magnitude representation of the $(M+1)$-length word of the real signed value of the component. When a specific value $X_{req}$ is requested from the reconfigurable element, $s_b$ and $t$ have to be calculated so that the resulting $X$ from (4.22) best approximates $X_{req}$. The sign bit is simply set as $s_b \leftarrow \text{sign}(X_{req})$. The tap setting can be calculated from (4.22).

Inaccuracies come from the fact that the real full-scale value $X_{FS}$ is a priori unavailable. Instead knowledge about it is known in the form of a random distribution $X_{FS} \sim \mathcal{N}(\hat{X}_{FS}, \sigma_{X_{FS}}^2)$. An analysis similar to (3.35)-(3.43) can be followed for the analysis of the inaccuracy that is introduced.

For the measurement devices in the system (voltage and current ADCs) the following simplified model is assumed.
4.3. MSC architecture

\[ s_b \leftarrow \text{sign}(X_{in}) \]  \hspace{1cm} (4.23a)

\[ D^R = \frac{|X_{in}|}{X_{FS}} \cdot 2^M \]  \hspace{1cm} (4.23b)

\[ D = \begin{cases} 
0 & \text{if } D^R \leq 0 \\
|D^R + 0.5| & \text{otherwise} \\
2^M - 1 & \text{if } D^R \geq 2^M - 1
\end{cases} \]  \hspace{1cm} (4.23c)

\[ X = \frac{s \cdot D \cdot X_{FS}}{2^M} \]  \hspace{1cm} (4.23d)

The analog input to the ADC is \( X_{in} \) and the digital output is the signed magnitude pair \([s_b, D]\). \( s_b \) is the sign bit which is true when \( s = +1 \). \( D \) is the magnitude word of length \( M \). \( D^R \) is the magnitude word, if it was allowed to assume non-integer values; it is quantized and saturated to \( D \) in (4.23c). \( X_{FS} \) determines the range of the ADC, since it is the full scale quantity that can be accepted at its input \(|X_{in}| \in [0, X_{FS})\). The value that is actually “read” by the ADC, and that is communicated to digital parts of the system is given by \( X \). Inaccuracy for (4.23) can be investigated in the lines of (3.61)-(3.65a) of the FPPNS ADCs.

Inaccuracies of individual components contribute to the overall inaccuracy of operations performed by the MSC. Inaccuracy is manifested in different stages, similar to table 3.2 in chapter 3 for the FPPNS.

\[ \tilde{\Gamma} = \Gamma + E_\Gamma \]  \hspace{1cm} (4.24)

\[ \tilde{I} = I + \epsilon_I \]  \hspace{1cm} (4.25)

\[ \tilde{V} = V + \epsilon_{V1} \]  \hspace{1cm} (4.26)

\[ \tilde{V}' = \tilde{V} + \epsilon_{V2} \]  \hspace{1cm} (4.27)

\[ E_\Gamma \text{ inaccuracies in (4.24) are owed to the combined effect of inaccuracies in } G \text{ and the diagonal correction matrix } G_K. \text{ The two yields the total inaccuracy in } \Gamma. \]

\[ \tilde{G} = G + E_G \]
\[ \tilde{G}_K = G_K + E_{G_K} \]
\[ \tilde{\Gamma} = \Gamma + E_\Gamma, \text{ where } E_\Gamma \equiv E_G + E_{G_K} \]  \hspace{1cm} (4.29)
Chapter 4. Concept future solver

Table 4.5 – Interface between the RAMSES software and the MSC platform

<table>
<thead>
<tr>
<th>Variable</th>
<th>Description</th>
<th>Data Type</th>
<th>I/O type</th>
</tr>
</thead>
<tbody>
<tr>
<td>t</td>
<td>time</td>
<td>1 × 1 double</td>
<td>I</td>
</tr>
<tr>
<td>updtDCorr</td>
<td>flag to denote update of the diag. corrections</td>
<td>1 × 1 bool</td>
<td>I</td>
</tr>
<tr>
<td>dCorr</td>
<td>diagonal corrections</td>
<td>4 × M double</td>
<td>I</td>
</tr>
<tr>
<td>iInj</td>
<td>flags to denote whether node has a current injection</td>
<td>1 × 2N bool</td>
<td>I</td>
</tr>
<tr>
<td>b</td>
<td>bus pseudo-current injection (rhs)</td>
<td>1 × 2N double</td>
<td>I</td>
</tr>
<tr>
<td>x</td>
<td>bus pseudo-voltage (solution)</td>
<td>1 × 2N double</td>
<td>O</td>
</tr>
</tbody>
</table>

4.3.6 Interface between the MSC and the RAMSES flow

Details about the actual internal memory structure of the GB_PU will not be given since this will depend greatly on the actual final implementation. All the necessary data that are needed for the functioning of the MSC are provided to the GB_PU through PC_MB_IF by the main CPU of the host PC. Then the GP_PU parses the data and generates the words necessary to drive all MSC components through the bus sets (gb_data, gb_address, gb_control) and (tpn_data, tpn_addr, tpn_ctrl).

During the initialization of the use of the MSC the power system data, the mapping data, and the scenario data, are transferred to GB_PU. First, power system and mapping information is parsed and the topology is initialized on the MSC. This involves configure the connectivity switches, configuring the TPN values and configuring the OPN values. The respective commands are issued by the GB_PU to the LC_PUs that are involved.

The scenario file is also parsed, so that the MSC knows when (at which time step) topological changes and faults are to be effectuated in the simulation process. For example, if at some point a three phase fault is to be applied at a bus of the system, in the electronic world this translates as a shunt connection of an OPN to ground at the electrical node that corresponds to the faulted bus. The parameters of the OPN are to be determined by nature of the fault.

In the simulation process, utilization of the MSC is made in the RAMSES flow, as shown in Fig. 4.2. Notice that since a Newton scheme is used for each time step, many internal (Newton) iterations may be required for convergence. The MSC is invoked at every internal iteration. The interface between the MSC and the RAMSES software executed in the conventional CPU(s) of the platform is show in table 4.5. Column I/O type is as seen from the GB_PU side.

The time variable t allows the MSC to keep awareness of the simulation. If two subsequent invocations are with the same t, this means that the new invocation concerns another internal Newton iteration for the same time step.

The diagonal corrections vector dCorr correspond to the (linearized) dynamic behavior characteristics of power system elements that are connected to predefined buses. The indexes of
4.3. MSC architecture

the bus locations of these diagonal corrections are provided by RAMSES at the initialization step of the simulation, as part of the topology of the system. The update of the diagonal corrections is performed only when the `updtDCorr` flag is true. Normally, this happens only for a few iteration in the course of a RAMSES simulation.

The `b` variable contains the vector of the injections required by the current internal Newton iteration of the current time step in RAMSES. The type of the injections is determined by the `iInj` flag vector. The latter has true entries in positions where current injections are to be performed, and false entries at voltage injections. After the \( I = \Gamma \cdot V \) solution, the voltage output is written by the MSC into the `Voutput` vector.

4.3.7 Timing

The sequence of actions in an MSC invocation is as follows.

1. The CPU of the PC writes the interface variables to the `GB_PU` via the `PC_MB_IF`.
2. The MSC performs all auxiliary tasks (e.g. internal memory io) and computes the values to be dispatched as commands to the LCs.
3. If topological changes are prevised, the `GB_PU` issues the topology reconfiguration commands to the LCs.
4. If the `updtDCorr` flag is true, the `GB_PU` issues OPN (re)configuration commands to the LCs.
5. The MSC issues to LCs the commands to perform the updated DAC injections.
6. Wait for the analog electronics to settle.
7. The MSC issue the read ADC commands to the LCs.
8. The `GB_PU` sends the solution (`x` vector) back to the CPU of the PC via the `PC_MB_IF`.

The timing of the above steps depends heavily on the actual implementation. Hereunder, there is an effort to gain some insight on the issue.

The time to complete steps (1) and (8), \( t_1 \) and \( t_8 \) respectively, depends on the speed of the `PC_MB_IF` connection. Based on the amount of data to be communicated, these times can be accurately estimated. Time \( t_2 \) of step (2) will depend on the actual operations performed by the `GB_PU` as well as the actual implementation of the latter, and only a rough estimate of it can be given. Times of steps (3), (4), (5), and (7), \( t_3, t_4, t_5 \) and \( t_7 \) can be accurately estimated.

The waiting time of (6), \( t_6 \) is the time required for the analog electronics to reach their next steady-state after the DAC injections have been effectuated. This is similar to \( t_{rd} \) in the FPPNS implementation. The RC phenomena of Fig. 3.26 occur also in the MSC. For this a simplified RC model similar to (3.75) can be assumed.
Chapter 4. Concept future solver

4.4 Numerical results

An emulator of the MSC concept has been created in MATLAB. This emulator provides the complete functionality of the MSC in software. It is fully parameterizable in terms of the characteristics (accuracy, bit resolution, etc.) of the electronics that have been modeled. The results presented in this section have been obtained with this emulator.

4.4.1 Selection of mapping ratios

The first design choice is the selection of the mapping ratios. There are three ratios with two degrees of freedom, as per (3.21). The ratio selection is important as the physical limits of the electronics have to be respected, e.g. no mapped conductance must exceed the maximum achievable conductance of the reconfigurable potentiometers and the VCCSs. For the section the following assumptions have been made. These limits are arbitrary and they will depend on the actual electronic realization.

\[
\begin{align*}
\max(g_{el}) &= 1/1000 \text{ S} \\
\max(v_{el}) &= 10 \text{ V} \\
\max(i_{el}) &= 100 \text{ mA}
\end{align*}
\]

(4.30) (4.31) (4.32)

Full use of the electronic conductance range is made when the highest value of power system admittance value \( \max(y_{ps}) \) is mapped to the maximum achievable electronic conductance. So a formula for the admittance ratio is as follows.

\[
\rho_Y = \frac{\max(g_{el})}{\max(y_{ps})}
\]

(4.33)

The \( \max(y_{ps}) \) value can be found easily by a search in the conductances of the given topology. Similarly, it makes sense to associate the maximum power system pseudo-voltage value to the actual maximum electronic voltage.

\[
\rho_V = \frac{\max(v_{el})}{\max(v_{ps})}
\]

(4.34)

According to (4.20), a good approximation for the maximum expected pseudo-voltage values is \( \max(v_{ps}) = 1 \text{ V} \).
Finally, the current mapping ratio is calculated through Ohm’s law of (3.21).

\[ \rho_I = \rho_Y \cdot \rho_V \]  

(4.35)

This is the scheme that has been used in what follows.

### 4.4.2 Linear system solving

The first way to demonstrate the validity of the MSC concept is to show that it actually solves the problem it is supposed to solve. The 11-bus system of appendix A is used as a test case. Due to the random nature of the true full-scale conductance values, the tests presented in this section have been repeatedly executed so as to avoid probabilistic artifacts.

#### Grid inaccuracies

Branches and shunt elements of the power systems are mapped onto TPNs and OPNs as per the previous sections. The topology of the power system is recreated in electronics by “virtually” closing the corresponding digital switches in the MATLAB emulator. As a result the \( \tilde{\Gamma} \) matrix is created. This \( \tilde{\Gamma} \) matrix already takes into account the inaccuracies of the electronics and the quantization, as explained in (4.24).

In Fig. 4.18 the relative error of \( \Gamma \) is plotted with respect to the bit resolution of the potentiometers and the VCCSs, as well as with respect to their assumed accuracy. The vertical axis corresponds to the quantity \( \| E_{\Gamma} \|_\infty / \| \Gamma \|_\infty \). The horizontal axis corresponds to different bit resolutions \( M_G \) for the reconfigurable conductances. Different lines correspond to different relative accuracies \( tol_{GFS} \) (in percent) for the full scale value \( G_{FS} \) of the potentiometers. The corresponding standard deviation is calculated as \( \sigma_{GFS} = tol_{GFS} / 3 \).

The effect of the relative accuracy is way more pronounced than the effect of the bit resolution. There is a trend that the error increases for lower resolutions \( (M_G \leq 9) \), especially for low relative accuracies \( tol_{RFS} > 10\% \). The use of an extremely high resolution for the potentiometers is not deemed justifiable as additional potential benefits fail to appear for resolutions above \( M_G \geq 13 \).

A similar result can be obtained for the inverse of the matrix \( \Gamma^{-1} \), as shown in Fig. 4.19. The dip for the relative error norm for \( M_G = 7 \) is coincidental. Again, for \( M_G > 10 \) and pronouncedly for \( M_G \geq 12 \) there is a residual error that cannot be further reduced by increasing the bit resolution of the reconfigurable electronics.

Based on the above, a design guideline of \( M_G \geq 12 \) bits is proposed for the reconfigurable conductances in the MSC.
Chapter 4. Concept future solver

Figure 4.18 – Relative $\infty$-norm errors that are introduced in the electronic equivalent of the $\Gamma$ matrix

Figure 4.19 – Relative $\infty$-norm errors that are introduced in the electronic equivalent of the $\Gamma^{-1}$ matrix
4.4. Numerical results

Figure 4.20 – Relative $\infty$-norm errors that are introduced in the electronic equivalent of the $I$ vector

**Current injection inaccuracies**

A similar study can be performed for the currents that are injected by IMNs of the MSC. Their inaccuracy is mainly controlled by two variables, the bit resolution $M_I$ and the full scale relative accuracy $tol_{IS}$. The current mapping ratio is as per (4.35). A typical $I$ injection has been assumed (taken from a real-world scenario) and it has been mapped on the MSC. Results are summarized in Fig. 4.20. The vertical axis corresponds to the quantity $\|e_I\|_\infty / \|I\|_\infty$, i.e. the relative error to the current injection that is introduced by MSC inaccuracies. The horizontal axis corresponds to different bit resolutions $M_I$. Different lines correspond to different relative accuracies $tol_{IS}$.

From the figure it is seen that the increasing the bit resolution of the DACs is beneficial for the overall accuracy. After a point on $M \geq 15$ the residual inaccuracy is mostly due to analog inaccuracy and cannot be decreased more.

Here an important remark should be made regarding the mapping ratios. The results of Fig. 4.20 are obtained by using the ratios of (4.33)-(4.35). In them, the full range of admittances and voltages is exploited by dynamically scaling the ratio in (4.33) and (4.34). The remaining current ratio is calculated by Ohm's law. Instead, similar dynamic scaling could be applied to the current ratio.

\[
\rho_I = \frac{\max|\epsilon_I|}{\max|\epsilon_{ps}|} \tag{4.36}
\]
Then the voltage ratio has to be calculated by the Ohms law.

\[ \rho_V = \frac{\rho_Y}{\rho_I} \]  

(4.37)

The benefit of this is that the entire range of the current DACs in IMNs can be exploited. This becomes evident in Fig. 4.21. The figure presents the same results as 4.20, but using the dynamic current scaling of (4.36)-(4.37) instead. It can be seen that the error is way less dependent on the resolution, compared to the previous case.

Based on the above, a design guideline of \( M_I \geq 16 \) bits is proposed for the voltage DACs of the MSC.

**Inaccuracies induced to the voltages due to grid and current injection inaccuracies**

The error \( \epsilon_{V1} \) in (4.26) is naturally affected by the assumed \( E_I \) and \( \epsilon_I \) inaccuracies. In turn the latter depend on the assumed inaccuracy of the reconfigurable conductances and current DACs respectively, as shown hereinabove in this section. Fig. 4.22 clarifies this connection. The x-axis is the relative full-scale inaccuracy of the conductances \( tol_{GFS} \), the y-axis is the relative full-scale inaccuracy of the I-DACs \( tol_{I_{FS}} \). The z-axis corresponds to the quantity \( \| \epsilon_{V1} \|_\infty / \| V \|_\infty \).
4.4. Numerical results

From the figure it can be seen that $\epsilon_{V1}$ is less sensitive to $I$ inaccuracies that it is to $G$ inaccuracies. This insensitivity is more pronounced for more inaccurate $\Gamma$ grids (i.e. higher $G$ inaccuracy).

**Voltage measurement inaccuracies**

The last inaccuracy introduced to the final solution is the one of the voltage DACs of the IMNs $\epsilon_{V2}$. For the purpose of this study no analog inaccuracy has been included in the model of the ADC. Hence the entire $\epsilon_{V2}$ inaccuracy in the MATLAB model is due to quantization. Fig. 4.23 summarizes the results of $\|\epsilon_{V2}\|_{\infty} / \|V\|_{\infty}$.

A knee in the graph appears for $M_{V_{ADC}} \approx 11$. A much higher resolution does not offer much in the reduction of the overall error. For the rest of this work a design guidelines of $M_{V_{ADC}} \geq 14$ bits is proposed.

The tests presented in this subsection refer to an 11-bus system mapped on the MSC. The graphs presented in the section illustrate the trends on accuracy sufficiently well, since the same observations have been made for larger systems.
Chapter 4. Concept future solver

Figure 4.23 – Relative $\infty$-norm errors introduced to the voltage solution $\hat{V}'$ due to the quantization of the voltage ADC of the IMNs

Timing

Regarding the timing of the analog operation of the MSC, the model of Fig. 3.26 is assumed. In order to investigate the phenomenon a SPICE model of the MSC for the 11-bus system has been created. All electrical parameters (conductances, parasitic capacitances, etc.) of the SPICE model are modifiable at will. Fig. 4.24 shows the overview of the topology. For each power system bus, there are two electrical nodes, “a” and “b”. The 4-pole nature of TPNs, and the 2-pole nature of OPNs are also evident. Changes in the IMN current injections are modeled using pulse current sources, seen on the bottom of the figure. The current sources complete the change in their output in a time $t_{itr}$ (output slew).

A series of realistic current injection changes have been simulated in the SPICE model in order to investigate the RC phenomenon. All changes have been taken from realistic cases that occur in the simulation of transient in real power system scenarios. The same output slew rate $t_{itr}$ has been assumed for all current DACs and the same capacitance has been assumed in every electrical node $C$. For putting the study into context, Fig. 4.25 provides a thumb-rule diagram for the order of magnitude of parasitic capacitances on different electronic design paradigms.

Figs. 4.26 show voltage transient results for $t_{itr} = 10$ ns. The horizontal axes are time in $s$, the vertical axes are voltage in V. For each voltage trajectory, a “o” marker denotes the time instant at which the trajectory has reached the 63.2% of the final steady state value, a “+” marker denotes the time instant at which the 90% of the steady-state value has been reached, a “*”
Figure 4.24 – The schematic of the electronic equivalent of an 11-bus system created in OrCAD Capture
Chapter 4. Concept future solver

10 pF | inter-PCB connections
1 pF  | PCB connections
discrete ICs
100 fF | IC implementations
10 fF  | transistor gate C at 180nm CMS

Figure 4.25 – Thumb-rule diagram for the order of magnitude of parasitic capacitances on different electronic design paradigms

marker denotes the time instant at which the 95% of the steady-state value has been reached, and a “x” marker denotes the time instant at which the 99% of the steady-state value has been reached.

As expected different node voltages have different delays. Most of trajectories follow an exponential trend without overshoot, hence the 1-pole RC model of section 3.3.1 of chapter 3 can provide an honest general approximation of the behavior. An interesting phenomenon is observed in Fig. 4.26d. Due to the fast responsiveness of the analog grid, the limit is the speed of the DACs. Indeed, all voltage curves follow a ramp 10 ns transition, which is exactly the $t_{tr}$. DACs that are one order of magnitude faster $t_{tr} = 1\, ns$ are assumed for the next set of tests. Figs. 4.27 summarize the results. The responses are quantitatively similar. The same “saturation” effect is observed in Fig. 4.27c, where the voltage responses simply follow the 1 ns ramp of the current injections. By retaining the assumption of uniform parasitic capacitances of $C = 10\, fF$, faster responses can be obtained by having faster current DACs. Figs. 4.28 summarize the related results. For $t_{tr} = 200\, ps$ the slew rate of the DAC starts to be comparable with the speed of an inverter in 180nm CMOS process.

Table 4.6 summarizes the results of the study. Empirical figures have been included for the times that the voltages reach the 63.2%, the 90%, the 95% and the 99% of the steady-state value (named $t_{0.63}$, $t_{0.90}$, $t_{0.95}$ and $t_{0.99}$ respectively).

The general trends that have been observed in the study of this subsection, can be assumed for systems of different size and topology. However the exact figures of the timings are strongly dependent on the exact implementation of the MSC. Additionally they are affected by the actual values of the reconfigurable conductances (potentiometers and VCCS). Due to the latter, the RC phenomena are different for different power systems, since this results into different conductances being mapped onto the electronics. Moreover, even for the same topology, different $\rho_Y$ mapping ratio results in different electronic conductances through the value mapping and hence affects the RC timings. For these reasons, the figures presented hereinabove, should be taken as general rules of thumb for the design process and not as strict metrics.
4.4. Numerical results

Figure 4.26 – Transients of the node of voltages of the MSC-mapping of the 11-bus system using a current source slew rate of $t_{itr} = 10$ ns

Table 4.6 – Empirical settling times for different parasitic capacitances and current sources slew rates

<table>
<thead>
<tr>
<th>$t_{itr}$</th>
<th>C</th>
<th>$t_{0.63}$</th>
<th>$t_{0.90}$</th>
<th>$t_{0.95}$</th>
<th>$t_{0.99}$</th>
</tr>
</thead>
<tbody>
<tr>
<td>10 ns</td>
<td>5 pF</td>
<td>47 ns</td>
<td>88 ns</td>
<td>140 ns</td>
<td>202 ns</td>
</tr>
<tr>
<td>10 ns</td>
<td>1 pF</td>
<td>14 ns</td>
<td>26 ns</td>
<td>33 ns</td>
<td>46 ns</td>
</tr>
<tr>
<td>10 ns</td>
<td>500 fF</td>
<td>10 ns</td>
<td>16 ns</td>
<td>19 ns</td>
<td>26 ns</td>
</tr>
<tr>
<td>10 ns</td>
<td>100 fF</td>
<td>7 ns</td>
<td>9.8 ns</td>
<td>10.5 ns</td>
<td>12 ns</td>
</tr>
<tr>
<td>1 ns</td>
<td>100 fF</td>
<td>1.36 ns</td>
<td>2.59 ns</td>
<td>3.24 ns</td>
<td>4.49 ns</td>
</tr>
<tr>
<td>1 ns</td>
<td>50 fF</td>
<td>0.98 ns</td>
<td>1.60 ns</td>
<td>1.93 ns</td>
<td>2.57 ns</td>
</tr>
<tr>
<td>1 ns</td>
<td>10 fF</td>
<td>710 ps</td>
<td>980 ps</td>
<td>1.05 ns</td>
<td>1.20 ns</td>
</tr>
<tr>
<td>500 ps</td>
<td>10 fF</td>
<td>400 ps</td>
<td>545 ps</td>
<td>610 ps</td>
<td>750 ps</td>
</tr>
<tr>
<td>200 ps</td>
<td>10 fF</td>
<td>195 ps</td>
<td>320 ps</td>
<td>386 ps</td>
<td>515 ps</td>
</tr>
</tbody>
</table>
Figure 4.27 – Transients of the node of voltages of the MSC-mapping of the 11-bus system using a current source slew rate of $t_{itr} = 1 \text{ ns}$

Figure 4.28 – Transients of the node of voltages of the MSC-mapping of the 11-bus system assuming uniform parasitic capacitances of $C = 10 \text{ fF}$
Table 4.7 – Free design parameters of the MSC and their effect in the operation

<table>
<thead>
<tr>
<th>Free design parameter</th>
<th>Description</th>
<th>Effect</th>
<th>Rule of thumb</th>
</tr>
</thead>
<tbody>
<tr>
<td>$M_G$</td>
<td>Bit resolution of the reconfigurable conductances</td>
<td>Affects overall accuracy through $E_T$ of (4.24) and $\epsilon_{V1}$ of (4.26)</td>
<td>$M_G \geq 12$</td>
</tr>
<tr>
<td>$M_I$</td>
<td>Bit resolution of the current DACs of IMNs</td>
<td>Affects overall accuracy through $\epsilon_I$ of (4.25) and $\epsilon_{V1}$ of (4.26)</td>
<td>$M_I \geq 16$</td>
</tr>
<tr>
<td>$M_V$</td>
<td>Bit resolution of the voltage ADCs of IMNs</td>
<td>Affects overall accuracy through $\epsilon_{V2}$ of (4.27)</td>
<td>$M_I \geq 12$</td>
</tr>
<tr>
<td>$tol_G_{fs}$</td>
<td>Analog inaccuracy of the reconfigurable conductances</td>
<td>Affects overall accuracy through $E_T$ of (4.24) and $\epsilon_{V1}$ of (4.26)</td>
<td>see Fig. 4.22</td>
</tr>
<tr>
<td>$tol_I_{fs}$</td>
<td>Analog inaccuracy of the current DACs of IMNs</td>
<td>Affects overall accuracy through $\epsilon_I$ of (4.25) and $\epsilon_{V1}$ of (4.26)</td>
<td>see Fig. 4.22</td>
</tr>
<tr>
<td>$C$</td>
<td>Assumed uniform parasitic capacitance of electrical nodes of the MSC</td>
<td>Affects the time delay due to RC phenomena</td>
<td>see Table 4.6</td>
</tr>
<tr>
<td>$t_{itr}$</td>
<td>Output slew rate of the current DACs</td>
<td>Affects the time delay due to RC phenomena</td>
<td>see Table 4.6</td>
</tr>
</tbody>
</table>

4.4.3 Integration into the RAMSES flow

An arrangement similar to the one of Fig. 4.1 has been used to test the proposed architecture. The RAMSES engine (ver. 3.13) resided on a conventional modern desktop computer (Intel Core i7 2x2.80GHz, 8.0 GB RAM, Windows 7 64b). An emulator of the proposed MSC was created in MATLAB. The flow of the RAMSES simulator was interfaced with the MSC emulator according to Fig. 4.2. Calling of the MSC functionality was done using the software interface of Table 4.5.

The free design parameters of the MSC and their effect in the operation of it are summarized in Table 4.7. When integrated into RAMSES, all these parameters affect the functioning of the MSC as a component of the RAMSES flow. This effect is examined in this section.

A three-cycle (60 ms) thee-phase fault is applied on bus #7 of the 11-bus system of appendix A at time $t_{\text{fault}} = 0.5$ s. The resulting voltage on the faulted bus is as shown in Fig. 4.29. This result has been obtained by using the RAMSES software using a variable time step $h = 1 - 10$ ms and the BDF numerical integration algorithm.

Imperfect linear system solving in the internal (Newton) iterations cannot lead to inaccuracy of the solution for a given time step by the Newton scheme. This is because the Newton scheme stops when the zero of the non-linear function is found with a specific tolerance anyway. Imperfect solution of the linear sub-problem can affect the scheme in two ways.

- It can result in a higher number of Newton iterations for a given time step.
Chapter 4. Concept future solver

- It can result in a lower steps selected by the variable time step mechanism.

Both the above have a negative effect in the overall number of total invocations of the linear system solving module. This increase has a potentially detrimental effect in the overall speed of the simulation. This will depend on the speed of linear system solving: if the speedup in the module alone is significant enough, then the higher number of operations can be counterbalanced and justified.

Validation of the design guidelines

A set of thumb-rule design guidelines have been put forth in previous section concerning the bit resolution of the conductances, the current DACs, and the voltage ADCs. A series of tests are conducted to verify the validity of these guidelines by integrating the MSC into RAMSES. Table 4.8 shows the related results. Six different design configuration have been created (A-F). The design parameters for each test case are shown in the respective rows of the table. The simulation that results in Fig. 4.29 was executed for each case. The number of the resulting total time steps for the time window $t = 0...1.0$ s are shown in column $N_t$. The last column $N_{calls}$ shows the total number of MSC invocations for the same time window.

Case A has unrealistically good values for all design parameters (e.g. resolutions of 24 bits, no analog inaccuracy). The resulting $N_t$ and $N_{calls}$ values can be taken as reference for the MSC. In all other cases, there is one parameter, in bold typeface, which mildly violates the design guidelines of table 4.7. The adverse effect of this violation is clearly seen on the last column.

Fig. 4.30 visualizes the phenomenon across time. The horizontal axis corresponds to simulated time and the vertical axis shows the number of times the MSC was invoked for the solution of the non-linear system at every time step. This corresponds to the number of (internal) Newton iterations for each time step. In most cases the number of $N_{calls}$ significantly increases while
4.4. Numerical results

Table 4.8 – Test cases to validate the design guidelines of table 4.7

<table>
<thead>
<tr>
<th>test case</th>
<th>$M_G$</th>
<th>$M_I$</th>
<th>$M_V$</th>
<th>$tol_{GFS}$</th>
<th>$tol_{I_{FS}}$</th>
<th>$N_I$</th>
<th>$N_{calls}$</th>
</tr>
</thead>
<tbody>
<tr>
<td>A</td>
<td>24</td>
<td>24</td>
<td>24</td>
<td>0%</td>
<td>0%</td>
<td>106</td>
<td>164</td>
</tr>
<tr>
<td>B</td>
<td>6</td>
<td>24</td>
<td>24</td>
<td>0%</td>
<td>0%</td>
<td>106</td>
<td>169</td>
</tr>
<tr>
<td>C</td>
<td>24</td>
<td>15</td>
<td>24</td>
<td>0%</td>
<td>0%</td>
<td>106</td>
<td>334</td>
</tr>
<tr>
<td>D</td>
<td>24</td>
<td>14</td>
<td>24</td>
<td>0%</td>
<td>0%</td>
<td>106</td>
<td>784</td>
</tr>
<tr>
<td>E</td>
<td>24</td>
<td>24</td>
<td>13</td>
<td>0%</td>
<td>0%</td>
<td>106</td>
<td>286</td>
</tr>
<tr>
<td>F</td>
<td>24</td>
<td>24</td>
<td>12</td>
<td>0%</td>
<td>0%</td>
<td>2 (divergence)</td>
<td>46</td>
</tr>
</tbody>
</table>

Figure 4.30 – MSC invocations for each time instant for the transient scenario on the 11-bus system

in one particular case (F), the simulation fails to converge, since the maximum number of Newton iterations (40) is reached in the second time step. From the above it can be concluded that a potential MSC implementation should respect the guidelines discussed previously.

Exploration of a reasonable design space

A thorough analysis of the design space has been conducted for different bit resolutions and analog inaccuracies. The limits of this design space are a combination of the design guidelines proposed and validated in previous sections and implementation realism, i.e. resolutions only up to an implementable limit have been considered. The design space is as follows.

- $M_I = \{16, 18, 20\}$
- $M_V = \{16, 18, 20\}$
Chapter 4. Concept future solver

Figure 4.31 – Average penalty on the number of Newton (internal) iterations for one time (external) iteration

- $M_G = \{12, 14, 16, 18\}$
- $tol = tol_{GFS} = tol_{IFS} = \{0.5\%, 1\%, 5\%, 10\%, 15\%, 20\%\}$

The same scenario of the previous subsection has been considered. Multiple runs have been conducted in order to cancel out of probabilistic outlier results.

Fig. 4.31 shows the penalty that is incurred on the number of times the linear system is solved, for different accuracy levels (horizontal axis). The software solution is taken as a reference. Results are averaged for all other free design parameters, $M_I, M_V, M_G$. For example, a value of +0.4 in the vertical axis, means that for the corresponding analog inaccuracy tolerance in the horizontal axis, on average the Newton scheme needs 0.4 more iterations to converge; this results is averaged over any other parameter variation.

The study concerns the time window $t = 0.5 - 1$ s of the response shown in Fig. 4.29. This choice has to do with the fact that complex dynamics manifest in the network right after the three phase fault is applied. The resulting non-linear systems require more than one Newton iterations to converge. Hence the effect of the inaccurate MSC linear system solution is better demonstrated.

Tables 4.9, 4.10, and 4.11 show the effect of analog inaccuracy with respect to the bit resolution of current injections, voltage measurements, and reconfigurable conductances respectively.
4.4. Numerical results

Table 4.9 – Effect of analog inaccuracy to the average penalty on Newton iterations with respect to the bit resolution of the current injections

<table>
<thead>
<tr>
<th>$M_I$</th>
<th>0.5%</th>
<th>1%</th>
<th>5%</th>
<th>10%</th>
<th>15%</th>
<th>20%</th>
<th>ave</th>
<th>inc. gain</th>
</tr>
</thead>
<tbody>
<tr>
<td>16</td>
<td>0.267</td>
<td>0.261</td>
<td>0.466</td>
<td>0.742</td>
<td>0.881</td>
<td>1.258</td>
<td>0.646</td>
<td></td>
</tr>
<tr>
<td>18</td>
<td>0.012</td>
<td>0.029</td>
<td>0.097</td>
<td>0.194</td>
<td>0.278</td>
<td>0.437</td>
<td>0.175</td>
<td>-0.471</td>
</tr>
<tr>
<td>20</td>
<td>0.012</td>
<td>0.029</td>
<td>0.094</td>
<td>0.188</td>
<td>0.272</td>
<td>0.378</td>
<td>0.162</td>
<td>-0.012</td>
</tr>
</tbody>
</table>

Table 4.10 – Effect of analog inaccuracy to the average penalty on Newton iterations with respect to the bit resolution of the voltage measurements

<table>
<thead>
<tr>
<th>$M_V$</th>
<th>0.5%</th>
<th>1%</th>
<th>5%</th>
<th>10%</th>
<th>15%</th>
<th>20%</th>
<th>ave</th>
<th>inc. gain</th>
</tr>
</thead>
<tbody>
<tr>
<td>16</td>
<td>0.142</td>
<td>0.150</td>
<td>0.261</td>
<td>0.397</td>
<td>0.508</td>
<td>0.715</td>
<td>0.362</td>
<td></td>
</tr>
<tr>
<td>18</td>
<td>0.012</td>
<td>0.088</td>
<td>0.193</td>
<td>0.380</td>
<td>0.454</td>
<td>0.645</td>
<td>0.308</td>
<td>-0.054</td>
</tr>
<tr>
<td>20</td>
<td>0.012</td>
<td>0.082</td>
<td>0.204</td>
<td>0.349</td>
<td>0.469</td>
<td>0.713</td>
<td>0.313</td>
<td>+0.005</td>
</tr>
</tbody>
</table>

For each table the respective bit resolution and the analog accuracy are varied while all other design parameters are held constant. Numbers correspond to the average penalty on the number of Newton (internal) iterations for one time (external) iteration, similar to Fig. 4.31. The last column of the tables, presents the gain by increasing (moving down the table) the bit resolution of the respective element. A minus sign correspond to a decrease of the penalty, and hence to a "gain" from a performance point of view.

From the tables the following observations can be drawn.

- The most influential parameter in the accuracy of the MSC is the resolution of the current injectors $M_I$.
- For every element gains in the increase of the bit resolution fade out after a certain level.

Table 4.11 – Effect of analog inaccuracy to the average penalty on Newton iterations with respect to the bit resolution the reconfigurable conductances

<table>
<thead>
<tr>
<th>$M_G$</th>
<th>0.5%</th>
<th>1%</th>
<th>5%</th>
<th>10%</th>
<th>15%</th>
<th>20%</th>
<th>ave</th>
<th>inc. gain</th>
</tr>
</thead>
<tbody>
<tr>
<td>12</td>
<td>0.070</td>
<td>0.109</td>
<td>0.235</td>
<td>0.387</td>
<td>0.455</td>
<td>0.650</td>
<td>0.318</td>
<td></td>
</tr>
<tr>
<td>14</td>
<td>0.088</td>
<td>0.103</td>
<td>0.214</td>
<td>0.356</td>
<td>0.457</td>
<td>0.784</td>
<td>0.334</td>
<td>+0.016</td>
</tr>
<tr>
<td>16</td>
<td>0.136</td>
<td>0.107</td>
<td>0.202</td>
<td>0.372</td>
<td>0.494</td>
<td>0.704</td>
<td>0.336</td>
<td>+0.002</td>
</tr>
<tr>
<td>18</td>
<td>0.095</td>
<td>0.107</td>
<td>0.226</td>
<td>0.385</td>
<td>0.502</td>
<td>0.626</td>
<td>0.323</td>
<td>-0.012</td>
</tr>
<tr>
<td>20</td>
<td>0.042</td>
<td>0.067</td>
<td>0.205</td>
<td>0.340</td>
<td>0.386</td>
<td>0.564</td>
<td>0.268</td>
<td>-0.055</td>
</tr>
</tbody>
</table>
Chapter 4. Concept future solver

- The bit resolution of the reconfigurable conductances $M_G$ seems to only have a minor influence on the overall accuracy. Given the “implementation cost” of having a higher $M_G$, this observation can yield significant resources savings in a potential design.

- An increase in analog inaccuracy has an almost direct impact on the performance penalty - see Fig. 4.31. A quasi-linear sensitivity of 0.03 iteration increase for one % of analog inaccuracy increase has been observed.

A proposed design choice and validation on a larger system

The actual design choice for all design parameters will depend on their implementation cost. The latter can be anything like more, larger, more complex, more expensive electronics. As a realistic validation case the following has been selected.

- $M_I = 18$
- $M_V = 18$
- $M_G = 14$

The system is used to simulate a scenario on a much larger test case, the 77-bus system. A three-cycle (60 ms) three-phase fault is applied on bus #4072. Fig. 4.32 shows the voltage trajectory of the faulted bus. The fault is applied at $t = 1.0$ s and the time window of the simulation is $t = 0.0 - 2.0$ s.

The MSC configuration proposed above has been tested in the same simulation for two accuracy levels, $tol_{G(s)} = tol_{I(s)} = 5\%$ and $tol_{G(s)} = tol_{I(s)} = 1\%$. Fig. 4.33 presents the number of linear system solution operations across different time steps for the full-software simulator,
4.4. Numerical results

Figure 4.33 – MSC invocations for each time instant for the transient scenario on the 77-bus system

and for the MSC using the two accuracy levels. There are $N_{ls} = 406$ linear system solutions in the fully software run. There are $N_{MSC} = 615$ MSC invocations for the 5% accuracy case and $N_{MSC} = 538$ for the 1% case.

The time profiling of the software solution as given by RAMSES is as follows.

\begin{itemize}
  \item[a:] Local system building factorization and Schur complement 0.0386s (23.47%)
  \item[b:] Reduced system eliminations & factorization (network) 0.0062s ( 3.76%)
  \item[c:] Reduced system solution (network) 0.0056s ( 3.39%)
  \item[d:] Injector evaluation and solution 0.0611s (37.17%)
  \item[e:] Convergence checks at end of each N-I 0.0077s ( 4.67%)
  \item[f:] Discrete events and other at end of time step 0.0039s ( 2.35%)
  \item[g:] Discrete controller computation (DCTL) 0.0066s ( 0.35%)
  \item[h:] Time step initialization 0.0316s (19.24%)
  \item[i:] Remaining time 0.0092s ( 5.61%)
\end{itemize}

Total elapsed time 0.1643s

The factorization (LU) and the solution of the factorized system of (4.5) is done in (b) and (c) respectively. An estimate of 30% of the time spent in (a) concerns the $\tilde{D}$ in (4.5). This estimate comes from an internal profiling of RAMSES. Hence, an MSC invocation corresponds to the times of (b) and (c) and part of the time of (a). In this example this is $t_{sw} = 23.38 \text{ ms}$. 

157
Chapter 4. Concept future solver

The time profiling of an MSC invocation is as explained in section 4.3.6.

\[ t_{MSC} = t_{comm} + t_{i op} + t_{RC} \]  

(4.38)

The grouping of times shown above is natural in the sense that the three groups depend on different causes. The communication time depends on the speed \( B_{comm} \) of the PC_MB_IF as well as in the total amount of data that has to be transmitted \( D_{comm} \). The bandwidth of a PCI Express 3.0 16x is assumed, \( B_{comm} = 126 \) Gbit/s. The total number of data that needs to be communicated at each MSC invocation can be calculated from table 4.5. \( N = 77 \) is the number of buses, \( M = 42 \) is the number of bus injectors for the example in question, and the size of double precision floating point format is 64 bits.

\[ D_{comm} = (1 + 4 \cdot M + 4 \cdot N) \cdot \text{sizeof}(\text{double}) + (1 + 2 \cdot N) \cdot \text{sizeof}(\text{bool}) = 30683 \text{ bits} \]  

(4.39)

The above give an estimate for the communication time.

\[ t_{comm} = \frac{D_{comm}}{B_{comm}} \approx 400 \cdot ns \]  

(4.40)

The exact time for the internal MSC operations cannot be determined unless the exact architecture and program flow of the digital part of the computer is known. In any case it holds

\[ t_{i op} = N_{op} \cdot t_{clk} = \frac{N_{op}}{f_{clk}}. \]  

(4.41)

Where \( N_{op} \) is the number of operations and \( f_{clk} \) is the clock cycle of the digital part of the MSC. Naturally, \( N_{op} \) is proportional to the size of the system, since values concerning the elements of the system are moved across the MSC digital hierarchy (see Figs. 4.17 and 4.15). For the sake of quantifying \( t_{i op} \), it is assumed that

\[ N_{op} \approx 3 \cdot 2 \cdot N + 4 \cdot M + 50 = 680. \]  

(4.42)
4.4. Numerical results

This rough assumption is in line with table 4.5. An auxiliary slack of 50 has been added to account of tasks that are necessary to the digital functioning of the MSC. Additionally, a digital clock frequency of $f_{clk} = 1 \text{ GHz}$ is assumed. With current technology, higher frequencies can easily be achieved in speed-optimized digital ICs. With these assumptions $t_{iop}$ is as per (4.41).

$$t_{iop} \approx 680 \text{ ns}$$  \hspace{1cm} (4.43)

For the waiting time, table 4.6 is pertinent. The waiting time has to be selected according to the $t_{itr}$ and $C$ parameters of the final implementation. Given that figures on the table only show an order of magnitude and not an exact result, a sufficient multiplicative security margin should be taken. For the sake of quantification, in this example $t_{itr} = 1 \text{ ns}$ and $C = 100 fF$ are assumed. Then the waiting time is set to be times ten the $t_{0.95}$ figure of the table.

$$t_{RC} = 10 \cdot 3.24 \text{ ns} \approx 35 \text{ ns}$$  \hspace{1cm} (4.44)

The total time of an MSC invocation would then be as follows.

$$t_{MSC} = t_{comm} + t_{iop} + t_{RC} \approx 1.2 \text{ us}$$  \hspace{1cm} (4.45)

This figure can be multiplied by the total number of MSC invocations in each case to yield an estimate for the total time spent for the MSC acceleration.

For the 5% accuracy case: $t_{hw} = N_{MSC} \cdot t_{MSC} = 615 \cdot 1.2 \text{ us} = 738 \text{ us} $  \hspace{1cm} (4.46)

For the 1% accuracy case: $t_{hw} = N_{MSC} \cdot t_{MSC} = 538 \cdot 1.2 \text{ us} = 646 \text{ us}$  \hspace{1cm} (4.47)

These figures have to be compared with the time that is spent in equivalent tasks in the full-software solution $t_{sw} = 23.38 \text{ ms}$. For the linear algebra part only, this implies the following speedup.

$$S_{LA} = \frac{t_{sw}}{t_{hw}} = \frac{23.38 \text{ ms}}{646...738 \text{ us}} \approx 31.7...36.2$$  \hspace{1cm} (4.48)
In this particular example, the MSC is two orders of magnitude faster than conventional software. If incorporated into the total timing of the solver, this would result in the following speedup.

\[
S_{\text{tot}} = \frac{T_{\text{old}}}{T_{\text{new}}} \approx \frac{164.3 \text{ ms}}{141.6 \text{ ms}} \approx 1.16 \tag{4.49}
\]

As said, the figures that are showcased here can be exactly known only after an implementation of the MSC. It is up to the final designer of the system to make all engineering choices that best fit his implementation needs. As shown hereinabove, by making some (moderate) reasonable assumptions, very significant performance benefits can be expected from the architecture. The benefit in the overall solution time, is naturally limited by the share of linear system solving in the total time. It may then be beneficial to design power system algorithms that intentionally perform more linear system solving operations, given that the latter are computationally cheaper through MSC.

### 4.5 Conclusions

In this section, a mixed-signal computer, the MSC, has been presented that is dedicated to linear algebra operations. To the knowledge of the author the MSC is the most advanced and versatile linear-algebra enabled computer that has been proposed. This work, is also the most thorough presentation of the (linear algebra) computing properties of analog electronics, again, to the knowledge of the author. In this chapter all the design-analysis tools for a future actual implementation of the MSC are contained.

Naturally, the architecture as well as its analysis largely draws upon concepts presented in chapter 3. Structural reconfigurability is greatly enhanced mainly due to the intelligent interconnection scheme of Figs. 4.15-4.17. Much wider value flexibility is offered due to the use of VCCSs in the TPNs and OPNs and the communication bottleneck has been lifted by connecting the MSC directly to the motherboard of the host PC through PC\_MB\_IF - see scheme in Fig. 4.1. It can therefore be concluded that the design goals have been met since the virtues of the original have been retained, while all of the limitations identified in section 3.5.2 have been overcome.
5 Conclusions

It is out of the scope of this work to question the established computing paradigm that has been shaping the world for the past sixty years. Digital computing is here to stay for a reason. This thesis is merely a look-back to analog computing given the advances in modern electronics. It has been quite some time that such a retrospective research has been conducted. To the knowledge of the author, this is the most thorough study on the properties of analog computing for linear algebra purposes to this day. Due to the affinity of analog electronic circuits with power system topologies, analog computing has been applied to problems in the power system domain.

The realization presented in chapter 3, termed the Field Programmable Power System Network (FPPNS), is the most advanced computer in the world in this aspect. Performance-wise tremendous benefits have been noted. Benefits come from both the analog part that handles the linear algebra computations and from the dedicated digital part that handles the solution of the differential algebraic equations of the simulation formulation.

Regarding the analog part, the most important lesson learned from the development and the use of the current prototype is that by definition analog computing cannot match the unrivaled reconfigurability and functional completeness of digital architectures. To put it differently, an analog computing setup is only good for what it has been exactly pre-configured for.

A further problem is that virtually all modern computing platform are either completely digital, or at least digital-computer-oriented. Hence for an analog computer to be able to coexist in such a digital ecosystem, bidirectional analog-to-digital electronic interfaces are required. Unsurprisingly enough the latter can be equally as complex to build and operate as the analog computing core itself.

The future computer of chapter 4, termed Mixed-Signal Computer (MSC), solved FPPNS limitations in a conceptual level. Three drastic changes in the design philosophy were key to that.
Chapter 5. Conclusions

- The analog computer was brought closer to the digital host PC - see Fig. 4.1. Then the analog computer can operate as a dedicated accelerator while general purpose tasks are left to the host PC.

- Active electronics (e.g. VCCSs) have been used in the design of analog processing units. This is more “expensive” from an implementation point of view, but leads to much less computing constraints for the platform.

- An elaborate interconnection fabric has been proposed - see Figs. 4.15-4.17. This largely reduces most topological limitations, and hence greatly enlarges the set of compatible matrices.

The principle of coherency between a computing platform and the problem has been considered in every step of the hardware design process. The exact requirements were drawn from the RAMSES power system simulator software which has been used as an application to host MSC-acceleration. Exact evaluation of the MSC would require an implementation, which was out of the scope of this work. Instead, all the tools necessary for such an implementation have been provided in this work. Exemplary sample results show very significant performance benefits for the linear algebra operations, which translate in gains in the overall performance of the analysis.

5.1 Future work

In this section future extensions to this work are proposed. It is separated in two subsections, one dedicated to the already existing prototype of chapter 3 and another dedicated to the concept of chapter 4.

5.1.1 On the existing solver

Upgrade the communication system. In the current prototype a USB 2.0 connections is used. By upgrading it to a more advanced version, e.g. USB 3.0 or 3.1, significant gains could be made in the communication time, which is currently a major performance bottleneck.

Implement other power system applications. This could include power flow, quasi-steady state simulation, etc. For every case, the algorithm will have to be designed according to the capabilities of the FPPNS. In cases where repeated linear system solutions are required, e.g. in many voltage stability analysis algorithms, significant performance improvements can be attained.

Re-implement the system using higher scale integration. In the current prototype discrete IC components have been used on multiple PCBs. By moving towards higher integration, many benefits can be expected, e.g. lower size, lower consumption, and less parasitics. Especially
interesting would be the integration of the analog part of the FPPNS into ICs.

**Implement an implicit integration algorithm.** A implicit scheme would normally result in a non-linear system since it involves future values of the dynamic variables. The resulting linear system can be solved with a Newton-scheme, similar to the one used in RAMSES. The Jacobian involved in the Newton scheme is very much related to the admittance matrix of the topology, which can be handled in the analog part of the FPPNS. Such a solution scheme is quite different from the current partitioned scheme that uses explicit integration - see Fig. 3.5. Hence, significant changes would be required on the digital side of the FPPNS as well.

**Integrate the FPPNS into the software framework of an existing power system simulation package.** The software API that wraps the hardware accelerator functionality already exists in the elab-tsaoct. Minor changes in the flow of the host software as well in the FPPNS API might be required. On the software side, the simplifying assumptions required by the FPPNS should be taken into account. This integration procedure, is similar to the what is proposed in chapter 4, i.e. the integration of the MSC into the RAMSES flow.

### 5.1.2 On the conceptual future solver

**Implement the MSC.** All the tools that are required for the analysis of a potential MSC implementation have been introduced in chapter 4. The implementation architecture and underlying technology will determine all the final parameters such as the parasitic capacitances, hence the design can be made accordingly. The proposal of the author is to first implement a Local Cell with relatively small node capacity. Then if functionality of the LC prototype is validated, the LC can be reproduced and the global architecture can be extended as per Fig. 4.17.

**Use the MSC outside the power system domain.** Since the core of the MSC functionality is linear algebra operations, it can be used as a linear algebra accelerated, even outside the power system domain. Theoretically, arbitrary matrices can be mapped on the reconfigurable analog fabric and the required operation can be performed through IMN injections and measurements. This might create some complications, since the analog part of the MSC is tailored for power system domain operations and it has been designed by taking into account the analogy of table 2.3. There are many case where similar analogies can be found, e.g. in cases where the physical system underlying the matrix is a flow network. In the general case, this cannot be guaranteed and hence the matrix might not be mappable to the MSC.

**Extend the MSC functionality to solve** (4.5c). As seen in the time profiling of the RAMSES flow in section 4.4.3 a significant amount of time is spend in “Injector evaluation and solution” (approx. 37%). This concerns the factorization and solution of the linear systems of (4.5c). The
latter refer to the dynamic injectors to the grid and its controllers. The sparsity pattern of the $A_i$ matrix for a synchronous machine is shown in Fig. 5.1. This is a result that comes from the differential equations that govern the non-linear behavior of the injector. For the same type of equipment, e.g. synchronous machine, wind turbine, etc., the size of the matrix is always the same. The same holds for the non-zero pattern. This is due to the ordering of the equations and the dynamic variables that is the same in RAMSES across injectors of the same type.

It can be envisioned to create analog electronic entities, similar to TPNs and OPNs that map the matrices $A_i$ of the injector linear systems. Since the non-zero pattern and the size is fixed, a corresponding electronic topology can be derived. A bank of such electronic entities can exist in the MSC and each injector can be mapped at one of them. At the corresponding point of the flow of RAMSES (see Fig. 4.2) the solution of the (4.5c) systems can be performed in full parallel on the MSC.

**Modify RAMSES to exploit more the linear algebra prowess of the MSC.** For every change in the $\tilde{D}$ matrix, mathematically its $L \cdot U$ factors also change. However given the intensity of matrix decompositions, what is usually done is to

- update the $\tilde{D}$ matrix as less often as possible, and also
- refactorize it as less often as possible.
This is referred to as a (very) dishonest Newton scheme. The penalty incurred by this strategy is that sometimes the convergence of the Newton solution is slowed down, and more iterations are needed.

This dishonest procedure is not required for the MSC since the “refactorization” of the matrix is not necessary. A change in the matrix translates into a closing or an opening of an electronic switch, or the reconfiguration of the value of one of the electronic components. This procedure is almost instantaneous. Hence it makes sense to update the $D$ matrix as often as it would be deemed beneficial to reducing the total number of Newton (internal) and possible time (external) iterations.

In the same line, an entirely different flow for the host power system algorithm can be considered. The more it relies on linear algebra operations, the more the benefit from MSC acceleration.

5.2 Final discussion

A very good question would be the following: What is the outlook of analog computing in general, and in the linear algebra and power system domains in particular? The question becomes more important/interesting if the modern barriers in digital computers are considered, e.g. fatigue of the performance power increase law and power dissipation considerations.

It is the opinion of the author that the answer to these questions is related to three things,

- the existence of an analogy between the problem to be solved, and a real-life physical phenomenon that could potentially be reproduced on the analog computer
- the required “general purpose-ness”, and
- the level of required accuracy.

The first point determines whether analog computing is altogether available for a given problem. For the second point, the more hard the accuracy requirements are in a domain, the less pertinent analog computing is for it. Lastly for the third point, the more general purpose a computer needs to be, the less space there is for analog in its design.

Combining the above, it is here postulated that a future for analog computers may exists as hardware accelerators for operations for which accuracy requirements are soft. This is similar to the MSC that has been proposed in chapter 4 of this work. The power system domain is particularly promising for such architectures since there exists an an analogy between the topologies of an analog circuit and a power system. It is up to advances of electronics to provide the level of required analog accuracy and low parasitic effects so as to make such a system a viable alternative to digital (co-)processors.
Test power system topologies

An overview of the systems that have been as test cases throughout this work is presented in table A.1. These systems have all been regularly used for different studies. The concern systems of different complexity and simplifying assumptions. For example, the systems rc3, lanz5, fabre18, fabre36, padiyar39e and fabre59 all have the assumption $G = 0$. This assumption is valid for rougher studies in the transmission level. The systems fabre36 and fabre59 have been generated by modifying ieee30 and ieee57 respectively, by adding some branches of almost zero-impedance (node-duplication). The system padiyar39e is a version of padiyar39 with conductances neglected. These systems also have zero shunt components, no phase-shifting transformers. As a result their resulting $Y = B$ matrices are actually singular, since for every one $i$ of their rows it holds $\sum_{j=1}^{N} Y[i, j] = 0$.

For the aforementioned systems, the properties have been identified for the $N \times N$ complex admittance $Y$ matrix. Table A.2 presents the respective results. In all cases, all matrices are structurally symmetric as discussed earlier and a corresponding column is omitted. Naturally for larger systems, more demanding property conditions are harder to be met, e.g. symmetry conditions are not met for the systems poland2383, poland2736, poland2746, poland3375 and peg15226 due to the presence of phase-shifting transformers.
### Appendix A. Test power system topologies

Table A.1 – General overview of systems to be examined

<table>
<thead>
<tr>
<th>Name-Size</th>
<th>Branch count</th>
<th>Shunt count</th>
<th>nnz</th>
<th>Sparsity</th>
</tr>
</thead>
<tbody>
<tr>
<td>rc3 [299]</td>
<td>3</td>
<td>0</td>
<td>9</td>
<td>100%</td>
</tr>
<tr>
<td>gs4 [300]</td>
<td>4</td>
<td>0</td>
<td>12</td>
<td>75%</td>
</tr>
<tr>
<td>lanz5 [301]</td>
<td>5</td>
<td>0</td>
<td>15</td>
<td>60%</td>
</tr>
<tr>
<td>powe115 [37]</td>
<td>7</td>
<td>0</td>
<td>19</td>
<td>76%</td>
</tr>
<tr>
<td>wv6 [302]</td>
<td>11</td>
<td>0</td>
<td>28</td>
<td>77.8%</td>
</tr>
<tr>
<td>chow9 [303]</td>
<td>9</td>
<td>0</td>
<td>27</td>
<td>33.3%</td>
</tr>
<tr>
<td>wsc9 [304]</td>
<td>9</td>
<td>4</td>
<td>27</td>
<td>33.3%</td>
</tr>
<tr>
<td>kundur11 [305]</td>
<td>12</td>
<td>2</td>
<td>31</td>
<td>25.6%</td>
</tr>
<tr>
<td>ieee14 [306]</td>
<td>20</td>
<td>1</td>
<td>54</td>
<td>27.6%</td>
</tr>
<tr>
<td>fabre18 ¹</td>
<td>24</td>
<td>0</td>
<td>66</td>
<td>20.4%</td>
</tr>
<tr>
<td>ieee24 [307]</td>
<td>38</td>
<td>1</td>
<td>92</td>
<td>16.0%</td>
</tr>
<tr>
<td>alsac30 [308]</td>
<td>41</td>
<td>2</td>
<td>112</td>
<td>12.4%</td>
</tr>
<tr>
<td>ieee30 [306]</td>
<td>41</td>
<td>2</td>
<td>112</td>
<td>12.4%</td>
</tr>
<tr>
<td>fabre36 ²</td>
<td>46</td>
<td>0</td>
<td>128</td>
<td>9.9%</td>
</tr>
<tr>
<td>ns39 [309]</td>
<td>46</td>
<td>0</td>
<td>131</td>
<td>8.6%</td>
</tr>
<tr>
<td>padiyar39 [33]</td>
<td>46</td>
<td>0</td>
<td>131</td>
<td>8.6%</td>
</tr>
<tr>
<td>padiyar39e ³</td>
<td>46</td>
<td>0</td>
<td>131</td>
<td>8.6%</td>
</tr>
<tr>
<td>ieee57 [306]</td>
<td>80</td>
<td>3</td>
<td>213</td>
<td>6.6%</td>
</tr>
<tr>
<td>fabre59 ⁴</td>
<td>70</td>
<td>0</td>
<td>199</td>
<td>5.7%</td>
</tr>
<tr>
<td>nordic77 [310]</td>
<td>105</td>
<td>11</td>
<td>253</td>
<td>4.3%</td>
</tr>
<tr>
<td>lel101 [311]</td>
<td>110</td>
<td>0</td>
<td>315</td>
<td>3.1%</td>
</tr>
<tr>
<td>ieee118 [306]</td>
<td>186</td>
<td>14</td>
<td>476</td>
<td>3.4%</td>
</tr>
<tr>
<td>ieee300 [306]</td>
<td>411</td>
<td>29</td>
<td>1118</td>
<td>1.2%</td>
</tr>
<tr>
<td>poland2383 [303]</td>
<td>2896</td>
<td>0</td>
<td>8155</td>
<td>0.14%</td>
</tr>
<tr>
<td>poland2736 [303]</td>
<td>3269</td>
<td>1</td>
<td>9262</td>
<td>0.12%</td>
</tr>
<tr>
<td>poland2746 [303]</td>
<td>3279</td>
<td>0</td>
<td>9292</td>
<td>0.12%</td>
</tr>
<tr>
<td>poland3012 [303]</td>
<td>3572</td>
<td>9</td>
<td>10144</td>
<td>0.11%</td>
</tr>
<tr>
<td>poland3120 [303]</td>
<td>3693</td>
<td>9</td>
<td>10488</td>
<td>0.11%</td>
</tr>
<tr>
<td>poland3374 [303]</td>
<td>4161</td>
<td>9</td>
<td>11510</td>
<td>0.10%</td>
</tr>
<tr>
<td>peg15226 [40]</td>
<td>21492</td>
<td>1373</td>
<td>55060</td>
<td>0.02%</td>
</tr>
</tbody>
</table>

¹ fabre18 is a modified version of ieee14
² fabre36 is a modified version of ieee30
³ padiyar39e is a modified version of padiyar39
⁴ fabre59 is a modified version of ieee57
Table A.2 – General linear algebraic properties of the admittance matrix $Y$ of power system test cases

<table>
<thead>
<tr>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>rc3</td>
<td>✓</td>
<td>✓</td>
<td>✓</td>
<td></td>
<td>✓</td>
<td>✓</td>
<td>✓</td>
<td>✓</td>
</tr>
<tr>
<td>gs4</td>
<td>✓</td>
<td>-</td>
<td>-</td>
<td></td>
<td>✓</td>
<td>✓</td>
<td>✓</td>
<td>✓</td>
</tr>
<tr>
<td>lanz5</td>
<td>✓</td>
<td>✓</td>
<td></td>
<td></td>
<td>✓</td>
<td>✓</td>
<td>-</td>
<td>✓</td>
</tr>
<tr>
<td>powell5</td>
<td>✓</td>
<td>-</td>
<td>-</td>
<td></td>
<td>✓</td>
<td>✓</td>
<td>✓</td>
<td>✓</td>
</tr>
<tr>
<td>ww6</td>
<td>✓</td>
<td>-</td>
<td>-</td>
<td></td>
<td>✓</td>
<td>✓</td>
<td>✓</td>
<td>✓</td>
</tr>
<tr>
<td>chow9</td>
<td>✓</td>
<td>-</td>
<td>-</td>
<td></td>
<td>✓</td>
<td>✓</td>
<td>-</td>
<td>✓</td>
</tr>
<tr>
<td>wsc9</td>
<td>✓</td>
<td>-</td>
<td>-</td>
<td></td>
<td>✓</td>
<td>✓</td>
<td>-</td>
<td>✓</td>
</tr>
<tr>
<td>kundur11</td>
<td>✓</td>
<td>-</td>
<td>-</td>
<td></td>
<td>✓</td>
<td>✓</td>
<td>-</td>
<td>✓</td>
</tr>
<tr>
<td>ieee14</td>
<td>✓</td>
<td>-</td>
<td>-</td>
<td></td>
<td>✓</td>
<td>✓</td>
<td>-</td>
<td>✓</td>
</tr>
<tr>
<td>fabre18</td>
<td>✓</td>
<td>✓</td>
<td>✓</td>
<td></td>
<td>✓</td>
<td>✓</td>
<td>-</td>
<td>✓</td>
</tr>
<tr>
<td>ieeerts24</td>
<td>✓</td>
<td>-</td>
<td>-</td>
<td></td>
<td>✓</td>
<td>✓</td>
<td>-</td>
<td>✓</td>
</tr>
<tr>
<td>alsac30</td>
<td>✓</td>
<td>-</td>
<td>-</td>
<td></td>
<td>✓</td>
<td>✓</td>
<td>-</td>
<td>✓</td>
</tr>
<tr>
<td>ieee30</td>
<td>✓</td>
<td>-</td>
<td>-</td>
<td></td>
<td>✓</td>
<td>✓</td>
<td>-</td>
<td>✓</td>
</tr>
<tr>
<td>fabre36</td>
<td>✓</td>
<td>✓</td>
<td>✓</td>
<td></td>
<td>✓</td>
<td>✓</td>
<td>-</td>
<td>✓</td>
</tr>
<tr>
<td>ne39</td>
<td>✓</td>
<td>-</td>
<td>-</td>
<td></td>
<td>✓</td>
<td>✓</td>
<td>-</td>
<td>✓</td>
</tr>
<tr>
<td>padiyar39</td>
<td>✓</td>
<td>-</td>
<td>-</td>
<td></td>
<td>✓</td>
<td>✓</td>
<td>-</td>
<td>✓</td>
</tr>
<tr>
<td>padiyar39e</td>
<td>✓</td>
<td>✓</td>
<td>✓</td>
<td></td>
<td>✓</td>
<td>✓</td>
<td>-</td>
<td>✓</td>
</tr>
<tr>
<td>ieee57</td>
<td>✓</td>
<td>-</td>
<td>-</td>
<td></td>
<td>✓</td>
<td>✓</td>
<td>-</td>
<td>✓</td>
</tr>
<tr>
<td>fabre59</td>
<td>✓</td>
<td>✓</td>
<td>✓</td>
<td></td>
<td>✓</td>
<td>✓</td>
<td>-</td>
<td>✓</td>
</tr>
<tr>
<td>nordic77</td>
<td>✓</td>
<td>-</td>
<td>-</td>
<td></td>
<td>✓</td>
<td>✓</td>
<td>-</td>
<td>✓</td>
</tr>
<tr>
<td>lel101</td>
<td>✓</td>
<td>-</td>
<td>-</td>
<td></td>
<td>✓</td>
<td>✓</td>
<td>-</td>
<td>✓</td>
</tr>
<tr>
<td>ieee118</td>
<td>✓</td>
<td>-</td>
<td>-</td>
<td></td>
<td>✓</td>
<td>✓</td>
<td>-</td>
<td>✓</td>
</tr>
<tr>
<td>ieee300</td>
<td>✓</td>
<td>-</td>
<td>-</td>
<td></td>
<td>✓</td>
<td>✓</td>
<td>-</td>
<td>✓</td>
</tr>
<tr>
<td>poland2383</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td></td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>poland2736</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td></td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>poland2746</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td></td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>poland3012</td>
<td>✓</td>
<td>-</td>
<td>-</td>
<td></td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>✓</td>
</tr>
<tr>
<td>poland3120</td>
<td>✓</td>
<td>-</td>
<td>-</td>
<td></td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>✓</td>
</tr>
<tr>
<td>poland3374</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td></td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>peg15226</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td></td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-</td>
</tr>
</tbody>
</table>
Common modeling assumptions in power systems

In this appendix the mathematical background of some common power system modeling assumptions is presented.

Nomenclature

The following nomenclature conventions are adopted.

- Small letters are used to denote real or complex scalars, e.g. $x = 1.1$.
- Bold small letters are used to denote phasors, e.g. $\mathbf{x} = x \cdot e^{j\phi}$.
- Capital letters are used for matrices and vectors of (real or complex scalars), e.g. $X_{123} = [x_1 \ x_2 \ x_3]^T$. Vectors are always considered as column vectors.
- A bar over a letter is used to denote quantities in the per unit system, e.g. $\bar{x} = x / x_{\text{base}}$.
- Combinations of the above are possible, e.g. a bold capital letter with a bar is used for a vector of phasors in the per unit system $\bar{\mathbf{X}}_{123} = [\bar{x}_1 \ \bar{x}_2 \ \bar{x}_3]^T = [\bar{x}_1 \cdot e^{j\phi_1} \ \bar{x}_2 \cdot e^{j\phi_2} \ \bar{x}_3 \cdot e^{j\phi_3}]^T = [(x_1 / x_{\text{base}}) \cdot e^{j\phi_1} \ (x_2 / x_{\text{base}}) \cdot e^{j\phi_2} \ (x_3 / x_{\text{base}}) \cdot e^{j\phi_3}]^T$.

B.1 Phasor representation

Let $x(t)$ be a time-domain waveform. According to the generalized averaging method [312], in the interval $\tau \in (t - T, t]$ the waveform can be approximated as follows.

$$x(\tau) = \sum_{k=-\infty}^{\infty} \langle x \rangle_k(t) \cdot e^{jkw_\tau}$$  \hspace{1cm} (B.1)
Appendix B. Common modeling assumptions in power systems

Where \( \omega_s = \frac{2\pi}{T} \) is the sampling frequency and \( \langle x \rangle_k(t) \) are the Short-Time Fourier series coefficients, given by

\[
\langle x \rangle_k(t) = \frac{1}{T} \int_{t-T}^{t} x(\tau) \cdot e^{-j \cdot k \omega_s \tau} d\tau.
\] (B.2)

The above is in essence a time-frequency representation of the original signal, as the signal in (B.1) is written as a sum of frequency-related \((k \cdot \omega_s)\), time-varying coefficients. In power system literature these coefficients are called dynamic phasors and they are commonly used to obtain state-space models in which the phasors are the state variables [313, 314].

In the general case, when \( x(t) \) is complex, the following holds.

\[
\langle x \rangle_{-k}(t) = \left( \langle x^* \rangle_k(t) \right)^* \quad \text{(B.3)}
\]

For real waveforms this simplifies further to

\[
\langle x \rangle_{-k}(t) = \left( \langle x \rangle_k(t) \right)^*.
\] (B.4)

Then (B.1) becomes

\[
x(t) = \langle x \rangle_0(t) + \sum_{k=1}^{\infty} \left[ \langle x \rangle_k(t) \cdot e^{j \cdot k \omega_s t} + \left( \langle x \rangle_k(t) \cdot e^{j \cdot k \omega_s t} \right)^* \right]
\] (B.5)

\[
= \langle x \rangle_0(t) + 2 \cdot \sum_{k=1}^{\infty} \Re \left\{ \langle x \rangle_k(t) \cdot e^{j \cdot k \omega_s t} \right\}.
\] (B.6)

We assume a sinusoidal waveform.

\[
x(t) = |x|(t) \cdot \cos \left( \omega(t) \cdot t + \phi(t) \right)
\] (B.7)

Another common assumption is that the parameters of the waveform above are time-invariant.
B.1. Phasor representation

\[ |x|(t) = |x| \text{ const} \]
\[ \omega(t) = \omega \text{ const} \]
\[ \phi(t) = \phi \text{ const} \]

Then, for \( \omega_s = \omega \) it can be verified that

\[ \langle x \rangle_1(t) = |x| \cdot e^{j\phi} \text{ const} \]  \( \text{(B.8)} \)
\[ \langle x \rangle_{k \neq 1}(t) = 0 \text{ const} \]  \( \text{(B.9)} \)

It is this first coefficient that is used as the phasor of the signal, \( x := \langle x \rangle_1(t) \). Under normal circumstances a single phasor is enough to describe a signal of the real system. This phasor is the first coefficient of the Short-Time Fourier Series expansion of the signal [313].

The phasor representation of a signal is slowly varying compared to the signal itself given that the bandwidth of the latter is (usually much) smaller than its carrier frequency; for more on that and on the relation between a signal, its Hilbert transform and its phasor see also [315].

Phasors denote quantities that rotate in the counterclockwise direction with the nominal frequency of the system, e.g. 50 Hz for Europe. A common convention for the magnitude of the phasor is to represent the RMS maximum value of the positive half-cycle of the sinusoid [316]. An example of a phasor is presented hereunder.

\[ x(t) = \sqrt{2} \cdot x \cdot \cos(\omega \cdot t + \phi_x) \rightarrow x = x \cdot e^{j\phi_x} \]

The phasor form above is called the exponential form and it is equivalent to the rectangular form and the polar form, shown hereunder.

\[ x = x \cdot e^{j\phi_x} = Re[x] + j \cdot Im[x] = x \angle \phi_x \]  \( \text{(B.10)} \)
Appendix B. Common modeling assumptions in power systems

B.2 Balance in electrical quantities

An arbitrary three-phase voltage vector is given by the following.

\[ V_{abc} = \begin{bmatrix} v_a \\ v_b \\ v_c \end{bmatrix} = \begin{bmatrix} v_a e^{j \phi_a} \\ v_b e^{j \phi_b} \\ v_c e^{j \phi_c} \end{bmatrix} \]  \hspace{1cm} (B.11)

According to the symmetrical components theory introduced by Fortescue in his seminal paper [317] a three-phase system can be resolved into three balanced systems. First the Fortescue operator \( \alpha \) is defined, which denotes a positive phase shift of 120°.

\[ \alpha := e^{j \frac{2\pi}{3}} \]  \hspace{1cm} (B.12)

Also

\[ \alpha^2 = e^{j \frac{4\pi}{3}} = -\alpha \]  \hspace{1cm} for a shift of 240°, \hspace{1cm} (B.13)
\[ \alpha^3 = e^{j 2\pi} = 1 \]  \hspace{1cm} for a shift of 360°. \hspace{1cm} (B.14)

A transformation matrix is then defined as follows.

\[ T_{abc}^{0+} := \begin{bmatrix} 1 & 1 & 1 \\ 1 & a^2 & a \\ 1 & a & a^2 \end{bmatrix} \]  \hspace{1cm} (B.15)

Left multiplication with \( T_{abc}^{0+} \) represents a change of the base of the vectors in \( \mathbb{C}^3 \). So, \( T_{abc}^{0+} \) can be thought of as a change of basis matrix from the symmetrical components to the three-phase domain. In base transformation matrices the subscript is the starting frame of reference and the superscript is the destination frame of reference.

Left multiplying with the inverse of the transformation matrix performs exactly the opposite
transformation, from the three-phase to the symmetrical components domain.

\[ T_{abc}^{0+-} = T_{0+-}^{abc} = \frac{1}{3} \begin{bmatrix} 1 & 1 & 1 \\ 1 & a & a^2 \\ 1 & a^2 & a \end{bmatrix} \] (B.16)

The following also hold.

- for the transpose: \[ T_{abc}^{0+-} = T_{0+-}^{abc} \] (B.17)
- for the complex conjugate: \[ T_{abc}^{0+-} = 3 \cdot T_{0+-}^{abc} \] (B.18)

Transforming the arbitrary voltage vector of phasors the symmetrical components domain gives the following.

\[ V_{abc} = T_{0+-}^{abc} \cdot V_{0+-} = T_{0+-}^{abc} \cdot \begin{bmatrix} v_0 \\ v_+ \\ v_- \end{bmatrix} = \begin{bmatrix} v_0 \\ v_0 \\ v_0 \end{bmatrix} + \begin{bmatrix} v_+ \\ a^2 \cdot v_+ \\ a \cdot v_+ \end{bmatrix} + \begin{bmatrix} v_- \\ a \cdot v_- \\ a^2 \cdot v_- \end{bmatrix} = V_0 + V_+ + V_- \] (B.19)

The zero (or homopolar) sequence consists of three phasors equal in magnitude and phase.

\[ V_0 := \begin{bmatrix} v_{a0} \\ v_{b0} \\ v_{c0} \end{bmatrix} = \begin{bmatrix} v_0 \\ v_0 \\ v_0 \end{bmatrix} \] (B.20)

The positive (or direct) sequence consists of a balanced system of three phasors having the same sequence as the original signal. As the phasors are rotating counter-clockwise, \( v_{b+} \) lags 120° behind \( v_{a+} \), and \( v_{c+} \) lags 120° behind \( v_{b+} \) (and 240° behind \( v_{a+} \)). So the sequence is \( a \rightarrow b \rightarrow c \).

\[ V_+ := \begin{bmatrix} v_{a+} \\ v_{b+} \\ v_{c+} \end{bmatrix} = \begin{bmatrix} v_+ \\ a^2 \cdot v_+ \\ a \cdot v_+ \end{bmatrix} \] (B.21)

The negative (or inverse) sequence consists of a balanced system of three phasors having the
Appendix B. Common modeling assumptions in power systems

opposite sequence from the original signal. As the phasors are rotating counter-clockwise, \( v_{c+} \) lags 120° behind \( v_{a+} \), and \( v_{b+} \) lags 120° behind \( v_{c+} \) (and 240° behind \( v_{a+} \)). So the sequence is \( a \rightarrow c \rightarrow b \).

\[
\begin{bmatrix}
v_{a-} \\
v_{b-} \\
v_{c-}
\end{bmatrix} = \begin{bmatrix} v_- \\ \alpha \cdot v_- \\ \alpha^2 \cdot v_-
\end{bmatrix}
\]

The base phasors of the symmetric sequences can be derived from the original signal as follows.

\[
V_{0+} = \gamma_{abc} \cdot V_{abc}
\]

Note that for each of the sequences the three constituent phasors appear always together, e.g. for the positive sequence, \( v_{a+} \) can never exist without \( v_{b+} \) and \( v_{c+} \) [316].

When a balanced operation is considered, amplitudes of the phasors of the three phases are equal, and angles are equidistant around a cycle of a full period \( 2\pi \).

\[
v_a = v_b = v_c \\
\phi_b = \phi_a - \frac{2\pi}{3} \\
\phi_c = \phi_a - 2 \cdot \frac{2\pi}{3}
\]

The three-phase vector can then be rewritten as follows.

\[
V_{abc} = v_a \cdot e^{j\phi_a} \cdot \begin{bmatrix} 1 \\ -\alpha \\ -\alpha^2 \end{bmatrix} = v_a \cdot e^{j\phi_a} \cdot \begin{bmatrix} 1 \\ \alpha^2 \\ \alpha \end{bmatrix}
\]
Applying the symmetrical components transformation to it, yields

\[
V_{0+} = T_{abc}^{0+} \cdot V_{abc} = \ldots = \begin{bmatrix}
0 \\
v_a \cdot e^{j\phi_a} \\
0
\end{bmatrix} \Rightarrow V_+ = v_a \cdot e^{j\phi_a}, \quad V_0 = v_- = 0.
\] (B.26)

That is, under balanced conditions only the direct sequence is present and a single phasor can represent the whole three-phase voltage.

### B.3 Symmetry in the network

In this section the effect of the symmetrical components transformation will be examined for the interconnecting building blocks of the network, i.e. the transmission lines. In a general three phase topology, the system is comprised by a set of connected three-port elements. The impedance parameters of a sample such element are given as follows

\[
V_{abc} = Z_{abc} \cdot I_{abc}, \quad V_{abc}, I_{abc} \in \mathbb{C}^3 \text{ and } Z_{abc} \in \mathbb{C}^3 \times \mathbb{C}^3
\] (B.27)

The theory behind the analytical derivation of the elements of the matrix hereinabove, i.e. line parameters estimation, is beyond the scope of this work and the interested reader is referred to relevant bibliography [289, 318, 319]. In the above, the effect of the input in any phase \((a, b\) or \(c)\) has an effect on the output of other phases, through the off-diagonal coupling terms of matrices \(Z_{abc}\) or \(Y_{abc}\). In other words, all inputs are coupled with all outputs. In order to have decoupled input-output relation the relation matrix needs to be diagonal.

A common assumption in many analyses are that three-port elements in the system are balanced, and thus their impedance parameters have a fully symmetric matrix [318].

\[
Z_{abc} = \begin{bmatrix}
z & m & m \\
m & z & m \\
m & m & z
\end{bmatrix}
\] (B.28)

Matrices as the one above are a special case of circulant matrices. As such the eigenvectors...
Appendix B. Common modeling assumptions in power systems

and eigenvalues can be easily calculated.

\[
\begin{align*}
\lambda_0 &= z + m \cdot \omega_0 + m \cdot \omega_0^2, \quad u_0 = [1 \quad \omega_0 \quad \omega_0^2]^T \\
\lambda_1 &= z + m \cdot \omega_1 + m \cdot \omega_1^2, \quad u_1 = [1 \quad \omega_1 \quad \omega_1^2]^T \\
\lambda_2 &= z + m \cdot \omega_2 + m \cdot \omega_2^2, \quad u_2 = [1 \quad \omega_2 \quad \omega_2^2]^T
\end{align*}
\]

(B.29)

Where \( \omega_k = \exp(j \cdot \frac{2\pi k}{3}) \) are the \( n \)th roots of the unit circle. Finally

\[
\begin{align*}
\lambda_0 &= z + 2 \cdot m, \quad u_0 = [1 \quad 1 \quad 1]^T \\
\lambda_1 &= z - m, \quad u_1 = [1 \quad \exp(j \cdot \frac{2\pi}{3}) \quad \exp(j \cdot \frac{4\pi}{3})]^T \\
\lambda_2 &= z - m, \quad u_2 = [1 \quad \exp(j \cdot \frac{4\pi}{3}) \quad \exp(j \cdot \frac{2\pi}{3})]^T
\end{align*}
\]

(B.30)

It can be seen that \( \exp(j \cdot \frac{2\pi}{3}) \equiv \alpha \) is nothing but the Fortescue operator.

According to eigendecomposition principles \( Z_{abc} \) can be diagonalized as follows.

\[
T^{-1} \cdot Z_{abc} \cdot T = Z_{diag} = \begin{bmatrix}
\lambda_0 & 0 \\
0 & \lambda_2
\end{bmatrix}
\]

(B.31)

\( T \) is an invertible matrix, the columns of which are the right eigenvectors of \( Z_{abc} \), and the corresponding diagonal entry in \( Z_{0+} \) is the corresponding eigenvalue. It is easily seen that \( T \) coincides with the \( T_{0+abc} \) matrix that has been defined in the previous section. Through this observation, it can be seen that when the symmetrical components transformation is applied to balanced three-port impedance networks, it diagonalizes the impedance matrix of the network.

The diagonalized form of \( Z_{abc} \) is defined as the diagonal eigenvalue matrix.

\[
Z_{0+} = \begin{bmatrix}
z_0 & 0 & 0 \\
0 & z_+ & 0 \\
0 & 0 & z_-
\end{bmatrix}
\Rightarrow Z_{diag} = \begin{bmatrix}
z + 2 \cdot m & 0 & 0 \\
0 & z - m & 0 \\
0 & 0 & z - m
\end{bmatrix}
\]

(B.32)
This is substituted in (B.27) to yield the decoupled equations.

\[
V_{abc} = T_{abc}^0 \cdot Z_{0_{++}} \cdot T_{abc}^{-1} \cdot I_{abc} \Rightarrow V_{0_{++}} = Z_{0_{++}} \cdot I_{0_{++}} \Rightarrow v_0 = z_0 \cdot i_0
\]

(B.33)

The last equation is the decoupled impedance parameters of the three port network. The usefulness of symmetrical components is that each sequence can be studied separately. Under balanced operating conditions it has been seen that homopolar and negative sequence phasors are zero, so only the positive sequence equation is relevant. The positive sequence impedance \(z_+\) is what is normally available in power flow studies for the branches of the system.

In the normal non-degenerate case \(Z_{abc}\) is invertible \(Y_{abc} = Z_{abc}^{-1}\). Then, a similar line can be followed for the admittance version of every equation. The relevant final equation would then be

\[
i_+ = y_+ \cdot v_+.
\]

(B.34)

B.4 Power considerations

Let a single phase, line-to-neutral voltage be

\[
v(t) = \sqrt{2} \cdot v \cdot \cos(\omega \cdot t + \phi_v) \rightarrow v = v \cdot e^{j \phi_v}
\]

(B.35)

Let a single phase line current be

\[
i(t) = \sqrt{2} \cdot i \cdot \cos(\omega \cdot t + \phi_i) \rightarrow i = i \cdot e^{j \phi_i}
\]

(B.36)

Then the instantaneous power entering a single port network is as follows [295].

\[
p_t = v(t) \cdot i(t) = ... = v \cdot i \cdot \cos(\phi_v - \phi_i) \cdot \{1 + \cos(2\omega t + 2\phi_i)\} - v \cdot i \cdot \sin(\phi_v - \phi_i) \cdot \sin(2\omega t + 2\phi_i)
\]

(B.37)
Appendix B. Common modeling assumptions in power systems

In the above there are two distinct parts.

- \( p_{ip}(t) := v \cdot i \cdot \cos(\phi_v - \phi_i)[1 + \cos(2\omega t + 2\phi_i)] \) is the instantaneous active power. This is the amount of power that is irreversibly consumed for the production of real work.

- \( p_{iq}(t) := -v \cdot i \cdot \sin(\phi_v - \phi_i) \cdot \sin(2\omega t + 2\phi_i) \) is the instantaneous reactive power. This is a byproduct of the alternating nature of the system as it is the result of the component of current that is in quadrature with the voltage. The reactive power supports the transfer of active power over the transmission medium. It is temporarily stored in inductances and capacitances in the network and has a zero average value.

The \( 2\omega t \) terms in the above are zeroed if averages over one period \( 2\pi/\omega \). We define the (average) active and (peak) reactive power as follows.

\[
\begin{align*}
p &:= v \cdot i \cdot \cos(\phi_v - \phi_i) \quad \text{(B.38)} \\
q &:= -v \cdot i \cdot \sin(\phi_v - \phi_i) \quad \text{(B.39)}
\end{align*}
\]

The instantaneous power then can be written as follows.

\[
p_i(t) = p \cdot [1 + \cos(2\omega t + 2\phi_i)] + q \cdot \sin(2\omega t + 2\phi_i) \quad \text{(B.40)}
\]

Active power \( p \) is the average amount of work producing power delivered to the element in question over one period of the sinusoidal voltage. Reactive power \( q \) is the peak value of the instantaneous reactive power as defined before.

Note that neither \( p \) or \( q \) are sinusoidal. However we can make use of the phasors that were defined in the beginning of this subsection to define single-phase apparent power as follows.

\[
s_{1\Phi} := v \cdot i^* = v \cdot i \cdot e^{j(\phi_v - \phi_i)} \quad \text{(B.41)}
\]

The above is only valid if the magnitude of the phasors \( v \) and \( i \) are the RMS values of the line-to-neutral voltage and line current respectively, as defined in (B.35) and (B.36). Of course (B.41) can also be written in rectangular or polar form, \( s_{1\Phi} = p + j \cdot q = s_{1\Phi} \cdot e^{j\phi} \).
A similar expression applies to the averaged three-phase apparent power.

\[ s_{3\Phi} = V_{abc}^T \cdot I_{abc}^* = v_a \cdot i_a^* + v_b \cdot i_b^* + v_c \cdot i_c^* \]  

(B.42)

For theories on the full three-phase *instantaneous* power the interested reader is referred to [320].

In case the signals are balanced, the above becomes

\[ s_{3\Phi} = v_a \cdot i_a^* + v_b \cdot i_b^* + v_c \cdot i_c^* = v_a \cdot i_a^* + \alpha^2 \cdot v_a \cdot (\alpha^2)^* \cdot i_a^* + \alpha \cdot v_a \cdot \alpha^* \cdot i_a^* = 3 \cdot v_a \cdot i_a^* = 3 \cdot v_a \cdot i_a \cdot e^{j(\phi_v - \phi_i)}. \]  

(B.43)

If the symmetrical components transformation is used

\[ s_{3\Phi} = V_{abc}^T \cdot I_{abc}^* = (T_{0+}^{abc} \cdot V_{0+-})^T \cdot (T_{0+}^{abc} \cdot I_{0+})^* = \ldots = 3 \cdot V_{0+}^T \cdot I_{0+}^*. \]  

(B.44)

The scalar factor $3 \times$ is produced by the matrix multiplication $T_{0+}^{abcT} \cdot T_{0+}^{abc*}$. It is reasonable if the symmetrical components are thought of as three sets of three balanced phasors, in total 9 phasors. In the case of balanced signals, the above becomes

\[ s_{3\Phi} = 3 \cdot v_+ \cdot i_+^* = 3 \cdot v_a \cdot i_a \cdot e^{j(\phi_v - \phi_i)}, \]  

which coincides with (B.43), as $v_+ \equiv v_a$ and $i_+ \equiv i_a$.

Another comment on the factor $3 \times$ is that it signifies that the base transformation defined by $T_{0+}^{abc}$ is not power invariant.

\[ T_{0+}^{abcT} \cdot T_{0+}^{abc*} = 3 \cdot I_3 \]  

(B.46)

Where $I_3$ is the $3 \times 3$ identity matrix. In order for the transformation to be power invariant a
Appendix B. Common modeling assumptions in power systems

new transformation matrix would have to be defined as follows [321].

\[ T_{0+}^{abc} = \frac{1}{h} \begin{bmatrix} 1 & 1 & 1 \\ 1 & a^2 & a \\ 1 & a & a^2 \end{bmatrix} \]  

(B.47)

Then the inverse would be

\[ T_{0+}^{abc^{-1}} = \frac{h}{3} \begin{bmatrix} 1 & 1 & 1 \\ 1 & a & a^2 \\ 1 & a^2 & a \end{bmatrix}, \]  

(B.48)

and

\[ T_{0+}^{abc T} \cdot T_{0+}^{abc^+} = \frac{3}{h^2} \cdot I_3. \]  

(B.49)

For the \( h = \sqrt{3} \) the transformation is power invariant. In this case, \( T_{0+}^{abc} \) is also unitary (i.e. the complex analogue of orthonormal), all three column vectors of \( T_{0+}^{abc} \) are of unit length and they are mutually orthogonal.

B.5 Per-unit representation

In order to further simplify the equations to be handled, all system quantities are normalized according to the per unit system (pu). For a quantity \( x \) the following holds.

\[ \bar{x} \text{ [in pu]} = \frac{x \text{ [in SI]}}{x_{base} \text{ [in SI]}} \]  

(B.50)

Bases of different quantities can be set arbitrarily and independently, but this affect the form of equations in the pu domain. Two examples of voltage, current and power bases will be considered to illustrate the fact.

Case A
Two of \( s_{baseA}, v_{baseA} \) and \( i_{baseA} \) are picked arbitrarily and the last of the three is selected to
B.5. Per-unit representation

respect

\[ s_{\text{base}A} = 3 \cdot v_{\text{base}A} \cdot i_{\text{base}A}. \]  (B.51)

**Case B**

Two of \( s_{\text{base}B}, v_{\text{base}B} \), and \( i_{\text{base}B} \) are picked arbitrarily and the last of the three is selected to respect

\[ s_{\text{base}B} = v_{\text{base}B} \cdot i_{\text{base}B}. \]  (B.52)

Case A could be for example when

- \( i_{\text{base}A} \) is set to correspond to the rated RMS current of a line,
- \( v_{\text{base}A} \) is set to correspond to the rated RMS line-to-ground voltage of a line, and
- \( s_{\text{base}A} \) as calculated by (B.51) would correspond to rated three-phase power.

Case B could be for example when

- \( i_{\text{base}B} \) is set to correspond to the rated RMS current of a line,
- \( v_{\text{base}B} \) is set to correspond to the rated RMS line-to-ground voltage of a line, and
- \( s_{\text{base}B} \) as calculated by (B.52) would correspond to rated single-phase power.

The equations for single- and three-phase power in a balanced system are respectively as follows.

\[ s_{1\phi} = v_{\phi} \cdot i_{\phi} \]  (B.53)
\[ s_{3\phi} = 3 \cdot v_{\phi} \cdot i_{\phi} \]  (B.54)

The above are in the SI system. In order to convert them to a pu system, both sides of the equations are divided by the respective \( s_{\text{base}} \).

Case A
Appendix B. Common modeling assumptions in power systems

For the single-phase power

\[
\begin{align*}
\frac{s_{1\Phi}}{s_{\text{base}A}} &= \frac{v_{\Phi} \cdot i_{\Phi}^*}{s_{\text{base}A}} \\
\bar{s}_{1\Phi} &= \frac{1}{3} \frac{v_{\Phi}}{v_{\text{base}A}} \cdot \frac{i_{\Phi}^*}{i_{\text{base}A}} \\
\bar{s}_{1\Phi} &= \frac{1}{3} \frac{v_{\Phi}}{v_{\text{base}A}} \cdot \frac{i_{\Phi}^*}{i_{\text{base}A}}
\end{align*}
\]

(B.55)

For the three-phase power

\[
\begin{align*}
\frac{s_{3\Phi}}{s_{\text{base}A}} &= 3 \cdot \frac{v_{\Phi} \cdot i_{\Phi}^*}{s_{\text{base}A}} \\
\bar{s}_{3\Phi} &= 3 \cdot \frac{v_{\Phi}}{v_{\text{base}A}} \cdot \frac{i_{\Phi}^*}{i_{\text{base}A}} \\
\bar{s}_{3\Phi} &= 3 \cdot \frac{v_{\Phi}}{v_{\text{base}A}} \cdot \frac{i_{\Phi}^*}{i_{\text{base}A}}
\end{align*}
\]

(B.56)

Case B
For the single-phase power

\[
\begin{align*}
\frac{s_{1\Phi}}{s_{\text{base}B}} &= \frac{v_{\Phi} \cdot i_{\Phi}^*}{s_{\text{base}B}} \\
\bar{s}_{1\Phi} &= \frac{v_{\Phi}}{v_{\text{base}B}} \cdot \frac{i_{\Phi}^*}{i_{\text{base}B}} \\
\bar{s}_{1\Phi} &= \frac{v_{\Phi}}{v_{\text{base}B}} \cdot \frac{i_{\Phi}^*}{i_{\text{base}B}}
\end{align*}
\]

(B.57)

For the three-phase power

\[
\begin{align*}
\frac{s_{3\Phi}}{s_{\text{base}B}} &= 3 \cdot \frac{v_{\Phi} \cdot i_{\Phi}^*}{s_{\text{base}B}} \\
\bar{s}_{3\Phi} &= 3 \cdot \frac{v_{\Phi}}{v_{\text{base}B}} \cdot \frac{i_{\Phi}^*}{i_{\text{base}B}} \\
\bar{s}_{3\Phi} &= 3 \cdot \frac{v_{\Phi}}{v_{\text{base}B}} \cdot \frac{i_{\Phi}^*}{i_{\text{base}B}}
\end{align*}
\]

(B.58)

From the above is it clearly shown that the selection of base quantities affects the form of the equations in the pu domain.
B.5. Per-unit representation

In this work a standard [321, 33] base system is adopted. A common assumption is adopted: first the power base is defined as a system wide 3-phase power base is defined. Then the voltage base is defined as rated line-to-line RMS voltage at the to-bus of the branch. All other relevant bases are derived from them [33].

\[ s_{\text{base}} = \text{system-wide 3-phase power base, in MVA} \quad (B.59a) \]
\[ v_{\text{base}} = \text{rated line-to-line RMS bus voltage at the to-bus of the branch, in kV} \quad (B.59b) \]
\[ i_{\text{base}} = \frac{s_{\text{base}}}{v_{\text{base}}} \text{ corresponding to } \sqrt{3} \times \text{ line RMS current, in A} \quad (B.59c) \]
\[ z_{3\Phi\text{base}} = \frac{v_{3\Phi\text{base}}}{i_{3\Phi\text{base}}} \text{, in } \Omega \quad (B.59d) \]

The voltage base is defined according to the \( \pi \)-equivalent model of a transmission line. It is taken to be the to-bus voltage of the branch, given that in the \( \pi \)-model the line impedance is considered after the transformer (see Fig. 3.6 in chapter 3).

The three-phase power equation can be alternatively read as follows.

\[ |s_{3\Phi}| = 3 \cdot v_{\Phi} \cdot i_{\Phi} = (\sqrt{3} \cdot v_{\Phi}) \cdot (\sqrt{3} \cdot i_{\Phi}) = v_{LL} \cdot i_{\sqrt{3}\Phi} \quad (B.60) \]

Where

- \( v_{LL} = \sqrt{3} \cdot v_{\Phi} \) is the line-to-line voltage, and
- \( i_{\sqrt{3}\Phi} = \sqrt{3} \cdot i_{\Phi} \) is the line current multiplied by \( \sqrt{3} \).

To put it simply, a three-phase line carrying 1 pu voltage (line-to-line RMS) and 1 pu current (\( \sqrt{3} \times \) line RMS), is transferring 1 pu of power.

These are quantities that are commonly considered in various power system studies, e.g. power flow. Multiplying both sides of (B.34) by \( \sqrt{3} \) gives

\[ i_{\sqrt{3}\Phi} = y \cdot v_{+LL}. \quad (B.61) \]

Where \( i_{+\sqrt{3}\Phi} \) is the line RMS current of the direct sequence multiplied by a factor of \( \sqrt{3} \), and \( v_{+LL} \) is the line-to-line RMS voltage of the direct sequence. In the same vein, rewriting (B.45)
Appendix B. Common modeling assumptions in power systems

gives

\[ s_{3\Phi} = 3 \cdot v_+ \cdot i^*_+ = \left( \sqrt{3} \cdot v_+ \right) \cdot \left( \sqrt{3} \cdot i^*_+ \right) \iff s_{3\Phi} = v_{+LL} \cdot i^*_{+\sqrt{3}\Phi}. \tag{B.62} \]

The pu transformation applied to (B.61) and (B.62) gives the basic core equations that are very commonly used in many power system analyses.

\[ \tilde{i}_{+\sqrt{3}\Phi} = \tilde{y} \cdot \tilde{v}_{+LL} \]  
\[ \tilde{s}_{3\Phi} = \tilde{v}_{+LL} \cdot \tilde{i}^*_+ \tag{B.64} \]

In order to make the pu transformation procedure clear an example for a real three-phase case is provided. Let the three-phase VA base be

\[ s_{base} = 100 \text{ kVA}. \]

We consider the base line-to-line RMS voltage to be

\[ v_{base} = 11 \text{ kV}. \]

The base value for the quantity $\sqrt{3} \times$ line RMS current is

\[ i_{base} = \frac{100}{11} = 9.1 \text{ A}. \]

Suppose that in the system there is a three-port network fed with voltage and current that have the following full three-phase time-domain expressions.

\[ V_{abc} = \begin{bmatrix} 10e3 \cdot \cos(\omega t + \phi_v) \\ 10e3 \cdot \cos(\omega t + \phi_v - 120^\circ) \\ 10e3 \cdot \cos(\omega t + \phi_v - 240^\circ) \end{bmatrix} V \]
The phasors of the above are as follows.

\[
V_{abc} = \begin{bmatrix}
7.07e3 \cdot e^{j\phi_V} \\
7.07e3 \cdot e^{j(\phi_V - \frac{2\pi}{3})} \\
7.07e3 \cdot e^{j(\phi_V - \frac{4\pi}{3})}
\end{bmatrix} \quad V
\]

\[
I_{abc} = \begin{bmatrix}
3.54e3 \cdot e^{j\phi_i} \\
3.54e3 \cdot e^{j(\phi_i - \frac{2\pi}{3})} \\
3.54e3 \cdot e^{j(\phi_i - \frac{4\pi}{3})}
\end{bmatrix} \quad A
\]

The three-phase power is calculated directly from the SI three-phase phasors as follows.

\[
s_{3\Phi} = V_{abc}^T \cdot I_{abc}^* = v_a \cdot i_a^* + v_b \cdot i_b^* + v_c \cdot i_c^* = 75 \cdot e^{j(\phi_v - \phi_i)} \quad kVA
\]

Alternatively, the per unit system can be used. First, the line-to-line RMS voltage is calculated as

\[
v_{LL} = \sqrt{3} \cdot v_{1\Phi} = 12.25e3 \quad V.
\]

Then, the quantity $\sqrt{3} \times$ line RMS current is calculated.

\[
i_{Lx\sqrt{3}} = \sqrt{3} \cdot i_{1\Phi} = 6.13 \quad A
\]

The above two, converted in pu are
Appendix B. Common modeling assumptions in power systems

\[ \bar{v}_{LL} = \frac{v_{LL}}{v_{base}} = 1.11 \text{ pu}, \]
\[ \bar{i}_{Lx\sqrt{3}} = \frac{i_{Lx\sqrt{3}}}{i_{base}} = 0.67 \text{ pu}. \]

And the magnitude of the apparent power is

\[ \bar{s} = \bar{v} \cdot \bar{i} = 0.75 \text{ pu} \]

If you convert the latter to SI, we see that it coincides with the value that has been directly calculated in SI.

\[ s = \bar{s} \cdot s_{base} = 75 \text{kVA} \]
Let an ordinary differential equation be

\[ \dot{x} = f(x). \]  \hspace{1cm} (C.1)

Suppose an initial value of \( x(0) = x_0 \). Then the solution of the above along time can be expressed as follows.

\[ x(t) = \phi(x_0) \]  \hspace{1cm} (C.2)

The point of numerical integration algorithms is to approximate this perfect analytical trajectory by a sequence of points \( x_k \) that correspond to respective time instants \( t_k \). For a given point in time \( t_k \) a numerical integration algorithm produces a point of the trajectory for time instant \( t_{k+1} = t_k + h \), where \( h \) is a timestep [33].

### C.1 Explicit and implicit methods

Generally, numerical integration algorithms can be defined in two broad categories, *implicit* and *explicit* methods. Implicit methods in order to calculate the next step require knowledge of the variable in the next step itself.

\[ x_{k+1} = g(x_{k+1}, x_k, x_{k-1}, ..., x_0) \]  \hspace{1cm} (C.3)

Unknown quantities (in bold) appear in both sides of the equation, and thus it has to be solved
Appendix C. Numerical integration

implicitly.

\[ \tilde{g}(x) = x_{k+1} - g(x_{k+1}, x_k, x_{k-1}, \ldots, x_0) = 0 \quad (C.4) \]

On the contrary explicit method provide a direct, and hence explicit, solution for the next step, with unknowns appearing only on the left-hand side.

\[ x_{k+1} = g(x_k, x_{k-1}, \ldots, x_0) \quad (C.5) \]

Obviously, implicit methods are computationally much more demanding. This comes though with the benefit of having much better numerical properties, in terms of numerical stability. This is to say that they succeed in bounding the propagation of the error in stiff problems. Stiffness is the property exhibited by some differential equations, in which for the solution sequence \( x_k \) to be close to the real (perfect) solution \( x(t) \), the time step \( h \) of numerical integration algorithms has to be very small. Otherwise numerical instability occurs, which means that the error is propagated in an unbounded fashion, rendering the solution useless.

Stiffness is often due to the fact that in the real system there are phenomena with very different time scales. For electromechanical transient simulation in power systems, this is not really the case as the dynamics are confined to the range of the electromechanical time constants of the generators of the network. Usually the latter are concentrated in the region of few hertz, and so transient simulators often benefit from the ease of implementation and the computational simplicity of explicit algorithms. A rule of thumb is that the integration time step should be one-fifth [291] up to one-half [297] the smallest effective time constant in the element models. Typical operator studies are performed with time steps of one-half [297] or one-fourth [322] of the period of the system nominal frequency - 50Hz in Europe, 60 Hz in N. America. In cases where simulators are designed to tackle both transient and longer term phenomena, the set of differential-algebraic equations are stiffer, and implicit algorithms are often favored [323].

C.2 Numerical properties of methods

A deeply specialized analysis of properties of numerical algorithms is out of the scope of this study. The interested reader is referred to [324, 325]. However, for clarity sake a brief view is given in this subsection.
C.2. Numerical properties of methods

C.2.1 Convergence

A numerical method is convergent if the approximated next step \( x_{k+1} \) approaches the exact solution \( x(t_k + h) \) as the time step becomes infinitesimally small.

\[
\lim_{h \to 0^+} x_{k+1} = x(t_k + h) \tag{C.6}
\]

C.2.2 Order

The order of a method is a measure of how well the difference equation defining the method approximates the differential equation. This can be quantified by the local truncation error.

\[
\delta = x_{k+1} - x(t_k + h) \tag{C.7}
\]

Where \( x_{k+1} \) is given by the formula of the method. A method has order \( p \) if the local error is of order \( O(h^{p+1}) \) as the time step \( h \) goes to zero. Intuitively, the higher the order of the method, the smaller the propagation of the error across time steps.

C.2.3 Stability

Stability directly relates to the concept of stiffness. A common method to assess the stability of a method, is to apply it to the standard test equation.

\[
\dot{y} = k \cdot y, \quad y_0 = y(0) = 1, \quad k \in \mathbb{C} \tag{C.8}
\]

The analytical solution of the above is \( y(t) = e^{kt} \). This is a test of how well the method can follow dynamics with time constants dictated by \( k \). When applied on this problem, the solution for the next step of an integration algorithm has the general form

\[
y_{k+1} = \mathcal{F}(k \cdot h) \cdot y_k \tag{C.9}
\]
Appendix C. Numerical integration

Where $F(\cdot)$ is determined by the algorithm. By induction the following holds.

$$y_k = F(k \cdot h)^k \cdot y_0$$  \hspace{1cm} (C.10)

So, for $\lim_{k \to \infty} y_k = 0$, it has to be $|F(z)| < 1$, where $z := k \cdot h \in \mathbb{C}$. The latter defines the stability region for the method, in the complex $z$ plane. A smaller stability region normally means that smaller time steps have to be adopted to prevent instability from occurring.

C.3 Truncation error

At each iteration a local truncation error is caused to the solution. These errors of each iteration accumulate to a final global truncation error for the method. Assume a continuous differential equation.

$$\dot{x} = f(x)$$  \hspace{1cm} (C.11)

We would like to approximate the exact solution $x(t)$ with a sequence of values $x^{(k)}$ at discrete time steps, $t = t_1, \ldots, t_N = 1 \cdot h, \ldots, N \cdot h$, where $h$ is a constant time step. Suppose that we apply a method which results in the following numerical integration formula for each subsequent step.

$$x^{(k+1)} = x^{(k)} + h \cdot A(x^{(k)}, h, f)$$  \hspace{1cm} (C.12)

$A$ is called the increment factor and is generally a function of previous approximations of $x(t)$ (only $x^{(k)}$ in the simplified example above), the time step $h$ and the function $f$. Formally the local truncation error $\tau_{k+1}$ is the error that $A$ introduces to $x^{(k+1)}$ if perfect knowledge for the previous value(s) is assumed $x^{(k)} = x(t_k)$.

$$\tau_{k+1} := x(t_{k+1}) - x^{(k+1)} = x(t_{k+1}) - x^{(k)} - h \cdot A(x^{(k)}, h, f)$$  \hspace{1cm} (C.13)

In a similar line, the global truncation error $\epsilon_k$ is calculated as the accumulation of all the local
C.3. Truncation error

truncation errors of all iterations if perfect knowledge for the initial value of $x$ is assumed.

$$
\epsilon_k := x(t_k) - x^{(k)} = x(t_k) - \left( x(0) + \sum_{i=0}^{k-1} A(x^{(i)}, h, f) \right)
$$

In the above a simplistic for $x^{(k+1)}$ has been assumed in (C.12). Expressions for the local and global truncation errors can be derived if more complex integration methods are assumed.
Bibliography


Bibliography


Bibliography


Bibliography


Bibliography


Bibliography


Bibliography


Bibliography


Theodoros Kyriakidis

Born on September 06, 1986. Greek Citizen.

Ecole Polytechnique Fédérale de Lausanne
Electronics Laboratory
Lausanne, Vaud, CH-1015

Phone: +41 21 69 34609, +30 6978 175 508
Email: thekyria@gmail.com

Education


Ph.D. Research visit Montefiore Institute, Université de Liège, Aug 2014 - Jan 2015.


Employment

Software Developer, Aristotle University of Thessaloniki (AUTh) Jan 2009 - Dec 2009
Development of a book e-distribution system: Database design, support and system upgrade

Network Engineering, Network Operations Center, AUTh Nov 2008 - Jun 2009
Installation, configuration and troubleshooting of network routers and ethernet switches

Assembly and spatial design of electrical automation panels

IT support, Information Technology Center, AUTh Nov 2005 - Feb 2006
Hardware and software troubleshooting and support

Skills and Qualifications

Research Skills & Interests

Electronics: reconfigurable computing, analog computing, embedded systems.

Power systems: dynamics, security assessment, optimization, power flow, smart grid architecture.

Mathematics: numerical analysis, (non-)linear systems, linear algebra, ODEs/DAEs, regression analysis.

Computing: (scientific) programming, network engineering, databases, sensor fusion.

Qualifications

FPGAs (h/w & s/w co-design), MCUs, PCBs, C[++]/[#], MATLAB, PHP, [My]SQL, VHDL, Networking - Device Management & conf. (Cisco), Server admin (W/LAMP), CMSs, Latex, Versioning (svn, git).
Languages

Greek (native)       English (prof. - C2)       Spanish (ave. - B2)
German (bas. - C1)  Italian (ave. - B2)    French (ave. - n/a)

Soft Skills & Other Competencies

Leadership, integrity, fast learning ability, discipline, work ethic, adaptability, multi-cultural living and working experience, polyglotism, synthetic thinking, driving licences (A, B).

Affiliations

Member, Technical Chamber of Greece, 2010-present.
Member, IEEE, 2008-present.
Member, IEEE Power & Energy Society, 2010-present.
  - Member, IEEE PES Swiss Chapter.
  - Member, IEEE Smart Grid Technical Community.
Member, IEEE Computer Society Member, 2008-2012 & 2014-present.
Member, ACM, 2014-present.

Publications (Selected)

Awards and Honors

Swiss National Science Foundation (SNSF) Doc.Mobility fellowship, 2014.
Scholarship by the Fulbright Foundation, 2010.
Academic Excellence Scholarship by the Greek National Scholarship Foundation (IKY), 2008 & 2004.