: H. Martin Bücker, George Corliss, Paul Hovland, Uwe Naumann, Boyana Norris
: Automatic Differentiation: Applications, Theory, and Implementations
: Springer-Verlag
: 9783540284383
: 1
: CHF 165.10
:
: Allgemeines, Lexika
: English
: 370
: Wasserzeichen/DRM
: PC/MAC/eReader/Tablet
: PDF

This collection covers the state of the art in automatic differentiation theory and practice. Practitioners and students will learn about advances in automatic differentiation techniques and strategies for the implementation of robust and powerful tools. Computational scientists and engineers will benefit from the discussion of applications, which provide insight into effective strategies for using automatic differentiation for design optimization, sensitivity analysis, and uncertainty quantification.

Written for: Computational scientists
Keywords: automatic differentiation, optimization, sensitivity analysis.

Backwards Differentiation in AD and Neural Nets: Past Links and New Opportunities (p. 15)

Paul J. Werbos
National Science Foundation, Arlington, VA, USA

pwerbos@nsf.gov

Summary.
Backwards calculation of derivatives– sometimes called the reverse mode, the full adjoint method, or backpropagation– has been developed and applied in many fields. This paper reviews several strands of history, advanced capabilities and types of application– particularly those which are crucial to the development of brain-like capabilities in intelligent control and artificial intelligence.

Key words: Reverse mode, backpropagation, intelligent control, reinforcement learning, neural networks, MLP, recurrent networks, approximate dynamic programming, adjoint, implicit systems

1 Introduction and Summary
Backwards differentiation or"the reverse accumulation of derivatives" has been used in many different fields, under different names, for different purposes. This paper will review that part of the history and concepts which I experienced directly. More importantly, it will describe how reverse differentiation could have more impact across a much wider range of applications.

Backwards differentiation has been used in four main ways known to me:

1. In automatic differentiation (AD), a field well covered by the rest of this book. In AD, reverse di.erentiation is usually called the"reverse method" or the"adjoint method." However, the term"adjoint method" has actually been used to describe two different generations of methods. Only the newer generation, which Griewank has called"the true adjoint method," captures the full power of the method.

2. In neural networks, where it is normally called"backpropagation" [532, 541, 544]. Surveys have shown that backpropagation is used in a majority of the real-world applications of artificial neural networks (ANNs). This is the stream of work that I know best, and may even claim to have originated.

3. In hand-coded"adjoint" or"dual" subroutines developed for specific models and applications, e.g., [534, 535, 539, 540].

4. In circuit design. Because the calculations of the reverse method are all local, it is possible to insert circuits onto a chip which calculate derivatives backwards physically on the same chip which calculates the quantit( ies) being differentiated. Professor Robert Newcomb at the University of Maryland, College Park, is one of the people who has implemented such"adjoint circuits."

Some of us believe that local calculations of this kind must exist in the brain, because the computational capabilities of the brain require some use of derivatives and because mechanisms have been found in the brain which fit this idea.

These four strands of research could benefit greatly from greater collaboration. For example– the AD community may well have the deepest understanding of how to actually calculate derivatives and to build robust dual subroutines, but the neural network community has worked hard to find many ways of using backpropagation in a wide variety of applications.

The gap between the AD community and the neural network community reminds me of a split I once saw between some people making aircraft engines and people making aircraft bodies.

Preface5
Contents7
List of Contributors11
Perspectives on Automatic Differentiation: Past, Present, and Future?18
1 The Algorithmic Approach19
2 Transformation of Algorithms20
3 Development of AD21
4 Present Tasks and Future Prospects28
5 Beyond AD31
Backwards Differentiation in AD and Neural Nets: Past Links and New Opportunities 32
1 Introduction and Summary32
2 Motivations and Early History34
3 Types of Differentiation Capability We Have Developed42
Solutions of ODEs with Removable Singularities 52
1 Introduction52
2 Notation and Some Polynomial Algebra53
3 Elementary Functions53
4 Other Functions59
5 Higher Order Equations61
6 Open Questions62
Automatic Propagation of Uncertainties64
1 Introduction64
2 Linear Models65
3 Contrast with Interval Analysis68
4 Nonlinear Models69
5 Implementation with Automatic Differentiation71
6 Validation of Uncertainty Models73
7 Way Ahead75
High-Order Representation of Poincare Maps76
1 Introduction76
2 Overview of DA Tools77
3 Description of the Method78
4 Examples80
Computation of Matrix Permanent with Automatic Differentiation84
1 Introduction84
2 Formulation85
3 Methods87
4 Algorithms89
5 Discussions and Comments92
6 Conclusion93
Computing Sparse Jacobian Matrices Optimally 94
1 Introduction94
2 Optimal Matrix Compression and Restoration96
3 Schur Complement Approach98
4 Combined Determination100
5 Using Recurring Sparsity Structure in Rows101
6 Numerical Experiments103
7 Concluding Remarks103
Application of AD-based Quasi-Newton Methods to Stiff ODEs106
1 Introduction106
2 Quasi-Newton Approximations108
3 Implementation Details112
4 Numerical Results112
5 Conclusions and Outlook115
Reduction of Storage Requirement by Checkpointing for Time- Dependent Optimal Control Problems in ODEs116
1 Introduction116
2 Quasilinearization Techniques118
3 Nested Reversal Schedules121
4 Numerical Example126
5 Conclusion and Outlook127
Improving the Performance of the Vertex Elimination Algorithm for Derivative Calculation 128
1 Introduction128
2 Heuristics130
3 Performance Analysis131
4 A Statement Reordering Scheme133
5 A Greedy List Scheduling Algorithm135
6 Conclusions and Further Work137
Acknowledgements137
Flattening Basic Blocks138
1 The Problem138
2 Variable Identification141
3 Removing Ambiguity by Splitting142
4 Practical Solution143
5 Splitting into Edge Subgraphs146
6 Outlook148
7 Conclusions150
The Adjoint Data-Flow Analyses: Formalization, Properties, and Applications152
1 Introduction152
2 Adjoints by Automatic Differentiation153
3 Classical Data-Flow Analyses154
4 Adjoint Data-Flow Analyses155
5 Application160
6 Conclusion162
Semiautomatic Differentiation for Efficient Gradient Computations164
1 Introduction164
2 Action on a Mesh165
3 Some AD Alternatives166
4 The RAD Package for Reverse AD169
5 Test Results170
6 Implications for Source Transformation174
7 Concluding Remarks174
Acknowledgment175
Computing Adjoints with the NAGWare Fortran 95 Compiler176
1 Aims of the CompAD Project176
2 Compiler AD – A Motivating Example177
3 Linearization of the Computational Graph179
4 Putting AD into the Compiler181
5 Case Study: Seeding in Forward and Reverse Mode183
6 Summary, Conclusion, and Outlook186
Extension of TAPENADE toward Fortran 95188
1 Introduction188
2 Nesting of Modules and Subprograms189
3 Derived Types190
4 Overloading191
5 Array Features193
6 Conclusion195
A Macro Language for Derivative Definition in ADiMat198
1 Introduction198
2 MATLAB in the Context of an AD Tool199
3 The Macro Language200
4 Exploiting Structure of a Given Code205
5 Conclusion and Future Work205