ebook ebooks e-book e-books downloaden bei MyEbooks.ch downloaden

Automatic Differentiation: Applications, Theory, and Implementations

:	H. Martin Bücker, George Corliss, Paul Hovland, Uwe Naumann, Boyana Norris
:	Automatic Differentiation: Applications, Theory, and Implementations
:	Springer-Verlag
:	9783540284383
:	1
:	CHF 165.10
:

:	Allgemeines, Lexika
:	English

:	370
:	Wasserzeichen/DRM
:	PC/MAC/eReader/Tablet
:	PDF

This collection covers the state of the art in automatic differentiation theory and practice. Practitioners and students will learn about advances in automatic differentiation techniques and strategies for the implementation of robust and powerful tools. Computational scientists and engineers will benefit from the discussion of applications, which provide insight into effective strategies for using automatic differentiation for design optimization, sensitivity analysis, and uncertainty quantification.

Written for: Computational scientists
Keywords: automatic differentiation, optimization, sensitivity analysis.

Backwards Differentiation in AD and Neural Nets: Past Links and New Opportunities (p. 15)

Paul J. Werbos
National Science Foundation, Arlington, VA, USA

pwerbos@nsf.gov
Summary.
Backwards calculation of derivatives– sometimes called the reverse mode, the full adjoint method, or backpropagation– has been developed and applied in many fields. This paper reviews several strands of history, advanced capabilities and types of application– particularly those which are crucial to the development of brain-like capabilities in intelligent control and artificial intelligence.

Key words: Reverse mode, backpropagation, intelligent control, reinforcement learning, neural networks, MLP, recurrent networks, approximate dynamic programming, adjoint, implicit systems

1 Introduction and Summary
Backwards differentiation or"the reverse accumulation of derivatives" has been used in many different fields, under different names, for different purposes. This paper will review that part of the history and concepts which I experienced directly. More importantly, it will describe how reverse differentiation could have more impact across a much wider range of applications.

Backwards differentiation has been used in four main ways known to me:

1. In automatic differentiation (AD), a field well covered by the rest of this book. In AD, reverse di.erentiation is usually called the"reverse method" or the"adjoint method." However, the term"adjoint method" has actually been used to describe two different generations of methods. Only the newer generation, which Griewank has called"the true adjoint method," captures the full power of the method.

2. In neural networks, where it is normally called"backpropagation" [532, 541, 544]. Surveys have shown that backpropagation is used in a majority of the real-world applications of artificial neural networks (ANNs). This is the stream of work that I know best, and may even claim to have originated.

3. In hand-coded"adjoint" or"dual" subroutines developed for specific models and applications, e.g., [534, 535, 539, 540].

4. In circuit design. Because the calculations of the reverse method are all local, it is possible to insert circuits onto a chip which calculate derivatives backwards physically on the same chip which calculates the quantit( ies) being differentiated. Professor Robert Newcomb at the University of Maryland, College Park, is one of the people who has implemented such"adjoint circuits."

Some of us believe that local calculations of this kind must exist in the brain, because the computational capabilities of the brain require some use of derivatives and because mechanisms have been found in the brain which fit this idea.

These four strands of research could benefit greatly from greater collaboration. For example– the AD community may well have the deepest understanding of how to actually calculate derivatives and to build robust dual subroutines, but the neural network community has worked hard to find many ways of using backpropagation in a wide variety of applications.

The gap between the AD community and the neural network community reminds me of a split I once saw between some people making aircraft engines and people making aircraft bodies.

	Preface	5
	Contents	7
	List of Contributors	11
	Perspectives on Automatic Differentiation: Past, Present, and Future?	18
	1 The Algorithmic Approach	19
	2 Transformation of Algorithms	20
	3 Development of AD	21
	4 Present Tasks and Future Prospects	28
	5 Beyond AD	31
	Backwards Differentiation in AD and Neural Nets: Past Links and New Opportunities	32
	1 Introduction and Summary	32
	2 Motivations and Early History	34
	3 Types of Differentiation Capability We Have Developed	42
	Solutions of ODEs with Removable Singularities	52
	1 Introduction	52
	2 Notation and Some Polynomial Algebra	53
	3 Elementary Functions	53
	4 Other Functions	59
	5 Higher Order Equations	61
	6 Open Questions	62
	Automatic Propagation of Uncertainties	64
	1 Introduction	64
	2 Linear Models	65
	3 Contrast with Interval Analysis	68
	4 Nonlinear Models	69
	5 Implementation with Automatic Differentiation	71
	6 Validation of Uncertainty Models	73
	7 Way Ahead	75
	High-Order Representation of Poincare Maps	76
	1 Introduction	76
	2 Overview of DA Tools	77
	3 Description of the Method	78
	4 Examples	80
	Computation of Matrix Permanent with Automatic Differentiation	84
	1 Introduction	84
	2 Formulation	85
	3 Methods	87
	4 Algorithms	89
	5 Discussions and Comments	92
	6 Conclusion	93
	Computing Sparse Jacobian Matrices Optimally	94
	1 Introduction	94
	2 Optimal Matrix Compression and Restoration	96
	3 Schur Complement Approach	98
	4 Combined Determination	100
	5 Using Recurring Sparsity Structure in Rows	101
	6 Numerical Experiments	103
	7 Concluding Remarks	103
	Application of AD-based Quasi-Newton Methods to Stiff ODEs	106
	1 Introduction	106
	2 Quasi-Newton Approximations	108
	3 Implementation Details	112
	4 Numerical Results	112
	5 Conclusions and Outlook	115
	Reduction of Storage Requirement by Checkpointing for Time- Dependent Optimal Control Problems in ODEs	116
	1 Introduction	116
	2 Quasilinearization Techniques	118
	3 Nested Reversal Schedules	121
	4 Numerical Example	126
	5 Conclusion and Outlook	127
	Improving the Performance of the Vertex Elimination Algorithm for Derivative Calculation	128
	1 Introduction	128
	2 Heuristics	130
	3 Performance Analysis	131
	4 A Statement Reordering Scheme	133
	5 A Greedy List Scheduling Algorithm	135
	6 Conclusions and Further Work	137
	Acknowledgements	137
	Flattening Basic Blocks	138
	1 The Problem	138
	2 Variable Identification	141
	3 Removing Ambiguity by Splitting	142
	4 Practical Solution	143
	5 Splitting into Edge Subgraphs	146
	6 Outlook	148
	7 Conclusions	150
	The Adjoint Data-Flow Analyses: Formalization, Properties, and Applications	152
	1 Introduction	152
	2 Adjoints by Automatic Differentiation	153
	3 Classical Data-Flow Analyses	154
	4 Adjoint Data-Flow Analyses	155
	5 Application	160
	6 Conclusion	162
	Semiautomatic Differentiation for Efficient Gradient Computations	164
	1 Introduction	164
	2 Action on a Mesh	165
	3 Some AD Alternatives	166
	4 The RAD Package for Reverse AD	169
	5 Test Results	170
	6 Implications for Source Transformation	174
	7 Concluding Remarks	174
	Acknowledgment	175
	Computing Adjoints with the NAGWare Fortran 95 Compiler	176
	1 Aims of the CompAD Project	176
	2 Compiler AD – A Motivating Example	177
	3 Linearization of the Computational Graph	179
	4 Putting AD into the Compiler	181
	5 Case Study: Seeding in Forward and Reverse Mode	183
	6 Summary, Conclusion, and Outlook	186
	Extension of TAPENADE toward Fortran 95	188
	1 Introduction	188
	2 Nesting of Modules and Subprograms	189
	3 Derived Types	190
	4 Overloading	191
	5 Array Features	193
	6 Conclusion	195
	A Macro Language for Derivative Definition in ADiMat	198
	1 Introduction	198
	2 MATLAB in the Context of an AD Tool	199
	3 The Macro Language	200
	4 Exploiting Structure of a Given Code	205
	5 Conclusion and Future Work	205