6810: 1094 Activities 6

Online handouts: listings for diffeq_test.cpp, diffeq_routines.cpp, eigen_tridiagonal_class.cpp.

The Chapter 6 notes have an introduction to algorithms for integrating differential equations. As part of these activities, we'll go over the basic ideas by example, using the routines in session06.zip, in preparation for looking at "Anharmonic Oscillations" and "Differential Chaos in Phase Space".

Your goals for this session:

Please work in pairs (more or less). The instructors will bounce around 1094 and answer questions.

Wrapping GSL Functions: eigen_tridiagonal_class

The code eigen_tridiagonal_class.cpp, together with GslHamiltonian.cpp and GslHamiltonian.h, are a rewrite of eigen_tridiagonal.cpp from Session 5. A C++ class called "Hamiltonian" hides all of the GSL function calls from the main program. Minimal changes were made to the code for clarity (so it is not optimal!).

  1. Look at the eigen_tridiagonal_class.cpp printout and compare to the eigen_tridiagonal.cpp printout from Session 5. If this is your first (real) exposure to a C++ class (or if you forget what you used to know), there will be confusing aspects. Look at the motivation in the Session 6 notes. A detailed guide to the implementation will be given in the Session 7 notes. For now, identify what has happened to each of the GSL function calls. In what ways is the new version better? (For experts, what would you do differently?)

  2. Verify that the code still works. Try adding a loop over the matrix dimension N. Does it work?

  3. (Bonus) What additional functions might you define in the Hamiltonian class? What other classes might you define?

Introduction to Parallel Processing with OpenMP

OpenMP (not to be confused with Open MPI!) implements parallel processing when there is shared memory, as is common these days with multi-core hardware. If you have a dual-core processor that means that in principle you can run your program twice as fast by running two "threads" of your code simultaneously on the two cores. If you have more cores available, in principle the speed scales with the number of cores. One way to achieve this in practice is to use OpenMP. We'll use a simple example written by Chris Orban to illustrate how it works. In the past this only worked on Linux or on Macs, but now it should work under Cygwin as well.

  1. The 1094 machines each have multiple cores: use shift-ctrl-esc and then choose the Performance tab (or About This Mac --> System Report --> Hardware). How many cores are there?

  2. Look at the program simpson_cosint_openmp.cpp in an editor or in the printed copy. The key openmp features are the omp include statement and the #pragma omp statement, which is an instruction to the compiler. What part of the program is executed in parallel?

  3. Let's compare using one and two threads. There is a built-in timer in the program, but run the program with time ./simpson_cosint_openmp, which will automatically print some timing info on the cpu time and the wall time after you run the program. Compile the program with g++ following the instructions in the comments, and then run it. Record the "num_time", the CPU time, and the wall time. (In tcsh, these are the first and third numbers; in bash, these are the second and first numbers.)

    Then change from two threads to one thread by modifying the omp_set_num_threads command in the code, recompile it, and re-run. Record the numbers again. Why do you think the CPU time is about the same but the wall time differs? Is the parallelization working?

  4. Now try to use all the cores you can. How well does the program scale? (E.g., compare (num_time for 1 core)/(num_time for n cores) to n.) Why is it not perfect scaling?

  5. (BONUS) For completeness, you can try the Intel C++ compiler (on linux), following the instructions in the top comment section of the program. Does it behave the same way as g++?

Integrating a First-Order Differential Equation

The code diffeq_test.cpp calls the differential-equation-solving routines in diffeq_routines.cpp ("euler" and "runge4") to integrate a series of coupled differential equations (but we'll start with a single equation). Functions for the right-hand-side of a first order differential equation (dy/dt = rhs[y,t]) and the exact y(t) [called "exact_answer"] for a specified initial condition are also defined in this file. There is also a header file diffeq_routines.h.

  1. Look through the Activities 6 Background Notes for a quick overview of differential equation solving.
  2. Download and unzip session06.zip.
  3. Use make_diffeq_test to compile and link diffeq_test. Run the program to generate diffeq_test.dat and look at it in an editor. The gnuplot plotfile diffeq_test.plt generates comparison plots of the integrated function from the output in diffeq_test.out. Load this plotfile in gnuplot:
    gnuplot> load "diffeq_test.plt"
    and examine the result. What can you conclude at this point?

  4. Look at the printout for diffeq_test.cpp and diffeq_routines.cpp and compare to the Activities 6 notes to figure out what is going on. The codes follow the notation in the notes. At present there is only one equation (first-order), so only y[0] is used. What is the differential equation being integrated?

  5. Modify the diffeq_test.plt file to plot the relative error at each value of t. (Modify the plot file and NOT the program; see the gnuplot handout on plot files for an example of how to do this.) As usual in studying errors, a log-log scale will be useful. The first point at t=0 may get in the way. Use "set xrange [?:?]" in gnuplot (where you fill in the ?'s) to avoid this problem. What can you say qualitatively about the errors?

  6. Now generate and plot results for a second value of h (your plot should have both values of h, so think about how to best do this). You'll want it use something like 1/10 the value, so it's easy to check the effect (e.g., if the difference goes like hn, then you'll see 10n, which is easy to see on a log-log plot). When the local error (for each step h) adds coherently, then the "accumulated" or "global" error for a given algorithm at tf scales like Nf*(local error) = (tf/h)*(local error). You can verify for Euler's algorithm, for which the local error to be h2 (see notes and excerpt), that the global error is, in fact, h. What is the local error for 4th-order Runge-Kutta according to the graph?

  7. Next, check how the accumulated error at one value of t scales with h for the two algorithms. Take t=1, for example. You'll need to modify the code to output the results for y(t=1) for a range of h's (think logarithmic!). Some things to be careful of: There should be different regions of the error plot, as we've seen before. Interpret them.

    Given your results, how would you choose a step size for 4th-order Runge-Kutta? (Hint: How do you explain the behavior of the error for small h?)

Extra: 2nd Order Runge-Kutta Algorithm

This exercise is intended to verify that you understand the meaning of the Runge-Kutta algorithms and how they are implemented in C++. (Most likely for a future assignment.)

  1. Add a third diffeq routine to the code, which implements the 2nd-order Runge-Kutta algorithms described in the Activities 6 notes.
  2. Check the scaling of the error with h, i.e., is the error proportional to a power of h?. What is the power?

  3. Try a different differential equation, such as dy/dt = 2*cos(2pi*t). How do the errors compare to your first equation?

Extra: Integrating a 2nd-Order Differential Equation

Second-order differential equations are treated by writing them as two coupled 1st-order equations, as described in the Activities 6 notes. We'll try this out for a simple example, which we'll generalize later to look at chaotic behavior. (Most likely pushed to Session 7.)

  1. Consider the differential equation for a simple, undriven harmonic oscillator [e.g., a mass m on a spring with constant k: d2x/dt2 + (k/m)x = 0]. What is the general solution in terms of the initial position and velocity?

  2. Rewrite the differential equation as two coupled 1st-order equations [for x and v=dx/dt]. Use units in which the oscillator mass and spring constant are equal to one. Take the initial position to be 1.0 and the initial velocity to be zero.

  3. Modify a copy of diffeq_test.cpp to use the runge4 subroutine to calculate this oscillator for times t=0 to t=10. You'll need to change N, modify rhs to consider both i==0 and i==1, put the exact answer in exact_answer, and change the limits of t appropriately.
  4. Plot the result and the exact result for comparison. What do you conclude?

6810: 1094 Activities 6. Last modified: .