Development Process Example

Test-Driven Development

Test-driven development is a development process that incorporates testing as part of both design and development. The idea is that by writing tests up front, you put more effort into thinking about what you're trying to do. And you get the benefit of having tests at the start.

Purpose

This page has a few goals:

  • Document my work on adding more analysis functionality to enzo.
  • Provide an example of the development process for discussion.
  • Demonstrate a unit testing framework in conjunction with lcatest.

The Plan

  1. Build and run a test using libenzo.a
  2. Rebuild the volume measuring tool from the EAL.
    1. Standalone
    2. Inline
  3. Add analysis checkpoints.

simplectest

There's a ton (10's) of unit testing frameworks out there for C/C++. Some languages (Java, Python) have standard, or de facto standard frameworks. Frameworks for other languages (Fortran, C/C++, IDL) have been developed. Since enzo is complex to start with, and the code is not structured around this concept, I wanted the simplest solution to get started with.

simplectest is about as lightweight as possible. A single header file, with a bunch of macros to make writing tests faster.

First Try: Building

Got the makefile system working by creating a lib target in amr_mpi/src.

Stupid MPI Test

Since I just wanted to test something, I decided to test MPI, and CommunicationInitialize. Here's the code, built on simplectest. enzoheaders.h is basically the top half of X_Main.C, with all of the headers and prototypes.

#include "enzoheaders.h"
#include "tests.h"

// Start the overall test suite
START_TESTS()

// A new group of tests, with an identifier
START_TEST("init comm")
  // Initialize Communications   
  CommunicationInitialize(&argc, &argv);

  MPI_Arg mpi_size;
  MPI_Comm_size(MPI_COMM_WORLD, &mpi_size);

  ASSERT(mpi_size == NumberOfProcessors);

  CommunicationFinalize();
END_TEST()

// End the overall test suite
END_TESTS()

Results

And we have our first passing unit test!

cable:~/Projects/Enzo/tests rpwagner$ ./enzoexample 
> init comm...
MPI_Init: NumberOfProcessors = 1

--- Results ---
Tests run:    1
Passes:       1
Failures:     0
cable:~/Projects/Enzo/tests rpwagner$ 

For fun, I decided to find out what happens in the multiple task case. What we get is multiple output.

cable:~/Projects/Enzo/tests rpwagner$ mpirun -np 2 ./enzoexample 
> init comm...
> init comm...
MPI_Init: NumberOfProcessors = 2

--- Results ---
Tests run:    1
Passes:       1
Failures:     0

--- Results ---
Tests run:    1
Passes:       1
Failures:     0
cable:~/Projects/Enzo/tests rpwagner$ 

Creating MPI Version

I copied tests.h to mpitests.h, and wrapped all of the printf statements in if(I'm ROOT). Now we only get extra output before MPI is initialized.

cable:~/Projects/Enzo/tests rpwagner$ mpirun -np 2 ./enzoexample 
> init comm...
> init comm...
MPI_Init: NumberOfProcessors = 2

--- Results ---
Tests run:    1
Passes:       1
Failures:     0
cable:~/Projects/Enzo/tests rpwagner$ 

Controlling I/O

One of the most basic things the analysis code needs to do is not read in the grid data unless asked. The line is sort of drawn between text and HDF5 files. Things in HDF5 files tend to be big, and therefore use a lot of memory. And there can be a lot of them, which means more disk access. So it's good to control when and how data gets read in. Of course, the hierarchy's so damned big, this seems kind of dumb to say. This was implemented before using a default argument to grid::ReadGrid, and I'll do that again.

The first test will be to call ReadAllData, and have it return FAIL when there's no grid files present. Then we'll need a flag for saying DontReadData, and ReadAllData should be satisfied with a parameter, hierarchy, and boundary file.

Results

This is working, using three changes:

  • A new global variable global_data.h:EXTERN int LoadGridDataAtStart;
  • Changing grid::ReadGrid
    • Modifying the arguments to ReadGrid(FILE *fptr, int ReadText = TRUE, int ReadData = TRUE);
    • Putting some flow control in grid::ReadGrid
  • Similar changes to ExternalBoundary::ReadExternalBoundary

Testing

To test things, I ran a small (163, three levels) spherical collapse simulation, and put the data in the tests/ directory. Yes, this is binary data, but we don't have a place to keep test data for regression testing.

The first test checks a restart:

#include "enzoheaders.h"
#include "enzo_unit_tests.h"

// Start the overall test suite
START_TESTS()
  CommunicationInitialize(&argc, &argv);
  // Initialize Communications   

  TopGridData MetaData;
  HierarchyEntry TopGrid;
  ExternalBoundary Exterior;
  char *ParameterFile          = "infall32r3d_0003";

  SetDefaultGlobalValues(MetaData);

START_TEST("read param")
  ASSERT(ReadAllData(ParameterFile, &TopGrid, MetaData, &Exterior));
END_TEST()

START_TEST("data read in")
  ASSERT(TopGrid.GridData->BaryonField[0]);
END_TEST()

  CommunicationFinalize();
// End the overall test suite
END_TESTS()
}

And here's the results:

Running readdata...
MPI_Init: NumberOfProcessors = 1
> read param...
Output to Global Dir /Users/rpwagner/tmp
> data read in...

--- Results ---
Tests run:    2
Passes:       2
Failures:     0

Yay!

The next test is supposed to read in grid data on demand, using the AnalysisBaseClass?. But it's not working, so I'll be back.

OK, things are going again. The BaryonFileName wasn't being stored, so it couldn't read in the data during a second call to grid::ReadGrid. BaryonFileName and ParticleFileName are now members of grid, and wrapped in #ifdef ENZO_ANALYSIS to prevent bloat during simulations. It think it may be possible to go overboard on the conditional compilation wrappers, but not in grid variables. The more of those we can turn off, the better.

Anyways, this example is pretty well wrapped up. Here's the first set of tests for the AnalysisBaseClass, which uses most of the public member functions, and tests the expected I/O behavior.

#include "enzoheaders.h"
#include "AnalysisBaseClass.h"
#include "enzo_unit_tests.h"

// Start the overall test suite
START_TESTS()
  CommunicationInitialize(&argc, &argv);
  // Initialize Communications   

  TopGridData MetaData;
  HierarchyEntry TopGrid;
  ExternalBoundary Exterior;
  char *ParameterFile          = "infall32r3d_0003";
  AnalysisBaseClass *analysis_obj = NULL;
  SetDefaultGlobalValues(MetaData);

  LoadGridDataAtStart = FALSE;

// A new group of tests, with an identifier
START_TEST("read data")
  ASSERT(ReadAllData(ParameterFile, &TopGrid, MetaData, &Exterior));
END_TEST()

START_TEST("grid data not read in")
  ASSERT(TopGrid.GridData->BaryonField[0] == NULL);
END_TEST()

START_TEST("init analysis")
  analysis_obj = new AnalysisBaseClass(&MetaData, &TopGrid);
  ASSERT(analysis_obj);
END_TEST()

START_TEST("baryon filename")
  ASSERT(!strcmp(TopGrid.GridData->BaryonFileName, "infall32r3d_0003.grid0001"));
END_TEST()

// A new group of tests, with an identifier
START_TEST("grids in volume")
  ASSERT(analysis_obj->NumberOfGridsInVolume() == 3);
END_TEST()

// A new group of tests, with an identifier
START_TEST("good point")
  float good_point[3] = {0.25, 0.25, 0.25};
  ASSERT(analysis_obj->FindGrid(&TopGrid, good_point));
END_TEST()

START_TEST("bad point")
  float bad_point[3] = {1.25, 1.25, 1.25};
  ASSERT(analysis_obj->FindGrid(&TopGrid, bad_point) == NULL);
END_TEST()

// A new group of tests, with an identifier
START_TEST("find grid")
  float point_in3[3] = {0.55, 0.55, 0.55};
  HierarchyEntry *grid3 = TopGrid.NextGridNextLevel->NextGridNextLevel; 
  ASSERT(analysis_obj->FindGrid(&TopGrid, point_in3) == grid3);
END_TEST()

// A new group of tests, with an identifier
START_TEST("flag grid")
  analysis_obj->FlagGridCells(&TopGrid);
  ASSERT(TopGrid.GridData->FlaggingField);
END_TEST()

// A new group of tests, with an identifier
START_TEST("count flagged cells")
int cell_count = 0, i, j, k, index;

int *flags = TopGrid.GridData->FlaggingField;
int *gridStart = TopGrid.GridData->GridStartIndex;
int *gridEnd = TopGrid.GridData->GridEndIndex;
int *gridDim = TopGrid.GridData->GridDimension;

  for (k = gridStart[2]; k <= gridEnd[2]; k++)
    for (j = gridStart[1]; j <= gridEnd[1]; j++) {
      index = (k*gridDim[1] + j)*gridDim[0];
      for (i = gridStart[0]; i <= gridEnd[0]; i++)
	if(flags[index + i])
	  cell_count++;
    }

  ASSERT_EQUALS(cell_count, 27);

END_TEST()

START_TEST("analysis region")
  FLOAT leftedge[3] = {0.1, 0.1, 0.1};
  FLOAT rightedge[3] = {0.45, 0.45, 0.45};
  analysis_obj->SetAnalysisRegion(leftedge, rightedge);
  ASSERT(analysis_obj->NumberOfGridsInVolume() == 2);
END_TEST()

START_TEST("read in grid data")
  ASSERT_EQ(analysis_obj->OpenData(), 2);
END_TEST()

START_TEST("check for baryon field")
  ASSERT(TopGrid.GridData->BaryonField[0]);
END_TEST()

  CommunicationFinalize();
// End the overall test suite
END_TESTS()

/* Used to capture exit statusses.  */
void my_exit(int status)
{

}

Which gives us the result:

Running analysisbaseclass...
MPI_Init: NumberOfProcessors = 1
> read data...
Output to Global Dir /Users/rpwagner/tmp
> grid data not read in...
> init analysis...
> baryon filename...
> grids in volume...
> good point...
> bad point...
> find grid...
> flag grid...
> count flagged cells...
> analysis region...
> read in grid data...
ClearFlaggingField: Warning, field already present.
> check for baryon field...

--- Results ---
Tests run:   13
Passes:      13
Failures:     0
cable:~/Projects/Enzo/tests rpwagner$ 

Yay!