Refactoring Passes

We have been working in the C++ version of EnergyPlus for almost ten years now, but the code still has a lot of "Fortran" in it. There have been a few targeted refactoring efforts focused on specific areas (e.g., plant) and there will be more, but a few sweeps through the code to remove some old idioms and replace with new and better ones will create a better environment in which to perform those future efforts and establish new patterns that new and updated code can mimic. I estimate that we need to do about 12-15 passes of various "sizes". The way I would characterize the size of a pass is:

small: can be done by one person in two months or less.
medium: can be done by two people in one iteration or less.
large: requires more than two people and/or more than one iteration.

Here are the passes that I am thinking of along with my effort estimates, which may be completely off. I've tried to order them "topologically", i.e., the ones that don't depend on other passes first. These are actually not "refactoring" per se, they are more "re-idiomizing" or "re-patterning". Real refactoring, e.g., reorganizing objects to leverage inheritance and virtual functions, collecting related data items into structs, converting groups scalar variables into array variables indexed by enum, etc. is instance-specific and probably cannot be done in a pass. Those are probably better done in a targeted way for specific modules that can clearly benefit from such transformations.

Enum. Medium. This pass is already underway and about 60% completed. It consists of three sub-passes. The first converts collections of constexpr int constants into enum class enumerations with a standard format in which the first item is Invalid = -1 and the last item is Num. This part may require some thinking and rejiggering (technical term) as in several places the code exhibits an undue reliance on the actual values of enumerations rather than on their symbolic names. The second sub-pass creates a constexpr std::array<std::string_view, static_cast<int>(ec::Num)> = ecNamesUC = {}; array of upper-cased names for each enum class ec that appears in the IDF and replaces the if-else ladders used to convert the enumeration strings to their values with calls to static_cast<ec>getEnumerationValue(str, ecNamesUC);. A goal of this sub-pass is to eliminate case-insensitive comparisons to constant strings. The third sub-pass replaces the SELECT_CASE_var if-else ladders that use these enumerations with switch/case statements.
Namespace. Small. This pass is intended to reverse the proliferation of namespaces in EnergyPlus and shorten namespace names. The original Fortran to C++ translation automatically created namespace X for every X.hh file, but there is no fundamental one-to-one relationship between files and namespaces. There is also no fundamental relationship between namespace names and file names. Logically, there are not 300+ namespaces in EnergyPlus. The number is probably more like 20 or 30. And because there are few logical namespaces, individual namespace names can be short. The C++ standard template library namespace is called std, not StandardTemplateLibrary. HeatPumpWaterToWaterCOOLING is not a good namespace name. Shortening these names will shorten lines, make code more readable, and make it easier for us to stop using using. Look here for a longer discussion.
Optional. Small. This pass would eliminate optional parameters and replace them with default parameters. A modification that would make this pass into a medium or even large pass would be to pass all reference variables by pointer rather than reference which--among other things--would allow us to use nullptr as a default value. See here for a longer discussion of why reference parameters are evil.
EPVectorClass. Medium. This pass consists of two sub-components which can probably be done together. The first replaces all instances of ObjexxFCL::Array1D for item types other than Real64 and bool with EPVector, which is a much lighter template. This change will add bounds-checking to debug builds, slowing down debug-build simulations significantly. To negate this, we will need to establish the pattern of using auto & x = state.dataX->X(XNum); followed by x.Field1, x.Field2, etc. rather than constantly doing state.dataX->X(XNum).Field1, state.dataX->X(XNum).Field2 all over the place (you know what I'm talking about). This will also make the code more readable. Actually, we need to establish this pattern regardless and it is probably the most important aspect of this pass.
EPVectorRealBool. Small. This pass complements the EPVectorClass pass and eliminates uses of Array1D_bool and Array1D<Real64>. This is split into a separate pass because it will require hand rolling loops for some of the arithmetic and boolean operators implemented for these templates. We may want to implement the boolean operations in EPVector if they are not implemented already.
EPVEctorXD. Medium. This pass would replace all Array2D, Array3D, etc. with combinations of nested EPVector and std::array as appropriate. This may seem quite expensive but can be amortized by collecting multiple ArrayXD<Real64> that have similar indexing structures and making them into a single ArrayXD<struct MultiReal64>, which is probably a good thing to do anyway. This may be the final vestige of ObjexxFCL.
std::vector. Large. This may be a pipe dream, but at some point we need to consider moving away from 1-based indexing.
epJSON. Large. This one is simple conceptually, but will probably take some time to actually implement. The IDF input processor interface, i.e., getObjectItem, has been internally re-plumbed to access the epJSON infrastructure. This layer should be peeled away and objects should be constructed using the epJSON interface directly. There are already one or two examples of this.
Initialization. Large. This pass will unlock a number of other optimizations and simplifications. This intent is to remove the ubiquitous if (state.dataX->GetInputFlag) { GetXInput(state); state.dataX->GetInputFlag = false; } from the prologues of about half the functions in EnergyPlus and the only slightly less ubiquitous if (CompIndex == 0) { CompIndex = FindItemInList(CompName, state.dataX->XList); if (CompIndex == 0) { ShowFatalError(""); } } that typically follows it. The idea is to replace this code with two-pass initialization at startup. The first pass reads and instantiates all objects from the input file, setting all fields that do not reference other objects. The second pass resolves object references and sets object indices--one focus of this second initialization pass is to eliminate case-insensitive comparisons for user-defined names and references, a second focus is to eliminate reference strings in objects. Once these two passes are implemented, the scattered initialization code can be deleted and object name parameters can be removed from internal functions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactoring Passes

Clone this wiki locally