-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PcCountPair tracker and LoopPoint #255
Open
studyztp
wants to merge
20
commits into
develop
Choose a base branch
from
region_tracker
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 19 commits
Commits
Show all changes
20 commits
Select commit
Hold shift + click to select a range
bcb4d03
Param: added PcCountPair
studyztp 547db65
PcCountTracker and PcCountTrackerManager basic finished
studyztp 1aea6e6
added get_pc_count and get_current_pc_count_pair function
studyztp 5ed5db1
added stdlib looppoint class and output json file
studyztp 7d09994
fixed hash problem for python PcCountPair
studyztp eabc1a5
fixed update_relatives() function
studyztp 0b735e1
take relative count for end always
studyztp 52691ab
added LoopPointRestore and json profile option for LoopPointCheckpiont
studyztp 0eb4720
fixed small things
studyztp f7a15a4
use c++ to_string function for __str__
studyztp 0a9cf2d
fixed the empty space
studyztp 0f5b292
added comments
studyztp c47378a
fixed a bug in LoopPointRestore where it takes the start relative cou…
studyztp 4349663
changed LoopPointFilePath and CheckPointDir format
studyztp 36adc37
added LoopPoint documentation
studyztp f587c9b
changed parameter names for looppoint input file and checkpoint dir i…
studyztp 6fd552d
used pre-commit
studyztp 85a6e69
addded print_all_targets in debug-flags
studyztp 991c945
removed exit event PCCOUNTTRACK_END and added exit_when_empty option …
studyztp 47d980d
cpu: fixup some style stuff for loopoints
powerjg File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,315 @@ | ||
# LoopPoint Documentation | ||
|
||
|
||
1. [Take Checkpoint](#take-checkpoint) | ||
- [LoopPointCheckpoint Module](#looppointcheckpoint-module) | ||
- [looppoint_save_checkpoint_generator](#looppoint_save_checkpoint_generator) | ||
- [Example Script](#take-checkpoint-example-script) | ||
2. [Restore Checkpoint](#restore-checkpoint) | ||
- [Example Script](#restore-checkpoint-example-script) | ||
3. [Json File](#json-file) | ||
- [structure](#structure) | ||
- [components](#components) | ||
- [Example](#example) | ||
4. [Others](#others) | ||
- [PcCountPair](#pccountpair) | ||
- [Debug Flag](#degub-flags) | ||
|
||
|
||
The main methodology of performing LoopPoint sampling in the gem5 simulator is taking and restoring checkpoints for LoopPoint significant regions. | ||
|
||
This can be done easily with the standard library LoopPoint modules. | ||
|
||
## Take Checkpoint | ||
In the gem5 standard library, there is a `LoopPointCheckpoint` module designed specifically to take checkpoints for LoopPoint with the `set_se_looppoint_workload` module and the `looppoint_save_checkpoint_generator` generator. | ||
|
||
|
||
### LoopPointCheckpoint module | ||
The `LoopPointCheckpoint` module requires two parameters: `looppoint_file` and `if_csv`. | ||
|
||
`looppoint_file` is a Path object that contains the path of the LoopPoint data file. The data file can be the csv file generated by Pin, or the json file generated by gem5. | ||
|
||
`if_csv` is a boolean object that is True if the file is a csv file, or False if the file is a json file. | ||
|
||
With these two inputs, the `LoopPointCheckpoint` module is able to generate and store information needed for the LoopPoint workload. | ||
|
||
Outputting the information by calling the `output_json_file` in the LoopPoint module in the Python script is strongly recommended and the json file is needed to restore the checkpoint later with the `LoopPointRestore` module. | ||
|
||
After creating a `LoopPointCheckpoint` object with the required inputs, this object can be passed into the `set_se_looppoint_workload`. | ||
|
||
If the LoopPoint information is correctly passed into the cores, there will be `SIMPOINT_END` exit events generated when the target Program Counter address has been executed the number of times the LoopPoint data specifies. | ||
|
||
When the exit event is raised, the `looppoint_save_checkpoint_generator` can evaluate the (PC,count) pair and determine if it is going to take a checkpoint at it or not. | ||
|
||
### looppoint_save_checkpoint_generator | ||
The `looppoint_save_checkpoint_generator` requires three parameters: `checkpoint_dir`, `looppoint`, and `update_relatives`. | ||
|
||
`checkpoint_dir` is a Path object that contains the path to the directory the checkpoints should be stored at. | ||
|
||
`looppoint` is the LoopPoint object(the `LoopPointCheckPoint` object created above) that will be used to identify the current (PC, count) pair the simulation reaches, the corresponding region id of this pair, and update the relative counts in the region if needed. | ||
|
||
`update_relatives` is a boolean object that is True if the relative count in the region should be updated, otherwise, it should be False. If you do not have relative counts in your LoopPoint data file, then it should be True. Relative count information is needed in the json file to restore checkpoints for LoopPoint. It is default as True. | ||
|
||
When the simulation encountered all the (PC, count) pairs in the LoopPoint data file, then it will raises an `PCCOUNTTRACK_END` exit event. With the standard library feature, the simulation can exit the simulation loop when this exit event is encountered. | ||
|
||
|
||
### Take Checkpoint Example Script | ||
``` Python | ||
from gem5.simulate.exit_event import ExitEvent | ||
from gem5.simulate.simulator import Simulator | ||
from gem5.resources.resource import Resource | ||
from gem5.components.cachehierarchies.classic.no_cache import NoCache | ||
from gem5.components.boards.simple_board import SimpleBoard | ||
from gem5.components.memory.single_channel import SingleChannelDDR3_1600 | ||
from gem5.components.processors.simple_processor import SimpleProcessor | ||
from gem5.components.processors.cpu_types import CPUTypes | ||
from gem5.isas import ISA | ||
from pathlib import Path | ||
# LoopPoint related modules | ||
from gem5.utils.looppoint import LoopPointCheckpoint | ||
from gem5.simulate.exit_event_generators import looppoint_save_checkpoint_generator, exit_generator | ||
|
||
# When taking checkpoints, we can use simpler configurations, for example NoCache for cache_hierarchy and ATOMIC CPUs. | ||
# Most of the configurations can be changed in the restore script. | ||
# There are some exceptions, for example the num_cores and the size of memory must match in both the taking checkpoint script and the restore script. | ||
cache_hierarchy = NoCache() | ||
|
||
memory = SingleChannelDDR3_1600(size="2GB") | ||
|
||
processor = SimpleProcessor( | ||
cpu_type=CPUTypes.ATOMIC, | ||
isa=ISA.X86, | ||
num_cores=9, | ||
) | ||
|
||
board = SimpleBoard( | ||
clk_freq="3GHz", | ||
processor=processor, | ||
memory=memory, | ||
cache_hierarchy=cache_hierarchy, | ||
) | ||
|
||
dir = Path("LoopPoint-CheckPoints/") | ||
dir.mkdir(exist_ok=True) | ||
|
||
# ---------------LoopPoint part begins--------------- | ||
# LoopPoint object created | ||
looppoint = LoopPointCheckpoint( | ||
looppoint_file=Path("LoopPoint.csv"), | ||
if_csv=True | ||
) | ||
|
||
board.set_se_looppoint_workload( | ||
binary= Resource("example-binary"), | ||
looppoint = looppoint | ||
) | ||
|
||
simulator = Simulator( | ||
board=board, | ||
on_exit_event ={ | ||
# setup exit event generator | ||
ExitEvent.SIMPOINT_BEGIN : | ||
looppoint_save_checkpoint_generator( | ||
checkpoint_dir=dir, | ||
looppoint=looppoint, | ||
update_relatives=True | ||
), | ||
# exit the simulation loop when PCCOUNTRACK_END | ||
# is encountered | ||
ExitEvent.PCCOUNTTRACK_END : exit_generator() | ||
} | ||
) | ||
|
||
simulator.run() | ||
# output json file | ||
looppoint.output_json_file() | ||
# ---------------LoopPoint part ends--------------- | ||
``` | ||
|
||
## Restore Checkpoint | ||
Checkpoints taken with the methodology above can be restored fairly straight forward with the `LoopPointRestore` and `set_se_looppoint_workload` module similar to how the checkpoints are taken. | ||
|
||
### LoopPointRestore module | ||
The `LoopPointRestor` module requires two parameters: the `looppoint_file` and `checkpoint_path`. | ||
|
||
`looppoint_file` is a Path object that contains the path to the LoopPoint json file that is generated with the checkpoint taking script. | ||
|
||
`checkpoint_path` is the Path that contains the path to the checkpoint directory. | ||
|
||
With these two inputs, the module is able to get the information needed to restore checkpoints taken for LoopPoint. | ||
|
||
After passing the `LoopPointRestore` object into `set_se_looppoint_workload` module, the simulation will raise `SIMPOINT_BEGIN` exit event when it encounters the (PC, count) pair. You can setup an exit event generator with the `Simulator` module in the standard library to perform different action to the exit event. | ||
|
||
### Restore Checkpoint Example Script | ||
``` Python | ||
from gem5.simulate.exit_event import ExitEvent | ||
from gem5.simulate.simulator import Simulator | ||
from gem5.components.cachehierarchies.classic.\ | ||
private_l1_private_l2_cache_hierarchy import ( | ||
PrivateL1PrivateL2CacheHierarchy, | ||
) | ||
from gem5.components.boards.simple_board import SimpleBoard | ||
from gem5.components.memory import DualChannelDDR4_2400 | ||
from gem5.components.processors.simple_processor import SimpleProcessor | ||
from gem5.components.processors.cpu_types import CPUTypes | ||
from gem5.isas import ISA | ||
from gem5.resources.resource import Resource | ||
from pathlib import Path | ||
from gem5.simulate.exit_event_generators import dump_reset_generator | ||
# LoopPoint related module | ||
from gem5.utils.looppoint import LoopPointRestore | ||
|
||
# Using a more complex cache hierarchy than the one | ||
# used in the taking checkpoint script | ||
cache_hierarchy = PrivateL1PrivateL2CacheHierarchy( | ||
l1d_size="32kB", | ||
l1i_size="32kB", | ||
l2_size="256kB", | ||
) | ||
|
||
memory = DualChannelDDR4_2400(size="2GB") | ||
|
||
processor = SimpleProcessor( | ||
cpu_type=CPUTypes.O3, | ||
isa=ISA.X86, | ||
num_cores=9, | ||
) | ||
|
||
board = SimpleBoard( | ||
clk_freq="3GHz", | ||
processor=processor, | ||
memory=memory, | ||
cache_hierarchy=cache_hierarchy, | ||
) | ||
|
||
# ---------------LoopPoint part begins--------------- | ||
# LoopPoint object created | ||
looppoint = LoopPointRestore( | ||
looppoint_file=Path("LoopPoint.json"), | ||
checkpoint_path=Path("LoopPoint-CheckPoints/cpt.Region1") | ||
) | ||
|
||
board.set_se_looppoint_workload( | ||
binary= Resource("example-binary"), | ||
looppoint=looppoint | ||
) | ||
|
||
simulator = Simulator( | ||
board=board, | ||
checkpoint_path=args.checkpoint, | ||
on_exit_event ={ | ||
# setup generator to perform a statistic dump and reset | ||
# when the SIMPOINT_BEGIN exit event is encountered | ||
ExitEvent.SIMPOINT_BEGIN : dump_reset_generator(), | ||
# exit the simulation loop when PCCOUNTRACK_END | ||
# is encountered | ||
ExitEvent.PCCOUNTTRACK_END: exit_generator() | ||
} | ||
) | ||
|
||
simulator.run() | ||
# ---------------LoopPoint part ends--------------- | ||
``` | ||
|
||
## Json File | ||
|
||
|
||
### structure | ||
|
||
Below is the structure of the json file | ||
``` | ||
region id | ||
|___ simulation | ||
| |___ start | ||
| | |___ pc | ||
| | |___ global | ||
| | |___ relative | ||
| |___ end | ||
| |___ pc | ||
| |___ global | ||
| |___ relative | ||
|___ warmup # optional to region | ||
| |___ start | ||
| | |___ pc | ||
| | |___ global | ||
| | |___ relative | ||
| |___ end | ||
| |___ pc | ||
| |___ global | ||
| |___ relative | ||
|___ multiplier | ||
|
||
``` | ||
|
||
### components | ||
|
||
`region id` represent the specific region that is selected by the LoopPoint sampling methodology. Each of this specific region has a simulation region, but it might or might not has a warmup region. | ||
|
||
`simulation` is the region that should be simulate in detail. | ||
|
||
`warmup` is the warmup region that should be used to warmup before performing detail simulation on the simulation region. | ||
|
||
`pc` is the Program Counter address. | ||
|
||
`global` is the global count that starts counting from the beginning of the workload. | ||
|
||
`relative` is the relative count that starts counting from the beginning of the (PC, count) pair where the checkpoint is taken. | ||
|
||
`multiplier` is the weight that should be applied on the data for this region to calculate the runtime with the LoopPoint sampling methodology. | ||
|
||
### Example | ||
|
||
``` Json | ||
"1": { | ||
"simulation": { | ||
"start": { | ||
"pc": 4221392, | ||
"global": 211076617, | ||
"relative": 15326617 | ||
}, | ||
"end": { | ||
"pc": 4221392, | ||
"global": 219060252, | ||
"relative": 23310252 | ||
} | ||
}, | ||
"multiplier": 4.0, | ||
"warmup": { | ||
"start": { | ||
"pc": 4221056, | ||
"count": 23520614 | ||
}, | ||
"end": { | ||
"pc": 4221392, | ||
"count": 211076617 | ||
} | ||
} | ||
}, | ||
"2": { | ||
"simulation": { | ||
"start": { | ||
"pc": 4206672, | ||
"global": 1 | ||
}, | ||
"end": { | ||
"pc": 4221392, | ||
"global": 6861604, | ||
"relative": 6861604 | ||
} | ||
}, | ||
"multiplier": 1.0 | ||
} | ||
``` | ||
|
||
## Others | ||
|
||
### PcCountPair | ||
|
||
`PcCountPair` is a special parameter for storing a Program Counter address (`PcCountPair::pc`) and an integer value of count(`PcCountPair::count`). The `PcCountPair` object can communicate between Python and C++, and can be used in the same way. However, the hashing functions between Python and C++ of this parameter are different. It is known that the C++ `PcCountPair` object can not be used as the key to find a value in the Python hash table. | ||
|
||
### Debug Flag | ||
|
||
`PcCountTracker` is a debug flag for the `PcCountTrackerManager` SimObject, which is the underneath object that is responsible for tracking the Program Counter address and the number of times they have been executed. It can be enabled in the command line. For example: | ||
|
||
build/X86/gem5.opt --debug-flags=PcCountTracker config_script.py | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -138,6 +138,66 @@ class Cycles | |
friend std::ostream& operator<<(std::ostream &out, const Cycles & cycles); | ||
}; | ||
|
||
class PcCountPair | ||
{ | ||
|
||
private: | ||
|
||
/** The Program Counter address*/ | ||
uint64_t pc; | ||
/** The count of the Program Counter address*/ | ||
int count; | ||
|
||
public: | ||
|
||
/** Explicit constructor assigning the pc and count values*/ | ||
explicit constexpr PcCountPair(uint64_t _pc, int _count) : | ||
pc(_pc), count(_count) {} | ||
|
||
/** Default constructor for parameter classes*/ | ||
PcCountPair() : pc(0), count(0) {} | ||
|
||
/** Returns the Program Counter address*/ | ||
constexpr uint64_t getPC () const {return pc;} | ||
/** Returns the count of the Program*/ | ||
constexpr int getCount() const {return count;} | ||
|
||
/** Greater than comparison*/ | ||
constexpr bool | ||
operator>(const PcCountPair& cc) const | ||
{ | ||
return count > cc.getCount(); | ||
} | ||
|
||
/** Equal comparison*/ | ||
constexpr bool | ||
operator==(const PcCountPair& cc) const | ||
{ | ||
return (pc == cc.getPC() && count == cc.getCount()); | ||
} | ||
|
||
/** String format*/ | ||
std::string | ||
to_string() const | ||
{ | ||
std::string s = "(" + std::to_string(pc) | ||
+ ", " + std::to_string(count) + ")"; | ||
return s; | ||
} | ||
|
||
/** Enable hashing for this parameter*/ | ||
struct HashFunction | ||
{ | ||
size_t operator()(const PcCountPair& item) const | ||
{ | ||
size_t xHash = std::hash<int>()(item.pc); | ||
size_t yHash = std::hash<int>()(item.count) << 1; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why left shift 1? |
||
return xHash ^ yHash; | ||
} | ||
}; | ||
|
||
}; | ||
|
||
/** | ||
* Address type | ||
* This will probably be moved somewhere else in the near future. | ||
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be in its own file. I would make a new file
src/cpu/probes/pc_count_pair.hh