Juan Ignacio Bustos Gorostegui
[email protected]
-
A) What were you able to get working and why do you think that you couldn't get the final objective?
- I managed to implement a functional secure memory that simulated corretly the added delay produced by the Secure Unit being a middleware between cpu's requests. To do so, i had to integrate two new buffers to the design, as the model as it was wasn't able to handle a fail send from the CPU nor MEM port.
In the new system, instead of sending all the request to fetch the tree's levels at the same time, the packets are just added to a METADATA_REQUEST_BUFFER and only the original request is send to the MEM port, that way no request for MetaData are lost in the event of a fail. Then, when the time comes to process a NextReqSendEvent, the METADATA_REQUEST_BUFFER is given full priority, only when it's empty is that we pull a new packet from the REQUEST_BUFFER. By only sending one message at a time, the problem of handling a fail send becomes trivial, and by giving full priority to the METADATA_REQUEST_BUFFER, it's ensure that the steps needed for a request to be authenticated are taken before proceeding to the next one. Almost the same modifications were made to the handling of the responses, adding a AUTHENTICATED_PACKETS_BUFFER with full priority over the original RESPONES_BUFFER.
Although this implementation worked and the impact of the delay was obviously noticeable, as each request for Metadata had its own delay, trying to mount the cache was a failure, the code compiled, but i wasn't able to simulate without panics. If i had to guess why, i probably forgot some step in the c++ to python interface that was needed for the components to communicate, or maybe i didn't undestand how to setup the cache properly.
- I managed to implement a functional secure memory that simulated corretly the added delay produced by the Secure Unit being a middleware between cpu's requests. To do so, i had to integrate two new buffers to the design, as the model as it was wasn't able to handle a fail send from the CPU nor MEM port.
-
B) Describe the experiment that you would run if everything was working.
- With a working basic model and its upgrade with an added cache, i would run a basic traffic linear generator to have a standar way to compare the effect of the cache in its performace. METADATA requests are at least 66% of all the traffic in the memory, and given that in my implementation all of them have their own delay, the effect in the performance is unforgivable. With a cache only for METADATA requests, not only the memory access should drop to almost a third, but the added delay from each request would stop being such a significant problem. All of this could be actually tested by comparing the ratio between Memory request per CPU request and the total simulated time between models.
-
C) Run some applications (getting started suite works, or matrix multiply) using the baseline and discuss the impact that you believe the secure memory changes will have.
- As i describe in question B) and given how matrix multiplication is dependant of the overhead in memory fetching, the impact of the cache would be extremely beneficial to the performance, although almost surely worse compared to a normal CPU.