Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

umulp is more delay intensive than regular umul #1810

Open
proppy opened this issue Dec 20, 2024 · 6 comments
Open

umulp is more delay intensive than regular umul #1810

proppy opened this issue Dec 20, 2024 · 6 comments
Labels
bug Something isn't working or is incorrect codegen Related to emitting (System)Verilog. delay model Predicting/modeling delays of target (backend) processes

Comments

@proppy
Copy link
Member

proppy commented Dec 20, 2024

Describe the bug
umulp characterization show more delay than a * b for asap7.

Expected behavior

umulp should take less time that a*b

Screenshots
a * b yield the following delay preview:
Image

umulp(a, b) yield the following delay preview:
Image

@proppy proppy added bug Something isn't working or is incorrect delay model Predicting/modeling delays of target (backend) processes labels Dec 20, 2024
@proppy
Copy link
Member Author

proppy commented Dec 20, 2024

asap7 8-bit umul:

data_points {
operation {
op: "kUMul"
bit_count: 8
operands {
bit_count: 8
}
operands {
bit_count: 8
}
specialization: OPERANDS_IDENTICAL
}
delay: 553
delay_offset: 94
}

asap7 8-bit umulp:
data_points {
operation {
op: "kUMulp"
bit_count: 8
operands {
bit_count: 8
}
operands {
bit_count: 8
}
specialization: OPERANDS_IDENTICAL
}
delay: 577
delay_offset: 94
}

@proppy
Copy link
Member Author

proppy commented Dec 20, 2024

Looks like this is the case too for sky130 too:

8-bit umul:

data_points {
operation {
op: "kUMul"
bit_count: 8
operands {
bit_count: 8
}
operands {
bit_count: 8
}
specialization: OPERANDS_IDENTICAL
}
delay: 1131
delay_offset: 241
}

8-bit umulp:

data_points {
operation {
op: "kUMulp"
bit_count: 8
operands {
bit_count: 8
}
operands {
bit_count: 8
}
specialization: OPERANDS_IDENTICAL
}
delay: 1181
delay_offset: 241
}

@proppy
Copy link
Member Author

proppy commented Dec 20, 2024

looks like this actually come from codegen: umulp is simply substracting a fix offset from the umul result:

  function automatic [15:0] umulp8b_8b_x_8b (input reg [7:0] lhs, input reg [7:0] rhs);
    reg [7:0] result;
    begin
      result = lhs * rhs;
      umulp8b_8b_x_8b = {8'h43, result - 8'h43};
    end
  endfunction

@proppy proppy added the codegen Related to emitting (System)Verilog. label Dec 20, 2024
@proppy proppy changed the title asap7 model for umulp looks wrong asap7 model/codegen for umulp is more delay intensive than regular umul Dec 20, 2024
@proppy proppy changed the title asap7 model/codegen for umulp is more delay intensive than regular umul umulp is more delay intensive than regular umul Dec 20, 2024
@proppy
Copy link
Member Author

proppy commented Dec 20, 2024

looks like this was introduced w/ af042f3#diff-a2fc2bd44970c5305b7655bcbce2d4e7f01015e489b78fe35bdc46dcfb3b00bd (and that before that one of the partial product was just 0).

@ericastor
Copy link
Collaborator

I think the change linked above was intended to be specific to simulation.

Possibly the actual issue is that XLS currently outputs its simulation-only stand-in for umulp when targeting a PDK that does not provide partial-multiplication operations.

@meheff
Copy link
Collaborator

meheff commented Dec 24, 2024

Umulp enables splitting a multiply into a partial product (which produces two values) and the final accumulation. This can enable better scheduling because a pipeline register can appear in the middle of the multiply. However, this is only useful if you are using an IP libraries has a fast partial product and use the mulp_format flag in order to instantiate the IP block:

- `--smulp_format=...` and `--umulp_format=...` set the format strings for

Otherwise XLS just emits something like ((X * Y - C), C) for some constant C for the mulp which is not helpful. We probably want to emit an error if mulp is used without mulp_format being specified, assuming we don't do that already.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working or is incorrect codegen Related to emitting (System)Verilog. delay model Predicting/modeling delays of target (backend) processes
Projects
None yet
Development

No branches or pull requests

3 participants