Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sger and dger. #18

Merged
merged 1 commit into from
Feb 2, 2015
Merged

Add sger and dger. #18

merged 1 commit into from
Feb 2, 2015

Conversation

archblob
Copy link
Contributor

This is here without the tests so we can discuss it.
Regarding #14.

@cartazio
Copy link
Member

when you have the chance, if you rebase this ontop of head I can happily merge it in (along with some basic testing)

@cartazio
Copy link
Member

bump :)

@archblob
Copy link
Contributor Author

Rebased and added some simple tests.

x <- Matrix.generateMutableDenseVector 2 (\_ -> 2.0)
y <- Matrix.generateMutableDenseVector 2 (\_ -> 3.0)
BLAS.sger 2.0 x y res
resList <- Matrix.mutableVectorToList $ _bufferDenMutMat res
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be good to add a wee comment explaining
A := alpha_x_y' + A, with the number filled in. So that why this test is correct is more immediately obvious.

Which does raise the realted point that we should probably put those formulae in the haddcoks somewhhere, but probably not in the tests or this PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eg a= 2* (2*3) +1= 13

@cartazio cartazio merged commit 7165a33 into wellposed:master Feb 2, 2015
@cartazio
Copy link
Member

cartazio commented Feb 2, 2015

merged, thanks @archblob ! :)

gerAbstraction gerName gerSafeFFI gerUnsafeFFI = ger
where
shouldCallFast :: Int -> Int -> Bool
shouldCallFast m n = flopsThreshold >= (fromIntegral m :: Int64)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@archblob derp, should this have been a <= instead? ie we should be calling the fast code IF m*n is LESS than flopsThreshold? shows what I get for merging a PR before doing a close reading

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't really looked at the code since the pr was done almost a year ago :-P. I just rebased and added some tests. Will take a look when I get home.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think we should be calling the fast code IF mn is LESS(or LEQ) than flopsThreshold. But I think it is the same that we should be calling the fast code IF flopsThreshold is GREATER(or GEQ) than mn?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're correct. It is correct as written. Fast is true when the product is
less than the threshold. Probably I should write a little helper
function to make it less confusing
On Thursday, May 7, 2015, Jueji Yang [email protected] wrote:

In src/Numerical/HBLAS/BLAS/Internal.hs
#18 (comment):

@@ -182,6 +186,35 @@ gemvAbstraction gemvName gemvSafeFFI gemvUnsafeFFI constHandler = gemv
(fromIntegral bstride) betaPtr cp (fromIntegral cstride)

+{-# NOINLINE gerAbstraction #-}
+gerAbstraction :: (SM.Storable el, PrimMonad m)

  •           => String
    
  •           -> GerFunFFI el
    
  •           -> GerFunFFI el
    
  •           -> forall orient . GerFun el orient (PrimState m) m
    
    +gerAbstraction gerName gerSafeFFI gerUnsafeFFI = ger
  • where
  •  shouldCallFast :: Int -> Int -> Bool
    
  •  shouldCallFast m n = flopsThreshold >= (fromIntegral m :: Int64)
    

Yes, I think we should be calling the fast code IF m_n is LESS_(or LEQ)
than flopsThreshold. But I think it is the same that we should be calling
the fast code IF flopsThreshold is GREATER(or GEQ) than m
n?


Reply to this email directly or view it on GitHub
https://github.com/wellposed/hblas/pull/18/files#r29869978.

@cartazio
Copy link
Member

cartazio commented Feb 2, 2015

@archblob no worries, i'll actually audit it properly + all the other FFI bindings when i have a chance in the next few days

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants