Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance of (+,-,*,/)(::TaylorScalar,::Number): easy short path ? #25

Closed
wants to merge 4 commits into from

Conversation

lrnv
Copy link
Contributor

@lrnv lrnv commented Feb 7, 2023

Hey,

The current implementation of *(::TaylorScalar,::Number) in your package promotes the scalar to a TaylorScalar as so :

for op in (:+, :-, :*, :/)
    @eval @inline $op(a::TaylorScalar, b::Number) = $op(promote(a, b)...)
    @eval @inline $op(a::Number, b::TaylorScalar) = $op(promote(a, b)...)
end

and then uses the full blown multiplication of taylorScalars:

@generated function *(a::TaylorScalar{T, N}, b::TaylorScalar{T, N}) where {T, N}
    return quote
        va, vb = value(a), value(b)
        @inbounds TaylorScalar($([:(+($([:($(binomial(i - 1, j - 1)) * va[$j] *
                                           vb[$(i + 1 - j)]) for j in 1:i]...)))
                                  for i in 1:N]...))
    end
end

Although very general and elegant, this seems super wasteful as a bunch of multiplications by 0 will occur and still take runtime.

t = TaylorScalar(rand(20)...)
x = rand(10)
@btime Ref(t) .* x; # 922 ns
@btime (TaylorScalar(t.value * xi) for xi in x) # 58 ns

and so a specific function would be nice. Same thing occurs for +,-,/ if i read your code correclty.

As i am not well versed in the @eval shananigans, the code I produced is quite ugly and has non-ncessary loops, but I could not fix that myself ;)

Please tell me what you think, I got a 40% speedup in my application with this.

@codecov-commenter
Copy link

Codecov Report

Base: 86.44% // Head: 86.81% // Increases project coverage by +0.36% 🎉

Coverage data is based on head (280bb66) compared to base (7b2d2d6).
Patch coverage: 100.00% of modified lines in pull request are covered.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #25      +/-   ##
==========================================
+ Coverage   86.44%   86.81%   +0.36%     
==========================================
  Files           5        5              
  Lines         214      220       +6     
==========================================
+ Hits          185      191       +6     
  Misses         29       29              
Impacted Files Coverage Δ
src/scalar.jl 100.00% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@tansongchen
Copy link
Member

tansongchen commented Feb 7, 2023

Thanks for your contribution. Yes I agree this will be a lot more efficient, I just haven't got a chance to address that. If you take a look at our continuous benchmark page (on the page, navigate to your branch first), you will notice that the performance for single variable Taylor expansion case improved a lot.

There is a caveat, though: the performance for PINN regressed. That's because Zygote is too weak to infer type through your [2:end] semantics. But you probably don't want to dig into the Zygote compatibility layer (which is a huge mess 😱 ) so I will fix that part for you.

@lrnv
Copy link
Contributor Author

lrnv commented Feb 7, 2023

I'm sorry i cleaned up the work and forced-pushed onto my branch, so if you want to pickup from there you can of course.

Edit : Yes indeed your benchmarks shows a nice improvement ! This is super cool stuff you got there with these benchmarks btw, having them real time on PRs and brnahces is neat !

@tansongchen
Copy link
Member

Yeah that is build up on PkgBenchmark.jl and my own little frontend, still WIP and buggy but feel free to take a look and play around

@lrnv
Copy link
Contributor Author

lrnv commented Feb 8, 2023

Hey

I added convenience functions to be able to multiply and/or divide TaylorScalar{T,N}'s with different Ts. This was missing and is very helpfull for me.

Indeed, I need to derivate through TaylorScalar's machinery: So I end up with types of the form TaylorScalar{ForwardDiff{...},N}.

Btw, is there a gradient function planned or just the derivative ?

@ToucheSir
Copy link

ToucheSir commented Feb 9, 2023

That's because Zygote is too weak to infer type through your [2:end] semantics

Instead of tuple slicing, you can use Base.tail. ntuple(i -> mytup[i + 1], length(mytup) - 1). It's uglier but is more likely to be type stable. We use this pattern across a bunch of FluxML code. It should be possible to improve type stability of tuple indexing in Zygote, but the reality is that anything which isn't a bugfix is low priority because of limited dev resources.

@tansongchen
Copy link
Member

Thanks @ToucheSir , that sounds like a quick fix, maybe @lrnv try that?

@tansongchen
Copy link
Member

Hey

I added convenience functions to be able to multiply and/or divide TaylorScalar{T,N}'s with different Ts. This was missing and is very helpfull for me.

Indeed, I need to derivate through TaylorScalar's machinery: So I end up with types of the form TaylorScalar{ForwardDiff{...},N}.

Btw, is there a gradient function planned or just the derivative ?

And regarding your gradient question, depends on what you mean:

  1. gradient(f, xs, n) will compute the n-th partial derivative for each x in xs, e.g. compute f_xx and f_yy at once. This is easy to implement (like a convenience wrapper), and indeed made some usage cases easier.
  2. gradient(f, xs, n) will compute the n-th Hessian-like tensor (which is a n-dimensional tensor); I didn't see a real use case for that, and it isn't really something that Taylor-mode is good at. Taylor-mode is very good for directional derivatives, and probably also good for nesting directional derivatives for 2 ~ 3 times (like f_xxyyy), but not for the full derivative tensor

@tansongchen
Copy link
Member

BTW, you can use julia -e 'using JuliaFormatter; format(".", verbose=true)' to format your code, as shown in https://github.com/JuliaDiff/TaylorDiff.jl/blob/main/.github/workflows/FormatCheck.yml

@tansongchen
Copy link
Member

Changing slicing to tail didn't fix the problem... I need to take a closer look next week

@lrnv
Copy link
Contributor Author

lrnv commented Feb 24, 2023

Hey sorry i did not catch up on your responses, i did not see them...

@tansongchen
Copy link
Member

I will be working on this during the summer, however with a different approach. I am currently adding chunking support to this package like ForwardDiff.jl, something like this

struct TaylorScalar{V, N, K}
  primal::V
  partials::NTuple{K, NTuple{N, V}}
end

where K is the order, and N is the chunk size as in ForwardDiff.jl. partials will be a polynomial consists of tuples, not numbers.

Based on this, we have easy access to the primal value via getfield, so won't face the same issue as we currently have.

@lrnv
Copy link
Contributor Author

lrnv commented May 22, 2024

Thanks for feedback @tansongchen. For me the real issue is #61 and until this is dealt with I cannot switch from TaylorSeries.jl to TaylorDiff.jl to benefit from the (already better) performance, let alone think of even better perf from this PR..

@lrnv
Copy link
Contributor Author

lrnv commented Sep 25, 2024

closed in favor of #81

@lrnv lrnv closed this Sep 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants