Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustdoc: use a templating engine to generate HTML #84419

Closed
jsha opened this issue Apr 22, 2021 · 45 comments
Closed

rustdoc: use a templating engine to generate HTML #84419

jsha opened this issue Apr 22, 2021 · 45 comments
Labels
A-rustdoc-ui Area: Rustdoc UI (generated HTML) C-enhancement Category: An issue proposing an enhancement or a PR with one. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.

Comments

@jsha
Copy link
Contributor

jsha commented Apr 22, 2021

Right now rustdoc generates HTML with a series of writes embedded in the Rust code. This means the Rust code needs to process the elements of a page in the same order that they will be emitted in HTML. It makes it hard to get a holistic view of the page structure. It means we need to take care to make sure tags are balanced, which can be particularly hard when a function has multiple return points. It means in order to make any changes to the HTML, you need a good understanding of the Rust code.

If we use a templating engine, we can separate the templates from the Rust code, improving those problems. It would also mean we could have a mode for rustdoc to read templates at runtime instead of using its compiled-in templates. That would make iterating on HTML improvements easier, because we wouldn't have to wait for a compile cycle for each change.

There are a variety of templating engines out there, but I think mustache is quite good. In particular the choice to have no logic in templates is important, because otherwise it is tempting to put some logic in templates and some in the driving code, which becomes confusing. In Rust, the handlebars crate looks mature and actively maintained. It implements the handlebars template language, which is evidently different from mustache but pretty close.

@jsha jsha added T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. C-enhancement Category: An issue proposing an enhancement or a PR with one. A-rustdoc-ui Area: Rustdoc UI (generated HTML) labels Apr 22, 2021
@ghost
Copy link

ghost commented Apr 22, 2021

https://github.com/Keats/tera
Tera is a good engine too.
It's similar the Django template engine, which the best and friendly engine .
https://docs.djangoproject.com/en/3.1/topics/templates/
I like Rust, and like Python.

@GuillaumeGomez
Copy link
Member

If we use a templating engine, we can separate the templates from the Rust code, improving those problems. It would also mean we could have a mode for rustdoc to read templates at runtime instead of using its compiled-in templates. That would make iterating on HTML improvements easier, because we wouldn't have to wait for a compile cycle for each change.

No, we don't want anything at runtime because it requires running a server, which we definitely don't want. Closing this issue then.

@Swatinem
Copy link
Contributor

we don't want anything at runtime because it requires running a server, which we definitely don't want.

I think you misinterpreted this. Its rather "at documentation generation time". So it would be more like a static site generator. No need to run a server.

I could imagine an unstable --template-dir switch as well to control the templates being used, for maximizing iteration time on this.

@GuillaumeGomez
Copy link
Member

GuillaumeGomez commented Apr 22, 2021

Oh, my bad. I'm still not a big fan of the idea of depending on an external template engine though...

EDIT: to add a bit more on my comment: rustdoc can generate up to thousands of files for one crate. I'm afraid that using a template engine might have a big (negative) impact on performance. But that's just a hunch, we need actual number before saying such things and I don't have them. And I'm afraid that if someone makes the implementation and the numbers are bad, it might just end up in a lot of work for nothing. Not really a great experience...

@jyn514
Copy link
Member

jyn514 commented Apr 22, 2021

In particular the choice to have no logic in templates is important, because otherwise it is tempting to put some logic in templates and some in the driving code, which becomes confusing.

FYI almost all template languages have at least if conditions/for loops, including handlebars. Maybe that's not what you meant though?

@jyn514
Copy link
Member

jyn514 commented Apr 22, 2021

And yes, I would prefer tera if possible for consistency with docs.rs.

@jsha
Copy link
Contributor Author

jsha commented Apr 22, 2021

I hear you, @GuillaumeGomez, about introducing extra dependencies, particularly in something that's such a core, critical tool. I think the benefit in maintainability is likely to be worth it.

rustdoc can generate up to thousands of files for one crate. I'm afraid that using a template engine might have a big (negative) impact on performance. But that's just a hunch, we need actual number before saying such things and I don't have them. And I'm afraid that if someone makes the implementation and the numbers are bad, it might just end up in a lot of work for nothing.

These are definite risks. I suspect the performance will be good. Templating engines are often used as part of the live request path in web servers, so they have to be fast. They might even do a better job of optimizing string handling than our current series of write! does.

almost all template languages have at least if conditions/for loops, including handlebars. Maybe that's not what you meant though?

Mustache describes itself as logic-less:

We call it "logic-less" because there are no if statements, else clauses, or for loops. Instead there are only tags. Some tags are replaced with a value, some nothing, and others a series of values. This document explains the different types of Mustache tags.

And Handlebars has some discussion about what it means by logic-less.

But I'm not too picky. I think we would benefit from any templating engine, and if we go that route, one that is familiar to some team members already would of course be great.

@GuillaumeGomez
Copy link
Member

As long as it doesn't require external libraries to be installed to work and doesn't kill performance, I think I can live with it. :)

@workingjubilee
Copy link
Member

https://github.com/Keats/tera
Tera is a good engine too.
It's similar the Django template engine, which the best and friendly engine .
https://docs.djangoproject.com/en/3.1/topics/templates/
I like Rust, and like Python.

Tera, as a Jinja dialect, admits a fairly "large language" compared to handlebars.
If we were to use a Jinja dialect, askama is similar and mostly compiled.
This leads to some considerable performance benefits, which @djc, the creator of askama (and notable Rust contributor) has investigated: https://github.com/djc/template-benchmarks-rs
These benchmarks show that as far as beating std, most template engines do not, they mostly compete, though interestingly maud and markup do, and both have the distinction of still being mostly Rust (they're basically just macros).

@djc
Copy link
Contributor

djc commented Apr 22, 2021

I'll try not to shill for Askama too much, but to some extent I actually consider the type safety of Askama (and Askama-like engines) to be more interesting/useful than its performance (though both are, of course, valuable).

In any case, happy to answer any questions that come up.

@ghost
Copy link

ghost commented Apr 23, 2021

Tera or askama were good so far, I am using Tera. It has better document. Speed and performance still good so far. and feature seems complete.

Askama looks well either, seems need some documentation constrution.

@GuillaumeGomez
Copy link
Member

Little sidenote: something we'll need to be very careful of is the size of the rendered template. For example, on docs.rs, tera generates a lot of whitespaces. And considering the number of files rustdoc generates, such "unused" chars will have a cost pretty quickly.

@jyn514
Copy link
Member

jyn514 commented Apr 23, 2021

@GuillaumeGomez could we minify the generate html? I think we do that already for some of the shared files.

I'd like to suggest we not bikeshed the exact template too much - Askama seems nice and I'm fine with that, but I suspect it will roughly equivalent massive amounts of work whichever engine we pick.

@GuillaumeGomez
Copy link
Member

Just to be clear: this is insanely complicated to minify HTML once it's generated. It has to take into account multiple things. For example: when is it fine to remove backlines and whitespaces? Just answering such a simple question is quite tricky. And that requires in all cases to parse HTML, which is another huge problem.

So to put it more simply: it has to be handled correctly before the generation by the template engine itself.

@jsha
Copy link
Contributor Author

jsha commented Apr 23, 2021

I'd like to suggest we not bikeshed the exact template too much - Askama seems nice and I'm fine with that, but I suspect it will roughly equivalent massive amounts of work whichever engine we pick.

Agreed. I'm seeing a good discussion of template engines, but I haven't heard much about whether people thing it's a worthwhile endeavor or not. I'd like to hear more of that! Even if the answer is "not worthwhile right now." :-)

@GuillaumeGomez
Copy link
Member

I think everyone agreed on the fact that it would improve code a lot. I was sharing my concerns but I also think it's a good idea for code maintenance. So here are the three mandatory things:

  1. Doesn't require external dependency (I think it's obvious but better to write it down).
  2. Doesn't generate HTML files bigger than they currently are.
  3. Doesn't kill performance (I think it's the less difficult one based on the previous comments).

For the rest, I only know jinja2 as template engine from docs.rs (and a few others from python but out of scope here). So I'll let you pick one and simply enjoy the ride. :)

@jsha
Copy link
Contributor Author

jsha commented Apr 23, 2021

Doesn't require external dependency (I think it's obvious but better to write it down).

When you say external dependency, I assume you don't mean "a crate," since all the options we're discussing are crates. Do you mean, e.g. a binary or library that's not managed by crates?

@GuillaumeGomez
Copy link
Member

Crates are fine. I meant external libraries/binaries.

@jsha
Copy link
Contributor Author

jsha commented Apr 25, 2021

I gave Askama a try this afternoon. There's a lot to like about it! You can see my very small example program below. I tried to see what it would look like for the settings page.

The main drawbacks in my mind are:

  • Templates must be built at compile time, not runtime. On of the big benefits I'd hoped to get from templatizing was to speed up the rustdoc maintainer workflow by not requiring a rebuild of rustdoc on every HTML change.
  • Parse errors for malformed templates currently don't give much guidance about where the mistake is. For instance:
error: unable to parse template:

"{% for setting in settings %}\n{{% match setting %}\n{% when Setting::Toggle with {js_data_name, description, default_value} %}\n  <div class=\"setting-line\">\n       <label class=\"toggle\">\n         <input type=\"checkbox\" id=\"{{ js_data_name }} \" >\n         <span class=\"slider\"></span>\n       </label>\n       <div>{{ description }}</div>\n   </div>\n{% else %}\n{% endmatch %}\n{% endfor %}\n</ul>"
<ul>
{% for setting in settings %}
{% match setting %}
{% when Setting::Toggle with {js_data_name, description, default_value} %}
  <div class="setting-line">
       <label class="toggle">
         <input type="checkbox" id="{{ js_data_name }} " >
         <span class="slider"></span>
       </label>
       <div>{{ description }}</div>
   </div>
{% else %}
{% endmatch %}
{% endfor %}
</ul>
use askama::Template;

#[derive(Template)]
#[template(path = "settings.html")]
struct Settings {
    settings: Vec<Setting>,
}

#[derive(Debug)]
enum Setting {
    Section {
        description: &'static str,
        sub_settings: Vec<Setting>,
    },
    Toggle {
        js_data_name: &'static str,
        description: &'static str,
        default_value: bool,
    },
    Select {
        js_data_name: &'static str,
        description: &'static str,
        default_value: &'static str,
        options: Vec<(String, String)>,
    },
}

fn main() {
    let s = Settings {
        settings: vec![Setting::Toggle {
              js_data_name: "foo",
              description: "bar",
              default_value: true,
        }],
    };
    println!("{}", s.render().unwrap());
}

@jsha
Copy link
Contributor Author

jsha commented May 16, 2021

I tried out tera and like it so far. It allows runtime loading of templates, and it has friendly error messages. It has features to suppress whitespace when needed.

Unfortunately it was once of the worse scorers on @djc's templating benchmarks, along with handlebars.

use tera::Tera;
use tera::Context;
use serde::Serialize;

#[derive(Serialize)]
struct Settings {
    settings: Vec<Setting>,
}

#[derive(Serialize)]
enum Setting {
    Toggle(bool),
}

fn main() {
    // Use globbing
    let tera = match Tera::new("templates/**/*.html") {
        Ok(t) => t,
        Err(e) => {
            println!("Parsing error(s): {}", e);
            ::std::process::exit(1);
        }
    };

    let s = Settings {
        settings: vec![Setting :: Toggle(true) ],
    };

    println!("{}", tera.render("settings.html", &Context::from_serialize(s).unwrap()).unwrap());
}
<form>
{%- for setting in settings -%}
  <input type="checkbox"
    {%- if setting == true -%}
      checked
    {- endif -%}
  >
{%- endfor -%}
</form>

Error output when templates are malformed (as the above is):

Parsing error(s): 
* Failed to parse "templates/settings.html"
 --> 8:1
  |
8 | {%- endfor -%}␊
  | ^---
  |
  = unexpected tag; expected an `elif` tag, an `else` tag, an endif tag (`{% endif %}`), or some content

Output when the template is fixed:

<form><input type="checkbox"></form>

@jyn514
Copy link
Member

jyn514 commented May 16, 2021

@jsha how big is the difference in performance? Rustdoc doesn't spend most of its time in render, if it's a linear slowdown it's probably ok.

@jsha
Copy link
Contributor Author

jsha commented May 16, 2021

https://github.com/djc/template-benchmarks-rs

Big table/Tera time: [3.3310 ms 3.3435 ms 3.3555 ms]
Big table/write time: [337.27 us 339.34 us 341.40 us]

Teams/Tera time: [7.8097 us 7.8556 us 7.9058 us]
Teams/write time: [697.02 ns 702.11 ns 708.27 ns]

So, about 10x on the fairly small benchmark examples, but I don't know how that would actually play out in the real world with rustdoc, particularly given what you say about rustdoc not spending most of its time in rendering.

@jyn514
Copy link
Member

jyn514 commented May 16, 2021

For comparison, render takes 8% of the time on syn, a "typical" case for rustdoc; 1% on diesel, a trait heavy crate; and 16% on stm32, which is a stress test in general (it has almost 12000 items).

@est31
Copy link
Member

est31 commented Jun 9, 2021

Can maybe Askama feature an alternative mode where compiles down to a different template language, or can be evaluated by a dynamic template engine? Then that mode can be enabled for rustdoc developers and would allow for fast edit/debug cycles for editing the html, which was quoted as one of the concerns, while keeping the top performance outside of developer mode.

@jyn514
Copy link
Member

jyn514 commented Jun 9, 2021

@est31 what benefit would that have over just using a dynamic template language? It seems like an awful lot of work.

@est31
Copy link
Member

est31 commented Jun 9, 2021

@jyn514 the dynamic templating language is slow, no? Using one would impair rustdoc run times for users. So have a dynamic mode for rapid development, and a compiled-in mode for runtime.

@jyn514
Copy link
Member

jyn514 commented Jun 9, 2021

@est31 at that point it would be simpler to use tera for debug builds and askama for release or something. But it doesn't matter anyway because rustdoc is horrifically slow in debug builds, so everyone would only use askama in practice.

@djc
Copy link
Contributor

djc commented Jun 9, 2021

Askama is basically a simple compiler which parses templates into an AST and then generates Rust code from that AST. There's probably not a lot of hidden complexity in writing a different code generator backend for Askama, and I'd be happy to mentor anyone who's interested in working on that.

Another approach might be to just reuse Tera templates for Askama. The template languages are fairly similar and probably have a fairly large common subset.

@jyn514
Copy link
Member

jyn514 commented Jun 9, 2021

Err I guess you could have a "dev" mode that only switches between askama and dynamic template language and still uses --release unconditionally. But that seems confusing.

@est31
Copy link
Member

est31 commented Jun 9, 2021

@jyn514 I was thinking about a config.toml value that controls this, maybe influenced by the profile.

@jyn514
Copy link
Member

jyn514 commented Jun 9, 2021

All of this seems like premature optimization anyway, the hard part of this is actually switching to a template language and then switching between languages is very simple in comparison.

@est31
Copy link
Member

est31 commented Jun 9, 2021

@jyn514 good point. This dynamic rendering mode is probably better discussed in an askama issue. I've created one: djc/askama#491 . Once there is templating it should be easy to switch from one language to another.

bors added a commit to rust-lang-ci/rust that referenced this issue Jun 21, 2021
Use Tera templates for rustdoc.

Replaces a format!() call in layout::render with a template
expansion. Introduces a `templates` field in SharedContext so parts
of rustdoc can share pre-rendered templates.

This currently builds in a copy of the single template available, like
with static files. However, future work can make this live-loadable with
a perma-unstable flag, to make rustdoc developers' work easier.

Part of rust-lang#84419.

Demo at https://hoffman-andrews.com/rust/tera/std/string/struct.String.html.
@jsha
Copy link
Contributor Author

jsha commented Oct 9, 2021

One interesting thing I noticed when writing #89695: Because the template input objects have to implement Serialize, I had to collect items into a Vec when otherwise an iterator would have done fine. This produces a bit of extra allocation over what we might prefer. There's some discussion in serde-rs/serde#571 about how serde deals with implementing Serialize for iterators. Intuitively, things that impl Serialize can be serialized multiple times, but an iterator can only be iterated once, so it's not a good candidate for impl Serialize.

I'd be interested in ideas on how to improve the situation. I'm not sure how much impact it will have on performance, but if it's something we can solve it would be nice to do so.

@GuillaumeGomez
Copy link
Member

We can run a perf check if you think it will have a big impact?

@jsha
Copy link
Contributor Author

jsha commented Oct 9, 2021

I think at the scale we're using it in #89695 it won't show up in profiling - it's a vec of ~3 entries, each of them short strings. Probably when we get to a place that calls for larger Vecs (like a vec of all items in a doc page), we should profile.

bors added a commit to rust-lang-ci/rust that referenced this issue Oct 10, 2021
Move top part of print_item to Tera templates

Part of rust-lang#84419.

This moves the first line of each item page (E.g. `Struct foo::Bar .... 1.0.0 [-][src]` into a Tera template.

I also moved template initialization into its own module and added a small macro to reduce duplication and opportunity for errors.
@jyn514
Copy link
Member

jyn514 commented Jan 2, 2022

We have currently stopped transitioning to templates because it's a fairly large performance regression (up to 3.5% just for switching print_item and no other calls: #89732). I discussed this with the team on Zulip and we decided to try Askama out for now to see if it can avoid the performance regressions. Would love for someone to make a PR switching the current tera templates to Askama!

@djc
Copy link
Contributor

djc commented Jan 3, 2022

See #92526.

djc added a commit to djc/rust that referenced this issue Jan 10, 2022
bors added a commit to rust-lang-ci/rust that referenced this issue Jan 13, 2022
Migrate rustdoc from Tera to Askama

See rust-lang#84419.

Should probably get a benchmarking run to verify if it has the intended effect on rustdoc performance.

cc `@jsha` `@jyn514.`
@camelid
Copy link
Member

camelid commented Jan 13, 2022

Should we close this issue? I think unless we run into issues, we're going to gradually convert to Askama templates. So I don't think there's much point in leaving this open.

@GuillaumeGomez
Copy link
Member

It should have been closed when we added tera already.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-rustdoc-ui Area: Rustdoc UI (generated HTML) C-enhancement Category: An issue proposing an enhancement or a PR with one. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

9 participants