Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport/forward port ch12 #4008

Merged
merged 6 commits into from
Aug 14, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
683 changes: 375 additions & 308 deletions nostarch/chapter12.md

Large diffs are not rendered by default.

28 changes: 7 additions & 21 deletions packages/tools/src/bin/link2print.rs
Original file line number Diff line number Diff line change
Expand Up @@ -46,27 +46,11 @@ fn parse_references(buffer: String) -> (String, HashMap<String, String>) {
fn parse_links((buffer, ref_map): (String, HashMap<String, String>)) -> String {
// FIXME: check which punctuation is allowed by spec.
let re = Regex::new(r###"(?:(?P<pre>(?:```(?:[^`]|`[^`])*`?\n```\n)|(?:[^\[]`[^`\n]+[\n]?[^`\n]*`))|(?:\[(?P<name>[^]]+)\](?:(?:\([[:blank:]]*(?P<val>[^")]*[^ ])(?:[[:blank:]]*"[^"]*")?\))|(?:\[(?P<key>[^]]*)\]))?))"###).expect("could not create regex");
let error_code =
Regex::new(r###"^E\d{4}$"###).expect("could not create regex");
let output = re.replace_all(&buffer, |caps: &Captures<'_>| {
match caps.name("pre") {
Some(pre_section) => pre_section.as_str().to_string(),
None => {
let name = caps.name("name").expect("could not get name").as_str();
// Really we should ignore text inside code blocks,
// this is a hack to not try to treat `#[derive()]`,
// `[profile]`, `[test]`, or `[E\d\d\d\d]` like a link.
if name.starts_with("derive(") ||
name.starts_with("profile") ||
name.starts_with("test") ||
name.starts_with("no_mangle") ||
name.starts_with("cfg") ||
name.starts_with("unoptimized") ||
name.starts_with("ignore") ||
name.starts_with("should_panic") ||
error_code.is_match(name) {
return format!("[{name}]")
}

let val = match caps.name("val") {
// `[name](link)`
Expand All @@ -81,8 +65,10 @@ fn parse_links((buffer, ref_map): (String, HashMap<String, String>)) -> String {
_ => ref_map.get(&key.as_str().to_uppercase()).unwrap_or_else(|| panic!("could not find url for the link text `{}`", key.as_str())).to_string(),
}
}
// `[name]` as reference
None => ref_map.get(&name.to_uppercase()).unwrap_or_else(|| panic!("could not find url for the link text `{name}`")).to_string(),
// `[name]` is within code and should not be treated as a link
None => {
return format!("[{name}]");
}
}
}
};
Expand Down Expand Up @@ -236,11 +222,11 @@ more text"
}

#[test]
fn parses_link_without_reference_as_reference() {
fn does_not_parse_link_without_reference_as_reference() {
let source = r"[link] is alone
[link]: The contents"
.to_string();
let target = r"link at *The contents* is alone".to_string();
let target = r"[link] is alone".to_string();
assert_eq!(parse(source), target);
}

Expand Down Expand Up @@ -294,7 +280,7 @@ version = "0.1.0"

[dependencies]
```
Another [link]
Another [link][]
more text
[link]: http://gohere
"###
Expand Down
16 changes: 7 additions & 9 deletions src/ch12-00-an-io-project.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,8 @@ Along the way, we’ll show how to make our command line tool use the terminal
features that many other command line tools use. We’ll read the value of an
environment variable to allow the user to configure the behavior of our tool.
We’ll also print error messages to the standard error console stream (`stderr`)
instead of standard output (`stdout`), so, for example, the user can redirect
successful output to a file while still seeing error messages onscreen.
instead of standard output (`stdout`) so that, for example, the user can
redirect successful output to a file while still seeing error messages onscreen.

One Rust community member, Andrew Gallant, has already created a fully
featured, very fast version of `grep`, called `ripgrep`. By comparison, our
Expand All @@ -29,17 +29,15 @@ background knowledge you need to understand a real-world project such as

Our `grep` project will combine a number of concepts you’ve learned so far:

* Organizing code (using what you learned about modules in [Chapter 7][ch7]<!--
ignore -->)
* Using vectors and strings (collections, [Chapter 8][ch8]<!-- ignore -->)
* Organizing code ([Chapter 7][ch7]<!-- ignore -->)
* Using vectors and strings ([Chapter 8][ch8]<!-- ignore -->)
* Handling errors ([Chapter 9][ch9]<!-- ignore -->)
* Using traits and lifetimes where appropriate ([Chapter 10][ch10]<!-- ignore
-->)
* Using traits and lifetimes where appropriate ([Chapter 10][ch10]<!-- ignore -->)
* Writing tests ([Chapter 11][ch11]<!-- ignore -->)

We’ll also briefly introduce closures, iterators, and trait objects, which
Chapters [13][ch13]<!-- ignore --> and [17][ch17]<!-- ignore --> will cover in
detail.
[Chapter 13][ch13]<!-- ignore --> and [Chapter 17][ch17]<!-- ignore --> will
cover in detail.

[ch7]: ch07-00-managing-growing-projects-with-packages-crates-and-modules.html
[ch8]: ch08-00-common-collections.html
Expand Down
16 changes: 8 additions & 8 deletions src/ch12-01-accepting-command-line-arguments.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ to turn it into a collection, such as a vector, that contains all the elements
the iterator produces.

The code in Listing 12-1 allows your `minigrep` program to read any command
line arguments passed to it and then collect the values into a vector.
line arguments passed to it, and then collect the values into a vector.

<Listing number="12-1" file-name="src/main.rs" caption="Collecting the command line arguments into a vector and printing them">

Expand All @@ -47,7 +47,7 @@ line arguments passed to it and then collect the values into a vector.

</Listing>

First, we bring the `std::env` module into scope with a `use` statement so we
First we bring the `std::env` module into scope with a `use` statement so we
can use its `args` function. Notice that the `std::env::args` function is
nested in two levels of modules. As we discussed in [Chapter
7][ch7-idiomatic-use]<!-- ignore -->, in cases where the desired function is
Expand All @@ -63,14 +63,14 @@ mistaken for a function that’s defined in the current module.
> Unicode. If your program needs to accept arguments containing invalid
> Unicode, use `std::env::args_os` instead. That function returns an iterator
> that produces `OsString` values instead of `String` values. We’ve chosen to
> use `std::env::args` here for simplicity, because `OsString` values differ
> per platform and are more complex to work with than `String` values.
> use `std::env::args` here for simplicity because `OsString` values differ per
> platform and are more complex to work with than `String` values.

On the first line of `main`, we call `env::args`, and we immediately use
`collect` to turn the iterator into a vector containing all the values produced
by the iterator. We can use the `collect` function to create many kinds of
collections, so we explicitly annotate the type of `args` to specify that we
want a vector of strings. Although we very rarely need to annotate types in
want a vector of strings. Although you very rarely need to annotate types in
Rust, `collect` is one function you do often need to annotate because Rust
isn’t able to infer the kind of collection you want.

Expand All @@ -89,8 +89,8 @@ Notice that the first value in the vector is `"target/debug/minigrep"`, which
is the name of our binary. This matches the behavior of the arguments list in
C, letting programs use the name by which they were invoked in their execution.
It’s often convenient to have access to the program name in case you want to
print it in messages or change behavior of the program based on what command
line alias was used to invoke the program. But for the purposes of this
print it in messages or change the behavior of the program based on what
command line alias was used to invoke the program. But for the purposes of this
chapter, we’ll ignore it and save only the two arguments we need.

### Saving the Argument Values in Variables
Expand All @@ -109,7 +109,7 @@ we can use the values throughout the rest of the program. We do that in Listing
</Listing>

As we saw when we printed the vector, the program’s name takes up the first
value in the vector at `args[0]`, so we’re starting arguments at index `1`. The
value in the vector at `args[0]`, so we’re starting arguments at index 1. The
first argument `minigrep` takes is the string we’re searching for, so we put a
reference to the first argument in the variable `query`. The second argument
will be the file path, so we put a reference to the second argument in the
Expand Down
15 changes: 8 additions & 7 deletions src/ch12-02-reading-a-file.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
## Reading a File

Now we’ll add functionality to read the file specified in the `file_path`
argument. First, we need a sample file to test it with: we’ll use a file with a
argument. First we need a sample file to test it with: we’ll use a file with a
small amount of text over multiple lines with some repeated words. Listing 12-3
has an Emily Dickinson poem that will work well! Create a file called
*poem.txt* at the root level of your project, and enter the poem “I’m Nobody!
Who are you?”

<Listing number="12-3" file-name="poem.txt" caption="A poem by Emily Dickinson makes a good test case">
<Listing number="12-3" file-name="poem.txt" caption="A poem by Emily Dickinson makes a good test case.">

```text
{{#include ../listings/ch12-an-io-project/listing-12-03/poem.txt}}
Expand All @@ -26,11 +26,12 @@ shown in Listing 12-4.

</Listing>

First, we bring in a relevant part of the standard library with a `use`
First we bring in a relevant part of the standard library with a `use`
statement: we need `std::fs` to handle files.

In `main`, the new statement `fs::read_to_string` takes the `file_path`, opens
that file, and returns a `std::io::Result<String>` of the file’s contents.
that file, and returns a value of type `std::io::Result<String>` that contains
the file’s contents.

After that, we again add a temporary `println!` statement that prints the value
of `contents` after the file is read, so we can check that the program is
Expand All @@ -50,6 +51,6 @@ responsibilities: generally, functions are clearer and easier to maintain if
each function is responsible for only one idea. The other problem is that we’re
not handling errors as well as we could. The program is still small, so these
flaws aren’t a big problem, but as the program grows, it will be harder to fix
them cleanly. It’s good practice to begin refactoring early on when developing
a program, because it’s much easier to refactor smaller amounts of code. We’ll
do that next.
them cleanly. It’s a good practice to begin refactoring early on when
developing a program because it’s much easier to refactor smaller amounts of
code. We’ll do that next.
61 changes: 30 additions & 31 deletions src/ch12-03-improving-error-handling-and-modularity.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,8 @@ community has developed guidelines for splitting the separate concerns of a
binary program when `main` starts getting large. This process has the following
steps:

* Split your program into a *main.rs* and a *lib.rs* and move your program’s
logic to *lib.rs*.
* Split your program into a *main.rs* file and a *lib.rs* file and move your
program’s logic to *lib.rs*.
* As long as your command line parsing logic is small, it can remain in
*main.rs*.
* When the command line parsing logic starts getting complicated, extract it
Expand All @@ -57,7 +57,7 @@ should be limited to the following:
* Handling the error if `run` returns an error

This pattern is about separating concerns: *main.rs* handles running the
program, and *lib.rs* handles all the logic of the task at hand. Because you
program and *lib.rs* handles all the logic of the task at hand. Because you
can’t test the `main` function directly, this structure lets you test all of
your program’s logic by moving it into functions in *lib.rs*. The code that
remains in *main.rs* will be small enough to verify its correctness by reading
Expand Down Expand Up @@ -163,7 +163,7 @@ named for their purpose.

So far, we’ve extracted the logic responsible for parsing the command line
arguments from `main` and placed it in the `parse_config` function. Doing so
helped us to see that the `query` and `file_path` values were related and that
helped us see that the `query` and `file_path` values were related, and that
relationship should be conveyed in our code. We then added a `Config` struct to
name the related purpose of `query` and `file_path` and to be able to return the
values’ names as struct field names from the `parse_config` function.
Expand Down Expand Up @@ -208,8 +208,8 @@ they should do instead. Let’s fix that now.
#### Improving the Error Message

In Listing 12-8, we add a check in the `new` function that will verify that the
slice is long enough before accessing index 1 and 2. If the slice isn’t long
enough, the program panics and displays a better error message.
slice is long enough before accessing index 1 and index 2. If the slice isn’t
long enough, the program panics and displays a better error message.

<Listing number="12-8" file-name="src/main.rs" caption="Adding a check for the number of arguments">

Expand All @@ -222,10 +222,10 @@ enough, the program panics and displays a better error message.
This code is similar to [the `Guess::new` function we wrote in Listing
9-13][ch9-custom-types]<!-- ignore -->, where we called `panic!` when the
`value` argument was out of the range of valid values. Instead of checking for
a range of values here, we’re checking that the length of `args` is at least 3
and the rest of the function can operate under the assumption that this
a range of values here, we’re checking that the length of `args` is at least
`3` and the rest of the function can operate under the assumption that this
condition has been met. If `args` has fewer than three items, this condition
will be true, and we call the `panic!` macro to end the program immediately.
will be `true`, and we call the `panic!` macro to end the program immediately.

With these extra few lines of code in `new`, let’s run the program without any
arguments again to see what the error looks like now:
Expand All @@ -235,8 +235,8 @@ arguments again to see what the error looks like now:
```

This output is better: we now have a reasonable error message. However, we also
have extraneous information we don’t want to give to our users. Perhaps using
the technique we used in Listing 9-13 isn’t the best to use here: a call to
have extraneous information we don’t want to give to our users. Perhaps the
technique we used in Listing 9-13 isn’t the best one to use here: a call to
`panic!` is more appropriate for a programming problem than a usage problem,
[as discussed in Chapter 9][ch9-error-guidelines]<!-- ignore -->. Instead,
we’ll use the other technique you learned about in Chapter 9—[returning a
Expand Down Expand Up @@ -270,7 +270,7 @@ well, which we’ll do in the next listing.
</Listing>

Our `build` function returns a `Result` with a `Config` instance in the success
case and a `&'static str` in the error case. Our error values will always be
case and a string literal in the error case. Our error values will always be
string literals that have the `'static` lifetime.

We’ve made two changes in the body of the function: instead of calling `panic!`
Expand Down Expand Up @@ -306,15 +306,15 @@ In this listing, we’ve used a method we haven’t covered in detail yet:
`unwrap_or_else`, which is defined on `Result<T, E>` by the standard library.
Using `unwrap_or_else` allows us to define some custom, non-`panic!` error
handling. If the `Result` is an `Ok` value, this method’s behavior is similar
to `unwrap`: it returns the inner value `Ok` is wrapping. However, if the value
is an `Err` value, this method calls the code in the *closure*, which is an
anonymous function we define and pass as an argument to `unwrap_or_else`. We’ll
cover closures in more detail in [Chapter 13][ch13]<!-- ignore -->. For now,
you just need to know that `unwrap_or_else` will pass the inner value of the
`Err`, which in this case is the static string `"not enough arguments"` that we
added in Listing 12-9, to our closure in the argument `err` that appears
between the vertical pipes. The code in the closure can then use the `err`
value when it runs.
to `unwrap`: it returns the inner value that `Ok` is wrapping. However, if the
value is an `Err` value, this method calls the code in the *closure*, which is
an anonymous function we define and pass as an argument to `unwrap_or_else`.
We’ll cover closures in more detail in [Chapter 13][ch13]<!-- ignore -->. For
now, you just need to know that `unwrap_or_else` will pass the inner value of
the `Err`, which in this case is the static string `"not enough arguments"`
that we added in Listing 12-9, to our closure in the argument `err` that
appears between the vertical pipes. The code in the closure can then use the
`err` value when it runs.

We’ve added a new `use` line to bring `process` from the standard library into
scope. The code in the closure that will be run in the error case is only two
Expand Down Expand Up @@ -386,7 +386,7 @@ know that `Box<dyn Error>` means the function will return a type that
implements the `Error` trait, but we don’t have to specify what particular type
the return value will be. This gives us flexibility to return error values that
may be of different types in different error cases. The `dyn` keyword is short
for dynamic.”
for *dynamic*.

Second, we’ve removed the call to `expect` in favor of the `?` operator, as we
talked about in [Chapter 9][ch9-question-mark]<!-- ignore -->. Rather than
Expand Down Expand Up @@ -423,22 +423,22 @@ with `Config::build` in Listing 12-10, but with a slight difference:
```

We use `if let` rather than `unwrap_or_else` to check whether `run` returns an
`Err` value and call `process::exit(1)` if it does. The `run` function doesn’t
return a value that we want to `unwrap` in the same way that `Config::build`
returns the `Config` instance. Because `run` returns `()` in the success case,
we only care about detecting an error, so we don’t need `unwrap_or_else` to
return the unwrapped value, which would only be `()`.
`Err` value and to call `process::exit(1)` if it does. The `run` function
doesn’t return a value that we want to `unwrap` in the same way that
`Config::build` returns the `Config` instance. Because `run` returns `()` in
the success case, we only care about detecting an error, so we don’t need
`unwrap_or_else` to return the unwrapped value, which would only be `()`.

The bodies of the `if let` and the `unwrap_or_else` functions are the same in
both cases: we print the error and exit.

### Splitting Code into a Library Crate

Our `minigrep` project is looking good so far! Now we’ll split the
*src/main.rs* file and put some code into the *src/lib.rs* file. That way we
*src/main.rs* file and put some code into the *src/lib.rs* file. That way, we
can test the code and have a *src/main.rs* file with fewer responsibilities.

Let’s move all the code that isn’t the `main` function from *src/main.rs* to
Let’s move all the code that isn’t in the `main` function from *src/main.rs* to
*src/lib.rs*:

* The `run` function definition
Expand Down Expand Up @@ -476,8 +476,7 @@ binary crate in *src/main.rs*, as shown in Listing 12-14.
We add a `use minigrep::Config` line to bring the `Config` type from the
library crate into the binary crate’s scope, and we prefix the `run` function
with our crate name. Now all the functionality should be connected and should
work. Run the program with `cargo run` and make sure everything works
correctly.
work. Run the program with `cargo run` and make sure everything works correctly.

Whew! That was a lot of work, but we’ve set ourselves up for success in the
future. Now it’s much easier to handle errors, and we’ve made the code more
Expand Down
Loading