Fix Hyperlink/Travis Issues #1582

skipjack · 2017-09-09T17:27:31Z

Reverted 8281949 so we can test and try to resolve the hyperlink issue on Travis' network. I think there still may be one link failure due to a package's README that we may have to resolve before we can fix the core issue.

Related to Munter/hyperlink#121

skipjack · 2017-10-02T22:53:53Z

@Munter I figured out the issue here and it's actually more a problem with our customizations to the hyperlink call than hyperlink itself. See the check-links.js script...

The issue is that we don't output anything until hyperlink completes which allows us to filter through the failures to exclude checks on specific elements/urls. It would be great if there was an option in hyperlink for element-based exclusion, however I'm guessing we could also update the script to output to the console while the script is running to avoid no output for such a long period of time. The benefit to native exclusion is that the script could actually just avoid testing certain links altogether thus improving performance and making this script unnecessary.

I'm not too familiar with the tap ecosystem but if you have any advice, that'd be much appreciated. I see that there is an --exclude option but only for URLs, and unfortunately in one of those two cases (class="support__[^"]*") there's no consistent pattern for the URLs. Interested in your thoughts on the best approach so we can get this resolved.

Munter · 2017-10-04T17:17:31Z

I think the solution might be to use an appropriate tap-reporter or write your own with a filter. Most tap reporters I have seen have continuous reporting of the tap stream. Alternatively you could use tee to pipe the tap output to a file while still logging all the activity to the console, then post-process the tap output from the file

skipjack · 2017-10-06T04:29:29Z

Most tap reporters I have seen have continuous reporting of the tap stream.

Yeah so I updated our script to check and output line by line, instead of waiting for completion, but unfortunately as you can see the build still fails for some reason. There must be one link check that's hanging for over 10 min. I'm thinking a quick hack would just be to use a setInterval to log a ... string every few min to prevent it from completely failing out however that's starting to feel pretty desperate.

Munter · 2017-10-06T07:17:39Z

Looks like there is not default timeout for teepee, which is the http client assetgraph uses. I'll see if I can set a relevant default and maybe expose the acceptable timeout as a cli option.

skipjack · 2017-10-11T03:53:51Z

@Munter somehow it's still hanging on the link following this line:

loading build/1ebd0482aadade65f20ec178219fe012.woff2

(this is with the patch fix I made for your fix/defaultTimeout branch)

skipjack · 2017-10-11T22:24:49Z

@Munter btw I'd be more than happy to add you as a contributor if you're interested. It might be easier for you to debug this branch/pr/ci-build that way.

Munter · 2017-10-14T05:49:23Z

@skipjack yeah, that might be useful. I'm not sure when I'll have the time to do a deep dive on this, but when I do it should speed up my feedback loop

skipjack · 2017-10-14T19:21:08Z

yeah, that might be useful. I'm not sure when I'll have the time to do a deep dive on this, but when I do it should speed up my feedback loop

No worries, just sent you the collaborator invite. For now I'm just running hyperlink locally on bigger PRs that might cause link failures. Let me know if there's anything else I can do to help.

skipjack · 2017-12-18T04:08:05Z

@pierreneter would you be interested in submitting a PR to the hyperlink repo to resolve Munter/hyperlink#121 and thus fix this issue? We basically need to somehow shut down any long-running requests, I'm sure @Munter would help if you have ideas.

I've been pushing hard to get the site's information architecture sorted out and sections cleaned up which I think will be pretty complete by the end of this week. I have a plan on how to start working at #1525 and #1400 starting next week (12/24-ish), assuming all goes to plan. Hoping to review that with you and @jeremenichelli if you're both still interested and willing to help.

That said, it would be good to get this resolved before we dig into that though as hyperlink is a great way for us to make sure we aren't breaking things when making major changes. If you, or anyone else, would be willing to tackle this I'd be very appreciative 🙏 .

pierreneter · 2017-12-18T11:16:58Z

Ok. I will try

jeremenichelli · 2017-12-18T13:44:37Z

@skipjack I'm in for the next steps towards this. Just ping me on the right discussions, I haven't been in Slack too much the last weeks because of personal stuff.

I think that moving to another static generator like jekyll, hugo, gatsby, etc. is worth a hangout to see what's best given the complexity of a hypothetical migration and the amount of work that it will mean. I personally would love that to happen :)

pierreneter · 2017-12-18T17:59:50Z

@skipjack I think @Munter just did it. We need update hyperlink version to try it work or not. But, what about a handling for timeout?

pierreneter · 2017-12-18T18:06:23Z

About #1525, #1400. The same with @jeremenichelli, I vote for gatsby but we need more discussion

pierreneter · 2017-12-19T03:34:30Z

See logs at https://travis-ci.org/webpack/webpack.js.org/builds/318201104 It looks like set timeout for teepee that does not solve our problem

pierreneter · 2017-12-19T04:02:09Z

It's stuck at a font file (woff2 - is not plain text). It's local file and not a problem with the request. PNG is also not pure text, but it's passed quickly. I think we just need to check local files (it aren't plain text) exist or not, not need to load it. @Munter what do you think?
P/s: @skipjack the small problem, we need check URLs that are external URL link like: "medium.com/@ridermansb" (https://travis-ci.org/webpack/webpack.js.org/builds/318427547#L2228) with missing URI schema and fix it.

skipjack · 2017-12-19T04:15:26Z

@pierreneter I'm actually not sure it's hanging on the font file even though it's the last log line:

loading build/314bbcd238d458622bbf32427346774f.woff

I think it may be loading the next one but failing before it makes to logging it out (though I could be wrong). I'm pretty sure it's one of the sponsor/backer links. I think we have to debug more at the hyperlink level -- the issue is that whatever the case may be in terms of fixing URLs or network issues, hyperlink abort a request after a minute or two (or some user provided time period).

pierreneter · 2017-12-19T04:20:32Z

@skipjack @Munter added abort a request after 30 seconds timeout (Munter/hyperlink@a0bb4b3) but I don't know it works or not.

skipjack · 2017-12-19T04:30:25Z

Just ping me on the right discussions, I haven't been in Slack too much the last weeks because of personal stuff.

@jeremenichelli Cool, no problem. Yeah I was thinking maybe you could take the lead on the refactoring the fetch scripts (i.e. the ones that populate external docs not local to the site). As we discussed a while back, some prefix like _ could be used on those files for a simpler .gitignore and general structure.

I think that moving to another static generator like jekyll, hugo, gatsby, etc. is worth a hangout to see what's best given the complexity of a hypothetical migration and the amount of work that it will mean. I personally would love that to happen :)

I vote for gatsby but we need more discussion

@pierreneter @jeremenichelli yeah if we're going to switch to another SSG, I think Gatsby would be the best alternative. @pierreneter I did finally track down the StaticSiteGeneratorPlugin approach which would yield something very similar to what Gatsby outputs but with pure webpack. Personally I think this approach would be a bit nicer as we'd retain full control over the process and in some ways it would kinda poetic to build the webpack site with just webpack rather than "endorse" a generator when there are already a lot out there.

How about this, I'll throw together a branch within the next two weeks to demo what I mean on the pure webpack approach. If that feels clunky, we can take a stab at porting to Gatsby. If not, maybe we continue down the pure webpack path and see where it takes us. Either of those approaches would leave us in a better spot than we are now and would make the tasks in #1525 and #1400 easier to tackle. The question in my mind is more, do we port to Gatsby and let it handle everything for us behind the scenes or can we build a relatively simple webpack config that yields the same thing but leaves us with more control if needed (btw, Gatsby utilizes the StaticSiteGeneratorPlugin under the hood).

Let's not clutter this thread more (my bad 😁 ) but I really will ping you guys soon to finish discussing infrastructure.

skipjack · 2017-12-19T04:33:32Z

added abort a request after 30 seconds timeout (Munter/hyperlink@a0bb4b3) but I don't know it works or not.

@pierreneter could you use a git dependency in the package.json on this branch to test it out?

skipjack · 2017-12-19T04:35:27Z

@pierreneter actually I think that commit was the one from a while ago that we tested out but didn't work. Feel free to try again though.

Munter · 2017-12-19T10:17:59Z

I see several possible causes of error here. Might be some request in a promise that throws unexpectedly and is not caught. Or some crazy long download or an actual timeout. I was considering pushing everything that initiates a request in some sort of request tracking array and then removing it again when the request has resolved. Then start logging out that arrays content periodically , which means if there is a 10 minute hang that stops the CI server, we should see the unhandled connections logged in the console. That would let us know what resources cause this and possibly how to recreate it locally to implement a test against it

pierreneter · 2017-12-19T12:37:42Z

I added more logs to requests of my custom hyperlink. CI test's result said that requests for try connect and http status has not been used. It seems that it is stuck at the file loading process. So many file with so many link

skipjack · 2017-12-22T22:07:17Z

@Munter any idea why the teepee timeout isn't working? I poked around a little, which was a bit tough as the teepee property isn't mentioned in assetgraphs readme, but managed to find the related code and then poked around in teepees source to find the timeout related bits.

I was initially thinking we could do the same thing with a setTimeout within hyperlink but it seems like all requests are basically handled behind the scenes by assetgraph and teepee? Should our next steps just be to npm link the hyperlink pack and then npm link assetgraph and/or teepee so we can debug the whole stack?

I was considering pushing everything that initiates a request in some sort of request tracking array and then removing it again when the request has resolved. Then start logging out that arrays content periodically , which means if there is a 10 minute hang that stops the CI server, we should see the unhandled connections logged in the console.

Where can we get access the requests? Somewhere from within the assetgraph (ag) instance? If we can do that maybe there's some way we can invoke teepee to abort longstanding requests (as a short-term fix without having to npm link the whole stack?

Munter · 2018-04-05T11:12:38Z

Since hyperlink is set up in a way right now where it will fail the build, I think we need to do a few tweaks to reduce either the scope of the tests or the severity.

This version of hyperlink has a --todo option, which allows you to define matching patterns for test that are acceptable failures. We should probably use that to allow people to create PR's that are not affected by the outside world.

I see a few different ways of going about this:

Mark test of imported content as `todo`

We are fetching a bunch of content from the readme of a lot of plugins. If there are errors in any of those it will affect the build. But the individual plugin doesn't have any way of anticipating that their change will break the docs build. If we can identify certain routes that purely contain external content, we could mark the tests of outgoing links from those pages as todo.

Skip tests of external links completely on a PR test

The most important piece of information hyperlink gives you is if your internally controlled website is an intact dependency graph. We could reduce the amount of checks done on a normal PR to only look at internal structure.

This should probably be paired with a periodical test that runs the full hyperlink suite, so the docs site isn't pointing at external missing pages. I'm not aware of any really awesome reporting tools that such an error output could be posted to in a structured way. I think you can do slack integrations with travis, so maybe there should be a daily full check with errors reported to a slack channel for the community to pick up along the way without them being build blockers

Munter · 2018-04-05T13:20:26Z

The CI tests are passing on this branch now, but I don't think we are ready to merge just yet. See #1582 (comment)

Munter

Hyperlink tap-spot needs to be updated to fail the build on error.

There needs to be a good way to separate critical errors from warnings in hyperlink and still being able to follow up on them even if they don't break the build

Hyperlink should be updated to the upcoming v4 release when it is out

montogeek · 2018-04-05T13:46:36Z

@Munter Sounds good

Munter · 2018-04-09T20:19:27Z

The change set was getting a bit unwieldy, so I squashed everything to keep this branch manageable. The change set is quite small after all the experiments. No need to keep that history

…pot reporter fix(tests): fix build termination issues with `lint:links` refactor(tests): test out `defaultTimeout` branch refactor(tests): test out syntax error patch for hyperlink fix Update hyperlink version Use --verbose option to determine the problem Try my custom hyperlink add limit assets loading per time Change limit to 60 Remove concurrency of hyperlink Switch hyperlink to assetgraph4 feature branch The upcoming major version of Hyperlink is being built here, and we're seeing good improvements on both memory consumption and runtime Switch to node 8 on Travis Hyperlink uses Assetgraph 4, which has node 8 as a minimum version. Add extremely verbose memory output to hyperlink Update hyperlink assetgraph dependency to 4.0.5 This clears out memory leaks from javascript assets where jsdom documents were not properly closed Update hyperlink Switch to raw hyperlink tap output to see more context Update hyperlink reference to include assetgraph 4.1.1 Bring back check-links tap filtering Updated hyperlink and use new skipfilters and tap reporter Update hyperlink to version that doesn't fail on svg redirects and gives false negative content-type-mismatch errors Remove check-links.js. hyperlink --skip and tap-spot do the same Removed tap-parser dependency, which was only used in now deleted check-links script

…aw.githubusercontent.com

montogeek · 2018-05-08T10:01:24Z

@Munter What else is missing to merge this one?

Munter · 2018-05-08T13:45:29Z

@montogeek I can't build the current master on my machine, so I can't iterate. I need the updated rebuild branch merged before I can continue

montogeek · 2018-05-08T13:50:02Z

@Munter Why it is not building for you? :/

Munter · 2018-05-08T16:54:41Z

I can't wait 50 minutes for a build, which is what the current master takes on my MacBook air

montogeek · 2018-05-09T09:01:56Z

@Munter Would it help if I send you a zip file with build dir content?

Munter · 2018-05-09T12:54:00Z

Not really. I need the ability to iterate and fix things as I go

skipjack · 2018-05-12T20:09:18Z

@Munter if that's the case, then maybe rebase on rebuild and repoint this PR to rebuild? It's getting pretty close to done, just working on getting the last few PRs in and then we'll just have to do some thorough testing to make sure we didn't miss anything.

skipjack · 2018-07-22T20:44:18Z

I'm still strapped on time but would love to see this finished. If anyone has bandwidth to dig into it before I do -- which might still be a few months out -- I highly recommend rebasing on rebuild, re-pointing this PR to rebuild, and then testing to see what else is needed to call this complete.

montogeek · 2018-07-22T20:50:42Z

I could do it tomorrow

Munter · 2018-07-23T23:54:59Z

CLosing in favor of #2382

* Update link checking to use latest hyperlink. Replaces #1582 * Stop appending empty fragment to urls pointing ad no fragment identifier * Resolve relative hrefs from README's to their github.com url instead of linking locally inside docs * Prefer https where available and also use https in examples * Make non-fragment links inside headings visible * Update webpack docs china link * Fixed invalid edit links to index pages * skip fragment check for vimdoc that is known to fail * Fix broken link to migration guide from configuration/resolve * Also resolve github README relative links when they don't start with a dot * Add missing pages: Vote, Organization, Starter kits * Remove unused code * Fix link that was redirected * Open all target='_blank' links with rel='noopener' for security * Skip link checking img.shields.io * Change order of markdown url rewriting to avoid errors * Use lowercase variable name for node url module. See #2382 (comment) * Update link to file-loader github repo * Rewrite links from https://npmjs.com to https://www.npmjs.com to avoid the inevitable redirect * Skip checking links to npmjs.com/package that causes hyperlink to get http 500 responses. This should probably be reverted and fixed later * Updated organization projects urls to their redirect target * Update tool-list to get correct links for projects * Updated links to their redirect target * Remove jscs-loader. The project is deprecated, merged with eslint and the domain we link to is for sale * Update link to karma-webpack project * Remove link to named-modules-plugin, which is now in webpack core * Added missing mini-css-extract-plugin to fetch script * Skip checking links to known deleted github users * Update broken fragments to work with new markdown generator * Fixed more broken links * infra(site) Copy assets when building * Update github location for css-loader * Remove dead github accounts from contributors listings * infra(site) Fix fetchPackages script when running locally * Fix internal fragmtn links in optimization.md * Skip link checking or opencollective.com/webpack. Massive html response made the checker go into CPU overdrive * Remove link to non-existing named-modules-plugin * Use new release of hyperlink * feat(infra) Travis optimization (#2404) * Fix internal fragmtn links in optimization.md * Skip link checking or opencollective.com/webpack. Massive html response made the checker go into CPU overdrive * Try out travis staged build * Add proselint * Upgrade pip before using it * Move before-hooks around to try and make proselint install * Try adding python as a language as well to geet an updated version * More messing with config * Manually handle node versioning * Add node minor version to nvm install. Defaulted to slightly too low version * Manually install node 8.11 * Try a matrix build to separate node and python stuff * Add linkcheck task and try a different cahce setup * Minor name change to test if cache works correctly across builds * Attempt to combine matrix and staged build * Attempt going back to staged build * Bump travis * Ping Travis. You alive? * Fix broken travis.yml * Fix wrong stage order * Explicitly run specific lintings, exclude proselint * Allow failures for link checker * Change proselint cache directory. Maybe this works * Add new script to fetch repository names that the docs should contain. Try to centralize github API usage to avoid rate limitations * Add caching to eslint * Remove parts of deploy.sh that are now run by travis * Added new ./repositories to store github api results * Replace fetch.sh with npm scripts and fetch-package-readmes.js * Attempt to make caches more specific to the containers and stages they refer to * Remove deprecaed fetch_packages script. Now replaced by two-step fetch-package-repos and fetch-package-readmes * Attempt a more windows friendly build by using npm and webpack instead of shell commands * Disable link checking for now to speed up builds * Revert "Disable link checking for now to speed up builds" This reverts commit 7d4bb06. * Add dist to proselint cache so generated content also gets checked * Remove unnessessary GITHUB_TOKEN env variable check in repo fetching script * Revert "Add dist to proselint cache so generated content also gets checked" This reverts commit fc7676d. * Rework pipeline for better speed and make proselint a deployment blocker * Rename build stage to reflect its actions * Run content-tree only after generating all external content * Remove link to non-existing named-modules-plugin * Fix double slashes in edit links * Rename stages * Add new internal link check as a blocking check * Fix wrong name in allowed_failures config

skipjack added Bug Build Process & Testing Infrastructure labels Sep 9, 2017

skipjack mentioned this pull request Sep 9, 2017

Timeout Issues on Travis Munter/hyperlink#121

Closed

skipjack force-pushed the fix-hyperlink branch from 4a687e1 to eab49af Compare October 1, 2017 13:08

skipjack mentioned this pull request Oct 3, 2017

Info Architecture #1612

Merged

6 tasks

skipjack mentioned this pull request Oct 11, 2017

fix(timeout): fix syntax error Munter/hyperlink#122

Merged

Munter mentioned this pull request Apr 3, 2018

Evaluate all http redirects of outgoing links #1978

Merged

Munter requested changes Apr 5, 2018

View reviewed changes

Munter force-pushed the fix-hyperlink branch from 613c1c7 to 2e52e7f Compare April 9, 2018 20:17

skipjack and others added 6 commits April 11, 2018 00:35

Skip gitter sidecar and todo external-redirects

f9d55b9

Fix broken fragment links to node documentation

4363624

Update hyperlink and tap-spot

a18805e

Don't resolve external README internal fragment identifier hrefs to r…

8fb0520

…aw.githubusercontent.com

hyperlink --todo img.shields.io

5dcdf11

Munter force-pushed the fix-hyperlink branch from 850b863 to 5dcdf11 Compare April 10, 2018 22:46

skipjack mentioned this pull request May 26, 2018

Rebuild the Site #2039

Merged

16 tasks

Munter added a commit that referenced this pull request Jul 23, 2018

Update link checking to use latest hyperlink. Replaces #1582

b6fedfa

Munter mentioned this pull request Jul 23, 2018

Rebuild hyperlink #2382

Merged

12 tasks

Munter closed this Jul 23, 2018

chenxsan deleted the fix-hyperlink branch June 24, 2021 10:25

Fix Hyperlink/Travis Issues #1582

Fix Hyperlink/Travis Issues #1582

Conversation

skipjack commented Sep 9, 2017

skipjack commented Oct 2, 2017 • edited Loading

Munter commented Oct 4, 2017

skipjack commented Oct 6, 2017

Munter commented Oct 6, 2017

skipjack commented Oct 11, 2017

skipjack commented Oct 11, 2017 • edited Loading

Munter commented Oct 14, 2017

skipjack commented Oct 14, 2017

skipjack commented Dec 18, 2017

pierreneter commented Dec 18, 2017

jeremenichelli commented Dec 18, 2017

pierreneter commented Dec 18, 2017

pierreneter commented Dec 18, 2017

pierreneter commented Dec 19, 2017

pierreneter commented Dec 19, 2017

skipjack commented Dec 19, 2017

pierreneter commented Dec 19, 2017

skipjack commented Dec 19, 2017 • edited Loading

skipjack commented Dec 19, 2017

skipjack commented Dec 19, 2017

Munter commented Dec 19, 2017

pierreneter commented Dec 19, 2017

skipjack commented Dec 22, 2017

Munter commented Apr 5, 2018

Mark test of imported content as todo

Skip tests of external links completely on a PR test

Munter commented Apr 5, 2018

Munter left a comment • edited Loading

Choose a reason for hiding this comment

montogeek commented Apr 5, 2018

Munter commented Apr 9, 2018

montogeek commented May 8, 2018

Munter commented May 8, 2018

montogeek commented May 8, 2018

Munter commented May 8, 2018

montogeek commented May 9, 2018

Munter commented May 9, 2018

skipjack commented May 12, 2018

skipjack commented Jul 22, 2018

montogeek commented Jul 22, 2018

Munter commented Jul 23, 2018

skipjack commented Oct 2, 2017 •

edited

Loading

skipjack commented Oct 11, 2017 •

edited

Loading

skipjack commented Dec 19, 2017 •

edited

Loading

Mark test of imported content as `todo`

Munter left a comment •

edited

Loading