Niwt - Nifty Integrated Web Tools

Niwt is a project begun by MicahCowan, one-time maintainer of GNU Wget.

Niwt is a tool for downloading resources from the web, and aims to (eventually) reproduce most of the functionality of GNU Wget (and some additional)—in particular, it will support automatic connection restoration, and recursive website fetching/mirroring. However, Niwt's design philosophy differs radically from Wget's—in particular, it is built entirely around Unix pipelines, and facilities to easily swap out or extend every existing piece of functionality with an alternative (or additional) program that offers equivalent (or improved) functionality (as opposed to Wget's more monolithic nature). It is meant to be "on-the-fly patchable", even by non-programmers. It is expected that this will result in a big trade-off between, the relative efficiency and lower resource consumption that Wget enjoys (which Niwt will certainly not), and extreme and relatively easy customization.

Wget is felt to be a very powerful and flexible tool for fetching files from the web, but perhaps suffers from trying to do too many things at once, the result being that rather than wget being a tool that follows the Unix development principle of "doing one thing and doing it well", it's a tool that does several things, some of which might be said to be done in only a mediocre fashion. A primary goal of Niwt is to separate the various tasks performed by a tool like wget, into pipelines of distinct and user-replaceable programs, each of which actually does take responsibility for as little as possible. The overall result may not necessarily be viewed as "better" than wget, which does its job quite well, and with much more efficiency than Niwt is ever likely to achieve. Niwt's design model is explicitly to go "to ridiculous extremes" in modularity, and both its strengths and weaknesses are primarily a result of that design philosophy.

Imagine being able to use grep to decide which links wget follows; or automatically extract tarballs via tar when they're downloaded (in a safe manner). Imagine being able to tell wget whether to follow a link based on which page it was found in. Or to use timestamps for one section of the website, but unconditionally pull the rest. Or follow links that were found in downloaded PDF files, not just HTML. Imagine being able to transform links that wget parses, before it follows them (perhaps to redirect to a mirror site?). These are the sorts of things that Niwt intends to make possible through its extensible design.

The Niwt project is (or will be) a weaving together of many of the ideas that MicahCowan either formed, was exposed to, or had discussions about during his time as GNU Wget's maintainer, as well as (of course) the existing strengths that Wget has to offer.

For more information, you can also hang out at the IRC channel #niwt @ irc.freenode.net, or check out the mailing list (http://addictivecode.org/mailman/listinfo/niwt-users/).

To download and install, see InstallingNiwt.

Niwt’s source code is free and open source software, and is available under the MIT (simple BSD-style) license. Unlike Wget, Niwt is not affiliated with the GNU Project.

Project Goals & Motivations

Some examples of facilities that Wget provides, but which could benefit from separation into distinct, user-replaceable programs with distinct responsibilities:

And here are some more exotic ideas about benefits that could be provided by such a model, that do not have any current equivalents in Wget:

Of course, not everything is a bowl of cherries: there are marked trade-offs—especially in the area of performance.

For this reason, Niwt will clearly not serve the needs of every user that currently finds Wget useful, and Wget will continue to be a vital tool in many users' toolbelts. In fact, while it is hoped that Niwt will be of interest to a large number of users, it is certain that Wget will continue to meet needs that Niwt never can. Niwt makes some fairly extreme trades, primarily of efficiency and resource consumption, in return for great flexibility and customization.

See TryingOutNiwt for an overview of how niwt is used, how it's designed, and what it can currently be made to do.

Niwt (last edited 2012-08-08 16:38:45 by MicahCowan)