Internet JUNKBUSTER Technical Information

Options · Checking Options · Installation · Copyright · (FAQ)


Manual Page


A copy of this page in standard man macro format is included in the tar archive.

*  Name

junkbuster - The Internet Junkbuster Proxy TM

*  Synopsis

junkbuster [-a] [-y] [-s] [-c] [-v]
[-u user_agent] [-r referer] [-t from]
[-b blockfile] [-j jarfile] [-l logfile]
[-w NAME=VALUE] [-x Header_text]
[-h [bind_host_address][:bind_port]]
[-f forward_host[:port]] [-d N]
[-g gw_protocol[:[gw_host][:gw_port]]]

*  Description

junkbuster is an instrumentable proxy that filters the HTTP stream between web servers and browsers.

Options

-a
(Obsolete) Accept the server's Set-cookie headers, passing them through to the browser. This option was removed in Version 1.2 and replaced by an improvement to the -c option.

-b blockfile
Block requests to URLs matching any pattern given in the lines of the blockfile. The junkbuster instead returns an error 403 (Forbidden) and an explanation, though the browser may display only a broken image icon. The syntax of a pattern is [domain][:port][/path] (the http:// or https:// protocol part is omitted). To decide if a pattern matches a target, the domains are compared first, then the paths.

To compare the domains, the pattern domain and the target domain specified in the URL are each broken into their components. (Components are separated by the . (period) character.) Next each of the target components is compared with the corresponding pattern component: last with last, next-to-last with next-to-last, and so on. (This is called right-anchored matching.) If all of the pattern components find their match in the target, then the domains are considered a match. Case is irrelevant when comparing domain components.

A successfully matching pattern can be an anchored substring of a target, but not vice versa. Thus if a pattern doesn't specify a domain, it matches all domains. Furthermore, when comparing two components, the components must either match in their entirety or up to a wildcard * (star character) in the pattern. The wildcard feature implements only a "prefix" match capability ("abc*" vs. "abcdefg"), not suffix matching ("*efg" vs. "abcdefg") or infix matching ("abc*efg" vs. "abcdefg"). The feature is restricted to the domain component; it is unrelated to the optional regular expression feature in the path (described below).

If a numeric port is specified in the pattern domain, then the target port must match as well. The default port in a target is port 80.

If the domain and port match, then the target URL path is checked for a match against the path in the pattern. Paths are compared with a simple case-sensitive left-anchored substring comparison. Once again, the pattern can be an anchored substring of the target, but not vice versa. A path of / (slash) would match all paths. Wildcards are not considered in path comparisons.

For example, the target URL
   the.yellow-brick-road.com/TinMan/has_no_brain
would be matched (and blocked) by the following patterns
   yellow-brick-road.com
and
   Yellow*.COM
and
   /TinM
but not
   follow.the.yellow-brick-road.com
or
   /tinman

Comments in a blockfile start with a # (hash) character and end at a new line. Blank lines are also ignored.

Lines beginning with a ~ (tilde) character are taken to be exceptions: a URL blocked by previous patterns that matches the rest of the line is let through. (The last match wins.)

Patterns may contain POSIX regular expressions provided the junkbuster was compiled with this option. The idiom /*.*/ad can then be used.

In version 1.3 and later the blockfile and cookiefile are checked for changes before each request.

-w NAME=VALUE
Specifies a pair to be sent as a cookie with every request to the server. (Such boring cookies are called wafers.) This option may be called more than once to generate multiple wafers. The original Netscape specification prohibited semi-colons, commas and white space; these characters will be URL-encoded if used in wafers. The Path and Domain attributes are not currently supported.

-c cookiefile
Enforce the cookie management policy specified in the cookiefile. If this option is not used all cookies are silently crunched, so that users who never want cookies aren't bothered by browsers asking whether each cookie should be accepted. However, cookies can still get through via JavaScript and SSL, so alerts should be left on.

In Version 1.2 and later this option must be followed by a filename containing instructions on which sites are allowed to receive and set cookies. By default cookies are dropped in both the browser's request and the server's response, unless the URL requested matches an entry in the cookiefile. The matching algorithm is the same as for the blockfile. A leading > character allows server-bound cookies only; a < allows only browser-bound cookies; a ~ character stops cookies in both directions. Thus a cookiefile containing a single line with the two characters >* will pass on all cookies to servers but not give any new ones to the browser.

-j jarfile
All Set-cookie attempts by the server are logged to jarfile. If no wafer is specified, one containing a canned notice (the vanilla wafer) is added as an alert to the server unless the -v option is invoked.

-v
Suppress the vanilla wafer.

-t from
If the browser discloses an email address in the FROM header (most don't), replace it with from. If from is set to . (the period character) the FROM is passed to the server unchanged. The default is to delete the FROM header.

-r referer
Whenever the browser discloses the URL that led to the current request, replace it with referer. If referer is set to . (period) the URL is passed to the server unchanged. In Version 1.4 and later, if referer is set to @ (at) the URL is sent in cases where the cookiefile specifies that a cookie would be sent. (No way to send bogus referers selectively is provided.) The default is to delete Referer.

-u user-agent
Information disclosed by the browser about itself is replaced with the value user-agent. If user-agent is set to . (period) the User-Agent header is passed to the server unchanged, along with any UA headers produced by MS-IE (which would otherwise be deleted). In Version 1.4 and later, if user-agent is set to @ (at) these headers are sent unchanged in cases where the cookiefile specifies that a cookie would be sent, otherwise only default User-Agent header is sent. That default is Mozilla/3.0 (Netscape) with an unremarkable Macintosh configuration. If used with a browser less advanced than Mozilla/3.0 or IE-3, the default may encourage pages containing extensions that confuse the browser.

-h [host][:port]
If host is specified, bind the junkbuster to that IP address. The default is to bind to all IP addresses (INADDR_ANY). Specifying a port is optional; the default is 8000.

-f forward_host[:port]
Forward all HTTP requests from the client to the forward_host[:port]. The default port is 8000. Chained proxies may be wanted to better conceal the IP address of the end-client, to add caching or different kinds of filtering, or to interface to an existing firewall proxy. No loop detection is performed. When setting up chains of proxies that might loop back, try adding Squid.

-g gw_protocol[:[gw_host][:gw_port]]
Use gw_protocol as the gateway protocol. (Only in Version 1.4 and later.) The default is to use no gateway protocol; this may be explicitly specified as direct on the command line. The SOCKS4 protocol may be specified as socks or socks4. The SOCKS4A protocol is specified as socks4a. The SOCKS5 protocol is not currently supported. The default gw_port is 1080.

The user's browser should not be configured to use SOCKS; the proxy conducts the negotiations, not the browser.

The user identification capabilities of SOCKS4 are deliberately not used; the user is always identified to the SOCKS server as userid=anonymous. If the server's policy is to reject requests from anonymous, the proxy will not work. Use -d 3 to see the status returned by the server.

-d N
Set debug mode. The most common value is 1, to pinpoint offensive URLs, so they can be added to the blockfile. The value of N is a bitwise logical-OR of the following values:
1 = URLs (show each URL requested by the browser);
2 = Connections (show each connection to or from the proxy);
4 = I/O (log I/O errors);
8 = Headers (as each header is scanned, show the header and what is done to it);
16 = Log everything (including debugging traces and the contents of the pages).

Because most browsers send several requests in parallel their contents may appear intermingled, so the -s option is recommended when using -d with N greater than 1.

-y
Add X-Forwarded-For headers to the server-bound HTTP stream indicating the client IP address to the server, in the new style of Squid 1.1.4.

-x HeaderText
Add the HeaderText verbatim to requests to the server. Typical uses include adding old-style forwarding notices such as Forwarded: by http://pro-privacy-isp.net and reinstating Keep-Alive headers (which the junkbuster deletes so as not to reveal its existence). No checking is done for correctness or plausibility, so it can be used to throw any old trash into the server-bound HTTP stream. Please don't litter.

-s
Doesn't fork() a separate process to handle each connection. Useful when debugging to keep the process single threaded.

-l logfile
Write all debugging data into logfile. The default logfile is the standard output.

*  Installation and Use

Browsers must be told to find the junkbuster (e.g. localhost port 8000). To set the HTTP proxy in Netscape 3.0, go through: Options; Network Preferences; Proxies; Manual Proxy Configuration; View. See the FAQ for other browsers. The Security Proxy should also be set to the same values, otherwise shttp: URLs won't work.

Note the limitations explained in the FAQ.

*  Checking Options

To allow users to check that a junkbuster is running and how it is configured, it intercepts requests for any URL ending in /show-proxy-args and blocks it, returning instead information on its version number and current configuration including the contents of its blockfile. To get an explicit warning that no junkbuster intervened if the proxy was not configured, it's best to point it to a URL that does this, such as Junkbusters' one.

*  See Also

http://www.junkbusters.com/ht/en/ijbfaq.html
http://www.junkbusters.com/ht/en/cookies.html
http://internet.junkbuster.com/cgi-bin/show-proxy-args
http://www.netscape.com/newsref/std/cookie_spec.html
http://www.ds.internic.net/rfc/rfc2109.txt
http://squid.nlanr.net/Squid/
http://www.math.ucsb.edu/%7Eboldt/

*  Copyright and GPL

Written and copyright by the Anonymous Coders and Junkbusters Corporation and made available under the GNU General Public License (GPL). This software comes with NO WARRANTY. Internet Junkbuster is a trademark of Junkbusters Corporation.

--- Back to Top of Page ---

Home · · Site Map · Privacy · Web Ads · Telemarketing · Junk Mail · Junk E-mail · Search

Copyright © 1996-7 Junkbusters Corporation. (TM) Copying and distribution permitted under the GNU General Public License: view document's source. 1997/04/22 http://www.junkbusters.com/ht/en/ijbman.html

webmaster@junkbusters.com