Archive-name: clarinet/headers ClariNet articles come in the USENET message interchange format. This is a variant of the ARPA/Internet electronic mail format. Exact details on that format can be found in documents known as RFC822 (Mail format) and RFC1036 (USENET format) which are stored for anonymous pickup on UUNET and a variety of other machines. ClariNet articles use the standard USENET headers, plus a variety of special custom ones. Here we explain how we use the standard headers and the meanings of our extensions. Standard headers: From: The mail address found here will almost always be clarinews@clarinet.com. The comment, or user's full name, will the the reporter's name. In some cases, a title like "Science Reporter" or an affiliation will be added. In some cases, the From: address is an e-mail address that reaches the news agency. In the case of UPI stories, this is not true, and replies go simply to us. Subject: In most cases, this is a professional reporter/editor's headline for the story. In some cases, such as standing (regular) stories -- stock reports, weather, sports statistics, etc. -- a headline is filled in by ClariNet, possibly including the date of the story. UPI headlines are in mixed case. Some syndicated feature headlines are in upper case. Newsbytes headlines come in upper case, but are converted by ClariNet software to mixed case. Keywords: On this line, we translate the reporter's story coding along with our own keywords. A list of possible regular keywords is available. All keywords are human-generated by reporters and editors. Unfortunately, the coding system UPI uses is prone to errors. It's very terse, and a single keystroke error can create a ridiculous keyword. With thousands of stories moving every day, this is frequent enough to be annoying, but infrequent enough to be easily tolerated. Newsgroups: Articles are cross posted to a variety of newsgroups based on their coding and keywords. In addition, certain regular stories are put in special newsgroups based on their slugword (see below.) In general, a story is crossposted to up to 5 groups, so that those following a topic get every story related to that topic. All modern news reading software makes sure that you never see a crossposted article more than once, no matter how many groups it appears in. Date: The time that we got the story directly from the wire, which we receive via satellite. It will usually not reach you for another two hours on average, due to batching, propagation delays and deliberate delays required by contract. Message-id: We form message-ids from the slugword and an encoding of the date and time. Sometimes a checksum is used when the story arrives without a date and time. References: ClariNet messages contain References lines that can be used by thread following newsreading tools such as trn. References are generated when a story is an update to an earlier story, and when a story is a sidebar to an earlier story. We do not list all the messages in a reference chain -- normally, we will list only the immediate predecessor of a story, and the root of the story tree. This is done for each level of sidebar -- though normally sidebars only go one level deep. If you use a threaded newsreader you will thus see chains of updates grouped together. Not all updates replace their predecessor, so you can see several real stories in a chain. For example, if you come in to clari.sports.baseball after a few days, you might see an entire series of ball game stories grouped together as one thread. You will also see related stories on a major topic grouped together. Supersedes: On some stories, when a story replaces an earlier version, the Supersedes header is used to specify the message-id of the replaced story. This doesn't always work, so a cancel message is also issued. In some cases only the cancel is issued, and we note what was replaced with an x-supersedes header, which is really just a comment. ClariNet Special Headers Slugword: This is a special story-specific keyword. Every story is assigned a slugword. If the story is updated, it goes out again with the same slugword. We use this to cancel the old story before issuing the update, so that only one version of the story exists on your machine at a given time. Most slugwords are just simple words. The main story on George Bush, for example, is usually slugged "bush." There is no formal pattern to this that you can use, however. It is a safe bet that any story slugged "bush" would be about him, but if some other bush became news, it might be used in that context as well. Sidebars to stories will often use a component slug that links them to the main story. For example, the Panama invasion was slugged "panama," and a variety of stories around it were slugged "panama-response," "panama-nuncio" and so on. Sometimes more levels will appear. Slugwords can also be used to indicate standing stories -- those that repeat with some frequency. The daily PEOPLE column is always slugged "people." You can track a standing story by looking for its slug. A list of standing stories is available. Location: This field provides the location for the story. Sometimes a comma delimited list of locations is provided. Unfortunately, quite often the reporter does not code the location of a story, particularly on U.S. domestic news. Most international news is coded for location. Possible location codes include country names such as "canada" or "west germany" and state names such as "california." Regions and continents are also coded, and even a few places like New York City. In general, expect a location only on an International story or a U.S. regional story. ACategory: This provides the ANPA story category. There are just over a dozen of these. They provide a general story category. Our keywords give far more specific coding. This is useful if you're looking for general coding. The categories are: usa General U.S. related news special Special section (rarely used) feature Feature article food Recipes etc. (rarely used) entertainment financial international Non U.S. stories commentary Editorials etc. lifestyle weather regional Regions of the USA national Artificial category, local version of national story. political scoreboard Sports score reports racing (Not covered by ClariNet) sports travel advisory (For editors only -- not released) washington reserved (Unknown Category) natbriefs Radio National Briefs briefs Radio briefs headlines Radio headlines reg-headlines Radio regional headlines markets Radio stock market reports billboard television Radio reports about Television Most stories are usa, international, financial, sports or entertainment. Most stories in clari.local groups are regional. Priority: This is a general indication of the importance of the story. Priorities are: "FLASH", Once a decade type stories "BULLETIN", Top stories of the week "urgent", Top breaking stories of the day "major", Big non-breaking stories (artificial category) "regular", Most stories "daily", Lower priority stories "deferred", /* never used */ "release-at-will", Advance material for release any time "advance", Material for future release "weekend", Material for weekend newspapers Stories of the "flash," "bulletin" or "urgent" priority are what is known as "breaking news." Each priority has its own newsgroup so that you can track the biggest stories directly. We have never seen a posting to clari.news.flash yet. The last known flash was "space shuttle explodes." (Flashes are always 3 words, followed up by a bulletin.) You usually see 2-4 bulletins a week, although there will always be multiple versions of any bulletin story. You see 2-4 urgent stories per day as well. "major" is a priority we created. This is for stories that are, in wire parlance, "skedded." They have a regular priority but are rated as important stories by the desk editors. They go into the "top" news groups. Format: The format field is somewhat redundant. It describes what sort of story this article is. It is most useful on sports stories which come in a variety of formats. Formats depend on the ACategory. Some formats, like a "game story," are only possible on a sports story. advisory For editors only -- not sent out annual Annual summary (Sports/financial/some news) audio advisory For radio stations breaking Urgent/bulletin/flash briefs Short summaries of major stories close Report at close of market trading correspondent's advisory For reporters only -- not sent out daily Lower priority news daybook For reporters only -- not sent out feature Feature stories game story Report on a game glances Sports at a glance report headlines Two sentence summaries of major stories interim Report while market is open linescores Broken down score reports market wrapup Final market report open Market opening report ratings Team rankings and reports regular Most news scorecard List of scores snap scores Quick scores for radio summary Summaries (sports/stock market/etc.) table Sports statistics week-end Market reports at end of week Some stories may have multiple formats, comma delimited. Unfortunately this is more often than not a coding error. ANPA:, Codes:, X-takes: These lines mostly serve as comments, used by us to track how our software decodes the stories from the non-formalized wire format. While it is not a supported header, here are the meanings of the fields on the ANPA line. ANPA: Wc: 446; Id: a0723; Sel: na--i; Adate: 3-17-1235pes; Ver 2/0; V: sked ld Wc: Word count Id: Internal wire story ID -- unique number for the day Sel: Wire selector code Adate: Date story was written Ver: Major and minor version numbers for this story. V: Version field, sometimes indicates reason for update Many keywords are possible here, which we won't document. Codes: This comment line contains the original reporter's cryptic coding of the story. We have translated all this into human readable information above. It is their for our debugging purposes, only. Not a supported header. X-Takes: If the story was sent to us in multiple parts (don't ask why), the number of parts received is listed here. Not a supported header.