TELECOM Digest Sun, 7 Nov 93 10:27:30 CST Volume 13 : Issue 739 Inside This Issue: Moderator: Patrick A. Townson Orange County DACS Outage (Urban Surfer) Dialup by Modem Bank to Ethernet (Scott M. Pfeffer) Fire Update 11-5-93 9:00 AM PST (Pete Tompkins) "Fake Switch" Box or Tester (Karl Bunch) Canada Goes 1+ 10D For All Long Distance, Sept '94 (Dave Leibold) Skokie, IL, and Telephone History (Dave Levenson) ---------------------------------------------------------------------- Date: Fri, 05 Nov 1993 13:20:02 PST From: Urban Surfer Subject: Orange County DACS Outage Reply-To: matt@phs.com Organization: Pacificare Health Systems About six weeks ago, I posted in the Digest an account of the DACS outage in Orange County, CA. I received several queries for more information. It seems that a lot of people were disturbed to learn about the potentioal points of failure on a DACS as well as the bug we experienced. I recently took a tour of the affected CO and met with the switch and DACS administrators to ask further questions. At this point, they believe that they have fully addressed all software and procedural issues with the DACS IV. They also stated that the software patches they applied have been propagated throughout the entire Bell network. The following is the public disclosure report sent to the FCC from Pacific Bell. This report was retyped from a fax, so any errors are mine or my secretary's. FINAL SERVICE DISRUPTION REPORT CATEGORY: 50,000+ REPORTING COMPANY: Pacific Bell REPORT CONTACT/TELEPHONE: Eva Low (510) 823-2910 LOCATION OF DISRUPTION: Anaheim, California (ANHMCA#11) 1. DATE AND TIME AND INCIDENT: 9/15/93 0752 HRS. 2. GEOGRAPHIC AREA AFFECTED: The failure of this Digital Crossconnect System (DCS) affected a portion of the city of Anaheim, California. This geographic area is located in the Los Angeles, California LATA 730. 3. ESTIMATED NUMBER OF CUSTOMERS AFFECTED: Potentially, 67,528 customers could have been affected by this failure. This estimate was derived based on the number and type of working circuits on the DCS. 4. TYPES OF SERVICES AFFECTED (e.g., INTEREXCHANGE, LOCAL, CELLULAR, 911 EMERGENCY SERVICES.): All services using the interoffice transport network, into and out of, the Anaheim 11 central office building were affected. This included two local switching entities which were isolated from the interoffice network (intraoffice call was not affected.) Operator and directory assistance services were adversely affected and the Anaheim Public Safety Answering Point (PSAP) was without Automatic Location Identification (ALI) during this failure. 5. DURATION OF THE INCIDENT: Date and time of disruption: 9/15/93 at 0752 Date and time of full service restoral: 9/15/93 at 1557 Duration of incident (minutes): 485 6. ESTIMATED NUMBER OF BLOCKED CALLS: Approximately 746,950 calls were blocked during this incident. This estimate is based on data from the switches using the same day and time of the week prior to the incident. 7A. CAUSE OF THE INCIDENT: This service outage was caused by a software defect in the Digital Access Cross-Connect System IV-2000 (DACS IV-2000). The software defect caused information in an area of the database called "Frame Data Page" to become corrupted. This corruption did not have an immediate impact on service. The Frame Data Page contains critical information related to the systems software program identity which is used by the DACS IV-2000 during system recovery. This corruption went undetected and was propagated from active memory to the hard disk and system backup tapes. Prior to the outage, most input commands issued to the DACS IV-2000 were responded to with "Retry Later" (RL) messages. In accordance with standard procedures, a system reset was activated to clear the system of RL responses in order to reestablish communications with the DACS IV-2000. The system design is such that when a system reset is activated, data resident on the hard disk is loaded onto active memory. On this occasion, the aforementioned corrupted Frame Data Page caused the DACS IV-2000 to reinitialize the cross-connect map and drop all active cross connects in the system. A total system outage ensued. Attempts to recover the system by rebooting from system backup tapes failed because the corrupted Frame Data Page also existed on these tapes. AT&T determined that the database corruption resulted from improper software process interactions involving the preemption of a lower priority process by a higher priority process during a very specific small window of time when the program was manipulating internal data pointers. The use of these data pointers by the higher priority process resulted in corruption of the Frame Data Page described above. 7B. NAME AND TYPE OF EQUIPMENT/VENDOR NAME: Name: Digital Access Crossconnect System IV-2000 (DACS IV) Type: Digital Crossconnect System (DCS) Vendor: American Telephone & Telegraph (AT&T) 7C. SPECIFIC PART OF THE NETWORK INVOLVED (e.g..LOOP SWITCH, INTEROFFICE): This disruption involved the interoffice transport portions of the network. 8. METHOD USED TO RESTORE SERVICE: Standard emergency action recovery procedures were executed by Pacific Bell field personnel under the direction of Pacific Bell Electronic Systems Assistance Center (ESAC), in consultation with AT&T RTAC, TSO, and Bell Laboratories (Bell Labs). Multiple attempts to recover system operation from storage media failed since the corruption was present on the hard drive as well as all backup tapes maintained in the office. A special software debugging tool called "DACSmate: was thereupon attached to the DACS IV-2000. Through the use of DACSmate, AT&T Bell Labs performed an intensive analysis of the database on the backup tape and determined that the extent of the corruption was confined to the Frame Data Page only: the cross-connect map itself maintained integrity. Bell Labs used DACSmate to copy the existent cross-connect map from taped and reloaded the DCS hardware, thereby restoring transmission for customer service, however the system controller remained in an out-of-service state. Subsequently, Bell Labs prepared a "new" database containing both a valid Frame Data page and the cross-connect map information. Standard procedures were used to load this database from tape into the system, and full functionality was restored to the DACS IV-2000. 9. STEPS TAKEN TO PREVENT REOCCURRENCE OF THE OUTAGE: 1. AT&T issued/reissued the following bulletins, called either COACH or Urgent Problem Notification (UPN) Bulletins, as a result of this incident: a) UPN Bulletin 9309171.1 was issued on September 17, 1993, alerting the industry that system resets can result in service interruptions, and hence, are not to be performed as part of normal troubleshooting of main controller problems. The UPN also recommends that the next level of support be contacted because it may be necessary to use special debugging tools to ensure that data corruption is not present. b) UPN Bulletin 9309171.2, issued on October 13, 1993, amends UPN Bulletin 9309171.1. The amended version describes the cause of the data corruption and also identifies the correcting software program releases (see item 2 below). c) COACH Bulletin #050393.2, issued on September 17, 1993, cancels the use of resets (recommended in Bulletin, #050393.1). This issue recommends that the next level of support be contacted and the next level of support be contacted and that special debugging tools may be needed to ensure that data corruption is not present. 2. The correction for the process interaction problem is available in the following software releases: Redundant Controller: 2.3drc (avail 11/7/93) 3.0drc (avail now) 2.3d (avail 10/25/93) Additional defensiveness measures have been developed to have the system automatically validate the database for integrity and to prevent the inadvertent propagation of corrupted data. These changes (tracked via MR CS 93-26601) will be available as follows: Redundant Controller: 2.3drc (avail 11/7/93) 3.01drc(avail now) Simplex Controller: 2.3d (avail 10/25/93) A software patch (overwrite) for MR# CS 93-26601 was developed for Simplex Controller 2.2d and Redundant Controller 2.2drc and is now available. Pacific Bell is planning to deploy Release 2.3 or later in all DACS IV-2000 offices by 1994. Patch application prior to this will be determined on a site-by-site basis. 3. Pacific Bell issued an ESAC Flash, 93-010F prohibiting the use of system resets without ESAC involvement. Moreover, ESAC will use DACSmate to verify that the database is not corrupted prior to initiating a reset. Deployment of additional defensive measures (2.2 patch, Release 2.3 and 3.0.2 software) provides this data validation internal to the DACS IV-2000 software. (See also item 2 and 4 in this section). 4. The DACSmate software debugging tool currently does not have remote access capability; however, enhancements for remote access via an x.25 wide area network are under development. This remote capability requires development of a companion called DACSlink, which AT&T will be jointly testing with Pacific Bell in December 1993. Pending successful completion of testing, Pacific Bell will implement DACSlink in all of its DACS IV-2000 offices during 1994. On October 8, 1993, AT&T provided Pacific Bell with additional portable DACSmate units, pending the deployment of DACSlink. Matt Holdrege matt@phs.com MH235 ------------------------------ From: sp9183@swuts.sbc.com (Scott M. Pfeffer) Subject: Dialup by Modem Bank to Ethernet Date: 6 Nov 93 18:44:05 GMT Organization: Southwestern Bell Telephone Company Dialup to modem bank. Investigating options such as Trailblazer series, etc. Need information on availability, pricing, recommendations, caveats, or experiences with implementing dialup to a host which can route TCP/IP traffic across modems. Nice picture: -------- | OFC/ | _____ _______ | HOME | | | /| | | CPTR |---|Modem|---SWITCHED TELCO----| Modem | | | |_____| \| Pool | -------- |_______| ||||||| --------- ________ \_____/ | Network\ _________ | Linkup | || | of |=====|Sun SPARC| <======| Device |======>// | Suns / |_________| |________| --------- Specifically, I am looking for information on the following pieces: 1. The Modem Pool. (What vendors, advice...) 2. The Linkup Device between the modem pool and the Sun (can be PC bridge, or an interface card in the Sun for the modem pool device, or anything else...) (Trailblazer?). 3. I have no questions about the the OFC/Home cptr, the Modem attached to the home computer, or the network of suns. Those details have already been worked out... Requirements: 1. Support 2400, 9600 baud. Autobaud detection would be nice. Initially 8 or 16 modems at least. Must be easily expandable. 2. Support compression, error control, but allow non-compression and/or non-error-control modems at the OFC/HOME to work (through automatic negotiation). 3. Total transparency between Sun and OFC/HOME CPTR. That is, once connected, ALL data from OFC to Sun will be delivered untouched, and from Sun to OFC, too. This means the linkup device as well as the modem pool device must not interfere or attempt to interpret the data coming across once the connection is established. 4. Cost reduction information valuable... Why: 1. Have client application on OFC/HOME computers that I want to talk to a server application on the Sun SPARC via DIALUP. 2. Client talks TCP over phone line using SLIP/PPP for serial IP. Thanks in advance. Please reply to: Scott Pfeffer sp9183@swuts.sbc.com or call direct: (314) 235-7213 Information Services, Southwestern Bell Telephone 18-N-22, One Bell Center,St. Louis, Missouri 63101 ------------------------------ From: tompkins@pete.tti.com (Tompkins) Subject: Fire Update 11-5-93 9: 00 AM PST Reply-To: tompkins@pete.tti.com (Tompkins) Organization: Transaction Technology, Inc. Date: Sat, 6 Nov 1993 17:35:30 GMT I am thoroughly impressed with the quality of the telephone service throughout Malibu the last three days. Outbound calls were totally unrestricted at all times; inbound volume was limited intentionally to insure our ability (and more importantly, emergency personnel's ability) to call out as necessary. A number of areas in east Mailbu (310-456 exchange) still get a busy. I suspect cables going to some of the fire areas have been welded together by the fire. Away from phones for a minute: I just drove into work for the first time since Monday. Pacific Coast Highway is open to Malibu residents (acutally, its open to anyone, but none of the canyon accesses are open to non-residents). The scene was one of total devastation -- but it was also a scene of many successes. For ten miles along PCH, the fire burned right up to the Highway where there wasn't a structure. And for that same stretch, it burned up to the back wall of the structures that fronted PCH. La Costa (one of the areas that has gotton a lot of news coverage) looked like a bomb hit it -- but the La Costa houses on PCH are untouched even though they are a mere 50' from their neighbors to the rear, which, in most cases, were destroyed! In Carbon Canyon, the fire raced through in seconds. The residents all assumed their houses were gone -- but the vast majority were missed -- partly the luck of the draw, and partly the hard work of fire fighters, partly aided by the large separation between houses (these are two to ten acre lots). In one case (a close personal friend), the fire fighters apparently cut out a piece of burning roof; broke in through a window to extinguish some burning furniture. They actually went out of their way to cover the other furniture, obviously taking pains to minimize the damage they caused -- they hauled the smoldering furniture a hundred yards down the canyon, and went on to the next house. This is one family who left KNOWING their house was gone and returning to find really minor damage! Further west (PCH DOES run east and west through Malibu, in spite of what the TV newspeople might have you believe!), a ring of burned out brush encircles the Civic Center and also the Malibu Knolls residential area, but nothing was burned (at least as far as you see from the Highway). Anyone who has visited Malibu has probably noticed the castle overlooking the Civic Center. Its walls are singed, as are the walls of many of its neighbors. Hughes Research, Pepperdine and the neighboring residential area, Malibu Country Estates: same stroy -- fire right up to the edge of the buildings. Four houses at Puerco Canyon (a little further west) have burnt brush all the way around them, with no apparant damage! Some hot spots still exist in Corral Canyon on the west flank and also in and around Topanga on the east flank, but there is no further threat to structures. We all thank God for the cool, damp sea breezes that returned to the area Wednesday. I don't want to minimize the devastation and the loss of hundreds of people, but it is obvious that in many areas the skillful and hard work of thousands of fire fighters from all over the western U.S. saved many hundreds more homes. The hearts of all of Malibu goes out to these hard working people. Pete Tompkins ------------------------------ Date: Sat, 06 Nov 1993 19:37:02 GMT From: karl@ttank.ttank.com (Karl Bunch) Subject: "Fake Switch" Box or Tester Organization: Think Tank Software, Norwalk, CA I'm looking for a circuit or "magic box" that would allow me to basicly plug to phones back-to-back. Given Phone A & B if phone A were picked up Phone B would ring, and when phone B is picked up they could converse as normal until one of them hangs-up. The same could be true in the reverse (Phone B would ring A if it's on-hook etc.) I want to hook up a phone to a voice-mail board and allow the board to ring the phone or the phone to "call into" the board without using up phone lines etc. I'm extremely ignorant as to how phones even ring. So be very complete in any reply you may make. Please reply by e-mail. Thanks for any help, Karl Bunch UUCP: ..!uunet!cerritos.edu!ttank!karl Think Tank Software INTERNET: karl@ttank.com ------------------------------ Date: Sat, 06 Nov 1993 23:06:58 -0400 From: Dave.Leibold@f730.n250.z1.FIDONET.ORG (Dave Leibold) Subject: Canada Goes 1 + 10D For All Long Distance, Sept '94 [from Bell News, Bell Ontario, 25 Oct 93. Text is Bell Canada's.] Canadian dialing patterns to change in 1994. Get ready for the next big change in dialing patterns. The way callers make long distance calls within their own area code will change for everyone in Canada (and most of North America) on September 4, 1994. Currently, to dial long distance within your own area code, you dial 1 or 0 and then the seven-digit number. Area code 905 is the only exception to this. But population growth, as well as the boom in new technologies such as cellular and fax machines, have used up almost all of the available area codes under the present North American Numbering Plan (NANP). To solve the problem, calling long distance within your own area code will require you to drop in your area code and dial 1 or 0 + area code + xxx-xxxx (just as you do when making long distance calls to other area codes). No change will take place in the way that local calls or long distance calls to other area codes are dialed. Dave Leibold - via FidoNet node 1:250/98 INTERNET: Dave.Leibold@f730.n250.z1.FIDONET.ORG ------------------------------ From: dave@westmark.com (Dave Levenson) Subject: Skokie, IL, and Telephone History Organization: Westmark, Inc. Date: Sat, 6 Nov 1993 17:59:11 GMT In light of our Moderator's recent move from Chicago to Skokie, IL, I thought I'd share a bit of telephone trivia from the mid 1960's. It was reported in that year that the largest number of telephones per capita (86 per 100 population) anywhere in the world was in the District of Columbia, USA. The second largest value of this number (I think the number was around 70 or so) was in Skokie, Illinois. (Now that Pat lives there, the number is probably higher!) Dave Levenson Internet: dave@westmark.com Westmark, Inc. UUCP: {uunet | rutgers | att}!westmark!dave Stirling, NJ, USA Voice: 908 647 0900 Fax: 908 647 6857 [Moderator's Note: Actually, I have but three lines: one for voice, one for data and one for fax. I'll make do somehow. The fax is now on a full time dedicated line and available to anyone who wants to use it: 708-329-0572. The Skokie area was also the home of Teletype Corporation as some old-timers may recall. I am just just hoping very desparately that things will work out financially for me and the family. :( PAT] ------------------------------ End of TELECOM Digest V13 #739 ****************************** ******************************************************************************